From robbin.ehn at oracle.com Mon Mar 2 10:16:44 2020 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 2 Mar 2020 11:16:44 +0100 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com> <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com> <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com> Message-ID: Hi, On 2/24/20 5:39 PM, Lindenmaier, Goetz wrote: > Hi, > > I had a look at the progress of this change. Nothing > happened since Richard posted his update using more > handshakes [1]. > But we (SAP) would appreciate a lot if this change could > be successfully reviewed and pushed. > > I think there is basic understanding that this > change is helpful. It fixes a number of issues with JVMTI, > and will deliver the same performance benefits as EA > does in current production mode for debugging scenarios. > > This is important for us as we run our VMs prepared > for debugging in production mode. > > I understand that Robbin proposed to replace the usage of > _suspend_flag with handshakes. Apparently, async handshakes > are needed to do so. We have been waiting a while for removal > of the _suspend_flag / introduction of async handshakes [2]. > What is the status here? I have an old prototype which I would like to continue to work on. So do not assume asynch handshakes will make 15. Even if it would, I think there are a lot more investigate work to remove _suspend_flag. > > I think we should no longer wait, but proceed with > this change. We will look into removing the usage of > suspend_flag introduced here once it is possible to implement > it with handshakes. Yes, sure. >> Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/ DeoptimizeObjectsALotThread is only used in compileBroker.cpp. You can move both declaration and definition to that file, no need to clobber thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry) Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's own hpp file? It doesn't seem right to add JVM TI classes into thread.hpp. Note that we also think we may have a bug in deopt: https://bugs.openjdk.java.net/browse/JDK-8238237 I think it would be best, if possible, to push after that is resolved. Not even nearly a full review :) Thanks, Robbin >> Incremental: >> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/ >> >> I was not able to eliminate the additional suspend flag now. I'll take care of this >> as soon as the >> existing suspend-resume-mechanism is reworked. >> >> Testing: >> >> Nightly tests @SAP: >> >> JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, Renaissance >> Suite, SAP specific tests >> with fastdebug and release builds on all platforms >> >> Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x parallel >> for 24h >> >> Thanks, Richard. >> >> >> More details on the changes: >> >> * Hide DeoptimizeObjectsALotThread from external view. >> >> * Changed EscapeBarrier_lock to be a _safepoint_check_never lock. >> It used to be _safepoint_check_sometimes, which will be eliminated sooner or >> later. >> I added explicit thread state changes with ThreadBlockInVM to code paths >> where we can wait() >> on EscapeBarrier_lock to become safepoint safe. >> >> * Use handshake EscapeBarrierSuspendHandshake to suspend target threads >> instead of vm operation >> VM_ThreadSuspendAllForObjDeopt. >> >> * Removed uses of Threads_lock. When adding a new thread we suspend it iff >> EA optimizations are >> being reverted. In the previous version we were waiting on Threads_lock >> while EA optimizations >> were reverted. See EscapeBarrier::thread_added(). >> >> * Made tests require Xmixed compilation mode. >> >> * Made tests agnostic regarding tiered compilation. >> I.e. tc isn't disabled anymore, and the tests can be run with tc enabled or >> disabled. >> >> * Exercising EATests.java as well with stress test options >> DeoptimizeObjectsALot* >> Due to the non-deterministic deoptimizations some tests need to be skipped. >> We do this to prevent bit-rot of the stress test code. >> >> * Executing EATests.java as well with graal if available. Driver for this is >> EATestsJVMCI.java. Graal cannot pass all tests, because it does not provide all >> the new debug info >> (namely not_global_escape_in_scope and arg_escape in scopeDesc.hpp). >> And graal does not yet support the JVMTI operations force early return and >> pop frame. >> >> * Removed tracing from new jdi tests in EATests.java. Too much trace output >> before the debugging >> connection is established can cause deadlock because output buffers fill up. >> (See https://bugs.openjdk.java.net/browse/JDK-8173304) >> >> * Many copyright year changes and smaller clean-up changes of testing code >> (trailing white-space and >> the like). >> >> >> -----Original Message----- >> From: David Holmes >> Sent: Donnerstag, 19. Dezember 2019 03:12 >> To: Reingruber, Richard ; serviceability- >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot- >> runtime-dev at openjdk.java.net; Vladimir Kozlov (vladimir.kozlov at oracle.com) >> >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in >> the Presence of JVMTI Agents >> >> Hi Richard, >> >> I think my issue is with the way EliminateNestedLocks works so I'm going >> to look into that more deeply. >> >> Thanks for the explanations. >> >> David >> >> On 18/12/2019 12:47 am, Reingruber, Richard wrote: >>> Hi David, >>> >>> > > > Some further queries/concerns: >>> > > > >>> > > > src/hotspot/share/runtime/objectMonitor.cpp >>> > > > >>> > > > Can you please explain the changes to ObjectMonitor::wait: >>> > > > >>> > > > ! _recursions = save // restore the old recursion count >>> > > > ! + jt->get_and_reset_relock_count_after_wait(); // >>> > > > increased by the deferred relock count >>> > > > >>> > > > what is the "deferred relock count"? I gather it relates to >>> > > > >>> > > > "The code was extended to be able to deoptimize objects of a >>> > > frame that >>> > > > is not the top frame and to let another thread than the owning >>> > > thread do >>> > > > it." >>> > > >>> > > Yes, these relate. Currently EA based optimizations are reverted, when a >> compiled frame is >>> > > replaced with corresponding interpreter frames. Part of this is relocking >> objects with eliminated >>> > > locking. New with the enhancement is that we do this also just before >> object references are >>> > > acquired through JVMTI. In this case we deoptimize also the owning >> compiled frame C and we >>> > > register deoptimized objects as deferred updates. When control returns >> to C it gets deoptimized, >>> > > we notice that objects are already deoptimized (reallocated and >> relocked), so we don't do it again >>> > > (relocking twice would be incorrect of course). Deferred updates are >> copied into the new >>> > > interpreter frames. >>> > > >>> > > Problem: relocking is not possible if the target thread T is waiting on the >> monitor that needs to >>> > > be relocked. This happens only with non-local objects with >> EliminateNestedLocks. Instead relocking >>> > > is deferred until T owns the monitor again. This is what the piece of >> code above does. >>> > >>> > Sorry I need some more detail here. How can you wait() on an object >>> > monitor if the object allocation and/or locking was optimised away? And >>> > what is a "non-local object" in this context? Isn't EA restricted to >>> > thread-confined objects? >>> >>> "Non-local object" is an object that escapes its thread. The issue I'm >> addressing with the changes >>> in ObjectMonitor::wait are almost unrelated to EA. They are caused by >> EliminateNestedLocks, where C2 >>> eliminates recursive locking of an already owned lock. The lock owning object >> exists on the heap, it >>> is locked and you can call wait() on it. >>> >>> EliminateLocks is the C2 option that controls lock elimination based on EA. >> Both optimizations have >>> in common that objects with eliminated locking need to be relocked when >> deoptimizing a frame, >>> i.e. when replacing a compiled frame with equivalent interpreter >>> frames. Deoptimization::relock_objects does that job for /all/ eliminated >> locks in scope. /All/ can >>> be a mix of eliminated nested locks and locks of not-escaping objects. >>> >>> New with the enhancement: I call relock_objects earlier, just before objects >> pontentially >>> escape. But then later when the owning compiled frame gets deoptimized, I >> must not do it again: >>> >>> See call to EscapeBarrier::objs_are_deoptimized in deoptimization.cpp: >>> >>> 373 if ((jvmci_enabled || ((DoEscapeAnalysis || EliminateNestedLocks) && >> EliminateLocks)) >>> 374 && !EscapeBarrier::objs_are_deoptimized(thread, deoptee.id())) { >>> 375 bool unused; >>> 376 eliminate_locks(thread, chunk, realloc_failures, deoptee, exec_mode, >> unused); >>> 377 } >>> >>> Now when calling relock_objects early it is quiet possible that I have to relock >> an object the >>> target thread currently waits for. Obviously I cannot relock in this case, >> instead I chose to >>> introduce relock_count_after_wait to JavaThread. >>> >>> > Is it just that some of the locking gets optimized away e.g. >>> > >>> > synchronised(obj) { >>> > synchronised(obj) { >>> > synchronised(obj) { >>> > obj.wait(); >>> > } >>> > } >>> > } >>> > >>> > If this is reduced to a form as-if it were a single lock of the monitor >>> > (due to EA) and the wait() triggers a JVM TI event which leads to the >>> > escape of "obj" then we need to reconstruct the true lock state, and so >>> > when the wait() internally unblocks and reacquires the monitor it has to >>> > set the true recursion count to 3, not the 1 that it appeared to be when >>> > wait() was initially called. Is that the scenario? >>> >>> Kind of... except that the locking is not eliminated due to EA and there is no >> JVM TI event >>> triggered by wait. >>> >>> Add >>> >>> LocalObject l1 = new LocalObject(); >>> >>> in front of the synchrnized blocks and assume a JVM TI agent acquires l1. This >> triggers the code in >>> question. >>> >>> See that relocking/reallocating is transactional. If it is done then for /all/ >> objects in scope and it is >>> done at most once. It wouldn't be quite so easy to split this in relocking of >> nested/EA-based >>> eliminated locks. >>> >>> > If so I find this truly awful. Anyone using wait() in a realistic form >>> > requires a notification and so the object cannot be thread confined. In >>> >>> It is not thread confined. >>> >>> > which case I would strongly argue that upon hitting the wait() the deopt >>> > should occur unconditionally and so the lock state is correct before we >>> > wait and so we don't need to mess with the recursion count internally >>> > when we reacquire the monitor. >>> > >>> > > >>> > > > which I don't like the sound of at all when it comes to ObjectMonitor >>> > > > state. So I'd like to understand in detail exactly what is going on here >>> > > > and why. This is a very intrusive change that seems to badly break >>> > > > encapsulation and impacts future changes to ObjectMonitor that are >> under >>> > > > investigation. >>> > > >>> > > I would not regard this as breaking encapsulation. Certainly not badly. >>> > > >>> > > I've added a property relock_count_after_wait to JavaThread. The >> property is well >>> > > encapsulated. Future ObjectMonitor implementations have to deal with >> recursion too. They are free >>> > > in choosing a way to do that as long as that property is taken into >> account. This is hardly a >>> > > limitation. >>> > >>> > I do think this badly breaks encapsulation as you have to add a callout >>> > from the guts of the ObjectMonitor code to reach into the thread to get >>> > this lock count adjustment. I understand why you have had to do this but >>> > I would much rather see a change to the EA optimisation strategy so that >>> > this is not needed. >>> > >>> > > Note also that the property is a straight forward extension of the >> existing concept of deferred >>> > > local updates. It is embedded into the structure holding them. So not >> even the footprint of a >>> > > JavaThread is enlarged if no deferred updates are generated. >>> > >>> > [...] >>> > >>> > > >>> > > I'm actually duplicating the existing external suspend mechanism, >> because a thread can be >>> > > suspended at most once. And hey, and don't like that either! But it >> seems not unlikely that the >>> > > duplicate can be removed together with the original and the new type >> of handshakes that will be >>> > > used for thread suspend can be used for object deoptimization too. See >> today's discussion in >>> > > JDK-8227745 [2]. >>> > >>> > I hope that discussion bears some fruit, at the moment it seems not to >>> > be possible to use handshakes here. :( >>> > >>> > The external suspend mechanism is a royal pain in the proverbial that we >>> > have to carefully live with. The idea that we're duplicating that for >>> > use in another fringe area of functionality does not thrill me at all. >>> > >>> > To be clear, I understand the problem that exists and that you wish to >>> > solve, but for the runtime parts I balk at the complexity cost of >>> > solving it. >>> >>> I know it's complex, but by far no rocket science. >>> >>> Also I find it hard to imagine another fix for JDK-8233915 besides changing >> the JVM TI specification. >>> >>> Thanks, Richard. >>> >>> -----Original Message----- >>> From: David Holmes >>> Sent: Dienstag, 17. Dezember 2019 08:03 >>> To: Reingruber, Richard ; serviceability- >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot- >> runtime-dev at openjdk.java.net; Vladimir Kozlov (vladimir.kozlov at oracle.com) >> >>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance >> in the Presence of JVMTI Agents >>> >>> >>> >>> David >>> >>> On 17/12/2019 4:57 pm, David Holmes wrote: >>>> Hi Richard, >>>> >>>> On 14/12/2019 5:01 am, Reingruber, Richard wrote: >>>>> Hi David, >>>>> >>>>> ?? > Some further queries/concerns: >>>>> ?? > >>>>> ?? > src/hotspot/share/runtime/objectMonitor.cpp >>>>> ?? > >>>>> ?? > Can you please explain the changes to ObjectMonitor::wait: >>>>> ?? > >>>>> ?? > !?? _recursions = save????? // restore the old recursion count >>>>> ?? > !???????????????? + jt->get_and_reset_relock_count_after_wait(); // >>>>> ?? > increased by the deferred relock count >>>>> ?? > >>>>> ?? > what is the "deferred relock count"? I gather it relates to >>>>> ?? > >>>>> ?? > "The code was extended to be able to deoptimize objects of a >>>>> frame that >>>>> ?? > is not the top frame and to let another thread than the owning >>>>> thread do >>>>> ?? > it." >>>>> >>>>> Yes, these relate. Currently EA based optimizations are reverted, when >>>>> a compiled frame is replaced >>>>> with corresponding interpreter frames. Part of this is relocking >>>>> objects with eliminated >>>>> locking. New with the enhancement is that we do this also just before >>>>> object references are acquired >>>>> through JVMTI. In this case we deoptimize also the owning compiled >>>>> frame C and we register >>>>> deoptimized objects as deferred updates. When control returns to C it >>>>> gets deoptimized, we notice >>>>> that objects are already deoptimized (reallocated and relocked), so we >>>>> don't do it again (relocking >>>>> twice would be incorrect of course). Deferred updates are copied into >>>>> the new interpreter frames. >>>>> >>>>> Problem: relocking is not possible if the target thread T is waiting >>>>> on the monitor that needs to be >>>>> relocked. This happens only with non-local objects with >>>>> EliminateNestedLocks. Instead relocking is >>>>> deferred until T owns the monitor again. This is what the piece of >>>>> code above does. >>>> >>>> Sorry I need some more detail here. How can you wait() on an object >>>> monitor if the object allocation and/or locking was optimised away? And >>>> what is a "non-local object" in this context? Isn't EA restricted to >>>> thread-confined objects? >>>> >>>> Is it just that some of the locking gets optimized away e.g. >>>> >>>> synchronised(obj) { >>>> ? synchronised(obj) { >>>> ??? synchronised(obj) { >>>> ????? obj.wait(); >>>> ??? } >>>> ? } >>>> } >>>> >>>> If this is reduced to a form as-if it were a single lock of the monitor >>>> (due to EA) and the wait() triggers a JVM TI event which leads to the >>>> escape of "obj" then we need to reconstruct the true lock state, and so >>>> when the wait() internally unblocks and reacquires the monitor it has to >>>> set the true recursion count to 3, not the 1 that it appeared to be when >>>> wait() was initially called. Is that the scenario? >>>> >>>> If so I find this truly awful. Anyone using wait() in a realistic form >>>> requires a notification and so the object cannot be thread confined. In >>>> which case I would strongly argue that upon hitting the wait() the deopt >>>> should occur unconditionally and so the lock state is correct before we >>>> wait and so we don't need to mess with the recursion count internally >>>> when we reacquire the monitor. >>>> >>>>> >>>>> ?? > which I don't like the sound of at all when it comes to >>>>> ObjectMonitor >>>>> ?? > state. So I'd like to understand in detail exactly what is going >>>>> on here >>>>> ?? > and why.? This is a very intrusive change that seems to badly break >>>>> ?? > encapsulation and impacts future changes to ObjectMonitor that >>>>> are under >>>>> ?? > investigation. >>>>> >>>>> I would not regard this as breaking encapsulation. Certainly not badly. >>>>> >>>>> I've added a property relock_count_after_wait to JavaThread. The >>>>> property is well >>>>> encapsulated. Future ObjectMonitor implementations have to deal with >>>>> recursion too. They are free in >>>>> choosing a way to do that as long as that property is taken into >>>>> account. This is hardly a >>>>> limitation. >>>> >>>> I do think this badly breaks encapsulation as you have to add a callout >>>> from the guts of the ObjectMonitor code to reach into the thread to get >>>> this lock count adjustment. I understand why you have had to do this but >>>> I would much rather see a change to the EA optimisation strategy so that >>>> this is not needed. >>>> >>>>> Note also that the property is a straight forward extension of the >>>>> existing concept of deferred >>>>> local updates. It is embedded into the structure holding them. So not >>>>> even the footprint of a >>>>> JavaThread is enlarged if no deferred updates are generated. >>>>> >>>>> ?? > --- >>>>> ?? > >>>>> ?? > src/hotspot/share/runtime/thread.cpp >>>>> ?? > >>>>> ?? > Can you please explain why >>>>> JavaThread::wait_for_object_deoptimization >>>>> ?? > has to be handcrafted in this way rather than using proper >>>>> transitions. >>>>> ?? > >>>>> >>>>> I wrote wait_for_object_deoptimization taking >>>>> JavaThread::java_suspend_self_with_safepoint_check >>>>> as template. So in short: for the same reasons :) >>>>> >>>>> Threads reach both methods as part of thread state transitions, >>>>> therefore special handling is >>>>> required to change thread state on top of ongoing transitions. >>>>> >>>>> ?? > We got rid of "deopt suspend" some time ago and it is disturbing >>>>> to see >>>>> ?? > it being added back (effectively). This seems like it may be >>>>> something >>>>> ?? > that handshakes could be used for. >>>>> >>>>> Deopt suspend used to be something rather different with a similar >>>>> name[1]. It is not being added back. >>>> >>>> I stand corrected. Despite comments in the code to the contrary >>>> deopt_suspend didn't actually cause a self-suspend. I was doing a lot of >>>> cleanup in this area 13 years ago :) >>>> >>>>> >>>>> I'm actually duplicating the existing external suspend mechanism, >>>>> because a thread can be suspended >>>>> at most once. And hey, and don't like that either! But it seems not >>>>> unlikely that the duplicate can >>>>> be removed together with the original and the new type of handshakes >>>>> that will be used for >>>>> thread suspend can be used for object deoptimization too. See today's >>>>> discussion in JDK-8227745 [2]. >>>> >>>> I hope that discussion bears some fruit, at the moment it seems not to >>>> be possible to use handshakes here. :( >>>> >>>> The external suspend mechanism is a royal pain in the proverbial that we >>>> have to carefully live with. The idea that we're duplicating that for >>>> use in another fringe area of functionality does not thrill me at all. >>>> >>>> To be clear, I understand the problem that exists and that you wish to >>>> solve, but for the runtime parts I balk at the complexity cost of >>>> solving it. >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> Thanks, Richard. >>>>> >>>>> [1] Deopt suspend was something like an async. handshake for >>>>> architectures with register windows, >>>>> ???? where patching the return pc for deoptimization of a compiled >>>>> frame was racy if the owner thread >>>>> ???? was in native code. Instead a "deopt" suspend flag was set on >>>>> which the thread patched its own >>>>> ???? frame upon return from native. So no thread was suspended. It got >>>>> its name only from the name of >>>>> ???? the flags. >>>>> >>>>> [2] Discussion about using handshakes to sync. with the target thread: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK- >> 8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.syste >> m.issuetabpanels:comment-tabpanel#comment-14306727 >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: David Holmes >>>>> Sent: Freitag, 13. Dezember 2019 00:56 >>>>> To: Reingruber, Richard ; >>>>> serviceability-dev at openjdk.java.net; >>>>> hotspot-compiler-dev at openjdk.java.net; >>>>> hotspot-runtime-dev at openjdk.java.net >>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >>>>> Performance in the Presence of JVMTI Agents >>>>> >>>>> Hi Richard, >>>>> >>>>> Some further queries/concerns: >>>>> >>>>> src/hotspot/share/runtime/objectMonitor.cpp >>>>> >>>>> Can you please explain the changes to ObjectMonitor::wait: >>>>> >>>>> !?? _recursions = save????? // restore the old recursion count >>>>> !???????????????? + jt->get_and_reset_relock_count_after_wait(); // >>>>> increased by the deferred relock count >>>>> >>>>> what is the "deferred relock count"? I gather it relates to >>>>> >>>>> "The code was extended to be able to deoptimize objects of a frame that >>>>> is not the top frame and to let another thread than the owning thread do >>>>> it." >>>>> >>>>> which I don't like the sound of at all when it comes to ObjectMonitor >>>>> state. So I'd like to understand in detail exactly what is going on here >>>>> and why.? This is a very intrusive change that seems to badly break >>>>> encapsulation and impacts future changes to ObjectMonitor that are under >>>>> investigation. >>>>> >>>>> --- >>>>> >>>>> src/hotspot/share/runtime/thread.cpp >>>>> >>>>> Can you please explain why JavaThread::wait_for_object_deoptimization >>>>> has to be handcrafted in this way rather than using proper transitions. >>>>> >>>>> We got rid of "deopt suspend" some time ago and it is disturbing to see >>>>> it being added back (effectively). This seems like it may be something >>>>> that handshakes could be used for. >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>> On 12/12/2019 7:02 am, David Holmes wrote: >>>>>> On 12/12/2019 1:07 am, Reingruber, Richard wrote: >>>>>>> Hi David, >>>>>>> >>>>>>> ??? > Most of the details here are in areas I can comment on in detail, >>>>>>> but I >>>>>>> ??? > did take an initial general look at things. >>>>>>> >>>>>>> Thanks for taking the time! >>>>>> >>>>>> Apologies the above should read: >>>>>> >>>>>> "Most of the details here are in areas I *can't* comment on in detail >>>>>> ..." >>>>>> >>>>>> David >>>>>> >>>>>>> ??? > The only thing that jumped out at me is that I think the >>>>>>> ??? > DeoptimizeObjectsALotThread should be a hidden thread. >>>>>>> ??? > >>>>>>> ??? > +? bool is_hidden_from_external_view() const { return true; } >>>>>>> >>>>>>> Yes, it should. Will add the method like above. >>>>>>> >>>>>>> ??? > Also I don't see any testing of the DeoptimizeObjectsALotThread. >>>>>>> Without >>>>>>> ??? > active testing this will just bit-rot. >>>>>>> >>>>>>> DeoptimizeObjectsALot is meant for stress testing with a larger >>>>>>> workload. I will add a minimal test >>>>>>> to keep it fresh. >>>>>>> >>>>>>> ??? > Also on the tests I don't understand your @requires clause: >>>>>>> ??? > >>>>>>> ??? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled >> & >>>>>>> ??? > (vm.opt.TieredCompilation != true)) >>>>>>> ??? > >>>>>>> ??? > This seems to require that TieredCompilation is disabled, but >>>>>>> tiered is >>>>>>> ??? > our normal mode of operation. ?? >>>>>>> ??? > >>>>>>> >>>>>>> I removed the clause. I guess I wanted to target the tests towards the >>>>>>> code they are supposed to >>>>>>> test, and it's easier to analyze failures w/o tiered compilation and >>>>>>> with just one compiler thread. >>>>>>> >>>>>>> Additionally I will make use of >>>>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests. >>>>>>> >>>>>>> Thanks, >>>>>>> Richard. >>>>>>> >>>>>>> -----Original Message----- >>>>>>> From: David Holmes >>>>>>> Sent: Mittwoch, 11. Dezember 2019 08:03 >>>>>>> To: Reingruber, Richard ; >>>>>>> serviceability-dev at openjdk.java.net; >>>>>>> hotspot-compiler-dev at openjdk.java.net; >>>>>>> hotspot-runtime-dev at openjdk.java.net >>>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >>>>>>> Performance in the Presence of JVMTI Agents >>>>>>> >>>>>>> Hi Richard, >>>>>>> >>>>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> I would like to get reviews please for >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ >>>>>>>> >>>>>>>> Corresponding RFE: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745 >>>>>>>> >>>>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 >>>>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1] >>>>>>>> >>>>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without >>>>>>>> issues (thanks!). In addition the >>>>>>>> change is being tested at SAP since I posted the first RFR some >>>>>>>> months ago. >>>>>>>> >>>>>>>> The intention of this enhancement is to benefit performance wise from >>>>>>>> escape analysis even if JVMTI >>>>>>>> agents request capabilities that allow them to access local variable >>>>>>>> values. E.g. if you start-up >>>>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then >>>>>>>> escape analysis is disabled right >>>>>>>> from the beginning, well before a debugger attaches -- if ever one >>>>>>>> should do so. With the >>>>>>>> enhancement, escape analysis will remain enabled until and after a >>>>>>>> debugger attaches. EA based >>>>>>>> optimizations are reverted just before an agent acquires the >>>>>>>> reference to an object. In the JBS item >>>>>>>> you'll find more details. >>>>>>> >>>>>>> Most of the details here are in areas I can comment on in detail, but I >>>>>>> did take an initial general look at things. >>>>>>> >>>>>>> The only thing that jumped out at me is that I think the >>>>>>> DeoptimizeObjectsALotThread should be a hidden thread. >>>>>>> >>>>>>> +? bool is_hidden_from_external_view() const { return true; } >>>>>>> >>>>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread. >>>>>>> Without >>>>>>> active testing this will just bit-rot. >>>>>>> >>>>>>> Also on the tests I don't understand your @requires clause: >>>>>>> >>>>>>> ??? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & >>>>>>> (vm.opt.TieredCompilation != true)) >>>>>>> >>>>>>> This seems to require that TieredCompilation is disabled, but tiered is >>>>>>> our normal mode of operation. ?? >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>>> Thanks, >>>>>>>> Richard. >>>>>>>> >>>>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745 >>>>>>>> >> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patc >> h >>>>>>>> >>>>>>>> >>>>>>>> From kevin.walls at oracle.com Mon Mar 2 10:47:16 2020 From: kevin.walls at oracle.com (Kevin Walls) Date: Mon, 2 Mar 2020 10:47:16 +0000 Subject: RFR(S): hs_err elapsed time in seconds is not accurate enough Message-ID: <4a3d7c8e-df9e-51de-77e4-775be1480392@oracle.com> Hi, (s11y and runtime opinions both relevant) A few times in the last month I've really wanted to compare the Events logged in the hs_err file, and the time of the JVM's crash. "elapsed time" in hs_err is only accurate to one second, and has been since before jdk5 was created. The diff below changes the format string and uses the non-rounded time value (I don't see a need to change the other integer arithmetic here), and we can enjoy hs_errs with detail like: ... Time: Mon Mar? 2 09:57:13 2020 UTC elapsed time: 5.490135 seconds (0d 0h 0m 5s) ... Thanks Kevin /jdk/open$ hg diff diff --git a/src/hotspot/share/runtime/os.cpp b/src/hotspot/share/runtime/os.cpp --- a/src/hotspot/share/runtime/os.cpp +++ b/src/hotspot/share/runtime/os.cpp @@ -1016,9 +1016,8 @@ ?? } ?? double t = os::elapsedTime(); -? // NOTE: It tends to crash after a SEGV if we want to printf("%f",...) in -? //?????? Linux. Must be a bug in glibc ? Workaround is to round "t" to int -? //?????? before printf. We lost some precision, but who cares? +? // NOTE: a crash using printf("%f",...) on Linux was historically noted here +? //?????? (before the jdk5 repo was created). ?? int eltime = (int)t;? // elapsed time in seconds ?? // print elapsed time in a human-readable format: @@ -1029,7 +1028,7 @@ ?? int elmins = (eltime - day_secs - hour_secs) / secs_per_min; ?? int minute_secs = elmins * secs_per_min; ?? int elsecs = (eltime - day_secs - hour_secs - minute_secs); -? st->print_cr(" elapsed time: %d seconds (%dd %dh %dm %ds)", eltime, eldays, elhours, elmins, elsecs); +? st->print_cr(" elapsed time: %f seconds (%dd %dh %dm %ds)", t, eldays, elhours, elmins, elsecs); ?} From kevin.walls at oracle.com Mon Mar 2 10:48:13 2020 From: kevin.walls at oracle.com (Kevin Walls) Date: Mon, 2 Mar 2020 10:48:13 +0000 Subject: RFR(S): 8240295: hs_err elapsed time in seconds is not accurate enough In-Reply-To: <4a3d7c8e-df9e-51de-77e4-775be1480392@oracle.com> References: <4a3d7c8e-df9e-51de-77e4-775be1480392@oracle.com> Message-ID: Oops, and with the bug ID in the title and JBS link: https://bugs.openjdk.java.net/browse/JDK-8240295 On 02/03/2020 10:47, Kevin Walls wrote: > Hi, > > (s11y and runtime opinions both relevant) > > A few times in the last month I've really wanted to compare the Events > logged in the hs_err file, and the time of the JVM's crash. > > "elapsed time" in hs_err is only accurate to one second, and has been > since before jdk5 was created. > > The diff below changes the format string and uses the non-rounded time > value (I don't see a need to change the other integer arithmetic > here), and we can enjoy hs_errs with detail like: > > ... > Time: Mon Mar? 2 09:57:13 2020 UTC elapsed time: 5.490135 seconds (0d > 0h 0m 5s) > ... > > Thanks > Kevin > > > /jdk/open$ hg diff > diff --git a/src/hotspot/share/runtime/os.cpp > b/src/hotspot/share/runtime/os.cpp > --- a/src/hotspot/share/runtime/os.cpp > +++ b/src/hotspot/share/runtime/os.cpp > @@ -1016,9 +1016,8 @@ > ?? } > > ?? double t = os::elapsedTime(); > -? // NOTE: It tends to crash after a SEGV if we want to > printf("%f",...) in > -? //?????? Linux. Must be a bug in glibc ? Workaround is to round "t" > to int > -? //?????? before printf. We lost some precision, but who cares? > +? // NOTE: a crash using printf("%f",...) on Linux was historically > noted here > +? //?????? (before the jdk5 repo was created). > ?? int eltime = (int)t;? // elapsed time in seconds > > ?? // print elapsed time in a human-readable format: > @@ -1029,7 +1028,7 @@ > ?? int elmins = (eltime - day_secs - hour_secs) / secs_per_min; > ?? int minute_secs = elmins * secs_per_min; > ?? int elsecs = (eltime - day_secs - hour_secs - minute_secs); > -? st->print_cr(" elapsed time: %d seconds (%dd %dh %dm %ds)", eltime, > eldays, elhours, elmins, elsecs); > +? st->print_cr(" elapsed time: %f seconds (%dd %dh %dm %ds)", t, > eldays, elhours, elmins, elsecs); > ?} > > From linzang at tencent.com Mon Mar 2 13:56:52 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Mon, 2 Mar 2020 13:56:52 +0000 Subject: JDK-8215624 add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: References: <11bca96c0e7745f5b2558cc49b42b996@tencent.com> Message-ID: <2EDF28BF-94D5-4F2E-B96E-2C45948AD454@tencent.com> Dear all, Let me try to ease the reviewing work by some explanation :P The patch's target is to speed up jmap -histo for heap iteration, from my experience it is necessary for large heap investigation. E.g in bigData scenario I have tried to conduct jmap -histo against 180GB heap, it does take quite a while. And if my understanding is corrent, even the jmap -histo without "live" option does heap inspection with heap lock acquired. so it is very likely to block mutator thread in allocation-sensitive scenario. I would say the faster the heap inspection does, the shorter the mutator be blocked. This is parallel iteration for jmap is necessary. I think the parallel heap inspection should be applied to all kind of heap. However, consider the heap layout are different for GCs, much time is required to understand all kinds of the heap layout to make the whole change. IMO, It is not wise to have a huge patch for the whole solution at once, and it is even harder to review it. So I plan to implement it incrementally, the first patch (this one) is going to confirm the implemention detail of how jmap accept the new option, passes it to attachListener of the jvm process and then how to make the parallel inspection closure be generic enough to make it easy to extend to different heap layout. And also how to implement the heap inspection in specific gc's heap. This patch use G1's heap as the begining. This patch actually do several things: 1. Add an option "parallelThreadNum=" to jmap -histo, the default behavior is to set N to 0, means let's JVM decide how many threads to use for heap inspection. Set this option to 1 will disable parallel heap inspection. (more details in CSR: https://bugs.openjdk.java.net/browse/JDK-8239290) 2. Make a change in how Jmap passing arguments, changes in http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.udiff.html, originally it pass options as separate arguments to attachListener, this patch change to that all options be compose to a single string. So the arg_count_max in attachListener.hpp do not need to be changed, and hence avoid the compatibility issue, as disscussed at https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-March/027334.html 3. Add an abstract class ParHeapInspectTask in heapInspection.hpp / heapInspection.cpp, It's work(uint worker_id) method prepares the data structure (KlassInfoTable) need for every parallel worker thread, and then call do_object_iterate_parallel() which is heap specific implementation. I also added some machenism in KlassInfoTable to support parallel iteration, such as merge(). 4. In specific heap (G1 in this patch), create a subclass of ParHeapInspectTask, implement the do_object_iterate_parallel() for parallel heap inspection. For G1, it simply invoke g1CollectedHeap's object_iterate_parallel(). 5. Add related test. 6. it may be easy to extend this patch for other kinds of heap by creating subclass of ParHeapInspectTask and implement the do_object_iterate_parallel(). Hope these info could help on code review and initate the discussion :-) Thanks! BRs, Lin ?>On 2020/2/19, 9:40 AM, "linzang(??)" wrote:. > > Re-post this RFR with correct enhancement number to make it trackable. > please ignore the previous wrong post. sorry for troubles. > > webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/ > Hi bug: https://bugs.openjdk.java.net/browse/JDK-8215624 > CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 > -------------- > Lin > >Hi Lin, > > > >Could you, please, re-post your RFR with the right enhancement number in > >the message subject? > >It will be more trackable this way. > > > >Thanks, > >Serguei > > > > > >On 2/17/20 10:29 PM, linzang(??) wrote: > >> Dear David, > >> Thanks a lot! > >> I have updated the refined code to http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/. > >> IMHO the parallel heap inspection can be extended to all kinds of heap as long as the heap layout can support parallel iteration. > >> Maybe we can firstly use this webrev to discuss how to implement it, because I am not sure my current implementation is an appropriate way to communicate with collectedHeap, then we can extend the solution to other kinds of heap. > >> > >> Thanks, > >> -------------- > >> Lin > >>> Hi Lin, > >>> > >>> Adding in hotspot-gc-dev as they need to see how this interacts with GC > >>> worker threads, and whether it needs to be extended beyond G1. > >>> > >>> I happened to spot one nit when browsing: > >>> > >>> src/hotspot/share/gc/shared/collectedHeap.hpp > >>> > >>> + virtual bool run_par_heap_inspect_task(KlassInfoTable* cit, > >>> + BoolObjectClosure* filter, > >>> + size_t* missed_count, > >>> + size_t thread_num) { > >>> + return NULL; > >>> > >>> s/NULL/false/ > >>> > >>> Cheers, > >>> David > >>> > >>> On 18/02/2020 2:15 pm, linzang(??) wrote: > >>>> Dear All, > >>>> May I ask your help to review the follow changes: > >>>> webrev: > >>>> http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/ > >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8215624 > >>>> related CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 > >>>> This patch enable parallel heap inspection of G1 for jmap histo. > >>>> my simple test shown it can speed up 2x of jmap -histo with > >>>> parallelThreadNum set to 2 for heap at ~500M on 4-core platform. > >>>> > >>>> ------------------------------------------------------------------------ > >>>> BRs, > >>>> Lin > >> > > > From david.holmes at oracle.com Tue Mar 3 01:11:02 2020 From: david.holmes at oracle.com (David Holmes) Date: Tue, 3 Mar 2020 11:11:02 +1000 Subject: RFR(S): 8240295: hs_err elapsed time in seconds is not accurate enough In-Reply-To: References: <4a3d7c8e-df9e-51de-77e4-775be1480392@oracle.com> Message-ID: <92ada19d-7200-6bb7-b174-9da8a166aea7@oracle.com> Hi Kevin, On 2/03/2020 8:48 pm, Kevin Walls wrote: > Oops, and with the bug ID in the title and JBS link: > https://bugs.openjdk.java.net/browse/JDK-8240295 > > > On 02/03/2020 10:47, Kevin Walls wrote: >> Hi, >> >> (s11y and runtime opinions both relevant) >> >> A few times in the last month I've really wanted to compare the Events >> logged in the hs_err file, and the time of the JVM's crash. >> >> "elapsed time" in hs_err is only accurate to one second, and has been >> since before jdk5 was created. >> >> The diff below changes the format string and uses the non-rounded time >> value (I don't see a need to change the other integer arithmetic >> here), and we can enjoy hs_errs with detail like: >> >> ... >> Time: Mon Mar? 2 09:57:13 2020 UTC elapsed time: 5.490135 seconds (0d >> 0h 0m 5s) >> ... >> >> Thanks >> Kevin >> >> >> /jdk/open$ hg diff >> diff --git a/src/hotspot/share/runtime/os.cpp >> b/src/hotspot/share/runtime/os.cpp >> --- a/src/hotspot/share/runtime/os.cpp >> +++ b/src/hotspot/share/runtime/os.cpp >> @@ -1016,9 +1016,8 @@ >> ?? } >> >> ?? double t = os::elapsedTime(); >> -? // NOTE: It tends to crash after a SEGV if we want to >> printf("%f",...) in >> -? //?????? Linux. Must be a bug in glibc ? Workaround is to round "t" >> to int >> -? //?????? before printf. We lost some precision, but who cares? >> +? // NOTE: a crash using printf("%f",...) on Linux was historically >> noted here >> +? //?????? (before the jdk5 repo was created). Just because it is old doesn't mean it no longer applies. printf is not async-signal-safe - we know that but we try to use it anyway. Maybe %f is even less async-signal-safe? This may get through testing okay but cause problems with real crashes in the field. What about breaking the time up into two ints: seconds and nanos? Cheers, David ----- >> ?? int eltime = (int)t;? // elapsed time in seconds >> >> ?? // print elapsed time in a human-readable format: >> @@ -1029,7 +1028,7 @@ >> ?? int elmins = (eltime - day_secs - hour_secs) / secs_per_min; >> ?? int minute_secs = elmins * secs_per_min; >> ?? int elsecs = (eltime - day_secs - hour_secs - minute_secs); >> -? st->print_cr(" elapsed time: %d seconds (%dd %dh %dm %ds)", eltime, >> eldays, elhours, elmins, elsecs); >> +? st->print_cr(" elapsed time: %f seconds (%dd %dh %dm %ds)", t, >> eldays, elhours, elmins, elsecs); >> ?} >> >> From ramkumar.sunderbabu at oracle.com Tue Mar 3 08:52:18 2020 From: ramkumar.sunderbabu at oracle.com (Ramkumar Sunderbabu) Date: Tue, 3 Mar 2020 00:52:18 -0800 (PST) Subject: RFR(S) : 8153430: [TESTBUG] jdk regression test javax/management/loading/MletParserLocaleTest.java reduce default timeout Message-ID: <0cee14ea-901d-4571-85b9-bbb6f01d59c1@default> Hi all, Please review this patch. Removed "timeout=5" from the Tests so that default timeout is used. JBS: https://bugs.openjdk.java.net/browse/JDK-8153430 Webrev: http://cr.openjdk.java.net/~rsunderbabu/8153430/webrev.00/ Testing: Locally tested with "-Xcomp" option on a linux-64 machine. Thanks, Ram -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.reingruber at sap.com Tue Mar 3 20:22:46 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Tue, 3 Mar 2020 20:22:46 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com> <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com> <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com> Message-ID: Hi Robbin, > > I understand that Robbin proposed to replace the usage of > > _suspend_flag with handshakes. Apparently, async handshakes > > are needed to do so. We have been waiting a while for removal > > of the _suspend_flag / introduction of async handshakes [2]. > > What is the status here? > I have an old prototype which I would like to continue to work on. > So do not assume asynch handshakes will make 15. > Even if it would, I think there are a lot more investigate work to remove > _suspend_flag. Let us know, if we can be of any help to you and be it only testing. > >> Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/ > DeoptimizeObjectsALotThread is only used in compileBroker.cpp. > You can move both declaration and definition to that file, no need to clobber > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry) Will do. > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's own > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp. You are right. It shouldn't be declared in thread.hpp. I will look into that. > Note that we also think we may have a bug in deopt: > https://bugs.openjdk.java.net/browse/JDK-8238237 > I think it would be best, if possible, to push after that is resolved. Sure. > Not even nearly a full review :) I know :) Anyways, thanks a lot, Richard. -----Original Message----- From: Robbin Ehn Sent: Monday, March 2, 2020 11:17 AM To: Lindenmaier, Goetz ; Reingruber, Richard ; David Holmes ; Vladimir Kozlov (vladimir.kozlov at oracle.com) ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents Hi, On 2/24/20 5:39 PM, Lindenmaier, Goetz wrote: > Hi, > > I had a look at the progress of this change. Nothing > happened since Richard posted his update using more > handshakes [1]. > But we (SAP) would appreciate a lot if this change could > be successfully reviewed and pushed. > > I think there is basic understanding that this > change is helpful. It fixes a number of issues with JVMTI, > and will deliver the same performance benefits as EA > does in current production mode for debugging scenarios. > > This is important for us as we run our VMs prepared > for debugging in production mode. > > I understand that Robbin proposed to replace the usage of > _suspend_flag with handshakes. Apparently, async handshakes > are needed to do so. We have been waiting a while for removal > of the _suspend_flag / introduction of async handshakes [2]. > What is the status here? I have an old prototype which I would like to continue to work on. So do not assume asynch handshakes will make 15. Even if it would, I think there are a lot more investigate work to remove _suspend_flag. > > I think we should no longer wait, but proceed with > this change. We will look into removing the usage of > suspend_flag introduced here once it is possible to implement > it with handshakes. Yes, sure. >> Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/ DeoptimizeObjectsALotThread is only used in compileBroker.cpp. You can move both declaration and definition to that file, no need to clobber thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry) Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's own hpp file? It doesn't seem right to add JVM TI classes into thread.hpp. Note that we also think we may have a bug in deopt: https://bugs.openjdk.java.net/browse/JDK-8238237 I think it would be best, if possible, to push after that is resolved. Not even nearly a full review :) Thanks, Robbin >> Incremental: >> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/ >> >> I was not able to eliminate the additional suspend flag now. I'll take care of this >> as soon as the >> existing suspend-resume-mechanism is reworked. >> >> Testing: >> >> Nightly tests @SAP: >> >> JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, Renaissance >> Suite, SAP specific tests >> with fastdebug and release builds on all platforms >> >> Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x parallel >> for 24h >> >> Thanks, Richard. >> >> >> More details on the changes: >> >> * Hide DeoptimizeObjectsALotThread from external view. >> >> * Changed EscapeBarrier_lock to be a _safepoint_check_never lock. >> It used to be _safepoint_check_sometimes, which will be eliminated sooner or >> later. >> I added explicit thread state changes with ThreadBlockInVM to code paths >> where we can wait() >> on EscapeBarrier_lock to become safepoint safe. >> >> * Use handshake EscapeBarrierSuspendHandshake to suspend target threads >> instead of vm operation >> VM_ThreadSuspendAllForObjDeopt. >> >> * Removed uses of Threads_lock. When adding a new thread we suspend it iff >> EA optimizations are >> being reverted. In the previous version we were waiting on Threads_lock >> while EA optimizations >> were reverted. See EscapeBarrier::thread_added(). >> >> * Made tests require Xmixed compilation mode. >> >> * Made tests agnostic regarding tiered compilation. >> I.e. tc isn't disabled anymore, and the tests can be run with tc enabled or >> disabled. >> >> * Exercising EATests.java as well with stress test options >> DeoptimizeObjectsALot* >> Due to the non-deterministic deoptimizations some tests need to be skipped. >> We do this to prevent bit-rot of the stress test code. >> >> * Executing EATests.java as well with graal if available. Driver for this is >> EATestsJVMCI.java. Graal cannot pass all tests, because it does not provide all >> the new debug info >> (namely not_global_escape_in_scope and arg_escape in scopeDesc.hpp). >> And graal does not yet support the JVMTI operations force early return and >> pop frame. >> >> * Removed tracing from new jdi tests in EATests.java. Too much trace output >> before the debugging >> connection is established can cause deadlock because output buffers fill up. >> (See https://bugs.openjdk.java.net/browse/JDK-8173304) >> >> * Many copyright year changes and smaller clean-up changes of testing code >> (trailing white-space and >> the like). >> >> >> -----Original Message----- >> From: David Holmes >> Sent: Donnerstag, 19. Dezember 2019 03:12 >> To: Reingruber, Richard ; serviceability- >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot- >> runtime-dev at openjdk.java.net; Vladimir Kozlov (vladimir.kozlov at oracle.com) >> >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in >> the Presence of JVMTI Agents >> >> Hi Richard, >> >> I think my issue is with the way EliminateNestedLocks works so I'm going >> to look into that more deeply. >> >> Thanks for the explanations. >> >> David >> >> On 18/12/2019 12:47 am, Reingruber, Richard wrote: >>> Hi David, >>> >>> > > > Some further queries/concerns: >>> > > > >>> > > > src/hotspot/share/runtime/objectMonitor.cpp >>> > > > >>> > > > Can you please explain the changes to ObjectMonitor::wait: >>> > > > >>> > > > ! _recursions = save // restore the old recursion count >>> > > > ! + jt->get_and_reset_relock_count_after_wait(); // >>> > > > increased by the deferred relock count >>> > > > >>> > > > what is the "deferred relock count"? I gather it relates to >>> > > > >>> > > > "The code was extended to be able to deoptimize objects of a >>> > > frame that >>> > > > is not the top frame and to let another thread than the owning >>> > > thread do >>> > > > it." >>> > > >>> > > Yes, these relate. Currently EA based optimizations are reverted, when a >> compiled frame is >>> > > replaced with corresponding interpreter frames. Part of this is relocking >> objects with eliminated >>> > > locking. New with the enhancement is that we do this also just before >> object references are >>> > > acquired through JVMTI. In this case we deoptimize also the owning >> compiled frame C and we >>> > > register deoptimized objects as deferred updates. When control returns >> to C it gets deoptimized, >>> > > we notice that objects are already deoptimized (reallocated and >> relocked), so we don't do it again >>> > > (relocking twice would be incorrect of course). Deferred updates are >> copied into the new >>> > > interpreter frames. >>> > > >>> > > Problem: relocking is not possible if the target thread T is waiting on the >> monitor that needs to >>> > > be relocked. This happens only with non-local objects with >> EliminateNestedLocks. Instead relocking >>> > > is deferred until T owns the monitor again. This is what the piece of >> code above does. >>> > >>> > Sorry I need some more detail here. How can you wait() on an object >>> > monitor if the object allocation and/or locking was optimised away? And >>> > what is a "non-local object" in this context? Isn't EA restricted to >>> > thread-confined objects? >>> >>> "Non-local object" is an object that escapes its thread. The issue I'm >> addressing with the changes >>> in ObjectMonitor::wait are almost unrelated to EA. They are caused by >> EliminateNestedLocks, where C2 >>> eliminates recursive locking of an already owned lock. The lock owning object >> exists on the heap, it >>> is locked and you can call wait() on it. >>> >>> EliminateLocks is the C2 option that controls lock elimination based on EA. >> Both optimizations have >>> in common that objects with eliminated locking need to be relocked when >> deoptimizing a frame, >>> i.e. when replacing a compiled frame with equivalent interpreter >>> frames. Deoptimization::relock_objects does that job for /all/ eliminated >> locks in scope. /All/ can >>> be a mix of eliminated nested locks and locks of not-escaping objects. >>> >>> New with the enhancement: I call relock_objects earlier, just before objects >> pontentially >>> escape. But then later when the owning compiled frame gets deoptimized, I >> must not do it again: >>> >>> See call to EscapeBarrier::objs_are_deoptimized in deoptimization.cpp: >>> >>> 373 if ((jvmci_enabled || ((DoEscapeAnalysis || EliminateNestedLocks) && >> EliminateLocks)) >>> 374 && !EscapeBarrier::objs_are_deoptimized(thread, deoptee.id())) { >>> 375 bool unused; >>> 376 eliminate_locks(thread, chunk, realloc_failures, deoptee, exec_mode, >> unused); >>> 377 } >>> >>> Now when calling relock_objects early it is quiet possible that I have to relock >> an object the >>> target thread currently waits for. Obviously I cannot relock in this case, >> instead I chose to >>> introduce relock_count_after_wait to JavaThread. >>> >>> > Is it just that some of the locking gets optimized away e.g. >>> > >>> > synchronised(obj) { >>> > synchronised(obj) { >>> > synchronised(obj) { >>> > obj.wait(); >>> > } >>> > } >>> > } >>> > >>> > If this is reduced to a form as-if it were a single lock of the monitor >>> > (due to EA) and the wait() triggers a JVM TI event which leads to the >>> > escape of "obj" then we need to reconstruct the true lock state, and so >>> > when the wait() internally unblocks and reacquires the monitor it has to >>> > set the true recursion count to 3, not the 1 that it appeared to be when >>> > wait() was initially called. Is that the scenario? >>> >>> Kind of... except that the locking is not eliminated due to EA and there is no >> JVM TI event >>> triggered by wait. >>> >>> Add >>> >>> LocalObject l1 = new LocalObject(); >>> >>> in front of the synchrnized blocks and assume a JVM TI agent acquires l1. This >> triggers the code in >>> question. >>> >>> See that relocking/reallocating is transactional. If it is done then for /all/ >> objects in scope and it is >>> done at most once. It wouldn't be quite so easy to split this in relocking of >> nested/EA-based >>> eliminated locks. >>> >>> > If so I find this truly awful. Anyone using wait() in a realistic form >>> > requires a notification and so the object cannot be thread confined. In >>> >>> It is not thread confined. >>> >>> > which case I would strongly argue that upon hitting the wait() the deopt >>> > should occur unconditionally and so the lock state is correct before we >>> > wait and so we don't need to mess with the recursion count internally >>> > when we reacquire the monitor. >>> > >>> > > >>> > > > which I don't like the sound of at all when it comes to ObjectMonitor >>> > > > state. So I'd like to understand in detail exactly what is going on here >>> > > > and why. This is a very intrusive change that seems to badly break >>> > > > encapsulation and impacts future changes to ObjectMonitor that are >> under >>> > > > investigation. >>> > > >>> > > I would not regard this as breaking encapsulation. Certainly not badly. >>> > > >>> > > I've added a property relock_count_after_wait to JavaThread. The >> property is well >>> > > encapsulated. Future ObjectMonitor implementations have to deal with >> recursion too. They are free >>> > > in choosing a way to do that as long as that property is taken into >> account. This is hardly a >>> > > limitation. >>> > >>> > I do think this badly breaks encapsulation as you have to add a callout >>> > from the guts of the ObjectMonitor code to reach into the thread to get >>> > this lock count adjustment. I understand why you have had to do this but >>> > I would much rather see a change to the EA optimisation strategy so that >>> > this is not needed. >>> > >>> > > Note also that the property is a straight forward extension of the >> existing concept of deferred >>> > > local updates. It is embedded into the structure holding them. So not >> even the footprint of a >>> > > JavaThread is enlarged if no deferred updates are generated. >>> > >>> > [...] >>> > >>> > > >>> > > I'm actually duplicating the existing external suspend mechanism, >> because a thread can be >>> > > suspended at most once. And hey, and don't like that either! But it >> seems not unlikely that the >>> > > duplicate can be removed together with the original and the new type >> of handshakes that will be >>> > > used for thread suspend can be used for object deoptimization too. See >> today's discussion in >>> > > JDK-8227745 [2]. >>> > >>> > I hope that discussion bears some fruit, at the moment it seems not to >>> > be possible to use handshakes here. :( >>> > >>> > The external suspend mechanism is a royal pain in the proverbial that we >>> > have to carefully live with. The idea that we're duplicating that for >>> > use in another fringe area of functionality does not thrill me at all. >>> > >>> > To be clear, I understand the problem that exists and that you wish to >>> > solve, but for the runtime parts I balk at the complexity cost of >>> > solving it. >>> >>> I know it's complex, but by far no rocket science. >>> >>> Also I find it hard to imagine another fix for JDK-8233915 besides changing >> the JVM TI specification. >>> >>> Thanks, Richard. >>> >>> -----Original Message----- >>> From: David Holmes >>> Sent: Dienstag, 17. Dezember 2019 08:03 >>> To: Reingruber, Richard ; serviceability- >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot- >> runtime-dev at openjdk.java.net; Vladimir Kozlov (vladimir.kozlov at oracle.com) >> >>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance >> in the Presence of JVMTI Agents >>> >>> >>> >>> David >>> >>> On 17/12/2019 4:57 pm, David Holmes wrote: >>>> Hi Richard, >>>> >>>> On 14/12/2019 5:01 am, Reingruber, Richard wrote: >>>>> Hi David, >>>>> >>>>> ?? > Some further queries/concerns: >>>>> ?? > >>>>> ?? > src/hotspot/share/runtime/objectMonitor.cpp >>>>> ?? > >>>>> ?? > Can you please explain the changes to ObjectMonitor::wait: >>>>> ?? > >>>>> ?? > !?? _recursions = save????? // restore the old recursion count >>>>> ?? > !???????????????? + jt->get_and_reset_relock_count_after_wait(); // >>>>> ?? > increased by the deferred relock count >>>>> ?? > >>>>> ?? > what is the "deferred relock count"? I gather it relates to >>>>> ?? > >>>>> ?? > "The code was extended to be able to deoptimize objects of a >>>>> frame that >>>>> ?? > is not the top frame and to let another thread than the owning >>>>> thread do >>>>> ?? > it." >>>>> >>>>> Yes, these relate. Currently EA based optimizations are reverted, when >>>>> a compiled frame is replaced >>>>> with corresponding interpreter frames. Part of this is relocking >>>>> objects with eliminated >>>>> locking. New with the enhancement is that we do this also just before >>>>> object references are acquired >>>>> through JVMTI. In this case we deoptimize also the owning compiled >>>>> frame C and we register >>>>> deoptimized objects as deferred updates. When control returns to C it >>>>> gets deoptimized, we notice >>>>> that objects are already deoptimized (reallocated and relocked), so we >>>>> don't do it again (relocking >>>>> twice would be incorrect of course). Deferred updates are copied into >>>>> the new interpreter frames. >>>>> >>>>> Problem: relocking is not possible if the target thread T is waiting >>>>> on the monitor that needs to be >>>>> relocked. This happens only with non-local objects with >>>>> EliminateNestedLocks. Instead relocking is >>>>> deferred until T owns the monitor again. This is what the piece of >>>>> code above does. >>>> >>>> Sorry I need some more detail here. How can you wait() on an object >>>> monitor if the object allocation and/or locking was optimised away? And >>>> what is a "non-local object" in this context? Isn't EA restricted to >>>> thread-confined objects? >>>> >>>> Is it just that some of the locking gets optimized away e.g. >>>> >>>> synchronised(obj) { >>>> ? synchronised(obj) { >>>> ??? synchronised(obj) { >>>> ????? obj.wait(); >>>> ??? } >>>> ? } >>>> } >>>> >>>> If this is reduced to a form as-if it were a single lock of the monitor >>>> (due to EA) and the wait() triggers a JVM TI event which leads to the >>>> escape of "obj" then we need to reconstruct the true lock state, and so >>>> when the wait() internally unblocks and reacquires the monitor it has to >>>> set the true recursion count to 3, not the 1 that it appeared to be when >>>> wait() was initially called. Is that the scenario? >>>> >>>> If so I find this truly awful. Anyone using wait() in a realistic form >>>> requires a notification and so the object cannot be thread confined. In >>>> which case I would strongly argue that upon hitting the wait() the deopt >>>> should occur unconditionally and so the lock state is correct before we >>>> wait and so we don't need to mess with the recursion count internally >>>> when we reacquire the monitor. >>>> >>>>> >>>>> ?? > which I don't like the sound of at all when it comes to >>>>> ObjectMonitor >>>>> ?? > state. So I'd like to understand in detail exactly what is going >>>>> on here >>>>> ?? > and why.? This is a very intrusive change that seems to badly break >>>>> ?? > encapsulation and impacts future changes to ObjectMonitor that >>>>> are under >>>>> ?? > investigation. >>>>> >>>>> I would not regard this as breaking encapsulation. Certainly not badly. >>>>> >>>>> I've added a property relock_count_after_wait to JavaThread. The >>>>> property is well >>>>> encapsulated. Future ObjectMonitor implementations have to deal with >>>>> recursion too. They are free in >>>>> choosing a way to do that as long as that property is taken into >>>>> account. This is hardly a >>>>> limitation. >>>> >>>> I do think this badly breaks encapsulation as you have to add a callout >>>> from the guts of the ObjectMonitor code to reach into the thread to get >>>> this lock count adjustment. I understand why you have had to do this but >>>> I would much rather see a change to the EA optimisation strategy so that >>>> this is not needed. >>>> >>>>> Note also that the property is a straight forward extension of the >>>>> existing concept of deferred >>>>> local updates. It is embedded into the structure holding them. So not >>>>> even the footprint of a >>>>> JavaThread is enlarged if no deferred updates are generated. >>>>> >>>>> ?? > --- >>>>> ?? > >>>>> ?? > src/hotspot/share/runtime/thread.cpp >>>>> ?? > >>>>> ?? > Can you please explain why >>>>> JavaThread::wait_for_object_deoptimization >>>>> ?? > has to be handcrafted in this way rather than using proper >>>>> transitions. >>>>> ?? > >>>>> >>>>> I wrote wait_for_object_deoptimization taking >>>>> JavaThread::java_suspend_self_with_safepoint_check >>>>> as template. So in short: for the same reasons :) >>>>> >>>>> Threads reach both methods as part of thread state transitions, >>>>> therefore special handling is >>>>> required to change thread state on top of ongoing transitions. >>>>> >>>>> ?? > We got rid of "deopt suspend" some time ago and it is disturbing >>>>> to see >>>>> ?? > it being added back (effectively). This seems like it may be >>>>> something >>>>> ?? > that handshakes could be used for. >>>>> >>>>> Deopt suspend used to be something rather different with a similar >>>>> name[1]. It is not being added back. >>>> >>>> I stand corrected. Despite comments in the code to the contrary >>>> deopt_suspend didn't actually cause a self-suspend. I was doing a lot of >>>> cleanup in this area 13 years ago :) >>>> >>>>> >>>>> I'm actually duplicating the existing external suspend mechanism, >>>>> because a thread can be suspended >>>>> at most once. And hey, and don't like that either! But it seems not >>>>> unlikely that the duplicate can >>>>> be removed together with the original and the new type of handshakes >>>>> that will be used for >>>>> thread suspend can be used for object deoptimization too. See today's >>>>> discussion in JDK-8227745 [2]. >>>> >>>> I hope that discussion bears some fruit, at the moment it seems not to >>>> be possible to use handshakes here. :( >>>> >>>> The external suspend mechanism is a royal pain in the proverbial that we >>>> have to carefully live with. The idea that we're duplicating that for >>>> use in another fringe area of functionality does not thrill me at all. >>>> >>>> To be clear, I understand the problem that exists and that you wish to >>>> solve, but for the runtime parts I balk at the complexity cost of >>>> solving it. >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> Thanks, Richard. >>>>> >>>>> [1] Deopt suspend was something like an async. handshake for >>>>> architectures with register windows, >>>>> ???? where patching the return pc for deoptimization of a compiled >>>>> frame was racy if the owner thread >>>>> ???? was in native code. Instead a "deopt" suspend flag was set on >>>>> which the thread patched its own >>>>> ???? frame upon return from native. So no thread was suspended. It got >>>>> its name only from the name of >>>>> ???? the flags. >>>>> >>>>> [2] Discussion about using handshakes to sync. with the target thread: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK- >> 8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.syste >> m.issuetabpanels:comment-tabpanel#comment-14306727 >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: David Holmes >>>>> Sent: Freitag, 13. Dezember 2019 00:56 >>>>> To: Reingruber, Richard ; >>>>> serviceability-dev at openjdk.java.net; >>>>> hotspot-compiler-dev at openjdk.java.net; >>>>> hotspot-runtime-dev at openjdk.java.net >>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >>>>> Performance in the Presence of JVMTI Agents >>>>> >>>>> Hi Richard, >>>>> >>>>> Some further queries/concerns: >>>>> >>>>> src/hotspot/share/runtime/objectMonitor.cpp >>>>> >>>>> Can you please explain the changes to ObjectMonitor::wait: >>>>> >>>>> !?? _recursions = save????? // restore the old recursion count >>>>> !???????????????? + jt->get_and_reset_relock_count_after_wait(); // >>>>> increased by the deferred relock count >>>>> >>>>> what is the "deferred relock count"? I gather it relates to >>>>> >>>>> "The code was extended to be able to deoptimize objects of a frame that >>>>> is not the top frame and to let another thread than the owning thread do >>>>> it." >>>>> >>>>> which I don't like the sound of at all when it comes to ObjectMonitor >>>>> state. So I'd like to understand in detail exactly what is going on here >>>>> and why.? This is a very intrusive change that seems to badly break >>>>> encapsulation and impacts future changes to ObjectMonitor that are under >>>>> investigation. >>>>> >>>>> --- >>>>> >>>>> src/hotspot/share/runtime/thread.cpp >>>>> >>>>> Can you please explain why JavaThread::wait_for_object_deoptimization >>>>> has to be handcrafted in this way rather than using proper transitions. >>>>> >>>>> We got rid of "deopt suspend" some time ago and it is disturbing to see >>>>> it being added back (effectively). This seems like it may be something >>>>> that handshakes could be used for. >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>> On 12/12/2019 7:02 am, David Holmes wrote: >>>>>> On 12/12/2019 1:07 am, Reingruber, Richard wrote: >>>>>>> Hi David, >>>>>>> >>>>>>> ??? > Most of the details here are in areas I can comment on in detail, >>>>>>> but I >>>>>>> ??? > did take an initial general look at things. >>>>>>> >>>>>>> Thanks for taking the time! >>>>>> >>>>>> Apologies the above should read: >>>>>> >>>>>> "Most of the details here are in areas I *can't* comment on in detail >>>>>> ..." >>>>>> >>>>>> David >>>>>> >>>>>>> ??? > The only thing that jumped out at me is that I think the >>>>>>> ??? > DeoptimizeObjectsALotThread should be a hidden thread. >>>>>>> ??? > >>>>>>> ??? > +? bool is_hidden_from_external_view() const { return true; } >>>>>>> >>>>>>> Yes, it should. Will add the method like above. >>>>>>> >>>>>>> ??? > Also I don't see any testing of the DeoptimizeObjectsALotThread. >>>>>>> Without >>>>>>> ??? > active testing this will just bit-rot. >>>>>>> >>>>>>> DeoptimizeObjectsALot is meant for stress testing with a larger >>>>>>> workload. I will add a minimal test >>>>>>> to keep it fresh. >>>>>>> >>>>>>> ??? > Also on the tests I don't understand your @requires clause: >>>>>>> ??? > >>>>>>> ??? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled >> & >>>>>>> ??? > (vm.opt.TieredCompilation != true)) >>>>>>> ??? > >>>>>>> ??? > This seems to require that TieredCompilation is disabled, but >>>>>>> tiered is >>>>>>> ??? > our normal mode of operation. ?? >>>>>>> ??? > >>>>>>> >>>>>>> I removed the clause. I guess I wanted to target the tests towards the >>>>>>> code they are supposed to >>>>>>> test, and it's easier to analyze failures w/o tiered compilation and >>>>>>> with just one compiler thread. >>>>>>> >>>>>>> Additionally I will make use of >>>>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests. >>>>>>> >>>>>>> Thanks, >>>>>>> Richard. >>>>>>> >>>>>>> -----Original Message----- >>>>>>> From: David Holmes >>>>>>> Sent: Mittwoch, 11. Dezember 2019 08:03 >>>>>>> To: Reingruber, Richard ; >>>>>>> serviceability-dev at openjdk.java.net; >>>>>>> hotspot-compiler-dev at openjdk.java.net; >>>>>>> hotspot-runtime-dev at openjdk.java.net >>>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >>>>>>> Performance in the Presence of JVMTI Agents >>>>>>> >>>>>>> Hi Richard, >>>>>>> >>>>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> I would like to get reviews please for >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ >>>>>>>> >>>>>>>> Corresponding RFE: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745 >>>>>>>> >>>>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 >>>>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1] >>>>>>>> >>>>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without >>>>>>>> issues (thanks!). In addition the >>>>>>>> change is being tested at SAP since I posted the first RFR some >>>>>>>> months ago. >>>>>>>> >>>>>>>> The intention of this enhancement is to benefit performance wise from >>>>>>>> escape analysis even if JVMTI >>>>>>>> agents request capabilities that allow them to access local variable >>>>>>>> values. E.g. if you start-up >>>>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then >>>>>>>> escape analysis is disabled right >>>>>>>> from the beginning, well before a debugger attaches -- if ever one >>>>>>>> should do so. With the >>>>>>>> enhancement, escape analysis will remain enabled until and after a >>>>>>>> debugger attaches. EA based >>>>>>>> optimizations are reverted just before an agent acquires the >>>>>>>> reference to an object. In the JBS item >>>>>>>> you'll find more details. >>>>>>> >>>>>>> Most of the details here are in areas I can comment on in detail, but I >>>>>>> did take an initial general look at things. >>>>>>> >>>>>>> The only thing that jumped out at me is that I think the >>>>>>> DeoptimizeObjectsALotThread should be a hidden thread. >>>>>>> >>>>>>> +? bool is_hidden_from_external_view() const { return true; } >>>>>>> >>>>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread. >>>>>>> Without >>>>>>> active testing this will just bit-rot. >>>>>>> >>>>>>> Also on the tests I don't understand your @requires clause: >>>>>>> >>>>>>> ??? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & >>>>>>> (vm.opt.TieredCompilation != true)) >>>>>>> >>>>>>> This seems to require that TieredCompilation is disabled, but tiered is >>>>>>> our normal mode of operation. ?? >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>>> Thanks, >>>>>>>> Richard. >>>>>>>> >>>>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745 >>>>>>>> >> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patc >> h >>>>>>>> >>>>>>>> >>>>>>>> From kevin.walls at oracle.com Tue Mar 3 22:44:14 2020 From: kevin.walls at oracle.com (Kevin Walls) Date: Tue, 3 Mar 2020 22:44:14 +0000 Subject: RFR(S): 8240295: hs_err elapsed time in seconds is not accurate enough In-Reply-To: <92ada19d-7200-6bb7-b174-9da8a166aea7@oracle.com> References: <4a3d7c8e-df9e-51de-77e4-775be1480392@oracle.com> <92ada19d-7200-6bb7-b174-9da8a166aea7@oracle.com> Message-ID: <053864a8-617f-fcac-fc26-220d161e3e55@oracle.com> Thanks David - Yes there are situations where hs_err fails, and few people are sadder than me when that happens 8-) , so I was thinking about how scared to be by the comment. With the safety net of the error handler for the steps of the hs_err file (which works well, we see it invoked frequently), it looks reasonable to use %f as we might do other slightly questionable things for a signal handler. Corrupting locale information or floating point state might possibly cause problems, but if I cause a fake crash in print_date_and_time the error handler recovers and the report continues. Thinking about printing with two ints, seconds and fractions: I don't see anything already that returns such a time in two components in the JVM, so we might implement a new form of os::javaTimeNanos() or similar that returns the two parts, and do that on each platform. I didn't yet come up with anything to do in os::print_date_and_time() which will take the fractional part of the double, and print just the fraction as an int, without using any library / %f facilities. If you're still concerned I could revisit these or some other idea. Genuine laugh out loud moment for me, I backported the elapsed time logging from 6u4 to 5u19? (https://bugs.openjdk.java.net/browse/JDK-6447157) (2007). (I said before jdk5 was created, I should have said before it was in mercurial.) Thanks Kevin On 03/03/2020 01:11, David Holmes wrote: > Hi Kevin, > > On 2/03/2020 8:48 pm, Kevin Walls wrote: >> Oops, and with the bug ID in the title and JBS link: >> https://bugs.openjdk.java.net/browse/JDK-8240295 >> >> >> On 02/03/2020 10:47, Kevin Walls wrote: >>> Hi, >>> >>> (s11y and runtime opinions both relevant) >>> >>> A few times in the last month I've really wanted to compare the >>> Events logged in the hs_err file, and the time of the JVM's crash. >>> >>> "elapsed time" in hs_err is only accurate to one second, and has >>> been since before jdk5 was created. >>> >>> The diff below changes the format string and uses the non-rounded >>> time value (I don't see a need to change the other integer >>> arithmetic here), and we can enjoy hs_errs with detail like: >>> >>> ... >>> Time: Mon Mar? 2 09:57:13 2020 UTC elapsed time: 5.490135 seconds >>> (0d 0h 0m 5s) >>> ... >>> >>> Thanks >>> Kevin >>> >>> >>> /jdk/open$ hg diff >>> diff --git a/src/hotspot/share/runtime/os.cpp >>> b/src/hotspot/share/runtime/os.cpp >>> --- a/src/hotspot/share/runtime/os.cpp >>> +++ b/src/hotspot/share/runtime/os.cpp >>> @@ -1016,9 +1016,8 @@ >>> ?? } >>> >>> ?? double t = os::elapsedTime(); >>> -? // NOTE: It tends to crash after a SEGV if we want to >>> printf("%f",...) in >>> -? //?????? Linux. Must be a bug in glibc ? Workaround is to round >>> "t" to int >>> -? //?????? before printf. We lost some precision, but who cares? >>> +? // NOTE: a crash using printf("%f",...) on Linux was historically >>> noted here >>> +? //?????? (before the jdk5 repo was created). > > Just because it is old doesn't mean it no longer applies. printf is > not async-signal-safe - we know that but we try to use it anyway. > Maybe %f is even less async-signal-safe? > > This may get through testing okay but cause problems with real crashes > in the field. > > What about breaking the time up into two ints: seconds and nanos? > > Cheers, > David > ----- > >>> ?? int eltime = (int)t;? // elapsed time in seconds >>> >>> ?? // print elapsed time in a human-readable format: >>> @@ -1029,7 +1028,7 @@ >>> ?? int elmins = (eltime - day_secs - hour_secs) / secs_per_min; >>> ?? int minute_secs = elmins * secs_per_min; >>> ?? int elsecs = (eltime - day_secs - hour_secs - minute_secs); >>> -? st->print_cr(" elapsed time: %d seconds (%dd %dh %dm %ds)", >>> eltime, eldays, elhours, elmins, elsecs); >>> +? st->print_cr(" elapsed time: %f seconds (%dd %dh %dm %ds)", t, >>> eldays, elhours, elmins, elsecs); >>> ?} >>> >>> From ramkumar.sunderbabu at oracle.com Wed Mar 4 05:24:33 2020 From: ramkumar.sunderbabu at oracle.com (Ramkumar Sunderbabu) Date: Tue, 3 Mar 2020 21:24:33 -0800 (PST) Subject: RFR(S) : 8153430: [TESTBUG] jdk regression test javax/management/loading/MletParserLocaleTest.java reduce default timeout In-Reply-To: <0cee14ea-901d-4571-85b9-bbb6f01d59c1@default> References: <0cee14ea-901d-4571-85b9-bbb6f01d59c1@default> Message-ID: Request to look into the change. It is tr?s simple. ? Thanks, Ram ? From: Ramkumar Sunderbabu Sent: Tuesday, March 3, 2020 2:22 PM To: serviceability-dev at openjdk.java.net Subject: RFR(S) : 8153430: [TESTBUG] jdk regression test javax/management/loading/MletParserLocaleTest.java reduce default timeout ? Hi all, ????????????? Please review this patch. Removed "timeout=5" from the Tests so that default timeout is used. ? JBS: https://bugs.openjdk.java.net/browse/JDK-8153430 Webrev: http://cr.openjdk.java.net/~rsunderbabu/8153430/webrev.00/ ? Testing: Locally tested with "-Xcomp" option on a linux-64 machine. ? Thanks, Ram -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Wed Mar 4 06:10:12 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 3 Mar 2020 22:10:12 -0800 Subject: RFR(S) : 8153430: [TESTBUG] jdk regression test javax/management/loading/MletParserLocaleTest.java reduce default timeout In-Reply-To: References: <0cee14ea-901d-4571-85b9-bbb6f01d59c1@default> Message-ID: <0653e5d4-893b-ce78-f0cf-5905a659bb10@oracle.com> An HTML attachment was scrubbed... URL: From alexey.menkov at oracle.com Thu Mar 5 00:30:41 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Wed, 4 Mar 2020 16:30:41 -0800 Subject: RFR: JDK-8240340: java/lang/management/ThreadMXBean/Locks.java is buggy Message-ID: Hi all, please review the fix for https://bugs.openjdk.java.net/browse/JDK-8240340 webrev: http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev/ changes: - assertThreadState method: don't re-read thread state throwing exception (as we got weird error like "Thread WaitingThread is at WAITING state but is expected to be in Thread.State = WAITING"); - added proper test shutdown on error (made all threads "daemon", interrupt waiting thread if CheckerThread throws exception); - if CheckerThread detects error, propagate the exception to main thread; - fixed LockFreeLogger class - it should work for logging from several threads, but it doesn't. I prefer to simplify it just to keep ConcurrentLinkedQueue. LockFreeLogger is also used by ThreadMXBeanStateTest test, but only by a single thread. --alex From david.holmes at oracle.com Thu Mar 5 00:57:35 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 5 Mar 2020 10:57:35 +1000 Subject: RFR(S): 8240295: hs_err elapsed time in seconds is not accurate enough In-Reply-To: <053864a8-617f-fcac-fc26-220d161e3e55@oracle.com> References: <4a3d7c8e-df9e-51de-77e4-775be1480392@oracle.com> <92ada19d-7200-6bb7-b174-9da8a166aea7@oracle.com> <053864a8-617f-fcac-fc26-220d161e3e55@oracle.com> Message-ID: On 4/03/2020 8:44 am, Kevin Walls wrote: > > Thanks David - > > Yes there are situations where hs_err fails, and few people are sadder > than me > when that happens 8-) , so I was thinking about how scared to be by the > comment. > > With the safety net of the error handler for the steps of the hs_err file > (which works well, we see it invoked frequently), it looks reasonable to > use > %f as we might do other slightly questionable things for a signal handler. > > Corrupting locale information or floating point state might possibly cause > problems, but if I cause a fake crash in print_date_and_time the error > handler recovers and the report continues. That is good to know. > Thinking about printing with two ints, seconds and fractions: > I don't see anything already that returns such a time in two components > in the > JVM, so we might implement a new form of os::javaTimeNanos() or similar > that > returns the two parts, and do that on each platform. I was thinking of something simple/crude ... > I didn't yet come up with anything to do in os::print_date_and_time() > which will take the fractional part of the double, and print just the > fraction as an int, without using any library / %f facilities. ... just using e.g. (untested) double t = os::elapsedTime(); int secs = (int) t; int micros = (int)((t - secs) * 100000); printf("%d.%d", secs, micros); with appropriate width specifiers to get the formatting right. Cheers, David > > If you're still concerned I could revisit these or some other idea. > > Genuine laugh out loud moment for me, I backported the elapsed time > logging from > 6u4 to 5u19? (https://bugs.openjdk.java.net/browse/JDK-6447157) (2007). > (I said before jdk5 was created, I should have said before it was in > mercurial.) > > Thanks > Kevin > > > On 03/03/2020 01:11, David Holmes wrote: >> Hi Kevin, >> >> On 2/03/2020 8:48 pm, Kevin Walls wrote: >>> Oops, and with the bug ID in the title and JBS link: >>> https://bugs.openjdk.java.net/browse/JDK-8240295 >>> >>> >>> On 02/03/2020 10:47, Kevin Walls wrote: >>>> Hi, >>>> >>>> (s11y and runtime opinions both relevant) >>>> >>>> A few times in the last month I've really wanted to compare the >>>> Events logged in the hs_err file, and the time of the JVM's crash. >>>> >>>> "elapsed time" in hs_err is only accurate to one second, and has >>>> been since before jdk5 was created. >>>> >>>> The diff below changes the format string and uses the non-rounded >>>> time value (I don't see a need to change the other integer >>>> arithmetic here), and we can enjoy hs_errs with detail like: >>>> >>>> ... >>>> Time: Mon Mar? 2 09:57:13 2020 UTC elapsed time: 5.490135 seconds >>>> (0d 0h 0m 5s) >>>> ... >>>> >>>> Thanks >>>> Kevin >>>> >>>> >>>> /jdk/open$ hg diff >>>> diff --git a/src/hotspot/share/runtime/os.cpp >>>> b/src/hotspot/share/runtime/os.cpp >>>> --- a/src/hotspot/share/runtime/os.cpp >>>> +++ b/src/hotspot/share/runtime/os.cpp >>>> @@ -1016,9 +1016,8 @@ >>>> ?? } >>>> >>>> ?? double t = os::elapsedTime(); >>>> -? // NOTE: It tends to crash after a SEGV if we want to >>>> printf("%f",...) in >>>> -? //?????? Linux. Must be a bug in glibc ? Workaround is to round >>>> "t" to int >>>> -? //?????? before printf. We lost some precision, but who cares? >>>> +? // NOTE: a crash using printf("%f",...) on Linux was historically >>>> noted here >>>> +? //?????? (before the jdk5 repo was created). >> >> Just because it is old doesn't mean it no longer applies. printf is >> not async-signal-safe - we know that but we try to use it anyway. >> Maybe %f is even less async-signal-safe? >> >> This may get through testing okay but cause problems with real crashes >> in the field. >> >> What about breaking the time up into two ints: seconds and nanos? >> >> Cheers, >> David >> ----- >> >>>> ?? int eltime = (int)t;? // elapsed time in seconds >>>> >>>> ?? // print elapsed time in a human-readable format: >>>> @@ -1029,7 +1028,7 @@ >>>> ?? int elmins = (eltime - day_secs - hour_secs) / secs_per_min; >>>> ?? int minute_secs = elmins * secs_per_min; >>>> ?? int elsecs = (eltime - day_secs - hour_secs - minute_secs); >>>> -? st->print_cr(" elapsed time: %d seconds (%dd %dh %dm %ds)", >>>> eltime, eldays, elhours, elmins, elsecs); >>>> +? st->print_cr(" elapsed time: %f seconds (%dd %dh %dm %ds)", t, >>>> eldays, elhours, elmins, elsecs); >>>> ?} >>>> >>>> From david.holmes at oracle.com Thu Mar 5 01:50:11 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 5 Mar 2020 11:50:11 +1000 Subject: RFR: JDK-8240340: java/lang/management/ThreadMXBean/Locks.java is buggy In-Reply-To: References: Message-ID: <4c9b6f1b-1308-9087-ef4c-140eaf103b0f@oracle.com> Hi Alex, On 5/03/2020 10:30 am, Alex Menkov wrote: > Hi all, > > please review the fix for > https://bugs.openjdk.java.net/browse/JDK-8240340 > webrev: > http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev/ > > changes: > - assertThreadState method: don't re-read thread state throwing > exception (as we got weird error like "Thread WaitingThread is at > WAITING state but is expected to be in Thread.State = WAITING"); > - added proper test shutdown on error (made all threads "daemon", > interrupt waiting thread if CheckerThread throws exception); > - if CheckerThread detects error, propagate the exception to main thread; The test changes seem fine. > - fixed LockFreeLogger class - it should work for logging from several > threads, but it doesn't. I prefer to simplify it just to keep > ConcurrentLinkedQueue. > LockFreeLogger is also used by ThreadMXBeanStateTest test, but only by a > single thread. I don't understand your changes here as you've completely changed the intended design of the logger. The original accumulates log entries per-thread and then spits them all out (though I'm not clear on the exact ordering - I don't how to read that stream stuff). The new code just creates a single queue of log records interleaving entries from different threads. The simple logger may be all that is needed but it seems quite different to the intent of the original. Thanks, David > --alex From kevin.walls at oracle.com Thu Mar 5 10:00:24 2020 From: kevin.walls at oracle.com (Kevin Walls) Date: Thu, 5 Mar 2020 10:00:24 +0000 Subject: RFR(S): 8240295: hs_err elapsed time in seconds is not accurate enough In-Reply-To: References: <4a3d7c8e-df9e-51de-77e4-775be1480392@oracle.com> <92ada19d-7200-6bb7-b174-9da8a166aea7@oracle.com> <053864a8-617f-fcac-fc26-220d161e3e55@oracle.com> Message-ID: Thanks - I had tried some ideas in the simple fashion, and we can use %06d formatting.... OK maybe such formatting is not as "bad" as %f... (glibc parses the int width specified without allocation.? We provide the output buffer, I don't think we will cause? vfprintf code to alloca or malloc.) I can offer a second version below that uses %d only.? Testing alongside %f in the same line, it retains the same value and position, e.g. Time: Thu Mar? 5 08:57:50 2020 UTC elapsed time: f: 2.001065 int: 2.001065 (raw int: 1065) seconds (0d 0h 0m 2s) Output example from the hg diff below (not from the same run): Time: Thu Mar? 5 09:28:01 2020 UTC elapsed time: 2.000611 seconds (0d 0h 0m 2s) Thanks! Kevin $ hg diff diff --git a/src/hotspot/share/runtime/os.cpp b/src/hotspot/share/runtime/os.cpp --- a/src/hotspot/share/runtime/os.cpp +++ b/src/hotspot/share/runtime/os.cpp @@ -1016,10 +1016,9 @@ ?? } ?? double t = os::elapsedTime(); -? // NOTE: It tends to crash after a SEGV if we want to printf("%f",...) in -? //?????? Linux. Must be a bug in glibc ? Workaround is to round "t" to int -? //?????? before printf. We lost some precision, but who cares? +? // NOTE: a crash using printf("%f",...) on Linux was historically noted here. ?? int eltime = (int)t;? // elapsed time in seconds +? int eltimeFraction = (int) ((t - eltime) * 1000000); ?? // print elapsed time in a human-readable format: ?? int eldays = eltime / secs_per_day; @@ -1029,7 +1028,7 @@ ?? int elmins = (eltime - day_secs - hour_secs) / secs_per_min; ?? int minute_secs = elmins * secs_per_min; ?? int elsecs = (eltime - day_secs - hour_secs - minute_secs); -? st->print_cr(" elapsed time: %d seconds (%dd %dh %dm %ds)", eltime, eldays, elhours, elmins, elsecs); +? st->print_cr(" elapsed time: %d.%06d seconds (%dd %dh %dm %ds)", eltime, eltimeFraction, eldays, elhours, elmins, elsecs); ?} On 05/03/2020 00:57, David Holmes wrote: > On 4/03/2020 8:44 am, Kevin Walls wrote: >> >> Thanks David - >> >> Yes there are situations where hs_err fails, and few people are >> sadder than me >> when that happens 8-) , so I was thinking about how scared to be by >> the comment. >> >> With the safety net of the error handler for the steps of the hs_err >> file >> (which works well, we see it invoked frequently), it looks reasonable >> to use >> %f as we might do other slightly questionable things for a signal >> handler. >> >> Corrupting locale information or floating point state might possibly >> cause >> problems, but if I cause a fake crash in print_date_and_time the error >> handler recovers and the report continues. > > That is good to know. > >> Thinking about printing with two ints, seconds and fractions: >> I don't see anything already that returns such a time in two >> components in the >> JVM, so we might implement a new form of os::javaTimeNanos() or >> similar that >> returns the two parts, and do that on each platform. > > I was thinking of something simple/crude ... > >> I didn't yet come up with anything to do in os::print_date_and_time() >> which will take the fractional part of the double, and print just the >> fraction as an int, without using any library / %f facilities. > > ... just using e.g. (untested) > > double t = os::elapsedTime(); > int secs =? (int) t; > int micros =? (int)((t - secs) * 100000); > printf("%d.%d", secs, micros); > > with appropriate width specifiers to get the formatting right. > > Cheers, > David > >> >> If you're still concerned I could revisit these or some other idea. >> >> Genuine laugh out loud moment for me, I backported the elapsed time >> logging from >> 6u4 to 5u19? (https://bugs.openjdk.java.net/browse/JDK-6447157) (2007). >> (I said before jdk5 was created, I should have said before it was in >> mercurial.) >> >> Thanks >> Kevin >> >> >> On 03/03/2020 01:11, David Holmes wrote: >>> Hi Kevin, >>> >>> On 2/03/2020 8:48 pm, Kevin Walls wrote: >>>> Oops, and with the bug ID in the title and JBS link: >>>> https://bugs.openjdk.java.net/browse/JDK-8240295 >>>> >>>> >>>> On 02/03/2020 10:47, Kevin Walls wrote: >>>>> Hi, >>>>> >>>>> (s11y and runtime opinions both relevant) >>>>> >>>>> A few times in the last month I've really wanted to compare the >>>>> Events logged in the hs_err file, and the time of the JVM's crash. >>>>> >>>>> "elapsed time" in hs_err is only accurate to one second, and has >>>>> been since before jdk5 was created. >>>>> >>>>> The diff below changes the format string and uses the non-rounded >>>>> time value (I don't see a need to change the other integer >>>>> arithmetic here), and we can enjoy hs_errs with detail like: >>>>> >>>>> ... >>>>> Time: Mon Mar? 2 09:57:13 2020 UTC elapsed time: 5.490135 seconds >>>>> (0d 0h 0m 5s) >>>>> ... >>>>> >>>>> Thanks >>>>> Kevin >>>>> >>>>> >>>>> /jdk/open$ hg diff >>>>> diff --git a/src/hotspot/share/runtime/os.cpp >>>>> b/src/hotspot/share/runtime/os.cpp >>>>> --- a/src/hotspot/share/runtime/os.cpp >>>>> +++ b/src/hotspot/share/runtime/os.cpp >>>>> @@ -1016,9 +1016,8 @@ >>>>> ?? } >>>>> >>>>> ?? double t = os::elapsedTime(); >>>>> -? // NOTE: It tends to crash after a SEGV if we want to >>>>> printf("%f",...) in >>>>> -? //?????? Linux. Must be a bug in glibc ? Workaround is to round >>>>> "t" to int >>>>> -? //?????? before printf. We lost some precision, but who cares? >>>>> +? // NOTE: a crash using printf("%f",...) on Linux was >>>>> historically noted here >>>>> +? //?????? (before the jdk5 repo was created). >>> >>> Just because it is old doesn't mean it no longer applies. printf is >>> not async-signal-safe - we know that but we try to use it anyway. >>> Maybe %f is even less async-signal-safe? >>> >>> This may get through testing okay but cause problems with real >>> crashes in the field. >>> >>> What about breaking the time up into two ints: seconds and nanos? >>> >>> Cheers, >>> David >>> ----- >>> >>>>> ?? int eltime = (int)t;? // elapsed time in seconds >>>>> >>>>> ?? // print elapsed time in a human-readable format: >>>>> @@ -1029,7 +1028,7 @@ >>>>> ?? int elmins = (eltime - day_secs - hour_secs) / secs_per_min; >>>>> ?? int minute_secs = elmins * secs_per_min; >>>>> ?? int elsecs = (eltime - day_secs - hour_secs - minute_secs); >>>>> -? st->print_cr(" elapsed time: %d seconds (%dd %dh %dm %ds)", >>>>> eltime, eldays, elhours, elmins, elsecs); >>>>> +? st->print_cr(" elapsed time: %f seconds (%dd %dh %dm %ds)", t, >>>>> eldays, elhours, elmins, elsecs); >>>>> ?} >>>>> >>>>> From david.holmes at oracle.com Thu Mar 5 10:38:58 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 5 Mar 2020 20:38:58 +1000 Subject: RFR(S): 8240295: hs_err elapsed time in seconds is not accurate enough In-Reply-To: References: <4a3d7c8e-df9e-51de-77e4-775be1480392@oracle.com> <92ada19d-7200-6bb7-b174-9da8a166aea7@oracle.com> <053864a8-617f-fcac-fc26-220d161e3e55@oracle.com> Message-ID: <20045e23-c736-5289-866e-9df5a09101a8@oracle.com> Thanks Kevin. I think this is the less risky change and achieves the goal. David On 5/03/2020 8:00 pm, Kevin Walls wrote: > Thanks - > > I had tried some ideas in the simple fashion, and we can use %06d > formatting.... OK maybe such formatting is not as "bad" as %f... > > (glibc parses the int width specified without allocation.? We provide > the output buffer, I don't think we will cause? vfprintf code to alloca > or malloc.) > > I can offer a second version below that uses %d only.? Testing alongside > %f in the same line, it retains the same value and position, e.g. > > Time: Thu Mar? 5 08:57:50 2020 UTC elapsed time: f: 2.001065 int: > 2.001065 (raw int: 1065) seconds (0d 0h 0m 2s) > > Output example from the hg diff below (not from the same run): > > Time: Thu Mar? 5 09:28:01 2020 UTC elapsed time: 2.000611 seconds (0d 0h > 0m 2s) > > > Thanks! > Kevin > > > $ hg diff > diff --git a/src/hotspot/share/runtime/os.cpp > b/src/hotspot/share/runtime/os.cpp > --- a/src/hotspot/share/runtime/os.cpp > +++ b/src/hotspot/share/runtime/os.cpp > @@ -1016,10 +1016,9 @@ > ?? } > > ?? double t = os::elapsedTime(); > -? // NOTE: It tends to crash after a SEGV if we want to > printf("%f",...) in > -? //?????? Linux. Must be a bug in glibc ? Workaround is to round "t" > to int > -? //?????? before printf. We lost some precision, but who cares? > +? // NOTE: a crash using printf("%f",...) on Linux was historically > noted here. > ?? int eltime = (int)t;? // elapsed time in seconds > +? int eltimeFraction = (int) ((t - eltime) * 1000000); > > ?? // print elapsed time in a human-readable format: > ?? int eldays = eltime / secs_per_day; > @@ -1029,7 +1028,7 @@ > ?? int elmins = (eltime - day_secs - hour_secs) / secs_per_min; > ?? int minute_secs = elmins * secs_per_min; > ?? int elsecs = (eltime - day_secs - hour_secs - minute_secs); > -? st->print_cr(" elapsed time: %d seconds (%dd %dh %dm %ds)", eltime, > eldays, elhours, elmins, elsecs); > +? st->print_cr(" elapsed time: %d.%06d seconds (%dd %dh %dm %ds)", > eltime, eltimeFraction, eldays, elhours, elmins, elsecs); > ?} > > > > On 05/03/2020 00:57, David Holmes wrote: >> On 4/03/2020 8:44 am, Kevin Walls wrote: >>> >>> Thanks David - >>> >>> Yes there are situations where hs_err fails, and few people are >>> sadder than me >>> when that happens 8-) , so I was thinking about how scared to be by >>> the comment. >>> >>> With the safety net of the error handler for the steps of the hs_err >>> file >>> (which works well, we see it invoked frequently), it looks reasonable >>> to use >>> %f as we might do other slightly questionable things for a signal >>> handler. >>> >>> Corrupting locale information or floating point state might possibly >>> cause >>> problems, but if I cause a fake crash in print_date_and_time the error >>> handler recovers and the report continues. >> >> That is good to know. >> >>> Thinking about printing with two ints, seconds and fractions: >>> I don't see anything already that returns such a time in two >>> components in the >>> JVM, so we might implement a new form of os::javaTimeNanos() or >>> similar that >>> returns the two parts, and do that on each platform. >> >> I was thinking of something simple/crude ... >> >>> I didn't yet come up with anything to do in os::print_date_and_time() >>> which will take the fractional part of the double, and print just the >>> fraction as an int, without using any library / %f facilities. >> >> ... just using e.g. (untested) >> >> double t = os::elapsedTime(); >> int secs =? (int) t; >> int micros =? (int)((t - secs) * 100000); >> printf("%d.%d", secs, micros); >> >> with appropriate width specifiers to get the formatting right. >> >> Cheers, >> David >> >>> >>> If you're still concerned I could revisit these or some other idea. >>> >>> Genuine laugh out loud moment for me, I backported the elapsed time >>> logging from >>> 6u4 to 5u19? (https://bugs.openjdk.java.net/browse/JDK-6447157) (2007). >>> (I said before jdk5 was created, I should have said before it was in >>> mercurial.) >>> >>> Thanks >>> Kevin >>> >>> >>> On 03/03/2020 01:11, David Holmes wrote: >>>> Hi Kevin, >>>> >>>> On 2/03/2020 8:48 pm, Kevin Walls wrote: >>>>> Oops, and with the bug ID in the title and JBS link: >>>>> https://bugs.openjdk.java.net/browse/JDK-8240295 >>>>> >>>>> >>>>> On 02/03/2020 10:47, Kevin Walls wrote: >>>>>> Hi, >>>>>> >>>>>> (s11y and runtime opinions both relevant) >>>>>> >>>>>> A few times in the last month I've really wanted to compare the >>>>>> Events logged in the hs_err file, and the time of the JVM's crash. >>>>>> >>>>>> "elapsed time" in hs_err is only accurate to one second, and has >>>>>> been since before jdk5 was created. >>>>>> >>>>>> The diff below changes the format string and uses the non-rounded >>>>>> time value (I don't see a need to change the other integer >>>>>> arithmetic here), and we can enjoy hs_errs with detail like: >>>>>> >>>>>> ... >>>>>> Time: Mon Mar? 2 09:57:13 2020 UTC elapsed time: 5.490135 seconds >>>>>> (0d 0h 0m 5s) >>>>>> ... >>>>>> >>>>>> Thanks >>>>>> Kevin >>>>>> >>>>>> >>>>>> /jdk/open$ hg diff >>>>>> diff --git a/src/hotspot/share/runtime/os.cpp >>>>>> b/src/hotspot/share/runtime/os.cpp >>>>>> --- a/src/hotspot/share/runtime/os.cpp >>>>>> +++ b/src/hotspot/share/runtime/os.cpp >>>>>> @@ -1016,9 +1016,8 @@ >>>>>> ?? } >>>>>> >>>>>> ?? double t = os::elapsedTime(); >>>>>> -? // NOTE: It tends to crash after a SEGV if we want to >>>>>> printf("%f",...) in >>>>>> -? //?????? Linux. Must be a bug in glibc ? Workaround is to round >>>>>> "t" to int >>>>>> -? //?????? before printf. We lost some precision, but who cares? >>>>>> +? // NOTE: a crash using printf("%f",...) on Linux was >>>>>> historically noted here >>>>>> +? //?????? (before the jdk5 repo was created). >>>> >>>> Just because it is old doesn't mean it no longer applies. printf is >>>> not async-signal-safe - we know that but we try to use it anyway. >>>> Maybe %f is even less async-signal-safe? >>>> >>>> This may get through testing okay but cause problems with real >>>> crashes in the field. >>>> >>>> What about breaking the time up into two ints: seconds and nanos? >>>> >>>> Cheers, >>>> David >>>> ----- >>>> >>>>>> ?? int eltime = (int)t;? // elapsed time in seconds >>>>>> >>>>>> ?? // print elapsed time in a human-readable format: >>>>>> @@ -1029,7 +1028,7 @@ >>>>>> ?? int elmins = (eltime - day_secs - hour_secs) / secs_per_min; >>>>>> ?? int minute_secs = elmins * secs_per_min; >>>>>> ?? int elsecs = (eltime - day_secs - hour_secs - minute_secs); >>>>>> -? st->print_cr(" elapsed time: %d seconds (%dd %dh %dm %ds)", >>>>>> eltime, eldays, elhours, elmins, elsecs); >>>>>> +? st->print_cr(" elapsed time: %f seconds (%dd %dh %dm %ds)", t, >>>>>> eldays, elhours, elmins, elsecs); >>>>>> ?} >>>>>> >>>>>> From daniel.fuchs at oracle.com Thu Mar 5 14:27:36 2020 From: daniel.fuchs at oracle.com (Daniel Fuchs) Date: Thu, 5 Mar 2020 14:27:36 +0000 Subject: jmx-dev RFR 8240604: Rewrite sun/management/jmxremote/bootstrap/CustomLauncherTest.java test to make binaries from source file In-Reply-To: <6b21ac03-0505-881e-b805-1681faab437b@bell-sw.com> References: <6b21ac03-0505-881e-b805-1681faab437b@bell-sw.com> Message-ID: Hi Alexander, Fixes to JMX & management agent are reviewed on the seviceability-dev (added in to:) these days. best regards, -- daniel On 05/03/2020 13:17, Alexander Scherbatiy wrote: > Hello, > > Could you review a small enhancement where the test CustomLauncherTest > is updated to build binary launcher file from launcher.c file. > The file launcher.c is renamed to exelauncher.c to follow the name > convention for executable test files building by jdk make system. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8240604 > Webrev: http://cr.openjdk.java.net/~alexsch/8240604/webrev.00 > > The changes for obsolete binary files from > sun/management/jmxremote/bootstrap/linux-* and solaris-* are not > included into the webrev. They needs to be removed manually. > > The test is passed on Ubuntu 18.04 x86-64, Solaris Sparc v9 11.2, and > Solaris x64 11.4 systems. > > The test is excluded from Windows and Mac Os X systems. > > Thanks, > Alexander. From alexey.menkov at oracle.com Thu Mar 5 18:54:20 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Thu, 5 Mar 2020 10:54:20 -0800 Subject: RFR: JDK-8240340: java/lang/management/ThreadMXBean/Locks.java is buggy In-Reply-To: <4c9b6f1b-1308-9087-ef4c-140eaf103b0f@oracle.com> References: <4c9b6f1b-1308-9087-ef4c-140eaf103b0f@oracle.com> Message-ID: <9a3d8f8c-5c46-aa64-ba22-b1b2bf1836ee@oracle.com> Hi David, Thanks you for the review. On 03/04/2020 17:50, David Holmes wrote: > Hi Alex, > > On 5/03/2020 10:30 am, Alex Menkov wrote: >> Hi all, >> >> please review the fix for >> https://bugs.openjdk.java.net/browse/JDK-8240340 >> webrev: >> http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev/ >> >> changes: >> - assertThreadState method: don't re-read thread state throwing >> exception (as we got weird error like "Thread WaitingThread is at >> WAITING state but is expected to be in Thread.State = WAITING"); >> - added proper test shutdown on error (made all threads "daemon", >> interrupt waiting thread if CheckerThread throws exception); >> - if CheckerThread detects error, propagate the exception to main thread; > > The test changes seem fine. > >> - fixed LockFreeLogger class - it should work for logging from several >> threads, but it doesn't. I prefer to simplify it just to keep >> ConcurrentLinkedQueue. >> LockFreeLogger is also used by ThreadMXBeanStateTest test, but only by >> a single thread. > > I don't understand your changes here as you've completely changed the > intended design of the logger. The original accumulates log entries > per-thread and then spits them all out (though I'm not clear on the > exact ordering - I don't how to read that stream stuff). The new code > just creates a single queue of log records interleaving entries from > different threads. The simple logger may be all that is needed but it > seems quite different to the intent of the original. Testing changes in the test I discovered that there is something wrong with the logger - it printed only part of the records, so I have to look at the LockFreeLogger class and I don't understand how it was supposed to work. About ordering in cumulative log: each record has Integer which used to sort log entries from all threads (i.e. records from different threads are printed at the order which log() was called). Looking at allRecords/records stuff I don't understand how it should be used. To get logs from different threads in one logger, we needs one instance. So we create LockFreeLogger (in main thread) and ctor creates ThreadLocal record and register it in allRecords. Logging from main thread works fine, but if any other thread tries to log, 1st log() call creates its own ThreadLocal records (by records.get()) and log records from this thread go there. But this ThreadLocal records is not registered in allRecords, so this logging won't be included in final log. Looks like we need to change log() to something like Map recs = records.get(); if (recs.isEmpty()) { allRecords.add(recs); } recs.put(id, String.format(format, params)); But all this stuff do exactly the same as simple ConcurrentLinkedQueue (i.e. lock free ordered list). At least I don't see other rationale in the stuff. --alex > > Thanks, > David > >> --alex From kevin.walls at oracle.com Thu Mar 5 21:01:53 2020 From: kevin.walls at oracle.com (Kevin Walls) Date: Thu, 5 Mar 2020 21:01:53 +0000 Subject: RFR(S): 8240295: hs_err elapsed time in seconds is not accurate enough In-Reply-To: <20045e23-c736-5289-866e-9df5a09101a8@oracle.com> References: <4a3d7c8e-df9e-51de-77e4-775be1480392@oracle.com> <92ada19d-7200-6bb7-b174-9da8a166aea7@oracle.com> <053864a8-617f-fcac-fc26-220d161e3e55@oracle.com> <20045e23-c736-5289-866e-9df5a09101a8@oracle.com> Message-ID: Great, thanks David. On 05/03/2020 10:38, David Holmes wrote: > Thanks Kevin. I think this is the less risky change and achieves the > goal. > > David > > On 5/03/2020 8:00 pm, Kevin Walls wrote: >> Thanks - >> >> I had tried some ideas in the simple fashion, and we can use %06d >> formatting.... OK maybe such formatting is not as "bad" as %f... >> >> (glibc parses the int width specified without allocation.? We provide >> the output buffer, I don't think we will cause? vfprintf code to >> alloca or malloc.) >> >> I can offer a second version below that uses %d only.? Testing >> alongside %f in the same line, it retains the same value and >> position, e.g. >> >> Time: Thu Mar? 5 08:57:50 2020 UTC elapsed time: f: 2.001065 int: >> 2.001065 (raw int: 1065) seconds (0d 0h 0m 2s) >> >> Output example from the hg diff below (not from the same run): >> >> Time: Thu Mar? 5 09:28:01 2020 UTC elapsed time: 2.000611 seconds (0d >> 0h 0m 2s) >> >> >> Thanks! >> Kevin >> >> >> $ hg diff >> diff --git a/src/hotspot/share/runtime/os.cpp >> b/src/hotspot/share/runtime/os.cpp >> --- a/src/hotspot/share/runtime/os.cpp >> +++ b/src/hotspot/share/runtime/os.cpp >> @@ -1016,10 +1016,9 @@ >> ??? } >> >> ??? double t = os::elapsedTime(); >> -? // NOTE: It tends to crash after a SEGV if we want to >> printf("%f",...) in >> -? //?????? Linux. Must be a bug in glibc ? Workaround is to round >> "t" to int >> -? //?????? before printf. We lost some precision, but who cares? >> +? // NOTE: a crash using printf("%f",...) on Linux was historically >> noted here. >> ??? int eltime = (int)t;? // elapsed time in seconds >> +? int eltimeFraction = (int) ((t - eltime) * 1000000); >> >> ??? // print elapsed time in a human-readable format: >> ??? int eldays = eltime / secs_per_day; >> @@ -1029,7 +1028,7 @@ >> ??? int elmins = (eltime - day_secs - hour_secs) / secs_per_min; >> ??? int minute_secs = elmins * secs_per_min; >> ??? int elsecs = (eltime - day_secs - hour_secs - minute_secs); >> -? st->print_cr(" elapsed time: %d seconds (%dd %dh %dm %ds)", >> eltime, eldays, elhours, elmins, elsecs); >> +? st->print_cr(" elapsed time: %d.%06d seconds (%dd %dh %dm %ds)", >> eltime, eltimeFraction, eldays, elhours, elmins, elsecs); >> ??} >> >> >> >> On 05/03/2020 00:57, David Holmes wrote: >>> On 4/03/2020 8:44 am, Kevin Walls wrote: >>>> >>>> Thanks David - >>>> >>>> Yes there are situations where hs_err fails, and few people are >>>> sadder than me >>>> when that happens 8-) , so I was thinking about how scared to be by >>>> the comment. >>>> >>>> With the safety net of the error handler for the steps of the >>>> hs_err file >>>> (which works well, we see it invoked frequently), it looks >>>> reasonable to use >>>> %f as we might do other slightly questionable things for a signal >>>> handler. >>>> >>>> Corrupting locale information or floating point state might >>>> possibly cause >>>> problems, but if I cause a fake crash in print_date_and_time the error >>>> handler recovers and the report continues. >>> >>> That is good to know. >>> >>>> Thinking about printing with two ints, seconds and fractions: >>>> I don't see anything already that returns such a time in two >>>> components in the >>>> JVM, so we might implement a new form of os::javaTimeNanos() or >>>> similar that >>>> returns the two parts, and do that on each platform. >>> >>> I was thinking of something simple/crude ... >>> >>>> I didn't yet come up with anything to do in os::print_date_and_time() >>>> which will take the fractional part of the double, and print just >>>> the fraction as an int, without using any library / %f facilities. >>> >>> ... just using e.g. (untested) >>> >>> double t = os::elapsedTime(); >>> int secs =? (int) t; >>> int micros =? (int)((t - secs) * 100000); >>> printf("%d.%d", secs, micros); >>> >>> with appropriate width specifiers to get the formatting right. >>> >>> Cheers, >>> David >>> >>>> >>>> If you're still concerned I could revisit these or some other idea. >>>> >>>> Genuine laugh out loud moment for me, I backported the elapsed time >>>> logging from >>>> 6u4 to 5u19 (https://bugs.openjdk.java.net/browse/JDK-6447157) (2007). >>>> (I said before jdk5 was created, I should have said before it was >>>> in mercurial.) >>>> >>>> Thanks >>>> Kevin >>>> >>>> >>>> On 03/03/2020 01:11, David Holmes wrote: >>>>> Hi Kevin, >>>>> >>>>> On 2/03/2020 8:48 pm, Kevin Walls wrote: >>>>>> Oops, and with the bug ID in the title and JBS link: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8240295 >>>>>> >>>>>> >>>>>> On 02/03/2020 10:47, Kevin Walls wrote: >>>>>>> Hi, >>>>>>> >>>>>>> (s11y and runtime opinions both relevant) >>>>>>> >>>>>>> A few times in the last month I've really wanted to compare the >>>>>>> Events logged in the hs_err file, and the time of the JVM's crash. >>>>>>> >>>>>>> "elapsed time" in hs_err is only accurate to one second, and has >>>>>>> been since before jdk5 was created. >>>>>>> >>>>>>> The diff below changes the format string and uses the >>>>>>> non-rounded time value (I don't see a need to change the other >>>>>>> integer arithmetic here), and we can enjoy hs_errs with detail >>>>>>> like: >>>>>>> >>>>>>> ... >>>>>>> Time: Mon Mar? 2 09:57:13 2020 UTC elapsed time: 5.490135 >>>>>>> seconds (0d 0h 0m 5s) >>>>>>> ... >>>>>>> >>>>>>> Thanks >>>>>>> Kevin >>>>>>> >>>>>>> >>>>>>> /jdk/open$ hg diff >>>>>>> diff --git a/src/hotspot/share/runtime/os.cpp >>>>>>> b/src/hotspot/share/runtime/os.cpp >>>>>>> --- a/src/hotspot/share/runtime/os.cpp >>>>>>> +++ b/src/hotspot/share/runtime/os.cpp >>>>>>> @@ -1016,9 +1016,8 @@ >>>>>>> ?? } >>>>>>> >>>>>>> ?? double t = os::elapsedTime(); >>>>>>> -? // NOTE: It tends to crash after a SEGV if we want to >>>>>>> printf("%f",...) in >>>>>>> -? //?????? Linux. Must be a bug in glibc ? Workaround is to >>>>>>> round "t" to int >>>>>>> -? //?????? before printf. We lost some precision, but who cares? >>>>>>> +? // NOTE: a crash using printf("%f",...) on Linux was >>>>>>> historically noted here >>>>>>> +? //?????? (before the jdk5 repo was created). >>>>> >>>>> Just because it is old doesn't mean it no longer applies. printf >>>>> is not async-signal-safe - we know that but we try to use it >>>>> anyway. Maybe %f is even less async-signal-safe? >>>>> >>>>> This may get through testing okay but cause problems with real >>>>> crashes in the field. >>>>> >>>>> What about breaking the time up into two ints: seconds and nanos? >>>>> >>>>> Cheers, >>>>> David >>>>> ----- >>>>> >>>>>>> ?? int eltime = (int)t;? // elapsed time in seconds >>>>>>> >>>>>>> ?? // print elapsed time in a human-readable format: >>>>>>> @@ -1029,7 +1028,7 @@ >>>>>>> ?? int elmins = (eltime - day_secs - hour_secs) / secs_per_min; >>>>>>> ?? int minute_secs = elmins * secs_per_min; >>>>>>> ?? int elsecs = (eltime - day_secs - hour_secs - minute_secs); >>>>>>> -? st->print_cr(" elapsed time: %d seconds (%dd %dh %dm %ds)", >>>>>>> eltime, eldays, elhours, elmins, elsecs); >>>>>>> +? st->print_cr(" elapsed time: %f seconds (%dd %dh %dm %ds)", >>>>>>> t, eldays, elhours, elmins, elsecs); >>>>>>> ?} >>>>>>> >>>>>>> From daniil.x.titov at oracle.com Fri Mar 6 01:15:12 2020 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Thu, 05 Mar 2020 17:15:12 -0800 Subject: 8196751: Add jhsdb option to specify debug server RMI connector port In-Reply-To: <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com> References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com> <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com> Message-ID: Hi Yasumasa, Serguei and Alex, Please review a new version of the fix [1] that addresses your comments. The new version in addition to RMI connector port option introduces two more options to specify RMI registry port and RMI connector host name. Currently, these last two settings could be specified using the system properties but the system properties have the following disadvantages comparing to the command line options: - It?s hard to know about them: they are not listed in tool?s help. - They have long names that hard to remember - It is easy to mistype them in the command line and you will not get any warning about it. The CSR [2] was also updated and needs to be reviewed. Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker container and connecting to it with the GUI debugger. Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.02/ [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 [3] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 Thank you, Daniil ?On 2/24/20, 5:45 AM, "Yasumasa Suenaga" wrote: Hi Daniil, - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments. Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply. - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used. But you can use same port number as RMI registry (1099). It is same as relation between jmxremote.port and jmxremote.rmi.port. Thanks, Yasumasa On 2020/02/24 13:21, Daniil Titov wrote: > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port. > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container. > > New CSR [3] was created for this change and it needs to be reviewed as well. > > Man pages for jhsdb will be updated in a separate issue. > > The current implementation (sun.jvm.hotspot.SALauncher) parses the command line options passed to jhsdb tool, > converts them to the ones for the debug server and then delegates the call to sun.jvm.hotspot.DebugServer.main(). > > // delegate to the actual SA debug server. > 367 DebugServer.main(newArgArray.toArray(new String[0])); > > However, sun.jvm.hotspot.DebugServer doesn't support named options and that prevents from efficiently adding new options to the tool. > I found it more suitable to start Hotspot agent directly in SALauncher rather than adding a new option in both sun.jvm.hotspot.SALauncher > and sun.jvm.hotspot.DebugServer and delegating the call. With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated > but I would prefer to address it in a separate issue. > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > container and connecting to it with the GUI debugger. > Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01 > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > Thank you, > Daniil > > From suenaga at oss.nttdata.com Fri Mar 6 08:30:05 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Fri, 6 Mar 2020 17:30:05 +0900 Subject: 8196751: Add jhsdb option to specify debug server RMI connector port In-Reply-To: References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com> <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com> Message-ID: Hi Daniil, - SALauncher.java - checkBasicOptions() is needed? I think you can remove this method and embed it in caller. - I think registryPort should be checked with Integer.parseInt() like others (rmiPort and pid) rather than regex. - Shutdown hook is very good idea. You can implement more simply if you use lambda expression. - SADebugDTest.java - Please add bug ID to @bug. - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array. Thanks, Yasumasa On 2020/03/06 10:15, Daniil Titov wrote: > Hi Yasumasa, Serguei and Alex, > > Please review a new version of the fix [1] that addresses your comments. The new version in addition to RMI connector > port option introduces two more options to specify RMI registry port and RMI connector host name. Currently, these > last two settings could be specified using the system properties but the system properties have the following disadvantages > comparing to the command line options: > - It?s hard to know about them: they are not listed in tool?s help. > - They have long names that hard to remember > - It is easy to mistype them in the command line and you will not get any warning about it. > > The CSR [2] was also updated and needs to be reviewed. > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > container and connecting to it with the GUI debugger. Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.02/ > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > [3] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > Thank you, > Daniil > > ?On 2/24/20, 5:45 AM, "Yasumasa Suenaga" wrote: > > Hi Daniil, > > - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments. > Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply. > > - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used. > But you can use same port number as RMI registry (1099). > It is same as relation between jmxremote.port and jmxremote.rmi.port. > > > Thanks, > > Yasumasa > > > On 2020/02/24 13:21, Daniil Titov wrote: > > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port. > > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container. > > > > New CSR [3] was created for this change and it needs to be reviewed as well. > > > > Man pages for jhsdb will be updated in a separate issue. > > > > The current implementation (sun.jvm.hotspot.SALauncher) parses the command line options passed to jhsdb tool, > > converts them to the ones for the debug server and then delegates the call to sun.jvm.hotspot.DebugServer.main(). > > > > // delegate to the actual SA debug server. > > 367 DebugServer.main(newArgArray.toArray(new String[0])); > > > > However, sun.jvm.hotspot.DebugServer doesn't support named options and that prevents from efficiently adding new options to the tool. > > I found it more suitable to start Hotspot agent directly in SALauncher rather than adding a new option in both sun.jvm.hotspot.SALauncher > > and sun.jvm.hotspot.DebugServer and delegating the call. With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated > > but I would prefer to address it in a separate issue. > > > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > > container and connecting to it with the GUI debugger. > > Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01 > > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > > > Thank you, > > Daniil > > > > > > > From chiroito107 at gmail.com Fri Mar 6 15:24:08 2020 From: chiroito107 at gmail.com (Chihiro Ito) Date: Sat, 7 Mar 2020 00:24:08 +0900 Subject: PING: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows In-Reply-To: References: <2f7bbd0f-75ac-05d3-f97d-9819f56fc98f@oss.nttdata.com> <171d3f8c-e0a6-0edf-8bbe-9fbc4b8f7614@oss.nttdata.com> <54d4e146-58bc-8318-6e27-922616ff4b37@oss.nttdata.com> <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com> <0a2df665-2e08-6139-c131-043a425b4916@oracle.com> Message-ID: Hi Serguei, Could you review this again, please? Regards, Chihiro 2020?2?27?(?) 22:11 Chihiro Ito : > > Hi Ralf, > > Thank you for your advice. > > 1. > The comment of serializePropertiesToByteArray in VMSupport is "The stream written to the byte array is ISO 8859-1 encoded.". > But the previous implementation does not keep this. I think we need to implement encode by ISO 8859-1. > > 2. > According to help, the feature of VM.system_properties is just "Print system properties". The users should not use this output for loading. The users use it when they want to see System Properties soon. > > Regards, > Chihiro > > > 2020?2?26?(?) 18:53 Schmelter, Ralf : >> >> Hi Chihiro, >> >> I have two remarks: >> >> 1. ISO Latin 1 characters which are not ASCII will not work with the code. While the Properties.store() method claims to create ISO Latin 1 String, it really only will create printable ASCII characters (apart from the comment, but it is ASCII too in this case). See Properties.saveConvert, where the char is checked for < 0x20 or > 0x7e and then printed as \uxxxx. This is important, since the bytes of the ByteArrayOutputStream are then send to the jcmd. And jcmd expects UTF-8 encoded strings, which is OK if we only used ASCII characters. But a ISO Latin 1 character >= 0x80 will break the encoding. Just try using \u00DC in your test. >> >> 2. Your change makes it impossible to load the output with properties.load(). The old output could be loaded, since it was a valid properties file. But yours is not. For example, consider the filename c:\test\new. Formerly it would be encoded as: >> C\:\\test\\new >> And now it is: >> C:\test\new >> But the properties code would see "\n" as the newline character in your encoding. In fact you cannot differentiate between \n, \t, \f and \r originally being one or two characters. >> >> Best regards, >> Ralf >> >> >> From: serviceability-dev On Behalf Of Chihiro Ito >> Sent: Dienstag, 25. Februar 2020 04:45 >> To: serguei.spitsyn at oracle.com >> Cc: serviceability-dev at openjdk.java.net >> Subject: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows >> >> Hi Serguei, >> >> Thanks for your review and advice. >> >> I modified these. >> Could you review this again, please? >> >> Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.05/ >> >> Regards, >> Chihiro >> From serguei.spitsyn at oracle.com Fri Mar 6 18:32:30 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 6 Mar 2020 10:32:30 -0800 Subject: PING: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows In-Reply-To: References: <2f7bbd0f-75ac-05d3-f97d-9819f56fc98f@oss.nttdata.com> <171d3f8c-e0a6-0edf-8bbe-9fbc4b8f7614@oss.nttdata.com> <54d4e146-58bc-8318-6e27-922616ff4b37@oss.nttdata.com> <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com> <0a2df665-2e08-6139-c131-043a425b4916@oracle.com> Message-ID: <31599c2a-6947-44a0-0f91-cb240ddaf1ed@oracle.com> An HTML attachment was scrubbed... URL: From daniil.x.titov at oracle.com Fri Mar 6 18:38:09 2020 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Fri, 06 Mar 2020 10:38:09 -0800 Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI connector port In-Reply-To: References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com> <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com> Message-ID: <41392FC2-AC78-4E94-878B-907DDFB3E968@oracle.com> Hi Yasumasa, -> checkBasicOptions() is needed? I think you can remove this method and embed it in caller. I think that having a piece of code that invokes a method named "buildAttachArgs" with a copy of the argument map just for its side-effect ( it throws an exception if parameters are incorrect) and ignores its return might look confusing. Thus, I found it more appropriate to wrap it inside a method with more relevant name . > SADebugDTest > - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array. We cannot use primitives there since these local variables are captured in lambda expression and are required to be final. The other option is to use some other wrapper for them but I don't see any obvious benefits in it comparing to the array. I will include your other suggestion in the new version of the webrev. Thanks! Daniil ?On 3/6/20, 12:30 AM, "Yasumasa Suenaga" wrote: Hi Daniil, - SALauncher.java - checkBasicOptions() is needed? I think you can remove this method and embed it in caller. - I think registryPort should be checked with Integer.parseInt() like others (rmiPort and pid) rather than regex. - Shutdown hook is very good idea. You can implement more simply if you use lambda expression. - SADebugDTest.java - Please add bug ID to @bug. - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array. Thanks, Yasumasa On 2020/03/06 10:15, Daniil Titov wrote: > Hi Yasumasa, Serguei and Alex, > > Please review a new version of the fix [1] that addresses your comments. The new version in addition to RMI connector > port option introduces two more options to specify RMI registry port and RMI connector host name. Currently, these > last two settings could be specified using the system properties but the system properties have the following disadvantages > comparing to the command line options: > - It?s hard to know about them: they are not listed in tool?s help. > - They have long names that hard to remember > - It is easy to mistype them in the command line and you will not get any warning about it. > > The CSR [2] was also updated and needs to be reviewed. > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > container and connecting to it with the GUI debugger. Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.02/ > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > [3] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > Thank you, > Daniil > > ?On 2/24/20, 5:45 AM, "Yasumasa Suenaga" wrote: > > Hi Daniil, > > - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments. > Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply. > > - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used. > But you can use same port number as RMI registry (1099). > It is same as relation between jmxremote.port and jmxremote.rmi.port. > > > Thanks, > > Yasumasa > > > On 2020/02/24 13:21, Daniil Titov wrote: > > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port. > > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container. > > > > New CSR [3] was created for this change and it needs to be reviewed as well. > > > > Man pages for jhsdb will be updated in a separate issue. > > > > The current implementation (sun.jvm.hotspot.SALauncher) parses the command line options passed to jhsdb tool, > > converts them to the ones for the debug server and then delegates the call to sun.jvm.hotspot.DebugServer.main(). > > > > // delegate to the actual SA debug server. > > 367 DebugServer.main(newArgArray.toArray(new String[0])); > > > > However, sun.jvm.hotspot.DebugServer doesn't support named options and that prevents from efficiently adding new options to the tool. > > I found it more suitable to start Hotspot agent directly in SALauncher rather than adding a new option in both sun.jvm.hotspot.SALauncher > > and sun.jvm.hotspot.DebugServer and delegating the call. With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated > > but I would prefer to address it in a separate issue. > > > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > > container and connecting to it with the GUI debugger. > > Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01 > > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > > > Thank you, > > Daniil > > > > > > > From suenaga at oss.nttdata.com Sat Mar 7 02:15:03 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Sat, 7 Mar 2020 11:15:03 +0900 Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI connector port In-Reply-To: <41392FC2-AC78-4E94-878B-907DDFB3E968@oracle.com> References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com> <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com> <41392FC2-AC78-4E94-878B-907DDFB3E968@oracle.com> Message-ID: Hi Daniil, On 2020/03/07 3:38, Daniil Titov wrote: > Hi Yasumasa, > > -> checkBasicOptions() is needed? I think you can remove this method and embed it in caller. > I think that having a piece of code that invokes a method named "buildAttachArgs" with a copy of the argument map just for its side-effect ( it throws an exception if parameters are incorrect) and ignores its return might look confusing. Thus, I found it more appropriate to wrap it inside a method with more relevant name . Ok, but I prefer to leave comment it. > > SADebugDTest > > - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array. > We cannot use primitives there since these local variables are captured in lambda expression and are required to be final. > The other option is to use some other wrapper for them but I don't see any obvious benefits in it comparing to the array. Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains `exclusiveAccess.dirs=.` to avoid concurrent execution. If you do not think this error check, test code is more simply. > I will include your other suggestion in the new version of the webrev. Sorry, I have one more comment: > - Shutdown hook is very good idea. You can implement more simply if you use lambda expression. Shutdown hook is already registered in c'tor of HotSpotAgent. It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed. Thanks, Yasumasa > Thanks! > Daniil > > ?On 3/6/20, 12:30 AM, "Yasumasa Suenaga" wrote: > > Hi Daniil, > > > - SALauncher.java > - checkBasicOptions() is needed? I think you can remove this method and embed it in caller. > - I think registryPort should be checked with Integer.parseInt() like others (rmiPort and pid) rather than regex. > - Shutdown hook is very good idea. You can implement more simply if you use lambda expression. > > - SADebugDTest.java > - Please add bug ID to @bug. > - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array. > > > Thanks, > > Yasumasa > > > On 2020/03/06 10:15, Daniil Titov wrote: > > Hi Yasumasa, Serguei and Alex, > > > > Please review a new version of the fix [1] that addresses your comments. The new version in addition to RMI connector > > port option introduces two more options to specify RMI registry port and RMI connector host name. Currently, these > > last two settings could be specified using the system properties but the system properties have the following disadvantages > > comparing to the command line options: > > - It?s hard to know about them: they are not listed in tool?s help. > > - They have long names that hard to remember > > - It is easy to mistype them in the command line and you will not get any warning about it. > > > > The CSR [2] was also updated and needs to be reviewed. > > > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > > container and connecting to it with the GUI debugger. Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.02/ > > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > [3] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > > > Thank you, > > Daniil > > > > ?On 2/24/20, 5:45 AM, "Yasumasa Suenaga" wrote: > > > > Hi Daniil, > > > > - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments. > > Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply. > > > > - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used. > > But you can use same port number as RMI registry (1099). > > It is same as relation between jmxremote.port and jmxremote.rmi.port. > > > > > > Thanks, > > > > Yasumasa > > > > > > On 2020/02/24 13:21, Daniil Titov wrote: > > > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port. > > > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container. > > > > > > New CSR [3] was created for this change and it needs to be reviewed as well. > > > > > > Man pages for jhsdb will be updated in a separate issue. > > > > > > The current implementation (sun.jvm.hotspot.SALauncher) parses the command line options passed to jhsdb tool, > > > converts them to the ones for the debug server and then delegates the call to sun.jvm.hotspot.DebugServer.main(). > > > > > > // delegate to the actual SA debug server. > > > 367 DebugServer.main(newArgArray.toArray(new String[0])); > > > > > > However, sun.jvm.hotspot.DebugServer doesn't support named options and that prevents from efficiently adding new options to the tool. > > > I found it more suitable to start Hotspot agent directly in SALauncher rather than adding a new option in both sun.jvm.hotspot.SALauncher > > > and sun.jvm.hotspot.DebugServer and delegating the call. With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated > > > but I would prefer to address it in a separate issue. > > > > > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > > > container and connecting to it with the GUI debugger. > > > Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01 > > > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > > > > > Thank you, > > > Daniil > > > > > > > > > > > > > > > From chris.plummer at oracle.com Sat Mar 7 04:42:53 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 6 Mar 2020 20:42:53 -0800 Subject: RFR(XS): 8240691: serviceability/sa/ClhsdbCDSJstackPrintAll.java and serviceability/sa/ClhsdbCDSCore.java should be excluded with ZGC In-Reply-To: <8c057888-5a75-c9c7-cb55-9352d07b3013@oracle.com> References: <86878457-3386-ea0e-f23e-7ad1bdff64a6@oracle.com> <038c6af5-f7e4-cf93-9c22-729f1db03e05@oracle.com> <8c057888-5a75-c9c7-cb55-9352d07b3013@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From suenaga at oss.nttdata.com Sat Mar 7 07:03:59 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Sat, 7 Mar 2020 16:03:59 +0900 Subject: PING: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows In-Reply-To: <31599c2a-6947-44a0-0f91-cb240ddaf1ed@oracle.com> References: <2f7bbd0f-75ac-05d3-f97d-9819f56fc98f@oss.nttdata.com> <171d3f8c-e0a6-0edf-8bbe-9fbc4b8f7614@oss.nttdata.com> <54d4e146-58bc-8318-6e27-922616ff4b37@oss.nttdata.com> <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com> <0a2df665-2e08-6139-c131-043a425b4916@oracle.com> <31599c2a-6947-44a0-0f91-cb240ddaf1ed@oracle.com> Message-ID: Hi Chihiro, I'm also ok with webrev.05 after updating copyright year. Yasumasa On 2020/03/07 3:32, serguei.spitsyn at oracle.com wrote: > Hi Chichiro, > > I'm okay with the fix. > Could you, please, update the copyright date in || src/java.base/share/classes/jdk/internal/vm/VMSupport.java before push? > > Thanks, > Serguei > > > On 3/6/20 07:24, Chihiro Ito wrote: >> Hi Serguei, >> >> Could you review this again, please? >> >> Regards, >> Chihiro >> >> >> 2020?2?27?(?) 22:11 Chihiro Ito: >>> Hi Ralf, >>> >>> Thank you for your advice. >>> >>> 1. >>> The comment of serializePropertiesToByteArray in VMSupport is "The stream written to the byte array is ISO 8859-1 encoded.". >>> But the previous implementation does not keep this. I think we need to implement encode by ISO 8859-1. >>> >>> 2. >>> According to help, the feature of VM.system_properties is just "Print system properties". The users should not use this output for loading. The users use it when they want to see System Properties soon. >>> >>> Regards, >>> Chihiro >>> >>> >>> 2020?2?26?(?) 18:53 Schmelter, Ralf: >>>> Hi Chihiro, >>>> >>>> I have two remarks: >>>> >>>> 1. ISO Latin 1 characters which are not ASCII will not work with the code. While the Properties.store() method claims to create ISO Latin 1 String, it really only will create printable ASCII characters (apart from the comment, but it is ASCII too in this case). See Properties.saveConvert, where the char is checked for < 0x20 or > 0x7e and then printed as \uxxxx. This is important, since the bytes of the ByteArrayOutputStream are then send to the jcmd. And jcmd expects UTF-8 encoded strings, which is OK if we only used ASCII characters. But a ISO Latin 1 character >= 0x80 will break the encoding. Just try using \u00DC in your test. >>>> >>>> 2. Your change makes it impossible to load the output with properties.load(). The old output could be loaded, since it was a valid properties file. But yours is not. For example, consider the filename c:\test\new. Formerly it would be encoded as: >>>> C\:\\test\\new >>>> And now it is: >>>> C:\test\new >>>> But the properties code would see "\n" as the newline character in your encoding. In fact you cannot differentiate between \n, \t, \f and \r originally being one or two characters. >>>> >>>> Best regards, >>>> Ralf >>>> >>>> >>>> From: serviceability-dev On Behalf Of Chihiro Ito >>>> Sent: Dienstag, 25. Februar 2020 04:45 >>>> To:serguei.spitsyn at oracle.com >>>> Cc:serviceability-dev at openjdk.java.net >>>> Subject: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows >>>> >>>> Hi Serguei, >>>> >>>> Thanks for your review and advice. >>>> >>>> I modified these. >>>> Could you review this again, please? >>>> >>>> Webrev :http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.05/ >>>> >>>> Regards, >>>> Chihiro >>>> > From serguei.spitsyn at oracle.com Sat Mar 7 07:17:59 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 6 Mar 2020 23:17:59 -0800 Subject: RFR: JDK-8240340: java/lang/management/ThreadMXBean/Locks.java is buggy In-Reply-To: References: Message-ID: Hi Alex, It looks good to me. Thanks, Serguei On 3/4/20 16:30, Alex Menkov wrote: > Hi all, > > please review the fix for > https://bugs.openjdk.java.net/browse/JDK-8240340 > webrev: > http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev/ > > changes: > - assertThreadState method: don't re-read thread state throwing > exception (as we got weird error like "Thread WaitingThread is at > WAITING state but is expected to be in Thread.State = WAITING"); > - added proper test shutdown on error (made all threads "daemon", > interrupt waiting thread if CheckerThread throws exception); > - if CheckerThread detects error, propagate the exception to main thread; > - fixed LockFreeLogger class - it should work for logging from several > threads, but it doesn't. I prefer to simplify it just to keep > ConcurrentLinkedQueue. > LockFreeLogger is also used by ThreadMXBeanStateTest test, but only by > a single thread. > > --alex From serguei.spitsyn at oracle.com Sat Mar 7 07:28:17 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 6 Mar 2020 23:28:17 -0800 Subject: RFR: JDK-8240340: java/lang/management/ThreadMXBean/Locks.java is buggy In-Reply-To: <9a3d8f8c-5c46-aa64-ba22-b1b2bf1836ee@oracle.com> References: <4c9b6f1b-1308-9087-ef4c-140eaf103b0f@oracle.com> <9a3d8f8c-5c46-aa64-ba22-b1b2bf1836ee@oracle.com> Message-ID: <2cf61bd9-da4d-2477-6604-d104118c4e03@oracle.com> Hi David and Alex, My understanding is that previous implementation collected logs separately for each thread in TLS, and at the end, merged and sorted out the output by log id. So, the result is that all messages are serialized at the end. Alex changed the implementation but the result is the same - all log messages are serialized. There are two tests which use the LockFreeLogger. Another one is: test/jdk/java/lang/Thread/ThreadStateController.java . Does the ThreadStateController.java work okay after the fix? Thanks, Serguei On 3/5/20 10:54, Alex Menkov wrote: > Hi David, > > Thanks you for the review. > > On 03/04/2020 17:50, David Holmes wrote: >> Hi Alex, >> >> On 5/03/2020 10:30 am, Alex Menkov wrote: >>> Hi all, >>> >>> please review the fix for >>> https://bugs.openjdk.java.net/browse/JDK-8240340 >>> webrev: >>> http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev/ >>> >>> >>> changes: >>> - assertThreadState method: don't re-read thread state throwing >>> exception (as we got weird error like "Thread WaitingThread is at >>> WAITING state but is expected to be in Thread.State = WAITING"); >>> - added proper test shutdown on error (made all threads "daemon", >>> interrupt waiting thread if CheckerThread throws exception); >>> - if CheckerThread detects error, propagate the exception to main >>> thread; >> >> The test changes seem fine. >> >>> - fixed LockFreeLogger class - it should work for logging from >>> several threads, but it doesn't. I prefer to simplify it just to >>> keep ConcurrentLinkedQueue. >>> LockFreeLogger is also used by ThreadMXBeanStateTest test, but only >>> by a single thread. >> >> I don't understand your changes here as you've completely changed the >> intended design of the logger. The original accumulates log entries >> per-thread and then spits them all out (though I'm not clear on the >> exact ordering - I don't how to read that stream stuff). The new code >> just creates a single queue of log records interleaving entries from >> different threads. The simple logger may be all that is needed but it >> seems quite different to the intent of the original. > > Testing changes in the test I discovered that there is something wrong > with the logger - it printed only part of the records, so I have to > look at the LockFreeLogger class and I don't understand how it was > supposed to work. > About ordering in cumulative log: each record has Integer which used > to sort log entries from all threads (i.e. records from different > threads are printed at the order which log() was called). > Looking at allRecords/records stuff I don't understand how it should > be used. To get logs from different threads in one logger, we needs > one instance. So we create LockFreeLogger (in main thread) and ctor > creates ThreadLocal record and register it in allRecords. Logging from > main thread works fine, but if any other thread tries to log, 1st > log() call creates its own ThreadLocal records (by records.get()) and > log records from this thread go there. But this ThreadLocal records is > not registered in allRecords, so this logging won't be included in > final log. > Looks like we need to change log() to something like > > Map recs = records.get(); > if (recs.isEmpty()) { > ??? allRecords.add(recs); > } > recs.put(id, String.format(format, params)); > > But all this stuff do exactly the same as simple ConcurrentLinkedQueue > (i.e. lock free ordered list). > At least I don't see other rationale in the stuff. > > --alex > >> >> Thanks, >> David >> >>> --alex From serguei.spitsyn at oracle.com Sat Mar 7 07:32:33 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 6 Mar 2020 23:32:33 -0800 Subject: RFR(S) : 8153430: [TESTBUG] jdk regression test javax/management/loading/MletParserLocaleTest.java reduce default timeout In-Reply-To: <0653e5d4-893b-ce78-f0cf-5905a659bb10@oracle.com> References: <0cee14ea-901d-4571-85b9-bbb6f01d59c1@default> <0653e5d4-893b-ce78-f0cf-5905a659bb10@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Sat Mar 7 07:42:33 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 6 Mar 2020 23:42:33 -0800 Subject: RFR(XS): 8240691: serviceability/sa/ClhsdbCDSJstackPrintAll.java and serviceability/sa/ClhsdbCDSCore.java should be excluded with ZGC In-Reply-To: References: <86878457-3386-ea0e-f23e-7ad1bdff64a6@oracle.com> <038c6af5-f7e4-cf93-9c22-729f1db03e05@oracle.com> <8c057888-5a75-c9c7-cb55-9352d07b3013@oracle.com> Message-ID: <42b7ac87-a09c-f703-f1a2-7bde6fb75f76@oracle.com> An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Sat Mar 7 07:53:21 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 6 Mar 2020 23:53:21 -0800 Subject: RFR(S): 8240295: hs_err elapsed time in seconds is not accurate enough In-Reply-To: <20045e23-c736-5289-866e-9df5a09101a8@oracle.com> References: <4a3d7c8e-df9e-51de-77e4-775be1480392@oracle.com> <92ada19d-7200-6bb7-b174-9da8a166aea7@oracle.com> <053864a8-617f-fcac-fc26-220d161e3e55@oracle.com> <20045e23-c736-5289-866e-9df5a09101a8@oracle.com> Message-ID: <80fc495b-5169-2ec8-559c-8ba8c9f4939d@oracle.com> Hi Kevin, This looks okay to me as well. Thanks, Serguei On 3/5/20 02:38, David Holmes wrote: > Thanks Kevin. I think this is the less risky change and achieves the > goal. > > David > > On 5/03/2020 8:00 pm, Kevin Walls wrote: >> Thanks - >> >> I had tried some ideas in the simple fashion, and we can use %06d >> formatting.... OK maybe such formatting is not as "bad" as %f... >> >> (glibc parses the int width specified without allocation.? We provide >> the output buffer, I don't think we will cause? vfprintf code to >> alloca or malloc.) >> >> I can offer a second version below that uses %d only.? Testing >> alongside %f in the same line, it retains the same value and >> position, e.g. >> >> Time: Thu Mar? 5 08:57:50 2020 UTC elapsed time: f: 2.001065 int: >> 2.001065 (raw int: 1065) seconds (0d 0h 0m 2s) >> >> Output example from the hg diff below (not from the same run): >> >> Time: Thu Mar? 5 09:28:01 2020 UTC elapsed time: 2.000611 seconds (0d >> 0h 0m 2s) >> >> >> Thanks! >> Kevin >> >> >> $ hg diff >> diff --git a/src/hotspot/share/runtime/os.cpp >> b/src/hotspot/share/runtime/os.cpp >> --- a/src/hotspot/share/runtime/os.cpp >> +++ b/src/hotspot/share/runtime/os.cpp >> @@ -1016,10 +1016,9 @@ >> ??? } >> >> ??? double t = os::elapsedTime(); >> -? // NOTE: It tends to crash after a SEGV if we want to >> printf("%f",...) in >> -? //?????? Linux. Must be a bug in glibc ? Workaround is to round >> "t" to int >> -? //?????? before printf. We lost some precision, but who cares? >> +? // NOTE: a crash using printf("%f",...) on Linux was historically >> noted here. >> ??? int eltime = (int)t;? // elapsed time in seconds >> +? int eltimeFraction = (int) ((t - eltime) * 1000000); >> >> ??? // print elapsed time in a human-readable format: >> ??? int eldays = eltime / secs_per_day; >> @@ -1029,7 +1028,7 @@ >> ??? int elmins = (eltime - day_secs - hour_secs) / secs_per_min; >> ??? int minute_secs = elmins * secs_per_min; >> ??? int elsecs = (eltime - day_secs - hour_secs - minute_secs); >> -? st->print_cr(" elapsed time: %d seconds (%dd %dh %dm %ds)", >> eltime, eldays, elhours, elmins, elsecs); >> +? st->print_cr(" elapsed time: %d.%06d seconds (%dd %dh %dm %ds)", >> eltime, eltimeFraction, eldays, elhours, elmins, elsecs); >> ??} >> >> >> >> On 05/03/2020 00:57, David Holmes wrote: >>> On 4/03/2020 8:44 am, Kevin Walls wrote: >>>> >>>> Thanks David - >>>> >>>> Yes there are situations where hs_err fails, and few people are >>>> sadder than me >>>> when that happens 8-) , so I was thinking about how scared to be by >>>> the comment. >>>> >>>> With the safety net of the error handler for the steps of the >>>> hs_err file >>>> (which works well, we see it invoked frequently), it looks >>>> reasonable to use >>>> %f as we might do other slightly questionable things for a signal >>>> handler. >>>> >>>> Corrupting locale information or floating point state might >>>> possibly cause >>>> problems, but if I cause a fake crash in print_date_and_time the error >>>> handler recovers and the report continues. >>> >>> That is good to know. >>> >>>> Thinking about printing with two ints, seconds and fractions: >>>> I don't see anything already that returns such a time in two >>>> components in the >>>> JVM, so we might implement a new form of os::javaTimeNanos() or >>>> similar that >>>> returns the two parts, and do that on each platform. >>> >>> I was thinking of something simple/crude ... >>> >>>> I didn't yet come up with anything to do in os::print_date_and_time() >>>> which will take the fractional part of the double, and print just >>>> the fraction as an int, without using any library / %f facilities. >>> >>> ... just using e.g. (untested) >>> >>> double t = os::elapsedTime(); >>> int secs =? (int) t; >>> int micros =? (int)((t - secs) * 100000); >>> printf("%d.%d", secs, micros); >>> >>> with appropriate width specifiers to get the formatting right. >>> >>> Cheers, >>> David >>> >>>> >>>> If you're still concerned I could revisit these or some other idea. >>>> >>>> Genuine laugh out loud moment for me, I backported the elapsed time >>>> logging from >>>> 6u4 to 5u19 (https://bugs.openjdk.java.net/browse/JDK-6447157) (2007). >>>> (I said before jdk5 was created, I should have said before it was >>>> in mercurial.) >>>> >>>> Thanks >>>> Kevin >>>> >>>> >>>> On 03/03/2020 01:11, David Holmes wrote: >>>>> Hi Kevin, >>>>> >>>>> On 2/03/2020 8:48 pm, Kevin Walls wrote: >>>>>> Oops, and with the bug ID in the title and JBS link: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8240295 >>>>>> >>>>>> >>>>>> On 02/03/2020 10:47, Kevin Walls wrote: >>>>>>> Hi, >>>>>>> >>>>>>> (s11y and runtime opinions both relevant) >>>>>>> >>>>>>> A few times in the last month I've really wanted to compare the >>>>>>> Events logged in the hs_err file, and the time of the JVM's crash. >>>>>>> >>>>>>> "elapsed time" in hs_err is only accurate to one second, and has >>>>>>> been since before jdk5 was created. >>>>>>> >>>>>>> The diff below changes the format string and uses the >>>>>>> non-rounded time value (I don't see a need to change the other >>>>>>> integer arithmetic here), and we can enjoy hs_errs with detail >>>>>>> like: >>>>>>> >>>>>>> ... >>>>>>> Time: Mon Mar? 2 09:57:13 2020 UTC elapsed time: 5.490135 >>>>>>> seconds (0d 0h 0m 5s) >>>>>>> ... >>>>>>> >>>>>>> Thanks >>>>>>> Kevin >>>>>>> >>>>>>> >>>>>>> /jdk/open$ hg diff >>>>>>> diff --git a/src/hotspot/share/runtime/os.cpp >>>>>>> b/src/hotspot/share/runtime/os.cpp >>>>>>> --- a/src/hotspot/share/runtime/os.cpp >>>>>>> +++ b/src/hotspot/share/runtime/os.cpp >>>>>>> @@ -1016,9 +1016,8 @@ >>>>>>> ?? } >>>>>>> >>>>>>> ?? double t = os::elapsedTime(); >>>>>>> -? // NOTE: It tends to crash after a SEGV if we want to >>>>>>> printf("%f",...) in >>>>>>> -? //?????? Linux. Must be a bug in glibc ? Workaround is to >>>>>>> round "t" to int >>>>>>> -? //?????? before printf. We lost some precision, but who cares? >>>>>>> +? // NOTE: a crash using printf("%f",...) on Linux was >>>>>>> historically noted here >>>>>>> +? //?????? (before the jdk5 repo was created). >>>>> >>>>> Just because it is old doesn't mean it no longer applies. printf >>>>> is not async-signal-safe - we know that but we try to use it >>>>> anyway. Maybe %f is even less async-signal-safe? >>>>> >>>>> This may get through testing okay but cause problems with real >>>>> crashes in the field. >>>>> >>>>> What about breaking the time up into two ints: seconds and nanos? >>>>> >>>>> Cheers, >>>>> David >>>>> ----- >>>>> >>>>>>> ?? int eltime = (int)t;? // elapsed time in seconds >>>>>>> >>>>>>> ?? // print elapsed time in a human-readable format: >>>>>>> @@ -1029,7 +1028,7 @@ >>>>>>> ?? int elmins = (eltime - day_secs - hour_secs) / secs_per_min; >>>>>>> ?? int minute_secs = elmins * secs_per_min; >>>>>>> ?? int elsecs = (eltime - day_secs - hour_secs - minute_secs); >>>>>>> -? st->print_cr(" elapsed time: %d seconds (%dd %dh %dm %ds)", >>>>>>> eltime, eldays, elhours, elmins, elsecs); >>>>>>> +? st->print_cr(" elapsed time: %f seconds (%dd %dh %dm %ds)", >>>>>>> t, eldays, elhours, elmins, elsecs); >>>>>>> ?} >>>>>>> >>>>>>> From kevin.walls at oracle.com Sat Mar 7 07:57:48 2020 From: kevin.walls at oracle.com (Kevin Walls) Date: Fri, 6 Mar 2020 23:57:48 -0800 (PST) Subject: RFR(S): 8240295: hs_err elapsed time in seconds is not accurate enough In-Reply-To: <80fc495b-5169-2ec8-559c-8ba8c9f4939d@oracle.com> References: <4a3d7c8e-df9e-51de-77e4-775be1480392@oracle.com> <92ada19d-7200-6bb7-b174-9da8a166aea7@oracle.com> <053864a8-617f-fcac-fc26-220d161e3e55@oracle.com> <20045e23-c736-5289-866e-9df5a09101a8@oracle.com> <80fc495b-5169-2ec8-559c-8ba8c9f4939d@oracle.com> Message-ID: <67d1cbc7-8102-09ac-7322-9f545089a45c@oracle.com> Great, thanks! On 07/03/2020 07:53, serguei.spitsyn at oracle.com wrote: > Hi Kevin, > > This looks okay to me as well. > > Thanks, > Serguei From chiroito107 at gmail.com Sat Mar 7 14:13:01 2020 From: chiroito107 at gmail.com (Chihiro Ito) Date: Sat, 7 Mar 2020 23:13:01 +0900 Subject: PING: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows In-Reply-To: References: <2f7bbd0f-75ac-05d3-f97d-9819f56fc98f@oss.nttdata.com> <171d3f8c-e0a6-0edf-8bbe-9fbc4b8f7614@oss.nttdata.com> <54d4e146-58bc-8318-6e27-922616ff4b37@oss.nttdata.com> <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com> <0a2df665-2e08-6139-c131-043a425b4916@oracle.com> <31599c2a-6947-44a0-0f91-cb240ddaf1ed@oracle.com> Message-ID: Hi Serguei and Yasumasa, I update the copyright year and created the change set. Could you sponsor this, please? Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/ Change set : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset Regards, Chihiro 2020?3?7?(?) 16:03 Yasumasa Suenaga : > > Hi Chihiro, > > I'm also ok with webrev.05 after updating copyright year. > > > Yasumasa > > > On 2020/03/07 3:32, serguei.spitsyn at oracle.com wrote: > > Hi Chichiro, > > > > I'm okay with the fix. > > Could you, please, update the copyright date in || src/java.base/share/classes/jdk/internal/vm/VMSupport.java before push? > > > > Thanks, > > Serguei > > > > > > On 3/6/20 07:24, Chihiro Ito wrote: > >> Hi Serguei, > >> > >> Could you review this again, please? > >> > >> Regards, > >> Chihiro > >> > >> > >> 2020?2?27?(?) 22:11 Chihiro Ito: > >>> Hi Ralf, > >>> > >>> Thank you for your advice. > >>> > >>> 1. > >>> The comment of serializePropertiesToByteArray in VMSupport is "The stream written to the byte array is ISO 8859-1 encoded.". > >>> But the previous implementation does not keep this. I think we need to implement encode by ISO 8859-1. > >>> > >>> 2. > >>> According to help, the feature of VM.system_properties is just "Print system properties". The users should not use this output for loading. The users use it when they want to see System Properties soon. > >>> > >>> Regards, > >>> Chihiro > >>> > >>> > >>> 2020?2?26?(?) 18:53 Schmelter, Ralf: > >>>> Hi Chihiro, > >>>> > >>>> I have two remarks: > >>>> > >>>> 1. ISO Latin 1 characters which are not ASCII will not work with the code. While the Properties.store() method claims to create ISO Latin 1 String, it really only will create printable ASCII characters (apart from the comment, but it is ASCII too in this case). See Properties.saveConvert, where the char is checked for < 0x20 or > 0x7e and then printed as \uxxxx. This is important, since the bytes of the ByteArrayOutputStream are then send to the jcmd. And jcmd expects UTF-8 encoded strings, which is OK if we only used ASCII characters. But a ISO Latin 1 character >= 0x80 will break the encoding. Just try using \u00DC in your test. > >>>> > >>>> 2. Your change makes it impossible to load the output with properties.load(). The old output could be loaded, since it was a valid properties file. But yours is not. For example, consider the filename c:\test\new. Formerly it would be encoded as: > >>>> C\:\\test\\new > >>>> And now it is: > >>>> C:\test\new > >>>> But the properties code would see "\n" as the newline character in your encoding. In fact you cannot differentiate between \n, \t, \f and \r originally being one or two characters. > >>>> > >>>> Best regards, > >>>> Ralf > >>>> > >>>> > >>>> From: serviceability-dev On Behalf Of Chihiro Ito > >>>> Sent: Dienstag, 25. Februar 2020 04:45 > >>>> To:serguei.spitsyn at oracle.com > >>>> Cc:serviceability-dev at openjdk.java.net > >>>> Subject: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows > >>>> > >>>> Hi Serguei, > >>>> > >>>> Thanks for your review and advice. > >>>> > >>>> I modified these. > >>>> Could you review this again, please? > >>>> > >>>> Webrev :http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.05/ > >>>> > >>>> Regards, > >>>> Chihiro > >>>> > > From chiroito107 at gmail.com Sun Mar 8 13:05:21 2020 From: chiroito107 at gmail.com (Chihiro Ito) Date: Sun, 8 Mar 2020 22:05:21 +0900 Subject: PING: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows In-Reply-To: References: <2f7bbd0f-75ac-05d3-f97d-9819f56fc98f@oss.nttdata.com> <171d3f8c-e0a6-0edf-8bbe-9fbc4b8f7614@oss.nttdata.com> <54d4e146-58bc-8318-6e27-922616ff4b37@oss.nttdata.com> <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com> <0a2df665-2e08-6139-c131-043a425b4916@oracle.com> <31599c2a-6947-44a0-0f91-cb240ddaf1ed@oracle.com> Message-ID: Hi, I'm sorry. I included "JDK-" in the changeset title. I removed it and updated it. Change set : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset Regards, Chihiro 2020?3?7?(?) 23:13 Chihiro Ito : > > Hi Serguei and Yasumasa, > > I update the copyright year and created the change set. > > Could you sponsor this, please? > > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/ > Change set : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset > > Regards, > Chihiro > > > 2020?3?7?(?) 16:03 Yasumasa Suenaga : > > > > > > Hi Chihiro, > > > > I'm also ok with webrev.05 after updating copyright year. > > > > > > Yasumasa > > > > > > On 2020/03/07 3:32, serguei.spitsyn at oracle.com wrote: > > > Hi Chichiro, > > > > > > I'm okay with the fix. > > > Could you, please, update the copyright date in || src/java.base/share/classes/jdk/internal/vm/VMSupport.java before push? > > > > > > Thanks, > > > Serguei > > > > > > > > > On 3/6/20 07:24, Chihiro Ito wrote: > > >> Hi Serguei, > > >> > > >> Could you review this again, please? > > >> > > >> Regards, > > >> Chihiro > > >> > > >> > > >> 2020?2?27?(?) 22:11 Chihiro Ito: > > >>> Hi Ralf, > > >>> > > >>> Thank you for your advice. > > >>> > > >>> 1. > > >>> The comment of serializePropertiesToByteArray in VMSupport is "The stream written to the byte array is ISO 8859-1 encoded.". > > >>> But the previous implementation does not keep this. I think we need to implement encode by ISO 8859-1. > > >>> > > >>> 2. > > >>> According to help, the feature of VM.system_properties is just "Print system properties". The users should not use this output for loading. The users use it when they want to see System Properties soon. > > >>> > > >>> Regards, > > >>> Chihiro > > >>> > > >>> > > >>> 2020?2?26?(?) 18:53 Schmelter, Ralf: > > >>>> Hi Chihiro, > > >>>> > > >>>> I have two remarks: > > >>>> > > >>>> 1. ISO Latin 1 characters which are not ASCII will not work with the code. While the Properties.store() method claims to create ISO Latin 1 String, it really only will create printable ASCII characters (apart from the comment, but it is ASCII too in this case). See Properties.saveConvert, where the char is checked for < 0x20 or > 0x7e and then printed as \uxxxx. This is important, since the bytes of the ByteArrayOutputStream are then send to the jcmd. And jcmd expects UTF-8 encoded strings, which is OK if we only used ASCII characters. But a ISO Latin 1 character >= 0x80 will break the encoding. Just try using \u00DC in your test. > > >>>> > > >>>> 2. Your change makes it impossible to load the output with properties.load(). The old output could be loaded, since it was a valid properties file. But yours is not. For example, consider the filename c:\test\new. Formerly it would be encoded as: > > >>>> C\:\\test\\new > > >>>> And now it is: > > >>>> C:\test\new > > >>>> But the properties code would see "\n" as the newline character in your encoding. In fact you cannot differentiate between \n, \t, \f and \r originally being one or two characters. > > >>>> > > >>>> Best regards, > > >>>> Ralf > > >>>> > > >>>> > > >>>> From: serviceability-dev On Behalf Of Chihiro Ito > > >>>> Sent: Dienstag, 25. Februar 2020 04:45 > > >>>> To:serguei.spitsyn at oracle.com > > >>>> Cc:serviceability-dev at openjdk.java.net > > >>>> Subject: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows > > >>>> > > >>>> Hi Serguei, > > >>>> > > >>>> Thanks for your review and advice. > > >>>> > > >>>> I modified these. > > >>>> Could you review this again, please? > > >>>> > > >>>> Webrev :http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.05/ > > >>>> > > >>>> Regards, > > >>>> Chihiro > > >>>> > > > From david.holmes at oracle.com Mon Mar 9 04:15:58 2020 From: david.holmes at oracle.com (David Holmes) Date: Mon, 9 Mar 2020 14:15:58 +1000 Subject: RFR: JDK-8240340: java/lang/management/ThreadMXBean/Locks.java is buggy In-Reply-To: <9a3d8f8c-5c46-aa64-ba22-b1b2bf1836ee@oracle.com> References: <4c9b6f1b-1308-9087-ef4c-140eaf103b0f@oracle.com> <9a3d8f8c-5c46-aa64-ba22-b1b2bf1836ee@oracle.com> Message-ID: <8d0f40d7-4202-0179-d130-1366c77e5c05@oracle.com> Hi Alex, On 6/03/2020 4:54 am, Alex Menkov wrote: > Hi David, > > Thanks you for the review. > > On 03/04/2020 17:50, David Holmes wrote: >> Hi Alex, >> >> On 5/03/2020 10:30 am, Alex Menkov wrote: >>> Hi all, >>> >>> please review the fix for >>> https://bugs.openjdk.java.net/browse/JDK-8240340 >>> webrev: >>> http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev/ >>> >>> >>> changes: >>> - assertThreadState method: don't re-read thread state throwing >>> exception (as we got weird error like "Thread WaitingThread is at >>> WAITING state but is expected to be in Thread.State = WAITING"); >>> - added proper test shutdown on error (made all threads "daemon", >>> interrupt waiting thread if CheckerThread throws exception); >>> - if CheckerThread detects error, propagate the exception to main >>> thread; >> >> The test changes seem fine. >> >>> - fixed LockFreeLogger class - it should work for logging from >>> several threads, but it doesn't. I prefer to simplify it just to keep >>> ConcurrentLinkedQueue. >>> LockFreeLogger is also used by ThreadMXBeanStateTest test, but only >>> by a single thread. >> >> I don't understand your changes here as you've completely changed the >> intended design of the logger. The original accumulates log entries >> per-thread and then spits them all out (though I'm not clear on the >> exact ordering - I don't how to read that stream stuff). The new code >> just creates a single queue of log records interleaving entries from >> different threads. The simple logger may be all that is needed but it >> seems quite different to the intent of the original. > > Testing changes in the test I discovered that there is something wrong > with the logger - it printed only part of the records, so I have to look > at the LockFreeLogger class and I don't understand how it was supposed > to work. > About ordering in cumulative log: each record has Integer which used to > sort log entries from all threads (i.e. records from different threads > are printed at the order which log() was called). > Looking at allRecords/records stuff I don't understand how it should be > used. To get logs from different threads in one logger, we needs one > instance. So we create LockFreeLogger (in main thread) and ctor creates > ThreadLocal record and register it in allRecords. Logging from main > thread works fine, but if any other thread tries to log, 1st log() call > creates its own ThreadLocal records (by records.get()) and log records > from this thread go there. But this ThreadLocal records is not > registered in allRecords, so this logging won't be included in final log. > Looks like we need to change log() to something like > > Map recs = records.get(); > if (recs.isEmpty()) { > ??? allRecords.add(recs); > } > recs.put(id, String.format(format, params)); Yep good catch - this logger was completely broken. > But all this stuff do exactly the same as simple ConcurrentLinkedQueue > (i.e. lock free ordered list). > At least I don't see other rationale in the stuff. I'm not certain of intent with the original but I'd always want to see log entries in chronological order - which is what we now clearly have. Thanks, David > --alex > >> >> Thanks, >> David >> >>> --alex From david.holmes at oracle.com Mon Mar 9 04:19:27 2020 From: david.holmes at oracle.com (David Holmes) Date: Mon, 9 Mar 2020 14:19:27 +1000 Subject: RFR: JDK-8240340: java/lang/management/ThreadMXBean/Locks.java is buggy In-Reply-To: <8d0f40d7-4202-0179-d130-1366c77e5c05@oracle.com> References: <4c9b6f1b-1308-9087-ef4c-140eaf103b0f@oracle.com> <9a3d8f8c-5c46-aa64-ba22-b1b2bf1836ee@oracle.com> <8d0f40d7-4202-0179-d130-1366c77e5c05@oracle.com> Message-ID: <1082f12c-102f-fd63-f8a4-7b623944ee03@oracle.com> P.S. Forgot to note however that you need to update the documentation for the logger now as the mention of "per-thread logs" makes no sense now. Also in the spirit of not using @author, and because this is no longer the code created by Jaroslav, please delete the @author line. Thanks, David On 9/03/2020 2:15 pm, David Holmes wrote: > Hi Alex, > > On 6/03/2020 4:54 am, Alex Menkov wrote: >> Hi David, >> >> Thanks you for the review. >> >> On 03/04/2020 17:50, David Holmes wrote: >>> Hi Alex, >>> >>> On 5/03/2020 10:30 am, Alex Menkov wrote: >>>> Hi all, >>>> >>>> please review the fix for >>>> https://bugs.openjdk.java.net/browse/JDK-8240340 >>>> webrev: >>>> http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev/ >>>> >>>> >>>> changes: >>>> - assertThreadState method: don't re-read thread state throwing >>>> exception (as we got weird error like "Thread WaitingThread is at >>>> WAITING state but is expected to be in Thread.State = WAITING"); >>>> - added proper test shutdown on error (made all threads "daemon", >>>> interrupt waiting thread if CheckerThread throws exception); >>>> - if CheckerThread detects error, propagate the exception to main >>>> thread; >>> >>> The test changes seem fine. >>> >>>> - fixed LockFreeLogger class - it should work for logging from >>>> several threads, but it doesn't. I prefer to simplify it just to >>>> keep ConcurrentLinkedQueue. >>>> LockFreeLogger is also used by ThreadMXBeanStateTest test, but only >>>> by a single thread. >>> >>> I don't understand your changes here as you've completely changed the >>> intended design of the logger. The original accumulates log entries >>> per-thread and then spits them all out (though I'm not clear on the >>> exact ordering - I don't how to read that stream stuff). The new code >>> just creates a single queue of log records interleaving entries from >>> different threads. The simple logger may be all that is needed but it >>> seems quite different to the intent of the original. >> >> Testing changes in the test I discovered that there is something wrong >> with the logger - it printed only part of the records, so I have to >> look at the LockFreeLogger class and I don't understand how it was >> supposed to work. >> About ordering in cumulative log: each record has Integer which used >> to sort log entries from all threads (i.e. records from different >> threads are printed at the order which log() was called). >> Looking at allRecords/records stuff I don't understand how it should >> be used. To get logs from different threads in one logger, we needs >> one instance. So we create LockFreeLogger (in main thread) and ctor >> creates ThreadLocal record and register it in allRecords. Logging from >> main thread works fine, but if any other thread tries to log, 1st >> log() call creates its own ThreadLocal records (by records.get()) and >> log records from this thread go there. But this ThreadLocal records is >> not registered in allRecords, so this logging won't be included in >> final log. >> Looks like we need to change log() to something like >> >> Map recs = records.get(); >> if (recs.isEmpty()) { >> ???? allRecords.add(recs); >> } >> recs.put(id, String.format(format, params)); > > Yep good catch - this logger was completely broken. > >> But all this stuff do exactly the same as simple ConcurrentLinkedQueue >> (i.e. lock free ordered list). >> At least I don't see other rationale in the stuff. > > I'm not certain of intent with the original but I'd always want to see > log entries in chronological order - which is what we now clearly have. > > Thanks, > David > >> --alex >> >>> >>> Thanks, >>> David >>> >>>> --alex From rkennke at redhat.com Mon Mar 9 12:39:03 2020 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 9 Mar 2020 13:39:03 +0100 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com> <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> Message-ID: <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> Hello all, Can I please get reviews of this change? In the meantime, we've done more testing and also field-/torture-testing by a customer who is happy now. :-) Thanks, Roman > Hi Serguei, > > Thanks for reviewing! > > I updated the patch to reflect your suggestions, very good! > It also includes a fix to allow re-connecting an agent after disconnect, > namely move setup of the trackingEnv and deletedSignatureBag to > _activate() to ensure have those structures after re-connect. > > http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/ > > Let me know what you think! > Roman > >> Hi Roman, >> >> Thank you for taking care about this scalability issue! >> >> I have a couple of quick comments. >> >> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >> >> 72 /* >> 73 * Lock to protect deletedSignatureBag >> 74 */ >> 75 static jrawMonitorID deletedSignatureLock; 76 77 /* >> 78 * A bag containing all the deleted classes' signatures. Must be >> accessed under >> 79 * deletedTagLock, >> 80 */ >> 81 struct bag* deletedSignatureBag; >> >> ? The comments contradict to each other. >> ? I guess, the lock name at line 79 has to be deletedSignatureLock >> instead of deletedTagLock. >> ? Also, comma at the end must be replaced with dot. >> >> >> 101 // Tag not found? Ignore. >> 102 if (klass == NULL) { >> 103 debugMonitorExit(deletedSignatureLock); >> 104 return; >> 105 } >> 106 >> 107 // Scan linked-list. >> 108 jlong found_tag = klass->klass_tag; >> 109 while (klass != NULL && found_tag != tag) { >> 110 klass_ptr = &klass->next; >> 111 klass = *klass_ptr; >> 112 found_tag = klass->klass_tag; >> 113 } >> 114 >> 115 // Tag not found? Ignore. >> 116 if (found_tag != tag) { >> 117 debugMonitorExit(deletedSignatureLock); >> 118 return; >> 119 } >> >> >> ?The code above can be simplified, so that the lines 101-105 are not >> needed anymore. >> ?It can be something like this: >> >> // Scan linked-list. >> while (klass != NULL && klass->klass_tag != tag) { >> klass_ptr = &klass->next; >> klass = *klass_ptr; >> } >> if (klass == NULL || klass->klass_tag != tag) { // klass not found - ignore. >> debugMonitorExit(deletedSignatureLock); >> return; >> } >> >> It will take more time when I get a chance to look at the rest. >> >> >> Thanks, >> Serguei >> >> >> >> >> On 12/21/19 13:24, Roman Kennke wrote: >>> Here comes an update that resolves some races that happen when >>> disconnecting an agent. In particular, we need to take the lock on >>> basically every operation, and also need to check whether or not >>> class-tracking is active and return an appropriate result (e.g. an empty >>> list) when we're not. >>> >>> Updated webrev: >>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ >>> >>> Thanks, >>> Roman >>> >>> >>>> So, here comes the O(1) implementation: >>>> >>>> - Whenever a class is 'prepared', it is registered with a tag, and we >>>> set-up a listener to get notified when it is unloaded. >>>> - Prepared classes are kept in a datastructure that is a table, which >>>> each entry being the head of a linked-list of KlassNode*. The table is >>>> indexed by tag % slot-count, and then simply prepend the new KlassNode*. >>>> This is O(1) operation. >>>> - When we get notified of unloading a class, we look up the signature of >>>> the reported tag in that table, and remember it in a bag. The KlassNode* >>>> is then unlinked from the table and deallocated. This is ~O(1) operation >>>> too, depending on the depth of the table. In my testcase which hammered >>>> the code with class-loads and unloads, I usually see depths of like 2-3, >>>> but not usually more. It should be ok. >>>> - when processUnloads() gets called, we simply hand out that bag, and >>>> allocate a new one. >>>> - I also added cleanup-code in classTrack_reset() to avoid leaking the >>>> signatures and KlassNode* etc when debug agent gets detached and/or >>>> re-attached (was missing before). >>>> - I also added locks around data-structure-manipulation (was missing >>>> before). >>>> - Also, I only activate this whole process when an actual listener gets >>>> registered on EI_GC_FINISH. This seems to happen right when attaching a >>>> jdb, not sure why jdb does that though. This may be something to improve >>>> in the future? >>>> >>>> In my tests, the performance of class-tracking itself looks really good. >>>> The bottleneck now is clearly actual synthesizing the class-unload >>>> events. I don't see how this can be helped when the debug agent asks for it? >>>> >>>> Updated webrev: >>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ >>>> >>>> Please let me know what you think of it. >>>> >>>> Thanks, >>>> Roman >>>> >>>> >>>>> Alright, the perfectionist in me got me. I am implementing the even more >>>>> efficient ~O(1) class tracking. Please hold off reviewing for now. >>>>> >>>>> Thanks,Roman >>>>> >>>>> Hi Chris, >>>>>>> I'll have a look at this, although it might not be for a few days. In >>>>>>> the meantime, maybe you can describe your new implementation in >>>>>>> classTrack.c so it's easier to look through the changes. >>>>>> Sure. >>>>>> >>>>>> The purpose of this class-tracking is to be able to determine the >>>>>> signatures of unloaded classes when GC/class-unloading happened, so that >>>>>> we can generate the appropriate JDWP event. >>>>>> >>>>>> The current implementation does so by maintaining a table of currently >>>>>> prepared classes by building that table when classTrack is initialized, >>>>>> and then add new classes whenever a class gets loaded. When unloading >>>>>> occurs, that cache is rebuilt into a new table, and compared with the >>>>>> old table, and whatever is in the old, but not in the new table gets >>>>>> returned. The problem is that when GCs happen frequently and/or many >>>>>> classes get loaded+unloaded, this amounts to O(classCount*gcCount) >>>>>> complexity. >>>>>> >>>>>> The new implementation keeps a linked-list of prepared classes, and also >>>>>> tracks unloads via the listener cbTrackingObjectFree(). Whenever an >>>>>> unload/GC occurs, the list of prepared classes is scanned, and classes >>>>>> that are also in the deletedTagBag are unlinked (thus maintaining the >>>>>> prepared-classes-list) and its signature put in the list that gets returned. >>>>>> >>>>>> The implementation is not perfect. In order to determine whether or not >>>>>> a class is unloaded, it needs to scan the deletedTagBag. That process is >>>>>> therefore still O(unloadedClassCount). The assumption here is that >>>>>> unloadedClassCount << classCount. In my experiments this seems to be >>>>>> true, and also reasonable to expect. >>>>>> >>>>>> (I have some ideas how to improve the implementation to ~O(1) but it >>>>>> would be considerably more complex: have to maintain a (hash)table that >>>>>> maps tags -> KlassNode*, unlink them directly upon unload, and build the >>>>>> unloaded-signatures list there, but I don't currently see that it's >>>>>> worth the effort). >>>>>> >>>>>> In addition to all that, this process is only activated when there's an >>>>>> actual listener registered for EI_GC_FINISH. >>>>>> >>>>>> Thanks, >>>>>> Roman >>>>>> >>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote: >>>>>>>> Hello all, >>>>>>>> >>>>>>>> Issue: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269 >>>>>>>> >>>>>>>> I am proposing what amounts to a rewrite of classTrack.c. It avoids >>>>>>>> throwing away the class cache on GC, and instead keeps track of >>>>>>>> loaded/unloaded classes one-by-one. >>>>>>>> >>>>>>>> In addition to that, it avoids this whole dance until an agent >>>>>>>> registers interest in EI_GC_FINISH. >>>>>>>> >>>>>>>> Webrev: >>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ >>>>>>>> >>>>>>>> Testing: manual testing of provided test scenarios and timing. >>>>>>>> >>>>>>>> Eg with the testcase provided here: >>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>> >>>>>>>> I am getting those numbers: >>>>>>>> unpatched: no debug: 84s with debug: 225s >>>>>>>> patched:?? no debug: 85s with debug: 95s >>>>>>>> >>>>>>>> I also tested successfully through jdk/submit repo >>>>>>>> >>>>>>>> Can I please get a review? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Roman >>>>>>>> >> > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From alexey.menkov at oracle.com Mon Mar 9 18:34:31 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Mon, 9 Mar 2020 11:34:31 -0700 Subject: RFR: JDK-8240340: java/lang/management/ThreadMXBean/Locks.java is buggy In-Reply-To: <2cf61bd9-da4d-2477-6604-d104118c4e03@oracle.com> References: <4c9b6f1b-1308-9087-ef4c-140eaf103b0f@oracle.com> <9a3d8f8c-5c46-aa64-ba22-b1b2bf1836ee@oracle.com> <2cf61bd9-da4d-2477-6604-d104118c4e03@oracle.com> Message-ID: <8b3eba03-b290-677a-70b0-99595ec19eb8@oracle.com> On 03/06/2020 23:28, serguei.spitsyn at oracle.com wrote: > Hi David and Alex, > > My understanding is that previous implementation collected logs > separately for each thread in TLS, and at the end, merged and sorted out > the output by log id. > So, the result is that all messages are serialized at the end. > Alex changed the implementation but the result is the same - all log > messages are serialized. > > There are two tests which use the LockFreeLogger. > Another one is: test/jdk/java/lang/Thread/ThreadStateController.java . > Does the ThreadStateController.java work okay after the fix? ThreadStateController is an utility class used only by ThreadMXBeanStateTest.java. ThreadMXBeanStateTest.java is problem-listed, but I verified that logging works in the test. --alex > > Thanks, > Serguei > > > On 3/5/20 10:54, Alex Menkov wrote: >> Hi David, >> >> Thanks you for the review. >> >> On 03/04/2020 17:50, David Holmes wrote: >>> Hi Alex, >>> >>> On 5/03/2020 10:30 am, Alex Menkov wrote: >>>> Hi all, >>>> >>>> please review the fix for >>>> https://bugs.openjdk.java.net/browse/JDK-8240340 >>>> webrev: >>>> http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev/ >>>> >>>> >>>> changes: >>>> - assertThreadState method: don't re-read thread state throwing >>>> exception (as we got weird error like "Thread WaitingThread is at >>>> WAITING state but is expected to be in Thread.State = WAITING"); >>>> - added proper test shutdown on error (made all threads "daemon", >>>> interrupt waiting thread if CheckerThread throws exception); >>>> - if CheckerThread detects error, propagate the exception to main >>>> thread; >>> >>> The test changes seem fine. >>> >>>> - fixed LockFreeLogger class - it should work for logging from >>>> several threads, but it doesn't. I prefer to simplify it just to >>>> keep ConcurrentLinkedQueue. >>>> LockFreeLogger is also used by ThreadMXBeanStateTest test, but only >>>> by a single thread. >>> >>> I don't understand your changes here as you've completely changed the >>> intended design of the logger. The original accumulates log entries >>> per-thread and then spits them all out (though I'm not clear on the >>> exact ordering - I don't how to read that stream stuff). The new code >>> just creates a single queue of log records interleaving entries from >>> different threads. The simple logger may be all that is needed but it >>> seems quite different to the intent of the original. >> >> Testing changes in the test I discovered that there is something wrong >> with the logger - it printed only part of the records, so I have to >> look at the LockFreeLogger class and I don't understand how it was >> supposed to work. >> About ordering in cumulative log: each record has Integer which used >> to sort log entries from all threads (i.e. records from different >> threads are printed at the order which log() was called). >> Looking at allRecords/records stuff I don't understand how it should >> be used. To get logs from different threads in one logger, we needs >> one instance. So we create LockFreeLogger (in main thread) and ctor >> creates ThreadLocal record and register it in allRecords. Logging from >> main thread works fine, but if any other thread tries to log, 1st >> log() call creates its own ThreadLocal records (by records.get()) and >> log records from this thread go there. But this ThreadLocal records is >> not registered in allRecords, so this logging won't be included in >> final log. >> Looks like we need to change log() to something like >> >> Map recs = records.get(); >> if (recs.isEmpty()) { >> ??? allRecords.add(recs); >> } >> recs.put(id, String.format(format, params)); >> >> But all this stuff do exactly the same as simple ConcurrentLinkedQueue >> (i.e. lock free ordered list). >> At least I don't see other rationale in the stuff. >> >> --alex >> >>> >>> Thanks, >>> David >>> >>>> --alex > From alexey.menkov at oracle.com Mon Mar 9 19:15:11 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Mon, 9 Mar 2020 12:15:11 -0700 Subject: RFR: JDK-8240340: java/lang/management/ThreadMXBean/Locks.java is buggy In-Reply-To: <1082f12c-102f-fd63-f8a4-7b623944ee03@oracle.com> References: <4c9b6f1b-1308-9087-ef4c-140eaf103b0f@oracle.com> <9a3d8f8c-5c46-aa64-ba22-b1b2bf1836ee@oracle.com> <8d0f40d7-4202-0179-d130-1366c77e5c05@oracle.com> <1082f12c-102f-fd63-f8a4-7b623944ee03@oracle.com> Message-ID: Updated webrev: http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev.02/ Changes are in LockFreeLogger comments only. --alex On 03/08/2020 21:19, David Holmes wrote: > P.S. > > Forgot to note however that you need to update the documentation for the > logger now as the mention of "per-thread logs" makes no sense now. Also > in the spirit of not using @author, and because this is no longer the > code created by Jaroslav, please delete the @author line. > > Thanks, > David > > On 9/03/2020 2:15 pm, David Holmes wrote: >> Hi Alex, >> >> On 6/03/2020 4:54 am, Alex Menkov wrote: >>> Hi David, >>> >>> Thanks you for the review. >>> >>> On 03/04/2020 17:50, David Holmes wrote: >>>> Hi Alex, >>>> >>>> On 5/03/2020 10:30 am, Alex Menkov wrote: >>>>> Hi all, >>>>> >>>>> please review the fix for >>>>> https://bugs.openjdk.java.net/browse/JDK-8240340 >>>>> webrev: >>>>> http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev/ >>>>> >>>>> >>>>> changes: >>>>> - assertThreadState method: don't re-read thread state throwing >>>>> exception (as we got weird error like "Thread WaitingThread is at >>>>> WAITING state but is expected to be in Thread.State = WAITING"); >>>>> - added proper test shutdown on error (made all threads "daemon", >>>>> interrupt waiting thread if CheckerThread throws exception); >>>>> - if CheckerThread detects error, propagate the exception to main >>>>> thread; >>>> >>>> The test changes seem fine. >>>> >>>>> - fixed LockFreeLogger class - it should work for logging from >>>>> several threads, but it doesn't. I prefer to simplify it just to >>>>> keep ConcurrentLinkedQueue. >>>>> LockFreeLogger is also used by ThreadMXBeanStateTest test, but only >>>>> by a single thread. >>>> >>>> I don't understand your changes here as you've completely changed >>>> the intended design of the logger. The original accumulates log >>>> entries per-thread and then spits them all out (though I'm not clear >>>> on the exact ordering - I don't how to read that stream stuff). The >>>> new code just creates a single queue of log records interleaving >>>> entries from different threads. The simple logger may be all that is >>>> needed but it seems quite different to the intent of the original. >>> >>> Testing changes in the test I discovered that there is something >>> wrong with the logger - it printed only part of the records, so I >>> have to look at the LockFreeLogger class and I don't understand how >>> it was supposed to work. >>> About ordering in cumulative log: each record has Integer which used >>> to sort log entries from all threads (i.e. records from different >>> threads are printed at the order which log() was called). >>> Looking at allRecords/records stuff I don't understand how it should >>> be used. To get logs from different threads in one logger, we needs >>> one instance. So we create LockFreeLogger (in main thread) and ctor >>> creates ThreadLocal record and register it in allRecords. Logging >>> from main thread works fine, but if any other thread tries to log, >>> 1st log() call creates its own ThreadLocal records (by records.get()) >>> and log records from this thread go there. But this ThreadLocal >>> records is not registered in allRecords, so this logging won't be >>> included in final log. >>> Looks like we need to change log() to something like >>> >>> Map recs = records.get(); >>> if (recs.isEmpty()) { >>> ???? allRecords.add(recs); >>> } >>> recs.put(id, String.format(format, params)); >> >> Yep good catch - this logger was completely broken. >> >>> But all this stuff do exactly the same as simple >>> ConcurrentLinkedQueue (i.e. lock free ordered list). >>> At least I don't see other rationale in the stuff. >> >> I'm not certain of intent with the original but I'd always want to see >> log entries in chronological order - which is what we now clearly have. >> >> Thanks, >> David >> >>> --alex >>> >>>> >>>> Thanks, >>>> David >>>> >>>>> --alex From david.holmes at oracle.com Mon Mar 9 23:09:21 2020 From: david.holmes at oracle.com (David Holmes) Date: Tue, 10 Mar 2020 09:09:21 +1000 Subject: RFR: JDK-8240340: java/lang/management/ThreadMXBean/Locks.java is buggy In-Reply-To: References: <4c9b6f1b-1308-9087-ef4c-140eaf103b0f@oracle.com> <9a3d8f8c-5c46-aa64-ba22-b1b2bf1836ee@oracle.com> <8d0f40d7-4202-0179-d130-1366c77e5c05@oracle.com> <1082f12c-102f-fd63-f8a4-7b623944ee03@oracle.com> Message-ID: Looks good. Thanks, David On 10/03/2020 5:15 am, Alex Menkov wrote: > > Updated webrev: > http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev.02/ > > > Changes are in LockFreeLogger comments only. > > --alex > > On 03/08/2020 21:19, David Holmes wrote: >> P.S. >> >> Forgot to note however that you need to update the documentation for >> the logger now as the mention of "per-thread logs" makes no sense now. >> Also in the spirit of not using @author, and because this is no longer >> the code created by Jaroslav, please delete the @author line. >> >> Thanks, >> David >> >> On 9/03/2020 2:15 pm, David Holmes wrote: >>> Hi Alex, >>> >>> On 6/03/2020 4:54 am, Alex Menkov wrote: >>>> Hi David, >>>> >>>> Thanks you for the review. >>>> >>>> On 03/04/2020 17:50, David Holmes wrote: >>>>> Hi Alex, >>>>> >>>>> On 5/03/2020 10:30 am, Alex Menkov wrote: >>>>>> Hi all, >>>>>> >>>>>> please review the fix for >>>>>> https://bugs.openjdk.java.net/browse/JDK-8240340 >>>>>> webrev: >>>>>> http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev/ >>>>>> >>>>>> >>>>>> changes: >>>>>> - assertThreadState method: don't re-read thread state throwing >>>>>> exception (as we got weird error like "Thread WaitingThread is at >>>>>> WAITING state but is expected to be in Thread.State = WAITING"); >>>>>> - added proper test shutdown on error (made all threads "daemon", >>>>>> interrupt waiting thread if CheckerThread throws exception); >>>>>> - if CheckerThread detects error, propagate the exception to main >>>>>> thread; >>>>> >>>>> The test changes seem fine. >>>>> >>>>>> - fixed LockFreeLogger class - it should work for logging from >>>>>> several threads, but it doesn't. I prefer to simplify it just to >>>>>> keep ConcurrentLinkedQueue. >>>>>> LockFreeLogger is also used by ThreadMXBeanStateTest test, but >>>>>> only by a single thread. >>>>> >>>>> I don't understand your changes here as you've completely changed >>>>> the intended design of the logger. The original accumulates log >>>>> entries per-thread and then spits them all out (though I'm not >>>>> clear on the exact ordering - I don't how to read that stream >>>>> stuff). The new code just creates a single queue of log records >>>>> interleaving entries from different threads. The simple logger may >>>>> be all that is needed but it seems quite different to the intent of >>>>> the original. >>>> >>>> Testing changes in the test I discovered that there is something >>>> wrong with the logger - it printed only part of the records, so I >>>> have to look at the LockFreeLogger class and I don't understand how >>>> it was supposed to work. >>>> About ordering in cumulative log: each record has Integer which used >>>> to sort log entries from all threads (i.e. records from different >>>> threads are printed at the order which log() was called). >>>> Looking at allRecords/records stuff I don't understand how it should >>>> be used. To get logs from different threads in one logger, we needs >>>> one instance. So we create LockFreeLogger (in main thread) and ctor >>>> creates ThreadLocal record and register it in allRecords. Logging >>>> from main thread works fine, but if any other thread tries to log, >>>> 1st log() call creates its own ThreadLocal records (by >>>> records.get()) and log records from this thread go there. But this >>>> ThreadLocal records is not registered in allRecords, so this logging >>>> won't be included in final log. >>>> Looks like we need to change log() to something like >>>> >>>> Map recs = records.get(); >>>> if (recs.isEmpty()) { >>>> ???? allRecords.add(recs); >>>> } >>>> recs.put(id, String.format(format, params)); >>> >>> Yep good catch - this logger was completely broken. >>> >>>> But all this stuff do exactly the same as simple >>>> ConcurrentLinkedQueue (i.e. lock free ordered list). >>>> At least I don't see other rationale in the stuff. >>> >>> I'm not certain of intent with the original but I'd always want to >>> see log entries in chronological order - which is what we now clearly >>> have. >>> >>> Thanks, >>> David >>> >>>> --alex >>>> >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> --alex From chris.plummer at oracle.com Tue Mar 10 02:29:53 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 9 Mar 2020 19:29:53 -0700 Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they do not attempt to use sudo when available Message-ID: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com> Hi, Please help review the following: https://bugs.openjdk.java.net/browse/JDK-8238268 http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/ I'll try to give enough background first to make it easier to understand the changes. On OSX you must run SA tests that attach to a live process as root or using sudo. For example: ? sudo make run-test TEST=serviceability/sa/ClhsdbJstackXcompStress.java Whether running as root or under sudo, the check to allow the test to run is done with: ??? private static boolean canAttachOSX() { ????????? return userName.equals("root"); ??? } Any test using "@requires vm.hasSAandCanAttach" must pass this check via Platform.shouldSAAttach(), which for OSX returns: ???????????? return canAttachOSX() && !isSignedOSX(); So if running as root the "@requires vm.hasSAandCanAttach" passes, otherwise it does not. However, using a root login to run tests is not a very desirable, nor is issuing a "sudo make run-test" (any created file ends up with root ownership). Because of this support was previously added for just running the attaching process using sudo, not the entire test. This was only done for the 20 or so tests that use ClhsdbLauncher. These tests use "@requires vm.hasSA", and then while running the test will do a "sudo" check if canAttachOSX() returns false: ??????? if (!Platform.shouldSAAttach()) { ??????????? if (Platform.isOSX()) { ??????????????? if (Platform.isSignedOSX()) { ??????????????????? throw new SkippedException("SA attach not expected to work. JDK is signed."); ??????????????? } else if (SATestUtils.canAddPrivileges()) { ??????????????????? needPrivileges = true; ??????????????? } ??????????? } ??????????? if (!needPrivileges)? { ?????????????? // Skip the test if we don't have enough permissions to attach ?????????????? // and cannot add privileges. ?????????????? throw new SkippedException( ?????????????????? "SA attach not expected to work. Insufficient privileges."); ?????????? } ??????? } So basically it does a runtime check of vm.hasSAandCanAttach, and if it fails then checks if running with sudo will work. This allows for either a passwordless sudo to be used when running clhsdb, or for the user to be prompted for the sudo password (note I've remove support for the latter with my changes). That brings us to the CR that is being fixed. ClhsdbLauncher tests support sudo and will therefore run with our CI testing on OSX, but the 25 or so tests that use "@requires vm.hasSAandCanAttach" do not, and therefore are never run with our CI OSX testing. The changes in this webrev fix that. There are two possible approaches to the fix. One is having the check for sudo be done as part of the vm.hasSAandCanAttach evaluation. The other approach is to do the check in the test at runtime similar to how ClhsdbLauncher currently does. This would mean just using "@requires vm.hasSA" for all the tests instead of "@requires vm.hasSAandCanAttach". I chose the later because there is an advantage to throwing SkippedException rather than just silently skipping the test using @requires. The advantage is that mdash tells you how many tests were skipped, and when you hover over the reason you can see the SkippedException message, which will differentiate between reasons like the JDK was signed or there are insufficient privileges. If all the checking was done by the vm.hasSAandCanAttach evaluation, you would not know why the test wasn't run. The "support" related changes made are all in the following 3 files. The rest of the changes are in the tests: test/jtreg-ext/requires/VMProps.java test/lib/jdk/test/lib/Platform.java test/lib/jdk/test/lib/SA/SATestUtils.java You'll noticed that one change I made to the sudo support in SATestUtils.canAddPrivileges() is to make sudo non-interactive, which means no password prompt. So that means either the user does not require a password, or the credentials have been cached. Otherwise the sudo check will fail. On most platforms if you execute a sudo command, the credentials are cached for 5 minutes. So if your user is not setup for passwordless sudo, then a sudo command can be issued before running the tests, and will likely remain cached until the test is run. The reason for using passwordless is because prompting in the middle of running tests can be confusing (you usually walk way once launching the tests and miss the prompt anyway), and avoids unnecessary delays in automated testing due to waiting for the password prompt to timeout (it used to wait 1 minute). There are essentially 3 types of tests that SA Attach to a process, each needing a slightly different fix: 1. Tests that directly launch a jdk.hotspot.agent class, such as TestClassDump.java. They need to call SATestUtils.checkAttachOk() to verify that attaching will be possible, and then SATestUtils.addPrivilegesIfNeeded(pb) to get the sudo command added if needed.They also need to switch from using hasSAandCanAttach to using hasSA. 2. Tests that launch command line tools such has jhsdb. They need to call SATestUtils.checkAttachOk() to verify that attaching will be possible, and then SATestUtils.createProcessBuilder() to create a process that will be launched using sudo if necessary.They also need to switch from using hasSAandCanAttach to using hasSA. 3. Tests that use ClhsdbLauncher. They already use hasSA instead of hasSAandCanAttach, and rely on ClhsdbLauncher to do check at runtime if attaching will work, so for the most part all the these tests are unchanged. ClhsdbLauncher was modified to take advantage of the new SATestUtils.createProcessBuilder() and SATestUtils.checkAttachOk() APIs. Some tests required special handling: test/hotspot/jtreg/compiler/ciReplay/TestSAClient.java test/hotspot/jtreg/compiler/ciReplay/TestSAServer.java - These two tests SA Attach to a core file, not to a process, so only need hasSA, ? not hasSAandCanAttach. No other changes were needed. test/hotspot/jtreg/serviceability/sa/ClhsdbFindPC.java - The output should never be null. If the test was skipped due to lack of privileges, you ? would never get to this section of the test. test/hotspot/jtreg/serviceability/sa/TestClhsdbJstackLock.java test/hotspot/jtreg/serviceability/sa/TestIntConstant.java test/hotspot/jtreg/serviceability/sa/TestPrintMdo.java test/hotspot/jtreg/serviceability/sa/TestType.java test/hotspot/jtreg/serviceability/sa/TestUniverse.java - These are ClhsdbLauncher tests, so they should have been using hasSA instead of ? hasSAandCanAttachin the first place. No other changes were needed. test/hotspot/jtreg/serviceability/sa/TestCpoolForInvokeDynamic.java test/hotspot/jtreg/serviceability/sa/TestDefaultMethods.java test/hotspot/jtreg/serviceability/sa/TestG1HeapRegion.java - These tests used to "@require mac" but seem run fine on OSX, so I removed this requirement. test/jdk/sun/tools/jhsdb/BasicLauncherTest.java - This test had a runtime check to not run on OSX due to not having core file stack ? walking support. However, this tests always attaches to a process, not a core file, ? and seems to run just fine on OSX. test/jdk/sun/tools/jstack/DeadlockDetectionTest.java - I changed the test to throw a SkippedException if it gets the unexpected error code ? rather than just println. And a few other miscellaneous changes not already covered: test/lib/jdk/test/lib/Platform.java - Made canPtraceAttachLinux() public so it can be called from SATestUtils. - vm.hasSAandCanAttach is now gone. thanks, Chris From chris.plummer at oracle.com Tue Mar 10 04:35:10 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 9 Mar 2020 21:35:10 -0700 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com> <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> Message-ID: <980425b9-cf93-3a5a-b10d-459c0d0d692a@oracle.com> I'll have a look at it, although it might not be for a couple of days. Chris On 3/9/20 5:39 AM, Roman Kennke wrote: > Hello all, > > Can I please get reviews of this change? In the meantime, we've done > more testing and also field-/torture-testing by a customer who is happy > now. :-) > > Thanks, > Roman > > >> Hi Serguei, >> >> Thanks for reviewing! >> >> I updated the patch to reflect your suggestions, very good! >> It also includes a fix to allow re-connecting an agent after disconnect, >> namely move setup of the trackingEnv and deletedSignatureBag to >> _activate() to ensure have those structures after re-connect. >> >> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/ >> >> Let me know what you think! >> Roman >> >>> Hi Roman, >>> >>> Thank you for taking care about this scalability issue! >>> >>> I have a couple of quick comments. >>> >>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>> >>> 72 /* >>> 73 * Lock to protect deletedSignatureBag >>> 74 */ >>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /* >>> 78 * A bag containing all the deleted classes' signatures. Must be >>> accessed under >>> 79 * deletedTagLock, >>> 80 */ >>> 81 struct bag* deletedSignatureBag; >>> >>> ? The comments contradict to each other. >>> ? I guess, the lock name at line 79 has to be deletedSignatureLock >>> instead of deletedTagLock. >>> ? Also, comma at the end must be replaced with dot. >>> >>> >>> 101 // Tag not found? Ignore. >>> 102 if (klass == NULL) { >>> 103 debugMonitorExit(deletedSignatureLock); >>> 104 return; >>> 105 } >>> 106 >>> 107 // Scan linked-list. >>> 108 jlong found_tag = klass->klass_tag; >>> 109 while (klass != NULL && found_tag != tag) { >>> 110 klass_ptr = &klass->next; >>> 111 klass = *klass_ptr; >>> 112 found_tag = klass->klass_tag; >>> 113 } >>> 114 >>> 115 // Tag not found? Ignore. >>> 116 if (found_tag != tag) { >>> 117 debugMonitorExit(deletedSignatureLock); >>> 118 return; >>> 119 } >>> >>> >>> ?The code above can be simplified, so that the lines 101-105 are not >>> needed anymore. >>> ?It can be something like this: >>> >>> // Scan linked-list. >>> while (klass != NULL && klass->klass_tag != tag) { >>> klass_ptr = &klass->next; >>> klass = *klass_ptr; >>> } >>> if (klass == NULL || klass->klass_tag != tag) { // klass not found - ignore. >>> debugMonitorExit(deletedSignatureLock); >>> return; >>> } >>> >>> It will take more time when I get a chance to look at the rest. >>> >>> >>> Thanks, >>> Serguei >>> >>> >>> >>> >>> On 12/21/19 13:24, Roman Kennke wrote: >>>> Here comes an update that resolves some races that happen when >>>> disconnecting an agent. In particular, we need to take the lock on >>>> basically every operation, and also need to check whether or not >>>> class-tracking is active and return an appropriate result (e.g. an empty >>>> list) when we're not. >>>> >>>> Updated webrev: >>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ >>>> >>>> Thanks, >>>> Roman >>>> >>>> >>>>> So, here comes the O(1) implementation: >>>>> >>>>> - Whenever a class is 'prepared', it is registered with a tag, and we >>>>> set-up a listener to get notified when it is unloaded. >>>>> - Prepared classes are kept in a datastructure that is a table, which >>>>> each entry being the head of a linked-list of KlassNode*. The table is >>>>> indexed by tag % slot-count, and then simply prepend the new KlassNode*. >>>>> This is O(1) operation. >>>>> - When we get notified of unloading a class, we look up the signature of >>>>> the reported tag in that table, and remember it in a bag. The KlassNode* >>>>> is then unlinked from the table and deallocated. This is ~O(1) operation >>>>> too, depending on the depth of the table. In my testcase which hammered >>>>> the code with class-loads and unloads, I usually see depths of like 2-3, >>>>> but not usually more. It should be ok. >>>>> - when processUnloads() gets called, we simply hand out that bag, and >>>>> allocate a new one. >>>>> - I also added cleanup-code in classTrack_reset() to avoid leaking the >>>>> signatures and KlassNode* etc when debug agent gets detached and/or >>>>> re-attached (was missing before). >>>>> - I also added locks around data-structure-manipulation (was missing >>>>> before). >>>>> - Also, I only activate this whole process when an actual listener gets >>>>> registered on EI_GC_FINISH. This seems to happen right when attaching a >>>>> jdb, not sure why jdb does that though. This may be something to improve >>>>> in the future? >>>>> >>>>> In my tests, the performance of class-tracking itself looks really good. >>>>> The bottleneck now is clearly actual synthesizing the class-unload >>>>> events. I don't see how this can be helped when the debug agent asks for it? >>>>> >>>>> Updated webrev: >>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ >>>>> >>>>> Please let me know what you think of it. >>>>> >>>>> Thanks, >>>>> Roman >>>>> >>>>> >>>>>> Alright, the perfectionist in me got me. I am implementing the even more >>>>>> efficient ~O(1) class tracking. Please hold off reviewing for now. >>>>>> >>>>>> Thanks,Roman >>>>>> >>>>>> Hi Chris, >>>>>>>> I'll have a look at this, although it might not be for a few days. In >>>>>>>> the meantime, maybe you can describe your new implementation in >>>>>>>> classTrack.c so it's easier to look through the changes. >>>>>>> Sure. >>>>>>> >>>>>>> The purpose of this class-tracking is to be able to determine the >>>>>>> signatures of unloaded classes when GC/class-unloading happened, so that >>>>>>> we can generate the appropriate JDWP event. >>>>>>> >>>>>>> The current implementation does so by maintaining a table of currently >>>>>>> prepared classes by building that table when classTrack is initialized, >>>>>>> and then add new classes whenever a class gets loaded. When unloading >>>>>>> occurs, that cache is rebuilt into a new table, and compared with the >>>>>>> old table, and whatever is in the old, but not in the new table gets >>>>>>> returned. The problem is that when GCs happen frequently and/or many >>>>>>> classes get loaded+unloaded, this amounts to O(classCount*gcCount) >>>>>>> complexity. >>>>>>> >>>>>>> The new implementation keeps a linked-list of prepared classes, and also >>>>>>> tracks unloads via the listener cbTrackingObjectFree(). Whenever an >>>>>>> unload/GC occurs, the list of prepared classes is scanned, and classes >>>>>>> that are also in the deletedTagBag are unlinked (thus maintaining the >>>>>>> prepared-classes-list) and its signature put in the list that gets returned. >>>>>>> >>>>>>> The implementation is not perfect. In order to determine whether or not >>>>>>> a class is unloaded, it needs to scan the deletedTagBag. That process is >>>>>>> therefore still O(unloadedClassCount). The assumption here is that >>>>>>> unloadedClassCount << classCount. In my experiments this seems to be >>>>>>> true, and also reasonable to expect. >>>>>>> >>>>>>> (I have some ideas how to improve the implementation to ~O(1) but it >>>>>>> would be considerably more complex: have to maintain a (hash)table that >>>>>>> maps tags -> KlassNode*, unlink them directly upon unload, and build the >>>>>>> unloaded-signatures list there, but I don't currently see that it's >>>>>>> worth the effort). >>>>>>> >>>>>>> In addition to all that, this process is only activated when there's an >>>>>>> actual listener registered for EI_GC_FINISH. >>>>>>> >>>>>>> Thanks, >>>>>>> Roman >>>>>>> >>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote: >>>>>>>>> Hello all, >>>>>>>>> >>>>>>>>> Issue: >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269 >>>>>>>>> >>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c. It avoids >>>>>>>>> throwing away the class cache on GC, and instead keeps track of >>>>>>>>> loaded/unloaded classes one-by-one. >>>>>>>>> >>>>>>>>> In addition to that, it avoids this whole dance until an agent >>>>>>>>> registers interest in EI_GC_FINISH. >>>>>>>>> >>>>>>>>> Webrev: >>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ >>>>>>>>> >>>>>>>>> Testing: manual testing of provided test scenarios and timing. >>>>>>>>> >>>>>>>>> Eg with the testcase provided here: >>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>> >>>>>>>>> I am getting those numbers: >>>>>>>>> unpatched: no debug: 84s with debug: 225s >>>>>>>>> patched:?? no debug: 85s with debug: 95s >>>>>>>>> >>>>>>>>> I also tested successfully through jdk/submit repo >>>>>>>>> >>>>>>>>> Can I please get a review? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Roman >>>>>>>>> From serguei.spitsyn at oracle.com Tue Mar 10 09:54:02 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 10 Mar 2020 02:54:02 -0700 Subject: PING: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows In-Reply-To: References: <54d4e146-58bc-8318-6e27-922616ff4b37@oss.nttdata.com> <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com> <0a2df665-2e08-6139-c131-043a425b4916@oracle.com> <31599c2a-6947-44a0-0f91-cb240ddaf1ed@oracle.com> Message-ID: <3e535ec6-573d-901e-e4c0-4e2174c80f15@oracle.com> Hi Chihiro, Yes, I'll sponsor it. Thank you for the update. Thanks, Serguei On 3/8/20 06:05, Chihiro Ito wrote: > Hi, > > I'm sorry. I included "JDK-" in the changeset title. I removed it and > updated it. > > Change set : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset > > Regards, > Chihiro > > 2020?3?7?(?) 23:13 Chihiro Ito : >> Hi Serguei and Yasumasa, >> >> I update the copyright year and created the change set. >> >> Could you sponsor this, please? >> >> Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/ >> Change set : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset >> >> Regards, >> Chihiro >> >> >> 2020?3?7?(?) 16:03 Yasumasa Suenaga : >> >> >>> Hi Chihiro, >>> >>> I'm also ok with webrev.05 after updating copyright year. >>> >>> >>> Yasumasa >>> >>> >>> On 2020/03/07 3:32, serguei.spitsyn at oracle.com wrote: >>>> Hi Chichiro, >>>> >>>> I'm okay with the fix. >>>> Could you, please, update the copyright date in || src/java.base/share/classes/jdk/internal/vm/VMSupport.java before push? >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 3/6/20 07:24, Chihiro Ito wrote: >>>>> Hi Serguei, >>>>> >>>>> Could you review this again, please? >>>>> >>>>> Regards, >>>>> Chihiro >>>>> >>>>> >>>>> 2020?2?27?(?) 22:11 Chihiro Ito: >>>>>> Hi Ralf, >>>>>> >>>>>> Thank you for your advice. >>>>>> >>>>>> 1. >>>>>> The comment of serializePropertiesToByteArray in VMSupport is "The stream written to the byte array is ISO 8859-1 encoded.". >>>>>> But the previous implementation does not keep this. I think we need to implement encode by ISO 8859-1. >>>>>> >>>>>> 2. >>>>>> According to help, the feature of VM.system_properties is just "Print system properties". The users should not use this output for loading. The users use it when they want to see System Properties soon. >>>>>> >>>>>> Regards, >>>>>> Chihiro >>>>>> >>>>>> >>>>>> 2020?2?26?(?) 18:53 Schmelter, Ralf: >>>>>>> Hi Chihiro, >>>>>>> >>>>>>> I have two remarks: >>>>>>> >>>>>>> 1. ISO Latin 1 characters which are not ASCII will not work with the code. While the Properties.store() method claims to create ISO Latin 1 String, it really only will create printable ASCII characters (apart from the comment, but it is ASCII too in this case). See Properties.saveConvert, where the char is checked for < 0x20 or > 0x7e and then printed as \uxxxx. This is important, since the bytes of the ByteArrayOutputStream are then send to the jcmd. And jcmd expects UTF-8 encoded strings, which is OK if we only used ASCII characters. But a ISO Latin 1 character >= 0x80 will break the encoding. Just try using \u00DC in your test. >>>>>>> >>>>>>> 2. Your change makes it impossible to load the output with properties.load(). The old output could be loaded, since it was a valid properties file. But yours is not. For example, consider the filename c:\test\new. Formerly it would be encoded as: >>>>>>> C\:\\test\\new >>>>>>> And now it is: >>>>>>> C:\test\new >>>>>>> But the properties code would see "\n" as the newline character in your encoding. In fact you cannot differentiate between \n, \t, \f and \r originally being one or two characters. >>>>>>> >>>>>>> Best regards, >>>>>>> Ralf >>>>>>> >>>>>>> >>>>>>> From: serviceability-dev On Behalf Of Chihiro Ito >>>>>>> Sent: Dienstag, 25. Februar 2020 04:45 >>>>>>> To:serguei.spitsyn at oracle.com >>>>>>> Cc:serviceability-dev at openjdk.java.net >>>>>>> Subject: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows >>>>>>> >>>>>>> Hi Serguei, >>>>>>> >>>>>>> Thanks for your review and advice. >>>>>>> >>>>>>> I modified these. >>>>>>> Could you review this again, please? >>>>>>> >>>>>>> Webrev :http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.05/ >>>>>>> >>>>>>> Regards, >>>>>>> Chihiro >>>>>>> From kevin.walls at oracle.com Tue Mar 10 09:58:57 2020 From: kevin.walls at oracle.com (Kevin Walls) Date: Tue, 10 Mar 2020 09:58:57 +0000 Subject: : PING: RFR: 8234624: jstack mixed mode should refer DWARF In-Reply-To: References: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com> <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com> <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com> <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com> Message-ID: Hi Yasumasa , The changes build OK for me in the latest jdk, and things still work. I have not yet seen the dwarf usage in action: I've tried a couple of different systems and so far have not reproduced the problem, i.e. jstack has not failed on native frames. I may need more recent basic libraries, will look again for somewhere where the problem happens and get back to you as I really want to run the changes. I have mostly minor other comments which don't need a new webrev, some just comments for the future: src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp: DW_CFA_nop - shouldn't this continue instead of return? (It may "never" happen, but a nop could appear within some other instructions?) DW_CFA_remember_state: a minor typo in the comment, "DW_CFA_remenber_state". We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not DW_CFA_advance_loc4.? I thought that was odd, but maybe addresses in these tables never increase by 4-byte amounts, would this mean a lot of code on one line. 8-) So maybe it's never used in practice, if you think it's unnecessary no problem, maybe a comment, or add it for robustness. General-purpose methods like read_leb128(), get_entry_length(), get_decoded_value() specifically update the _buf pointer in this DwarfParser. DwarfParser::process_dwarf() moves _buf. It calls process_cie() which reads, moves _buf and restores it to the original position, then we read augmentation_length from where _buf is. I'm not sure if that's wrong, or if I just need to read again about the CIE/etc layout. I don't really want to suggest making the code pass around a current _buf for the invocation of these general purpose methods, but just wanted to comment that if these get used more widely that might become necessary. Similarly in future, if this DWARF support code became used more widely, it might want to move to an OS-neutral directory?? It's odd to label it as Linux-specific. src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp: Thanks for changing "can_parsable" which was in the earlier version. 8-) These are just comments to mainly say it looks good, and somebody else out there has read it. I will look for a system that shows the problem, and get back to you again! Many thanks Kevin On 27/02/2020 05:13, Yasumasa Suenaga wrote: > Hi all, > > webrev.03 cannot be applied to current jdk/jdk due to 8239224 and > 8239462 changes (they updated copyright year). > So I modified webrev (only copyright year changes) to be able to apply > to current jdk/jdk. > Could you review it? > > ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/ > > I need one more reviewer to push. > > > Thanks, > > Yasumasa > > > On 2020/02/17 13:07, Yasumasa Suenaga wrote: >> PING: Could you review it? >> >>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >> >> This change has been already reviewed by Serguei. >> I need one more reviewer to push. >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2020/02/03 1:37, Yasumasa Suenaga wrote: >>> PING: Could you reveiw this change? >>> >>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>> >>> I believe this change helps troubleshooter to fight to postmortem >>> analysis. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2020/01/19 3:16, Yasumasa Suenaga wrote: >>>> PING: Could you review it? >>>> >>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>> >>>> I updated webrev. I discussed with Serguei in off list, and I >>>> refactored webrev.02 . >>>> It has passed tests on submit repo >>>> (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549). >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2019/12/15 10:51, Yasumasa Suenaga wrote: >>>>> Hi Serguei, >>>>> >>>>> Thanks for your comment! >>>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev. >>>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as >>>>> Dmitry said. >>>>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/ >>>>> >>>>> This change has been passed all tests on submit repo >>>>> (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487). >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> This is nice move in general. >>>>>> Thank you for working on this! >>>>>> >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html >>>>>> >>>>>> >>>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == >>>>>> 0L) { // Java frame 98 Address rbp = >>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp >>>>>> == null) { 100 return null; 101 } 102 return new >>>>>> LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native >>>>>> frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new >>>>>> DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 >>>>>> Address rbp = >>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if (rbp >>>>>> == null) { 110 return null; 111 } 112 return new >>>>>> LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 >>>>>> dwarf.processDwarf(pc); 115 Address cfa = >>>>>> ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 >>>>>> !dwarf.isBPOffsetAvailable()) 117 ? >>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : >>>>>> context.getRegisterAsAddress(dwarf.getCFARegister()) 119 >>>>>> .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 >>>>>> return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, >>>>>> dwarf); 124 } >>>>>> >>>>>> >>>>>> I'd suggest to simplify the logic by refactoring to something >>>>>> like below: >>>>>> >>>>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc); >>>>>> ?????????? Address cfa = >>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame >>>>>> ?????????? DwarfParser dwarf = null; >>>>>> >>>>>> ?????????? if (libptr != 0L) { // Native frame >>>>>> ???????????? try { >>>>>> ?????????????? dwarf = new DwarfParser(libptr); >>>>>> ?????????????? dwarf.processDwarf(pc); >>>>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == >>>>>> AMD64ThreadContext.RBP) && >>>>>> !dwarf.isBPOffsetAvailable()) >>>>>> ???????????????????????????????? ? >>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP) >>>>>> ???????????????????????????????? : >>>>>> context.getRegisterAsAddress(dwarf.getCFARegister()) >>>>>> .addOffsetTo(dwarf.getCFAOffset()); >>>>>> >>>>>> ??????????? } catch (DebuggerException e) { // bail out to Java >>>>>> frame case >>>>>> ??????????? } >>>>>> ????????? } >>>>>> ????????? if (cfa == null) { >>>>>> ??????????? return null; >>>>>> ????????? } >>>>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); >>>>>> >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html >>>>>> >>>>>> >>>>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA() >>>>>> >>>>>> ?? Better to rename 'ofs' => 'offs'. >>>>>> >>>>>> 77 nextCFA = nextCFA.addOffsetTo(- >>>>>> nextDwarf.getBasePointerOffsetFromCFA()); >>>>>> >>>>>> ?? Extra space after '-' sign. >>>>>> >>>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, >>>>>> ThreadContext context) { >>>>>> >>>>>> ?? It feels like the logic has to be somehow >>>>>> refactored/simplified as >>>>>> ?? several typical fragments appears in slightly different contexts. >>>>>> ?? But it is not easy to understand what it is. >>>>>> ?? Could you, please, add some comments to key places explaining >>>>>> this logic. >>>>>> ?? Then I'll check if it is possible to make it a little bit >>>>>> simpler. >>>>>> >>>>>> 109 private CFrame javaSender(ThreadContext context) { 110 >>>>>> Address nextCFA; 111 Address nextPC; 112 113 nextPC = >>>>>> getNextPC(false); 114 if (nextPC == null) { 115 return null; 116 >>>>>> } 117 118 DwarfParser nextDwarf = null; 119 long libptr = >>>>>> dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // >>>>>> Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); >>>>>> 123 } catch (DebuggerException e) { 124 nextCFA = >>>>>> getNextCFA(null, context); 125 return (nextCFA == null) ? null : >>>>>> new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 >>>>>> nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = >>>>>> getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? >>>>>> null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>> 133 } >>>>>> >>>>>> ??The above can be simplified if a DebuggerException can not be >>>>>> thrown from processDwarf(nextPC): >>>>>> ????? private CFrame javaSender(ThreadContext context) { >>>>>> ??????? Address nextPC = getNextPC(false); >>>>>> ??????? if (nextPC == null) { >>>>>> ????????? return null; >>>>>> ??????? } >>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>> ??????? DwarfParser nextDwarf = null; >>>>>> >>>>>> ??????? if (libptr != 0L) { // Native frame >>>>>> ????????? try { >>>>>> ??????????? nextDwarf = new DwarfParser(libptr); >>>>>> ??????????? nextDwarf.processDwarf(nextPC); >>>>>> ????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>>> ????????? } >>>>>> ??????? } >>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>> ??????? return (nextCFA == null) ? null : new >>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>> ????? } >>>>>> >>>>>> 135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext >>>>>> context = thread.getContext(); 137 138 if (dwarf == null) { // >>>>>> Java frame 139 return javaSender(context); 140 } 141 142 Address >>>>>> nextPC = getNextPC(true); 143 if (nextPC == null) { 144 return >>>>>> null; 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = >>>>>> dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = >>>>>> dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // >>>>>> Next frame might be Java frame 153 nextCFA = getNextCFA(null, >>>>>> context); 154 return (nextCFA == null) ? null : new >>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 } 156 try { 157 >>>>>> nextDwarf = new DwarfParser(libptr); 158 } catch >>>>>> (DebuggerException e) { 159 nextCFA = getNextCFA(null, context); >>>>>> 160 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, >>>>>> nextCFA, nextPC, null); 161 } 162 } 163 164 >>>>>> nextDwarf.processDwarf(nextPC); 165 nextCFA = >>>>>> getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? >>>>>> null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 167 } >>>>>> >>>>>> ??This one can be also simplified a little: >>>>>> >>>>>> ????? public CFrame sender(ThreadProxy thread) { >>>>>> ??????? ThreadContext context = thread.getContext(); >>>>>> >>>>>> ??????? if (dwarf == null) { // Java frame >>>>>> ????????? return javaSender(context); >>>>>> ??????? } >>>>>> ??????? Address nextPC = getNextPC(true); >>>>>> ??????? if (nextPC == null) { >>>>>> ????????? return null; >>>>>> ??????? } >>>>>> ??????? DwarfParser nextDwarf = null; >>>>>> ??????? if (!dwarf.isIn(nextPC)) { >>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>> ????????? if (libptr != 0L) { >>>>>> ??????????? try { >>>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>>> ????????????? nextDwarf.processDwarf(nextPC); >>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java >>>>>> frame >>>>>> ??????????? } >>>>>> ????????? } >>>>>> ??????? } >>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>> ??????? return (nextCFA == null) ? null : new >>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>> ????? } >>>>>> >>>>>> Finally, it looks like just one method could replace both >>>>>> sender(ThreadProxy thread) and javaSender(ThreadContext context): >>>>>> >>>>>> ????? private CFrame commonSender(ThreadProxy thread) { >>>>>> ??????? ThreadContext context = thread.getContext(); >>>>>> ??????? Address nextPC = getNextPC(false); >>>>>> ??????? if (nextPC == null) { >>>>>> ????????? return null; >>>>>> ??????? } >>>>>> ??????? DwarfParser nextDwarf = null; >>>>>> >>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) { >>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>> ????????? if (libptr != 0L) { >>>>>> ??????????? try { >>>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>>> ????????????? nextDwarf.processDwarf(nextPC); >>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java >>>>>> frame >>>>>> ??????????? } >>>>>> ????????? } >>>>>> ??????? } >>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>> ??????? return (nextCFA == null) ? null : new >>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>> ????? } >>>>>> >>>>>> I'm still reviewing the dwarf parser files. >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote: >>>>>>> Hi, >>>>>>> >>>>>>> I refactored LinuxAMD64CFrame.java . It works fine in >>>>>>> serviceability/sa tests and >>>>>>> all tests on submit repo >>>>>>> (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923). >>>>>>> Could you review new webrev? >>>>>>> >>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/ >>>>>>> >>>>>>> The diff from previous webrev is here: >>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> Please review this change: >>>>>>>> >>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>> ?? webrev: >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/ >>>>>>>> >>>>>>>> >>>>>>>> According to 2.7 Stack Unwind Algorithm in System V Application >>>>>>>> Binary Interface AMD64 >>>>>>>> Architecture Processor Supplement [1], we need to use DWARF in >>>>>>>> .eh_frame or .debug_frame >>>>>>>> for stack unwinding. >>>>>>>> >>>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by default >>>>>>>> since GCC 4.6, so system >>>>>>>> library (e.g. libc) might be compiled with this feature. >>>>>>>> >>>>>>>> However `jhsdb jstack --mixed` does not do so, it uses base >>>>>>>> pointer register (RBP). >>>>>>>> So it might be lack of stack frames. >>>>>>>> >>>>>>>> I guess JDK-8219201 is caused by same issue. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> [1] >>>>>>>> https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf >>>>>> From suenaga at oss.nttdata.com Tue Mar 10 12:36:38 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Tue, 10 Mar 2020 21:36:38 +0900 Subject: : PING: RFR: 8234624: jstack mixed mode should refer DWARF In-Reply-To: References: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com> <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com> <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com> <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com> Message-ID: <24a2dcae-fba9-c6ed-657e-a1d54584bbc3@oss.nttdata.com> Hi Kevin, Thanks for your comment! On 2020/03/10 18:58, Kevin Walls wrote: > Hi Yasumasa , > > The changes build OK for me in the latest jdk, and things still work. > I have not yet seen the dwarf usage in action: I've tried a couple of different systems and so far have not reproduced the problem, i.e. jstack has not failed on native frames. > > I may need more recent basic libraries, will look again for somewhere where the problem happens and get back to you as I really want to run the changes. You can see the problem with JShell. Some Java frames would not be seen in mixed jstack. > I have mostly minor other comments which don't need a new webrev, some just comments for the future: > > src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp: > > DW_CFA_nop - shouldn't this continue instead of return? > (It may "never" happen, but a nop could appear within some other instructions?) DW_CFA_nop is used for padding, so we can ignore (return immediately) it. > DW_CFA_remember_state: a minor typo in the comment, "DW_CFA_remenber_state". I will fix it. > We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not DW_CFA_advance_loc4.? I thought that was odd, but maybe addresses in these tables never increase by 4-byte amounts, would this mean a lot of code on one line. 8-) > So maybe it's never used in practice, if you think it's unnecessary no problem, maybe a comment, or add it for robustness. I will add DW_CFA_advance_loc4. > General-purpose methods like read_leb128(), get_entry_length(), get_decoded_value() specifically update the _buf pointer in this DwarfParser. > > DwarfParser::process_dwarf() moves _buf. > It calls process_cie() which reads, moves _buf and restores it to the original position, then we read augmentation_length from where _buf is. > I'm not sure if that's wrong, or if I just need to read again about the CIE/etc layout. > > I don't really want to suggest making the code pass around a current _buf for the invocation of these general purpose methods, but just wanted to comment that if these get used more widely that might become necessary. I saw GDB and binutils source for creating this patch. They seems to process similar code because we need to calculate DWARF instructions one-by-one to get the value which relates to specified PC. > Similarly in future, if this DWARF support code became used more widely, it might want to move to an > OS-neutral directory?? It's odd to label it as Linux-specific. Windows does not use DWARF at least, it uses another feature. https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019 I'm not sure other platforms (Solaris, macOS) uses DWARF. If DWARF is used in them, I can move DWARF related code to posix directory. > src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp: > Thanks for changing "can_parsable" which was in the earlier version. 8-) > > > These are just comments to mainly say it looks good, and somebody else out there has read it. > I will look for a system that shows the problem, and get back to you again! Thanks, Yasumasa > Many thanks > Kevin > > On 27/02/2020 05:13, Yasumasa Suenaga wrote: >> Hi all, >> >> webrev.03 cannot be applied to current jdk/jdk due to 8239224 and 8239462 changes (they updated copyright year). >> So I modified webrev (only copyright year changes) to be able to apply to current jdk/jdk. >> Could you review it? >> >> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/ >> >> I need one more reviewer to push. >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2020/02/17 13:07, Yasumasa Suenaga wrote: >>> PING: Could you review it? >>> >>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>> >>> This change has been already reviewed by Serguei. >>> I need one more reviewer to push. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2020/02/03 1:37, Yasumasa Suenaga wrote: >>>> PING: Could you reveiw this change? >>>> >>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>> >>>> I believe this change helps troubleshooter to fight to postmortem analysis. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2020/01/19 3:16, Yasumasa Suenaga wrote: >>>>> PING: Could you review it? >>>>> >>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>> >>>>> I updated webrev. I discussed with Serguei in off list, and I refactored webrev.02 . >>>>> It has passed tests on submit repo (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549). >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2019/12/15 10:51, Yasumasa Suenaga wrote: >>>>>> Hi Serguei, >>>>>> >>>>>> Thanks for your comment! >>>>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev. >>>>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as Dmitry said. >>>>>> >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/ >>>>>> >>>>>> This change has been passed all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487). >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> This is nice move in general. >>>>>>> Thank you for working on this! >>>>>>> >>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html >>>>>>> >>>>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 0L) { // Java frame 98 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp == null) { 100 return null; 101 } 102 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if (rbp == null) { 110 return null; 111 } 112 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 dwarf.processDwarf(pc); 115 Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 !dwarf.isBPOffsetAvailable()) 117 ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : context.getRegisterAsAddress(dwarf.getCFARegister()) 119 .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); 124 } >>>>>>> >>>>>>> >>>>>>> I'd suggest to simplify the logic by refactoring to something like below: >>>>>>> >>>>>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc); >>>>>>> ?????????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame >>>>>>> ?????????? DwarfParser dwarf = null; >>>>>>> >>>>>>> ?????????? if (libptr != 0L) { // Native frame >>>>>>> ???????????? try { >>>>>>> ?????????????? dwarf = new DwarfParser(libptr); >>>>>>> ?????????????? dwarf.processDwarf(pc); >>>>>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && >>>>>>> !dwarf.isBPOffsetAvailable()) >>>>>>> ???????????????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) >>>>>>> ???????????????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister()) >>>>>>> .addOffsetTo(dwarf.getCFAOffset()); >>>>>>> >>>>>>> ??????????? } catch (DebuggerException e) { // bail out to Java frame case >>>>>>> ??????????? } >>>>>>> ????????? } >>>>>>> ????????? if (cfa == null) { >>>>>>> ??????????? return null; >>>>>>> ????????? } >>>>>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); >>>>>>> >>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html >>>>>>> >>>>>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA() >>>>>>> >>>>>>> ?? Better to rename 'ofs' => 'offs'. >>>>>>> >>>>>>> 77 nextCFA = nextCFA.addOffsetTo(- nextDwarf.getBasePointerOffsetFromCFA()); >>>>>>> >>>>>>> ?? Extra space after '-' sign. >>>>>>> >>>>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, ThreadContext context) { >>>>>>> >>>>>>> ?? It feels like the logic has to be somehow refactored/simplified as >>>>>>> ?? several typical fragments appears in slightly different contexts. >>>>>>> ?? But it is not easy to understand what it is. >>>>>>> ?? Could you, please, add some comments to key places explaining this logic. >>>>>>> ?? Then I'll check if it is possible to make it a little bit simpler. >>>>>>> >>>>>>> 109 private CFrame javaSender(ThreadContext context) { 110 Address nextCFA; 111 Address nextPC; 112 113 nextPC = getNextPC(false); 114 if (nextPC == null) { 115 return null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 123 } catch (DebuggerException e) { 124 nextCFA = getNextCFA(null, context); 125 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 133 } >>>>>>> >>>>>>> ??The above can be simplified if a DebuggerException can not be thrown from processDwarf(nextPC): >>>>>>> ????? private CFrame javaSender(ThreadContext context) { >>>>>>> ??????? Address nextPC = getNextPC(false); >>>>>>> ??????? if (nextPC == null) { >>>>>>> ????????? return null; >>>>>>> ??????? } >>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>> >>>>>>> ??????? if (libptr != 0L) { // Native frame >>>>>>> ????????? try { >>>>>>> ??????????? nextDwarf = new DwarfParser(libptr); >>>>>>> ??????????? nextDwarf.processDwarf(nextPC); >>>>>>> ????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>>>> ????????? } >>>>>>> ??????? } >>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>> ????? } >>>>>>> >>>>>>> 135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext context = thread.getContext(); 137 138 if (dwarf == null) { // Java frame 139 return javaSender(context); 140 } 141 142 Address nextPC = getNextPC(true); 143 if (nextPC == null) { 144 return null; 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // Next frame might be Java frame 153 nextCFA = getNextCFA(null, context); 154 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 158 } catch (DebuggerException e) { 159 nextCFA = getNextCFA(null, context); 160 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? null : new >>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 167 } >>>>>>> >>>>>>> ??This one can be also simplified a little: >>>>>>> >>>>>>> ????? public CFrame sender(ThreadProxy thread) { >>>>>>> ??????? ThreadContext context = thread.getContext(); >>>>>>> >>>>>>> ??????? if (dwarf == null) { // Java frame >>>>>>> ????????? return javaSender(context); >>>>>>> ??????? } >>>>>>> ??????? Address nextPC = getNextPC(true); >>>>>>> ??????? if (nextPC == null) { >>>>>>> ????????? return null; >>>>>>> ??????? } >>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>> ??????? if (!dwarf.isIn(nextPC)) { >>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>> ????????? if (libptr != 0L) { >>>>>>> ??????????? try { >>>>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>>>> ????????????? nextDwarf.processDwarf(nextPC); >>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>>>> ??????????? } >>>>>>> ????????? } >>>>>>> ??????? } >>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>> ????? } >>>>>>> >>>>>>> Finally, it looks like just one method could replace both >>>>>>> sender(ThreadProxy thread) and javaSender(ThreadContext context): >>>>>>> >>>>>>> ????? private CFrame commonSender(ThreadProxy thread) { >>>>>>> ??????? ThreadContext context = thread.getContext(); >>>>>>> ??????? Address nextPC = getNextPC(false); >>>>>>> ??????? if (nextPC == null) { >>>>>>> ????????? return null; >>>>>>> ??????? } >>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>> >>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) { >>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>> ????????? if (libptr != 0L) { >>>>>>> ??????????? try { >>>>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>>>> ????????????? nextDwarf.processDwarf(nextPC); >>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>>>> ??????????? } >>>>>>> ????????? } >>>>>>> ??????? } >>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>> ????? } >>>>>>> >>>>>>> I'm still reviewing the dwarf parser files. >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> I refactored LinuxAMD64CFrame.java . It works fine in serviceability/sa tests and >>>>>>>> all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923). >>>>>>>> Could you review new webrev? >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/ >>>>>>>> >>>>>>>> The diff from previous webrev is here: >>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote: >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> Please review this change: >>>>>>>>> >>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/ >>>>>>>>> >>>>>>>>> >>>>>>>>> According to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64 >>>>>>>>> Architecture Processor Supplement [1], we need to use DWARF in .eh_frame or .debug_frame >>>>>>>>> for stack unwinding. >>>>>>>>> >>>>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by default since GCC 4.6, so system >>>>>>>>> library (e.g. libc) might be compiled with this feature. >>>>>>>>> >>>>>>>>> However `jhsdb jstack --mixed` does not do so, it uses base pointer register (RBP). >>>>>>>>> So it might be lack of stack frames. >>>>>>>>> >>>>>>>>> I guess JDK-8219201 is caused by same issue. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> [1] https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf >>>>>>> From serguei.spitsyn at oracle.com Tue Mar 10 22:55:53 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 10 Mar 2020 15:55:53 -0700 Subject: RFR: JDK-8240340: java/lang/management/ThreadMXBean/Locks.java is buggy In-Reply-To: References: <4c9b6f1b-1308-9087-ef4c-140eaf103b0f@oracle.com> <9a3d8f8c-5c46-aa64-ba22-b1b2bf1836ee@oracle.com> <8d0f40d7-4202-0179-d130-1366c77e5c05@oracle.com> <1082f12c-102f-fd63-f8a4-7b623944ee03@oracle.com> Message-ID: <7c4b037c-384d-2137-f42d-9f31390c9f15@oracle.com> Hi Alex, The update looks good. Thanks, Serguei On 3/9/20 12:15, Alex Menkov wrote: > > Updated webrev: > http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev.02/ > > > Changes are in LockFreeLogger comments only. > > --alex > > On 03/08/2020 21:19, David Holmes wrote: >> P.S. >> >> Forgot to note however that you need to update the documentation for >> the logger now as the mention of "per-thread logs" makes no sense >> now. Also in the spirit of not using @author, and because this is no >> longer the code created by Jaroslav, please delete the @author line. >> >> Thanks, >> David >> >> On 9/03/2020 2:15 pm, David Holmes wrote: >>> Hi Alex, >>> >>> On 6/03/2020 4:54 am, Alex Menkov wrote: >>>> Hi David, >>>> >>>> Thanks you for the review. >>>> >>>> On 03/04/2020 17:50, David Holmes wrote: >>>>> Hi Alex, >>>>> >>>>> On 5/03/2020 10:30 am, Alex Menkov wrote: >>>>>> Hi all, >>>>>> >>>>>> please review the fix for >>>>>> https://bugs.openjdk.java.net/browse/JDK-8240340 >>>>>> webrev: >>>>>> http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev/ >>>>>> >>>>>> >>>>>> changes: >>>>>> - assertThreadState method: don't re-read thread state throwing >>>>>> exception (as we got weird error like "Thread WaitingThread is at >>>>>> WAITING state but is expected to be in Thread.State = WAITING"); >>>>>> - added proper test shutdown on error (made all threads "daemon", >>>>>> interrupt waiting thread if CheckerThread throws exception); >>>>>> - if CheckerThread detects error, propagate the exception to main >>>>>> thread; >>>>> >>>>> The test changes seem fine. >>>>> >>>>>> - fixed LockFreeLogger class - it should work for logging from >>>>>> several threads, but it doesn't. I prefer to simplify it just to >>>>>> keep ConcurrentLinkedQueue. >>>>>> LockFreeLogger is also used by ThreadMXBeanStateTest test, but >>>>>> only by a single thread. >>>>> >>>>> I don't understand your changes here as you've completely changed >>>>> the intended design of the logger. The original accumulates log >>>>> entries per-thread and then spits them all out (though I'm not >>>>> clear on the exact ordering - I don't how to read that stream >>>>> stuff). The new code just creates a single queue of log records >>>>> interleaving entries from different threads. The simple logger may >>>>> be all that is needed but it seems quite different to the intent >>>>> of the original. >>>> >>>> Testing changes in the test I discovered that there is something >>>> wrong with the logger - it printed only part of the records, so I >>>> have to look at the LockFreeLogger class and I don't understand how >>>> it was supposed to work. >>>> About ordering in cumulative log: each record has Integer which >>>> used to sort log entries from all threads (i.e. records from >>>> different threads are printed at the order which log() was called). >>>> Looking at allRecords/records stuff I don't understand how it >>>> should be used. To get logs from different threads in one logger, >>>> we needs one instance. So we create LockFreeLogger (in main thread) >>>> and ctor creates ThreadLocal record and register it in allRecords. >>>> Logging from main thread works fine, but if any other thread tries >>>> to log, 1st log() call creates its own ThreadLocal records (by >>>> records.get()) and log records from this thread go there. But this >>>> ThreadLocal records is not registered in allRecords, so this >>>> logging won't be included in final log. >>>> Looks like we need to change log() to something like >>>> >>>> Map recs = records.get(); >>>> if (recs.isEmpty()) { >>>> ???? allRecords.add(recs); >>>> } >>>> recs.put(id, String.format(format, params)); >>> >>> Yep good catch - this logger was completely broken. >>> >>>> But all this stuff do exactly the same as simple >>>> ConcurrentLinkedQueue (i.e. lock free ordered list). >>>> At least I don't see other rationale in the stuff. >>> >>> I'm not certain of intent with the original but I'd always want to >>> see log entries in chronological order - which is what we now >>> clearly have. >>> >>> Thanks, >>> David >>> >>>> --alex >>>> >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> --alex From kevin.walls at oracle.com Tue Mar 10 23:53:21 2020 From: kevin.walls at oracle.com (Kevin Walls) Date: Tue, 10 Mar 2020 16:53:21 -0700 (PDT) Subject: : PING: RFR: 8234624: jstack mixed mode should refer DWARF In-Reply-To: <24a2dcae-fba9-c6ed-657e-a1d54584bbc3@oss.nttdata.com> References: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com> <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com> <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com> <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com> <24a2dcae-fba9-c6ed-657e-a1d54584bbc3@oss.nttdata.com> Message-ID: <01c2a8c1-0e04-4e42-161a-80d6bdaa4ffa@oracle.com> Hi - In testing I wasn't seeing any of the Dwarf code triggered. With LIBSAPROC_DEBUG set I'm getting the "Could not find executable section in" for lots of / maybe all the libraries... src/jdk.hotspot.agent/linux/native/libsaproc/libproc_impl.c ?? if (fill_instr_info(newlib)) { ???? if (!read_eh_frame(ph, newlib)) { fill_instr_info is failing, and we never get to read_eh_frame(). output like: libsaproc DEBUG: [0] vaddr = 0x0, memsz = 0xaba4, filesz = 0xaba4 libsaproc DEBUG: Could not find executable section in /lib/x86_64-linux-gnu/libnss_nis-2.27.so (similar for all libraries). fill_instr fails if: ?if ((lib->exec_start == 0L) || (lib->exec_end == 0L)) ...but isn't exec_start relative to the library address? It's the value of ph->vaddr and it is often zero. I added some booleans and did: 185?????? if ((lib->exec_start == 0L) || (lib->exec_start > ph->p_vaddr)) { 186???????? lib->exec_start = ph->p_vaddr; 187???????? found_start =true; 188?????? } (similarly for end) and only failed if: 201?? if (!found_start || !found_end) { 202???? return false; ...and now it's better. ? I go from: ----------------- 3306 ----------------- 0x00007f75824acd2d????? __GI___pthread_timedjoin_ex + 0x17d to: ----------------- 31127 ----------------- 0x00007fa284d78d2d????? __GI___pthread_timedjoin_ex + 0x17d 0x00007fa2857aaf2d????? CallJavaMainInNewThread + 0xad 0x00007fa2857a74ed????? ContinueInNewThread + 0x4d 0x00007fa2857a8c49????? JLI_Launch + 0x1529 0x000055af1b78db1c????? main + 0x11c Thanks Kevin On 10/03/2020 12:36, Yasumasa Suenaga wrote: > Hi Kevin, > > Thanks for your comment! > > On 2020/03/10 18:58, Kevin Walls wrote: >> Hi Yasumasa , >> >> The changes build OK for me in the latest jdk, and things still work. >> I have not yet seen the dwarf usage in action: I've tried a couple of >> different systems and so far have not reproduced the problem, i.e. >> jstack has not failed on native frames. >> >> I may need more recent basic libraries, will look again for somewhere >> where the problem happens and get back to you as I really want to run >> the changes. > > You can see the problem with JShell. > Some Java frames would not be seen in mixed jstack. > > >> I have mostly minor other comments which don't need a new webrev, >> some just comments for the future: >> >> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp: >> >> DW_CFA_nop - shouldn't this continue instead of return? >> (It may "never" happen, but a nop could appear within some other >> instructions?) > > DW_CFA_nop is used for padding, so we can ignore (return immediately) it. > > >> DW_CFA_remember_state: a minor typo in the comment, >> "DW_CFA_remenber_state". > > I will fix it. > > >> We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not >> DW_CFA_advance_loc4.? I thought that was odd, but maybe addresses in >> these tables never increase by 4-byte amounts, would this mean a lot >> of code on one line. 8-) >> So maybe it's never used in practice, if you think it's unnecessary >> no problem, maybe a comment, or add it for robustness. > > I will add DW_CFA_advance_loc4. > > >> General-purpose methods like read_leb128(), get_entry_length(), >> get_decoded_value() specifically update the _buf pointer in this >> DwarfParser. >> >> DwarfParser::process_dwarf() moves _buf. >> It calls process_cie() which reads, moves _buf and restores it to the >> original position, then we read augmentation_length from where _buf is. >> I'm not sure if that's wrong, or if I just need to read again about >> the CIE/etc layout. >> >> I don't really want to suggest making the code pass around a current >> _buf for the invocation of these general purpose methods, but just >> wanted to comment that if these get used more widely that might >> become necessary. > > I saw GDB and binutils source for creating this patch. > They seems to process similar code because we need to calculate DWARF > instructions one-by-one to get the value which relates to specified PC. > > >> Similarly in future, if this DWARF support code became used more >> widely, it might want to move to an >> OS-neutral directory?? It's odd to label it as Linux-specific. > > Windows does not use DWARF at least, it uses another feature. > > https://urldefense.com/v3/__https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaTtALtpRg$ > > I'm not sure other platforms (Solaris, macOS) uses DWARF. > If DWARF is used in them, I can move DWARF related code to posix > directory. > > >> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp: >> Thanks for changing "can_parsable" which was in the earlier version. 8-) >> >> >> These are just comments to mainly say it looks good, and somebody >> else out there has read it. >> I will look for a system that shows the problem, and get back to you >> again! > > > Thanks, > > Yasumasa > > >> Many thanks >> Kevin >> >> On 27/02/2020 05:13, Yasumasa Suenaga wrote: >>> Hi all, >>> >>> webrev.03 cannot be applied to current jdk/jdk due to 8239224 and >>> 8239462 changes (they updated copyright year). >>> So I modified webrev (only copyright year changes) to be able to >>> apply to current jdk/jdk. >>> Could you review it? >>> >>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/ >>> >>> I need one more reviewer to push. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2020/02/17 13:07, Yasumasa Suenaga wrote: >>>> PING: Could you review it? >>>> >>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>> ?? webrev: >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>> >>>> This change has been already reviewed by Serguei. >>>> I need one more reviewer to push. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2020/02/03 1:37, Yasumasa Suenaga wrote: >>>>> PING: Could you reveiw this change? >>>>> >>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>> ?? webrev: >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>> >>>>> I believe this change helps troubleshooter to fight to postmortem >>>>> analysis. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2020/01/19 3:16, Yasumasa Suenaga wrote: >>>>>> PING: Could you review it? >>>>>> >>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>> ?? webrev: >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>> >>>>>> I updated webrev. I discussed with Serguei in off list, and I >>>>>> refactored webrev.02 . >>>>>> It has passed tests on submit repo >>>>>> (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549). >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2019/12/15 10:51, Yasumasa Suenaga wrote: >>>>>>> Hi Serguei, >>>>>>> >>>>>>> Thanks for your comment! >>>>>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev. >>>>>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as >>>>>>> Dmitry said. >>>>>>> >>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/ >>>>>>> >>>>>>> This change has been passed all tests on submit repo >>>>>>> (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487). >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote: >>>>>>>> Hi Yasumasa, >>>>>>>> >>>>>>>> This is nice move in general. >>>>>>>> Thank you for working on this! >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html >>>>>>>> >>>>>>>> >>>>>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == >>>>>>>> 0L) { // Java frame 98 Address rbp = >>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if >>>>>>>> (rbp == null) { 100 return null; 101 } 102 return new >>>>>>>> LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native >>>>>>>> frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new >>>>>>>> DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 >>>>>>>> Address rbp = >>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if >>>>>>>> (rbp == null) { 110 return null; 111 } 112 return new >>>>>>>> LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 >>>>>>>> dwarf.processDwarf(pc); 115 Address cfa = >>>>>>>> ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 >>>>>>>> !dwarf.isBPOffsetAvailable()) 117 ? >>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : >>>>>>>> context.getRegisterAsAddress(dwarf.getCFARegister()) 119 >>>>>>>> .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 >>>>>>>> return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, >>>>>>>> pc, dwarf); 124 } >>>>>>>> >>>>>>>> >>>>>>>> I'd suggest to simplify the logic by refactoring to something >>>>>>>> like below: >>>>>>>> >>>>>>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc); >>>>>>>> ?????????? Address cfa = >>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java >>>>>>>> frame >>>>>>>> ?????????? DwarfParser dwarf = null; >>>>>>>> >>>>>>>> ?????????? if (libptr != 0L) { // Native frame >>>>>>>> ???????????? try { >>>>>>>> ?????????????? dwarf = new DwarfParser(libptr); >>>>>>>> ?????????????? dwarf.processDwarf(pc); >>>>>>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == >>>>>>>> AMD64ThreadContext.RBP) && >>>>>>>> !dwarf.isBPOffsetAvailable()) >>>>>>>> ???????????????????????????????? ? >>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP) >>>>>>>> ???????????????????????????????? : >>>>>>>> context.getRegisterAsAddress(dwarf.getCFARegister()) >>>>>>>> .addOffsetTo(dwarf.getCFAOffset()); >>>>>>>> >>>>>>>> ??????????? } catch (DebuggerException e) { // bail out to Java >>>>>>>> frame case >>>>>>>> ??????????? } >>>>>>>> ????????? } >>>>>>>> ????????? if (cfa == null) { >>>>>>>> ??????????? return null; >>>>>>>> ????????? } >>>>>>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html >>>>>>>> >>>>>>>> >>>>>>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA() >>>>>>>> >>>>>>>> ?? Better to rename 'ofs' => 'offs'. >>>>>>>> >>>>>>>> 77 nextCFA = nextCFA.addOffsetTo(- >>>>>>>> nextDwarf.getBasePointerOffsetFromCFA()); >>>>>>>> >>>>>>>> ?? Extra space after '-' sign. >>>>>>>> >>>>>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, >>>>>>>> ThreadContext context) { >>>>>>>> >>>>>>>> ?? It feels like the logic has to be somehow >>>>>>>> refactored/simplified as >>>>>>>> ?? several typical fragments appears in slightly different >>>>>>>> contexts. >>>>>>>> ?? But it is not easy to understand what it is. >>>>>>>> ?? Could you, please, add some comments to key places >>>>>>>> explaining this logic. >>>>>>>> ?? Then I'll check if it is possible to make it a little bit >>>>>>>> simpler. >>>>>>>> >>>>>>>> 109 private CFrame javaSender(ThreadContext context) { 110 >>>>>>>> Address nextCFA; 111 Address nextPC; 112 113 nextPC = >>>>>>>> getNextPC(false); 114 if (nextPC == null) { 115 return null; >>>>>>>> 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = >>>>>>>> dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // >>>>>>>> Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); >>>>>>>> 123 } catch (DebuggerException e) { 124 nextCFA = >>>>>>>> getNextCFA(null, context); 125 return (nextCFA == null) ? null >>>>>>>> : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 >>>>>>>> nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = >>>>>>>> getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? >>>>>>>> null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, >>>>>>>> nextDwarf); 133 } >>>>>>>> >>>>>>>> ??The above can be simplified if a DebuggerException can not be >>>>>>>> thrown from processDwarf(nextPC): >>>>>>>> ????? private CFrame javaSender(ThreadContext context) { >>>>>>>> ??????? Address nextPC = getNextPC(false); >>>>>>>> ??????? if (nextPC == null) { >>>>>>>> ????????? return null; >>>>>>>> ??????? } >>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>> >>>>>>>> ??????? if (libptr != 0L) { // Native frame >>>>>>>> ????????? try { >>>>>>>> ??????????? nextDwarf = new DwarfParser(libptr); >>>>>>>> ??????????? nextDwarf.processDwarf(nextPC); >>>>>>>> ????????? } catch (DebuggerException e) { // Bail out to Java >>>>>>>> frame >>>>>>>> ????????? } >>>>>>>> ??????? } >>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>> ??????? return (nextCFA == null) ? null : new >>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>> ????? } >>>>>>>> >>>>>>>> 135 public CFrame sender(ThreadProxy thread) { 136 >>>>>>>> ThreadContext context = thread.getContext(); 137 138 if (dwarf >>>>>>>> == null) { // Java frame 139 return javaSender(context); 140 } >>>>>>>> 141 142 Address nextPC = getNextPC(true); 143 if (nextPC == >>>>>>>> null) { 144 return null; 145 } 146 147 Address nextCFA; 148 >>>>>>>> DwarfParser nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { >>>>>>>> 150 long libptr = dbg.findLibPtrByAddress(nextPC); 151 if >>>>>>>> (libptr == 0L) { 152 // Next frame might be Java frame 153 >>>>>>>> nextCFA = getNextCFA(null, context); 154 return (nextCFA == >>>>>>>> null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, >>>>>>>> null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); >>>>>>>> 158 } catch (DebuggerException e) { 159 nextCFA = >>>>>>>> getNextCFA(null, context); 160 return (nextCFA == null) ? null >>>>>>>> : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } >>>>>>>> 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = >>>>>>>> getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? >>>>>>>> null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>> 167 } >>>>>>>> >>>>>>>> ??This one can be also simplified a little: >>>>>>>> >>>>>>>> ????? public CFrame sender(ThreadProxy thread) { >>>>>>>> ??????? ThreadContext context = thread.getContext(); >>>>>>>> >>>>>>>> ??????? if (dwarf == null) { // Java frame >>>>>>>> ????????? return javaSender(context); >>>>>>>> ??????? } >>>>>>>> ??????? Address nextPC = getNextPC(true); >>>>>>>> ??????? if (nextPC == null) { >>>>>>>> ????????? return null; >>>>>>>> ??????? } >>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>> ??????? if (!dwarf.isIn(nextPC)) { >>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>> ????????? if (libptr != 0L) { >>>>>>>> ??????????? try { >>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>>>>> ????????????? nextDwarf.processDwarf(nextPC); >>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java >>>>>>>> frame >>>>>>>> ??????????? } >>>>>>>> ????????? } >>>>>>>> ??????? } >>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>> ??????? return (nextCFA == null) ? null : new >>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>> ????? } >>>>>>>> >>>>>>>> Finally, it looks like just one method could replace both >>>>>>>> sender(ThreadProxy thread) and javaSender(ThreadContext context): >>>>>>>> >>>>>>>> ????? private CFrame commonSender(ThreadProxy thread) { >>>>>>>> ??????? ThreadContext context = thread.getContext(); >>>>>>>> ??????? Address nextPC = getNextPC(false); >>>>>>>> ??????? if (nextPC == null) { >>>>>>>> ????????? return null; >>>>>>>> ??????? } >>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>> >>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) { >>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>> ????????? if (libptr != 0L) { >>>>>>>> ??????????? try { >>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>>>>> ????????????? nextDwarf.processDwarf(nextPC); >>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java >>>>>>>> frame >>>>>>>> ??????????? } >>>>>>>> ????????? } >>>>>>>> ??????? } >>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>> ??????? return (nextCFA == null) ? null : new >>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>> ????? } >>>>>>>> >>>>>>>> I'm still reviewing the dwarf parser files. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> >>>>>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I refactored LinuxAMD64CFrame.java . It works fine in >>>>>>>>> serviceability/sa tests and >>>>>>>>> all tests on submit repo >>>>>>>>> (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923). >>>>>>>>> Could you review new webrev? >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/ >>>>>>>>> >>>>>>>>> The diff from previous webrev is here: >>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote: >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> Please review this change: >>>>>>>>>> >>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>> ?? webrev: >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> According to 2.7 Stack Unwind Algorithm in System V >>>>>>>>>> Application Binary Interface AMD64 >>>>>>>>>> Architecture Processor Supplement [1], we need to use DWARF >>>>>>>>>> in .eh_frame or .debug_frame >>>>>>>>>> for stack unwinding. >>>>>>>>>> >>>>>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by default >>>>>>>>>> since GCC 4.6, so system >>>>>>>>>> library (e.g. libc) might be compiled with this feature. >>>>>>>>>> >>>>>>>>>> However `jhsdb jstack --mixed` does not do so, it uses base >>>>>>>>>> pointer register (RBP). >>>>>>>>>> So it might be lack of stack frames. >>>>>>>>>> >>>>>>>>>> I guess JDK-8219201 is caused by same issue. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> [1] >>>>>>>>>> https://urldefense.com/v3/__https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaQusmjOqA$ >>>>>>>>> >>>>>>>> From serguei.spitsyn at oracle.com Wed Mar 11 01:07:51 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 10 Mar 2020 18:07:51 -0700 Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they do not attempt to use sudo when available In-Reply-To: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com> References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com> Message-ID: <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com> An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Wed Mar 11 01:57:01 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 10 Mar 2020 18:57:01 -0700 Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they do not attempt to use sudo when available In-Reply-To: <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com> References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com> <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From suenaga at oss.nttdata.com Wed Mar 11 02:07:48 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Wed, 11 Mar 2020 11:07:48 +0900 Subject: : PING: RFR: 8234624: jstack mixed mode should refer DWARF In-Reply-To: <01c2a8c1-0e04-4e42-161a-80d6bdaa4ffa@oracle.com> References: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com> <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com> <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com> <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com> <24a2dcae-fba9-c6ed-657e-a1d54584bbc3@oss.nttdata.com> <01c2a8c1-0e04-4e42-161a-80d6bdaa4ffa@oracle.com> Message-ID: Hi Kevin, I guess first program header in the libraries which are on your machine has exec flag (you can check it with `readelf -l`). So I tweaked my patch (initial value of exec_start and exec_end set to -1) in new webrev. http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05/ This webrev contains the fix for your comment (typo and DW_CFA_advance_loc4). Thanks, Yasumasa On 2020/03/11 8:53, Kevin Walls wrote: > Hi - > > In testing I wasn't seeing any of the Dwarf code triggered. > > With LIBSAPROC_DEBUG set I'm getting the "Could not find executable section in" for lots of / maybe all the libraries... > > src/jdk.hotspot.agent/linux/native/libsaproc/libproc_impl.c > > ?? if (fill_instr_info(newlib)) { > ???? if (!read_eh_frame(ph, newlib)) { > > fill_instr_info is failing, and we never get to read_eh_frame(). > > output like: > > libsaproc DEBUG: [0] vaddr = 0x0, memsz = 0xaba4, filesz = 0xaba4 > libsaproc DEBUG: Could not find executable section in /lib/x86_64-linux-gnu/libnss_nis-2.27.so > > (similar for all libraries). > > fill_instr fails if: > > ?if ((lib->exec_start == 0L) || (lib->exec_end == 0L)) > > ...but isn't exec_start relative to the library address? It's the value of ph->vaddr and it is often zero. > > I added some booleans and did: > > 185?????? if ((lib->exec_start == 0L) || (lib->exec_start > ph->p_vaddr)) { > 186???????? lib->exec_start = ph->p_vaddr; > 187???????? found_start =true; > 188?????? } > > (similarly for end) and only failed if: > > 201?? if (!found_start || !found_end) { > 202???? return false; > > ...and now it's better. ? I go from: > > ----------------- 3306 ----------------- > 0x00007f75824acd2d????? __GI___pthread_timedjoin_ex + 0x17d > > to: > > ----------------- 31127 ----------------- > 0x00007fa284d78d2d????? __GI___pthread_timedjoin_ex + 0x17d > 0x00007fa2857aaf2d????? CallJavaMainInNewThread + 0xad > 0x00007fa2857a74ed????? ContinueInNewThread + 0x4d > 0x00007fa2857a8c49????? JLI_Launch + 0x1529 > 0x000055af1b78db1c????? main + 0x11c > > > Thanks > Kevin > > > > > On 10/03/2020 12:36, Yasumasa Suenaga wrote: > >> Hi Kevin, >> >> Thanks for your comment! >> >> On 2020/03/10 18:58, Kevin Walls wrote: >>> Hi Yasumasa , >>> >>> The changes build OK for me in the latest jdk, and things still work. >>> I have not yet seen the dwarf usage in action: I've tried a couple of different systems and so far have not reproduced the problem, i.e. jstack has not failed on native frames. >>> >>> I may need more recent basic libraries, will look again for somewhere where the problem happens and get back to you as I really want to run the changes. >> >> You can see the problem with JShell. >> Some Java frames would not be seen in mixed jstack. >> >> >>> I have mostly minor other comments which don't need a new webrev, some just comments for the future: >>> >>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp: >>> >>> DW_CFA_nop - shouldn't this continue instead of return? >>> (It may "never" happen, but a nop could appear within some other instructions?) >> >> DW_CFA_nop is used for padding, so we can ignore (return immediately) it. >> >> >>> DW_CFA_remember_state: a minor typo in the comment, "DW_CFA_remenber_state". >> >> I will fix it. >> >> >>> We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not DW_CFA_advance_loc4.? I thought that was odd, but maybe addresses in these tables never increase by 4-byte amounts, would this mean a lot of code on one line. 8-) >>> So maybe it's never used in practice, if you think it's unnecessary no problem, maybe a comment, or add it for robustness. >> >> I will add DW_CFA_advance_loc4. >> >> >>> General-purpose methods like read_leb128(), get_entry_length(), get_decoded_value() specifically update the _buf pointer in this DwarfParser. >>> >>> DwarfParser::process_dwarf() moves _buf. >>> It calls process_cie() which reads, moves _buf and restores it to the original position, then we read augmentation_length from where _buf is. >>> I'm not sure if that's wrong, or if I just need to read again about the CIE/etc layout. >>> >>> I don't really want to suggest making the code pass around a current _buf for the invocation of these general purpose methods, but just wanted to comment that if these get used more widely that might become necessary. >> >> I saw GDB and binutils source for creating this patch. >> They seems to process similar code because we need to calculate DWARF instructions one-by-one to get the value which relates to specified PC. >> >> >>> Similarly in future, if this DWARF support code became used more widely, it might want to move to an >>> OS-neutral directory?? It's odd to label it as Linux-specific. >> >> Windows does not use DWARF at least, it uses another feature. >> >> https://urldefense.com/v3/__https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaTtALtpRg$ >> I'm not sure other platforms (Solaris, macOS) uses DWARF. >> If DWARF is used in them, I can move DWARF related code to posix directory. >> >> >>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp: >>> Thanks for changing "can_parsable" which was in the earlier version. 8-) >>> >>> >>> These are just comments to mainly say it looks good, and somebody else out there has read it. >>> I will look for a system that shows the problem, and get back to you again! >> >> >> Thanks, >> >> Yasumasa >> >> >>> Many thanks >>> Kevin >>> >>> On 27/02/2020 05:13, Yasumasa Suenaga wrote: >>>> Hi all, >>>> >>>> webrev.03 cannot be applied to current jdk/jdk due to 8239224 and 8239462 changes (they updated copyright year). >>>> So I modified webrev (only copyright year changes) to be able to apply to current jdk/jdk. >>>> Could you review it? >>>> >>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/ >>>> >>>> I need one more reviewer to push. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2020/02/17 13:07, Yasumasa Suenaga wrote: >>>>> PING: Could you review it? >>>>> >>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>> >>>>> This change has been already reviewed by Serguei. >>>>> I need one more reviewer to push. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2020/02/03 1:37, Yasumasa Suenaga wrote: >>>>>> PING: Could you reveiw this change? >>>>>> >>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>> >>>>>> I believe this change helps troubleshooter to fight to postmortem analysis. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2020/01/19 3:16, Yasumasa Suenaga wrote: >>>>>>> PING: Could you review it? >>>>>>> >>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>>> >>>>>>> I updated webrev. I discussed with Serguei in off list, and I refactored webrev.02 . >>>>>>> It has passed tests on submit repo (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549). >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2019/12/15 10:51, Yasumasa Suenaga wrote: >>>>>>>> Hi Serguei, >>>>>>>> >>>>>>>> Thanks for your comment! >>>>>>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev. >>>>>>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as Dmitry said. >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/ >>>>>>>> >>>>>>>> This change has been passed all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487). >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Hi Yasumasa, >>>>>>>>> >>>>>>>>> This is nice move in general. >>>>>>>>> Thank you for working on this! >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html >>>>>>>>> >>>>>>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 0L) { // Java frame 98 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp == null) { 100 return null; 101 } 102 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if (rbp == null) { 110 return null; 111 } 112 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 dwarf.processDwarf(pc); 115 Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 !dwarf.isBPOffsetAvailable()) 117 ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : context.getRegisterAsAddress(dwarf.getCFARegister()) 119 .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); 124 } >>>>>>>>> >>>>>>>>> >>>>>>>>> I'd suggest to simplify the logic by refactoring to something like below: >>>>>>>>> >>>>>>>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc); >>>>>>>>> ?????????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame >>>>>>>>> ?????????? DwarfParser dwarf = null; >>>>>>>>> >>>>>>>>> ?????????? if (libptr != 0L) { // Native frame >>>>>>>>> ???????????? try { >>>>>>>>> ?????????????? dwarf = new DwarfParser(libptr); >>>>>>>>> ?????????????? dwarf.processDwarf(pc); >>>>>>>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && >>>>>>>>> !dwarf.isBPOffsetAvailable()) >>>>>>>>> ???????????????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) >>>>>>>>> ???????????????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister()) >>>>>>>>> .addOffsetTo(dwarf.getCFAOffset()); >>>>>>>>> >>>>>>>>> ??????????? } catch (DebuggerException e) { // bail out to Java frame case >>>>>>>>> ??????????? } >>>>>>>>> ????????? } >>>>>>>>> ????????? if (cfa == null) { >>>>>>>>> ??????????? return null; >>>>>>>>> ????????? } >>>>>>>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html >>>>>>>>> >>>>>>>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA() >>>>>>>>> >>>>>>>>> ?? Better to rename 'ofs' => 'offs'. >>>>>>>>> >>>>>>>>> 77 nextCFA = nextCFA.addOffsetTo(- nextDwarf.getBasePointerOffsetFromCFA()); >>>>>>>>> >>>>>>>>> ?? Extra space after '-' sign. >>>>>>>>> >>>>>>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, ThreadContext context) { >>>>>>>>> >>>>>>>>> ?? It feels like the logic has to be somehow refactored/simplified as >>>>>>>>> ?? several typical fragments appears in slightly different contexts. >>>>>>>>> ?? But it is not easy to understand what it is. >>>>>>>>> ?? Could you, please, add some comments to key places explaining this logic. >>>>>>>>> ?? Then I'll check if it is possible to make it a little bit simpler. >>>>>>>>> >>>>>>>>> 109 private CFrame javaSender(ThreadContext context) { 110 Address nextCFA; 111 Address nextPC; 112 113 nextPC = getNextPC(false); 114 if (nextPC == null) { 115 return null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 123 } catch (DebuggerException e) { 124 nextCFA = getNextCFA(null, context); 125 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 133 } >>>>>>>>> >>>>>>>>> ??The above can be simplified if a DebuggerException can not be thrown from processDwarf(nextPC): >>>>>>>>> ????? private CFrame javaSender(ThreadContext context) { >>>>>>>>> ??????? Address nextPC = getNextPC(false); >>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>> ????????? return null; >>>>>>>>> ??????? } >>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>> >>>>>>>>> ??????? if (libptr != 0L) { // Native frame >>>>>>>>> ????????? try { >>>>>>>>> ??????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>> ??????????? nextDwarf.processDwarf(nextPC); >>>>>>>>> ????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>>>>>> ????????? } >>>>>>>>> ??????? } >>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>> ????? } >>>>>>>>> >>>>>>>>> 135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext context = thread.getContext(); 137 138 if (dwarf == null) { // Java frame 139 return javaSender(context); 140 } 141 142 Address nextPC = getNextPC(true); 143 if (nextPC == null) { 144 return null; 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // Next frame might be Java frame 153 nextCFA = getNextCFA(null, context); 154 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 158 } catch (DebuggerException e) { 159 nextCFA = getNextCFA(null, context); 160 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? null : new >>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 167 } >>>>>>>>> >>>>>>>>> ??This one can be also simplified a little: >>>>>>>>> >>>>>>>>> ????? public CFrame sender(ThreadProxy thread) { >>>>>>>>> ??????? ThreadContext context = thread.getContext(); >>>>>>>>> >>>>>>>>> ??????? if (dwarf == null) { // Java frame >>>>>>>>> ????????? return javaSender(context); >>>>>>>>> ??????? } >>>>>>>>> ??????? Address nextPC = getNextPC(true); >>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>> ????????? return null; >>>>>>>>> ??????? } >>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>> ??????? if (!dwarf.isIn(nextPC)) { >>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>> ????????? if (libptr != 0L) { >>>>>>>>> ??????????? try { >>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC); >>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>>>>>> ??????????? } >>>>>>>>> ????????? } >>>>>>>>> ??????? } >>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>> ????? } >>>>>>>>> >>>>>>>>> Finally, it looks like just one method could replace both >>>>>>>>> sender(ThreadProxy thread) and javaSender(ThreadContext context): >>>>>>>>> >>>>>>>>> ????? private CFrame commonSender(ThreadProxy thread) { >>>>>>>>> ??????? ThreadContext context = thread.getContext(); >>>>>>>>> ??????? Address nextPC = getNextPC(false); >>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>> ????????? return null; >>>>>>>>> ??????? } >>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>> >>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) { >>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>> ????????? if (libptr != 0L) { >>>>>>>>> ??????????? try { >>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC); >>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>>>>>> ??????????? } >>>>>>>>> ????????? } >>>>>>>>> ??????? } >>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>> ????? } >>>>>>>>> >>>>>>>>> I'm still reviewing the dwarf parser files. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>> >>>>>>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I refactored LinuxAMD64CFrame.java . It works fine in serviceability/sa tests and >>>>>>>>>> all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923). >>>>>>>>>> Could you review new webrev? >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/ >>>>>>>>>> >>>>>>>>>> The diff from previous webrev is here: >>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote: >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> Please review this change: >>>>>>>>>>> >>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> According to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64 >>>>>>>>>>> Architecture Processor Supplement [1], we need to use DWARF in .eh_frame or .debug_frame >>>>>>>>>>> for stack unwinding. >>>>>>>>>>> >>>>>>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by default since GCC 4.6, so system >>>>>>>>>>> library (e.g. libc) might be compiled with this feature. >>>>>>>>>>> >>>>>>>>>>> However `jhsdb jstack --mixed` does not do so, it uses base pointer register (RBP). >>>>>>>>>>> So it might be lack of stack frames. >>>>>>>>>>> >>>>>>>>>>> I guess JDK-8219201 is caused by same issue. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> [1] https://urldefense.com/v3/__https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaQusmjOqA$ >>>>>>>>>> >>>>>>>>> From serguei.spitsyn at oracle.com Wed Mar 11 02:25:44 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 10 Mar 2020 19:25:44 -0700 Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they do not attempt to use sudo when available In-Reply-To: References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com> <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com> Message-ID: <7c6ae898-f4f1-f353-99a6-47d00162bda9@oracle.com> An HTML attachment was scrubbed... URL: From suenaga at oss.nttdata.com Wed Mar 11 05:52:16 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Wed, 11 Mar 2020 14:52:16 +0900 Subject: : PING: RFR: 8234624: jstack mixed mode should refer DWARF In-Reply-To: References: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com> <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com> <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com> <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com> <24a2dcae-fba9-c6ed-657e-a1d54584bbc3@oss.nttdata.com> <01c2a8c1-0e04-4e42-161a-80d6bdaa4ffa@oracle.com> Message-ID: Hi Kevin, I saw 2 errors on submit repo (mach5-one-ysuenaga-JDK-8234624-5-20200311-0209-9358475). So I tweaked my patch, but I saw the crash again (mach5-one-ysuenaga-JDK-8234624-5-20200311-0448-9361448). Last change on submit repo is here: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05-2/ Can you share details on submit repo? Thanks, Yasumasa On 2020/03/11 11:07, Yasumasa Suenaga wrote: > Hi Kevin, > > I guess first program header in the libraries which are on your machine has exec flag (you can check it with `readelf -l`). > So I tweaked my patch (initial value of exec_start and exec_end set to -1) in new webrev. > > ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05/ > > This webrev contains the fix for your comment (typo and DW_CFA_advance_loc4). > > > Thanks, > > Yasumasa > > > On 2020/03/11 8:53, Kevin Walls wrote: >> Hi - >> >> In testing I wasn't seeing any of the Dwarf code triggered. >> >> With LIBSAPROC_DEBUG set I'm getting the "Could not find executable section in" for lots of / maybe all the libraries... >> >> src/jdk.hotspot.agent/linux/native/libsaproc/libproc_impl.c >> >> ??? if (fill_instr_info(newlib)) { >> ????? if (!read_eh_frame(ph, newlib)) { >> >> fill_instr_info is failing, and we never get to read_eh_frame(). >> >> output like: >> >> libsaproc DEBUG: [0] vaddr = 0x0, memsz = 0xaba4, filesz = 0xaba4 >> libsaproc DEBUG: Could not find executable section in /lib/x86_64-linux-gnu/libnss_nis-2.27.so >> >> (similar for all libraries). >> >> fill_instr fails if: >> >> ??if ((lib->exec_start == 0L) || (lib->exec_end == 0L)) >> >> ...but isn't exec_start relative to the library address? It's the value of ph->vaddr and it is often zero. >> >> I added some booleans and did: >> >> 185?????? if ((lib->exec_start == 0L) || (lib->exec_start > ph->p_vaddr)) { >> 186???????? lib->exec_start = ph->p_vaddr; >> 187???????? found_start =true; >> 188?????? } >> >> (similarly for end) and only failed if: >> >> 201?? if (!found_start || !found_end) { >> 202???? return false; >> >> ...and now it's better. ? I go from: >> >> ----------------- 3306 ----------------- >> 0x00007f75824acd2d????? __GI___pthread_timedjoin_ex + 0x17d >> >> to: >> >> ----------------- 31127 ----------------- >> 0x00007fa284d78d2d????? __GI___pthread_timedjoin_ex + 0x17d >> 0x00007fa2857aaf2d????? CallJavaMainInNewThread + 0xad >> 0x00007fa2857a74ed????? ContinueInNewThread + 0x4d >> 0x00007fa2857a8c49????? JLI_Launch + 0x1529 >> 0x000055af1b78db1c????? main + 0x11c >> >> >> Thanks >> Kevin >> >> >> >> >> On 10/03/2020 12:36, Yasumasa Suenaga wrote: >> >>> Hi Kevin, >>> >>> Thanks for your comment! >>> >>> On 2020/03/10 18:58, Kevin Walls wrote: >>>> Hi Yasumasa , >>>> >>>> The changes build OK for me in the latest jdk, and things still work. >>>> I have not yet seen the dwarf usage in action: I've tried a couple of different systems and so far have not reproduced the problem, i.e. jstack has not failed on native frames. >>>> >>>> I may need more recent basic libraries, will look again for somewhere where the problem happens and get back to you as I really want to run the changes. >>> >>> You can see the problem with JShell. >>> Some Java frames would not be seen in mixed jstack. >>> >>> >>>> I have mostly minor other comments which don't need a new webrev, some just comments for the future: >>>> >>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp: >>>> >>>> DW_CFA_nop - shouldn't this continue instead of return? >>>> (It may "never" happen, but a nop could appear within some other instructions?) >>> >>> DW_CFA_nop is used for padding, so we can ignore (return immediately) it. >>> >>> >>>> DW_CFA_remember_state: a minor typo in the comment, "DW_CFA_remenber_state". >>> >>> I will fix it. >>> >>> >>>> We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not DW_CFA_advance_loc4.? I thought that was odd, but maybe addresses in these tables never increase by 4-byte amounts, would this mean a lot of code on one line. 8-) >>>> So maybe it's never used in practice, if you think it's unnecessary no problem, maybe a comment, or add it for robustness. >>> >>> I will add DW_CFA_advance_loc4. >>> >>> >>>> General-purpose methods like read_leb128(), get_entry_length(), get_decoded_value() specifically update the _buf pointer in this DwarfParser. >>>> >>>> DwarfParser::process_dwarf() moves _buf. >>>> It calls process_cie() which reads, moves _buf and restores it to the original position, then we read augmentation_length from where _buf is. >>>> I'm not sure if that's wrong, or if I just need to read again about the CIE/etc layout. >>>> >>>> I don't really want to suggest making the code pass around a current _buf for the invocation of these general purpose methods, but just wanted to comment that if these get used more widely that might become necessary. >>> >>> I saw GDB and binutils source for creating this patch. >>> They seems to process similar code because we need to calculate DWARF instructions one-by-one to get the value which relates to specified PC. >>> >>> >>>> Similarly in future, if this DWARF support code became used more widely, it might want to move to an >>>> OS-neutral directory?? It's odd to label it as Linux-specific. >>> >>> Windows does not use DWARF at least, it uses another feature. >>> >>> https://urldefense.com/v3/__https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaTtALtpRg$ >>> I'm not sure other platforms (Solaris, macOS) uses DWARF. >>> If DWARF is used in them, I can move DWARF related code to posix directory. >>> >>> >>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp: >>>> Thanks for changing "can_parsable" which was in the earlier version. 8-) >>>> >>>> >>>> These are just comments to mainly say it looks good, and somebody else out there has read it. >>>> I will look for a system that shows the problem, and get back to you again! >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>>> Many thanks >>>> Kevin >>>> >>>> On 27/02/2020 05:13, Yasumasa Suenaga wrote: >>>>> Hi all, >>>>> >>>>> webrev.03 cannot be applied to current jdk/jdk due to 8239224 and 8239462 changes (they updated copyright year). >>>>> So I modified webrev (only copyright year changes) to be able to apply to current jdk/jdk. >>>>> Could you review it? >>>>> >>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/ >>>>> >>>>> I need one more reviewer to push. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2020/02/17 13:07, Yasumasa Suenaga wrote: >>>>>> PING: Could you review it? >>>>>> >>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>> >>>>>> This change has been already reviewed by Serguei. >>>>>> I need one more reviewer to push. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2020/02/03 1:37, Yasumasa Suenaga wrote: >>>>>>> PING: Could you reveiw this change? >>>>>>> >>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>>> >>>>>>> I believe this change helps troubleshooter to fight to postmortem analysis. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2020/01/19 3:16, Yasumasa Suenaga wrote: >>>>>>>> PING: Could you review it? >>>>>>>> >>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>>>> >>>>>>>> I updated webrev. I discussed with Serguei in off list, and I refactored webrev.02 . >>>>>>>> It has passed tests on submit repo (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549). >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2019/12/15 10:51, Yasumasa Suenaga wrote: >>>>>>>>> Hi Serguei, >>>>>>>>> >>>>>>>>> Thanks for your comment! >>>>>>>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev. >>>>>>>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as Dmitry said. >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/ >>>>>>>>> >>>>>>>>> This change has been passed all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487). >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote: >>>>>>>>>> Hi Yasumasa, >>>>>>>>>> >>>>>>>>>> This is nice move in general. >>>>>>>>>> Thank you for working on this! >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html >>>>>>>>>> >>>>>>>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 0L) { // Java frame 98 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp == null) { 100 return null; 101 } 102 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if (rbp == null) { 110 return null; 111 } 112 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 dwarf.processDwarf(pc); 115 Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 !dwarf.isBPOffsetAvailable()) 117 ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : context.getRegisterAsAddress(dwarf.getCFARegister()) 119 .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); 124 } >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I'd suggest to simplify the logic by refactoring to something like below: >>>>>>>>>> >>>>>>>>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc); >>>>>>>>>> ?????????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame >>>>>>>>>> ?????????? DwarfParser dwarf = null; >>>>>>>>>> >>>>>>>>>> ?????????? if (libptr != 0L) { // Native frame >>>>>>>>>> ???????????? try { >>>>>>>>>> ?????????????? dwarf = new DwarfParser(libptr); >>>>>>>>>> ?????????????? dwarf.processDwarf(pc); >>>>>>>>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && >>>>>>>>>> !dwarf.isBPOffsetAvailable()) >>>>>>>>>> ???????????????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) >>>>>>>>>> ???????????????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister()) >>>>>>>>>> .addOffsetTo(dwarf.getCFAOffset()); >>>>>>>>>> >>>>>>>>>> ??????????? } catch (DebuggerException e) { // bail out to Java frame case >>>>>>>>>> ??????????? } >>>>>>>>>> ????????? } >>>>>>>>>> ????????? if (cfa == null) { >>>>>>>>>> ??????????? return null; >>>>>>>>>> ????????? } >>>>>>>>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html >>>>>>>>>> >>>>>>>>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA() >>>>>>>>>> >>>>>>>>>> ?? Better to rename 'ofs' => 'offs'. >>>>>>>>>> >>>>>>>>>> 77 nextCFA = nextCFA.addOffsetTo(- nextDwarf.getBasePointerOffsetFromCFA()); >>>>>>>>>> >>>>>>>>>> ?? Extra space after '-' sign. >>>>>>>>>> >>>>>>>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, ThreadContext context) { >>>>>>>>>> >>>>>>>>>> ?? It feels like the logic has to be somehow refactored/simplified as >>>>>>>>>> ?? several typical fragments appears in slightly different contexts. >>>>>>>>>> ?? But it is not easy to understand what it is. >>>>>>>>>> ?? Could you, please, add some comments to key places explaining this logic. >>>>>>>>>> ?? Then I'll check if it is possible to make it a little bit simpler. >>>>>>>>>> >>>>>>>>>> 109 private CFrame javaSender(ThreadContext context) { 110 Address nextCFA; 111 Address nextPC; 112 113 nextPC = getNextPC(false); 114 if (nextPC == null) { 115 return null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 123 } catch (DebuggerException e) { 124 nextCFA = getNextCFA(null, context); 125 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 133 } >>>>>>>>>> >>>>>>>>>> ??The above can be simplified if a DebuggerException can not be thrown from processDwarf(nextPC): >>>>>>>>>> ????? private CFrame javaSender(ThreadContext context) { >>>>>>>>>> ??????? Address nextPC = getNextPC(false); >>>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>>> ????????? return null; >>>>>>>>>> ??????? } >>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>>> >>>>>>>>>> ??????? if (libptr != 0L) { // Native frame >>>>>>>>>> ????????? try { >>>>>>>>>> ??????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>>> ??????????? nextDwarf.processDwarf(nextPC); >>>>>>>>>> ????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>>>>>>> ????????? } >>>>>>>>>> ??????? } >>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>>> ????? } >>>>>>>>>> >>>>>>>>>> 135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext context = thread.getContext(); 137 138 if (dwarf == null) { // Java frame 139 return javaSender(context); 140 } 141 142 Address nextPC = getNextPC(true); 143 if (nextPC == null) { 144 return null; 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // Next frame might be Java frame 153 nextCFA = getNextCFA(null, context); 154 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 158 } catch (DebuggerException e) { 159 nextCFA = getNextCFA(null, context); 160 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? null : new >>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 167 } >>>>>>>>>> >>>>>>>>>> ??This one can be also simplified a little: >>>>>>>>>> >>>>>>>>>> ????? public CFrame sender(ThreadProxy thread) { >>>>>>>>>> ??????? ThreadContext context = thread.getContext(); >>>>>>>>>> >>>>>>>>>> ??????? if (dwarf == null) { // Java frame >>>>>>>>>> ????????? return javaSender(context); >>>>>>>>>> ??????? } >>>>>>>>>> ??????? Address nextPC = getNextPC(true); >>>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>>> ????????? return null; >>>>>>>>>> ??????? } >>>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>>> ??????? if (!dwarf.isIn(nextPC)) { >>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>> ????????? if (libptr != 0L) { >>>>>>>>>> ??????????? try { >>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC); >>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>>>>>>> ??????????? } >>>>>>>>>> ????????? } >>>>>>>>>> ??????? } >>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>>> ????? } >>>>>>>>>> >>>>>>>>>> Finally, it looks like just one method could replace both >>>>>>>>>> sender(ThreadProxy thread) and javaSender(ThreadContext context): >>>>>>>>>> >>>>>>>>>> ????? private CFrame commonSender(ThreadProxy thread) { >>>>>>>>>> ??????? ThreadContext context = thread.getContext(); >>>>>>>>>> ??????? Address nextPC = getNextPC(false); >>>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>>> ????????? return null; >>>>>>>>>> ??????? } >>>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>>> >>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) { >>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>> ????????? if (libptr != 0L) { >>>>>>>>>> ??????????? try { >>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC); >>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>>>>>>> ??????????? } >>>>>>>>>> ????????? } >>>>>>>>>> ??????? } >>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>>> ????? } >>>>>>>>>> >>>>>>>>>> I'm still reviewing the dwarf parser files. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Serguei >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote: >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> I refactored LinuxAMD64CFrame.java . It works fine in serviceability/sa tests and >>>>>>>>>>> all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923). >>>>>>>>>>> Could you review new webrev? >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/ >>>>>>>>>>> >>>>>>>>>>> The diff from previous webrev is here: >>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote: >>>>>>>>>>>> Hi all, >>>>>>>>>>>> >>>>>>>>>>>> Please review this change: >>>>>>>>>>>> >>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> According to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64 >>>>>>>>>>>> Architecture Processor Supplement [1], we need to use DWARF in .eh_frame or .debug_frame >>>>>>>>>>>> for stack unwinding. >>>>>>>>>>>> >>>>>>>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by default since GCC 4.6, so system >>>>>>>>>>>> library (e.g. libc) might be compiled with this feature. >>>>>>>>>>>> >>>>>>>>>>>> However `jhsdb jstack --mixed` does not do so, it uses base pointer register (RBP). >>>>>>>>>>>> So it might be lack of stack frames. >>>>>>>>>>>> >>>>>>>>>>>> I guess JDK-8219201 is caused by same issue. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> [1] https://urldefense.com/v3/__https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaQusmjOqA$ >>>>>>>>>>> >>>>>>>>>> From david.holmes at oracle.com Wed Mar 11 05:59:29 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 11 Mar 2020 15:59:29 +1000 Subject: : PING: RFR: 8234624: jstack mixed mode should refer DWARF In-Reply-To: References: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com> <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com> <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com> <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com> <24a2dcae-fba9-c6ed-657e-a1d54584bbc3@oss.nttdata.com> <01c2a8c1-0e04-4e42-161a-80d6bdaa4ffa@oracle.com> Message-ID: <26681640-d8c8-5b64-58cc-36f8a854d101@oracle.com> Hi Yasumasa, Partial hs_err info below. David ----- # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00007fdf2000e87c, pid=29798, tid=29800 # # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-11-0447267.suenaga.source) # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-11-0447267.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) # Problematic frame: # C [libsaproc.so+0x487c] DwarfParser::process_dwarf(unsigned long)+0x2c # # Core dump will be written. Default location: Core dumps may be processed with "/opt/core.sh %p" (or dumping to /opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/scratch/0/core.29798) # # If you would like to submit a bug report, please visit: # https://bugreport.java.com/bugreport/crash.jsp # The crash happened outside the Java Virtual Machine in native code. # See problematic frame for where to report the bug. # --------------- S U M M A R Y ------------ Command Line: -Denv.class.path=/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/serviceability/sa/TestJhsdbJstackMixed.d:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/hotspot/jtreg/serviceability/sa:/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/javatest.jar:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/jtreg.jar -Dapplication.home=/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/linux-x64-debug.jdk/jdk-15/fastdebug -Xms8m -Djdk.module.main=jdk.hotspot.agent jdk.hotspot.agent/sun.jvm.hotspot.SALauncher jstack --mixed --pid 29770 Time: Wed Mar 11 05:20:57 2020 UTC elapsed time: 3.927809 seconds (0d 0h 0m 3s) --------------- T H R E A D --------------- Current thread (0x00007fdf5c032000): JavaThread "main" [_thread_in_native, id=29800, stack(0x00007fdf63a9e000,0x00007fdf63b9f000)] Stack: [0x00007fdf63a9e000,0x00007fdf63b9f000], sp=0x00007fdf63b9d190, free space=1020k Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) C [libsaproc.so+0x487c] DwarfParser::process_dwarf(unsigned long)+0x2c j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.tools.Tool.startInternal()V+87 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 jdk.hotspot.agent at 15-internal v ~StubRoutines::call_stub V [libjvm.so+0xc2291c] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x6ac V [libjvm.so+0xd31970] jni_invoke_static(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) [clone .isra.140] [clone .constprop.263]+0x370 V [libjvm.so+0xd36202] jni_CallStaticVoidMethod+0x222 C [libjli.so+0x4bed] JavaMain+0xbcd C [libjli.so+0x80a9] ThreadJavaMain+0x9 Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.tools.Tool.startInternal()V+87 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 jdk.hotspot.agent at 15-internal j sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 jdk.hotspot.agent at 15-internal v ~StubRoutines::call_stub siginfo: si_signo: 11 (SIGSEGV), si_code: 2 (SEGV_ACCERR), si_addr: 0x00007fded5076b79 Register to memory mapping: RAX=0x00007f7e4dfe3229 is an unknown value RBX=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 d4 de 7f 00 00 RCX=0x00007fded4072380 points into unknown readable memory: 2f 75 73 72 2f 6c 69 62 RDX=0x00007fded4076b85 points into unknown readable memory: 01 00 00 RSP=0x00007fdf63b9d190 is pointing into the stack for thread: 0x00007fdf5c032000 RBP=0x00007fdf63b9d1b0 is pointing into the stack for thread: 0x00007fdf5c032000 RSI=0x0000000000000004 is an unknown value RDI=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 d4 de 7f 00 00 R8 =0x000000000146c380 points into unknown readable memory: 02 00 00 00 00 00 00 00 R9 =0x00007fded4076b79 points into unknown readable memory: 7a 52 00 01 78 10 01 R10=0x00000000ffffffff is an unknown value R11=0x000000000100527a is an unknown value R12=0x00007fded5076b79 is an unknown value R13=0x00007f7da2f8e68a is an unknown value R14=0x00007f7dbdf62b1d is an unknown value R15=0x00007fdf5c032000 is a thread Registers: RAX=0x00007f7e4dfe3229, RBX=0x00007fdf5c4d7080, RCX=0x00007fded4072380, RDX=0x00007fded4076b85 RSP=0x00007fdf63b9d190, RBP=0x00007fdf63b9d1b0, RSI=0x0000000000000004, RDI=0x00007fdf5c4d7080 R8 =0x000000000146c380, R9 =0x00007fded4076b79, R10=0x00000000ffffffff, R11=0x000000000100527a R12=0x00007fded5076b79, R13=0x00007f7da2f8e68a, R14=0x00007f7dbdf62b1d, R15=0x00007fdf5c032000 RIP=0x00007fdf2000e87c, EFLAGS=0x0000000000010206, CSGSFS=0x002b000000000033, ERR=0x0000000000000004 TRAPNO=0x000000000000000e Top of Stack: (sp=0x00007fdf63b9d190) 0x00007fdf63b9d190: 00007fdf209d0980 0000000000000000 0x00007fdf63b9d1a0: 00007fdf209d0980 00007fdf63b9d258 0x00007fdf63b9d1b0: 00007fdf63b9d228 00007fdf44778dbe 0x00007fdf63b9d1c0: 000000000146c380 00007fdf5c032000 Instructions: (pc=0x00007fdf2000e87c) 0x00007fdf2000e77c: 89 43 18 4d 85 f6 75 0f eb 2a 66 2e 0f 1f 84 00 0x00007fdf2000e78c: 00 00 00 00 48 89 c2 48 8d 42 01 48 89 43 08 80 0x00007fdf2000e79c: 78 ff 00 78 ef 48 8d 42 02 48 89 43 08 0f b6 42 0x00007fdf2000e7ac: 01 88 43 10 48 c7 43 28 00 00 00 00 4c 89 e1 48 0x00007fdf2000e7bc: 89 df 31 f6 48 b8 07 00 00 00 10 00 00 00 c6 43 0x00007fdf2000e7cc: 3c 00 48 c7 c2 ff ff ff ff 48 89 43 14 48 c7 43 0x00007fdf2000e7dc: 30 00 00 00 00 c7 43 38 00 00 00 00 e8 13 fb ff 0x00007fdf2000e7ec: ff 4c 89 6b 08 48 83 c4 18 5b 41 5c 41 5d 41 5e 0x00007fdf2000e7fc: 41 5f 5d c3 83 e7 40 0f 84 63 ff ff ff 48 c7 c2 0x00007fdf2000e80c: ff ff ff ff 48 d3 e2 49 09 d0 e9 51 ff ff ff 90 0x00007fdf2000e81c: 0f 1f 40 00 0f b6 47 10 83 e0 07 3c 02 74 0a 76 0x00007fdf2000e82c: 1b 3c 03 74 04 3c 04 75 17 48 8b 57 08 8b 02 48 0x00007fdf2000e83c: 83 c2 04 48 89 57 08 c3 0f 1f 40 00 84 c0 74 e9 0x00007fdf2000e84c: 31 c0 c3 90 55 41 ba ff ff ff ff 48 89 e5 41 56 0x00007fdf2000e85c: 41 55 49 89 f5 41 54 53 48 8b 07 48 89 fb 4c 8b 0x00007fdf2000e86c: a0 28 11 00 00 eb 09 0f 1f 44 00 00 4c 89 63 08 0x00007fdf2000e87c: 41 8b 04 24 4d 8d 4c 24 04 4c 89 4b 08 4c 39 d0 0x00007fdf2000e88c: 75 0a 49 8b 44 24 04 4d 8d 4c 24 0c 45 8b 19 4d 0x00007fdf2000e89c: 8d 24 01 49 8d 41 04 48 89 43 08 45 85 db 74 cc 0x00007fdf2000e8ac: 48 89 df e8 8c f9 ff ff 48 8b 13 41 89 c6 4c 03 0x00007fdf2000e8bc: b2 18 11 00 00 e8 5a ff ff ff 89 c0 4c 01 f0 4c 0x00007fdf2000e8cc: 39 e8 76 a8 4d 39 ee 77 a3 44 89 da 4c 89 ce e8 0x00007fdf2000e8dc: 90 fd ff ff 48 8b 43 08 31 c9 31 ff 48 83 c0 01 0x00007fdf2000e8ec: 0f 1f 40 00 48 89 43 08 0f b6 70 ff 49 89 c0 48 0x00007fdf2000e8fc: 83 c0 01 48 89 f2 83 e2 7f 48 d3 e2 83 c1 07 48 0x00007fdf2000e90c: 09 d7 40 84 f6 78 dd 4c 01 c7 4c 89 e1 4c 89 ea 0x00007fdf2000e91c: 4c 89 f6 48 89 7b 08 48 89 df e8 d5 f9 ff ff 5b 0x00007fdf2000e92c: 31 c0 41 5c 41 5d 41 5e 5d c3 66 2e 0f 1f 84 00 0x00007fdf2000e93c: 00 00 00 00 55 48 89 e5 41 54 53 48 81 ec d0 00 0x00007fdf2000e94c: 00 00 48 89 b5 48 ff ff ff 48 89 95 50 ff ff ff 0x00007fdf2000e95c: 48 89 8d 58 ff ff ff 4c 89 85 60 ff ff ff 4c 89 0x00007fdf2000e96c: 8d 68 ff ff ff 84 c0 74 23 0f 29 85 70 ff ff ff On 11/03/2020 3:52 pm, Yasumasa Suenaga wrote: > Hi Kevin, > > I saw 2 errors on submit repo > (mach5-one-ysuenaga-JDK-8234624-5-20200311-0209-9358475). > So I tweaked my patch, but I saw the crash again > (mach5-one-ysuenaga-JDK-8234624-5-20200311-0448-9361448). > > ? Last change on submit repo is here: > ??? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05-2/ > > Can you share details on submit repo? > > > Thanks, > > Yasumasa > > > On 2020/03/11 11:07, Yasumasa Suenaga wrote: >> Hi Kevin, >> >> I guess first program header in the libraries which are on your >> machine has exec flag (you can check it with `readelf -l`). >> So I tweaked my patch (initial value of exec_start and exec_end set to >> -1) in new webrev. >> >> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05/ >> >> This webrev contains the fix for your comment (typo and >> DW_CFA_advance_loc4). >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2020/03/11 8:53, Kevin Walls wrote: >>> Hi - >>> >>> In testing I wasn't seeing any of the Dwarf code triggered. >>> >>> With LIBSAPROC_DEBUG set I'm getting the "Could not find executable >>> section in" for lots of / maybe all the libraries... >>> >>> src/jdk.hotspot.agent/linux/native/libsaproc/libproc_impl.c >>> >>> ??? if (fill_instr_info(newlib)) { >>> ????? if (!read_eh_frame(ph, newlib)) { >>> >>> fill_instr_info is failing, and we never get to read_eh_frame(). >>> >>> output like: >>> >>> libsaproc DEBUG: [0] vaddr = 0x0, memsz = 0xaba4, filesz = 0xaba4 >>> libsaproc DEBUG: Could not find executable section in >>> /lib/x86_64-linux-gnu/libnss_nis-2.27.so >>> >>> (similar for all libraries). >>> >>> fill_instr fails if: >>> >>> ??if ((lib->exec_start == 0L) || (lib->exec_end == 0L)) >>> >>> ...but isn't exec_start relative to the library address? It's the >>> value of ph->vaddr and it is often zero. >>> >>> I added some booleans and did: >>> >>> 185?????? if ((lib->exec_start == 0L) || (lib->exec_start > >>> ph->p_vaddr)) { >>> 186???????? lib->exec_start = ph->p_vaddr; >>> 187???????? found_start =true; >>> 188?????? } >>> >>> (similarly for end) and only failed if: >>> >>> 201?? if (!found_start || !found_end) { >>> 202???? return false; >>> >>> ...and now it's better. ? I go from: >>> >>> ----------------- 3306 ----------------- >>> 0x00007f75824acd2d????? __GI___pthread_timedjoin_ex + 0x17d >>> >>> to: >>> >>> ----------------- 31127 ----------------- >>> 0x00007fa284d78d2d????? __GI___pthread_timedjoin_ex + 0x17d >>> 0x00007fa2857aaf2d????? CallJavaMainInNewThread + 0xad >>> 0x00007fa2857a74ed????? ContinueInNewThread + 0x4d >>> 0x00007fa2857a8c49????? JLI_Launch + 0x1529 >>> 0x000055af1b78db1c????? main + 0x11c >>> >>> >>> Thanks >>> Kevin >>> >>> >>> >>> >>> On 10/03/2020 12:36, Yasumasa Suenaga wrote: >>> >>>> Hi Kevin, >>>> >>>> Thanks for your comment! >>>> >>>> On 2020/03/10 18:58, Kevin Walls wrote: >>>>> Hi Yasumasa , >>>>> >>>>> The changes build OK for me in the latest jdk, and things still work. >>>>> I have not yet seen the dwarf usage in action: I've tried a couple >>>>> of different systems and so far have not reproduced the problem, >>>>> i.e. jstack has not failed on native frames. >>>>> >>>>> I may need more recent basic libraries, will look again for >>>>> somewhere where the problem happens and get back to you as I really >>>>> want to run the changes. >>>> >>>> You can see the problem with JShell. >>>> Some Java frames would not be seen in mixed jstack. >>>> >>>> >>>>> I have mostly minor other comments which don't need a new webrev, >>>>> some just comments for the future: >>>>> >>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp: >>>>> >>>>> DW_CFA_nop - shouldn't this continue instead of return? >>>>> (It may "never" happen, but a nop could appear within some other >>>>> instructions?) >>>> >>>> DW_CFA_nop is used for padding, so we can ignore (return >>>> immediately) it. >>>> >>>> >>>>> DW_CFA_remember_state: a minor typo in the comment, >>>>> "DW_CFA_remenber_state". >>>> >>>> I will fix it. >>>> >>>> >>>>> We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not >>>>> DW_CFA_advance_loc4.? I thought that was odd, but maybe addresses >>>>> in these tables never increase by 4-byte amounts, would this mean a >>>>> lot of code on one line. 8-) >>>>> So maybe it's never used in practice, if you think it's unnecessary >>>>> no problem, maybe a comment, or add it for robustness. >>>> >>>> I will add DW_CFA_advance_loc4. >>>> >>>> >>>>> General-purpose methods like read_leb128(), get_entry_length(), >>>>> get_decoded_value() specifically update the _buf pointer in this >>>>> DwarfParser. >>>>> >>>>> DwarfParser::process_dwarf() moves _buf. >>>>> It calls process_cie() which reads, moves _buf and restores it to >>>>> the original position, then we read augmentation_length from where >>>>> _buf is. >>>>> I'm not sure if that's wrong, or if I just need to read again about >>>>> the CIE/etc layout. >>>>> >>>>> I don't really want to suggest making the code pass around a >>>>> current _buf for the invocation of these general purpose methods, >>>>> but just wanted to comment that if these get used more widely that >>>>> might become necessary. >>>> >>>> I saw GDB and binutils source for creating this patch. >>>> They seems to process similar code because we need to calculate >>>> DWARF instructions one-by-one to get the value which relates to >>>> specified PC. >>>> >>>> >>>>> Similarly in future, if this DWARF support code became used more >>>>> widely, it might want to move to an >>>>> OS-neutral directory?? It's odd to label it as Linux-specific. >>>> >>>> Windows does not use DWARF at least, it uses another feature. >>>> >>>> https://urldefense.com/v3/__https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaTtALtpRg$ >>>> >>>> I'm not sure other platforms (Solaris, macOS) uses DWARF. >>>> If DWARF is used in them, I can move DWARF related code to posix >>>> directory. >>>> >>>> >>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp: >>>>> Thanks for changing "can_parsable" which was in the earlier >>>>> version. 8-) >>>>> >>>>> >>>>> These are just comments to mainly say it looks good, and somebody >>>>> else out there has read it. >>>>> I will look for a system that shows the problem, and get back to >>>>> you again! >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>>> Many thanks >>>>> Kevin >>>>> >>>>> On 27/02/2020 05:13, Yasumasa Suenaga wrote: >>>>>> Hi all, >>>>>> >>>>>> webrev.03 cannot be applied to current jdk/jdk due to 8239224 and >>>>>> 8239462 changes (they updated copyright year). >>>>>> So I modified webrev (only copyright year changes) to be able to >>>>>> apply to current jdk/jdk. >>>>>> Could you review it? >>>>>> >>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/ >>>>>> >>>>>> I need one more reviewer to push. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2020/02/17 13:07, Yasumasa Suenaga wrote: >>>>>>> PING: Could you review it? >>>>>>> >>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>> ?? webrev: >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>>> >>>>>>> This change has been already reviewed by Serguei. >>>>>>> I need one more reviewer to push. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2020/02/03 1:37, Yasumasa Suenaga wrote: >>>>>>>> PING: Could you reveiw this change? >>>>>>>> >>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>> ?? webrev: >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>>>> >>>>>>>> I believe this change helps troubleshooter to fight to >>>>>>>> postmortem analysis. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2020/01/19 3:16, Yasumasa Suenaga wrote: >>>>>>>>> PING: Could you review it? >>>>>>>>> >>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>> ?? webrev: >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>>>>> >>>>>>>>> I updated webrev. I discussed with Serguei in off list, and I >>>>>>>>> refactored webrev.02 . >>>>>>>>> It has passed tests on submit repo >>>>>>>>> (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549). >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2019/12/15 10:51, Yasumasa Suenaga wrote: >>>>>>>>>> Hi Serguei, >>>>>>>>>> >>>>>>>>>> Thanks for your comment! >>>>>>>>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev. >>>>>>>>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as >>>>>>>>>> Dmitry said. >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/ >>>>>>>>>> >>>>>>>>>> This change has been passed all tests on submit repo >>>>>>>>>> (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487). >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>> >>>>>>>>>>> This is nice move in general. >>>>>>>>>>> Thank you for working on this! >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr >>>>>>>>>>> == 0L) { // Java frame 98 Address rbp = >>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if >>>>>>>>>>> (rbp == null) { 100 return null; 101 } 102 return new >>>>>>>>>>> LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native >>>>>>>>>>> frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new >>>>>>>>>>> DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 >>>>>>>>>>> Address rbp = >>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if >>>>>>>>>>> (rbp == null) { 110 return null; 111 } 112 return new >>>>>>>>>>> LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 >>>>>>>>>>> dwarf.processDwarf(pc); 115 Address cfa = >>>>>>>>>>> ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 >>>>>>>>>>> !dwarf.isBPOffsetAvailable()) 117 ? >>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : >>>>>>>>>>> context.getRegisterAsAddress(dwarf.getCFARegister()) 119 >>>>>>>>>>> .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { >>>>>>>>>>> 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, >>>>>>>>>>> cfa, pc, dwarf); 124 } >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I'd suggest to simplify the logic by refactoring to something >>>>>>>>>>> like below: >>>>>>>>>>> >>>>>>>>>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc); >>>>>>>>>>> ?????????? Address cfa = >>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java >>>>>>>>>>> frame >>>>>>>>>>> ?????????? DwarfParser dwarf = null; >>>>>>>>>>> >>>>>>>>>>> ?????????? if (libptr != 0L) { // Native frame >>>>>>>>>>> ???????????? try { >>>>>>>>>>> ?????????????? dwarf = new DwarfParser(libptr); >>>>>>>>>>> ?????????????? dwarf.processDwarf(pc); >>>>>>>>>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == >>>>>>>>>>> AMD64ThreadContext.RBP) && >>>>>>>>>>> !dwarf.isBPOffsetAvailable()) >>>>>>>>>>> ???????????????????????????????? ? >>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP) >>>>>>>>>>> ???????????????????????????????? : >>>>>>>>>>> context.getRegisterAsAddress(dwarf.getCFARegister()) >>>>>>>>>>> .addOffsetTo(dwarf.getCFAOffset()); >>>>>>>>>>> >>>>>>>>>>> ??????????? } catch (DebuggerException e) { // bail out to >>>>>>>>>>> Java frame case >>>>>>>>>>> ??????????? } >>>>>>>>>>> ????????? } >>>>>>>>>>> ????????? if (cfa == null) { >>>>>>>>>>> ??????????? return null; >>>>>>>>>>> ????????? } >>>>>>>>>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA() >>>>>>>>>>> >>>>>>>>>>> ?? Better to rename 'ofs' => 'offs'. >>>>>>>>>>> >>>>>>>>>>> 77 nextCFA = nextCFA.addOffsetTo(- >>>>>>>>>>> nextDwarf.getBasePointerOffsetFromCFA()); >>>>>>>>>>> >>>>>>>>>>> ?? Extra space after '-' sign. >>>>>>>>>>> >>>>>>>>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, >>>>>>>>>>> ThreadContext context) { >>>>>>>>>>> >>>>>>>>>>> ?? It feels like the logic has to be somehow >>>>>>>>>>> refactored/simplified as >>>>>>>>>>> ?? several typical fragments appears in slightly different >>>>>>>>>>> contexts. >>>>>>>>>>> ?? But it is not easy to understand what it is. >>>>>>>>>>> ?? Could you, please, add some comments to key places >>>>>>>>>>> explaining this logic. >>>>>>>>>>> ?? Then I'll check if it is possible to make it a little bit >>>>>>>>>>> simpler. >>>>>>>>>>> >>>>>>>>>>> 109 private CFrame javaSender(ThreadContext context) { 110 >>>>>>>>>>> Address nextCFA; 111 Address nextPC; 112 113 nextPC = >>>>>>>>>>> getNextPC(false); 114 if (nextPC == null) { 115 return null; >>>>>>>>>>> 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = >>>>>>>>>>> dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // >>>>>>>>>>> Native frame 121 try { 122 nextDwarf = new >>>>>>>>>>> DwarfParser(libptr); 123 } catch (DebuggerException e) { 124 >>>>>>>>>>> nextCFA = getNextCFA(null, context); 125 return (nextCFA == >>>>>>>>>>> null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, >>>>>>>>>>> null); 126 } 127 nextDwarf.processDwarf(nextPC); 128 } 129 >>>>>>>>>>> 130 nextCFA = getNextCFA(nextDwarf, context); 131 return >>>>>>>>>>> (nextCFA == null) ? null 132 : new LinuxAMD64CFrame(dbg, >>>>>>>>>>> nextCFA, nextPC, nextDwarf); 133 } >>>>>>>>>>> >>>>>>>>>>> ??The above can be simplified if a DebuggerException can not >>>>>>>>>>> be thrown from processDwarf(nextPC): >>>>>>>>>>> ????? private CFrame javaSender(ThreadContext context) { >>>>>>>>>>> ??????? Address nextPC = getNextPC(false); >>>>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>>>> ????????? return null; >>>>>>>>>>> ??????? } >>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>>>> >>>>>>>>>>> ??????? if (libptr != 0L) { // Native frame >>>>>>>>>>> ????????? try { >>>>>>>>>>> ??????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>>>> ??????????? nextDwarf.processDwarf(nextPC); >>>>>>>>>>> ????????? } catch (DebuggerException e) { // Bail out to Java >>>>>>>>>>> frame >>>>>>>>>>> ????????? } >>>>>>>>>>> ??????? } >>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>>>> ??????? return (nextCFA == null) ? null : new >>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>>>> ????? } >>>>>>>>>>> >>>>>>>>>>> 135 public CFrame sender(ThreadProxy thread) { 136 >>>>>>>>>>> ThreadContext context = thread.getContext(); 137 138 if >>>>>>>>>>> (dwarf == null) { // Java frame 139 return >>>>>>>>>>> javaSender(context); 140 } 141 142 Address nextPC = >>>>>>>>>>> getNextPC(true); 143 if (nextPC == null) { 144 return null; >>>>>>>>>>> 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = >>>>>>>>>>> dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = >>>>>>>>>>> dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 >>>>>>>>>>> // Next frame might be Java frame 153 nextCFA = >>>>>>>>>>> getNextCFA(null, context); 154 return (nextCFA == null) ? >>>>>>>>>>> null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 >>>>>>>>>>> } 156 try { 157 nextDwarf = new DwarfParser(libptr); 158 } >>>>>>>>>>> catch (DebuggerException e) { 159 nextCFA = getNextCFA(null, >>>>>>>>>>> context); 160 return (nextCFA == null) ? null : new >>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 163 >>>>>>>>>>> 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = >>>>>>>>>>> getNextCFA(nextDwarf, context); 166 return (nextCFA == null) >>>>>>>>>>> ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, >>>>>>>>>>> nextDwarf); 167 } >>>>>>>>>>> >>>>>>>>>>> ??This one can be also simplified a little: >>>>>>>>>>> >>>>>>>>>>> ????? public CFrame sender(ThreadProxy thread) { >>>>>>>>>>> ??????? ThreadContext context = thread.getContext(); >>>>>>>>>>> >>>>>>>>>>> ??????? if (dwarf == null) { // Java frame >>>>>>>>>>> ????????? return javaSender(context); >>>>>>>>>>> ??????? } >>>>>>>>>>> ??????? Address nextPC = getNextPC(true); >>>>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>>>> ????????? return null; >>>>>>>>>>> ??????? } >>>>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>>>> ??????? if (!dwarf.isIn(nextPC)) { >>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>> ????????? if (libptr != 0L) { >>>>>>>>>>> ??????????? try { >>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC); >>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to >>>>>>>>>>> Java frame >>>>>>>>>>> ??????????? } >>>>>>>>>>> ????????? } >>>>>>>>>>> ??????? } >>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>>>> ??????? return (nextCFA == null) ? null : new >>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>>>> ????? } >>>>>>>>>>> >>>>>>>>>>> Finally, it looks like just one method could replace both >>>>>>>>>>> sender(ThreadProxy thread) and javaSender(ThreadContext >>>>>>>>>>> context): >>>>>>>>>>> >>>>>>>>>>> ????? private CFrame commonSender(ThreadProxy thread) { >>>>>>>>>>> ??????? ThreadContext context = thread.getContext(); >>>>>>>>>>> ??????? Address nextPC = getNextPC(false); >>>>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>>>> ????????? return null; >>>>>>>>>>> ??????? } >>>>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>>>> >>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) { >>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>> ????????? if (libptr != 0L) { >>>>>>>>>>> ??????????? try { >>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC); >>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to >>>>>>>>>>> Java frame >>>>>>>>>>> ??????????? } >>>>>>>>>>> ????????? } >>>>>>>>>>> ??????? } >>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>>>> ??????? return (nextCFA == null) ? null : new >>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>>>> ????? } >>>>>>>>>>> >>>>>>>>>>> I'm still reviewing the dwarf parser files. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Serguei >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote: >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> I refactored LinuxAMD64CFrame.java . It works fine in >>>>>>>>>>>> serviceability/sa tests and >>>>>>>>>>>> all tests on submit repo >>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923). >>>>>>>>>>>> Could you review new webrev? >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/ >>>>>>>>>>>> >>>>>>>>>>>> The diff from previous webrev is here: >>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote: >>>>>>>>>>>>> Hi all, >>>>>>>>>>>>> >>>>>>>>>>>>> Please review this change: >>>>>>>>>>>>> >>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>>>>> ?? webrev: >>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/ >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> According to 2.7 Stack Unwind Algorithm in System V >>>>>>>>>>>>> Application Binary Interface AMD64 >>>>>>>>>>>>> Architecture Processor Supplement [1], we need to use DWARF >>>>>>>>>>>>> in .eh_frame or .debug_frame >>>>>>>>>>>>> for stack unwinding. >>>>>>>>>>>>> >>>>>>>>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by >>>>>>>>>>>>> default since GCC 4.6, so system >>>>>>>>>>>>> library (e.g. libc) might be compiled with this feature. >>>>>>>>>>>>> >>>>>>>>>>>>> However `jhsdb jstack --mixed` does not do so, it uses base >>>>>>>>>>>>> pointer register (RBP). >>>>>>>>>>>>> So it might be lack of stack frames. >>>>>>>>>>>>> >>>>>>>>>>>>> I guess JDK-8219201 is caused by same issue. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> [1] >>>>>>>>>>>>> https://urldefense.com/v3/__https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaQusmjOqA$ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> From suenaga at oss.nttdata.com Wed Mar 11 06:03:59 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Wed, 11 Mar 2020 15:03:59 +0900 Subject: : PING: RFR: 8234624: jstack mixed mode should refer DWARF In-Reply-To: <26681640-d8c8-5b64-58cc-36f8a854d101@oracle.com> References: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com> <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com> <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com> <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com> <24a2dcae-fba9-c6ed-657e-a1d54584bbc3@oss.nttdata.com> <01c2a8c1-0e04-4e42-161a-80d6bdaa4ffa@oracle.com> <26681640-d8c8-5b64-58cc-36f8a854d101@oracle.com> Message-ID: <75c8f8ac-4557-03ab-77eb-f2383aa2b5f1@oss.nttdata.com> Thanks David! Can you share native backtrace? (Did /opt/core.sh collect it?) Yasumasa On 2020/03/11 14:59, David Holmes wrote: > Hi Yasumasa, > > Partial hs_err info below. > > David > ----- > > # > # A fatal error has been detected by the Java Runtime Environment: > # > #? SIGSEGV (0xb) at pc=0x00007fdf2000e87c, pid=29798, tid=29800 > # > # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-11-0447267.suenaga.source) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-11-0447267.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) > # Problematic frame: > # C? [libsaproc.so+0x487c]? DwarfParser::process_dwarf(unsigned long)+0x2c > # > # Core dump will be written. Default location: Core dumps may be processed with "/opt/core.sh %p" (or dumping to /opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/scratch/0/core.29798) > # > # If you would like to submit a bug report, please visit: > #?? https://bugreport.java.com/bugreport/crash.jsp > # The crash happened outside the Java Virtual Machine in native code. > # See problematic frame for where to report the bug. > # > > ---------------? S U M M A R Y ------------ > > Command Line: > -Denv.class.path=/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/serviceability/sa/TestJhsdbJstackMixed.d:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/hotspot/jtreg/serviceability/sa:/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/javatest.jar:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/jtreg.jar > -Dapplication.home=/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/linux-x64-debug.jdk/jdk-15/fastdebug -Xms8m -Djdk.module.main=jdk.hotspot.agent jdk.hotspot.agent/sun.jvm.hotspot.SALauncher jstack --mixed --pid 29770 > > Time: Wed Mar 11 05:20:57 2020 UTC elapsed time: 3.927809 seconds (0d 0h 0m 3s) > > ---------------? T H R E A D? --------------- > > Current thread (0x00007fdf5c032000):? JavaThread "main" [_thread_in_native, id=29800, stack(0x00007fdf63a9e000,0x00007fdf63b9f000)] > > Stack: [0x00007fdf63a9e000,0x00007fdf63b9f000],? sp=0x00007fdf63b9d190, free space=1020k > Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) > C? [libsaproc.so+0x487c]? DwarfParser::process_dwarf(unsigned long)+0x2c > j? sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 jdk.hotspot.agent at 15-internal > j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 jdk.hotspot.agent at 15-internal > j sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 jdk.hotspot.agent at 15-internal > j sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 jdk.hotspot.agent at 15-internal > j sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.Tool.startInternal()V+87 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 jdk.hotspot.agent at 15-internal > v? ~StubRoutines::call_stub > V? [libjvm.so+0xc2291c]? JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x6ac > V? [libjvm.so+0xd31970]? jni_invoke_static(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) [clone .isra.140] [clone .constprop.263]+0x370 > V? [libjvm.so+0xd36202]? jni_CallStaticVoidMethod+0x222 > C? [libjli.so+0x4bed]? JavaMain+0xbcd > C? [libjli.so+0x80a9]? ThreadJavaMain+0x9 > > Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) > j? sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 jdk.hotspot.agent at 15-internal > j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 jdk.hotspot.agent at 15-internal > j sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 jdk.hotspot.agent at 15-internal > j sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 jdk.hotspot.agent at 15-internal > j sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.Tool.startInternal()V+87 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 jdk.hotspot.agent at 15-internal > v? ~StubRoutines::call_stub > > siginfo: si_signo: 11 (SIGSEGV), si_code: 2 (SEGV_ACCERR), si_addr: 0x00007fded5076b79 > > Register to memory mapping: > > RAX=0x00007f7e4dfe3229 is an unknown value > RBX=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 d4 de 7f 00 00 > RCX=0x00007fded4072380 points into unknown readable memory: 2f 75 73 72 2f 6c 69 62 > RDX=0x00007fded4076b85 points into unknown readable memory: 01 00 00 > RSP=0x00007fdf63b9d190 is pointing into the stack for thread: 0x00007fdf5c032000 > RBP=0x00007fdf63b9d1b0 is pointing into the stack for thread: 0x00007fdf5c032000 > RSI=0x0000000000000004 is an unknown value > RDI=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 d4 de 7f 00 00 > R8 =0x000000000146c380 points into unknown readable memory: 02 00 00 00 00 00 00 00 > R9 =0x00007fded4076b79 points into unknown readable memory: 7a 52 00 01 78 10 01 > R10=0x00000000ffffffff is an unknown value > R11=0x000000000100527a is an unknown value > R12=0x00007fded5076b79 is an unknown value > R13=0x00007f7da2f8e68a is an unknown value > R14=0x00007f7dbdf62b1d is an unknown value > R15=0x00007fdf5c032000 is a thread > > > Registers: > RAX=0x00007f7e4dfe3229, RBX=0x00007fdf5c4d7080, RCX=0x00007fded4072380, RDX=0x00007fded4076b85 > RSP=0x00007fdf63b9d190, RBP=0x00007fdf63b9d1b0, RSI=0x0000000000000004, RDI=0x00007fdf5c4d7080 > R8 =0x000000000146c380, R9 =0x00007fded4076b79, R10=0x00000000ffffffff, R11=0x000000000100527a > R12=0x00007fded5076b79, R13=0x00007f7da2f8e68a, R14=0x00007f7dbdf62b1d, R15=0x00007fdf5c032000 > RIP=0x00007fdf2000e87c, EFLAGS=0x0000000000010206, CSGSFS=0x002b000000000033, ERR=0x0000000000000004 > ? TRAPNO=0x000000000000000e > > Top of Stack: (sp=0x00007fdf63b9d190) > 0x00007fdf63b9d190:?? 00007fdf209d0980 0000000000000000 > 0x00007fdf63b9d1a0:?? 00007fdf209d0980 00007fdf63b9d258 > 0x00007fdf63b9d1b0:?? 00007fdf63b9d228 00007fdf44778dbe > 0x00007fdf63b9d1c0:?? 000000000146c380 00007fdf5c032000 > > Instructions: (pc=0x00007fdf2000e87c) > 0x00007fdf2000e77c:?? 89 43 18 4d 85 f6 75 0f eb 2a 66 2e 0f 1f 84 00 > 0x00007fdf2000e78c:?? 00 00 00 00 48 89 c2 48 8d 42 01 48 89 43 08 80 > 0x00007fdf2000e79c:?? 78 ff 00 78 ef 48 8d 42 02 48 89 43 08 0f b6 42 > 0x00007fdf2000e7ac:?? 01 88 43 10 48 c7 43 28 00 00 00 00 4c 89 e1 48 > 0x00007fdf2000e7bc:?? 89 df 31 f6 48 b8 07 00 00 00 10 00 00 00 c6 43 > 0x00007fdf2000e7cc:?? 3c 00 48 c7 c2 ff ff ff ff 48 89 43 14 48 c7 43 > 0x00007fdf2000e7dc:?? 30 00 00 00 00 c7 43 38 00 00 00 00 e8 13 fb ff > 0x00007fdf2000e7ec:?? ff 4c 89 6b 08 48 83 c4 18 5b 41 5c 41 5d 41 5e > 0x00007fdf2000e7fc:?? 41 5f 5d c3 83 e7 40 0f 84 63 ff ff ff 48 c7 c2 > 0x00007fdf2000e80c:?? ff ff ff ff 48 d3 e2 49 09 d0 e9 51 ff ff ff 90 > 0x00007fdf2000e81c:?? 0f 1f 40 00 0f b6 47 10 83 e0 07 3c 02 74 0a 76 > 0x00007fdf2000e82c:?? 1b 3c 03 74 04 3c 04 75 17 48 8b 57 08 8b 02 48 > 0x00007fdf2000e83c:?? 83 c2 04 48 89 57 08 c3 0f 1f 40 00 84 c0 74 e9 > 0x00007fdf2000e84c:?? 31 c0 c3 90 55 41 ba ff ff ff ff 48 89 e5 41 56 > 0x00007fdf2000e85c:?? 41 55 49 89 f5 41 54 53 48 8b 07 48 89 fb 4c 8b > 0x00007fdf2000e86c:?? a0 28 11 00 00 eb 09 0f 1f 44 00 00 4c 89 63 08 > 0x00007fdf2000e87c:?? 41 8b 04 24 4d 8d 4c 24 04 4c 89 4b 08 4c 39 d0 > 0x00007fdf2000e88c:?? 75 0a 49 8b 44 24 04 4d 8d 4c 24 0c 45 8b 19 4d > 0x00007fdf2000e89c:?? 8d 24 01 49 8d 41 04 48 89 43 08 45 85 db 74 cc > 0x00007fdf2000e8ac:?? 48 89 df e8 8c f9 ff ff 48 8b 13 41 89 c6 4c 03 > 0x00007fdf2000e8bc:?? b2 18 11 00 00 e8 5a ff ff ff 89 c0 4c 01 f0 4c > 0x00007fdf2000e8cc:?? 39 e8 76 a8 4d 39 ee 77 a3 44 89 da 4c 89 ce e8 > 0x00007fdf2000e8dc:?? 90 fd ff ff 48 8b 43 08 31 c9 31 ff 48 83 c0 01 > 0x00007fdf2000e8ec:?? 0f 1f 40 00 48 89 43 08 0f b6 70 ff 49 89 c0 48 > 0x00007fdf2000e8fc:?? 83 c0 01 48 89 f2 83 e2 7f 48 d3 e2 83 c1 07 48 > 0x00007fdf2000e90c:?? 09 d7 40 84 f6 78 dd 4c 01 c7 4c 89 e1 4c 89 ea > 0x00007fdf2000e91c:?? 4c 89 f6 48 89 7b 08 48 89 df e8 d5 f9 ff ff 5b > 0x00007fdf2000e92c:?? 31 c0 41 5c 41 5d 41 5e 5d c3 66 2e 0f 1f 84 00 > 0x00007fdf2000e93c:?? 00 00 00 00 55 48 89 e5 41 54 53 48 81 ec d0 00 > 0x00007fdf2000e94c:?? 00 00 48 89 b5 48 ff ff ff 48 89 95 50 ff ff ff > 0x00007fdf2000e95c:?? 48 89 8d 58 ff ff ff 4c 89 85 60 ff ff ff 4c 89 > 0x00007fdf2000e96c:?? 8d 68 ff ff ff 84 c0 74 23 0f 29 85 70 ff ff ff > > > On 11/03/2020 3:52 pm, Yasumasa Suenaga wrote: >> Hi Kevin, >> >> I saw 2 errors on submit repo (mach5-one-ysuenaga-JDK-8234624-5-20200311-0209-9358475). >> So I tweaked my patch, but I saw the crash again (mach5-one-ysuenaga-JDK-8234624-5-20200311-0448-9361448). >> >> ?? Last change on submit repo is here: >> ???? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05-2/ >> >> Can you share details on submit repo? >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2020/03/11 11:07, Yasumasa Suenaga wrote: >>> Hi Kevin, >>> >>> I guess first program header in the libraries which are on your machine has exec flag (you can check it with `readelf -l`). >>> So I tweaked my patch (initial value of exec_start and exec_end set to -1) in new webrev. >>> >>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05/ >>> >>> This webrev contains the fix for your comment (typo and DW_CFA_advance_loc4). >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2020/03/11 8:53, Kevin Walls wrote: >>>> Hi - >>>> >>>> In testing I wasn't seeing any of the Dwarf code triggered. >>>> >>>> With LIBSAPROC_DEBUG set I'm getting the "Could not find executable section in" for lots of / maybe all the libraries... >>>> >>>> src/jdk.hotspot.agent/linux/native/libsaproc/libproc_impl.c >>>> >>>> ??? if (fill_instr_info(newlib)) { >>>> ????? if (!read_eh_frame(ph, newlib)) { >>>> >>>> fill_instr_info is failing, and we never get to read_eh_frame(). >>>> >>>> output like: >>>> >>>> libsaproc DEBUG: [0] vaddr = 0x0, memsz = 0xaba4, filesz = 0xaba4 >>>> libsaproc DEBUG: Could not find executable section in /lib/x86_64-linux-gnu/libnss_nis-2.27.so >>>> >>>> (similar for all libraries). >>>> >>>> fill_instr fails if: >>>> >>>> ??if ((lib->exec_start == 0L) || (lib->exec_end == 0L)) >>>> >>>> ...but isn't exec_start relative to the library address? It's the value of ph->vaddr and it is often zero. >>>> >>>> I added some booleans and did: >>>> >>>> 185?????? if ((lib->exec_start == 0L) || (lib->exec_start > ph->p_vaddr)) { >>>> 186???????? lib->exec_start = ph->p_vaddr; >>>> 187???????? found_start =true; >>>> 188?????? } >>>> >>>> (similarly for end) and only failed if: >>>> >>>> 201?? if (!found_start || !found_end) { >>>> 202???? return false; >>>> >>>> ...and now it's better. ? I go from: >>>> >>>> ----------------- 3306 ----------------- >>>> 0x00007f75824acd2d????? __GI___pthread_timedjoin_ex + 0x17d >>>> >>>> to: >>>> >>>> ----------------- 31127 ----------------- >>>> 0x00007fa284d78d2d????? __GI___pthread_timedjoin_ex + 0x17d >>>> 0x00007fa2857aaf2d????? CallJavaMainInNewThread + 0xad >>>> 0x00007fa2857a74ed????? ContinueInNewThread + 0x4d >>>> 0x00007fa2857a8c49????? JLI_Launch + 0x1529 >>>> 0x000055af1b78db1c????? main + 0x11c >>>> >>>> >>>> Thanks >>>> Kevin >>>> >>>> >>>> >>>> >>>> On 10/03/2020 12:36, Yasumasa Suenaga wrote: >>>> >>>>> Hi Kevin, >>>>> >>>>> Thanks for your comment! >>>>> >>>>> On 2020/03/10 18:58, Kevin Walls wrote: >>>>>> Hi Yasumasa , >>>>>> >>>>>> The changes build OK for me in the latest jdk, and things still work. >>>>>> I have not yet seen the dwarf usage in action: I've tried a couple of different systems and so far have not reproduced the problem, i.e. jstack has not failed on native frames. >>>>>> >>>>>> I may need more recent basic libraries, will look again for somewhere where the problem happens and get back to you as I really want to run the changes. >>>>> >>>>> You can see the problem with JShell. >>>>> Some Java frames would not be seen in mixed jstack. >>>>> >>>>> >>>>>> I have mostly minor other comments which don't need a new webrev, some just comments for the future: >>>>>> >>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp: >>>>>> >>>>>> DW_CFA_nop - shouldn't this continue instead of return? >>>>>> (It may "never" happen, but a nop could appear within some other instructions?) >>>>> >>>>> DW_CFA_nop is used for padding, so we can ignore (return immediately) it. >>>>> >>>>> >>>>>> DW_CFA_remember_state: a minor typo in the comment, "DW_CFA_remenber_state". >>>>> >>>>> I will fix it. >>>>> >>>>> >>>>>> We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not DW_CFA_advance_loc4.? I thought that was odd, but maybe addresses in these tables never increase by 4-byte amounts, would this mean a lot of code on one line. 8-) >>>>>> So maybe it's never used in practice, if you think it's unnecessary no problem, maybe a comment, or add it for robustness. >>>>> >>>>> I will add DW_CFA_advance_loc4. >>>>> >>>>> >>>>>> General-purpose methods like read_leb128(), get_entry_length(), get_decoded_value() specifically update the _buf pointer in this DwarfParser. >>>>>> >>>>>> DwarfParser::process_dwarf() moves _buf. >>>>>> It calls process_cie() which reads, moves _buf and restores it to the original position, then we read augmentation_length from where _buf is. >>>>>> I'm not sure if that's wrong, or if I just need to read again about the CIE/etc layout. >>>>>> >>>>>> I don't really want to suggest making the code pass around a current _buf for the invocation of these general purpose methods, but just wanted to comment that if these get used more widely that might become necessary. >>>>> >>>>> I saw GDB and binutils source for creating this patch. >>>>> They seems to process similar code because we need to calculate DWARF instructions one-by-one to get the value which relates to specified PC. >>>>> >>>>> >>>>>> Similarly in future, if this DWARF support code became used more widely, it might want to move to an >>>>>> OS-neutral directory?? It's odd to label it as Linux-specific. >>>>> >>>>> Windows does not use DWARF at least, it uses another feature. >>>>> >>>>> https://urldefense.com/v3/__https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaTtALtpRg$ >>>>> I'm not sure other platforms (Solaris, macOS) uses DWARF. >>>>> If DWARF is used in them, I can move DWARF related code to posix directory. >>>>> >>>>> >>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp: >>>>>> Thanks for changing "can_parsable" which was in the earlier version. 8-) >>>>>> >>>>>> >>>>>> These are just comments to mainly say it looks good, and somebody else out there has read it. >>>>>> I will look for a system that shows the problem, and get back to you again! >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>>> Many thanks >>>>>> Kevin >>>>>> >>>>>> On 27/02/2020 05:13, Yasumasa Suenaga wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> webrev.03 cannot be applied to current jdk/jdk due to 8239224 and 8239462 changes (they updated copyright year). >>>>>>> So I modified webrev (only copyright year changes) to be able to apply to current jdk/jdk. >>>>>>> Could you review it? >>>>>>> >>>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/ >>>>>>> >>>>>>> I need one more reviewer to push. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2020/02/17 13:07, Yasumasa Suenaga wrote: >>>>>>>> PING: Could you review it? >>>>>>>> >>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>>>> >>>>>>>> This change has been already reviewed by Serguei. >>>>>>>> I need one more reviewer to push. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2020/02/03 1:37, Yasumasa Suenaga wrote: >>>>>>>>> PING: Could you reveiw this change? >>>>>>>>> >>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>>>>> >>>>>>>>> I believe this change helps troubleshooter to fight to postmortem analysis. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2020/01/19 3:16, Yasumasa Suenaga wrote: >>>>>>>>>> PING: Could you review it? >>>>>>>>>> >>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>>>>>> >>>>>>>>>> I updated webrev. I discussed with Serguei in off list, and I refactored webrev.02 . >>>>>>>>>> It has passed tests on submit repo (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549). >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2019/12/15 10:51, Yasumasa Suenaga wrote: >>>>>>>>>>> Hi Serguei, >>>>>>>>>>> >>>>>>>>>>> Thanks for your comment! >>>>>>>>>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev. >>>>>>>>>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as Dmitry said. >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/ >>>>>>>>>>> >>>>>>>>>>> This change has been passed all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487). >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>> >>>>>>>>>>>> This is nice move in general. >>>>>>>>>>>> Thank you for working on this! >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html >>>>>>>>>>>> >>>>>>>>>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 0L) { // Java frame 98 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp == null) { 100 return null; 101 } 102 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if (rbp == null) { 110 return null; 111 } 112 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 dwarf.processDwarf(pc); 115 Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 !dwarf.isBPOffsetAvailable()) 117 ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : context.getRegisterAsAddress(dwarf.getCFARegister()) 119 .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); 124 } >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I'd suggest to simplify the logic by refactoring to something like below: >>>>>>>>>>>> >>>>>>>>>>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc); >>>>>>>>>>>> ?????????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame >>>>>>>>>>>> ?????????? DwarfParser dwarf = null; >>>>>>>>>>>> >>>>>>>>>>>> ?????????? if (libptr != 0L) { // Native frame >>>>>>>>>>>> ???????????? try { >>>>>>>>>>>> ?????????????? dwarf = new DwarfParser(libptr); >>>>>>>>>>>> ?????????????? dwarf.processDwarf(pc); >>>>>>>>>>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && >>>>>>>>>>>> !dwarf.isBPOffsetAvailable()) >>>>>>>>>>>> ???????????????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) >>>>>>>>>>>> ???????????????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister()) >>>>>>>>>>>> .addOffsetTo(dwarf.getCFAOffset()); >>>>>>>>>>>> >>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // bail out to Java frame case >>>>>>>>>>>> ??????????? } >>>>>>>>>>>> ????????? } >>>>>>>>>>>> ????????? if (cfa == null) { >>>>>>>>>>>> ??????????? return null; >>>>>>>>>>>> ????????? } >>>>>>>>>>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html >>>>>>>>>>>> >>>>>>>>>>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA() >>>>>>>>>>>> >>>>>>>>>>>> ?? Better to rename 'ofs' => 'offs'. >>>>>>>>>>>> >>>>>>>>>>>> 77 nextCFA = nextCFA.addOffsetTo(- nextDwarf.getBasePointerOffsetFromCFA()); >>>>>>>>>>>> >>>>>>>>>>>> ?? Extra space after '-' sign. >>>>>>>>>>>> >>>>>>>>>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, ThreadContext context) { >>>>>>>>>>>> >>>>>>>>>>>> ?? It feels like the logic has to be somehow refactored/simplified as >>>>>>>>>>>> ?? several typical fragments appears in slightly different contexts. >>>>>>>>>>>> ?? But it is not easy to understand what it is. >>>>>>>>>>>> ?? Could you, please, add some comments to key places explaining this logic. >>>>>>>>>>>> ?? Then I'll check if it is possible to make it a little bit simpler. >>>>>>>>>>>> >>>>>>>>>>>> 109 private CFrame javaSender(ThreadContext context) { 110 Address nextCFA; 111 Address nextPC; 112 113 nextPC = getNextPC(false); 114 if (nextPC == null) { 115 return null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 123 } catch (DebuggerException e) { 124 nextCFA = getNextCFA(null, context); 125 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 133 } >>>>>>>>>>>> >>>>>>>>>>>> ??The above can be simplified if a DebuggerException can not be thrown from processDwarf(nextPC): >>>>>>>>>>>> ????? private CFrame javaSender(ThreadContext context) { >>>>>>>>>>>> ??????? Address nextPC = getNextPC(false); >>>>>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>>>>> ????????? return null; >>>>>>>>>>>> ??????? } >>>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>>>>> >>>>>>>>>>>> ??????? if (libptr != 0L) { // Native frame >>>>>>>>>>>> ????????? try { >>>>>>>>>>>> ??????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>>>>> ??????????? nextDwarf.processDwarf(nextPC); >>>>>>>>>>>> ????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>>>>>>>>> ????????? } >>>>>>>>>>>> ??????? } >>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>>>>> ????? } >>>>>>>>>>>> >>>>>>>>>>>> 135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext context = thread.getContext(); 137 138 if (dwarf == null) { // Java frame 139 return javaSender(context); 140 } 141 142 Address nextPC = getNextPC(true); 143 if (nextPC == null) { 144 return null; 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // Next frame might be Java frame 153 nextCFA = getNextCFA(null, context); 154 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 158 } catch (DebuggerException e) { 159 nextCFA = getNextCFA(null, context); 160 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? null : new >>>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 167 } >>>>>>>>>>>> >>>>>>>>>>>> ??This one can be also simplified a little: >>>>>>>>>>>> >>>>>>>>>>>> ????? public CFrame sender(ThreadProxy thread) { >>>>>>>>>>>> ??????? ThreadContext context = thread.getContext(); >>>>>>>>>>>> >>>>>>>>>>>> ??????? if (dwarf == null) { // Java frame >>>>>>>>>>>> ????????? return javaSender(context); >>>>>>>>>>>> ??????? } >>>>>>>>>>>> ??????? Address nextPC = getNextPC(true); >>>>>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>>>>> ????????? return null; >>>>>>>>>>>> ??????? } >>>>>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>>>>> ??????? if (!dwarf.isIn(nextPC)) { >>>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>>> ????????? if (libptr != 0L) { >>>>>>>>>>>> ??????????? try { >>>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC); >>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>>>>>>>>> ??????????? } >>>>>>>>>>>> ????????? } >>>>>>>>>>>> ??????? } >>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>>>>> ????? } >>>>>>>>>>>> >>>>>>>>>>>> Finally, it looks like just one method could replace both >>>>>>>>>>>> sender(ThreadProxy thread) and javaSender(ThreadContext context): >>>>>>>>>>>> >>>>>>>>>>>> ????? private CFrame commonSender(ThreadProxy thread) { >>>>>>>>>>>> ??????? ThreadContext context = thread.getContext(); >>>>>>>>>>>> ??????? Address nextPC = getNextPC(false); >>>>>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>>>>> ????????? return null; >>>>>>>>>>>> ??????? } >>>>>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>>>>> >>>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) { >>>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>>> ????????? if (libptr != 0L) { >>>>>>>>>>>> ??????????? try { >>>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC); >>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>>>>>>>>> ??????????? } >>>>>>>>>>>> ????????? } >>>>>>>>>>>> ??????? } >>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>>>>> ????? } >>>>>>>>>>>> >>>>>>>>>>>> I'm still reviewing the dwarf parser files. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Serguei >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote: >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> I refactored LinuxAMD64CFrame.java . It works fine in serviceability/sa tests and >>>>>>>>>>>>> all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923). >>>>>>>>>>>>> Could you review new webrev? >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/ >>>>>>>>>>>>> >>>>>>>>>>>>> The diff from previous webrev is here: >>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Please review this change: >>>>>>>>>>>>>> >>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> According to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64 >>>>>>>>>>>>>> Architecture Processor Supplement [1], we need to use DWARF in .eh_frame or .debug_frame >>>>>>>>>>>>>> for stack unwinding. >>>>>>>>>>>>>> >>>>>>>>>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by default since GCC 4.6, so system >>>>>>>>>>>>>> library (e.g. libc) might be compiled with this feature. >>>>>>>>>>>>>> >>>>>>>>>>>>>> However `jhsdb jstack --mixed` does not do so, it uses base pointer register (RBP). >>>>>>>>>>>>>> So it might be lack of stack frames. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I guess JDK-8219201 is caused by same issue. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> [1] https://urldefense.com/v3/__https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaQusmjOqA$ >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> From david.holmes at oracle.com Wed Mar 11 06:20:52 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 11 Mar 2020 16:20:52 +1000 Subject: : PING: RFR: 8234624: jstack mixed mode should refer DWARF In-Reply-To: <75c8f8ac-4557-03ab-77eb-f2383aa2b5f1@oss.nttdata.com> References: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com> <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com> <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com> <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com> <24a2dcae-fba9-c6ed-657e-a1d54584bbc3@oss.nttdata.com> <01c2a8c1-0e04-4e42-161a-80d6bdaa4ffa@oracle.com> <26681640-d8c8-5b64-58cc-36f8a854d101@oracle.com> <75c8f8ac-4557-03ab-77eb-f2383aa2b5f1@oss.nttdata.com> Message-ID: <89d854f1-6c28-3976-ba3f-33e3e8cb6012@oracle.com> On 11/03/2020 4:03 pm, Yasumasa Suenaga wrote: > Thanks David! > > Can you share native backtrace? > (Did /opt/core.sh collect it?) There is a core file but I can't process it, sorry. David ----- > > Yasumasa > > > On 2020/03/11 14:59, David Holmes wrote: >> Hi Yasumasa, >> >> Partial hs_err info below. >> >> David >> ----- >> >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> #? SIGSEGV (0xb) at pc=0x00007fdf2000e87c, pid=29798, tid=29800 >> # >> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build >> 15-internal+0-2020-03-11-0447267.suenaga.source) >> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug >> 15-internal+0-2020-03-11-0447267.suenaga.source, mixed mode, sharing, >> tiered, compressed oops, g1 gc, linux-amd64) >> # Problematic frame: >> # C? [libsaproc.so+0x487c]? DwarfParser::process_dwarf(unsigned >> long)+0x2c >> # >> # Core dump will be written. Default location: Core dumps may be >> processed with "/opt/core.sh %p" (or dumping to >> /opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/scratch/0/core.29798) >> >> # >> # If you would like to submit a bug report, please visit: >> #?? https://bugreport.java.com/bugreport/crash.jsp >> # The crash happened outside the Java Virtual Machine in native code. >> # See problematic frame for where to report the bug. >> # >> >> ---------------? S U M M A R Y ------------ >> >> Command Line: >> -Denv.class.path=/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/serviceability/sa/TestJhsdbJstackMixed.d:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/hotspot/jtreg/serviceability/sa:/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/javatest.jar:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/jtreg.jar >> -Dapplication.home=/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/linux-x64-debug.jdk/jdk-15/fastdebug >> -Xms8m -Djdk.module.main=jdk.hotspot.agent >> jdk.hotspot.agent/sun.jvm.hotspot.SALauncher jstack --mixed --pid 29770 >> >> Time: Wed Mar 11 05:20:57 2020 UTC elapsed time: 3.927809 seconds (0d >> 0h 0m 3s) >> >> ---------------? T H R E A D? --------------- >> >> Current thread (0x00007fdf5c032000):? JavaThread "main" >> [_thread_in_native, id=29800, >> stack(0x00007fdf63a9e000,0x00007fdf63b9f000)] >> >> Stack: [0x00007fdf63a9e000,0x00007fdf63b9f000], >> sp=0x00007fdf63b9d190, free space=1020k >> Native frames: (J=compiled Java code, A=aot compiled Java code, >> j=interpreted, Vv=VM code, C=native code) >> C? [libsaproc.so+0x487c]? DwarfParser::process_dwarf(unsigned long)+0x2c >> j >> sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 >> jdk.hotspot.agent at 15-internal >> j >> sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 >> jdk.hotspot.agent at 15-internal >> j >> sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 >> jdk.hotspot.agent at 15-internal >> j >> sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 >> jdk.hotspot.agent at 15-internal >> j >> sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.tools.Tool.startInternal()V+87 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 >> jdk.hotspot.agent at 15-internal >> v? ~StubRoutines::call_stub >> V? [libjvm.so+0xc2291c]? JavaCalls::call_helper(JavaValue*, >> methodHandle const&, JavaCallArguments*, Thread*)+0x6ac >> V? [libjvm.so+0xd31970]? jni_invoke_static(JNIEnv_*, JavaValue*, >> _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) >> [clone .isra.140] [clone .constprop.263]+0x370 >> V? [libjvm.so+0xd36202]? jni_CallStaticVoidMethod+0x222 >> C? [libjli.so+0x4bed]? JavaMain+0xbcd >> C? [libjli.so+0x80a9]? ThreadJavaMain+0x9 >> >> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) >> j >> sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 >> jdk.hotspot.agent at 15-internal >> j >> sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 >> jdk.hotspot.agent at 15-internal >> j >> sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 >> jdk.hotspot.agent at 15-internal >> j >> sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 >> jdk.hotspot.agent at 15-internal >> j >> sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.tools.Tool.startInternal()V+87 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 >> jdk.hotspot.agent at 15-internal >> v? ~StubRoutines::call_stub >> >> siginfo: si_signo: 11 (SIGSEGV), si_code: 2 (SEGV_ACCERR), si_addr: >> 0x00007fded5076b79 >> >> Register to memory mapping: >> >> RAX=0x00007f7e4dfe3229 is an unknown value >> RBX=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 >> d4 de 7f 00 00 >> RCX=0x00007fded4072380 points into unknown readable memory: 2f 75 73 >> 72 2f 6c 69 62 >> RDX=0x00007fded4076b85 points into unknown readable memory: 01 00 00 >> RSP=0x00007fdf63b9d190 is pointing into the stack for thread: >> 0x00007fdf5c032000 >> RBP=0x00007fdf63b9d1b0 is pointing into the stack for thread: >> 0x00007fdf5c032000 >> RSI=0x0000000000000004 is an unknown value >> RDI=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 >> d4 de 7f 00 00 >> R8 =0x000000000146c380 points into unknown readable memory: 02 00 00 >> 00 00 00 00 00 >> R9 =0x00007fded4076b79 points into unknown readable memory: 7a 52 00 >> 01 78 10 01 >> R10=0x00000000ffffffff is an unknown value >> R11=0x000000000100527a is an unknown value >> R12=0x00007fded5076b79 is an unknown value >> R13=0x00007f7da2f8e68a is an unknown value >> R14=0x00007f7dbdf62b1d is an unknown value >> R15=0x00007fdf5c032000 is a thread >> >> >> Registers: >> RAX=0x00007f7e4dfe3229, RBX=0x00007fdf5c4d7080, >> RCX=0x00007fded4072380, RDX=0x00007fded4076b85 >> RSP=0x00007fdf63b9d190, RBP=0x00007fdf63b9d1b0, >> RSI=0x0000000000000004, RDI=0x00007fdf5c4d7080 >> R8 =0x000000000146c380, R9 =0x00007fded4076b79, >> R10=0x00000000ffffffff, R11=0x000000000100527a >> R12=0x00007fded5076b79, R13=0x00007f7da2f8e68a, >> R14=0x00007f7dbdf62b1d, R15=0x00007fdf5c032000 >> RIP=0x00007fdf2000e87c, EFLAGS=0x0000000000010206, >> CSGSFS=0x002b000000000033, ERR=0x0000000000000004 >> ?? TRAPNO=0x000000000000000e >> >> Top of Stack: (sp=0x00007fdf63b9d190) >> 0x00007fdf63b9d190:?? 00007fdf209d0980 0000000000000000 >> 0x00007fdf63b9d1a0:?? 00007fdf209d0980 00007fdf63b9d258 >> 0x00007fdf63b9d1b0:?? 00007fdf63b9d228 00007fdf44778dbe >> 0x00007fdf63b9d1c0:?? 000000000146c380 00007fdf5c032000 >> >> Instructions: (pc=0x00007fdf2000e87c) >> 0x00007fdf2000e77c:?? 89 43 18 4d 85 f6 75 0f eb 2a 66 2e 0f 1f 84 00 >> 0x00007fdf2000e78c:?? 00 00 00 00 48 89 c2 48 8d 42 01 48 89 43 08 80 >> 0x00007fdf2000e79c:?? 78 ff 00 78 ef 48 8d 42 02 48 89 43 08 0f b6 42 >> 0x00007fdf2000e7ac:?? 01 88 43 10 48 c7 43 28 00 00 00 00 4c 89 e1 48 >> 0x00007fdf2000e7bc:?? 89 df 31 f6 48 b8 07 00 00 00 10 00 00 00 c6 43 >> 0x00007fdf2000e7cc:?? 3c 00 48 c7 c2 ff ff ff ff 48 89 43 14 48 c7 43 >> 0x00007fdf2000e7dc:?? 30 00 00 00 00 c7 43 38 00 00 00 00 e8 13 fb ff >> 0x00007fdf2000e7ec:?? ff 4c 89 6b 08 48 83 c4 18 5b 41 5c 41 5d 41 5e >> 0x00007fdf2000e7fc:?? 41 5f 5d c3 83 e7 40 0f 84 63 ff ff ff 48 c7 c2 >> 0x00007fdf2000e80c:?? ff ff ff ff 48 d3 e2 49 09 d0 e9 51 ff ff ff 90 >> 0x00007fdf2000e81c:?? 0f 1f 40 00 0f b6 47 10 83 e0 07 3c 02 74 0a 76 >> 0x00007fdf2000e82c:?? 1b 3c 03 74 04 3c 04 75 17 48 8b 57 08 8b 02 48 >> 0x00007fdf2000e83c:?? 83 c2 04 48 89 57 08 c3 0f 1f 40 00 84 c0 74 e9 >> 0x00007fdf2000e84c:?? 31 c0 c3 90 55 41 ba ff ff ff ff 48 89 e5 41 56 >> 0x00007fdf2000e85c:?? 41 55 49 89 f5 41 54 53 48 8b 07 48 89 fb 4c 8b >> 0x00007fdf2000e86c:?? a0 28 11 00 00 eb 09 0f 1f 44 00 00 4c 89 63 08 >> 0x00007fdf2000e87c:?? 41 8b 04 24 4d 8d 4c 24 04 4c 89 4b 08 4c 39 d0 >> 0x00007fdf2000e88c:?? 75 0a 49 8b 44 24 04 4d 8d 4c 24 0c 45 8b 19 4d >> 0x00007fdf2000e89c:?? 8d 24 01 49 8d 41 04 48 89 43 08 45 85 db 74 cc >> 0x00007fdf2000e8ac:?? 48 89 df e8 8c f9 ff ff 48 8b 13 41 89 c6 4c 03 >> 0x00007fdf2000e8bc:?? b2 18 11 00 00 e8 5a ff ff ff 89 c0 4c 01 f0 4c >> 0x00007fdf2000e8cc:?? 39 e8 76 a8 4d 39 ee 77 a3 44 89 da 4c 89 ce e8 >> 0x00007fdf2000e8dc:?? 90 fd ff ff 48 8b 43 08 31 c9 31 ff 48 83 c0 01 >> 0x00007fdf2000e8ec:?? 0f 1f 40 00 48 89 43 08 0f b6 70 ff 49 89 c0 48 >> 0x00007fdf2000e8fc:?? 83 c0 01 48 89 f2 83 e2 7f 48 d3 e2 83 c1 07 48 >> 0x00007fdf2000e90c:?? 09 d7 40 84 f6 78 dd 4c 01 c7 4c 89 e1 4c 89 ea >> 0x00007fdf2000e91c:?? 4c 89 f6 48 89 7b 08 48 89 df e8 d5 f9 ff ff 5b >> 0x00007fdf2000e92c:?? 31 c0 41 5c 41 5d 41 5e 5d c3 66 2e 0f 1f 84 00 >> 0x00007fdf2000e93c:?? 00 00 00 00 55 48 89 e5 41 54 53 48 81 ec d0 00 >> 0x00007fdf2000e94c:?? 00 00 48 89 b5 48 ff ff ff 48 89 95 50 ff ff ff >> 0x00007fdf2000e95c:?? 48 89 8d 58 ff ff ff 4c 89 85 60 ff ff ff 4c 89 >> 0x00007fdf2000e96c:?? 8d 68 ff ff ff 84 c0 74 23 0f 29 85 70 ff ff ff >> >> >> On 11/03/2020 3:52 pm, Yasumasa Suenaga wrote: >>> Hi Kevin, >>> >>> I saw 2 errors on submit repo >>> (mach5-one-ysuenaga-JDK-8234624-5-20200311-0209-9358475). >>> So I tweaked my patch, but I saw the crash again >>> (mach5-one-ysuenaga-JDK-8234624-5-20200311-0448-9361448). >>> >>> ?? Last change on submit repo is here: >>> ???? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05-2/ >>> >>> Can you share details on submit repo? >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2020/03/11 11:07, Yasumasa Suenaga wrote: >>>> Hi Kevin, >>>> >>>> I guess first program header in the libraries which are on your >>>> machine has exec flag (you can check it with `readelf -l`). >>>> So I tweaked my patch (initial value of exec_start and exec_end set >>>> to -1) in new webrev. >>>> >>>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05/ >>>> >>>> This webrev contains the fix for your comment (typo and >>>> DW_CFA_advance_loc4). >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2020/03/11 8:53, Kevin Walls wrote: >>>>> Hi - >>>>> >>>>> In testing I wasn't seeing any of the Dwarf code triggered. >>>>> >>>>> With LIBSAPROC_DEBUG set I'm getting the "Could not find executable >>>>> section in" for lots of / maybe all the libraries... >>>>> >>>>> src/jdk.hotspot.agent/linux/native/libsaproc/libproc_impl.c >>>>> >>>>> ??? if (fill_instr_info(newlib)) { >>>>> ????? if (!read_eh_frame(ph, newlib)) { >>>>> >>>>> fill_instr_info is failing, and we never get to read_eh_frame(). >>>>> >>>>> output like: >>>>> >>>>> libsaproc DEBUG: [0] vaddr = 0x0, memsz = 0xaba4, filesz = 0xaba4 >>>>> libsaproc DEBUG: Could not find executable section in >>>>> /lib/x86_64-linux-gnu/libnss_nis-2.27.so >>>>> >>>>> (similar for all libraries). >>>>> >>>>> fill_instr fails if: >>>>> >>>>> ??if ((lib->exec_start == 0L) || (lib->exec_end == 0L)) >>>>> >>>>> ...but isn't exec_start relative to the library address? It's the >>>>> value of ph->vaddr and it is often zero. >>>>> >>>>> I added some booleans and did: >>>>> >>>>> 185?????? if ((lib->exec_start == 0L) || (lib->exec_start > >>>>> ph->p_vaddr)) { >>>>> 186???????? lib->exec_start = ph->p_vaddr; >>>>> 187???????? found_start =true; >>>>> 188?????? } >>>>> >>>>> (similarly for end) and only failed if: >>>>> >>>>> 201?? if (!found_start || !found_end) { >>>>> 202???? return false; >>>>> >>>>> ...and now it's better. ? I go from: >>>>> >>>>> ----------------- 3306 ----------------- >>>>> 0x00007f75824acd2d????? __GI___pthread_timedjoin_ex + 0x17d >>>>> >>>>> to: >>>>> >>>>> ----------------- 31127 ----------------- >>>>> 0x00007fa284d78d2d????? __GI___pthread_timedjoin_ex + 0x17d >>>>> 0x00007fa2857aaf2d????? CallJavaMainInNewThread + 0xad >>>>> 0x00007fa2857a74ed????? ContinueInNewThread + 0x4d >>>>> 0x00007fa2857a8c49????? JLI_Launch + 0x1529 >>>>> 0x000055af1b78db1c????? main + 0x11c >>>>> >>>>> >>>>> Thanks >>>>> Kevin >>>>> >>>>> >>>>> >>>>> >>>>> On 10/03/2020 12:36, Yasumasa Suenaga wrote: >>>>> >>>>>> Hi Kevin, >>>>>> >>>>>> Thanks for your comment! >>>>>> >>>>>> On 2020/03/10 18:58, Kevin Walls wrote: >>>>>>> Hi Yasumasa , >>>>>>> >>>>>>> The changes build OK for me in the latest jdk, and things still >>>>>>> work. >>>>>>> I have not yet seen the dwarf usage in action: I've tried a >>>>>>> couple of different systems and so far have not reproduced the >>>>>>> problem, i.e. jstack has not failed on native frames. >>>>>>> >>>>>>> I may need more recent basic libraries, will look again for >>>>>>> somewhere where the problem happens and get back to you as I >>>>>>> really want to run the changes. >>>>>> >>>>>> You can see the problem with JShell. >>>>>> Some Java frames would not be seen in mixed jstack. >>>>>> >>>>>> >>>>>>> I have mostly minor other comments which don't need a new webrev, >>>>>>> some just comments for the future: >>>>>>> >>>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp: >>>>>>> >>>>>>> DW_CFA_nop - shouldn't this continue instead of return? >>>>>>> (It may "never" happen, but a nop could appear within some other >>>>>>> instructions?) >>>>>> >>>>>> DW_CFA_nop is used for padding, so we can ignore (return >>>>>> immediately) it. >>>>>> >>>>>> >>>>>>> DW_CFA_remember_state: a minor typo in the comment, >>>>>>> "DW_CFA_remenber_state". >>>>>> >>>>>> I will fix it. >>>>>> >>>>>> >>>>>>> We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not >>>>>>> DW_CFA_advance_loc4.? I thought that was odd, but maybe addresses >>>>>>> in these tables never increase by 4-byte amounts, would this mean >>>>>>> a lot of code on one line. 8-) >>>>>>> So maybe it's never used in practice, if you think it's >>>>>>> unnecessary no problem, maybe a comment, or add it for robustness. >>>>>> >>>>>> I will add DW_CFA_advance_loc4. >>>>>> >>>>>> >>>>>>> General-purpose methods like read_leb128(), get_entry_length(), >>>>>>> get_decoded_value() specifically update the _buf pointer in this >>>>>>> DwarfParser. >>>>>>> >>>>>>> DwarfParser::process_dwarf() moves _buf. >>>>>>> It calls process_cie() which reads, moves _buf and restores it to >>>>>>> the original position, then we read augmentation_length from >>>>>>> where _buf is. >>>>>>> I'm not sure if that's wrong, or if I just need to read again >>>>>>> about the CIE/etc layout. >>>>>>> >>>>>>> I don't really want to suggest making the code pass around a >>>>>>> current _buf for the invocation of these general purpose methods, >>>>>>> but just wanted to comment that if these get used more widely >>>>>>> that might become necessary. >>>>>> >>>>>> I saw GDB and binutils source for creating this patch. >>>>>> They seems to process similar code because we need to calculate >>>>>> DWARF instructions one-by-one to get the value which relates to >>>>>> specified PC. >>>>>> >>>>>> >>>>>>> Similarly in future, if this DWARF support code became used more >>>>>>> widely, it might want to move to an >>>>>>> OS-neutral directory?? It's odd to label it as Linux-specific. >>>>>> >>>>>> Windows does not use DWARF at least, it uses another feature. >>>>>> >>>>>> https://urldefense.com/v3/__https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaTtALtpRg$ >>>>>> >>>>>> I'm not sure other platforms (Solaris, macOS) uses DWARF. >>>>>> If DWARF is used in them, I can move DWARF related code to posix >>>>>> directory. >>>>>> >>>>>> >>>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp: >>>>>>> Thanks for changing "can_parsable" which was in the earlier >>>>>>> version. 8-) >>>>>>> >>>>>>> >>>>>>> These are just comments to mainly say it looks good, and somebody >>>>>>> else out there has read it. >>>>>>> I will look for a system that shows the problem, and get back to >>>>>>> you again! >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>>> Many thanks >>>>>>> Kevin >>>>>>> >>>>>>> On 27/02/2020 05:13, Yasumasa Suenaga wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> webrev.03 cannot be applied to current jdk/jdk due to 8239224 >>>>>>>> and 8239462 changes (they updated copyright year). >>>>>>>> So I modified webrev (only copyright year changes) to be able to >>>>>>>> apply to current jdk/jdk. >>>>>>>> Could you review it? >>>>>>>> >>>>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/ >>>>>>>> >>>>>>>> I need one more reviewer to push. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2020/02/17 13:07, Yasumasa Suenaga wrote: >>>>>>>>> PING: Could you review it? >>>>>>>>> >>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>>> ?? webrev: >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>>>>> >>>>>>>>> This change has been already reviewed by Serguei. >>>>>>>>> I need one more reviewer to push. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2020/02/03 1:37, Yasumasa Suenaga wrote: >>>>>>>>>> PING: Could you reveiw this change? >>>>>>>>>> >>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>>> ?? webrev: >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>>>>>> >>>>>>>>>> I believe this change helps troubleshooter to fight to >>>>>>>>>> postmortem analysis. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2020/01/19 3:16, Yasumasa Suenaga wrote: >>>>>>>>>>> PING: Could you review it? >>>>>>>>>>> >>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>>> ?? webrev: >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>>>>>>> >>>>>>>>>>> I updated webrev. I discussed with Serguei in off list, and I >>>>>>>>>>> refactored webrev.02 . >>>>>>>>>>> It has passed tests on submit repo >>>>>>>>>>> (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549). >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 2019/12/15 10:51, Yasumasa Suenaga wrote: >>>>>>>>>>>> Hi Serguei, >>>>>>>>>>>> >>>>>>>>>>>> Thanks for your comment! >>>>>>>>>>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev. >>>>>>>>>>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as >>>>>>>>>>>> Dmitry said. >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/ >>>>>>>>>>>> >>>>>>>>>>>> This change has been passed all tests on submit repo >>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487). >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>> >>>>>>>>>>>>> This is nice move in general. >>>>>>>>>>>>> Thank you for working on this! >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr >>>>>>>>>>>>> == 0L) { // Java frame 98 Address rbp = >>>>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if >>>>>>>>>>>>> (rbp == null) { 100 return null; 101 } 102 return new >>>>>>>>>>>>> LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // >>>>>>>>>>>>> Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = >>>>>>>>>>>>> new DwarfParser(libptr); 107 } catch (DebuggerException e) >>>>>>>>>>>>> { 108 Address rbp = >>>>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 >>>>>>>>>>>>> if (rbp == null) { 110 return null; 111 } 112 return new >>>>>>>>>>>>> LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 >>>>>>>>>>>>> dwarf.processDwarf(pc); 115 Address cfa = >>>>>>>>>>>>> ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 >>>>>>>>>>>>> !dwarf.isBPOffsetAvailable()) 117 ? >>>>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : >>>>>>>>>>>>> context.getRegisterAsAddress(dwarf.getCFARegister()) 119 >>>>>>>>>>>>> .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { >>>>>>>>>>>>> 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, >>>>>>>>>>>>> cfa, pc, dwarf); 124 } >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> I'd suggest to simplify the logic by refactoring to >>>>>>>>>>>>> something like below: >>>>>>>>>>>>> >>>>>>>>>>>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc); >>>>>>>>>>>>> ?????????? Address cfa = >>>>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); // >>>>>>>>>>>>> Java frame >>>>>>>>>>>>> ?????????? DwarfParser dwarf = null; >>>>>>>>>>>>> >>>>>>>>>>>>> ?????????? if (libptr != 0L) { // Native frame >>>>>>>>>>>>> ???????????? try { >>>>>>>>>>>>> ?????????????? dwarf = new DwarfParser(libptr); >>>>>>>>>>>>> ?????????????? dwarf.processDwarf(pc); >>>>>>>>>>>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == >>>>>>>>>>>>> AMD64ThreadContext.RBP) && >>>>>>>>>>>>> !dwarf.isBPOffsetAvailable()) >>>>>>>>>>>>> ???????????????????????????????? ? >>>>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP) >>>>>>>>>>>>> ???????????????????????????????? : >>>>>>>>>>>>> context.getRegisterAsAddress(dwarf.getCFARegister()) >>>>>>>>>>>>> .addOffsetTo(dwarf.getCFAOffset()); >>>>>>>>>>>>> >>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // bail out to >>>>>>>>>>>>> Java frame case >>>>>>>>>>>>> ??????????? } >>>>>>>>>>>>> ????????? } >>>>>>>>>>>>> ????????? if (cfa == null) { >>>>>>>>>>>>> ??????????? return null; >>>>>>>>>>>>> ????????? } >>>>>>>>>>>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA() >>>>>>>>>>>>> >>>>>>>>>>>>> ?? Better to rename 'ofs' => 'offs'. >>>>>>>>>>>>> >>>>>>>>>>>>> 77 nextCFA = nextCFA.addOffsetTo(- >>>>>>>>>>>>> nextDwarf.getBasePointerOffsetFromCFA()); >>>>>>>>>>>>> >>>>>>>>>>>>> ?? Extra space after '-' sign. >>>>>>>>>>>>> >>>>>>>>>>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, >>>>>>>>>>>>> ThreadContext context) { >>>>>>>>>>>>> >>>>>>>>>>>>> ?? It feels like the logic has to be somehow >>>>>>>>>>>>> refactored/simplified as >>>>>>>>>>>>> ?? several typical fragments appears in slightly different >>>>>>>>>>>>> contexts. >>>>>>>>>>>>> ?? But it is not easy to understand what it is. >>>>>>>>>>>>> ?? Could you, please, add some comments to key places >>>>>>>>>>>>> explaining this logic. >>>>>>>>>>>>> ?? Then I'll check if it is possible to make it a little >>>>>>>>>>>>> bit simpler. >>>>>>>>>>>>> >>>>>>>>>>>>> 109 private CFrame javaSender(ThreadContext context) { 110 >>>>>>>>>>>>> Address nextCFA; 111 Address nextPC; 112 113 nextPC = >>>>>>>>>>>>> getNextPC(false); 114 if (nextPC == null) { 115 return >>>>>>>>>>>>> null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long >>>>>>>>>>>>> libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr != >>>>>>>>>>>>> 0L) { // Native frame 121 try { 122 nextDwarf = new >>>>>>>>>>>>> DwarfParser(libptr); 123 } catch (DebuggerException e) { >>>>>>>>>>>>> 124 nextCFA = getNextCFA(null, context); 125 return >>>>>>>>>>>>> (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, >>>>>>>>>>>>> nextCFA, nextPC, null); 126 } 127 >>>>>>>>>>>>> nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = >>>>>>>>>>>>> getNextCFA(nextDwarf, context); 131 return (nextCFA == >>>>>>>>>>>>> null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, >>>>>>>>>>>>> nextPC, nextDwarf); 133 } >>>>>>>>>>>>> >>>>>>>>>>>>> ??The above can be simplified if a DebuggerException can >>>>>>>>>>>>> not be thrown from processDwarf(nextPC): >>>>>>>>>>>>> ????? private CFrame javaSender(ThreadContext context) { >>>>>>>>>>>>> ??????? Address nextPC = getNextPC(false); >>>>>>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>>>>>> ????????? return null; >>>>>>>>>>>>> ??????? } >>>>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>>>>>> >>>>>>>>>>>>> ??????? if (libptr != 0L) { // Native frame >>>>>>>>>>>>> ????????? try { >>>>>>>>>>>>> ??????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>>>>>> ??????????? nextDwarf.processDwarf(nextPC); >>>>>>>>>>>>> ????????? } catch (DebuggerException e) { // Bail out to >>>>>>>>>>>>> Java frame >>>>>>>>>>>>> ????????? } >>>>>>>>>>>>> ??????? } >>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new >>>>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>>>>>> ????? } >>>>>>>>>>>>> >>>>>>>>>>>>> 135 public CFrame sender(ThreadProxy thread) { 136 >>>>>>>>>>>>> ThreadContext context = thread.getContext(); 137 138 if >>>>>>>>>>>>> (dwarf == null) { // Java frame 139 return >>>>>>>>>>>>> javaSender(context); 140 } 141 142 Address nextPC = >>>>>>>>>>>>> getNextPC(true); 143 if (nextPC == null) { 144 return null; >>>>>>>>>>>>> 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = >>>>>>>>>>>>> dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = >>>>>>>>>>>>> dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { >>>>>>>>>>>>> 152 // Next frame might be Java frame 153 nextCFA = >>>>>>>>>>>>> getNextCFA(null, context); 154 return (nextCFA == null) ? >>>>>>>>>>>>> null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); >>>>>>>>>>>>> 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); >>>>>>>>>>>>> 158 } catch (DebuggerException e) { 159 nextCFA = >>>>>>>>>>>>> getNextCFA(null, context); 160 return (nextCFA == null) ? >>>>>>>>>>>>> null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); >>>>>>>>>>>>> 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 >>>>>>>>>>>>> nextCFA = getNextCFA(nextDwarf, context); 166 return >>>>>>>>>>>>> (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, >>>>>>>>>>>>> nextCFA, nextPC, nextDwarf); 167 } >>>>>>>>>>>>> >>>>>>>>>>>>> ??This one can be also simplified a little: >>>>>>>>>>>>> >>>>>>>>>>>>> ????? public CFrame sender(ThreadProxy thread) { >>>>>>>>>>>>> ??????? ThreadContext context = thread.getContext(); >>>>>>>>>>>>> >>>>>>>>>>>>> ??????? if (dwarf == null) { // Java frame >>>>>>>>>>>>> ????????? return javaSender(context); >>>>>>>>>>>>> ??????? } >>>>>>>>>>>>> ??????? Address nextPC = getNextPC(true); >>>>>>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>>>>>> ????????? return null; >>>>>>>>>>>>> ??????? } >>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>>>>>> ??????? if (!dwarf.isIn(nextPC)) { >>>>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>>>> ????????? if (libptr != 0L) { >>>>>>>>>>>>> ??????????? try { >>>>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC); >>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to >>>>>>>>>>>>> Java frame >>>>>>>>>>>>> ??????????? } >>>>>>>>>>>>> ????????? } >>>>>>>>>>>>> ??????? } >>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new >>>>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>>>>>> ????? } >>>>>>>>>>>>> >>>>>>>>>>>>> Finally, it looks like just one method could replace both >>>>>>>>>>>>> sender(ThreadProxy thread) and javaSender(ThreadContext >>>>>>>>>>>>> context): >>>>>>>>>>>>> >>>>>>>>>>>>> ????? private CFrame commonSender(ThreadProxy thread) { >>>>>>>>>>>>> ??????? ThreadContext context = thread.getContext(); >>>>>>>>>>>>> ??????? Address nextPC = getNextPC(false); >>>>>>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>>>>>> ????????? return null; >>>>>>>>>>>>> ??????? } >>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>>>>>> >>>>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) { >>>>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>>>> ????????? if (libptr != 0L) { >>>>>>>>>>>>> ??????????? try { >>>>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC); >>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to >>>>>>>>>>>>> Java frame >>>>>>>>>>>>> ??????????? } >>>>>>>>>>>>> ????????? } >>>>>>>>>>>>> ??????? } >>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new >>>>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>>>>>> ????? } >>>>>>>>>>>>> >>>>>>>>>>>>> I'm still reviewing the dwarf parser files. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Serguei >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I refactored LinuxAMD64CFrame.java . It works fine in >>>>>>>>>>>>>> serviceability/sa tests and >>>>>>>>>>>>>> all tests on submit repo >>>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923). >>>>>>>>>>>>>> Could you review new webrev? >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> The diff from previous webrev is here: >>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Please review this change: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>>>>>>> ?? webrev: >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> According to 2.7 Stack Unwind Algorithm in System V >>>>>>>>>>>>>>> Application Binary Interface AMD64 >>>>>>>>>>>>>>> Architecture Processor Supplement [1], we need to use >>>>>>>>>>>>>>> DWARF in .eh_frame or .debug_frame >>>>>>>>>>>>>>> for stack unwinding. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by >>>>>>>>>>>>>>> default since GCC 4.6, so system >>>>>>>>>>>>>>> library (e.g. libc) might be compiled with this feature. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> However `jhsdb jstack --mixed` does not do so, it uses >>>>>>>>>>>>>>> base pointer register (RBP). >>>>>>>>>>>>>>> So it might be lack of stack frames. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I guess JDK-8219201 is caused by same issue. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>> https://urldefense.com/v3/__https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaQusmjOqA$ >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> From suenaga at oss.nttdata.com Wed Mar 11 06:48:26 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Wed, 11 Mar 2020 15:48:26 +0900 Subject: : PING: RFR: 8234624: jstack mixed mode should refer DWARF In-Reply-To: <89d854f1-6c28-3976-ba3f-33e3e8cb6012@oracle.com> References: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com> <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com> <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com> <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com> <24a2dcae-fba9-c6ed-657e-a1d54584bbc3@oss.nttdata.com> <01c2a8c1-0e04-4e42-161a-80d6bdaa4ffa@oracle.com> <26681640-d8c8-5b64-58cc-36f8a854d101@oracle.com> <75c8f8ac-4557-03ab-77eb-f2383aa2b5f1@oss.nttdata.com> <89d854f1-6c28-3976-ba3f-33e3e8cb6012@oracle.com> Message-ID: On 2020/03/11 15:20, David Holmes wrote: > On 11/03/2020 4:03 pm, Yasumasa Suenaga wrote: >> Thanks David! >> >> Can you share native backtrace? >> (Did /opt/core.sh collect it?) > > There is a core file but I can't process it, sorry. Can you share entire of hs_err log and libsaproc.so on this test? I cannot reproduce the crash on my laptop. Yasumasa > David > ----- > >> >> Yasumasa >> >> >> On 2020/03/11 14:59, David Holmes wrote: >>> Hi Yasumasa, >>> >>> Partial hs_err info below. >>> >>> David >>> ----- >>> >>> # >>> # A fatal error has been detected by the Java Runtime Environment: >>> # >>> #? SIGSEGV (0xb) at pc=0x00007fdf2000e87c, pid=29798, tid=29800 >>> # >>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-11-0447267.suenaga.source) >>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-11-0447267.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>> # Problematic frame: >>> # C? [libsaproc.so+0x487c]? DwarfParser::process_dwarf(unsigned long)+0x2c >>> # >>> # Core dump will be written. Default location: Core dumps may be processed with "/opt/core.sh %p" (or dumping to /opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/scratch/0/core.29798) >>> # >>> # If you would like to submit a bug report, please visit: >>> #?? https://bugreport.java.com/bugreport/crash.jsp >>> # The crash happened outside the Java Virtual Machine in native code. >>> # See problematic frame for where to report the bug. >>> # >>> >>> ---------------? S U M M A R Y ------------ >>> >>> Command Line: >>> -Denv.class.path=/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/serviceability/sa/TestJhsdbJstackMixed.d:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/hotspot/jtreg/serviceability/sa:/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/javatest.jar:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/jtreg.jar >>> -Dapplication.home=/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/linux-x64-debug.jdk/jdk-15/fastdebug -Xms8m -Djdk.module.main=jdk.hotspot.agent jdk.hotspot.agent/sun.jvm.hotspot.SALauncher jstack --mixed --pid 29770 >>> >>> Time: Wed Mar 11 05:20:57 2020 UTC elapsed time: 3.927809 seconds (0d 0h 0m 3s) >>> >>> ---------------? T H R E A D? --------------- >>> >>> Current thread (0x00007fdf5c032000):? JavaThread "main" [_thread_in_native, id=29800, stack(0x00007fdf63a9e000,0x00007fdf63b9f000)] >>> >>> Stack: [0x00007fdf63a9e000,0x00007fdf63b9f000], sp=0x00007fdf63b9d190, free space=1020k >>> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) >>> C? [libsaproc.so+0x487c]? DwarfParser::process_dwarf(unsigned long)+0x2c >>> j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 jdk.hotspot.agent at 15-internal >>> j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 jdk.hotspot.agent at 15-internal >>> j sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 jdk.hotspot.agent at 15-internal >>> j sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 jdk.hotspot.agent at 15-internal >>> j sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.tools.Tool.startInternal()V+87 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 jdk.hotspot.agent at 15-internal >>> v? ~StubRoutines::call_stub >>> V? [libjvm.so+0xc2291c]? JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x6ac >>> V? [libjvm.so+0xd31970]? jni_invoke_static(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) [clone .isra.140] [clone .constprop.263]+0x370 >>> V? [libjvm.so+0xd36202]? jni_CallStaticVoidMethod+0x222 >>> C? [libjli.so+0x4bed]? JavaMain+0xbcd >>> C? [libjli.so+0x80a9]? ThreadJavaMain+0x9 >>> >>> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) >>> j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 jdk.hotspot.agent at 15-internal >>> j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 jdk.hotspot.agent at 15-internal >>> j sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 jdk.hotspot.agent at 15-internal >>> j sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 jdk.hotspot.agent at 15-internal >>> j sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.tools.Tool.startInternal()V+87 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 jdk.hotspot.agent at 15-internal >>> v? ~StubRoutines::call_stub >>> >>> siginfo: si_signo: 11 (SIGSEGV), si_code: 2 (SEGV_ACCERR), si_addr: 0x00007fded5076b79 >>> >>> Register to memory mapping: >>> >>> RAX=0x00007f7e4dfe3229 is an unknown value >>> RBX=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 d4 de 7f 00 00 >>> RCX=0x00007fded4072380 points into unknown readable memory: 2f 75 73 72 2f 6c 69 62 >>> RDX=0x00007fded4076b85 points into unknown readable memory: 01 00 00 >>> RSP=0x00007fdf63b9d190 is pointing into the stack for thread: 0x00007fdf5c032000 >>> RBP=0x00007fdf63b9d1b0 is pointing into the stack for thread: 0x00007fdf5c032000 >>> RSI=0x0000000000000004 is an unknown value >>> RDI=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 d4 de 7f 00 00 >>> R8 =0x000000000146c380 points into unknown readable memory: 02 00 00 00 00 00 00 00 >>> R9 =0x00007fded4076b79 points into unknown readable memory: 7a 52 00 01 78 10 01 >>> R10=0x00000000ffffffff is an unknown value >>> R11=0x000000000100527a is an unknown value >>> R12=0x00007fded5076b79 is an unknown value >>> R13=0x00007f7da2f8e68a is an unknown value >>> R14=0x00007f7dbdf62b1d is an unknown value >>> R15=0x00007fdf5c032000 is a thread >>> >>> >>> Registers: >>> RAX=0x00007f7e4dfe3229, RBX=0x00007fdf5c4d7080, RCX=0x00007fded4072380, RDX=0x00007fded4076b85 >>> RSP=0x00007fdf63b9d190, RBP=0x00007fdf63b9d1b0, RSI=0x0000000000000004, RDI=0x00007fdf5c4d7080 >>> R8 =0x000000000146c380, R9 =0x00007fded4076b79, R10=0x00000000ffffffff, R11=0x000000000100527a >>> R12=0x00007fded5076b79, R13=0x00007f7da2f8e68a, R14=0x00007f7dbdf62b1d, R15=0x00007fdf5c032000 >>> RIP=0x00007fdf2000e87c, EFLAGS=0x0000000000010206, CSGSFS=0x002b000000000033, ERR=0x0000000000000004 >>> ?? TRAPNO=0x000000000000000e >>> >>> Top of Stack: (sp=0x00007fdf63b9d190) >>> 0x00007fdf63b9d190:?? 00007fdf209d0980 0000000000000000 >>> 0x00007fdf63b9d1a0:?? 00007fdf209d0980 00007fdf63b9d258 >>> 0x00007fdf63b9d1b0:?? 00007fdf63b9d228 00007fdf44778dbe >>> 0x00007fdf63b9d1c0:?? 000000000146c380 00007fdf5c032000 >>> >>> Instructions: (pc=0x00007fdf2000e87c) >>> 0x00007fdf2000e77c:?? 89 43 18 4d 85 f6 75 0f eb 2a 66 2e 0f 1f 84 00 >>> 0x00007fdf2000e78c:?? 00 00 00 00 48 89 c2 48 8d 42 01 48 89 43 08 80 >>> 0x00007fdf2000e79c:?? 78 ff 00 78 ef 48 8d 42 02 48 89 43 08 0f b6 42 >>> 0x00007fdf2000e7ac:?? 01 88 43 10 48 c7 43 28 00 00 00 00 4c 89 e1 48 >>> 0x00007fdf2000e7bc:?? 89 df 31 f6 48 b8 07 00 00 00 10 00 00 00 c6 43 >>> 0x00007fdf2000e7cc:?? 3c 00 48 c7 c2 ff ff ff ff 48 89 43 14 48 c7 43 >>> 0x00007fdf2000e7dc:?? 30 00 00 00 00 c7 43 38 00 00 00 00 e8 13 fb ff >>> 0x00007fdf2000e7ec:?? ff 4c 89 6b 08 48 83 c4 18 5b 41 5c 41 5d 41 5e >>> 0x00007fdf2000e7fc:?? 41 5f 5d c3 83 e7 40 0f 84 63 ff ff ff 48 c7 c2 >>> 0x00007fdf2000e80c:?? ff ff ff ff 48 d3 e2 49 09 d0 e9 51 ff ff ff 90 >>> 0x00007fdf2000e81c:?? 0f 1f 40 00 0f b6 47 10 83 e0 07 3c 02 74 0a 76 >>> 0x00007fdf2000e82c:?? 1b 3c 03 74 04 3c 04 75 17 48 8b 57 08 8b 02 48 >>> 0x00007fdf2000e83c:?? 83 c2 04 48 89 57 08 c3 0f 1f 40 00 84 c0 74 e9 >>> 0x00007fdf2000e84c:?? 31 c0 c3 90 55 41 ba ff ff ff ff 48 89 e5 41 56 >>> 0x00007fdf2000e85c:?? 41 55 49 89 f5 41 54 53 48 8b 07 48 89 fb 4c 8b >>> 0x00007fdf2000e86c:?? a0 28 11 00 00 eb 09 0f 1f 44 00 00 4c 89 63 08 >>> 0x00007fdf2000e87c:?? 41 8b 04 24 4d 8d 4c 24 04 4c 89 4b 08 4c 39 d0 >>> 0x00007fdf2000e88c:?? 75 0a 49 8b 44 24 04 4d 8d 4c 24 0c 45 8b 19 4d >>> 0x00007fdf2000e89c:?? 8d 24 01 49 8d 41 04 48 89 43 08 45 85 db 74 cc >>> 0x00007fdf2000e8ac:?? 48 89 df e8 8c f9 ff ff 48 8b 13 41 89 c6 4c 03 >>> 0x00007fdf2000e8bc:?? b2 18 11 00 00 e8 5a ff ff ff 89 c0 4c 01 f0 4c >>> 0x00007fdf2000e8cc:?? 39 e8 76 a8 4d 39 ee 77 a3 44 89 da 4c 89 ce e8 >>> 0x00007fdf2000e8dc:?? 90 fd ff ff 48 8b 43 08 31 c9 31 ff 48 83 c0 01 >>> 0x00007fdf2000e8ec:?? 0f 1f 40 00 48 89 43 08 0f b6 70 ff 49 89 c0 48 >>> 0x00007fdf2000e8fc:?? 83 c0 01 48 89 f2 83 e2 7f 48 d3 e2 83 c1 07 48 >>> 0x00007fdf2000e90c:?? 09 d7 40 84 f6 78 dd 4c 01 c7 4c 89 e1 4c 89 ea >>> 0x00007fdf2000e91c:?? 4c 89 f6 48 89 7b 08 48 89 df e8 d5 f9 ff ff 5b >>> 0x00007fdf2000e92c:?? 31 c0 41 5c 41 5d 41 5e 5d c3 66 2e 0f 1f 84 00 >>> 0x00007fdf2000e93c:?? 00 00 00 00 55 48 89 e5 41 54 53 48 81 ec d0 00 >>> 0x00007fdf2000e94c:?? 00 00 48 89 b5 48 ff ff ff 48 89 95 50 ff ff ff >>> 0x00007fdf2000e95c:?? 48 89 8d 58 ff ff ff 4c 89 85 60 ff ff ff 4c 89 >>> 0x00007fdf2000e96c:?? 8d 68 ff ff ff 84 c0 74 23 0f 29 85 70 ff ff ff >>> >>> >>> On 11/03/2020 3:52 pm, Yasumasa Suenaga wrote: >>>> Hi Kevin, >>>> >>>> I saw 2 errors on submit repo (mach5-one-ysuenaga-JDK-8234624-5-20200311-0209-9358475). >>>> So I tweaked my patch, but I saw the crash again (mach5-one-ysuenaga-JDK-8234624-5-20200311-0448-9361448). >>>> >>>> ?? Last change on submit repo is here: >>>> ???? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05-2/ >>>> >>>> Can you share details on submit repo? >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2020/03/11 11:07, Yasumasa Suenaga wrote: >>>>> Hi Kevin, >>>>> >>>>> I guess first program header in the libraries which are on your machine has exec flag (you can check it with `readelf -l`). >>>>> So I tweaked my patch (initial value of exec_start and exec_end set to -1) in new webrev. >>>>> >>>>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05/ >>>>> >>>>> This webrev contains the fix for your comment (typo and DW_CFA_advance_loc4). >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2020/03/11 8:53, Kevin Walls wrote: >>>>>> Hi - >>>>>> >>>>>> In testing I wasn't seeing any of the Dwarf code triggered. >>>>>> >>>>>> With LIBSAPROC_DEBUG set I'm getting the "Could not find executable section in" for lots of / maybe all the libraries... >>>>>> >>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/libproc_impl.c >>>>>> >>>>>> ??? if (fill_instr_info(newlib)) { >>>>>> ????? if (!read_eh_frame(ph, newlib)) { >>>>>> >>>>>> fill_instr_info is failing, and we never get to read_eh_frame(). >>>>>> >>>>>> output like: >>>>>> >>>>>> libsaproc DEBUG: [0] vaddr = 0x0, memsz = 0xaba4, filesz = 0xaba4 >>>>>> libsaproc DEBUG: Could not find executable section in /lib/x86_64-linux-gnu/libnss_nis-2.27.so >>>>>> >>>>>> (similar for all libraries). >>>>>> >>>>>> fill_instr fails if: >>>>>> >>>>>> ??if ((lib->exec_start == 0L) || (lib->exec_end == 0L)) >>>>>> >>>>>> ...but isn't exec_start relative to the library address? It's the value of ph->vaddr and it is often zero. >>>>>> >>>>>> I added some booleans and did: >>>>>> >>>>>> 185?????? if ((lib->exec_start == 0L) || (lib->exec_start > ph->p_vaddr)) { >>>>>> 186???????? lib->exec_start = ph->p_vaddr; >>>>>> 187???????? found_start =true; >>>>>> 188?????? } >>>>>> >>>>>> (similarly for end) and only failed if: >>>>>> >>>>>> 201?? if (!found_start || !found_end) { >>>>>> 202???? return false; >>>>>> >>>>>> ...and now it's better. ? I go from: >>>>>> >>>>>> ----------------- 3306 ----------------- >>>>>> 0x00007f75824acd2d????? __GI___pthread_timedjoin_ex + 0x17d >>>>>> >>>>>> to: >>>>>> >>>>>> ----------------- 31127 ----------------- >>>>>> 0x00007fa284d78d2d????? __GI___pthread_timedjoin_ex + 0x17d >>>>>> 0x00007fa2857aaf2d????? CallJavaMainInNewThread + 0xad >>>>>> 0x00007fa2857a74ed????? ContinueInNewThread + 0x4d >>>>>> 0x00007fa2857a8c49????? JLI_Launch + 0x1529 >>>>>> 0x000055af1b78db1c????? main + 0x11c >>>>>> >>>>>> >>>>>> Thanks >>>>>> Kevin >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On 10/03/2020 12:36, Yasumasa Suenaga wrote: >>>>>> >>>>>>> Hi Kevin, >>>>>>> >>>>>>> Thanks for your comment! >>>>>>> >>>>>>> On 2020/03/10 18:58, Kevin Walls wrote: >>>>>>>> Hi Yasumasa , >>>>>>>> >>>>>>>> The changes build OK for me in the latest jdk, and things still work. >>>>>>>> I have not yet seen the dwarf usage in action: I've tried a couple of different systems and so far have not reproduced the problem, i.e. jstack has not failed on native frames. >>>>>>>> >>>>>>>> I may need more recent basic libraries, will look again for somewhere where the problem happens and get back to you as I really want to run the changes. >>>>>>> >>>>>>> You can see the problem with JShell. >>>>>>> Some Java frames would not be seen in mixed jstack. >>>>>>> >>>>>>> >>>>>>>> I have mostly minor other comments which don't need a new webrev, some just comments for the future: >>>>>>>> >>>>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp: >>>>>>>> >>>>>>>> DW_CFA_nop - shouldn't this continue instead of return? >>>>>>>> (It may "never" happen, but a nop could appear within some other instructions?) >>>>>>> >>>>>>> DW_CFA_nop is used for padding, so we can ignore (return immediately) it. >>>>>>> >>>>>>> >>>>>>>> DW_CFA_remember_state: a minor typo in the comment, "DW_CFA_remenber_state". >>>>>>> >>>>>>> I will fix it. >>>>>>> >>>>>>> >>>>>>>> We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not DW_CFA_advance_loc4.? I thought that was odd, but maybe addresses in these tables never increase by 4-byte amounts, would this mean a lot of code on one line. 8-) >>>>>>>> So maybe it's never used in practice, if you think it's unnecessary no problem, maybe a comment, or add it for robustness. >>>>>>> >>>>>>> I will add DW_CFA_advance_loc4. >>>>>>> >>>>>>> >>>>>>>> General-purpose methods like read_leb128(), get_entry_length(), get_decoded_value() specifically update the _buf pointer in this DwarfParser. >>>>>>>> >>>>>>>> DwarfParser::process_dwarf() moves _buf. >>>>>>>> It calls process_cie() which reads, moves _buf and restores it to the original position, then we read augmentation_length from where _buf is. >>>>>>>> I'm not sure if that's wrong, or if I just need to read again about the CIE/etc layout. >>>>>>>> >>>>>>>> I don't really want to suggest making the code pass around a current _buf for the invocation of these general purpose methods, but just wanted to comment that if these get used more widely that might become necessary. >>>>>>> >>>>>>> I saw GDB and binutils source for creating this patch. >>>>>>> They seems to process similar code because we need to calculate DWARF instructions one-by-one to get the value which relates to specified PC. >>>>>>> >>>>>>> >>>>>>>> Similarly in future, if this DWARF support code became used more widely, it might want to move to an >>>>>>>> OS-neutral directory?? It's odd to label it as Linux-specific. >>>>>>> >>>>>>> Windows does not use DWARF at least, it uses another feature. >>>>>>> >>>>>>> https://urldefense.com/v3/__https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaTtALtpRg$ >>>>>>> I'm not sure other platforms (Solaris, macOS) uses DWARF. >>>>>>> If DWARF is used in them, I can move DWARF related code to posix directory. >>>>>>> >>>>>>> >>>>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp: >>>>>>>> Thanks for changing "can_parsable" which was in the earlier version. 8-) >>>>>>>> >>>>>>>> >>>>>>>> These are just comments to mainly say it looks good, and somebody else out there has read it. >>>>>>>> I will look for a system that shows the problem, and get back to you again! >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>>> Many thanks >>>>>>>> Kevin >>>>>>>> >>>>>>>> On 27/02/2020 05:13, Yasumasa Suenaga wrote: >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> webrev.03 cannot be applied to current jdk/jdk due to 8239224 and 8239462 changes (they updated copyright year). >>>>>>>>> So I modified webrev (only copyright year changes) to be able to apply to current jdk/jdk. >>>>>>>>> Could you review it? >>>>>>>>> >>>>>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/ >>>>>>>>> >>>>>>>>> I need one more reviewer to push. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2020/02/17 13:07, Yasumasa Suenaga wrote: >>>>>>>>>> PING: Could you review it? >>>>>>>>>> >>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>>>>>> >>>>>>>>>> This change has been already reviewed by Serguei. >>>>>>>>>> I need one more reviewer to push. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2020/02/03 1:37, Yasumasa Suenaga wrote: >>>>>>>>>>> PING: Could you reveiw this change? >>>>>>>>>>> >>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>>>>>>> >>>>>>>>>>> I believe this change helps troubleshooter to fight to postmortem analysis. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 2020/01/19 3:16, Yasumasa Suenaga wrote: >>>>>>>>>>>> PING: Could you review it? >>>>>>>>>>>> >>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>>>>>>>> >>>>>>>>>>>> I updated webrev. I discussed with Serguei in off list, and I refactored webrev.02 . >>>>>>>>>>>> It has passed tests on submit repo (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549). >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 2019/12/15 10:51, Yasumasa Suenaga wrote: >>>>>>>>>>>>> Hi Serguei, >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for your comment! >>>>>>>>>>>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev. >>>>>>>>>>>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as Dmitry said. >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/ >>>>>>>>>>>>> >>>>>>>>>>>>> This change has been passed all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487). >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>> >>>>>>>>>>>>>> This is nice move in general. >>>>>>>>>>>>>> Thank you for working on this! >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html >>>>>>>>>>>>>> >>>>>>>>>>>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 0L) { // Java frame 98 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp == null) { 100 return null; 101 } 102 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if (rbp == null) { 110 return null; 111 } 112 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 dwarf.processDwarf(pc); 115 Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 !dwarf.isBPOffsetAvailable()) 117 ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : context.getRegisterAsAddress(dwarf.getCFARegister()) 119 .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); 124 } >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> I'd suggest to simplify the logic by refactoring to something like below: >>>>>>>>>>>>>> >>>>>>>>>>>>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc); >>>>>>>>>>>>>> ?????????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame >>>>>>>>>>>>>> ?????????? DwarfParser dwarf = null; >>>>>>>>>>>>>> >>>>>>>>>>>>>> ?????????? if (libptr != 0L) { // Native frame >>>>>>>>>>>>>> ???????????? try { >>>>>>>>>>>>>> ?????????????? dwarf = new DwarfParser(libptr); >>>>>>>>>>>>>> ?????????????? dwarf.processDwarf(pc); >>>>>>>>>>>>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && >>>>>>>>>>>>>> !dwarf.isBPOffsetAvailable()) >>>>>>>>>>>>>> ???????????????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) >>>>>>>>>>>>>> ???????????????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister()) >>>>>>>>>>>>>> .addOffsetTo(dwarf.getCFAOffset()); >>>>>>>>>>>>>> >>>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // bail out to Java frame case >>>>>>>>>>>>>> ??????????? } >>>>>>>>>>>>>> ????????? } >>>>>>>>>>>>>> ????????? if (cfa == null) { >>>>>>>>>>>>>> ??????????? return null; >>>>>>>>>>>>>> ????????? } >>>>>>>>>>>>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html >>>>>>>>>>>>>> >>>>>>>>>>>>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA() >>>>>>>>>>>>>> >>>>>>>>>>>>>> ?? Better to rename 'ofs' => 'offs'. >>>>>>>>>>>>>> >>>>>>>>>>>>>> 77 nextCFA = nextCFA.addOffsetTo(- nextDwarf.getBasePointerOffsetFromCFA()); >>>>>>>>>>>>>> >>>>>>>>>>>>>> ?? Extra space after '-' sign. >>>>>>>>>>>>>> >>>>>>>>>>>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, ThreadContext context) { >>>>>>>>>>>>>> >>>>>>>>>>>>>> ?? It feels like the logic has to be somehow refactored/simplified as >>>>>>>>>>>>>> ?? several typical fragments appears in slightly different contexts. >>>>>>>>>>>>>> ?? But it is not easy to understand what it is. >>>>>>>>>>>>>> ?? Could you, please, add some comments to key places explaining this logic. >>>>>>>>>>>>>> ?? Then I'll check if it is possible to make it a little bit simpler. >>>>>>>>>>>>>> >>>>>>>>>>>>>> 109 private CFrame javaSender(ThreadContext context) { 110 Address nextCFA; 111 Address nextPC; 112 113 nextPC = getNextPC(false); 114 if (nextPC == null) { 115 return null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 123 } catch (DebuggerException e) { 124 nextCFA = getNextCFA(null, context); 125 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 133 } >>>>>>>>>>>>>> >>>>>>>>>>>>>> ??The above can be simplified if a DebuggerException can not be thrown from processDwarf(nextPC): >>>>>>>>>>>>>> ????? private CFrame javaSender(ThreadContext context) { >>>>>>>>>>>>>> ??????? Address nextPC = getNextPC(false); >>>>>>>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>>>>>>> ????????? return null; >>>>>>>>>>>>>> ??????? } >>>>>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>>>>>>> >>>>>>>>>>>>>> ??????? if (libptr != 0L) { // Native frame >>>>>>>>>>>>>> ????????? try { >>>>>>>>>>>>>> ??????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>>>>>>> ??????????? nextDwarf.processDwarf(nextPC); >>>>>>>>>>>>>> ????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>>>>>>>>>>> ????????? } >>>>>>>>>>>>>> ??????? } >>>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>>>>>>> ????? } >>>>>>>>>>>>>> >>>>>>>>>>>>>> 135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext context = thread.getContext(); 137 138 if (dwarf == null) { // Java frame 139 return javaSender(context); 140 } 141 142 Address nextPC = getNextPC(true); 143 if (nextPC == null) { 144 return null; 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // Next frame might be Java frame 153 nextCFA = getNextCFA(null, context); 154 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 158 } catch (DebuggerException e) { 159 nextCFA = getNextCFA(null, context); 160 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? null : new >>>>>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 167 } >>>>>>>>>>>>>> >>>>>>>>>>>>>> ??This one can be also simplified a little: >>>>>>>>>>>>>> >>>>>>>>>>>>>> ????? public CFrame sender(ThreadProxy thread) { >>>>>>>>>>>>>> ??????? ThreadContext context = thread.getContext(); >>>>>>>>>>>>>> >>>>>>>>>>>>>> ??????? if (dwarf == null) { // Java frame >>>>>>>>>>>>>> ????????? return javaSender(context); >>>>>>>>>>>>>> ??????? } >>>>>>>>>>>>>> ??????? Address nextPC = getNextPC(true); >>>>>>>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>>>>>>> ????????? return null; >>>>>>>>>>>>>> ??????? } >>>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>>>>>>> ??????? if (!dwarf.isIn(nextPC)) { >>>>>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>>>>> ????????? if (libptr != 0L) { >>>>>>>>>>>>>> ??????????? try { >>>>>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC); >>>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>>>>>>>>>>> ??????????? } >>>>>>>>>>>>>> ????????? } >>>>>>>>>>>>>> ??????? } >>>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>>>>>>> ????? } >>>>>>>>>>>>>> >>>>>>>>>>>>>> Finally, it looks like just one method could replace both >>>>>>>>>>>>>> sender(ThreadProxy thread) and javaSender(ThreadContext context): >>>>>>>>>>>>>> >>>>>>>>>>>>>> ????? private CFrame commonSender(ThreadProxy thread) { >>>>>>>>>>>>>> ??????? ThreadContext context = thread.getContext(); >>>>>>>>>>>>>> ??????? Address nextPC = getNextPC(false); >>>>>>>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>>>>>>> ????????? return null; >>>>>>>>>>>>>> ??????? } >>>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>>>>>>> >>>>>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>>>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) { >>>>>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>>>>> ????????? if (libptr != 0L) { >>>>>>>>>>>>>> ??????????? try { >>>>>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC); >>>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>>>>>>>>>>> ??????????? } >>>>>>>>>>>>>> ????????? } >>>>>>>>>>>>>> ??????? } >>>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>>>>>>> ????? } >>>>>>>>>>>>>> >>>>>>>>>>>>>> I'm still reviewing the dwarf parser files. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Serguei >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I refactored LinuxAMD64CFrame.java . It works fine in serviceability/sa tests and >>>>>>>>>>>>>>> all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923). >>>>>>>>>>>>>>> Could you review new webrev? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The diff from previous webrev is here: >>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Please review this change: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> According to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64 >>>>>>>>>>>>>>>> Architecture Processor Supplement [1], we need to use DWARF in .eh_frame or .debug_frame >>>>>>>>>>>>>>>> for stack unwinding. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by default since GCC 4.6, so system >>>>>>>>>>>>>>>> library (e.g. libc) might be compiled with this feature. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> However `jhsdb jstack --mixed` does not do so, it uses base pointer register (RBP). >>>>>>>>>>>>>>>> So it might be lack of stack frames. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I guess JDK-8219201 is caused by same issue. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> [1] https://urldefense.com/v3/__https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaQusmjOqA$ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> From suenaga at oss.nttdata.com Wed Mar 11 09:49:12 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Wed, 11 Mar 2020 18:49:12 +0900 Subject: : PING: RFR: 8234624: jstack mixed mode should refer DWARF In-Reply-To: <26681640-d8c8-5b64-58cc-36f8a854d101@oracle.com> References: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com> <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com> <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com> <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com> <24a2dcae-fba9-c6ed-657e-a1d54584bbc3@oss.nttdata.com> <01c2a8c1-0e04-4e42-161a-80d6bdaa4ffa@oracle.com> <26681640-d8c8-5b64-58cc-36f8a854d101@oracle.com> Message-ID: Hi, Thanks David and Ioi for sharing the status. I've fixed the problem in new webrev (mach5-one-ysuenaga-JDK-8234624-6-20200311-0827-9367344): http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.06/ Diff from webrev.05 is here: http://hg.openjdk.java.net/jdk/submit/rev/e3d12785f087 Thanks, Yasumasa On 2020/03/11 14:59, David Holmes wrote: > Hi Yasumasa, > > Partial hs_err info below. > > David > ----- > > # > # A fatal error has been detected by the Java Runtime Environment: > # > #? SIGSEGV (0xb) at pc=0x00007fdf2000e87c, pid=29798, tid=29800 > # > # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-11-0447267.suenaga.source) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-11-0447267.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) > # Problematic frame: > # C? [libsaproc.so+0x487c]? DwarfParser::process_dwarf(unsigned long)+0x2c > # > # Core dump will be written. Default location: Core dumps may be processed with "/opt/core.sh %p" (or dumping to /opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/scratch/0/core.29798) > # > # If you would like to submit a bug report, please visit: > #?? https://bugreport.java.com/bugreport/crash.jsp > # The crash happened outside the Java Virtual Machine in native code. > # See problematic frame for where to report the bug. > # > > ---------------? S U M M A R Y ------------ > > Command Line: > -Denv.class.path=/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/serviceability/sa/TestJhsdbJstackMixed.d:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/hotspot/jtreg/serviceability/sa:/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/javatest.jar:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/jtreg.jar > -Dapplication.home=/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/linux-x64-debug.jdk/jdk-15/fastdebug -Xms8m -Djdk.module.main=jdk.hotspot.agent jdk.hotspot.agent/sun.jvm.hotspot.SALauncher jstack --mixed --pid 29770 > > Time: Wed Mar 11 05:20:57 2020 UTC elapsed time: 3.927809 seconds (0d 0h 0m 3s) > > ---------------? T H R E A D? --------------- > > Current thread (0x00007fdf5c032000):? JavaThread "main" [_thread_in_native, id=29800, stack(0x00007fdf63a9e000,0x00007fdf63b9f000)] > > Stack: [0x00007fdf63a9e000,0x00007fdf63b9f000],? sp=0x00007fdf63b9d190, free space=1020k > Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) > C? [libsaproc.so+0x487c]? DwarfParser::process_dwarf(unsigned long)+0x2c > j? sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 jdk.hotspot.agent at 15-internal > j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 jdk.hotspot.agent at 15-internal > j sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 jdk.hotspot.agent at 15-internal > j sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 jdk.hotspot.agent at 15-internal > j sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.Tool.startInternal()V+87 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 jdk.hotspot.agent at 15-internal > v? ~StubRoutines::call_stub > V? [libjvm.so+0xc2291c]? JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x6ac > V? [libjvm.so+0xd31970]? jni_invoke_static(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) [clone .isra.140] [clone .constprop.263]+0x370 > V? [libjvm.so+0xd36202]? jni_CallStaticVoidMethod+0x222 > C? [libjli.so+0x4bed]? JavaMain+0xbcd > C? [libjli.so+0x80a9]? ThreadJavaMain+0x9 > > Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) > j? sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 jdk.hotspot.agent at 15-internal > j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 jdk.hotspot.agent at 15-internal > j sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 jdk.hotspot.agent at 15-internal > j sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 jdk.hotspot.agent at 15-internal > j sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.Tool.startInternal()V+87 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 jdk.hotspot.agent at 15-internal > j? sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 jdk.hotspot.agent at 15-internal > v? ~StubRoutines::call_stub > > siginfo: si_signo: 11 (SIGSEGV), si_code: 2 (SEGV_ACCERR), si_addr: 0x00007fded5076b79 > > Register to memory mapping: > > RAX=0x00007f7e4dfe3229 is an unknown value > RBX=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 d4 de 7f 00 00 > RCX=0x00007fded4072380 points into unknown readable memory: 2f 75 73 72 2f 6c 69 62 > RDX=0x00007fded4076b85 points into unknown readable memory: 01 00 00 > RSP=0x00007fdf63b9d190 is pointing into the stack for thread: 0x00007fdf5c032000 > RBP=0x00007fdf63b9d1b0 is pointing into the stack for thread: 0x00007fdf5c032000 > RSI=0x0000000000000004 is an unknown value > RDI=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 d4 de 7f 00 00 > R8 =0x000000000146c380 points into unknown readable memory: 02 00 00 00 00 00 00 00 > R9 =0x00007fded4076b79 points into unknown readable memory: 7a 52 00 01 78 10 01 > R10=0x00000000ffffffff is an unknown value > R11=0x000000000100527a is an unknown value > R12=0x00007fded5076b79 is an unknown value > R13=0x00007f7da2f8e68a is an unknown value > R14=0x00007f7dbdf62b1d is an unknown value > R15=0x00007fdf5c032000 is a thread > > > Registers: > RAX=0x00007f7e4dfe3229, RBX=0x00007fdf5c4d7080, RCX=0x00007fded4072380, RDX=0x00007fded4076b85 > RSP=0x00007fdf63b9d190, RBP=0x00007fdf63b9d1b0, RSI=0x0000000000000004, RDI=0x00007fdf5c4d7080 > R8 =0x000000000146c380, R9 =0x00007fded4076b79, R10=0x00000000ffffffff, R11=0x000000000100527a > R12=0x00007fded5076b79, R13=0x00007f7da2f8e68a, R14=0x00007f7dbdf62b1d, R15=0x00007fdf5c032000 > RIP=0x00007fdf2000e87c, EFLAGS=0x0000000000010206, CSGSFS=0x002b000000000033, ERR=0x0000000000000004 > ? TRAPNO=0x000000000000000e > > Top of Stack: (sp=0x00007fdf63b9d190) > 0x00007fdf63b9d190:?? 00007fdf209d0980 0000000000000000 > 0x00007fdf63b9d1a0:?? 00007fdf209d0980 00007fdf63b9d258 > 0x00007fdf63b9d1b0:?? 00007fdf63b9d228 00007fdf44778dbe > 0x00007fdf63b9d1c0:?? 000000000146c380 00007fdf5c032000 > > Instructions: (pc=0x00007fdf2000e87c) > 0x00007fdf2000e77c:?? 89 43 18 4d 85 f6 75 0f eb 2a 66 2e 0f 1f 84 00 > 0x00007fdf2000e78c:?? 00 00 00 00 48 89 c2 48 8d 42 01 48 89 43 08 80 > 0x00007fdf2000e79c:?? 78 ff 00 78 ef 48 8d 42 02 48 89 43 08 0f b6 42 > 0x00007fdf2000e7ac:?? 01 88 43 10 48 c7 43 28 00 00 00 00 4c 89 e1 48 > 0x00007fdf2000e7bc:?? 89 df 31 f6 48 b8 07 00 00 00 10 00 00 00 c6 43 > 0x00007fdf2000e7cc:?? 3c 00 48 c7 c2 ff ff ff ff 48 89 43 14 48 c7 43 > 0x00007fdf2000e7dc:?? 30 00 00 00 00 c7 43 38 00 00 00 00 e8 13 fb ff > 0x00007fdf2000e7ec:?? ff 4c 89 6b 08 48 83 c4 18 5b 41 5c 41 5d 41 5e > 0x00007fdf2000e7fc:?? 41 5f 5d c3 83 e7 40 0f 84 63 ff ff ff 48 c7 c2 > 0x00007fdf2000e80c:?? ff ff ff ff 48 d3 e2 49 09 d0 e9 51 ff ff ff 90 > 0x00007fdf2000e81c:?? 0f 1f 40 00 0f b6 47 10 83 e0 07 3c 02 74 0a 76 > 0x00007fdf2000e82c:?? 1b 3c 03 74 04 3c 04 75 17 48 8b 57 08 8b 02 48 > 0x00007fdf2000e83c:?? 83 c2 04 48 89 57 08 c3 0f 1f 40 00 84 c0 74 e9 > 0x00007fdf2000e84c:?? 31 c0 c3 90 55 41 ba ff ff ff ff 48 89 e5 41 56 > 0x00007fdf2000e85c:?? 41 55 49 89 f5 41 54 53 48 8b 07 48 89 fb 4c 8b > 0x00007fdf2000e86c:?? a0 28 11 00 00 eb 09 0f 1f 44 00 00 4c 89 63 08 > 0x00007fdf2000e87c:?? 41 8b 04 24 4d 8d 4c 24 04 4c 89 4b 08 4c 39 d0 > 0x00007fdf2000e88c:?? 75 0a 49 8b 44 24 04 4d 8d 4c 24 0c 45 8b 19 4d > 0x00007fdf2000e89c:?? 8d 24 01 49 8d 41 04 48 89 43 08 45 85 db 74 cc > 0x00007fdf2000e8ac:?? 48 89 df e8 8c f9 ff ff 48 8b 13 41 89 c6 4c 03 > 0x00007fdf2000e8bc:?? b2 18 11 00 00 e8 5a ff ff ff 89 c0 4c 01 f0 4c > 0x00007fdf2000e8cc:?? 39 e8 76 a8 4d 39 ee 77 a3 44 89 da 4c 89 ce e8 > 0x00007fdf2000e8dc:?? 90 fd ff ff 48 8b 43 08 31 c9 31 ff 48 83 c0 01 > 0x00007fdf2000e8ec:?? 0f 1f 40 00 48 89 43 08 0f b6 70 ff 49 89 c0 48 > 0x00007fdf2000e8fc:?? 83 c0 01 48 89 f2 83 e2 7f 48 d3 e2 83 c1 07 48 > 0x00007fdf2000e90c:?? 09 d7 40 84 f6 78 dd 4c 01 c7 4c 89 e1 4c 89 ea > 0x00007fdf2000e91c:?? 4c 89 f6 48 89 7b 08 48 89 df e8 d5 f9 ff ff 5b > 0x00007fdf2000e92c:?? 31 c0 41 5c 41 5d 41 5e 5d c3 66 2e 0f 1f 84 00 > 0x00007fdf2000e93c:?? 00 00 00 00 55 48 89 e5 41 54 53 48 81 ec d0 00 > 0x00007fdf2000e94c:?? 00 00 48 89 b5 48 ff ff ff 48 89 95 50 ff ff ff > 0x00007fdf2000e95c:?? 48 89 8d 58 ff ff ff 4c 89 85 60 ff ff ff 4c 89 > 0x00007fdf2000e96c:?? 8d 68 ff ff ff 84 c0 74 23 0f 29 85 70 ff ff ff > > > On 11/03/2020 3:52 pm, Yasumasa Suenaga wrote: >> Hi Kevin, >> >> I saw 2 errors on submit repo (mach5-one-ysuenaga-JDK-8234624-5-20200311-0209-9358475). >> So I tweaked my patch, but I saw the crash again (mach5-one-ysuenaga-JDK-8234624-5-20200311-0448-9361448). >> >> ?? Last change on submit repo is here: >> ???? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05-2/ >> >> Can you share details on submit repo? >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2020/03/11 11:07, Yasumasa Suenaga wrote: >>> Hi Kevin, >>> >>> I guess first program header in the libraries which are on your machine has exec flag (you can check it with `readelf -l`). >>> So I tweaked my patch (initial value of exec_start and exec_end set to -1) in new webrev. >>> >>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05/ >>> >>> This webrev contains the fix for your comment (typo and DW_CFA_advance_loc4). >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2020/03/11 8:53, Kevin Walls wrote: >>>> Hi - >>>> >>>> In testing I wasn't seeing any of the Dwarf code triggered. >>>> >>>> With LIBSAPROC_DEBUG set I'm getting the "Could not find executable section in" for lots of / maybe all the libraries... >>>> >>>> src/jdk.hotspot.agent/linux/native/libsaproc/libproc_impl.c >>>> >>>> ??? if (fill_instr_info(newlib)) { >>>> ????? if (!read_eh_frame(ph, newlib)) { >>>> >>>> fill_instr_info is failing, and we never get to read_eh_frame(). >>>> >>>> output like: >>>> >>>> libsaproc DEBUG: [0] vaddr = 0x0, memsz = 0xaba4, filesz = 0xaba4 >>>> libsaproc DEBUG: Could not find executable section in /lib/x86_64-linux-gnu/libnss_nis-2.27.so >>>> >>>> (similar for all libraries). >>>> >>>> fill_instr fails if: >>>> >>>> ??if ((lib->exec_start == 0L) || (lib->exec_end == 0L)) >>>> >>>> ...but isn't exec_start relative to the library address? It's the value of ph->vaddr and it is often zero. >>>> >>>> I added some booleans and did: >>>> >>>> 185?????? if ((lib->exec_start == 0L) || (lib->exec_start > ph->p_vaddr)) { >>>> 186???????? lib->exec_start = ph->p_vaddr; >>>> 187???????? found_start =true; >>>> 188?????? } >>>> >>>> (similarly for end) and only failed if: >>>> >>>> 201?? if (!found_start || !found_end) { >>>> 202???? return false; >>>> >>>> ...and now it's better. ? I go from: >>>> >>>> ----------------- 3306 ----------------- >>>> 0x00007f75824acd2d????? __GI___pthread_timedjoin_ex + 0x17d >>>> >>>> to: >>>> >>>> ----------------- 31127 ----------------- >>>> 0x00007fa284d78d2d????? __GI___pthread_timedjoin_ex + 0x17d >>>> 0x00007fa2857aaf2d????? CallJavaMainInNewThread + 0xad >>>> 0x00007fa2857a74ed????? ContinueInNewThread + 0x4d >>>> 0x00007fa2857a8c49????? JLI_Launch + 0x1529 >>>> 0x000055af1b78db1c????? main + 0x11c >>>> >>>> >>>> Thanks >>>> Kevin >>>> >>>> >>>> >>>> >>>> On 10/03/2020 12:36, Yasumasa Suenaga wrote: >>>> >>>>> Hi Kevin, >>>>> >>>>> Thanks for your comment! >>>>> >>>>> On 2020/03/10 18:58, Kevin Walls wrote: >>>>>> Hi Yasumasa , >>>>>> >>>>>> The changes build OK for me in the latest jdk, and things still work. >>>>>> I have not yet seen the dwarf usage in action: I've tried a couple of different systems and so far have not reproduced the problem, i.e. jstack has not failed on native frames. >>>>>> >>>>>> I may need more recent basic libraries, will look again for somewhere where the problem happens and get back to you as I really want to run the changes. >>>>> >>>>> You can see the problem with JShell. >>>>> Some Java frames would not be seen in mixed jstack. >>>>> >>>>> >>>>>> I have mostly minor other comments which don't need a new webrev, some just comments for the future: >>>>>> >>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp: >>>>>> >>>>>> DW_CFA_nop - shouldn't this continue instead of return? >>>>>> (It may "never" happen, but a nop could appear within some other instructions?) >>>>> >>>>> DW_CFA_nop is used for padding, so we can ignore (return immediately) it. >>>>> >>>>> >>>>>> DW_CFA_remember_state: a minor typo in the comment, "DW_CFA_remenber_state". >>>>> >>>>> I will fix it. >>>>> >>>>> >>>>>> We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not DW_CFA_advance_loc4.? I thought that was odd, but maybe addresses in these tables never increase by 4-byte amounts, would this mean a lot of code on one line. 8-) >>>>>> So maybe it's never used in practice, if you think it's unnecessary no problem, maybe a comment, or add it for robustness. >>>>> >>>>> I will add DW_CFA_advance_loc4. >>>>> >>>>> >>>>>> General-purpose methods like read_leb128(), get_entry_length(), get_decoded_value() specifically update the _buf pointer in this DwarfParser. >>>>>> >>>>>> DwarfParser::process_dwarf() moves _buf. >>>>>> It calls process_cie() which reads, moves _buf and restores it to the original position, then we read augmentation_length from where _buf is. >>>>>> I'm not sure if that's wrong, or if I just need to read again about the CIE/etc layout. >>>>>> >>>>>> I don't really want to suggest making the code pass around a current _buf for the invocation of these general purpose methods, but just wanted to comment that if these get used more widely that might become necessary. >>>>> >>>>> I saw GDB and binutils source for creating this patch. >>>>> They seems to process similar code because we need to calculate DWARF instructions one-by-one to get the value which relates to specified PC. >>>>> >>>>> >>>>>> Similarly in future, if this DWARF support code became used more widely, it might want to move to an >>>>>> OS-neutral directory?? It's odd to label it as Linux-specific. >>>>> >>>>> Windows does not use DWARF at least, it uses another feature. >>>>> >>>>> https://urldefense.com/v3/__https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaTtALtpRg$ >>>>> I'm not sure other platforms (Solaris, macOS) uses DWARF. >>>>> If DWARF is used in them, I can move DWARF related code to posix directory. >>>>> >>>>> >>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp: >>>>>> Thanks for changing "can_parsable" which was in the earlier version. 8-) >>>>>> >>>>>> >>>>>> These are just comments to mainly say it looks good, and somebody else out there has read it. >>>>>> I will look for a system that shows the problem, and get back to you again! >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>>> Many thanks >>>>>> Kevin >>>>>> >>>>>> On 27/02/2020 05:13, Yasumasa Suenaga wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> webrev.03 cannot be applied to current jdk/jdk due to 8239224 and 8239462 changes (they updated copyright year). >>>>>>> So I modified webrev (only copyright year changes) to be able to apply to current jdk/jdk. >>>>>>> Could you review it? >>>>>>> >>>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/ >>>>>>> >>>>>>> I need one more reviewer to push. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2020/02/17 13:07, Yasumasa Suenaga wrote: >>>>>>>> PING: Could you review it? >>>>>>>> >>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>>>> >>>>>>>> This change has been already reviewed by Serguei. >>>>>>>> I need one more reviewer to push. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2020/02/03 1:37, Yasumasa Suenaga wrote: >>>>>>>>> PING: Could you reveiw this change? >>>>>>>>> >>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>>>>> >>>>>>>>> I believe this change helps troubleshooter to fight to postmortem analysis. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2020/01/19 3:16, Yasumasa Suenaga wrote: >>>>>>>>>> PING: Could you review it? >>>>>>>>>> >>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>>>>>> >>>>>>>>>> I updated webrev. I discussed with Serguei in off list, and I refactored webrev.02 . >>>>>>>>>> It has passed tests on submit repo (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549). >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2019/12/15 10:51, Yasumasa Suenaga wrote: >>>>>>>>>>> Hi Serguei, >>>>>>>>>>> >>>>>>>>>>> Thanks for your comment! >>>>>>>>>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev. >>>>>>>>>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as Dmitry said. >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/ >>>>>>>>>>> >>>>>>>>>>> This change has been passed all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487). >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>> >>>>>>>>>>>> This is nice move in general. >>>>>>>>>>>> Thank you for working on this! >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html >>>>>>>>>>>> >>>>>>>>>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 0L) { // Java frame 98 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp == null) { 100 return null; 101 } 102 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if (rbp == null) { 110 return null; 111 } 112 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 dwarf.processDwarf(pc); 115 Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 !dwarf.isBPOffsetAvailable()) 117 ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : context.getRegisterAsAddress(dwarf.getCFARegister()) 119 .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); 124 } >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I'd suggest to simplify the logic by refactoring to something like below: >>>>>>>>>>>> >>>>>>>>>>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc); >>>>>>>>>>>> ?????????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame >>>>>>>>>>>> ?????????? DwarfParser dwarf = null; >>>>>>>>>>>> >>>>>>>>>>>> ?????????? if (libptr != 0L) { // Native frame >>>>>>>>>>>> ???????????? try { >>>>>>>>>>>> ?????????????? dwarf = new DwarfParser(libptr); >>>>>>>>>>>> ?????????????? dwarf.processDwarf(pc); >>>>>>>>>>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && >>>>>>>>>>>> !dwarf.isBPOffsetAvailable()) >>>>>>>>>>>> ???????????????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) >>>>>>>>>>>> ???????????????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister()) >>>>>>>>>>>> .addOffsetTo(dwarf.getCFAOffset()); >>>>>>>>>>>> >>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // bail out to Java frame case >>>>>>>>>>>> ??????????? } >>>>>>>>>>>> ????????? } >>>>>>>>>>>> ????????? if (cfa == null) { >>>>>>>>>>>> ??????????? return null; >>>>>>>>>>>> ????????? } >>>>>>>>>>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html >>>>>>>>>>>> >>>>>>>>>>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA() >>>>>>>>>>>> >>>>>>>>>>>> ?? Better to rename 'ofs' => 'offs'. >>>>>>>>>>>> >>>>>>>>>>>> 77 nextCFA = nextCFA.addOffsetTo(- nextDwarf.getBasePointerOffsetFromCFA()); >>>>>>>>>>>> >>>>>>>>>>>> ?? Extra space after '-' sign. >>>>>>>>>>>> >>>>>>>>>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, ThreadContext context) { >>>>>>>>>>>> >>>>>>>>>>>> ?? It feels like the logic has to be somehow refactored/simplified as >>>>>>>>>>>> ?? several typical fragments appears in slightly different contexts. >>>>>>>>>>>> ?? But it is not easy to understand what it is. >>>>>>>>>>>> ?? Could you, please, add some comments to key places explaining this logic. >>>>>>>>>>>> ?? Then I'll check if it is possible to make it a little bit simpler. >>>>>>>>>>>> >>>>>>>>>>>> 109 private CFrame javaSender(ThreadContext context) { 110 Address nextCFA; 111 Address nextPC; 112 113 nextPC = getNextPC(false); 114 if (nextPC == null) { 115 return null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 123 } catch (DebuggerException e) { 124 nextCFA = getNextCFA(null, context); 125 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 133 } >>>>>>>>>>>> >>>>>>>>>>>> ??The above can be simplified if a DebuggerException can not be thrown from processDwarf(nextPC): >>>>>>>>>>>> ????? private CFrame javaSender(ThreadContext context) { >>>>>>>>>>>> ??????? Address nextPC = getNextPC(false); >>>>>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>>>>> ????????? return null; >>>>>>>>>>>> ??????? } >>>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>>>>> >>>>>>>>>>>> ??????? if (libptr != 0L) { // Native frame >>>>>>>>>>>> ????????? try { >>>>>>>>>>>> ??????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>>>>> ??????????? nextDwarf.processDwarf(nextPC); >>>>>>>>>>>> ????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>>>>>>>>> ????????? } >>>>>>>>>>>> ??????? } >>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>>>>> ????? } >>>>>>>>>>>> >>>>>>>>>>>> 135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext context = thread.getContext(); 137 138 if (dwarf == null) { // Java frame 139 return javaSender(context); 140 } 141 142 Address nextPC = getNextPC(true); 143 if (nextPC == null) { 144 return null; 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // Next frame might be Java frame 153 nextCFA = getNextCFA(null, context); 154 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 158 } catch (DebuggerException e) { 159 nextCFA = getNextCFA(null, context); 160 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? null : new >>>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 167 } >>>>>>>>>>>> >>>>>>>>>>>> ??This one can be also simplified a little: >>>>>>>>>>>> >>>>>>>>>>>> ????? public CFrame sender(ThreadProxy thread) { >>>>>>>>>>>> ??????? ThreadContext context = thread.getContext(); >>>>>>>>>>>> >>>>>>>>>>>> ??????? if (dwarf == null) { // Java frame >>>>>>>>>>>> ????????? return javaSender(context); >>>>>>>>>>>> ??????? } >>>>>>>>>>>> ??????? Address nextPC = getNextPC(true); >>>>>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>>>>> ????????? return null; >>>>>>>>>>>> ??????? } >>>>>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>>>>> ??????? if (!dwarf.isIn(nextPC)) { >>>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>>> ????????? if (libptr != 0L) { >>>>>>>>>>>> ??????????? try { >>>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC); >>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>>>>>>>>> ??????????? } >>>>>>>>>>>> ????????? } >>>>>>>>>>>> ??????? } >>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>>>>> ????? } >>>>>>>>>>>> >>>>>>>>>>>> Finally, it looks like just one method could replace both >>>>>>>>>>>> sender(ThreadProxy thread) and javaSender(ThreadContext context): >>>>>>>>>>>> >>>>>>>>>>>> ????? private CFrame commonSender(ThreadProxy thread) { >>>>>>>>>>>> ??????? ThreadContext context = thread.getContext(); >>>>>>>>>>>> ??????? Address nextPC = getNextPC(false); >>>>>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>>>>> ????????? return null; >>>>>>>>>>>> ??????? } >>>>>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>>>>> >>>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) { >>>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>>> ????????? if (libptr != 0L) { >>>>>>>>>>>> ??????????? try { >>>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC); >>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>>>>>>>>> ??????????? } >>>>>>>>>>>> ????????? } >>>>>>>>>>>> ??????? } >>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>>>>> ????? } >>>>>>>>>>>> >>>>>>>>>>>> I'm still reviewing the dwarf parser files. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Serguei >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote: >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> I refactored LinuxAMD64CFrame.java . It works fine in serviceability/sa tests and >>>>>>>>>>>>> all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923). >>>>>>>>>>>>> Could you review new webrev? >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/ >>>>>>>>>>>>> >>>>>>>>>>>>> The diff from previous webrev is here: >>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Please review this change: >>>>>>>>>>>>>> >>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> According to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64 >>>>>>>>>>>>>> Architecture Processor Supplement [1], we need to use DWARF in .eh_frame or .debug_frame >>>>>>>>>>>>>> for stack unwinding. >>>>>>>>>>>>>> >>>>>>>>>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by default since GCC 4.6, so system >>>>>>>>>>>>>> library (e.g. libc) might be compiled with this feature. >>>>>>>>>>>>>> >>>>>>>>>>>>>> However `jhsdb jstack --mixed` does not do so, it uses base pointer register (RBP). >>>>>>>>>>>>>> So it might be lack of stack frames. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I guess JDK-8219201 is caused by same issue. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> [1] https://urldefense.com/v3/__https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaQusmjOqA$ >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> From kevin.walls at oracle.com Wed Mar 11 15:31:31 2020 From: kevin.walls at oracle.com (Kevin Walls) Date: Wed, 11 Mar 2020 15:31:31 +0000 Subject: : PING: RFR: 8234624: jstack mixed mode should refer DWARF In-Reply-To: References: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com> <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com> <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com> <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com> <24a2dcae-fba9-c6ed-657e-a1d54584bbc3@oss.nttdata.com> <01c2a8c1-0e04-4e42-161a-80d6bdaa4ffa@oracle.com> <26681640-d8c8-5b64-58cc-36f8a854d101@oracle.com> Message-ID: Hi - OK great, it checks for a zero-length dwarf entry. I did a rebuild and tested it locally it works here, so all looks good to me. We may in future want to work on src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/bsd/amd64/BsdAMD64CFrame.java to use Dwarf similarly, that's why I mentioned the platform-neutral directory name, but I have no issue with that happening in the future. Thanks Kevin On 11/03/2020 09:49, Yasumasa Suenaga wrote: > Hi, > > Thanks David and Ioi for sharing the status. > I've fixed the problem in new webrev > (mach5-one-ysuenaga-JDK-8234624-6-20200311-0827-9367344): > > ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.06/ > > Diff from webrev.05 is here: > > ? http://hg.openjdk.java.net/jdk/submit/rev/e3d12785f087 > > > Thanks, > > Yasumasa > > > On 2020/03/11 14:59, David Holmes wrote: >> Hi Yasumasa, >> >> Partial hs_err info below. >> >> David >> ----- >> >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> #? SIGSEGV (0xb) at pc=0x00007fdf2000e87c, pid=29798, tid=29800 >> # >> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug >> build 15-internal+0-2020-03-11-0447267.suenaga.source) >> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug >> 15-internal+0-2020-03-11-0447267.suenaga.source, mixed mode, sharing, >> tiered, compressed oops, g1 gc, linux-amd64) >> # Problematic frame: >> # C? [libsaproc.so+0x487c]? DwarfParser::process_dwarf(unsigned >> long)+0x2c >> # >> # Core dump will be written. Default location: Core dumps may be >> processed with "/opt/core.sh %p" (or dumping to >> /opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/scratch/0/core.29798) >> # >> # If you would like to submit a bug report, please visit: >> # >> https://urldefense.com/v3/__https://bugreport.java.com/bugreport/crash.jsp__;!!GqivPVa7Brio!OghfqRRRHbAZloG3aVJ244OPPTcCQOwYIl_vm6vU_toLb9qFzTUirVBEHn2tfDp26A$ >> # The crash happened outside the Java Virtual Machine in native code. >> # See problematic frame for where to report the bug. >> # >> >> ---------------? S U M M A R Y ------------ >> >> Command Line: >> -Denv.class.path=/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/serviceability/sa/TestJhsdbJstackMixed.d:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/hotspot/jtreg/serviceability/sa:/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/javatest.jar:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/jtreg.jar >> -Dapplication.home=/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/linux-x64-debug.jdk/jdk-15/fastdebug >> -Xms8m -Djdk.module.main=jdk.hotspot.agent >> jdk.hotspot.agent/sun.jvm.hotspot.SALauncher jstack --mixed --pid 29770 >> >> Time: Wed Mar 11 05:20:57 2020 UTC elapsed time: 3.927809 seconds (0d >> 0h 0m 3s) >> >> ---------------? T H R E A D? --------------- >> >> Current thread (0x00007fdf5c032000):? JavaThread "main" >> [_thread_in_native, id=29800, >> stack(0x00007fdf63a9e000,0x00007fdf63b9f000)] >> >> Stack: [0x00007fdf63a9e000,0x00007fdf63b9f000], >> sp=0x00007fdf63b9d190, free space=1020k >> Native frames: (J=compiled Java code, A=aot compiled Java code, >> j=interpreted, Vv=VM code, C=native code) >> C? [libsaproc.so+0x487c]? DwarfParser::process_dwarf(unsigned long)+0x2c >> j >> sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 >> jdk.hotspot.agent at 15-internal >> j >> sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 >> jdk.hotspot.agent at 15-internal >> j >> sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 >> jdk.hotspot.agent at 15-internal >> j >> sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 >> jdk.hotspot.agent at 15-internal >> j >> sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.tools.Tool.startInternal()V+87 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 >> jdk.hotspot.agent at 15-internal >> j sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 >> jdk.hotspot.agent at 15-internal >> j sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 >> jdk.hotspot.agent at 15-internal >> v? ~StubRoutines::call_stub >> V? [libjvm.so+0xc2291c]? JavaCalls::call_helper(JavaValue*, >> methodHandle const&, JavaCallArguments*, Thread*)+0x6ac >> V? [libjvm.so+0xd31970]? jni_invoke_static(JNIEnv_*, JavaValue*, >> _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) >> [clone .isra.140] [clone .constprop.263]+0x370 >> V? [libjvm.so+0xd36202]? jni_CallStaticVoidMethod+0x222 >> C? [libjli.so+0x4bed]? JavaMain+0xbcd >> C? [libjli.so+0x80a9]? ThreadJavaMain+0x9 >> >> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) >> j >> sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 >> jdk.hotspot.agent at 15-internal >> j >> sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 >> jdk.hotspot.agent at 15-internal >> j >> sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 >> jdk.hotspot.agent at 15-internal >> j >> sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 >> jdk.hotspot.agent at 15-internal >> j >> sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.tools.Tool.startInternal()V+87 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 >> jdk.hotspot.agent at 15-internal >> j sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 >> jdk.hotspot.agent at 15-internal >> j sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 >> jdk.hotspot.agent at 15-internal >> j? sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 >> jdk.hotspot.agent at 15-internal >> v? ~StubRoutines::call_stub >> >> siginfo: si_signo: 11 (SIGSEGV), si_code: 2 (SEGV_ACCERR), si_addr: >> 0x00007fded5076b79 >> >> Register to memory mapping: >> >> RAX=0x00007f7e4dfe3229 is an unknown value >> RBX=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 >> d4 de 7f 00 00 >> RCX=0x00007fded4072380 points into unknown readable memory: 2f 75 73 >> 72 2f 6c 69 62 >> RDX=0x00007fded4076b85 points into unknown readable memory: 01 00 00 >> RSP=0x00007fdf63b9d190 is pointing into the stack for thread: >> 0x00007fdf5c032000 >> RBP=0x00007fdf63b9d1b0 is pointing into the stack for thread: >> 0x00007fdf5c032000 >> RSI=0x0000000000000004 is an unknown value >> RDI=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 >> d4 de 7f 00 00 >> R8 =0x000000000146c380 points into unknown readable memory: 02 00 00 >> 00 00 00 00 00 >> R9 =0x00007fded4076b79 points into unknown readable memory: 7a 52 00 >> 01 78 10 01 >> R10=0x00000000ffffffff is an unknown value >> R11=0x000000000100527a is an unknown value >> R12=0x00007fded5076b79 is an unknown value >> R13=0x00007f7da2f8e68a is an unknown value >> R14=0x00007f7dbdf62b1d is an unknown value >> R15=0x00007fdf5c032000 is a thread >> >> >> Registers: >> RAX=0x00007f7e4dfe3229, RBX=0x00007fdf5c4d7080, >> RCX=0x00007fded4072380, RDX=0x00007fded4076b85 >> RSP=0x00007fdf63b9d190, RBP=0x00007fdf63b9d1b0, >> RSI=0x0000000000000004, RDI=0x00007fdf5c4d7080 >> R8 =0x000000000146c380, R9 =0x00007fded4076b79, >> R10=0x00000000ffffffff, R11=0x000000000100527a >> R12=0x00007fded5076b79, R13=0x00007f7da2f8e68a, >> R14=0x00007f7dbdf62b1d, R15=0x00007fdf5c032000 >> RIP=0x00007fdf2000e87c, EFLAGS=0x0000000000010206, >> CSGSFS=0x002b000000000033, ERR=0x0000000000000004 >> ?? TRAPNO=0x000000000000000e >> >> Top of Stack: (sp=0x00007fdf63b9d190) >> 0x00007fdf63b9d190:?? 00007fdf209d0980 0000000000000000 >> 0x00007fdf63b9d1a0:?? 00007fdf209d0980 00007fdf63b9d258 >> 0x00007fdf63b9d1b0:?? 00007fdf63b9d228 00007fdf44778dbe >> 0x00007fdf63b9d1c0:?? 000000000146c380 00007fdf5c032000 >> >> Instructions: (pc=0x00007fdf2000e87c) >> 0x00007fdf2000e77c:?? 89 43 18 4d 85 f6 75 0f eb 2a 66 2e 0f 1f 84 00 >> 0x00007fdf2000e78c:?? 00 00 00 00 48 89 c2 48 8d 42 01 48 89 43 08 80 >> 0x00007fdf2000e79c:?? 78 ff 00 78 ef 48 8d 42 02 48 89 43 08 0f b6 42 >> 0x00007fdf2000e7ac:?? 01 88 43 10 48 c7 43 28 00 00 00 00 4c 89 e1 48 >> 0x00007fdf2000e7bc:?? 89 df 31 f6 48 b8 07 00 00 00 10 00 00 00 c6 43 >> 0x00007fdf2000e7cc:?? 3c 00 48 c7 c2 ff ff ff ff 48 89 43 14 48 c7 43 >> 0x00007fdf2000e7dc:?? 30 00 00 00 00 c7 43 38 00 00 00 00 e8 13 fb ff >> 0x00007fdf2000e7ec:?? ff 4c 89 6b 08 48 83 c4 18 5b 41 5c 41 5d 41 5e >> 0x00007fdf2000e7fc:?? 41 5f 5d c3 83 e7 40 0f 84 63 ff ff ff 48 c7 c2 >> 0x00007fdf2000e80c:?? ff ff ff ff 48 d3 e2 49 09 d0 e9 51 ff ff ff 90 >> 0x00007fdf2000e81c:?? 0f 1f 40 00 0f b6 47 10 83 e0 07 3c 02 74 0a 76 >> 0x00007fdf2000e82c:?? 1b 3c 03 74 04 3c 04 75 17 48 8b 57 08 8b 02 48 >> 0x00007fdf2000e83c:?? 83 c2 04 48 89 57 08 c3 0f 1f 40 00 84 c0 74 e9 >> 0x00007fdf2000e84c:?? 31 c0 c3 90 55 41 ba ff ff ff ff 48 89 e5 41 56 >> 0x00007fdf2000e85c:?? 41 55 49 89 f5 41 54 53 48 8b 07 48 89 fb 4c 8b >> 0x00007fdf2000e86c:?? a0 28 11 00 00 eb 09 0f 1f 44 00 00 4c 89 63 08 >> 0x00007fdf2000e87c:?? 41 8b 04 24 4d 8d 4c 24 04 4c 89 4b 08 4c 39 d0 >> 0x00007fdf2000e88c:?? 75 0a 49 8b 44 24 04 4d 8d 4c 24 0c 45 8b 19 4d >> 0x00007fdf2000e89c:?? 8d 24 01 49 8d 41 04 48 89 43 08 45 85 db 74 cc >> 0x00007fdf2000e8ac:?? 48 89 df e8 8c f9 ff ff 48 8b 13 41 89 c6 4c 03 >> 0x00007fdf2000e8bc:?? b2 18 11 00 00 e8 5a ff ff ff 89 c0 4c 01 f0 4c >> 0x00007fdf2000e8cc:?? 39 e8 76 a8 4d 39 ee 77 a3 44 89 da 4c 89 ce e8 >> 0x00007fdf2000e8dc:?? 90 fd ff ff 48 8b 43 08 31 c9 31 ff 48 83 c0 01 >> 0x00007fdf2000e8ec:?? 0f 1f 40 00 48 89 43 08 0f b6 70 ff 49 89 c0 48 >> 0x00007fdf2000e8fc:?? 83 c0 01 48 89 f2 83 e2 7f 48 d3 e2 83 c1 07 48 >> 0x00007fdf2000e90c:?? 09 d7 40 84 f6 78 dd 4c 01 c7 4c 89 e1 4c 89 ea >> 0x00007fdf2000e91c:?? 4c 89 f6 48 89 7b 08 48 89 df e8 d5 f9 ff ff 5b >> 0x00007fdf2000e92c:?? 31 c0 41 5c 41 5d 41 5e 5d c3 66 2e 0f 1f 84 00 >> 0x00007fdf2000e93c:?? 00 00 00 00 55 48 89 e5 41 54 53 48 81 ec d0 00 >> 0x00007fdf2000e94c:?? 00 00 48 89 b5 48 ff ff ff 48 89 95 50 ff ff ff >> 0x00007fdf2000e95c:?? 48 89 8d 58 ff ff ff 4c 89 85 60 ff ff ff 4c 89 >> 0x00007fdf2000e96c:?? 8d 68 ff ff ff 84 c0 74 23 0f 29 85 70 ff ff ff >> >> >> On 11/03/2020 3:52 pm, Yasumasa Suenaga wrote: >>> Hi Kevin, >>> >>> I saw 2 errors on submit repo >>> (mach5-one-ysuenaga-JDK-8234624-5-20200311-0209-9358475). >>> So I tweaked my patch, but I saw the crash again >>> (mach5-one-ysuenaga-JDK-8234624-5-20200311-0448-9361448). >>> >>> ?? Last change on submit repo is here: >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05-2/ >>> >>> Can you share details on submit repo? >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2020/03/11 11:07, Yasumasa Suenaga wrote: >>>> Hi Kevin, >>>> >>>> I guess first program header in the libraries which are on your >>>> machine has exec flag (you can check it with `readelf -l`). >>>> So I tweaked my patch (initial value of exec_start and exec_end set >>>> to -1) in new webrev. >>>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05/ >>>> >>>> This webrev contains the fix for your comment (typo and >>>> DW_CFA_advance_loc4). >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2020/03/11 8:53, Kevin Walls wrote: >>>>> Hi - >>>>> >>>>> In testing I wasn't seeing any of the Dwarf code triggered. >>>>> >>>>> With LIBSAPROC_DEBUG set I'm getting the "Could not find >>>>> executable section in" for lots of / maybe all the libraries... >>>>> >>>>> src/jdk.hotspot.agent/linux/native/libsaproc/libproc_impl.c >>>>> >>>>> ??? if (fill_instr_info(newlib)) { >>>>> ????? if (!read_eh_frame(ph, newlib)) { >>>>> >>>>> fill_instr_info is failing, and we never get to read_eh_frame(). >>>>> >>>>> output like: >>>>> >>>>> libsaproc DEBUG: [0] vaddr = 0x0, memsz = 0xaba4, filesz = 0xaba4 >>>>> libsaproc DEBUG: Could not find executable section in >>>>> /lib/x86_64-linux-gnu/libnss_nis-2.27.so >>>>> >>>>> (similar for all libraries). >>>>> >>>>> fill_instr fails if: >>>>> >>>>> ??if ((lib->exec_start == 0L) || (lib->exec_end == 0L)) >>>>> >>>>> ...but isn't exec_start relative to the library address? It's the >>>>> value of ph->vaddr and it is often zero. >>>>> >>>>> I added some booleans and did: >>>>> >>>>> 185?????? if ((lib->exec_start == 0L) || (lib->exec_start > >>>>> ph->p_vaddr)) { >>>>> 186???????? lib->exec_start = ph->p_vaddr; >>>>> 187???????? found_start =true; >>>>> 188?????? } >>>>> >>>>> (similarly for end) and only failed if: >>>>> >>>>> 201?? if (!found_start || !found_end) { >>>>> 202???? return false; >>>>> >>>>> ...and now it's better. ? I go from: >>>>> >>>>> ----------------- 3306 ----------------- >>>>> 0x00007f75824acd2d????? __GI___pthread_timedjoin_ex + 0x17d >>>>> >>>>> to: >>>>> >>>>> ----------------- 31127 ----------------- >>>>> 0x00007fa284d78d2d????? __GI___pthread_timedjoin_ex + 0x17d >>>>> 0x00007fa2857aaf2d????? CallJavaMainInNewThread + 0xad >>>>> 0x00007fa2857a74ed????? ContinueInNewThread + 0x4d >>>>> 0x00007fa2857a8c49????? JLI_Launch + 0x1529 >>>>> 0x000055af1b78db1c????? main + 0x11c >>>>> >>>>> >>>>> Thanks >>>>> Kevin >>>>> >>>>> >>>>> >>>>> >>>>> On 10/03/2020 12:36, Yasumasa Suenaga wrote: >>>>> >>>>>> Hi Kevin, >>>>>> >>>>>> Thanks for your comment! >>>>>> >>>>>> On 2020/03/10 18:58, Kevin Walls wrote: >>>>>>> Hi Yasumasa , >>>>>>> >>>>>>> The changes build OK for me in the latest jdk, and things still >>>>>>> work. >>>>>>> I have not yet seen the dwarf usage in action: I've tried a >>>>>>> couple of different systems and so far have not reproduced the >>>>>>> problem, i.e. jstack has not failed on native frames. >>>>>>> >>>>>>> I may need more recent basic libraries, will look again for >>>>>>> somewhere where the problem happens and get back to you as I >>>>>>> really want to run the changes. >>>>>> >>>>>> You can see the problem with JShell. >>>>>> Some Java frames would not be seen in mixed jstack. >>>>>> >>>>>> >>>>>>> I have mostly minor other comments which don't need a new >>>>>>> webrev, some just comments for the future: >>>>>>> >>>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp: >>>>>>> >>>>>>> DW_CFA_nop - shouldn't this continue instead of return? >>>>>>> (It may "never" happen, but a nop could appear within some other >>>>>>> instructions?) >>>>>> >>>>>> DW_CFA_nop is used for padding, so we can ignore (return >>>>>> immediately) it. >>>>>> >>>>>> >>>>>>> DW_CFA_remember_state: a minor typo in the comment, >>>>>>> "DW_CFA_remenber_state". >>>>>> >>>>>> I will fix it. >>>>>> >>>>>> >>>>>>> We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not >>>>>>> DW_CFA_advance_loc4.? I thought that was odd, but maybe >>>>>>> addresses in these tables never increase by 4-byte amounts, >>>>>>> would this mean a lot of code on one line. 8-) >>>>>>> So maybe it's never used in practice, if you think it's >>>>>>> unnecessary no problem, maybe a comment, or add it for robustness. >>>>>> >>>>>> I will add DW_CFA_advance_loc4. >>>>>> >>>>>> >>>>>>> General-purpose methods like read_leb128(), get_entry_length(), >>>>>>> get_decoded_value() specifically update the _buf pointer in this >>>>>>> DwarfParser. >>>>>>> >>>>>>> DwarfParser::process_dwarf() moves _buf. >>>>>>> It calls process_cie() which reads, moves _buf and restores it >>>>>>> to the original position, then we read augmentation_length from >>>>>>> where _buf is. >>>>>>> I'm not sure if that's wrong, or if I just need to read again >>>>>>> about the CIE/etc layout. >>>>>>> >>>>>>> I don't really want to suggest making the code pass around a >>>>>>> current _buf for the invocation of these general purpose >>>>>>> methods, but just wanted to comment that if these get used more >>>>>>> widely that might become necessary. >>>>>> >>>>>> I saw GDB and binutils source for creating this patch. >>>>>> They seems to process similar code because we need to calculate >>>>>> DWARF instructions one-by-one to get the value which relates to >>>>>> specified PC. >>>>>> >>>>>> >>>>>>> Similarly in future, if this DWARF support code became used more >>>>>>> widely, it might want to move to an >>>>>>> OS-neutral directory?? It's odd to label it as Linux-specific. >>>>>> >>>>>> Windows does not use DWARF at least, it uses another feature. >>>>>> >>>>>> https://urldefense.com/v3/__https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaTtALtpRg$ >>>>>> >>>>>> I'm not sure other platforms (Solaris, macOS) uses DWARF. >>>>>> If DWARF is used in them, I can move DWARF related code to posix >>>>>> directory. >>>>>> >>>>>> >>>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp: >>>>>>> Thanks for changing "can_parsable" which was in the earlier >>>>>>> version. 8-) >>>>>>> >>>>>>> >>>>>>> These are just comments to mainly say it looks good, and >>>>>>> somebody else out there has read it. >>>>>>> I will look for a system that shows the problem, and get back to >>>>>>> you again! >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>>> Many thanks >>>>>>> Kevin >>>>>>> >>>>>>> On 27/02/2020 05:13, Yasumasa Suenaga wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> webrev.03 cannot be applied to current jdk/jdk due to 8239224 >>>>>>>> and 8239462 changes (they updated copyright year). >>>>>>>> So I modified webrev (only copyright year changes) to be able >>>>>>>> to apply to current jdk/jdk. >>>>>>>> Could you review it? >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/ >>>>>>>> >>>>>>>> I need one more reviewer to push. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2020/02/17 13:07, Yasumasa Suenaga wrote: >>>>>>>>> PING: Could you review it? >>>>>>>>> >>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>>> ?? webrev: >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>>>>> >>>>>>>>> This change has been already reviewed by Serguei. >>>>>>>>> I need one more reviewer to push. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2020/02/03 1:37, Yasumasa Suenaga wrote: >>>>>>>>>> PING: Could you reveiw this change? >>>>>>>>>> >>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>>> ?? webrev: >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>>>>>> >>>>>>>>>> I believe this change helps troubleshooter to fight to >>>>>>>>>> postmortem analysis. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2020/01/19 3:16, Yasumasa Suenaga wrote: >>>>>>>>>>> PING: Could you review it? >>>>>>>>>>> >>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>>> ?? webrev: >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>>>>>>> >>>>>>>>>>> I updated webrev. I discussed with Serguei in off list, and >>>>>>>>>>> I refactored webrev.02 . >>>>>>>>>>> It has passed tests on submit repo >>>>>>>>>>> (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549). >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 2019/12/15 10:51, Yasumasa Suenaga wrote: >>>>>>>>>>>> Hi Serguei, >>>>>>>>>>>> >>>>>>>>>>>> Thanks for your comment! >>>>>>>>>>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new >>>>>>>>>>>> webrev. >>>>>>>>>>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c >>>>>>>>>>>> as Dmitry said. >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/ >>>>>>>>>>>> >>>>>>>>>>>> This change has been passed all tests on submit repo >>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487). >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>> >>>>>>>>>>>>> This is nice move in general. >>>>>>>>>>>>> Thank you for working on this! >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if >>>>>>>>>>>>> (libptr == 0L) { // Java frame 98 Address rbp = >>>>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 >>>>>>>>>>>>> if (rbp == null) { 100 return null; 101 } 102 return new >>>>>>>>>>>>> LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // >>>>>>>>>>>>> Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = >>>>>>>>>>>>> new DwarfParser(libptr); 107 } catch (DebuggerException e) >>>>>>>>>>>>> { 108 Address rbp = >>>>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 >>>>>>>>>>>>> if (rbp == null) { 110 return null; 111 } 112 return new >>>>>>>>>>>>> LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 >>>>>>>>>>>>> dwarf.processDwarf(pc); 115 Address cfa = >>>>>>>>>>>>> ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 >>>>>>>>>>>>> !dwarf.isBPOffsetAvailable()) 117 ? >>>>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : >>>>>>>>>>>>> context.getRegisterAsAddress(dwarf.getCFARegister()) 119 >>>>>>>>>>>>> .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { >>>>>>>>>>>>> 121 return null; 122 } 123 return new >>>>>>>>>>>>> LinuxAMD64CFrame(dbg, cfa, pc, dwarf); 124 } >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> I'd suggest to simplify the logic by refactoring to >>>>>>>>>>>>> something like below: >>>>>>>>>>>>> >>>>>>>>>>>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc); >>>>>>>>>>>>> ?????????? Address cfa = >>>>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); // >>>>>>>>>>>>> Java frame >>>>>>>>>>>>> ?????????? DwarfParser dwarf = null; >>>>>>>>>>>>> >>>>>>>>>>>>> ?????????? if (libptr != 0L) { // Native frame >>>>>>>>>>>>> ???????????? try { >>>>>>>>>>>>> ?????????????? dwarf = new DwarfParser(libptr); >>>>>>>>>>>>> ?????????????? dwarf.processDwarf(pc); >>>>>>>>>>>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == >>>>>>>>>>>>> AMD64ThreadContext.RBP) && >>>>>>>>>>>>> !dwarf.isBPOffsetAvailable()) >>>>>>>>>>>>> ???????????????????????????????? ? >>>>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP) >>>>>>>>>>>>> ???????????????????????????????? : >>>>>>>>>>>>> context.getRegisterAsAddress(dwarf.getCFARegister()) >>>>>>>>>>>>> .addOffsetTo(dwarf.getCFAOffset()); >>>>>>>>>>>>> >>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // bail out to >>>>>>>>>>>>> Java frame case >>>>>>>>>>>>> ??????????? } >>>>>>>>>>>>> ????????? } >>>>>>>>>>>>> ????????? if (cfa == null) { >>>>>>>>>>>>> ??????????? return null; >>>>>>>>>>>>> ????????? } >>>>>>>>>>>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 58 long ofs = useDwarf ? >>>>>>>>>>>>> dwarf.getReturnAddressOffsetFromCFA() >>>>>>>>>>>>> >>>>>>>>>>>>> ?? Better to rename 'ofs' => 'offs'. >>>>>>>>>>>>> >>>>>>>>>>>>> 77 nextCFA = nextCFA.addOffsetTo(- >>>>>>>>>>>>> nextDwarf.getBasePointerOffsetFromCFA()); >>>>>>>>>>>>> >>>>>>>>>>>>> ?? Extra space after '-' sign. >>>>>>>>>>>>> >>>>>>>>>>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, >>>>>>>>>>>>> ThreadContext context) { >>>>>>>>>>>>> >>>>>>>>>>>>> ?? It feels like the logic has to be somehow >>>>>>>>>>>>> refactored/simplified as >>>>>>>>>>>>> ?? several typical fragments appears in slightly different >>>>>>>>>>>>> contexts. >>>>>>>>>>>>> ?? But it is not easy to understand what it is. >>>>>>>>>>>>> ?? Could you, please, add some comments to key places >>>>>>>>>>>>> explaining this logic. >>>>>>>>>>>>> ?? Then I'll check if it is possible to make it a little >>>>>>>>>>>>> bit simpler. >>>>>>>>>>>>> >>>>>>>>>>>>> 109 private CFrame javaSender(ThreadContext context) { 110 >>>>>>>>>>>>> Address nextCFA; 111 Address nextPC; 112 113 nextPC = >>>>>>>>>>>>> getNextPC(false); 114 if (nextPC == null) { 115 return >>>>>>>>>>>>> null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long >>>>>>>>>>>>> libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr >>>>>>>>>>>>> != 0L) { // Native frame 121 try { 122 nextDwarf = new >>>>>>>>>>>>> DwarfParser(libptr); 123 } catch (DebuggerException e) { >>>>>>>>>>>>> 124 nextCFA = getNextCFA(null, context); 125 return >>>>>>>>>>>>> (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, >>>>>>>>>>>>> nextCFA, nextPC, null); 126 } 127 >>>>>>>>>>>>> nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = >>>>>>>>>>>>> getNextCFA(nextDwarf, context); 131 return (nextCFA == >>>>>>>>>>>>> null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, >>>>>>>>>>>>> nextPC, nextDwarf); 133 } >>>>>>>>>>>>> >>>>>>>>>>>>> ??The above can be simplified if a DebuggerException can >>>>>>>>>>>>> not be thrown from processDwarf(nextPC): >>>>>>>>>>>>> ????? private CFrame javaSender(ThreadContext context) { >>>>>>>>>>>>> ??????? Address nextPC = getNextPC(false); >>>>>>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>>>>>> ????????? return null; >>>>>>>>>>>>> ??????? } >>>>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>>>>>> >>>>>>>>>>>>> ??????? if (libptr != 0L) { // Native frame >>>>>>>>>>>>> ????????? try { >>>>>>>>>>>>> ??????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>>>>>> nextDwarf.processDwarf(nextPC); >>>>>>>>>>>>> ????????? } catch (DebuggerException e) { // Bail out to >>>>>>>>>>>>> Java frame >>>>>>>>>>>>> ????????? } >>>>>>>>>>>>> ??????? } >>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new >>>>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>>>>>> ????? } >>>>>>>>>>>>> >>>>>>>>>>>>> 135 public CFrame sender(ThreadProxy thread) { 136 >>>>>>>>>>>>> ThreadContext context = thread.getContext(); 137 138 if >>>>>>>>>>>>> (dwarf == null) { // Java frame 139 return >>>>>>>>>>>>> javaSender(context); 140 } 141 142 Address nextPC = >>>>>>>>>>>>> getNextPC(true); 143 if (nextPC == null) { 144 return >>>>>>>>>>>>> null; 145 } 146 147 Address nextCFA; 148 DwarfParser >>>>>>>>>>>>> nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long >>>>>>>>>>>>> libptr = dbg.findLibPtrByAddress(nextPC); 151 if (libptr >>>>>>>>>>>>> == 0L) { 152 // Next frame might be Java frame 153 nextCFA >>>>>>>>>>>>> = getNextCFA(null, context); 154 return (nextCFA == null) >>>>>>>>>>>>> ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); >>>>>>>>>>>>> 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); >>>>>>>>>>>>> 158 } catch (DebuggerException e) { 159 nextCFA = >>>>>>>>>>>>> getNextCFA(null, context); 160 return (nextCFA == null) ? >>>>>>>>>>>>> null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); >>>>>>>>>>>>> 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 >>>>>>>>>>>>> nextCFA = getNextCFA(nextDwarf, context); 166 return >>>>>>>>>>>>> (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, >>>>>>>>>>>>> nextCFA, nextPC, nextDwarf); 167 } >>>>>>>>>>>>> >>>>>>>>>>>>> ??This one can be also simplified a little: >>>>>>>>>>>>> >>>>>>>>>>>>> ????? public CFrame sender(ThreadProxy thread) { >>>>>>>>>>>>> ??????? ThreadContext context = thread.getContext(); >>>>>>>>>>>>> >>>>>>>>>>>>> ??????? if (dwarf == null) { // Java frame >>>>>>>>>>>>> ????????? return javaSender(context); >>>>>>>>>>>>> ??????? } >>>>>>>>>>>>> ??????? Address nextPC = getNextPC(true); >>>>>>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>>>>>> ????????? return null; >>>>>>>>>>>>> ??????? } >>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>>>>>> ??????? if (!dwarf.isIn(nextPC)) { >>>>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>>>> ????????? if (libptr != 0L) { >>>>>>>>>>>>> ??????????? try { >>>>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>>>>>> nextDwarf.processDwarf(nextPC); >>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to >>>>>>>>>>>>> Java frame >>>>>>>>>>>>> ??????????? } >>>>>>>>>>>>> ????????? } >>>>>>>>>>>>> ??????? } >>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new >>>>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>>>>>> ????? } >>>>>>>>>>>>> >>>>>>>>>>>>> Finally, it looks like just one method could replace both >>>>>>>>>>>>> sender(ThreadProxy thread) and javaSender(ThreadContext >>>>>>>>>>>>> context): >>>>>>>>>>>>> >>>>>>>>>>>>> ????? private CFrame commonSender(ThreadProxy thread) { >>>>>>>>>>>>> ??????? ThreadContext context = thread.getContext(); >>>>>>>>>>>>> ??????? Address nextPC = getNextPC(false); >>>>>>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>>>>>> ????????? return null; >>>>>>>>>>>>> ??????? } >>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>>>>>> >>>>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) { >>>>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>>>> ????????? if (libptr != 0L) { >>>>>>>>>>>>> ??????????? try { >>>>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>>>>>> nextDwarf.processDwarf(nextPC); >>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to >>>>>>>>>>>>> Java frame >>>>>>>>>>>>> ??????????? } >>>>>>>>>>>>> ????????? } >>>>>>>>>>>>> ??????? } >>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new >>>>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>>>>>> ????? } >>>>>>>>>>>>> >>>>>>>>>>>>> I'm still reviewing the dwarf parser files. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Serguei >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I refactored LinuxAMD64CFrame.java . It works fine in >>>>>>>>>>>>>> serviceability/sa tests and >>>>>>>>>>>>>> all tests on submit repo >>>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923). >>>>>>>>>>>>>> Could you review new webrev? >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> The diff from previous webrev is here: >>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Please review this change: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>>>>>>> ?? webrev: >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> According to 2.7 Stack Unwind Algorithm in System V >>>>>>>>>>>>>>> Application Binary Interface AMD64 >>>>>>>>>>>>>>> Architecture Processor Supplement [1], we need to use >>>>>>>>>>>>>>> DWARF in .eh_frame or .debug_frame >>>>>>>>>>>>>>> for stack unwinding. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by >>>>>>>>>>>>>>> default since GCC 4.6, so system >>>>>>>>>>>>>>> library (e.g. libc) might be compiled with this feature. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> However `jhsdb jstack --mixed` does not do so, it uses >>>>>>>>>>>>>>> base pointer register (RBP). >>>>>>>>>>>>>>> So it might be lack of stack frames. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I guess JDK-8219201 is caused by same issue. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>> https://urldefense.com/v3/__https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaQusmjOqA$ >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> From serguei.spitsyn at oracle.com Wed Mar 11 19:35:44 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 11 Mar 2020 12:35:44 -0700 Subject: RFR: 8240881: several tests are failing due to encoding failures Message-ID: An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Wed Mar 11 19:49:08 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 11 Mar 2020 15:49:08 -0400 Subject: RFR: 8240881: several tests are failing due to encoding failures In-Reply-To: References: Message-ID: On 3/11/20 3:35 PM, serguei.spitsyn at oracle.com wrote: > Please, review the fix of: > https://bugs.openjdk.java.net/browse/JDK-8240881 > > Webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/8240881-encoding-antidelta/ Thumbs up! This is a trivial changeset so you may push with a single (R)eviewer. Your anti-delta matches mine and I compared mine to the parent of the push for JDK-8222489. My Mach5 Tier[56] job set is not quite finished, but I haven't seen any signs of the failures yet. Dan > > > Summary: > ? JDK-8240881 is a regression caused by the fix of: > https://bugs.openjdk.java.net/browse/JDK-8222489 > ? ?? Webrev: http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/ > ???? Changeset: > http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset > > ? The suggested fix is the JDK-8240881 anti-delta. > ? As a reviewer and sponsor, I apologize for this regression. > ? The change impact occurred bigger than expected. > > Testing: > ? The mac > > Thanks, > Serguei -------------- next part -------------- An HTML attachment was scrubbed... URL: From ioi.lam at oracle.com Wed Mar 11 19:49:55 2020 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 11 Mar 2020 12:49:55 -0700 Subject: RFR: 8240881: several tests are failing due to encoding failures In-Reply-To: References: Message-ID: <681add21-344f-fc59-4784-5b1c0f0bb851@oracle.com> Looks good to me. Thanks - Ioi On 3/11/20 12:35 PM, serguei.spitsyn at oracle.com wrote: > Please, review the fix of: > https://bugs.openjdk.java.net/browse/JDK-8240881 > > Webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/8240881-encoding-antidelta/ > > > Summary: > ? JDK-8240881 is a regression caused by the fix of: > https://bugs.openjdk.java.net/browse/JDK-8222489 > ? ?? Webrev: http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/ > ???? Changeset: > http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset > > ? The suggested fix is the JDK-8240881 anti-delta. > ? As a reviewer and sponsor, I apologize for this regression. > ? The change impact occurred bigger than expected. > > Testing: > ? The mac > > Thanks, > Serguei -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Wed Mar 11 19:52:38 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 11 Mar 2020 12:52:38 -0700 Subject: PING: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows In-Reply-To: <3e535ec6-573d-901e-e4c0-4e2174c80f15@oracle.com> References: <54d4e146-58bc-8318-6e27-922616ff4b37@oss.nttdata.com> <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com> <0a2df665-2e08-6139-c131-043a425b4916@oracle.com> <31599c2a-6947-44a0-0f91-cb240ddaf1ed@oracle.com> <3e535ec6-573d-901e-e4c0-4e2174c80f15@oracle.com> Message-ID: <4d29cbef-aa4e-31bc-8562-16930eeed219@oracle.com> Hi Chihiro, I've tested and pushed your fix but the impact of fix was underestimated. The fix caused several regressions and the following bug was filed: ? https://bugs.openjdk.java.net/browse/JDK-8240881 Now, I'm working on removing the fix of JDK-8222489 with the anti-delta. You can find and review my RFR posted on the serviceability-dev mailing list: ? RFR: 8240881: several tests are failing due to encoding failures You can file another bug as a replacement of JDK-8222489. I will help you with the information about test regressions caused by it. Thanks, Serguei On 3/10/20 02:54, serguei.spitsyn at oracle.com wrote: > Hi Chihiro, > > Yes, I'll sponsor it. > Thank you for the update. > > Thanks, > Serguei > > > On 3/8/20 06:05, Chihiro Ito wrote: >> Hi, >> >> I'm sorry. I included "JDK-" in the changeset title. I removed it and >> updated it. >> >> Change set : >> http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset >> >> Regards, >> Chihiro >> >> 2020?3?7?(?) 23:13 Chihiro Ito : >>> Hi Serguei and Yasumasa, >>> >>> I update the copyright year and created the change set. >>> >>> Could you sponsor this, please? >>> >>> Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/ >>> Change set : >>> http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset >>> >>> Regards, >>> Chihiro >>> >>> >>> 2020?3?7?(?) 16:03 Yasumasa Suenaga : >>> >>> >>>> Hi Chihiro, >>>> >>>> I'm also ok with webrev.05 after updating copyright year. >>>> >>>> >>>> Yasumasa >>>> >>>> >>>> On 2020/03/07 3:32, serguei.spitsyn at oracle.com wrote: >>>>> Hi Chichiro, >>>>> >>>>> I'm okay with the fix. >>>>> Could you, please, update the copyright date in || >>>>> src/java.base/share/classes/jdk/internal/vm/VMSupport.java before >>>>> push? >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 3/6/20 07:24, Chihiro Ito wrote: >>>>>> Hi Serguei, >>>>>> >>>>>> Could you review this again, please? >>>>>> >>>>>> Regards, >>>>>> Chihiro >>>>>> >>>>>> >>>>>> 2020?2?27?(?) 22:11 Chihiro Ito: >>>>>>> Hi Ralf, >>>>>>> >>>>>>> Thank you for your advice. >>>>>>> >>>>>>> 1. >>>>>>> The comment of serializePropertiesToByteArray in VMSupport is >>>>>>> "The stream written to the byte array is ISO 8859-1 encoded.". >>>>>>> But the previous implementation does not keep this. I think we >>>>>>> need to implement encode by ISO 8859-1. >>>>>>> >>>>>>> 2. >>>>>>> According to help, the feature of VM.system_properties is just >>>>>>> "Print system properties". The users should not use this output >>>>>>> for loading. The users use it when they want to see System >>>>>>> Properties soon. >>>>>>> >>>>>>> Regards, >>>>>>> Chihiro >>>>>>> >>>>>>> >>>>>>> 2020?2?26?(?) 18:53 Schmelter, Ralf: >>>>>>>> Hi Chihiro, >>>>>>>> >>>>>>>> I have two remarks: >>>>>>>> >>>>>>>> 1. ISO Latin 1 characters which are not ASCII will not work >>>>>>>> with the code. While the Properties.store() method claims to >>>>>>>> create ISO Latin 1 String, it really only will create printable >>>>>>>> ASCII characters (apart from the comment, but it is ASCII too >>>>>>>> in this case). See Properties.saveConvert, where the char is >>>>>>>> checked for < 0x20 or > 0x7e and then printed as \uxxxx. This >>>>>>>> is important, since the bytes of the ByteArrayOutputStream are >>>>>>>> then send to the jcmd. And jcmd expects UTF-8 encoded strings, >>>>>>>> which is OK if we only used ASCII characters. But a ISO Latin 1 >>>>>>>> character >= 0x80 will break the encoding. Just try using >>>>>>>> \u00DC in your test. >>>>>>>> >>>>>>>> 2. Your change makes it impossible to load the output with >>>>>>>> properties.load(). The old output could be loaded, since it was >>>>>>>> a valid properties file. But yours is not. For example, >>>>>>>> consider the filename c:\test\new. Formerly it would be encoded >>>>>>>> as: >>>>>>>> C\:\\test\\new >>>>>>>> And now it is: >>>>>>>> C:\test\new >>>>>>>> But the properties code would see "\n" as the newline character >>>>>>>> in your encoding. In fact you cannot differentiate between \n, >>>>>>>> \t, \f and \r originally being one or two characters. >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Ralf >>>>>>>> >>>>>>>> >>>>>>>> From: >>>>>>>> serviceability-dev >>>>>>>> On Behalf Of Chihiro Ito >>>>>>>> Sent: Dienstag, 25. Februar 2020 04:45 >>>>>>>> To:serguei.spitsyn at oracle.com >>>>>>>> Cc:serviceability-dev at openjdk.java.net >>>>>>>> Subject: Re: RFR: JDK-8222489: jcmd VM.system_properties gives >>>>>>>> unusable paths on Windows >>>>>>>> >>>>>>>> Hi Serguei, >>>>>>>> >>>>>>>> Thanks for your review and advice. >>>>>>>> >>>>>>>> I modified these. >>>>>>>> Could you review this again, please? >>>>>>>> >>>>>>>> Webrev :http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.05/ >>>>>>>> >>>>>>>> Regards, >>>>>>>> Chihiro >>>>>>>> > From daniel.daugherty at oracle.com Wed Mar 11 19:58:11 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 11 Mar 2020 15:58:11 -0400 Subject: PING: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows In-Reply-To: <4d29cbef-aa4e-31bc-8562-16930eeed219@oracle.com> References: <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com> <0a2df665-2e08-6139-c131-043a425b4916@oracle.com> <31599c2a-6947-44a0-0f91-cb240ddaf1ed@oracle.com> <3e535ec6-573d-901e-e4c0-4e2174c80f15@oracle.com> <4d29cbef-aa4e-31bc-8562-16930eeed219@oracle.com> Message-ID: <319bc269-a35b-210e-faca-d0cae54e0d51@oracle.com> The replacement bug should be filed with this description: ??? [REDO] 8222489 jcmd VM.system_properties gives unusable paths on Windows and should be linked to the original bug also. Dan On 3/11/20 3:52 PM, serguei.spitsyn at oracle.com wrote: > Hi Chihiro, > > I've tested and pushed your fix but the impact of fix was underestimated. > The fix caused several regressions and the following bug was filed: > ? https://bugs.openjdk.java.net/browse/JDK-8240881 > > Now, I'm working on removing the fix of JDK-8222489 with the anti-delta. > You can find and review my RFR posted on the serviceability-dev > mailing list: > ? RFR: 8240881: several tests are failing due to encoding failures > > You can file another bug as a replacement of JDK-8222489. > I will help you with the information about test regressions caused by it. > > Thanks, > Serguei > > > On 3/10/20 02:54, serguei.spitsyn at oracle.com wrote: >> Hi Chihiro, >> >> Yes, I'll sponsor it. >> Thank you for the update. >> >> Thanks, >> Serguei >> >> >> On 3/8/20 06:05, Chihiro Ito wrote: >>> Hi, >>> >>> I'm sorry. I included "JDK-" in the changeset title. I removed it and >>> updated it. >>> >>> Change set : >>> http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset >>> >>> Regards, >>> Chihiro >>> >>> 2020?3?7?(?) 23:13 Chihiro Ito : >>>> Hi Serguei and Yasumasa, >>>> >>>> I update the copyright year and created the change set. >>>> >>>> Could you sponsor this, please? >>>> >>>> Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/ >>>> Change set : >>>> http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset >>>> >>>> Regards, >>>> Chihiro >>>> >>>> >>>> 2020?3?7?(?) 16:03 Yasumasa Suenaga : >>>> >>>> >>>>> Hi Chihiro, >>>>> >>>>> I'm also ok with webrev.05 after updating copyright year. >>>>> >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2020/03/07 3:32, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Chichiro, >>>>>> >>>>>> I'm okay with the fix. >>>>>> Could you, please, update the copyright date in || >>>>>> src/java.base/share/classes/jdk/internal/vm/VMSupport.java before >>>>>> push? >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 3/6/20 07:24, Chihiro Ito wrote: >>>>>>> Hi Serguei, >>>>>>> >>>>>>> Could you review this again, please? >>>>>>> >>>>>>> Regards, >>>>>>> Chihiro >>>>>>> >>>>>>> >>>>>>> 2020?2?27?(?) 22:11 Chihiro Ito: >>>>>>>> Hi Ralf, >>>>>>>> >>>>>>>> Thank you for your advice. >>>>>>>> >>>>>>>> 1. >>>>>>>> The comment of serializePropertiesToByteArray in VMSupport is >>>>>>>> "The stream written to the byte array is ISO 8859-1 encoded.". >>>>>>>> But the previous implementation does not keep this. I think we >>>>>>>> need to implement encode by ISO 8859-1. >>>>>>>> >>>>>>>> 2. >>>>>>>> According to help, the feature of VM.system_properties is just >>>>>>>> "Print system properties". The users should not use this output >>>>>>>> for loading. The users use it when they want to see System >>>>>>>> Properties soon. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Chihiro >>>>>>>> >>>>>>>> >>>>>>>> 2020?2?26?(?) 18:53 Schmelter, Ralf: >>>>>>>>> Hi Chihiro, >>>>>>>>> >>>>>>>>> I have two remarks: >>>>>>>>> >>>>>>>>> 1. ISO Latin 1 characters which are not ASCII will not work >>>>>>>>> with the code. While the Properties.store() method claims to >>>>>>>>> create ISO Latin 1 String, it really only will create >>>>>>>>> printable ASCII characters (apart from the comment, but it is >>>>>>>>> ASCII too in this case). See Properties.saveConvert, where the >>>>>>>>> char is checked for < 0x20 or > 0x7e and then printed as >>>>>>>>> \uxxxx. This is important, since the bytes of the >>>>>>>>> ByteArrayOutputStream are then send to the jcmd. And jcmd >>>>>>>>> expects UTF-8 encoded strings, which is OK if we only used >>>>>>>>> ASCII characters. But a ISO Latin 1 character >= 0x80 will >>>>>>>>> break the encoding. Just try using \u00DC in your test. >>>>>>>>> >>>>>>>>> 2. Your change makes it impossible to load the output with >>>>>>>>> properties.load(). The old output could be loaded, since it >>>>>>>>> was a valid properties file. But yours is not. For example, >>>>>>>>> consider the filename c:\test\new. Formerly it would be >>>>>>>>> encoded as: >>>>>>>>> C\:\\test\\new >>>>>>>>> And now it is: >>>>>>>>> C:\test\new >>>>>>>>> But the properties code would see "\n" as the newline >>>>>>>>> character in your encoding. In fact you cannot differentiate >>>>>>>>> between \n, \t, \f and \r originally being one or two characters. >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> Ralf >>>>>>>>> >>>>>>>>> >>>>>>>>> From: >>>>>>>>> serviceability-dev >>>>>>>>> On Behalf Of Chihiro Ito >>>>>>>>> Sent: Dienstag, 25. Februar 2020 04:45 >>>>>>>>> To:serguei.spitsyn at oracle.com >>>>>>>>> Cc:serviceability-dev at openjdk.java.net >>>>>>>>> Subject: Re: RFR: JDK-8222489: jcmd VM.system_properties gives >>>>>>>>> unusable paths on Windows >>>>>>>>> >>>>>>>>> Hi Serguei, >>>>>>>>> >>>>>>>>> Thanks for your review and advice. >>>>>>>>> >>>>>>>>> I modified these. >>>>>>>>> Could you review this again, please? >>>>>>>>> >>>>>>>>> Webrev :http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.05/ >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Chihiro >>>>>>>>> >> > From serguei.spitsyn at oracle.com Wed Mar 11 19:58:34 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 11 Mar 2020 12:58:34 -0700 Subject: RFR: 8240881: several tests are failing due to encoding failures In-Reply-To: References: Message-ID: Hi Dan, Thank you for filing the bug and review! My mach5 job was submitted later , so your job comes to be handy - thanks! I'll push the fix. Thanks, Serguei On 3/11/20 12:49, Daniel D. Daugherty wrote: > On 3/11/20 3:35 PM, serguei.spitsyn at oracle.com wrote: >> Please, review the fix of: >> https://bugs.openjdk.java.net/browse/JDK-8240881 >> >> Webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/8240881-encoding-antidelta/ > > Thumbs up! This is a trivial changeset so you may push with a > single (R)eviewer. > > Your anti-delta matches mine and I compared mine to the parent of > the push for JDK-8222489. > > My Mach5 Tier[56] job set is not quite finished, but I haven't seen > any signs of the failures yet. > > Dan > > >> >> >> Summary: >> ? JDK-8240881 is a regression caused by the fix of: >> https://bugs.openjdk.java.net/browse/JDK-8222489 >> ? ?? Webrev: http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/ >> ???? Changeset: >> http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset >> >> ? The suggested fix is the JDK-8240881 anti-delta. >> ? As a reviewer and sponsor, I apologize for this regression. >> ? The change impact occurred bigger than expected. >> >> Testing: >> ? The mac >> >> Thanks, >> Serguei > From serguei.spitsyn at oracle.com Wed Mar 11 20:13:44 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 11 Mar 2020 13:13:44 -0700 Subject: RFR: 8240881: several tests are failing due to encoding failures In-Reply-To: <681add21-344f-fc59-4784-5b1c0f0bb851@oracle.com> References: <681add21-344f-fc59-4784-5b1c0f0bb851@oracle.com> Message-ID: <2ac3d137-a695-3e43-b44e-499f71e7ec49@oracle.com> Thanks, Ioi! Serguei On 3/11/20 12:49, Ioi Lam wrote: > Looks good to me. > > Thanks > - Ioi > > On 3/11/20 12:35 PM, serguei.spitsyn at oracle.com wrote: >> Please, review the fix of: >> https://bugs.openjdk.java.net/browse/JDK-8240881 >> >> Webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/8240881-encoding-antidelta/ >> >> >> Summary: >> ? JDK-8240881 is a regression caused by the fix of: >> https://bugs.openjdk.java.net/browse/JDK-8222489 >> ? ?? Webrev: http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/ >> ???? Changeset: >> http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset >> >> ? The suggested fix is the JDK-8240881 anti-delta. >> ? As a reviewer and sponsor, I apologize for this regression. >> ? The change impact occurred bigger than expected. >> >> Testing: >> ? The mac >> >> Thanks, >> Serguei > From alexey.menkov at oracle.com Wed Mar 11 22:29:16 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Wed, 11 Mar 2020 15:29:16 -0700 Subject: RFR(XS): JDK-8217441: Failure of ::realloc() should be handled correctly in sawindbg.cpp Message-ID: <513ed37a-172d-2700-ad8d-9dcf322ed7cc@oracle.com> Hi all, please review small (and I'd say trivial) fix for https://bugs.openjdk.java.net/browse/JDK-8217441 webrev: http://cr.openjdk.java.net/~amenkov/jdk15/WinDbg_realloc/webrev/ from realloc() spec: On failure, returns a null pointer. The original pointer ptr remains valid and may need to be deallocated with free() or realloc(). --alex From chris.plummer at oracle.com Wed Mar 11 23:04:14 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 11 Mar 2020 16:04:14 -0700 Subject: RFR(XS): JDK-8217441: Failure of ::realloc() should be handled correctly in sawindbg.cpp In-Reply-To: <513ed37a-172d-2700-ad8d-9dcf322ed7cc@oracle.com> References: <513ed37a-172d-2700-ad8d-9dcf322ed7cc@oracle.com> Message-ID: <2a1fab61-6075-6391-29ba-4e7de45b0930@oracle.com> Looks good. Chris On 3/11/20 3:29 PM, Alex Menkov wrote: > Hi all, > > please review small (and I'd say trivial) fix for > https://bugs.openjdk.java.net/browse/JDK-8217441 > webrev: > http://cr.openjdk.java.net/~amenkov/jdk15/WinDbg_realloc/webrev/ > > from realloc() spec: > On failure, returns a null pointer. The original pointer ptr remains > valid and may need to be deallocated with free() or realloc(). > > --alex From serguei.spitsyn at oracle.com Wed Mar 11 23:04:50 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 11 Mar 2020 23:04:50 +0000 (UTC) Subject: RFR(XS): JDK-8217441: Failure of ::realloc() should be handled correctly in sawindbg.cpp In-Reply-To: <513ed37a-172d-2700-ad8d-9dcf322ed7cc@oracle.com> References: <513ed37a-172d-2700-ad8d-9dcf322ed7cc@oracle.com> Message-ID: Hi Alex, It looks good to me. It seems, returning S_FALSE from SAOutputCallbacks::Output should be okay as the same is done when nullptr is returned from malloc. Thanks, Serguei On 3/11/20 15:29, Alex Menkov wrote: > Hi all, > > please review small (and I'd say trivial) fix for > https://bugs.openjdk.java.net/browse/JDK-8217441 > webrev: > http://cr.openjdk.java.net/~amenkov/jdk15/WinDbg_realloc/webrev/ > > from realloc() spec: > On failure, returns a null pointer. The original pointer ptr remains > valid and may need to be deallocated with free() or realloc(). > > --alex From alexey.menkov at oracle.com Wed Mar 11 23:32:19 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Wed, 11 Mar 2020 16:32:19 -0700 Subject: RFR(XS): JDK-8217441: Failure of ::realloc() should be handled correctly in sawindbg.cpp In-Reply-To: References: <513ed37a-172d-2700-ad8d-9dcf322ed7cc@oracle.com> Message-ID: <146669f5-49d7-99a9-4e9f-5ba47f91c69d@oracle.com> On 03/11/2020 16:04, serguei.spitsyn at oracle.com wrote: > Hi Alex, > > It looks good to me. > It seems, returning S_FALSE from SAOutputCallbacks::Output should be > okay as the same is done when nullptr is returned from malloc. Accordingly MSDN returned value is ignored by debugger engine. But I keept it as it was. --alex > > Thanks, > Serguei > > > On 3/11/20 15:29, Alex Menkov wrote: >> Hi all, >> >> please review small (and I'd say trivial) fix for >> https://bugs.openjdk.java.net/browse/JDK-8217441 >> webrev: >> http://cr.openjdk.java.net/~amenkov/jdk15/WinDbg_realloc/webrev/ >> >> from realloc() spec: >> On failure, returns a null pointer. The original pointer ptr remains >> valid and may need to be deallocated with free() or realloc(). >> >> --alex > From suenaga at oss.nttdata.com Thu Mar 12 00:10:10 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Thu, 12 Mar 2020 09:10:10 +0900 Subject: : PING: RFR: 8234624: jstack mixed mode should refer DWARF In-Reply-To: References: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com> <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com> <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com> <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com> <24a2dcae-fba9-c6ed-657e-a1d54584bbc3@oss.nttdata.com> <01c2a8c1-0e04-4e42-161a-80d6bdaa4ffa@oracle.com> <26681640-d8c8-5b64-58cc-36f8a854d101@oracle.com> Message-ID: On 2020/03/12 0:31, Kevin Walls wrote: > Hi - > > OK great, it checks for a zero-length dwarf entry. > > I did a rebuild and tested it locally it works here, so all looks good to me. Thanks Kevin! > We may in future want to work on src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/bsd/amd64/BsdAMD64CFrame.java to use Dwarf similarly, that's why I mentioned the platform-neutral directory name, but I have no issue with that happening in the future. I do not have Mac, so I cannot work for it... Of course, if you (or other serviceability folks) work for it, I will help. Yasumasa > Thanks > Kevin > > > On 11/03/2020 09:49, Yasumasa Suenaga wrote: >> Hi, >> >> Thanks David and Ioi for sharing the status. >> I've fixed the problem in new webrev (mach5-one-ysuenaga-JDK-8234624-6-20200311-0827-9367344): >> >> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.06/ >> >> Diff from webrev.05 is here: >> >> ? http://hg.openjdk.java.net/jdk/submit/rev/e3d12785f087 >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2020/03/11 14:59, David Holmes wrote: >>> Hi Yasumasa, >>> >>> Partial hs_err info below. >>> >>> David >>> ----- >>> >>> # >>> # A fatal error has been detected by the Java Runtime Environment: >>> # >>> #? SIGSEGV (0xb) at pc=0x00007fdf2000e87c, pid=29798, tid=29800 >>> # >>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-11-0447267.suenaga.source) >>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-11-0447267.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>> # Problematic frame: >>> # C? [libsaproc.so+0x487c]? DwarfParser::process_dwarf(unsigned long)+0x2c >>> # >>> # Core dump will be written. Default location: Core dumps may be processed with "/opt/core.sh %p" (or dumping to /opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/scratch/0/core.29798) >>> # >>> # If you would like to submit a bug report, please visit: >>> # https://urldefense.com/v3/__https://bugreport.java.com/bugreport/crash.jsp__;!!GqivPVa7Brio!OghfqRRRHbAZloG3aVJ244OPPTcCQOwYIl_vm6vU_toLb9qFzTUirVBEHn2tfDp26A$ # The crash happened outside the Java Virtual Machine in native code. >>> # See problematic frame for where to report the bug. >>> # >>> >>> ---------------? S U M M A R Y ------------ >>> >>> Command Line: >>> -Denv.class.path=/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/serviceability/sa/TestJhsdbJstackMixed.d:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/hotspot/jtreg/serviceability/sa:/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/javatest.jar:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/jtreg.jar >>> -Dapplication.home=/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/linux-x64-debug.jdk/jdk-15/fastdebug -Xms8m -Djdk.module.main=jdk.hotspot.agent jdk.hotspot.agent/sun.jvm.hotspot.SALauncher jstack --mixed --pid 29770 >>> >>> Time: Wed Mar 11 05:20:57 2020 UTC elapsed time: 3.927809 seconds (0d 0h 0m 3s) >>> >>> ---------------? T H R E A D? --------------- >>> >>> Current thread (0x00007fdf5c032000):? JavaThread "main" [_thread_in_native, id=29800, stack(0x00007fdf63a9e000,0x00007fdf63b9f000)] >>> >>> Stack: [0x00007fdf63a9e000,0x00007fdf63b9f000], sp=0x00007fdf63b9d190, free space=1020k >>> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) >>> C? [libsaproc.so+0x487c]? DwarfParser::process_dwarf(unsigned long)+0x2c >>> j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 jdk.hotspot.agent at 15-internal >>> j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 jdk.hotspot.agent at 15-internal >>> j sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 jdk.hotspot.agent at 15-internal >>> j sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 jdk.hotspot.agent at 15-internal >>> j sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.tools.Tool.startInternal()V+87 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 jdk.hotspot.agent at 15-internal >>> j sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 jdk.hotspot.agent at 15-internal >>> j sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 jdk.hotspot.agent at 15-internal >>> v? ~StubRoutines::call_stub >>> V? [libjvm.so+0xc2291c]? JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x6ac >>> V? [libjvm.so+0xd31970]? jni_invoke_static(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) [clone .isra.140] [clone .constprop.263]+0x370 >>> V? [libjvm.so+0xd36202]? jni_CallStaticVoidMethod+0x222 >>> C? [libjli.so+0x4bed]? JavaMain+0xbcd >>> C? [libjli.so+0x80a9]? ThreadJavaMain+0x9 >>> >>> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) >>> j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 jdk.hotspot.agent at 15-internal >>> j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 jdk.hotspot.agent at 15-internal >>> j sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 jdk.hotspot.agent at 15-internal >>> j sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 jdk.hotspot.agent at 15-internal >>> j sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.tools.Tool.startInternal()V+87 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 jdk.hotspot.agent at 15-internal >>> j sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 jdk.hotspot.agent at 15-internal >>> j sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 jdk.hotspot.agent at 15-internal >>> j? sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 jdk.hotspot.agent at 15-internal >>> v? ~StubRoutines::call_stub >>> >>> siginfo: si_signo: 11 (SIGSEGV), si_code: 2 (SEGV_ACCERR), si_addr: 0x00007fded5076b79 >>> >>> Register to memory mapping: >>> >>> RAX=0x00007f7e4dfe3229 is an unknown value >>> RBX=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 d4 de 7f 00 00 >>> RCX=0x00007fded4072380 points into unknown readable memory: 2f 75 73 72 2f 6c 69 62 >>> RDX=0x00007fded4076b85 points into unknown readable memory: 01 00 00 >>> RSP=0x00007fdf63b9d190 is pointing into the stack for thread: 0x00007fdf5c032000 >>> RBP=0x00007fdf63b9d1b0 is pointing into the stack for thread: 0x00007fdf5c032000 >>> RSI=0x0000000000000004 is an unknown value >>> RDI=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 d4 de 7f 00 00 >>> R8 =0x000000000146c380 points into unknown readable memory: 02 00 00 00 00 00 00 00 >>> R9 =0x00007fded4076b79 points into unknown readable memory: 7a 52 00 01 78 10 01 >>> R10=0x00000000ffffffff is an unknown value >>> R11=0x000000000100527a is an unknown value >>> R12=0x00007fded5076b79 is an unknown value >>> R13=0x00007f7da2f8e68a is an unknown value >>> R14=0x00007f7dbdf62b1d is an unknown value >>> R15=0x00007fdf5c032000 is a thread >>> >>> >>> Registers: >>> RAX=0x00007f7e4dfe3229, RBX=0x00007fdf5c4d7080, RCX=0x00007fded4072380, RDX=0x00007fded4076b85 >>> RSP=0x00007fdf63b9d190, RBP=0x00007fdf63b9d1b0, RSI=0x0000000000000004, RDI=0x00007fdf5c4d7080 >>> R8 =0x000000000146c380, R9 =0x00007fded4076b79, R10=0x00000000ffffffff, R11=0x000000000100527a >>> R12=0x00007fded5076b79, R13=0x00007f7da2f8e68a, R14=0x00007f7dbdf62b1d, R15=0x00007fdf5c032000 >>> RIP=0x00007fdf2000e87c, EFLAGS=0x0000000000010206, CSGSFS=0x002b000000000033, ERR=0x0000000000000004 >>> ?? TRAPNO=0x000000000000000e >>> >>> Top of Stack: (sp=0x00007fdf63b9d190) >>> 0x00007fdf63b9d190:?? 00007fdf209d0980 0000000000000000 >>> 0x00007fdf63b9d1a0:?? 00007fdf209d0980 00007fdf63b9d258 >>> 0x00007fdf63b9d1b0:?? 00007fdf63b9d228 00007fdf44778dbe >>> 0x00007fdf63b9d1c0:?? 000000000146c380 00007fdf5c032000 >>> >>> Instructions: (pc=0x00007fdf2000e87c) >>> 0x00007fdf2000e77c:?? 89 43 18 4d 85 f6 75 0f eb 2a 66 2e 0f 1f 84 00 >>> 0x00007fdf2000e78c:?? 00 00 00 00 48 89 c2 48 8d 42 01 48 89 43 08 80 >>> 0x00007fdf2000e79c:?? 78 ff 00 78 ef 48 8d 42 02 48 89 43 08 0f b6 42 >>> 0x00007fdf2000e7ac:?? 01 88 43 10 48 c7 43 28 00 00 00 00 4c 89 e1 48 >>> 0x00007fdf2000e7bc:?? 89 df 31 f6 48 b8 07 00 00 00 10 00 00 00 c6 43 >>> 0x00007fdf2000e7cc:?? 3c 00 48 c7 c2 ff ff ff ff 48 89 43 14 48 c7 43 >>> 0x00007fdf2000e7dc:?? 30 00 00 00 00 c7 43 38 00 00 00 00 e8 13 fb ff >>> 0x00007fdf2000e7ec:?? ff 4c 89 6b 08 48 83 c4 18 5b 41 5c 41 5d 41 5e >>> 0x00007fdf2000e7fc:?? 41 5f 5d c3 83 e7 40 0f 84 63 ff ff ff 48 c7 c2 >>> 0x00007fdf2000e80c:?? ff ff ff ff 48 d3 e2 49 09 d0 e9 51 ff ff ff 90 >>> 0x00007fdf2000e81c:?? 0f 1f 40 00 0f b6 47 10 83 e0 07 3c 02 74 0a 76 >>> 0x00007fdf2000e82c:?? 1b 3c 03 74 04 3c 04 75 17 48 8b 57 08 8b 02 48 >>> 0x00007fdf2000e83c:?? 83 c2 04 48 89 57 08 c3 0f 1f 40 00 84 c0 74 e9 >>> 0x00007fdf2000e84c:?? 31 c0 c3 90 55 41 ba ff ff ff ff 48 89 e5 41 56 >>> 0x00007fdf2000e85c:?? 41 55 49 89 f5 41 54 53 48 8b 07 48 89 fb 4c 8b >>> 0x00007fdf2000e86c:?? a0 28 11 00 00 eb 09 0f 1f 44 00 00 4c 89 63 08 >>> 0x00007fdf2000e87c:?? 41 8b 04 24 4d 8d 4c 24 04 4c 89 4b 08 4c 39 d0 >>> 0x00007fdf2000e88c:?? 75 0a 49 8b 44 24 04 4d 8d 4c 24 0c 45 8b 19 4d >>> 0x00007fdf2000e89c:?? 8d 24 01 49 8d 41 04 48 89 43 08 45 85 db 74 cc >>> 0x00007fdf2000e8ac:?? 48 89 df e8 8c f9 ff ff 48 8b 13 41 89 c6 4c 03 >>> 0x00007fdf2000e8bc:?? b2 18 11 00 00 e8 5a ff ff ff 89 c0 4c 01 f0 4c >>> 0x00007fdf2000e8cc:?? 39 e8 76 a8 4d 39 ee 77 a3 44 89 da 4c 89 ce e8 >>> 0x00007fdf2000e8dc:?? 90 fd ff ff 48 8b 43 08 31 c9 31 ff 48 83 c0 01 >>> 0x00007fdf2000e8ec:?? 0f 1f 40 00 48 89 43 08 0f b6 70 ff 49 89 c0 48 >>> 0x00007fdf2000e8fc:?? 83 c0 01 48 89 f2 83 e2 7f 48 d3 e2 83 c1 07 48 >>> 0x00007fdf2000e90c:?? 09 d7 40 84 f6 78 dd 4c 01 c7 4c 89 e1 4c 89 ea >>> 0x00007fdf2000e91c:?? 4c 89 f6 48 89 7b 08 48 89 df e8 d5 f9 ff ff 5b >>> 0x00007fdf2000e92c:?? 31 c0 41 5c 41 5d 41 5e 5d c3 66 2e 0f 1f 84 00 >>> 0x00007fdf2000e93c:?? 00 00 00 00 55 48 89 e5 41 54 53 48 81 ec d0 00 >>> 0x00007fdf2000e94c:?? 00 00 48 89 b5 48 ff ff ff 48 89 95 50 ff ff ff >>> 0x00007fdf2000e95c:?? 48 89 8d 58 ff ff ff 4c 89 85 60 ff ff ff 4c 89 >>> 0x00007fdf2000e96c:?? 8d 68 ff ff ff 84 c0 74 23 0f 29 85 70 ff ff ff >>> >>> >>> On 11/03/2020 3:52 pm, Yasumasa Suenaga wrote: >>>> Hi Kevin, >>>> >>>> I saw 2 errors on submit repo (mach5-one-ysuenaga-JDK-8234624-5-20200311-0209-9358475). >>>> So I tweaked my patch, but I saw the crash again (mach5-one-ysuenaga-JDK-8234624-5-20200311-0448-9361448). >>>> >>>> ?? Last change on submit repo is here: >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05-2/ >>>> >>>> Can you share details on submit repo? >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2020/03/11 11:07, Yasumasa Suenaga wrote: >>>>> Hi Kevin, >>>>> >>>>> I guess first program header in the libraries which are on your machine has exec flag (you can check it with `readelf -l`). >>>>> So I tweaked my patch (initial value of exec_start and exec_end set to -1) in new webrev. >>>>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05/ >>>>> >>>>> This webrev contains the fix for your comment (typo and DW_CFA_advance_loc4). >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2020/03/11 8:53, Kevin Walls wrote: >>>>>> Hi - >>>>>> >>>>>> In testing I wasn't seeing any of the Dwarf code triggered. >>>>>> >>>>>> With LIBSAPROC_DEBUG set I'm getting the "Could not find executable section in" for lots of / maybe all the libraries... >>>>>> >>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/libproc_impl.c >>>>>> >>>>>> ??? if (fill_instr_info(newlib)) { >>>>>> ????? if (!read_eh_frame(ph, newlib)) { >>>>>> >>>>>> fill_instr_info is failing, and we never get to read_eh_frame(). >>>>>> >>>>>> output like: >>>>>> >>>>>> libsaproc DEBUG: [0] vaddr = 0x0, memsz = 0xaba4, filesz = 0xaba4 >>>>>> libsaproc DEBUG: Could not find executable section in /lib/x86_64-linux-gnu/libnss_nis-2.27.so >>>>>> >>>>>> (similar for all libraries). >>>>>> >>>>>> fill_instr fails if: >>>>>> >>>>>> ??if ((lib->exec_start == 0L) || (lib->exec_end == 0L)) >>>>>> >>>>>> ...but isn't exec_start relative to the library address? It's the value of ph->vaddr and it is often zero. >>>>>> >>>>>> I added some booleans and did: >>>>>> >>>>>> 185?????? if ((lib->exec_start == 0L) || (lib->exec_start > ph->p_vaddr)) { >>>>>> 186???????? lib->exec_start = ph->p_vaddr; >>>>>> 187???????? found_start =true; >>>>>> 188?????? } >>>>>> >>>>>> (similarly for end) and only failed if: >>>>>> >>>>>> 201?? if (!found_start || !found_end) { >>>>>> 202???? return false; >>>>>> >>>>>> ...and now it's better. ? I go from: >>>>>> >>>>>> ----------------- 3306 ----------------- >>>>>> 0x00007f75824acd2d????? __GI___pthread_timedjoin_ex + 0x17d >>>>>> >>>>>> to: >>>>>> >>>>>> ----------------- 31127 ----------------- >>>>>> 0x00007fa284d78d2d????? __GI___pthread_timedjoin_ex + 0x17d >>>>>> 0x00007fa2857aaf2d????? CallJavaMainInNewThread + 0xad >>>>>> 0x00007fa2857a74ed????? ContinueInNewThread + 0x4d >>>>>> 0x00007fa2857a8c49????? JLI_Launch + 0x1529 >>>>>> 0x000055af1b78db1c????? main + 0x11c >>>>>> >>>>>> >>>>>> Thanks >>>>>> Kevin >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On 10/03/2020 12:36, Yasumasa Suenaga wrote: >>>>>> >>>>>>> Hi Kevin, >>>>>>> >>>>>>> Thanks for your comment! >>>>>>> >>>>>>> On 2020/03/10 18:58, Kevin Walls wrote: >>>>>>>> Hi Yasumasa , >>>>>>>> >>>>>>>> The changes build OK for me in the latest jdk, and things still work. >>>>>>>> I have not yet seen the dwarf usage in action: I've tried a couple of different systems and so far have not reproduced the problem, i.e. jstack has not failed on native frames. >>>>>>>> >>>>>>>> I may need more recent basic libraries, will look again for somewhere where the problem happens and get back to you as I really want to run the changes. >>>>>>> >>>>>>> You can see the problem with JShell. >>>>>>> Some Java frames would not be seen in mixed jstack. >>>>>>> >>>>>>> >>>>>>>> I have mostly minor other comments which don't need a new webrev, some just comments for the future: >>>>>>>> >>>>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp: >>>>>>>> >>>>>>>> DW_CFA_nop - shouldn't this continue instead of return? >>>>>>>> (It may "never" happen, but a nop could appear within some other instructions?) >>>>>>> >>>>>>> DW_CFA_nop is used for padding, so we can ignore (return immediately) it. >>>>>>> >>>>>>> >>>>>>>> DW_CFA_remember_state: a minor typo in the comment, "DW_CFA_remenber_state". >>>>>>> >>>>>>> I will fix it. >>>>>>> >>>>>>> >>>>>>>> We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not DW_CFA_advance_loc4.? I thought that was odd, but maybe addresses in these tables never increase by 4-byte amounts, would this mean a lot of code on one line. 8-) >>>>>>>> So maybe it's never used in practice, if you think it's unnecessary no problem, maybe a comment, or add it for robustness. >>>>>>> >>>>>>> I will add DW_CFA_advance_loc4. >>>>>>> >>>>>>> >>>>>>>> General-purpose methods like read_leb128(), get_entry_length(), get_decoded_value() specifically update the _buf pointer in this DwarfParser. >>>>>>>> >>>>>>>> DwarfParser::process_dwarf() moves _buf. >>>>>>>> It calls process_cie() which reads, moves _buf and restores it to the original position, then we read augmentation_length from where _buf is. >>>>>>>> I'm not sure if that's wrong, or if I just need to read again about the CIE/etc layout. >>>>>>>> >>>>>>>> I don't really want to suggest making the code pass around a current _buf for the invocation of these general purpose methods, but just wanted to comment that if these get used more widely that might become necessary. >>>>>>> >>>>>>> I saw GDB and binutils source for creating this patch. >>>>>>> They seems to process similar code because we need to calculate DWARF instructions one-by-one to get the value which relates to specified PC. >>>>>>> >>>>>>> >>>>>>>> Similarly in future, if this DWARF support code became used more widely, it might want to move to an >>>>>>>> OS-neutral directory?? It's odd to label it as Linux-specific. >>>>>>> >>>>>>> Windows does not use DWARF at least, it uses another feature. >>>>>>> >>>>>>> https://urldefense.com/v3/__https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaTtALtpRg$ >>>>>>> I'm not sure other platforms (Solaris, macOS) uses DWARF. >>>>>>> If DWARF is used in them, I can move DWARF related code to posix directory. >>>>>>> >>>>>>> >>>>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp: >>>>>>>> Thanks for changing "can_parsable" which was in the earlier version. 8-) >>>>>>>> >>>>>>>> >>>>>>>> These are just comments to mainly say it looks good, and somebody else out there has read it. >>>>>>>> I will look for a system that shows the problem, and get back to you again! >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>>> Many thanks >>>>>>>> Kevin >>>>>>>> >>>>>>>> On 27/02/2020 05:13, Yasumasa Suenaga wrote: >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> webrev.03 cannot be applied to current jdk/jdk due to 8239224 and 8239462 changes (they updated copyright year). >>>>>>>>> So I modified webrev (only copyright year changes) to be able to apply to current jdk/jdk. >>>>>>>>> Could you review it? >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/ >>>>>>>>> >>>>>>>>> I need one more reviewer to push. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2020/02/17 13:07, Yasumasa Suenaga wrote: >>>>>>>>>> PING: Could you review it? >>>>>>>>>> >>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>>>>>> >>>>>>>>>> This change has been already reviewed by Serguei. >>>>>>>>>> I need one more reviewer to push. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2020/02/03 1:37, Yasumasa Suenaga wrote: >>>>>>>>>>> PING: Could you reveiw this change? >>>>>>>>>>> >>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>>>>>>> >>>>>>>>>>> I believe this change helps troubleshooter to fight to postmortem analysis. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 2020/01/19 3:16, Yasumasa Suenaga wrote: >>>>>>>>>>>> PING: Could you review it? >>>>>>>>>>>> >>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>>>>>>>>>>> >>>>>>>>>>>> I updated webrev. I discussed with Serguei in off list, and I refactored webrev.02 . >>>>>>>>>>>> It has passed tests on submit repo (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549). >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 2019/12/15 10:51, Yasumasa Suenaga wrote: >>>>>>>>>>>>> Hi Serguei, >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for your comment! >>>>>>>>>>>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev. >>>>>>>>>>>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as Dmitry said. >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/ >>>>>>>>>>>>> >>>>>>>>>>>>> This change has been passed all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487). >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>> >>>>>>>>>>>>>> This is nice move in general. >>>>>>>>>>>>>> Thank you for working on this! >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html >>>>>>>>>>>>>> >>>>>>>>>>>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 0L) { // Java frame 98 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp == null) { 100 return null; 101 } 102 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if (rbp == null) { 110 return null; 111 } 112 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 dwarf.processDwarf(pc); 115 Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 !dwarf.isBPOffsetAvailable()) 117 ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : context.getRegisterAsAddress(dwarf.getCFARegister()) 119 .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); 124 } >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> I'd suggest to simplify the logic by refactoring to something like below: >>>>>>>>>>>>>> >>>>>>>>>>>>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc); >>>>>>>>>>>>>> ?????????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame >>>>>>>>>>>>>> ?????????? DwarfParser dwarf = null; >>>>>>>>>>>>>> >>>>>>>>>>>>>> ?????????? if (libptr != 0L) { // Native frame >>>>>>>>>>>>>> ???????????? try { >>>>>>>>>>>>>> ?????????????? dwarf = new DwarfParser(libptr); >>>>>>>>>>>>>> ?????????????? dwarf.processDwarf(pc); >>>>>>>>>>>>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && >>>>>>>>>>>>>> !dwarf.isBPOffsetAvailable()) >>>>>>>>>>>>>> ???????????????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) >>>>>>>>>>>>>> ???????????????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister()) >>>>>>>>>>>>>> .addOffsetTo(dwarf.getCFAOffset()); >>>>>>>>>>>>>> >>>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // bail out to Java frame case >>>>>>>>>>>>>> ??????????? } >>>>>>>>>>>>>> ????????? } >>>>>>>>>>>>>> ????????? if (cfa == null) { >>>>>>>>>>>>>> ??????????? return null; >>>>>>>>>>>>>> ????????? } >>>>>>>>>>>>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html >>>>>>>>>>>>>> >>>>>>>>>>>>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA() >>>>>>>>>>>>>> >>>>>>>>>>>>>> ?? Better to rename 'ofs' => 'offs'. >>>>>>>>>>>>>> >>>>>>>>>>>>>> 77 nextCFA = nextCFA.addOffsetTo(- nextDwarf.getBasePointerOffsetFromCFA()); >>>>>>>>>>>>>> >>>>>>>>>>>>>> ?? Extra space after '-' sign. >>>>>>>>>>>>>> >>>>>>>>>>>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, ThreadContext context) { >>>>>>>>>>>>>> >>>>>>>>>>>>>> ?? It feels like the logic has to be somehow refactored/simplified as >>>>>>>>>>>>>> ?? several typical fragments appears in slightly different contexts. >>>>>>>>>>>>>> ?? But it is not easy to understand what it is. >>>>>>>>>>>>>> ?? Could you, please, add some comments to key places explaining this logic. >>>>>>>>>>>>>> ?? Then I'll check if it is possible to make it a little bit simpler. >>>>>>>>>>>>>> >>>>>>>>>>>>>> 109 private CFrame javaSender(ThreadContext context) { 110 Address nextCFA; 111 Address nextPC; 112 113 nextPC = getNextPC(false); 114 if (nextPC == null) { 115 return null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 123 } catch (DebuggerException e) { 124 nextCFA = getNextCFA(null, context); 125 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 133 } >>>>>>>>>>>>>> >>>>>>>>>>>>>> ??The above can be simplified if a DebuggerException can not be thrown from processDwarf(nextPC): >>>>>>>>>>>>>> ????? private CFrame javaSender(ThreadContext context) { >>>>>>>>>>>>>> ??????? Address nextPC = getNextPC(false); >>>>>>>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>>>>>>> ????????? return null; >>>>>>>>>>>>>> ??????? } >>>>>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>>>>>>> >>>>>>>>>>>>>> ??????? if (libptr != 0L) { // Native frame >>>>>>>>>>>>>> ????????? try { >>>>>>>>>>>>>> ??????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>>>>>>> nextDwarf.processDwarf(nextPC); >>>>>>>>>>>>>> ????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>>>>>>>>>>> ????????? } >>>>>>>>>>>>>> ??????? } >>>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>>>>>>> ????? } >>>>>>>>>>>>>> >>>>>>>>>>>>>> 135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext context = thread.getContext(); 137 138 if (dwarf == null) { // Java frame 139 return javaSender(context); 140 } 141 142 Address nextPC = getNextPC(true); 143 if (nextPC == null) { 144 return null; 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // Next frame might be Java frame 153 nextCFA = getNextCFA(null, context); 154 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 158 } catch (DebuggerException e) { 159 nextCFA = getNextCFA(null, context); 160 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? null : new >>>>>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 167 } >>>>>>>>>>>>>> >>>>>>>>>>>>>> ??This one can be also simplified a little: >>>>>>>>>>>>>> >>>>>>>>>>>>>> ????? public CFrame sender(ThreadProxy thread) { >>>>>>>>>>>>>> ??????? ThreadContext context = thread.getContext(); >>>>>>>>>>>>>> >>>>>>>>>>>>>> ??????? if (dwarf == null) { // Java frame >>>>>>>>>>>>>> ????????? return javaSender(context); >>>>>>>>>>>>>> ??????? } >>>>>>>>>>>>>> ??????? Address nextPC = getNextPC(true); >>>>>>>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>>>>>>> ????????? return null; >>>>>>>>>>>>>> ??????? } >>>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>>>>>>> ??????? if (!dwarf.isIn(nextPC)) { >>>>>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>>>>> ????????? if (libptr != 0L) { >>>>>>>>>>>>>> ??????????? try { >>>>>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>>>>>>> nextDwarf.processDwarf(nextPC); >>>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>>>>>>>>>>> ??????????? } >>>>>>>>>>>>>> ????????? } >>>>>>>>>>>>>> ??????? } >>>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>>>>>>> ????? } >>>>>>>>>>>>>> >>>>>>>>>>>>>> Finally, it looks like just one method could replace both >>>>>>>>>>>>>> sender(ThreadProxy thread) and javaSender(ThreadContext context): >>>>>>>>>>>>>> >>>>>>>>>>>>>> ????? private CFrame commonSender(ThreadProxy thread) { >>>>>>>>>>>>>> ??????? ThreadContext context = thread.getContext(); >>>>>>>>>>>>>> ??????? Address nextPC = getNextPC(false); >>>>>>>>>>>>>> ??????? if (nextPC == null) { >>>>>>>>>>>>>> ????????? return null; >>>>>>>>>>>>>> ??????? } >>>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null; >>>>>>>>>>>>>> >>>>>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>>>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) { >>>>>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>>>>>> ????????? if (libptr != 0L) { >>>>>>>>>>>>>> ??????????? try { >>>>>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>>>>>>>>>>> nextDwarf.processDwarf(nextPC); >>>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>>>>>>>>>>> ??????????? } >>>>>>>>>>>>>> ????????? } >>>>>>>>>>>>>> ??????? } >>>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>>>>>>>>>>> ????? } >>>>>>>>>>>>>> >>>>>>>>>>>>>> I'm still reviewing the dwarf parser files. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Serguei >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I refactored LinuxAMD64CFrame.java . It works fine in serviceability/sa tests and >>>>>>>>>>>>>>> all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923). >>>>>>>>>>>>>>> Could you review new webrev? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The diff from previous webrev is here: >>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Please review this change: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> According to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64 >>>>>>>>>>>>>>>> Architecture Processor Supplement [1], we need to use DWARF in .eh_frame or .debug_frame >>>>>>>>>>>>>>>> for stack unwinding. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by default since GCC 4.6, so system >>>>>>>>>>>>>>>> library (e.g. libc) might be compiled with this feature. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> However `jhsdb jstack --mixed` does not do so, it uses base pointer register (RBP). >>>>>>>>>>>>>>>> So it might be lack of stack frames. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I guess JDK-8219201 is caused by same issue. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> [1] https://urldefense.com/v3/__https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaQusmjOqA$ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> > From suenaga at oss.nttdata.com Thu Mar 12 01:32:58 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Thu, 12 Mar 2020 10:32:58 +0900 Subject: PING: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows In-Reply-To: <4d29cbef-aa4e-31bc-8562-16930eeed219@oracle.com> References: <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com> <0a2df665-2e08-6139-c131-043a425b4916@oracle.com> <31599c2a-6947-44a0-0f91-cb240ddaf1ed@oracle.com> <3e535ec6-573d-901e-e4c0-4e2174c80f15@oracle.com> <4d29cbef-aa4e-31bc-8562-16930eeed219@oracle.com> Message-ID: Hi, AFAICS failure tests which are listed in JBS (JDK-8240881) seems to be caused by HotSpotVirtualMachine::getSystemProperties. It would load the result of PrintSystemPropertiesDCmd via Properties::load. So we have to compliant the spec of Properties. OTOH it is not described in spec (JDK-7120511: introducing VM.system_properties, and help message). Thus I think we need to add new option for VM.system_properties for showing raw values (e.g. -raw). If do so, we need CSR. What do you think? Yasumasa On 2020/03/12 4:52, serguei.spitsyn at oracle.com wrote: > Hi Chihiro, > > I've tested and pushed your fix but the impact of fix was underestimated. > The fix caused several regressions and the following bug was filed: > ? https://bugs.openjdk.java.net/browse/JDK-8240881 > > Now, I'm working on removing the fix of JDK-8222489 with the anti-delta. > You can find and review my RFR posted on the serviceability-dev mailing list: > ? RFR: 8240881: several tests are failing due to encoding failures > > You can file another bug as a replacement of JDK-8222489. > I will help you with the information about test regressions caused by it. > > Thanks, > Serguei > > > On 3/10/20 02:54, serguei.spitsyn at oracle.com wrote: >> Hi Chihiro, >> >> Yes, I'll sponsor it. >> Thank you for the update. >> >> Thanks, >> Serguei >> >> >> On 3/8/20 06:05, Chihiro Ito wrote: >>> Hi, >>> >>> I'm sorry. I included "JDK-" in the changeset title. I removed it and >>> updated it. >>> >>> Change set : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset >>> >>> Regards, >>> Chihiro >>> >>> 2020?3?7?(?) 23:13 Chihiro Ito : >>>> Hi Serguei and Yasumasa, >>>> >>>> I update the copyright year and created the change set. >>>> >>>> Could you sponsor this, please? >>>> >>>> Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/ >>>> Change set : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset >>>> >>>> Regards, >>>> Chihiro >>>> >>>> >>>> 2020?3?7?(?) 16:03 Yasumasa Suenaga : >>>> >>>> >>>>> Hi Chihiro, >>>>> >>>>> I'm also ok with webrev.05 after updating copyright year. >>>>> >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2020/03/07 3:32, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Chichiro, >>>>>> >>>>>> I'm okay with the fix. >>>>>> Could you, please, update the copyright date in || src/java.base/share/classes/jdk/internal/vm/VMSupport.java before push? >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 3/6/20 07:24, Chihiro Ito wrote: >>>>>>> Hi Serguei, >>>>>>> >>>>>>> Could you review this again, please? >>>>>>> >>>>>>> Regards, >>>>>>> Chihiro >>>>>>> >>>>>>> >>>>>>> 2020?2?27?(?) 22:11 Chihiro Ito: >>>>>>>> Hi Ralf, >>>>>>>> >>>>>>>> Thank you for your advice. >>>>>>>> >>>>>>>> 1. >>>>>>>> The comment of serializePropertiesToByteArray in VMSupport is "The stream written to the byte array is ISO 8859-1 encoded.". >>>>>>>> But the previous implementation does not keep this. I think we need to implement encode by ISO 8859-1. >>>>>>>> >>>>>>>> 2. >>>>>>>> According to help, the feature of VM.system_properties is just "Print system properties". The users should not use this output for loading. The users use it when they want to see System Properties soon. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Chihiro >>>>>>>> >>>>>>>> >>>>>>>> 2020?2?26?(?) 18:53 Schmelter, Ralf: >>>>>>>>> Hi Chihiro, >>>>>>>>> >>>>>>>>> I have two remarks: >>>>>>>>> >>>>>>>>> 1. ISO Latin 1 characters which are not ASCII will not work with the code. While the Properties.store() method claims to create ISO Latin 1 String, it really only will create printable ASCII characters (apart from the comment, but it is ASCII too in this case). See Properties.saveConvert, where the char is checked for < 0x20 or > 0x7e and then printed as \uxxxx. This is important, since the bytes of the ByteArrayOutputStream are then send to the jcmd. And jcmd expects UTF-8 encoded strings, which is OK if we only used ASCII characters. But a ISO Latin 1 character >= 0x80 will break the encoding. Just try using \u00DC in your test. >>>>>>>>> >>>>>>>>> 2. Your change makes it impossible to load the output with properties.load(). The old output could be loaded, since it was a valid properties file. But yours is not. For example, consider the filename c:\test\new. Formerly it would be encoded as: >>>>>>>>> C\:\\test\\new >>>>>>>>> And now it is: >>>>>>>>> C:\test\new >>>>>>>>> But the properties code would see "\n" as the newline character in your encoding. In fact you cannot differentiate between \n, \t, \f and \r originally being one or two characters. >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> Ralf >>>>>>>>> >>>>>>>>> >>>>>>>>> From: serviceability-dev On Behalf Of Chihiro Ito >>>>>>>>> Sent: Dienstag, 25. Februar 2020 04:45 >>>>>>>>> To:serguei.spitsyn at oracle.com >>>>>>>>> Cc:serviceability-dev at openjdk.java.net >>>>>>>>> Subject: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows >>>>>>>>> >>>>>>>>> Hi Serguei, >>>>>>>>> >>>>>>>>> Thanks for your review and advice. >>>>>>>>> >>>>>>>>> I modified these. >>>>>>>>> Could you review this again, please? >>>>>>>>> >>>>>>>>> Webrev :http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.05/ >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Chihiro >>>>>>>>> >> > From chiroito107 at gmail.com Thu Mar 12 03:15:40 2020 From: chiroito107 at gmail.com (Chihiro Ito) Date: Thu, 12 Mar 2020 12:15:40 +0900 Subject: PING: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows In-Reply-To: References: <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com> <0a2df665-2e08-6139-c131-043a425b4916@oracle.com> <31599c2a-6947-44a0-0f91-cb240ddaf1ed@oracle.com> <3e535ec6-573d-901e-e4c0-4e2174c80f15@oracle.com> <4d29cbef-aa4e-31bc-8562-16930eeed219@oracle.com> Message-ID: Hi Serguei, I could not fail these tests on my environment. I would like to see more detail bug information especially the environment variable. Could you share with me the test-result, please? Regards, Chihiro 2020?3?12?(?) 10:32 Yasumasa Suenaga : > > Hi, > > AFAICS failure tests which are listed in JBS (JDK-8240881) seems to be caused by HotSpotVirtualMachine::getSystemProperties. > It would load the result of PrintSystemPropertiesDCmd via Properties::load. So we have to compliant the spec of Properties. > > OTOH it is not described in spec (JDK-7120511: introducing VM.system_properties, and help message). > > Thus I think we need to add new option for VM.system_properties for showing raw values (e.g. -raw). > If do so, we need CSR. > > What do you think? > > > Yasumasa > > > On 2020/03/12 4:52, serguei.spitsyn at oracle.com wrote: > > Hi Chihiro, > > > > I've tested and pushed your fix but the impact of fix was underestimated. > > The fix caused several regressions and the following bug was filed: > > https://bugs.openjdk.java.net/browse/JDK-8240881 > > > > Now, I'm working on removing the fix of JDK-8222489 with the anti-delta. > > You can find and review my RFR posted on the serviceability-dev mailing list: > > RFR: 8240881: several tests are failing due to encoding failures > > > > You can file another bug as a replacement of JDK-8222489. > > I will help you with the information about test regressions caused by it. > > > > Thanks, > > Serguei > > > > > > On 3/10/20 02:54, serguei.spitsyn at oracle.com wrote: > >> Hi Chihiro, > >> > >> Yes, I'll sponsor it. > >> Thank you for the update. > >> > >> Thanks, > >> Serguei > >> > >> > >> On 3/8/20 06:05, Chihiro Ito wrote: > >>> Hi, > >>> > >>> I'm sorry. I included "JDK-" in the changeset title. I removed it and > >>> updated it. > >>> > >>> Change set : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset > >>> > >>> Regards, > >>> Chihiro > >>> > >>> 2020?3?7?(?) 23:13 Chihiro Ito : > >>>> Hi Serguei and Yasumasa, > >>>> > >>>> I update the copyright year and created the change set. > >>>> > >>>> Could you sponsor this, please? > >>>> > >>>> Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/ > >>>> Change set : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset > >>>> > >>>> Regards, > >>>> Chihiro > >>>> > >>>> > >>>> 2020?3?7?(?) 16:03 Yasumasa Suenaga : > >>>> > >>>> > >>>>> Hi Chihiro, > >>>>> > >>>>> I'm also ok with webrev.05 after updating copyright year. > >>>>> > >>>>> > >>>>> Yasumasa > >>>>> > >>>>> > >>>>> On 2020/03/07 3:32, serguei.spitsyn at oracle.com wrote: > >>>>>> Hi Chichiro, > >>>>>> > >>>>>> I'm okay with the fix. > >>>>>> Could you, please, update the copyright date in || src/java.base/share/classes/jdk/internal/vm/VMSupport.java before push? > >>>>>> > >>>>>> Thanks, > >>>>>> Serguei > >>>>>> > >>>>>> > >>>>>> On 3/6/20 07:24, Chihiro Ito wrote: > >>>>>>> Hi Serguei, > >>>>>>> > >>>>>>> Could you review this again, please? > >>>>>>> > >>>>>>> Regards, > >>>>>>> Chihiro > >>>>>>> > >>>>>>> > >>>>>>> 2020?2?27?(?) 22:11 Chihiro Ito: > >>>>>>>> Hi Ralf, > >>>>>>>> > >>>>>>>> Thank you for your advice. > >>>>>>>> > >>>>>>>> 1. > >>>>>>>> The comment of serializePropertiesToByteArray in VMSupport is "The stream written to the byte array is ISO 8859-1 encoded.". > >>>>>>>> But the previous implementation does not keep this. I think we need to implement encode by ISO 8859-1. > >>>>>>>> > >>>>>>>> 2. > >>>>>>>> According to help, the feature of VM.system_properties is just "Print system properties". The users should not use this output for loading. The users use it when they want to see System Properties soon. > >>>>>>>> > >>>>>>>> Regards, > >>>>>>>> Chihiro > >>>>>>>> > >>>>>>>> > >>>>>>>> 2020?2?26?(?) 18:53 Schmelter, Ralf: > >>>>>>>>> Hi Chihiro, > >>>>>>>>> > >>>>>>>>> I have two remarks: > >>>>>>>>> > >>>>>>>>> 1. ISO Latin 1 characters which are not ASCII will not work with the code. While the Properties.store() method claims to create ISO Latin 1 String, it really only will create printable ASCII characters (apart from the comment, but it is ASCII too in this case). See Properties.saveConvert, where the char is checked for < 0x20 or > 0x7e and then printed as \uxxxx. This is important, since the bytes of the ByteArrayOutputStream are then send to the jcmd. And jcmd expects UTF-8 encoded strings, which is OK if we only used ASCII characters. But a ISO Latin 1 character >= 0x80 will break the encoding. Just try using \u00DC in your test. > >>>>>>>>> > >>>>>>>>> 2. Your change makes it impossible to load the output with properties.load(). The old output could be loaded, since it was a valid properties file. But yours is not. For example, consider the filename c:\test\new. Formerly it would be encoded as: > >>>>>>>>> C\:\\test\\new > >>>>>>>>> And now it is: > >>>>>>>>> C:\test\new > >>>>>>>>> But the properties code would see "\n" as the newline character in your encoding. In fact you cannot differentiate between \n, \t, \f and \r originally being one or two characters. > >>>>>>>>> > >>>>>>>>> Best regards, > >>>>>>>>> Ralf > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> From: serviceability-dev On Behalf Of Chihiro Ito > >>>>>>>>> Sent: Dienstag, 25. Februar 2020 04:45 > >>>>>>>>> To:serguei.spitsyn at oracle.com > >>>>>>>>> Cc:serviceability-dev at openjdk.java.net > >>>>>>>>> Subject: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows > >>>>>>>>> > >>>>>>>>> Hi Serguei, > >>>>>>>>> > >>>>>>>>> Thanks for your review and advice. > >>>>>>>>> > >>>>>>>>> I modified these. > >>>>>>>>> Could you review this again, please? > >>>>>>>>> > >>>>>>>>> Webrev :http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.05/ > >>>>>>>>> > >>>>>>>>> Regards, > >>>>>>>>> Chihiro > >>>>>>>>> > >> > > From serguei.spitsyn at oracle.com Thu Mar 12 06:20:08 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 11 Mar 2020 23:20:08 -0700 Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they do not attempt to use sudo when available In-Reply-To: References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com> <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com> Message-ID: <150f87f8-5b54-deae-ced1-1227bb1ed17a@oracle.com> An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Thu Mar 12 07:03:57 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 12 Mar 2020 00:03:57 -0700 Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they do not attempt to use sudo when available In-Reply-To: <150f87f8-5b54-deae-ced1-1227bb1ed17a@oracle.com> References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com> <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com> <150f87f8-5b54-deae-ced1-1227bb1ed17a@oracle.com> Message-ID: <227652ff-ffb0-a1f0-531d-03b75ecec921@oracle.com> An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Thu Mar 12 07:06:39 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 12 Mar 2020 00:06:39 -0700 (PDT) Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they do not attempt to use sudo when available In-Reply-To: <227652ff-ffb0-a1f0-531d-03b75ecec921@oracle.com> References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com> <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com> <150f87f8-5b54-deae-ced1-1227bb1ed17a@oracle.com> <227652ff-ffb0-a1f0-531d-03b75ecec921@oracle.com> Message-ID: <33ffd0e7-d017-c1bf-feb5-4d2f07e753f2@oracle.com> An HTML attachment was scrubbed... URL: From egor.ushakov at jetbrains.com Thu Mar 12 10:12:49 2020 From: egor.ushakov at jetbrains.com (Egor Ushakov) Date: Thu, 12 Mar 2020 13:12:49 +0300 Subject: invokeMethod's result gced immediately Message-ID: <9a3aeb36-e859-5ca5-27b2-4781b0cbf137@jetbrains.com> Hi all, it seems that the result of the invokeMethod could be gced immediately, which is quite strange. Currently we have to do: invoke + disableCollection new(Array)Instance + disableCollection (String)mirrorOf + disableCollection in a loop until succeeded, to allow something like foo().boo().zoo() to evaluate successfully. Is there a way to automatically disable collection for newly created objects from jdi? Maybe there's a bug about this? Thanks! -- Egor Ushakov Software Developer JetBrains http://www.jetbrains.com The Drive to Develop From chiroito107 at gmail.com Thu Mar 12 14:32:17 2020 From: chiroito107 at gmail.com (Chihiro Ito) Date: Thu, 12 Mar 2020 23:32:17 +0900 Subject: PING: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows In-Reply-To: References: <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com> <0a2df665-2e08-6139-c131-043a425b4916@oracle.com> <31599c2a-6947-44a0-0f91-cb240ddaf1ed@oracle.com> <3e535ec6-573d-901e-e4c0-4e2174c80f15@oracle.com> <4d29cbef-aa4e-31bc-8562-16930eeed219@oracle.com> Message-ID: Hi, I agree with Yasunaga's idea. The current implementation seems to be required for the JVM to read and write. According to the test results, it is difficult to improve this. However, as in JBS, we also need human-readable output. Regards, Chihiro 2020?3?12?(?) 12:15 Chihiro Ito : > > Hi Serguei, > > I could not fail these tests on my environment. > I would like to see more detail bug information especially the > environment variable. > Could you share with me the test-result, please? > > Regards, > Chihiro > > 2020?3?12?(?) 10:32 Yasumasa Suenaga : > > > > Hi, > > > > AFAICS failure tests which are listed in JBS (JDK-8240881) seems to be caused by HotSpotVirtualMachine::getSystemProperties. > > It would load the result of PrintSystemPropertiesDCmd via Properties::load. So we have to compliant the spec of Properties. > > > > OTOH it is not described in spec (JDK-7120511: introducing VM.system_properties, and help message). > > > > Thus I think we need to add new option for VM.system_properties for showing raw values (e.g. -raw). > > If do so, we need CSR. > > > > What do you think? > > > > > > Yasumasa > > > > > > On 2020/03/12 4:52, serguei.spitsyn at oracle.com wrote: > > > Hi Chihiro, > > > > > > I've tested and pushed your fix but the impact of fix was underestimated. > > > The fix caused several regressions and the following bug was filed: > > > https://bugs.openjdk.java.net/browse/JDK-8240881 > > > > > > Now, I'm working on removing the fix of JDK-8222489 with the anti-delta. > > > You can find and review my RFR posted on the serviceability-dev mailing list: > > > RFR: 8240881: several tests are failing due to encoding failures > > > > > > You can file another bug as a replacement of JDK-8222489. > > > I will help you with the information about test regressions caused by it. > > > > > > Thanks, > > > Serguei > > > > > > > > > On 3/10/20 02:54, serguei.spitsyn at oracle.com wrote: > > >> Hi Chihiro, > > >> > > >> Yes, I'll sponsor it. > > >> Thank you for the update. > > >> > > >> Thanks, > > >> Serguei > > >> > > >> > > >> On 3/8/20 06:05, Chihiro Ito wrote: > > >>> Hi, > > >>> > > >>> I'm sorry. I included "JDK-" in the changeset title. I removed it and > > >>> updated it. > > >>> > > >>> Change set : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset > > >>> > > >>> Regards, > > >>> Chihiro > > >>> > > >>> 2020?3?7?(?) 23:13 Chihiro Ito : > > >>>> Hi Serguei and Yasumasa, > > >>>> > > >>>> I update the copyright year and created the change set. > > >>>> > > >>>> Could you sponsor this, please? > > >>>> > > >>>> Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/ > > >>>> Change set : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset > > >>>> > > >>>> Regards, > > >>>> Chihiro > > >>>> > > >>>> > > >>>> 2020?3?7?(?) 16:03 Yasumasa Suenaga : > > >>>> > > >>>> > > >>>>> Hi Chihiro, > > >>>>> > > >>>>> I'm also ok with webrev.05 after updating copyright year. > > >>>>> > > >>>>> > > >>>>> Yasumasa > > >>>>> > > >>>>> > > >>>>> On 2020/03/07 3:32, serguei.spitsyn at oracle.com wrote: > > >>>>>> Hi Chichiro, > > >>>>>> > > >>>>>> I'm okay with the fix. > > >>>>>> Could you, please, update the copyright date in || src/java.base/share/classes/jdk/internal/vm/VMSupport.java before push? > > >>>>>> > > >>>>>> Thanks, > > >>>>>> Serguei > > >>>>>> > > >>>>>> > > >>>>>> On 3/6/20 07:24, Chihiro Ito wrote: > > >>>>>>> Hi Serguei, > > >>>>>>> > > >>>>>>> Could you review this again, please? > > >>>>>>> > > >>>>>>> Regards, > > >>>>>>> Chihiro > > >>>>>>> > > >>>>>>> > > >>>>>>> 2020?2?27?(?) 22:11 Chihiro Ito: > > >>>>>>>> Hi Ralf, > > >>>>>>>> > > >>>>>>>> Thank you for your advice. > > >>>>>>>> > > >>>>>>>> 1. > > >>>>>>>> The comment of serializePropertiesToByteArray in VMSupport is "The stream written to the byte array is ISO 8859-1 encoded.". > > >>>>>>>> But the previous implementation does not keep this. I think we need to implement encode by ISO 8859-1. > > >>>>>>>> > > >>>>>>>> 2. > > >>>>>>>> According to help, the feature of VM.system_properties is just "Print system properties". The users should not use this output for loading. The users use it when they want to see System Properties soon. > > >>>>>>>> > > >>>>>>>> Regards, > > >>>>>>>> Chihiro > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> 2020?2?26?(?) 18:53 Schmelter, Ralf: > > >>>>>>>>> Hi Chihiro, > > >>>>>>>>> > > >>>>>>>>> I have two remarks: > > >>>>>>>>> > > >>>>>>>>> 1. ISO Latin 1 characters which are not ASCII will not work with the code. While the Properties.store() method claims to create ISO Latin 1 String, it really only will create printable ASCII characters (apart from the comment, but it is ASCII too in this case). See Properties.saveConvert, where the char is checked for < 0x20 or > 0x7e and then printed as \uxxxx. This is important, since the bytes of the ByteArrayOutputStream are then send to the jcmd. And jcmd expects UTF-8 encoded strings, which is OK if we only used ASCII characters. But a ISO Latin 1 character >= 0x80 will break the encoding. Just try using \u00DC in your test. > > >>>>>>>>> > > >>>>>>>>> 2. Your change makes it impossible to load the output with properties.load(). The old output could be loaded, since it was a valid properties file. But yours is not. For example, consider the filename c:\test\new. Formerly it would be encoded as: > > >>>>>>>>> C\:\\test\\new > > >>>>>>>>> And now it is: > > >>>>>>>>> C:\test\new > > >>>>>>>>> But the properties code would see "\n" as the newline character in your encoding. In fact you cannot differentiate between \n, \t, \f and \r originally being one or two characters. > > >>>>>>>>> > > >>>>>>>>> Best regards, > > >>>>>>>>> Ralf > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> From: serviceability-dev On Behalf Of Chihiro Ito > > >>>>>>>>> Sent: Dienstag, 25. Februar 2020 04:45 > > >>>>>>>>> To:serguei.spitsyn at oracle.com > > >>>>>>>>> Cc:serviceability-dev at openjdk.java.net > > >>>>>>>>> Subject: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows > > >>>>>>>>> > > >>>>>>>>> Hi Serguei, > > >>>>>>>>> > > >>>>>>>>> Thanks for your review and advice. > > >>>>>>>>> > > >>>>>>>>> I modified these. > > >>>>>>>>> Could you review this again, please? > > >>>>>>>>> > > >>>>>>>>> Webrev :http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.05/ > > >>>>>>>>> > > >>>>>>>>> Regards, > > >>>>>>>>> Chihiro > > >>>>>>>>> > > >> > > > From martin.doerr at sap.com Thu Mar 12 16:28:29 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Thu, 12 Mar 2020 16:28:29 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com> <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com> <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com> Message-ID: Hi Richard, I managed to find time for a (almost) complete review of webrev.4. (I'll review the tests separately.) First of all, the change seems to be in pretty good quality for its significant complexity. I couldn't find any real bugs. But I'd like to propose minor improvements. I'm convinced that it's mature because we did substantial testing. I like the new functionality for object deoptimization. It can possibly be reused for future escape analysis based optimizations. So I appreciate having it available in the code base. In addition to that, your change makes the JVMTI implementation better integrated into the VM. Now to the details: src/hotspot/share/c1/c1_IR.hpp describe_scope parameters. Ok. src/hotspot/share/ci/ciEnv.cpp src/hotspot/share/ci/ciEnv.hpp Fix for JvmtiExport::can_walk_any_space() capability. Ok. src/hotspot/share/code/compiledMethod.cpp Nice cleanup! src/hotspot/share/code/debugInfoRec.cpp src/hotspot/share/code/debugInfoRec.hpp Additional parmeters. (Remark: I think "non_global_escape_in_scope" would read better than "not_global_escape_in_scope", but your version is consistent with existing code, so no change request from my side.) Ok. src/hotspot/share/code/nmethod.cpp Nice cleanup! src/hotspot/share/code/pcDesc.hpp Additional parameters. Ok. src/hotspot/share/code/scopeDesc.cpp src/hotspot/share/code/scopeDesc.hpp Improved implementation + additional parameters. Ok. src/hotspot/share/compiler/compileBroker.cpp src/hotspot/share/compiler/compileBroker.hpp Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into a follow up change together with the test in order to make this webrev smaller, but since it is included, I'm reviewing everything at once. Not a big deal.) Ok. src/hotspot/share/jvmci/jvmciCodeInstaller.cpp Additional parameters. Ok. src/hotspot/share/opto/c2compiler.cpp Make do_escape_analysis independent of JVMCI capabilities. Nice! src/hotspot/share/opto/callnode.hpp Additional fields for MachSafePointNodes. Ok. src/hotspot/share/opto/escape.cpp Annotation for MachSafePointNodes. Your added functionality looks correct. But I'd prefer to move the bulky code out of the large function. I suggest to factor out something like has_not_global_escape and has_arg_escape. So the code could look like this: SafePointNode* sfn = sfn_worklist.at(next); sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn)); if (sfn->is_CallJava()) { CallJavaNode* call = sfn->as_CallJava(); call->set_arg_escape(has_arg_escape(call)); } This would also allow us to get rid of the found_..._escape_in_args variables making the loops better readable. It's kind of ugly to use strcmp to recognize uncommon trap, but that seems to be the way to do it (there are more such places). So it's ok. src/hotspot/share/opto/machnode.hpp Additional fields for MachSafePointNodes. Ok. src/hotspot/share/opto/macro.cpp Allow elimination of non-escaping allocations. Ok. src/hotspot/share/opto/matcher.cpp src/hotspot/share/opto/output.cpp Copy attribute / pass parameters. Ok. src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp Nice cleanup! src/hotspot/share/prims/jvmtiEnv.cpp src/hotspot/share/prims/jvmtiEnvBase.cpp Escape barriers + deoptimize objects for target thread. Good. src/hotspot/share/prims/jvmtiImpl.cpp src/hotspot/share/prims/jvmtiImpl.hpp The sequence is pretty complex: VM_GetOrSetLocal element initialization executes EscapeBarrier code which suspends the target thread (extra VM Operation). VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM Thread to prepare VM Operation with frame deoptimization). VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor which resumes the target thread. But I don't have any improvement proposal. Performance is probably not a concern, here. So it's ok. VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it has non-globally escaping objects and other frames if they have arg escaping ones. Good. src/hotspot/share/prims/jvmtiTagMap.cpp Escape barriers + deoptimize objects for all threads. Ok. src/hotspot/share/prims/whitebox.cpp Added WB_IsFrameDeoptimized to API. Ok. src/hotspot/share/runtime/deoptimization.cpp Object deoptimization. I have more comments and proposals, here. First of all, handling recursive and waiting locks in relock_objects is tricky, but looks correct. Comments are sufficient to understand why things are done as they are implemented. BiasedLocking related parts are complex, but we may get rid of them in the future (with BiasedLocking removal). Anyway, looks correct, too. Typo in comment: "regularily" => "regularly" Deoptimization::fetch_unroll_info_helper is the only place where _jvmti_deferred_updates get deallocated (except JavaThread destructor). But I think we always go through it, so I can't see a memory leak or such kind of issues. EscapeBarrier::deoptimize_objects: ResourceMark should use calling_thread(). You can use MutexLocker and MonitorLocker with Thread* to save the Thread::current() call. I'd make set_objs_are_deoptimized static and remove it from the EscapeBarrier interface because I think it shouldn't be used outside of EscapeBarrier::deoptimize_objects. Typo in comment: "we must only deoptimize" => "we only have to deoptimize" "bool EscapeBarrier::deoptimize_objects(intptr_t* fr_id)" is trivial and barrier_active() is redundant. Implementation can get moved to hpp file. I'll get back to suspend flags, later. There are weird cases regarding _self_deoptimization_in_progress. Assume we have 3 threads A, B and C. A deopts C, B deopts C, C deopts C. C can set _self_deoptimization_in_progress while A performs the handshake for suspending C. I think this doesn't lead to errors, but it's probably not desired. I think it would be better to use only one "wait" call in sync_and_suspend_one and sync_and_suspend_all. I first thought it'd be better to move ThreadBlockInVM before wait() to reduce thread state transitions, but that seems to be problematic because ThreadBlockInVM destructor contains a safepoint check which we shouldn't do while holding EscapeBarrier_lock. So no change request. Change in thred_added: I think the sequence would be more comprehensive if we waited for deopt_all_threads in Thread::start and all other places where a new thread can run into Java code (e.g. JVMTI attach). Your version makes new threads come up with suspend flag set. That looks correct, too. Advantage is that you only have to change one place (thread_added). It'll be interesting to see how it will look like when we use async handshakes instead of suspend flags. For now, I'm ok with your version. I'd only move MutexLocker ml(EscapeBarrier_lock...) after if (!jt->is_hidden_from_external_view()). Having 4 different deoptimize_objects functions makes it a little hard to keep an overview of which one is used for what. Maybe adding suffixes would help a little bit, but I can also live with what you have. Implementation looks correct to me. src/hotspot/share/runtime/deoptimization.hpp Escape barriers and object deoptimization functions. Typo in comment: "helt" => "held" src/hotspot/share/runtime/globals.hpp Addition of develop flag DeoptimizeObjectsALotInterval. Ok. src/hotspot/share/runtime/interfaceSupport.cpp InterfaceSupport::deoptimizeAllObjects() is only used for DeoptimizeObjectsALot = 1. I think DeoptimizeObjectsALot = 2 is more important, but I think it's not bad to have DeoptimizeObjectsALot = 1 in addition. Ok. src/hotspot/share/runtime/interfaceSupport.inline.hpp Addition of deoptimizeAllObjects. Ok. src/hotspot/share/runtime/mutexLocker.cpp src/hotspot/share/runtime/mutexLocker.hpp Addition of EscapeBarrier_lock. Ok. src/hotspot/share/runtime/objectMonitor.cpp Make recursion count relock aware. Ok. src/hotspot/share/runtime/stackValue.hpp Better reinitilization in StackValue. Good. src/hotspot/share/runtime/thread.cpp src/hotspot/share/runtime/thread.hpp src/hotspot/share/runtime/thread.inline.hpp wait_for_object_deoptimization, suspend flag, deferred updates and test feature to deoptimize objects. In the long term, we want to get rid of suspend flags, so it's not so nice to introduce a new one. But I agree with G?tz that it should be acceptable as temporary solution until async handshakes are available (which takes more time). So I'm ok with your change. You can use MutexLocker with Thread*. JVMTIDeferredUpdates: I agree with Robin. It'd be nice to move the class out of thread.hpp. src/hotspot/share/runtime/vframe.cpp Added support for entry frame to new_vframe. Ok. src/hotspot/share/runtime/vframe_hp.cpp src/hotspot/share/runtime/vframe_hp.hpp I think code()->as_nmethod() in not_global_escape_in_scope() and arg_escape() should better be under #ifdef ASSERT or inside the assert statement (no need for code cache walking in product build). jvmtiDeferredLocalVariableSet::update_monitors: Please add a comment explaining that owner referenced by original info may be scalar replaced, but it is deoptimized in the vframe. src/hotspot/share/utilities/macros.hpp Addition of NOT_COMPILER2_OR_JVMCI_RETURN macros. Ok. test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysisEnabled.java test/hotspot/jtreg/serviceability/jvmti/Heap/libIterateHeapWithEscapeAnalysisEnabled.c New test. Will review separately. test/jdk/TEST.ROOT Addition of vm.jvmci as required property. Ok. test/jdk/com/sun/jdi/EATests.java test/jdk/com/sun/jdi/EATestsJVMCI.java New test. Will review separately. test/lib/sun/hotspot/WhiteBox.java Added isFrameDeoptimized to API. Ok. That was it. Best regards, Martin > -----Original Message----- > From: hotspot-compiler-dev bounces at openjdk.java.net> On Behalf Of Reingruber, Richard > Sent: Dienstag, 3. M?rz 2020 21:23 > To: 'Robbin Ehn' ; Lindenmaier, Goetz > ; David Holmes ; > Vladimir Kozlov (vladimir.kozlov at oracle.com) > ; serviceability-dev at openjdk.java.net; > hotspot-compiler-dev at openjdk.java.net; hotspot-runtime- > dev at openjdk.java.net > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better > Performance in the Presence of JVMTI Agents > > Hi Robbin, > > > > I understand that Robbin proposed to replace the usage of > > > _suspend_flag with handshakes. Apparently, async handshakes > > > are needed to do so. We have been waiting a while for removal > > > of the _suspend_flag / introduction of async handshakes [2]. > > > What is the status here? > > > I have an old prototype which I would like to continue to work on. > > So do not assume asynch handshakes will make 15. > > Even if it would, I think there are a lot more investigate work to remove > > _suspend_flag. > > Let us know, if we can be of any help to you and be it only testing. > > > >> Full: > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/ > > > DeoptimizeObjectsALotThread is only used in compileBroker.cpp. > > You can move both declaration and definition to that file, no need to > clobber > > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry) > > Will do. > > > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's > own > > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp. > > You are right. It shouldn't be declared in thread.hpp. I will look into that. > > > Note that we also think we may have a bug in deopt: > > https://bugs.openjdk.java.net/browse/JDK-8238237 > > > I think it would be best, if possible, to push after that is resolved. > > Sure. > > > Not even nearly a full review :) > > I know :) > > Anyways, thanks a lot, > Richard. > > > -----Original Message----- > From: Robbin Ehn > Sent: Monday, March 2, 2020 11:17 AM > To: Lindenmaier, Goetz ; Reingruber, Richard > ; David Holmes ; > Vladimir Kozlov (vladimir.kozlov at oracle.com) > ; serviceability-dev at openjdk.java.net; > hotspot-compiler-dev at openjdk.java.net; hotspot-runtime- > dev at openjdk.java.net > Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance > in the Presence of JVMTI Agents > > Hi, > > On 2/24/20 5:39 PM, Lindenmaier, Goetz wrote: > > Hi, > > > > I had a look at the progress of this change. Nothing > > happened since Richard posted his update using more > > handshakes [1]. > > But we (SAP) would appreciate a lot if this change could > > be successfully reviewed and pushed. > > > > I think there is basic understanding that this > > change is helpful. It fixes a number of issues with JVMTI, > > and will deliver the same performance benefits as EA > > does in current production mode for debugging scenarios. > > > > This is important for us as we run our VMs prepared > > for debugging in production mode. > > > > I understand that Robbin proposed to replace the usage of > > _suspend_flag with handshakes. Apparently, async handshakes > > are needed to do so. We have been waiting a while for removal > > of the _suspend_flag / introduction of async handshakes [2]. > > What is the status here? > > I have an old prototype which I would like to continue to work on. > So do not assume asynch handshakes will make 15. > Even if it would, I think there are a lot more investigate work to remove > _suspend_flag. > > > > > I think we should no longer wait, but proceed with > > this change. We will look into removing the usage of > > suspend_flag introduced here once it is possible to implement > > it with handshakes. > > Yes, sure. > > >> Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/ > > DeoptimizeObjectsALotThread is only used in compileBroker.cpp. > You can move both declaration and definition to that file, no need to clobber > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry) > > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's > own > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp. > > Note that we also think we may have a bug in deopt: > https://bugs.openjdk.java.net/browse/JDK-8238237 > > I think it would be best, if possible, to push after that is resolved. > > Not even nearly a full review :) > > Thanks, Robbin > > > >> Incremental: > >> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/ > >> > >> I was not able to eliminate the additional suspend flag now. I'll take care > of this > >> as soon as the > >> existing suspend-resume-mechanism is reworked. > >> > >> Testing: > >> > >> Nightly tests @SAP: > >> > >> JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, > Renaissance > >> Suite, SAP specific tests > >> with fastdebug and release builds on all platforms > >> > >> Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x > parallel > >> for 24h > >> > >> Thanks, Richard. > >> > >> > >> More details on the changes: > >> > >> * Hide DeoptimizeObjectsALotThread from external view. > >> > >> * Changed EscapeBarrier_lock to be a _safepoint_check_never lock. > >> It used to be _safepoint_check_sometimes, which will be eliminated > sooner or > >> later. > >> I added explicit thread state changes with ThreadBlockInVM to code > paths > >> where we can wait() > >> on EscapeBarrier_lock to become safepoint safe. > >> > >> * Use handshake EscapeBarrierSuspendHandshake to suspend target > threads > >> instead of vm operation > >> VM_ThreadSuspendAllForObjDeopt. > >> > >> * Removed uses of Threads_lock. When adding a new thread we suspend > it iff > >> EA optimizations are > >> being reverted. In the previous version we were waiting on > Threads_lock > >> while EA optimizations > >> were reverted. See EscapeBarrier::thread_added(). > >> > >> * Made tests require Xmixed compilation mode. > >> > >> * Made tests agnostic regarding tiered compilation. > >> I.e. tc isn't disabled anymore, and the tests can be run with tc enabled or > >> disabled. > >> > >> * Exercising EATests.java as well with stress test options > >> DeoptimizeObjectsALot* > >> Due to the non-deterministic deoptimizations some tests need to be > skipped. > >> We do this to prevent bit-rot of the stress test code. > >> > >> * Executing EATests.java as well with graal if available. Driver for this is > >> EATestsJVMCI.java. Graal cannot pass all tests, because it does not > provide all > >> the new debug info > >> (namely not_global_escape_in_scope and arg_escape in > scopeDesc.hpp). > >> And graal does not yet support the JVMTI operations force early return > and > >> pop frame. > >> > >> * Removed tracing from new jdi tests in EATests.java. Too much trace > output > >> before the debugging > >> connection is established can cause deadlock because output buffers fill > up. > >> (See https://bugs.openjdk.java.net/browse/JDK-8173304) > >> > >> * Many copyright year changes and smaller clean-up changes of testing > code > >> (trailing white-space and > >> the like). > >> > >> > >> -----Original Message----- > >> From: David Holmes > >> Sent: Donnerstag, 19. Dezember 2019 03:12 > >> To: Reingruber, Richard ; serviceability- > >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; > hotspot- > >> runtime-dev at openjdk.java.net; Vladimir Kozlov > (vladimir.kozlov at oracle.com) > >> > >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > Performance in > >> the Presence of JVMTI Agents > >> > >> Hi Richard, > >> > >> I think my issue is with the way EliminateNestedLocks works so I'm going > >> to look into that more deeply. > >> > >> Thanks for the explanations. > >> > >> David > >> > >> On 18/12/2019 12:47 am, Reingruber, Richard wrote: > >>> Hi David, > >>> > >>> > > > Some further queries/concerns: > >>> > > > > >>> > > > src/hotspot/share/runtime/objectMonitor.cpp > >>> > > > > >>> > > > Can you please explain the changes to ObjectMonitor::wait: > >>> > > > > >>> > > > ! _recursions = save // restore the old recursion count > >>> > > > ! + jt->get_and_reset_relock_count_after_wait(); // > >>> > > > increased by the deferred relock count > >>> > > > > >>> > > > what is the "deferred relock count"? I gather it relates to > >>> > > > > >>> > > > "The code was extended to be able to deoptimize objects of a > >>> > > frame that > >>> > > > is not the top frame and to let another thread than the owning > >>> > > thread do > >>> > > > it." > >>> > > > >>> > > Yes, these relate. Currently EA based optimizations are reverted, > when a > >> compiled frame is > >>> > > replaced with corresponding interpreter frames. Part of this is > relocking > >> objects with eliminated > >>> > > locking. New with the enhancement is that we do this also just > before > >> object references are > >>> > > acquired through JVMTI. In this case we deoptimize also the > owning > >> compiled frame C and we > >>> > > register deoptimized objects as deferred updates. When control > returns > >> to C it gets deoptimized, > >>> > > we notice that objects are already deoptimized (reallocated and > >> relocked), so we don't do it again > >>> > > (relocking twice would be incorrect of course). Deferred updates > are > >> copied into the new > >>> > > interpreter frames. > >>> > > > >>> > > Problem: relocking is not possible if the target thread T is waiting > on the > >> monitor that needs to > >>> > > be relocked. This happens only with non-local objects with > >> EliminateNestedLocks. Instead relocking > >>> > > is deferred until T owns the monitor again. This is what the piece of > >> code above does. > >>> > > >>> > Sorry I need some more detail here. How can you wait() on an > object > >>> > monitor if the object allocation and/or locking was optimised away? > And > >>> > what is a "non-local object" in this context? Isn't EA restricted to > >>> > thread-confined objects? > >>> > >>> "Non-local object" is an object that escapes its thread. The issue I'm > >> addressing with the changes > >>> in ObjectMonitor::wait are almost unrelated to EA. They are caused by > >> EliminateNestedLocks, where C2 > >>> eliminates recursive locking of an already owned lock. The lock owning > object > >> exists on the heap, it > >>> is locked and you can call wait() on it. > >>> > >>> EliminateLocks is the C2 option that controls lock elimination based on > EA. > >> Both optimizations have > >>> in common that objects with eliminated locking need to be relocked > when > >> deoptimizing a frame, > >>> i.e. when replacing a compiled frame with equivalent interpreter > >>> frames. Deoptimization::relock_objects does that job for /all/ eliminated > >> locks in scope. /All/ can > >>> be a mix of eliminated nested locks and locks of not-escaping objects. > >>> > >>> New with the enhancement: I call relock_objects earlier, just before > objects > >> pontentially > >>> escape. But then later when the owning compiled frame gets > deoptimized, I > >> must not do it again: > >>> > >>> See call to EscapeBarrier::objs_are_deoptimized in deoptimization.cpp: > >>> > >>> 373 if ((jvmci_enabled || ((DoEscapeAnalysis || > EliminateNestedLocks) && > >> EliminateLocks)) > >>> 374 && !EscapeBarrier::objs_are_deoptimized(thread, > deoptee.id())) { > >>> 375 bool unused; > >>> 376 eliminate_locks(thread, chunk, realloc_failures, deoptee, > exec_mode, > >> unused); > >>> 377 } > >>> > >>> Now when calling relock_objects early it is quiet possible that I have to > relock > >> an object the > >>> target thread currently waits for. Obviously I cannot relock in this case, > >> instead I chose to > >>> introduce relock_count_after_wait to JavaThread. > >>> > >>> > Is it just that some of the locking gets optimized away e.g. > >>> > > >>> > synchronised(obj) { > >>> > synchronised(obj) { > >>> > synchronised(obj) { > >>> > obj.wait(); > >>> > } > >>> > } > >>> > } > >>> > > >>> > If this is reduced to a form as-if it were a single lock of the monitor > >>> > (due to EA) and the wait() triggers a JVM TI event which leads to the > >>> > escape of "obj" then we need to reconstruct the true lock state, and > so > >>> > when the wait() internally unblocks and reacquires the monitor it > has to > >>> > set the true recursion count to 3, not the 1 that it appeared to be > when > >>> > wait() was initially called. Is that the scenario? > >>> > >>> Kind of... except that the locking is not eliminated due to EA and there is > no > >> JVM TI event > >>> triggered by wait. > >>> > >>> Add > >>> > >>> LocalObject l1 = new LocalObject(); > >>> > >>> in front of the synchrnized blocks and assume a JVM TI agent acquires l1. > This > >> triggers the code in > >>> question. > >>> > >>> See that relocking/reallocating is transactional. If it is done then for /all/ > >> objects in scope and it is > >>> done at most once. It wouldn't be quite so easy to split this in relocking > of > >> nested/EA-based > >>> eliminated locks. > >>> > >>> > If so I find this truly awful. Anyone using wait() in a realistic form > >>> > requires a notification and so the object cannot be thread confined. > In > >>> > >>> It is not thread confined. > >>> > >>> > which case I would strongly argue that upon hitting the wait() the > deopt > >>> > should occur unconditionally and so the lock state is correct before > we > >>> > wait and so we don't need to mess with the recursion count > internally > >>> > when we reacquire the monitor. > >>> > > >>> > > > >>> > > > which I don't like the sound of at all when it comes to > ObjectMonitor > >>> > > > state. So I'd like to understand in detail exactly what is going on > here > >>> > > > and why. This is a very intrusive change that seems to badly > break > >>> > > > encapsulation and impacts future changes to ObjectMonitor > that are > >> under > >>> > > > investigation. > >>> > > > >>> > > I would not regard this as breaking encapsulation. Certainly not > badly. > >>> > > > >>> > > I've added a property relock_count_after_wait to JavaThread. The > >> property is well > >>> > > encapsulated. Future ObjectMonitor implementations have to deal > with > >> recursion too. They are free > >>> > > in choosing a way to do that as long as that property is taken into > >> account. This is hardly a > >>> > > limitation. > >>> > > >>> > I do think this badly breaks encapsulation as you have to add a > callout > >>> > from the guts of the ObjectMonitor code to reach into the thread to > get > >>> > this lock count adjustment. I understand why you have had to do > this but > >>> > I would much rather see a change to the EA optimisation strategy so > that > >>> > this is not needed. > >>> > > >>> > > Note also that the property is a straight forward extension of the > >> existing concept of deferred > >>> > > local updates. It is embedded into the structure holding them. So > not > >> even the footprint of a > >>> > > JavaThread is enlarged if no deferred updates are generated. > >>> > > >>> > [...] > >>> > > >>> > > > >>> > > I'm actually duplicating the existing external suspend mechanism, > >> because a thread can be > >>> > > suspended at most once. And hey, and don't like that either! But it > >> seems not unlikely that the > >>> > > duplicate can be removed together with the original and the new > type > >> of handshakes that will be > >>> > > used for thread suspend can be used for object deoptimization > too. See > >> today's discussion in > >>> > > JDK-8227745 [2]. > >>> > > >>> > I hope that discussion bears some fruit, at the moment it seems not > to > >>> > be possible to use handshakes here. :( > >>> > > >>> > The external suspend mechanism is a royal pain in the proverbial > that we > >>> > have to carefully live with. The idea that we're duplicating that for > >>> > use in another fringe area of functionality does not thrill me at all. > >>> > > >>> > To be clear, I understand the problem that exists and that you wish > to > >>> > solve, but for the runtime parts I balk at the complexity cost of > >>> > solving it. > >>> > >>> I know it's complex, but by far no rocket science. > >>> > >>> Also I find it hard to imagine another fix for JDK-8233915 besides > changing > >> the JVM TI specification. > >>> > >>> Thanks, Richard. > >>> > >>> -----Original Message----- > >>> From: David Holmes > >>> Sent: Dienstag, 17. Dezember 2019 08:03 > >>> To: Reingruber, Richard ; serviceability- > >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; > hotspot- > >> runtime-dev at openjdk.java.net; Vladimir Kozlov > (vladimir.kozlov at oracle.com) > >> > >>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > Performance > >> in the Presence of JVMTI Agents > >>> > >>> > >>> > >>> David > >>> > >>> On 17/12/2019 4:57 pm, David Holmes wrote: > >>>> Hi Richard, > >>>> > >>>> On 14/12/2019 5:01 am, Reingruber, Richard wrote: > >>>>> Hi David, > >>>>> > >>>>> ?? > Some further queries/concerns: > >>>>> ?? > > >>>>> ?? > src/hotspot/share/runtime/objectMonitor.cpp > >>>>> ?? > > >>>>> ?? > Can you please explain the changes to ObjectMonitor::wait: > >>>>> ?? > > >>>>> ?? > !?? _recursions = save????? // restore the old recursion count > >>>>> ?? > !???????????????? + jt->get_and_reset_relock_count_after_wait(); // > >>>>> ?? > increased by the deferred relock count > >>>>> ?? > > >>>>> ?? > what is the "deferred relock count"? I gather it relates to > >>>>> ?? > > >>>>> ?? > "The code was extended to be able to deoptimize objects of a > >>>>> frame that > >>>>> ?? > is not the top frame and to let another thread than the owning > >>>>> thread do > >>>>> ?? > it." > >>>>> > >>>>> Yes, these relate. Currently EA based optimizations are reverted, > when > >>>>> a compiled frame is replaced > >>>>> with corresponding interpreter frames. Part of this is relocking > >>>>> objects with eliminated > >>>>> locking. New with the enhancement is that we do this also just before > >>>>> object references are acquired > >>>>> through JVMTI. In this case we deoptimize also the owning compiled > >>>>> frame C and we register > >>>>> deoptimized objects as deferred updates. When control returns to C > it > >>>>> gets deoptimized, we notice > >>>>> that objects are already deoptimized (reallocated and relocked), so > we > >>>>> don't do it again (relocking > >>>>> twice would be incorrect of course). Deferred updates are copied into > >>>>> the new interpreter frames. > >>>>> > >>>>> Problem: relocking is not possible if the target thread T is waiting > >>>>> on the monitor that needs to be > >>>>> relocked. This happens only with non-local objects with > >>>>> EliminateNestedLocks. Instead relocking is > >>>>> deferred until T owns the monitor again. This is what the piece of > >>>>> code above does. > >>>> > >>>> Sorry I need some more detail here. How can you wait() on an object > >>>> monitor if the object allocation and/or locking was optimised away? > And > >>>> what is a "non-local object" in this context? Isn't EA restricted to > >>>> thread-confined objects? > >>>> > >>>> Is it just that some of the locking gets optimized away e.g. > >>>> > >>>> synchronised(obj) { > >>>> ? synchronised(obj) { > >>>> ??? synchronised(obj) { > >>>> ????? obj.wait(); > >>>> ??? } > >>>> ? } > >>>> } > >>>> > >>>> If this is reduced to a form as-if it were a single lock of the monitor > >>>> (due to EA) and the wait() triggers a JVM TI event which leads to the > >>>> escape of "obj" then we need to reconstruct the true lock state, and so > >>>> when the wait() internally unblocks and reacquires the monitor it has to > >>>> set the true recursion count to 3, not the 1 that it appeared to be when > >>>> wait() was initially called. Is that the scenario? > >>>> > >>>> If so I find this truly awful. Anyone using wait() in a realistic form > >>>> requires a notification and so the object cannot be thread confined. In > >>>> which case I would strongly argue that upon hitting the wait() the > deopt > >>>> should occur unconditionally and so the lock state is correct before we > >>>> wait and so we don't need to mess with the recursion count internally > >>>> when we reacquire the monitor. > >>>> > >>>>> > >>>>> ?? > which I don't like the sound of at all when it comes to > >>>>> ObjectMonitor > >>>>> ?? > state. So I'd like to understand in detail exactly what is going > >>>>> on here > >>>>> ?? > and why.? This is a very intrusive change that seems to badly > break > >>>>> ?? > encapsulation and impacts future changes to ObjectMonitor that > >>>>> are under > >>>>> ?? > investigation. > >>>>> > >>>>> I would not regard this as breaking encapsulation. Certainly not badly. > >>>>> > >>>>> I've added a property relock_count_after_wait to JavaThread. The > >>>>> property is well > >>>>> encapsulated. Future ObjectMonitor implementations have to deal > with > >>>>> recursion too. They are free in > >>>>> choosing a way to do that as long as that property is taken into > >>>>> account. This is hardly a > >>>>> limitation. > >>>> > >>>> I do think this badly breaks encapsulation as you have to add a callout > >>>> from the guts of the ObjectMonitor code to reach into the thread to > get > >>>> this lock count adjustment. I understand why you have had to do this > but > >>>> I would much rather see a change to the EA optimisation strategy so > that > >>>> this is not needed. > >>>> > >>>>> Note also that the property is a straight forward extension of the > >>>>> existing concept of deferred > >>>>> local updates. It is embedded into the structure holding them. So not > >>>>> even the footprint of a > >>>>> JavaThread is enlarged if no deferred updates are generated. > >>>>> > >>>>> ?? > --- > >>>>> ?? > > >>>>> ?? > src/hotspot/share/runtime/thread.cpp > >>>>> ?? > > >>>>> ?? > Can you please explain why > >>>>> JavaThread::wait_for_object_deoptimization > >>>>> ?? > has to be handcrafted in this way rather than using proper > >>>>> transitions. > >>>>> ?? > > >>>>> > >>>>> I wrote wait_for_object_deoptimization taking > >>>>> JavaThread::java_suspend_self_with_safepoint_check > >>>>> as template. So in short: for the same reasons :) > >>>>> > >>>>> Threads reach both methods as part of thread state transitions, > >>>>> therefore special handling is > >>>>> required to change thread state on top of ongoing transitions. > >>>>> > >>>>> ?? > We got rid of "deopt suspend" some time ago and it is disturbing > >>>>> to see > >>>>> ?? > it being added back (effectively). This seems like it may be > >>>>> something > >>>>> ?? > that handshakes could be used for. > >>>>> > >>>>> Deopt suspend used to be something rather different with a similar > >>>>> name[1]. It is not being added back. > >>>> > >>>> I stand corrected. Despite comments in the code to the contrary > >>>> deopt_suspend didn't actually cause a self-suspend. I was doing a lot of > >>>> cleanup in this area 13 years ago :) > >>>> > >>>>> > >>>>> I'm actually duplicating the existing external suspend mechanism, > >>>>> because a thread can be suspended > >>>>> at most once. And hey, and don't like that either! But it seems not > >>>>> unlikely that the duplicate can > >>>>> be removed together with the original and the new type of > handshakes > >>>>> that will be used for > >>>>> thread suspend can be used for object deoptimization too. See > today's > >>>>> discussion in JDK-8227745 [2]. > >>>> > >>>> I hope that discussion bears some fruit, at the moment it seems not to > >>>> be possible to use handshakes here. :( > >>>> > >>>> The external suspend mechanism is a royal pain in the proverbial that > we > >>>> have to carefully live with. The idea that we're duplicating that for > >>>> use in another fringe area of functionality does not thrill me at all. > >>>> > >>>> To be clear, I understand the problem that exists and that you wish to > >>>> solve, but for the runtime parts I balk at the complexity cost of > >>>> solving it. > >>>> > >>>> Thanks, > >>>> David > >>>> ----- > >>>> > >>>>> Thanks, Richard. > >>>>> > >>>>> [1] Deopt suspend was something like an async. handshake for > >>>>> architectures with register windows, > >>>>> ???? where patching the return pc for deoptimization of a compiled > >>>>> frame was racy if the owner thread > >>>>> ???? was in native code. Instead a "deopt" suspend flag was set on > >>>>> which the thread patched its own > >>>>> ???? frame upon return from native. So no thread was suspended. It > got > >>>>> its name only from the name of > >>>>> ???? the flags. > >>>>> > >>>>> [2] Discussion about using handshakes to sync. with the target thread: > >>>>> > >>>>> https://bugs.openjdk.java.net/browse/JDK- > >> > 8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.syst > e > >> m.issuetabpanels:comment-tabpanel#comment-14306727 > >>>>> > >>>>> > >>>>> -----Original Message----- > >>>>> From: David Holmes > >>>>> Sent: Freitag, 13. Dezember 2019 00:56 > >>>>> To: Reingruber, Richard ; > >>>>> serviceability-dev at openjdk.java.net; > >>>>> hotspot-compiler-dev at openjdk.java.net; > >>>>> hotspot-runtime-dev at openjdk.java.net > >>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > >>>>> Performance in the Presence of JVMTI Agents > >>>>> > >>>>> Hi Richard, > >>>>> > >>>>> Some further queries/concerns: > >>>>> > >>>>> src/hotspot/share/runtime/objectMonitor.cpp > >>>>> > >>>>> Can you please explain the changes to ObjectMonitor::wait: > >>>>> > >>>>> !?? _recursions = save????? // restore the old recursion count > >>>>> !???????????????? + jt->get_and_reset_relock_count_after_wait(); // > >>>>> increased by the deferred relock count > >>>>> > >>>>> what is the "deferred relock count"? I gather it relates to > >>>>> > >>>>> "The code was extended to be able to deoptimize objects of a frame > that > >>>>> is not the top frame and to let another thread than the owning thread > do > >>>>> it." > >>>>> > >>>>> which I don't like the sound of at all when it comes to ObjectMonitor > >>>>> state. So I'd like to understand in detail exactly what is going on here > >>>>> and why.? This is a very intrusive change that seems to badly break > >>>>> encapsulation and impacts future changes to ObjectMonitor that are > under > >>>>> investigation. > >>>>> > >>>>> --- > >>>>> > >>>>> src/hotspot/share/runtime/thread.cpp > >>>>> > >>>>> Can you please explain why > JavaThread::wait_for_object_deoptimization > >>>>> has to be handcrafted in this way rather than using proper transitions. > >>>>> > >>>>> We got rid of "deopt suspend" some time ago and it is disturbing to > see > >>>>> it being added back (effectively). This seems like it may be something > >>>>> that handshakes could be used for. > >>>>> > >>>>> Thanks, > >>>>> David > >>>>> ----- > >>>>> > >>>>> On 12/12/2019 7:02 am, David Holmes wrote: > >>>>>> On 12/12/2019 1:07 am, Reingruber, Richard wrote: > >>>>>>> Hi David, > >>>>>>> > >>>>>>> ??? > Most of the details here are in areas I can comment on in > detail, > >>>>>>> but I > >>>>>>> ??? > did take an initial general look at things. > >>>>>>> > >>>>>>> Thanks for taking the time! > >>>>>> > >>>>>> Apologies the above should read: > >>>>>> > >>>>>> "Most of the details here are in areas I *can't* comment on in detail > >>>>>> ..." > >>>>>> > >>>>>> David > >>>>>> > >>>>>>> ??? > The only thing that jumped out at me is that I think the > >>>>>>> ??? > DeoptimizeObjectsALotThread should be a hidden thread. > >>>>>>> ??? > > >>>>>>> ??? > +? bool is_hidden_from_external_view() const { return true; } > >>>>>>> > >>>>>>> Yes, it should. Will add the method like above. > >>>>>>> > >>>>>>> ??? > Also I don't see any testing of the > DeoptimizeObjectsALotThread. > >>>>>>> Without > >>>>>>> ??? > active testing this will just bit-rot. > >>>>>>> > >>>>>>> DeoptimizeObjectsALot is meant for stress testing with a larger > >>>>>>> workload. I will add a minimal test > >>>>>>> to keep it fresh. > >>>>>>> > >>>>>>> ??? > Also on the tests I don't understand your @requires clause: > >>>>>>> ??? > > >>>>>>> ??? >?? @requires ((vm.compMode != "Xcomp") & > vm.compiler2.enabled > >> & > >>>>>>> ??? > (vm.opt.TieredCompilation != true)) > >>>>>>> ??? > > >>>>>>> ??? > This seems to require that TieredCompilation is disabled, but > >>>>>>> tiered is > >>>>>>> ??? > our normal mode of operation. ?? > >>>>>>> ??? > > >>>>>>> > >>>>>>> I removed the clause. I guess I wanted to target the tests towards > the > >>>>>>> code they are supposed to > >>>>>>> test, and it's easier to analyze failures w/o tiered compilation and > >>>>>>> with just one compiler thread. > >>>>>>> > >>>>>>> Additionally I will make use of > >>>>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests. > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Richard. > >>>>>>> > >>>>>>> -----Original Message----- > >>>>>>> From: David Holmes > >>>>>>> Sent: Mittwoch, 11. Dezember 2019 08:03 > >>>>>>> To: Reingruber, Richard ; > >>>>>>> serviceability-dev at openjdk.java.net; > >>>>>>> hotspot-compiler-dev at openjdk.java.net; > >>>>>>> hotspot-runtime-dev at openjdk.java.net > >>>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > >>>>>>> Performance in the Presence of JVMTI Agents > >>>>>>> > >>>>>>> Hi Richard, > >>>>>>> > >>>>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote: > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> I would like to get reviews please for > >>>>>>>> > >>>>>>>> > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ > >>>>>>>> > >>>>>>>> Corresponding RFE: > >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745 > >>>>>>>> > >>>>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 > >>>>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK- > 8214584 [1] > >>>>>>>> > >>>>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing > without > >>>>>>>> issues (thanks!). In addition the > >>>>>>>> change is being tested at SAP since I posted the first RFR some > >>>>>>>> months ago. > >>>>>>>> > >>>>>>>> The intention of this enhancement is to benefit performance wise > from > >>>>>>>> escape analysis even if JVMTI > >>>>>>>> agents request capabilities that allow them to access local variable > >>>>>>>> values. E.g. if you start-up > >>>>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, > then > >>>>>>>> escape analysis is disabled right > >>>>>>>> from the beginning, well before a debugger attaches -- if ever one > >>>>>>>> should do so. With the > >>>>>>>> enhancement, escape analysis will remain enabled until and after > a > >>>>>>>> debugger attaches. EA based > >>>>>>>> optimizations are reverted just before an agent acquires the > >>>>>>>> reference to an object. In the JBS item > >>>>>>>> you'll find more details. > >>>>>>> > >>>>>>> Most of the details here are in areas I can comment on in detail, but > I > >>>>>>> did take an initial general look at things. > >>>>>>> > >>>>>>> The only thing that jumped out at me is that I think the > >>>>>>> DeoptimizeObjectsALotThread should be a hidden thread. > >>>>>>> > >>>>>>> +? bool is_hidden_from_external_view() const { return true; } > >>>>>>> > >>>>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread. > >>>>>>> Without > >>>>>>> active testing this will just bit-rot. > >>>>>>> > >>>>>>> Also on the tests I don't understand your @requires clause: > >>>>>>> > >>>>>>> ??? @requires ((vm.compMode != "Xcomp") & > vm.compiler2.enabled & > >>>>>>> (vm.opt.TieredCompilation != true)) > >>>>>>> > >>>>>>> This seems to require that TieredCompilation is disabled, but tiered > is > >>>>>>> our normal mode of operation. ?? > >>>>>>> > >>>>>>> Thanks, > >>>>>>> David > >>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Richard. > >>>>>>>> > >>>>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745 > >>>>>>>> > >> > http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.pa > tc > >> h > >>>>>>>> > >>>>>>>> > >>>>>>>> From chris.plummer at oracle.com Thu Mar 12 16:53:49 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 12 Mar 2020 09:53:49 -0700 Subject: invokeMethod's result gced immediately In-Reply-To: <9a3aeb36-e859-5ca5-27b2-4781b0cbf137@jetbrains.com> References: <9a3aeb36-e859-5ca5-27b2-4781b0cbf137@jetbrains.com> Message-ID: <7ee41f75-106e-2f35-0369-3256193aab5d@oracle.com> Hi Egor, This stems from the JDWP spec. If you look at the ObjectReference.DisableCollection command: https://docs.oracle.com/javase/10/docs/specs/jdwp/jdwp-protocol.html#JDWP_ObjectReference_DisableCollection "By default all objects in back-end replies may be collected at any time the target VM is running. A call to this command guarantees that the object will not be collected." Yes, it can be annoying that by default collection is not already disabled on any returned ObjectReference. We've had some tests with intermittent failures because they were buggy in this regard. I think there are two reasons it is done this way, both mentioned in the above JDWP section. The first is: "Note that while the target VM is suspended, no garbage collection will occur because all threads are suspended. The typical examination of variables, fields, and arrays during the suspension is safe without explicitly disabling garbage collection." So it's quite common not to need to DisableCollection on the object. Second is: "This method should be used sparingly, as it alters the pattern of garbage collection in the target VM and, consequently, may result in application behavior under the debugger that differs from its non-debugged behavior." So it looks like there's good reason to limit using DisableCollection. It's a bit unclear to me how your foo().boo().zoo() example is implemented. Are you under a SUSPEND_ALL when you do the invoke? If so, all threads should still be suspended after the invoke completes, so you should be able to call DisableCollection without having to worry about it failing due to the object already being collected. This should set you up to use the objectref in the next invoke in the chain, once again without worry about having to retry due to collection. cheers, Chris On 3/12/20 3:12 AM, Egor Ushakov wrote: > Hi all, > > it seems that the result of the invokeMethod could be gced > immediately, which is quite strange. > Currently we have to do: > invoke + disableCollection > new(Array)Instance + disableCollection > (String)mirrorOf + disableCollection > in a loop until succeeded, to allow something like foo().boo().zoo() > to evaluate successfully. > Is there a way to automatically disable collection for newly created > objects from jdi? > Maybe there's a bug about this? > > Thanks! > From jonathan.gibbons at oracle.com Thu Mar 12 20:50:01 2020 From: jonathan.gibbons at oracle.com (Jonathan Gibbons) Date: Thu, 12 Mar 2020 13:50:01 -0700 Subject: RFR: [small,docs] JDK-8240971 Fix CSS styles in some doc comments Message-ID: Please review a simple fix regarding the non-standard use of some CSS in some doc comments. From the JBS Description: Recently, for the display of javadoc block tags, javadoc changed from using an inconsistent set of CSS class names on the generated 'dt' elements to using a single new name ("notes") on the enclosing 'dl' element. There are a few (4) places in the main JDK code where the old-style names were used explicitly in doc comments, in order to emulate the appearance of a list of block tags. These use-sites should be fixed up. They are in the following files: open/src/java.base/share/classes/module-info.java open/src/java.se/share/classes/module-info.java open/src/java.management.rmi/share/classes/module-info.java open/src/jdk.jconsole/share/classes/module-info.java In addition, these four files used the style attribute to force the font to be used. The font is now set in the standard CSS for "notes", and so the local use of a "style" attribute is no longer necessary. -- Jon JBS: https://bugs.openjdk.java.net/browse/JDK-8240971 Webrev: http://cr.openjdk.java.net/~jjg/8240971/webrev.00/index.html API: http://cr.openjdk.java.net/~jjg/8240971/api.00/index.html From mandy.chung at oracle.com Thu Mar 12 20:53:31 2020 From: mandy.chung at oracle.com (Mandy Chung) Date: Thu, 12 Mar 2020 13:53:31 -0700 Subject: RFR: [small, docs] JDK-8240971 Fix CSS styles in some doc comments In-Reply-To: References: Message-ID: <3b7e1339-a173-9739-fa8c-f603304a5886@oracle.com> This change looks okay. Mandy On 3/12/20 1:50 PM, Jonathan Gibbons wrote: > Please review a simple fix regarding the non-standard use of some CSS > in some doc comments. > > From the JBS Description: > > Recently, for the display of javadoc block tags, javadoc changed from > using an inconsistent set of CSS class names on the generated 'dt' > elements to using a single new name ("notes") on the enclosing 'dl' > element. > > There are a few (4) places in the main JDK code where the old-style > names were used explicitly in doc comments, in order to emulate the > appearance of a list of block tags. These use-sites should be fixed > up. They are in the following files: > > open/src/java.base/share/classes/module-info.java > open/src/java.se/share/classes/module-info.java > open/src/java.management.rmi/share/classes/module-info.java > open/src/jdk.jconsole/share/classes/module-info.java > > In addition, these four files used the style attribute to force the > font to be used. The font is now set in the standard CSS for "notes", > and so the local use of a "style" attribute is no longer necessary. > > -- Jon > > JBS: https://bugs.openjdk.java.net/browse/JDK-8240971 > Webrev: http://cr.openjdk.java.net/~jjg/8240971/webrev.00/index.html > API: http://cr.openjdk.java.net/~jjg/8240971/api.00/index.html > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexey.menkov at oracle.com Fri Mar 13 00:33:21 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Thu, 12 Mar 2020 17:33:21 -0700 Subject: RFR: [small, docs] JDK-8240971 Fix CSS styles in some doc comments In-Reply-To: <3b7e1339-a173-9739-fa8c-f603304a5886@oracle.com> References: <3b7e1339-a173-9739-fa8c-f603304a5886@oracle.com> Message-ID: +1 --alex On 03/12/2020 13:53, Mandy Chung wrote: > This change looks okay. > > Mandy > > On 3/12/20 1:50 PM, Jonathan Gibbons wrote: >> Please review a simple fix regarding the non-standard use of some CSS >> in some doc comments. >> >> From the JBS Description: >> >> Recently, for the display of javadoc block tags, javadoc changed from >> using an inconsistent set of CSS class names on the generated 'dt' >> elements to using a single new name ("notes") on the enclosing 'dl' >> element. >> >> There are a few (4) places in the main JDK code where the old-style >> names were used explicitly in doc comments, in order to emulate the >> appearance of a list of block tags. These use-sites should be fixed >> up. They are in the following files: >> >> open/src/java.base/share/classes/module-info.java >> open/src/java.se/share/classes/module-info.java >> open/src/java.management.rmi/share/classes/module-info.java >> open/src/jdk.jconsole/share/classes/module-info.java >> >> In addition, these four files used the style attribute to force the >> font to be used. The font is now set in the standard CSS for "notes", >> and so the local use of a "style" attribute is no longer necessary. >> >> -- Jon >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8240971 >> Webrev: http://cr.openjdk.java.net/~jjg/8240971/webrev.00/index.html >> API: http://cr.openjdk.java.net/~jjg/8240971/api.00/index.html >> > From chris.plummer at oracle.com Fri Mar 13 06:06:03 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 12 Mar 2020 23:06:03 -0700 Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they do not attempt to use sudo when available In-Reply-To: <33ffd0e7-d017-c1bf-feb5-4d2f07e753f2@oracle.com> References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com> <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com> <150f87f8-5b54-deae-ced1-1227bb1ed17a@oracle.com> <227652ff-ffb0-a1f0-531d-03b75ecec921@oracle.com> <33ffd0e7-d017-c1bf-feb5-4d2f07e753f2@oracle.com> Message-ID: <90e1ffcc-6636-b328-b167-fd1d42072575@oracle.com> An HTML attachment was scrubbed... URL: From richard.reingruber at sap.com Fri Mar 13 09:08:51 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Fri, 13 Mar 2020 09:08:51 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com> <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com> <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com> Message-ID: Hi Martin, thanks a lot for reviewing and the feedback. I'll dig into the details as soon as possible. Looking forward to it :) Thanks, Richard. -----Original Message----- From: Doerr, Martin Sent: Donnerstag, 12. M?rz 2020 17:28 To: Reingruber, Richard ; 'Robbin Ehn' ; Lindenmaier, Goetz ; David Holmes ; Vladimir Kozlov (vladimir.kozlov at oracle.com) ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents Hi Richard, I managed to find time for a (almost) complete review of webrev.4. (I'll review the tests separately.) First of all, the change seems to be in pretty good quality for its significant complexity. I couldn't find any real bugs. But I'd like to propose minor improvements. I'm convinced that it's mature because we did substantial testing. I like the new functionality for object deoptimization. It can possibly be reused for future escape analysis based optimizations. So I appreciate having it available in the code base. In addition to that, your change makes the JVMTI implementation better integrated into the VM. Now to the details: src/hotspot/share/c1/c1_IR.hpp describe_scope parameters. Ok. src/hotspot/share/ci/ciEnv.cpp src/hotspot/share/ci/ciEnv.hpp Fix for JvmtiExport::can_walk_any_space() capability. Ok. src/hotspot/share/code/compiledMethod.cpp Nice cleanup! src/hotspot/share/code/debugInfoRec.cpp src/hotspot/share/code/debugInfoRec.hpp Additional parmeters. (Remark: I think "non_global_escape_in_scope" would read better than "not_global_escape_in_scope", but your version is consistent with existing code, so no change request from my side.) Ok. src/hotspot/share/code/nmethod.cpp Nice cleanup! src/hotspot/share/code/pcDesc.hpp Additional parameters. Ok. src/hotspot/share/code/scopeDesc.cpp src/hotspot/share/code/scopeDesc.hpp Improved implementation + additional parameters. Ok. src/hotspot/share/compiler/compileBroker.cpp src/hotspot/share/compiler/compileBroker.hpp Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into a follow up change together with the test in order to make this webrev smaller, but since it is included, I'm reviewing everything at once. Not a big deal.) Ok. src/hotspot/share/jvmci/jvmciCodeInstaller.cpp Additional parameters. Ok. src/hotspot/share/opto/c2compiler.cpp Make do_escape_analysis independent of JVMCI capabilities. Nice! src/hotspot/share/opto/callnode.hpp Additional fields for MachSafePointNodes. Ok. src/hotspot/share/opto/escape.cpp Annotation for MachSafePointNodes. Your added functionality looks correct. But I'd prefer to move the bulky code out of the large function. I suggest to factor out something like has_not_global_escape and has_arg_escape. So the code could look like this: SafePointNode* sfn = sfn_worklist.at(next); sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn)); if (sfn->is_CallJava()) { CallJavaNode* call = sfn->as_CallJava(); call->set_arg_escape(has_arg_escape(call)); } This would also allow us to get rid of the found_..._escape_in_args variables making the loops better readable. It's kind of ugly to use strcmp to recognize uncommon trap, but that seems to be the way to do it (there are more such places). So it's ok. src/hotspot/share/opto/machnode.hpp Additional fields for MachSafePointNodes. Ok. src/hotspot/share/opto/macro.cpp Allow elimination of non-escaping allocations. Ok. src/hotspot/share/opto/matcher.cpp src/hotspot/share/opto/output.cpp Copy attribute / pass parameters. Ok. src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp Nice cleanup! src/hotspot/share/prims/jvmtiEnv.cpp src/hotspot/share/prims/jvmtiEnvBase.cpp Escape barriers + deoptimize objects for target thread. Good. src/hotspot/share/prims/jvmtiImpl.cpp src/hotspot/share/prims/jvmtiImpl.hpp The sequence is pretty complex: VM_GetOrSetLocal element initialization executes EscapeBarrier code which suspends the target thread (extra VM Operation). VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM Thread to prepare VM Operation with frame deoptimization). VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor which resumes the target thread. But I don't have any improvement proposal. Performance is probably not a concern, here. So it's ok. VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it has non-globally escaping objects and other frames if they have arg escaping ones. Good. src/hotspot/share/prims/jvmtiTagMap.cpp Escape barriers + deoptimize objects for all threads. Ok. src/hotspot/share/prims/whitebox.cpp Added WB_IsFrameDeoptimized to API. Ok. src/hotspot/share/runtime/deoptimization.cpp Object deoptimization. I have more comments and proposals, here. First of all, handling recursive and waiting locks in relock_objects is tricky, but looks correct. Comments are sufficient to understand why things are done as they are implemented. BiasedLocking related parts are complex, but we may get rid of them in the future (with BiasedLocking removal). Anyway, looks correct, too. Typo in comment: "regularily" => "regularly" Deoptimization::fetch_unroll_info_helper is the only place where _jvmti_deferred_updates get deallocated (except JavaThread destructor). But I think we always go through it, so I can't see a memory leak or such kind of issues. EscapeBarrier::deoptimize_objects: ResourceMark should use calling_thread(). You can use MutexLocker and MonitorLocker with Thread* to save the Thread::current() call. I'd make set_objs_are_deoptimized static and remove it from the EscapeBarrier interface because I think it shouldn't be used outside of EscapeBarrier::deoptimize_objects. Typo in comment: "we must only deoptimize" => "we only have to deoptimize" "bool EscapeBarrier::deoptimize_objects(intptr_t* fr_id)" is trivial and barrier_active() is redundant. Implementation can get moved to hpp file. I'll get back to suspend flags, later. There are weird cases regarding _self_deoptimization_in_progress. Assume we have 3 threads A, B and C. A deopts C, B deopts C, C deopts C. C can set _self_deoptimization_in_progress while A performs the handshake for suspending C. I think this doesn't lead to errors, but it's probably not desired. I think it would be better to use only one "wait" call in sync_and_suspend_one and sync_and_suspend_all. I first thought it'd be better to move ThreadBlockInVM before wait() to reduce thread state transitions, but that seems to be problematic because ThreadBlockInVM destructor contains a safepoint check which we shouldn't do while holding EscapeBarrier_lock. So no change request. Change in thred_added: I think the sequence would be more comprehensive if we waited for deopt_all_threads in Thread::start and all other places where a new thread can run into Java code (e.g. JVMTI attach). Your version makes new threads come up with suspend flag set. That looks correct, too. Advantage is that you only have to change one place (thread_added). It'll be interesting to see how it will look like when we use async handshakes instead of suspend flags. For now, I'm ok with your version. I'd only move MutexLocker ml(EscapeBarrier_lock...) after if (!jt->is_hidden_from_external_view()). Having 4 different deoptimize_objects functions makes it a little hard to keep an overview of which one is used for what. Maybe adding suffixes would help a little bit, but I can also live with what you have. Implementation looks correct to me. src/hotspot/share/runtime/deoptimization.hpp Escape barriers and object deoptimization functions. Typo in comment: "helt" => "held" src/hotspot/share/runtime/globals.hpp Addition of develop flag DeoptimizeObjectsALotInterval. Ok. src/hotspot/share/runtime/interfaceSupport.cpp InterfaceSupport::deoptimizeAllObjects() is only used for DeoptimizeObjectsALot = 1. I think DeoptimizeObjectsALot = 2 is more important, but I think it's not bad to have DeoptimizeObjectsALot = 1 in addition. Ok. src/hotspot/share/runtime/interfaceSupport.inline.hpp Addition of deoptimizeAllObjects. Ok. src/hotspot/share/runtime/mutexLocker.cpp src/hotspot/share/runtime/mutexLocker.hpp Addition of EscapeBarrier_lock. Ok. src/hotspot/share/runtime/objectMonitor.cpp Make recursion count relock aware. Ok. src/hotspot/share/runtime/stackValue.hpp Better reinitilization in StackValue. Good. src/hotspot/share/runtime/thread.cpp src/hotspot/share/runtime/thread.hpp src/hotspot/share/runtime/thread.inline.hpp wait_for_object_deoptimization, suspend flag, deferred updates and test feature to deoptimize objects. In the long term, we want to get rid of suspend flags, so it's not so nice to introduce a new one. But I agree with G?tz that it should be acceptable as temporary solution until async handshakes are available (which takes more time). So I'm ok with your change. You can use MutexLocker with Thread*. JVMTIDeferredUpdates: I agree with Robin. It'd be nice to move the class out of thread.hpp. src/hotspot/share/runtime/vframe.cpp Added support for entry frame to new_vframe. Ok. src/hotspot/share/runtime/vframe_hp.cpp src/hotspot/share/runtime/vframe_hp.hpp I think code()->as_nmethod() in not_global_escape_in_scope() and arg_escape() should better be under #ifdef ASSERT or inside the assert statement (no need for code cache walking in product build). jvmtiDeferredLocalVariableSet::update_monitors: Please add a comment explaining that owner referenced by original info may be scalar replaced, but it is deoptimized in the vframe. src/hotspot/share/utilities/macros.hpp Addition of NOT_COMPILER2_OR_JVMCI_RETURN macros. Ok. test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysisEnabled.java test/hotspot/jtreg/serviceability/jvmti/Heap/libIterateHeapWithEscapeAnalysisEnabled.c New test. Will review separately. test/jdk/TEST.ROOT Addition of vm.jvmci as required property. Ok. test/jdk/com/sun/jdi/EATests.java test/jdk/com/sun/jdi/EATestsJVMCI.java New test. Will review separately. test/lib/sun/hotspot/WhiteBox.java Added isFrameDeoptimized to API. Ok. That was it. Best regards, Martin > -----Original Message----- > From: hotspot-compiler-dev bounces at openjdk.java.net> On Behalf Of Reingruber, Richard > Sent: Dienstag, 3. M?rz 2020 21:23 > To: 'Robbin Ehn' ; Lindenmaier, Goetz > ; David Holmes ; > Vladimir Kozlov (vladimir.kozlov at oracle.com) > ; serviceability-dev at openjdk.java.net; > hotspot-compiler-dev at openjdk.java.net; hotspot-runtime- > dev at openjdk.java.net > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better > Performance in the Presence of JVMTI Agents > > Hi Robbin, > > > > I understand that Robbin proposed to replace the usage of > > > _suspend_flag with handshakes. Apparently, async handshakes > > > are needed to do so. We have been waiting a while for removal > > > of the _suspend_flag / introduction of async handshakes [2]. > > > What is the status here? > > > I have an old prototype which I would like to continue to work on. > > So do not assume asynch handshakes will make 15. > > Even if it would, I think there are a lot more investigate work to remove > > _suspend_flag. > > Let us know, if we can be of any help to you and be it only testing. > > > >> Full: > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/ > > > DeoptimizeObjectsALotThread is only used in compileBroker.cpp. > > You can move both declaration and definition to that file, no need to > clobber > > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry) > > Will do. > > > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's > own > > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp. > > You are right. It shouldn't be declared in thread.hpp. I will look into that. > > > Note that we also think we may have a bug in deopt: > > https://bugs.openjdk.java.net/browse/JDK-8238237 > > > I think it would be best, if possible, to push after that is resolved. > > Sure. > > > Not even nearly a full review :) > > I know :) > > Anyways, thanks a lot, > Richard. > > > -----Original Message----- > From: Robbin Ehn > Sent: Monday, March 2, 2020 11:17 AM > To: Lindenmaier, Goetz ; Reingruber, Richard > ; David Holmes ; > Vladimir Kozlov (vladimir.kozlov at oracle.com) > ; serviceability-dev at openjdk.java.net; > hotspot-compiler-dev at openjdk.java.net; hotspot-runtime- > dev at openjdk.java.net > Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance > in the Presence of JVMTI Agents > > Hi, > > On 2/24/20 5:39 PM, Lindenmaier, Goetz wrote: > > Hi, > > > > I had a look at the progress of this change. Nothing > > happened since Richard posted his update using more > > handshakes [1]. > > But we (SAP) would appreciate a lot if this change could > > be successfully reviewed and pushed. > > > > I think there is basic understanding that this > > change is helpful. It fixes a number of issues with JVMTI, > > and will deliver the same performance benefits as EA > > does in current production mode for debugging scenarios. > > > > This is important for us as we run our VMs prepared > > for debugging in production mode. > > > > I understand that Robbin proposed to replace the usage of > > _suspend_flag with handshakes. Apparently, async handshakes > > are needed to do so. We have been waiting a while for removal > > of the _suspend_flag / introduction of async handshakes [2]. > > What is the status here? > > I have an old prototype which I would like to continue to work on. > So do not assume asynch handshakes will make 15. > Even if it would, I think there are a lot more investigate work to remove > _suspend_flag. > > > > > I think we should no longer wait, but proceed with > > this change. We will look into removing the usage of > > suspend_flag introduced here once it is possible to implement > > it with handshakes. > > Yes, sure. > > >> Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/ > > DeoptimizeObjectsALotThread is only used in compileBroker.cpp. > You can move both declaration and definition to that file, no need to clobber > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry) > > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's > own > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp. > > Note that we also think we may have a bug in deopt: > https://bugs.openjdk.java.net/browse/JDK-8238237 > > I think it would be best, if possible, to push after that is resolved. > > Not even nearly a full review :) > > Thanks, Robbin > > > >> Incremental: > >> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/ > >> > >> I was not able to eliminate the additional suspend flag now. I'll take care > of this > >> as soon as the > >> existing suspend-resume-mechanism is reworked. > >> > >> Testing: > >> > >> Nightly tests @SAP: > >> > >> JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, > Renaissance > >> Suite, SAP specific tests > >> with fastdebug and release builds on all platforms > >> > >> Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x > parallel > >> for 24h > >> > >> Thanks, Richard. > >> > >> > >> More details on the changes: > >> > >> * Hide DeoptimizeObjectsALotThread from external view. > >> > >> * Changed EscapeBarrier_lock to be a _safepoint_check_never lock. > >> It used to be _safepoint_check_sometimes, which will be eliminated > sooner or > >> later. > >> I added explicit thread state changes with ThreadBlockInVM to code > paths > >> where we can wait() > >> on EscapeBarrier_lock to become safepoint safe. > >> > >> * Use handshake EscapeBarrierSuspendHandshake to suspend target > threads > >> instead of vm operation > >> VM_ThreadSuspendAllForObjDeopt. > >> > >> * Removed uses of Threads_lock. When adding a new thread we suspend > it iff > >> EA optimizations are > >> being reverted. In the previous version we were waiting on > Threads_lock > >> while EA optimizations > >> were reverted. See EscapeBarrier::thread_added(). > >> > >> * Made tests require Xmixed compilation mode. > >> > >> * Made tests agnostic regarding tiered compilation. > >> I.e. tc isn't disabled anymore, and the tests can be run with tc enabled or > >> disabled. > >> > >> * Exercising EATests.java as well with stress test options > >> DeoptimizeObjectsALot* > >> Due to the non-deterministic deoptimizations some tests need to be > skipped. > >> We do this to prevent bit-rot of the stress test code. > >> > >> * Executing EATests.java as well with graal if available. Driver for this is > >> EATestsJVMCI.java. Graal cannot pass all tests, because it does not > provide all > >> the new debug info > >> (namely not_global_escape_in_scope and arg_escape in > scopeDesc.hpp). > >> And graal does not yet support the JVMTI operations force early return > and > >> pop frame. > >> > >> * Removed tracing from new jdi tests in EATests.java. Too much trace > output > >> before the debugging > >> connection is established can cause deadlock because output buffers fill > up. > >> (See https://bugs.openjdk.java.net/browse/JDK-8173304) > >> > >> * Many copyright year changes and smaller clean-up changes of testing > code > >> (trailing white-space and > >> the like). > >> > >> > >> -----Original Message----- > >> From: David Holmes > >> Sent: Donnerstag, 19. Dezember 2019 03:12 > >> To: Reingruber, Richard ; serviceability- > >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; > hotspot- > >> runtime-dev at openjdk.java.net; Vladimir Kozlov > (vladimir.kozlov at oracle.com) > >> > >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > Performance in > >> the Presence of JVMTI Agents > >> > >> Hi Richard, > >> > >> I think my issue is with the way EliminateNestedLocks works so I'm going > >> to look into that more deeply. > >> > >> Thanks for the explanations. > >> > >> David > >> > >> On 18/12/2019 12:47 am, Reingruber, Richard wrote: > >>> Hi David, > >>> > >>> > > > Some further queries/concerns: > >>> > > > > >>> > > > src/hotspot/share/runtime/objectMonitor.cpp > >>> > > > > >>> > > > Can you please explain the changes to ObjectMonitor::wait: > >>> > > > > >>> > > > ! _recursions = save // restore the old recursion count > >>> > > > ! + jt->get_and_reset_relock_count_after_wait(); // > >>> > > > increased by the deferred relock count > >>> > > > > >>> > > > what is the "deferred relock count"? I gather it relates to > >>> > > > > >>> > > > "The code was extended to be able to deoptimize objects of a > >>> > > frame that > >>> > > > is not the top frame and to let another thread than the owning > >>> > > thread do > >>> > > > it." > >>> > > > >>> > > Yes, these relate. Currently EA based optimizations are reverted, > when a > >> compiled frame is > >>> > > replaced with corresponding interpreter frames. Part of this is > relocking > >> objects with eliminated > >>> > > locking. New with the enhancement is that we do this also just > before > >> object references are > >>> > > acquired through JVMTI. In this case we deoptimize also the > owning > >> compiled frame C and we > >>> > > register deoptimized objects as deferred updates. When control > returns > >> to C it gets deoptimized, > >>> > > we notice that objects are already deoptimized (reallocated and > >> relocked), so we don't do it again > >>> > > (relocking twice would be incorrect of course). Deferred updates > are > >> copied into the new > >>> > > interpreter frames. > >>> > > > >>> > > Problem: relocking is not possible if the target thread T is waiting > on the > >> monitor that needs to > >>> > > be relocked. This happens only with non-local objects with > >> EliminateNestedLocks. Instead relocking > >>> > > is deferred until T owns the monitor again. This is what the piece of > >> code above does. > >>> > > >>> > Sorry I need some more detail here. How can you wait() on an > object > >>> > monitor if the object allocation and/or locking was optimised away? > And > >>> > what is a "non-local object" in this context? Isn't EA restricted to > >>> > thread-confined objects? > >>> > >>> "Non-local object" is an object that escapes its thread. The issue I'm > >> addressing with the changes > >>> in ObjectMonitor::wait are almost unrelated to EA. They are caused by > >> EliminateNestedLocks, where C2 > >>> eliminates recursive locking of an already owned lock. The lock owning > object > >> exists on the heap, it > >>> is locked and you can call wait() on it. > >>> > >>> EliminateLocks is the C2 option that controls lock elimination based on > EA. > >> Both optimizations have > >>> in common that objects with eliminated locking need to be relocked > when > >> deoptimizing a frame, > >>> i.e. when replacing a compiled frame with equivalent interpreter > >>> frames. Deoptimization::relock_objects does that job for /all/ eliminated > >> locks in scope. /All/ can > >>> be a mix of eliminated nested locks and locks of not-escaping objects. > >>> > >>> New with the enhancement: I call relock_objects earlier, just before > objects > >> pontentially > >>> escape. But then later when the owning compiled frame gets > deoptimized, I > >> must not do it again: > >>> > >>> See call to EscapeBarrier::objs_are_deoptimized in deoptimization.cpp: > >>> > >>> 373 if ((jvmci_enabled || ((DoEscapeAnalysis || > EliminateNestedLocks) && > >> EliminateLocks)) > >>> 374 && !EscapeBarrier::objs_are_deoptimized(thread, > deoptee.id())) { > >>> 375 bool unused; > >>> 376 eliminate_locks(thread, chunk, realloc_failures, deoptee, > exec_mode, > >> unused); > >>> 377 } > >>> > >>> Now when calling relock_objects early it is quiet possible that I have to > relock > >> an object the > >>> target thread currently waits for. Obviously I cannot relock in this case, > >> instead I chose to > >>> introduce relock_count_after_wait to JavaThread. > >>> > >>> > Is it just that some of the locking gets optimized away e.g. > >>> > > >>> > synchronised(obj) { > >>> > synchronised(obj) { > >>> > synchronised(obj) { > >>> > obj.wait(); > >>> > } > >>> > } > >>> > } > >>> > > >>> > If this is reduced to a form as-if it were a single lock of the monitor > >>> > (due to EA) and the wait() triggers a JVM TI event which leads to the > >>> > escape of "obj" then we need to reconstruct the true lock state, and > so > >>> > when the wait() internally unblocks and reacquires the monitor it > has to > >>> > set the true recursion count to 3, not the 1 that it appeared to be > when > >>> > wait() was initially called. Is that the scenario? > >>> > >>> Kind of... except that the locking is not eliminated due to EA and there is > no > >> JVM TI event > >>> triggered by wait. > >>> > >>> Add > >>> > >>> LocalObject l1 = new LocalObject(); > >>> > >>> in front of the synchrnized blocks and assume a JVM TI agent acquires l1. > This > >> triggers the code in > >>> question. > >>> > >>> See that relocking/reallocating is transactional. If it is done then for /all/ > >> objects in scope and it is > >>> done at most once. It wouldn't be quite so easy to split this in relocking > of > >> nested/EA-based > >>> eliminated locks. > >>> > >>> > If so I find this truly awful. Anyone using wait() in a realistic form > >>> > requires a notification and so the object cannot be thread confined. > In > >>> > >>> It is not thread confined. > >>> > >>> > which case I would strongly argue that upon hitting the wait() the > deopt > >>> > should occur unconditionally and so the lock state is correct before > we > >>> > wait and so we don't need to mess with the recursion count > internally > >>> > when we reacquire the monitor. > >>> > > >>> > > > >>> > > > which I don't like the sound of at all when it comes to > ObjectMonitor > >>> > > > state. So I'd like to understand in detail exactly what is going on > here > >>> > > > and why. This is a very intrusive change that seems to badly > break > >>> > > > encapsulation and impacts future changes to ObjectMonitor > that are > >> under > >>> > > > investigation. > >>> > > > >>> > > I would not regard this as breaking encapsulation. Certainly not > badly. > >>> > > > >>> > > I've added a property relock_count_after_wait to JavaThread. The > >> property is well > >>> > > encapsulated. Future ObjectMonitor implementations have to deal > with > >> recursion too. They are free > >>> > > in choosing a way to do that as long as that property is taken into > >> account. This is hardly a > >>> > > limitation. > >>> > > >>> > I do think this badly breaks encapsulation as you have to add a > callout > >>> > from the guts of the ObjectMonitor code to reach into the thread to > get > >>> > this lock count adjustment. I understand why you have had to do > this but > >>> > I would much rather see a change to the EA optimisation strategy so > that > >>> > this is not needed. > >>> > > >>> > > Note also that the property is a straight forward extension of the > >> existing concept of deferred > >>> > > local updates. It is embedded into the structure holding them. So > not > >> even the footprint of a > >>> > > JavaThread is enlarged if no deferred updates are generated. > >>> > > >>> > [...] > >>> > > >>> > > > >>> > > I'm actually duplicating the existing external suspend mechanism, > >> because a thread can be > >>> > > suspended at most once. And hey, and don't like that either! But it > >> seems not unlikely that the > >>> > > duplicate can be removed together with the original and the new > type > >> of handshakes that will be > >>> > > used for thread suspend can be used for object deoptimization > too. See > >> today's discussion in > >>> > > JDK-8227745 [2]. > >>> > > >>> > I hope that discussion bears some fruit, at the moment it seems not > to > >>> > be possible to use handshakes here. :( > >>> > > >>> > The external suspend mechanism is a royal pain in the proverbial > that we > >>> > have to carefully live with. The idea that we're duplicating that for > >>> > use in another fringe area of functionality does not thrill me at all. > >>> > > >>> > To be clear, I understand the problem that exists and that you wish > to > >>> > solve, but for the runtime parts I balk at the complexity cost of > >>> > solving it. > >>> > >>> I know it's complex, but by far no rocket science. > >>> > >>> Also I find it hard to imagine another fix for JDK-8233915 besides > changing > >> the JVM TI specification. > >>> > >>> Thanks, Richard. > >>> > >>> -----Original Message----- > >>> From: David Holmes > >>> Sent: Dienstag, 17. Dezember 2019 08:03 > >>> To: Reingruber, Richard ; serviceability- > >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; > hotspot- > >> runtime-dev at openjdk.java.net; Vladimir Kozlov > (vladimir.kozlov at oracle.com) > >> > >>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > Performance > >> in the Presence of JVMTI Agents > >>> > >>> > >>> > >>> David > >>> > >>> On 17/12/2019 4:57 pm, David Holmes wrote: > >>>> Hi Richard, > >>>> > >>>> On 14/12/2019 5:01 am, Reingruber, Richard wrote: > >>>>> Hi David, > >>>>> > >>>>> ?? > Some further queries/concerns: > >>>>> ?? > > >>>>> ?? > src/hotspot/share/runtime/objectMonitor.cpp > >>>>> ?? > > >>>>> ?? > Can you please explain the changes to ObjectMonitor::wait: > >>>>> ?? > > >>>>> ?? > !?? _recursions = save????? // restore the old recursion count > >>>>> ?? > !???????????????? + jt->get_and_reset_relock_count_after_wait(); // > >>>>> ?? > increased by the deferred relock count > >>>>> ?? > > >>>>> ?? > what is the "deferred relock count"? I gather it relates to > >>>>> ?? > > >>>>> ?? > "The code was extended to be able to deoptimize objects of a > >>>>> frame that > >>>>> ?? > is not the top frame and to let another thread than the owning > >>>>> thread do > >>>>> ?? > it." > >>>>> > >>>>> Yes, these relate. Currently EA based optimizations are reverted, > when > >>>>> a compiled frame is replaced > >>>>> with corresponding interpreter frames. Part of this is relocking > >>>>> objects with eliminated > >>>>> locking. New with the enhancement is that we do this also just before > >>>>> object references are acquired > >>>>> through JVMTI. In this case we deoptimize also the owning compiled > >>>>> frame C and we register > >>>>> deoptimized objects as deferred updates. When control returns to C > it > >>>>> gets deoptimized, we notice > >>>>> that objects are already deoptimized (reallocated and relocked), so > we > >>>>> don't do it again (relocking > >>>>> twice would be incorrect of course). Deferred updates are copied into > >>>>> the new interpreter frames. > >>>>> > >>>>> Problem: relocking is not possible if the target thread T is waiting > >>>>> on the monitor that needs to be > >>>>> relocked. This happens only with non-local objects with > >>>>> EliminateNestedLocks. Instead relocking is > >>>>> deferred until T owns the monitor again. This is what the piece of > >>>>> code above does. > >>>> > >>>> Sorry I need some more detail here. How can you wait() on an object > >>>> monitor if the object allocation and/or locking was optimised away? > And > >>>> what is a "non-local object" in this context? Isn't EA restricted to > >>>> thread-confined objects? > >>>> > >>>> Is it just that some of the locking gets optimized away e.g. > >>>> > >>>> synchronised(obj) { > >>>> ? synchronised(obj) { > >>>> ??? synchronised(obj) { > >>>> ????? obj.wait(); > >>>> ??? } > >>>> ? } > >>>> } > >>>> > >>>> If this is reduced to a form as-if it were a single lock of the monitor > >>>> (due to EA) and the wait() triggers a JVM TI event which leads to the > >>>> escape of "obj" then we need to reconstruct the true lock state, and so > >>>> when the wait() internally unblocks and reacquires the monitor it has to > >>>> set the true recursion count to 3, not the 1 that it appeared to be when > >>>> wait() was initially called. Is that the scenario? > >>>> > >>>> If so I find this truly awful. Anyone using wait() in a realistic form > >>>> requires a notification and so the object cannot be thread confined. In > >>>> which case I would strongly argue that upon hitting the wait() the > deopt > >>>> should occur unconditionally and so the lock state is correct before we > >>>> wait and so we don't need to mess with the recursion count internally > >>>> when we reacquire the monitor. > >>>> > >>>>> > >>>>> ?? > which I don't like the sound of at all when it comes to > >>>>> ObjectMonitor > >>>>> ?? > state. So I'd like to understand in detail exactly what is going > >>>>> on here > >>>>> ?? > and why.? This is a very intrusive change that seems to badly > break > >>>>> ?? > encapsulation and impacts future changes to ObjectMonitor that > >>>>> are under > >>>>> ?? > investigation. > >>>>> > >>>>> I would not regard this as breaking encapsulation. Certainly not badly. > >>>>> > >>>>> I've added a property relock_count_after_wait to JavaThread. The > >>>>> property is well > >>>>> encapsulated. Future ObjectMonitor implementations have to deal > with > >>>>> recursion too. They are free in > >>>>> choosing a way to do that as long as that property is taken into > >>>>> account. This is hardly a > >>>>> limitation. > >>>> > >>>> I do think this badly breaks encapsulation as you have to add a callout > >>>> from the guts of the ObjectMonitor code to reach into the thread to > get > >>>> this lock count adjustment. I understand why you have had to do this > but > >>>> I would much rather see a change to the EA optimisation strategy so > that > >>>> this is not needed. > >>>> > >>>>> Note also that the property is a straight forward extension of the > >>>>> existing concept of deferred > >>>>> local updates. It is embedded into the structure holding them. So not > >>>>> even the footprint of a > >>>>> JavaThread is enlarged if no deferred updates are generated. > >>>>> > >>>>> ?? > --- > >>>>> ?? > > >>>>> ?? > src/hotspot/share/runtime/thread.cpp > >>>>> ?? > > >>>>> ?? > Can you please explain why > >>>>> JavaThread::wait_for_object_deoptimization > >>>>> ?? > has to be handcrafted in this way rather than using proper > >>>>> transitions. > >>>>> ?? > > >>>>> > >>>>> I wrote wait_for_object_deoptimization taking > >>>>> JavaThread::java_suspend_self_with_safepoint_check > >>>>> as template. So in short: for the same reasons :) > >>>>> > >>>>> Threads reach both methods as part of thread state transitions, > >>>>> therefore special handling is > >>>>> required to change thread state on top of ongoing transitions. > >>>>> > >>>>> ?? > We got rid of "deopt suspend" some time ago and it is disturbing > >>>>> to see > >>>>> ?? > it being added back (effectively). This seems like it may be > >>>>> something > >>>>> ?? > that handshakes could be used for. > >>>>> > >>>>> Deopt suspend used to be something rather different with a similar > >>>>> name[1]. It is not being added back. > >>>> > >>>> I stand corrected. Despite comments in the code to the contrary > >>>> deopt_suspend didn't actually cause a self-suspend. I was doing a lot of > >>>> cleanup in this area 13 years ago :) > >>>> > >>>>> > >>>>> I'm actually duplicating the existing external suspend mechanism, > >>>>> because a thread can be suspended > >>>>> at most once. And hey, and don't like that either! But it seems not > >>>>> unlikely that the duplicate can > >>>>> be removed together with the original and the new type of > handshakes > >>>>> that will be used for > >>>>> thread suspend can be used for object deoptimization too. See > today's > >>>>> discussion in JDK-8227745 [2]. > >>>> > >>>> I hope that discussion bears some fruit, at the moment it seems not to > >>>> be possible to use handshakes here. :( > >>>> > >>>> The external suspend mechanism is a royal pain in the proverbial that > we > >>>> have to carefully live with. The idea that we're duplicating that for > >>>> use in another fringe area of functionality does not thrill me at all. > >>>> > >>>> To be clear, I understand the problem that exists and that you wish to > >>>> solve, but for the runtime parts I balk at the complexity cost of > >>>> solving it. > >>>> > >>>> Thanks, > >>>> David > >>>> ----- > >>>> > >>>>> Thanks, Richard. > >>>>> > >>>>> [1] Deopt suspend was something like an async. handshake for > >>>>> architectures with register windows, > >>>>> ???? where patching the return pc for deoptimization of a compiled > >>>>> frame was racy if the owner thread > >>>>> ???? was in native code. Instead a "deopt" suspend flag was set on > >>>>> which the thread patched its own > >>>>> ???? frame upon return from native. So no thread was suspended. It > got > >>>>> its name only from the name of > >>>>> ???? the flags. > >>>>> > >>>>> [2] Discussion about using handshakes to sync. with the target thread: > >>>>> > >>>>> https://bugs.openjdk.java.net/browse/JDK- > >> > 8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.syst > e > >> m.issuetabpanels:comment-tabpanel#comment-14306727 > >>>>> > >>>>> > >>>>> -----Original Message----- > >>>>> From: David Holmes > >>>>> Sent: Freitag, 13. Dezember 2019 00:56 > >>>>> To: Reingruber, Richard ; > >>>>> serviceability-dev at openjdk.java.net; > >>>>> hotspot-compiler-dev at openjdk.java.net; > >>>>> hotspot-runtime-dev at openjdk.java.net > >>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > >>>>> Performance in the Presence of JVMTI Agents > >>>>> > >>>>> Hi Richard, > >>>>> > >>>>> Some further queries/concerns: > >>>>> > >>>>> src/hotspot/share/runtime/objectMonitor.cpp > >>>>> > >>>>> Can you please explain the changes to ObjectMonitor::wait: > >>>>> > >>>>> !?? _recursions = save????? // restore the old recursion count > >>>>> !???????????????? + jt->get_and_reset_relock_count_after_wait(); // > >>>>> increased by the deferred relock count > >>>>> > >>>>> what is the "deferred relock count"? I gather it relates to > >>>>> > >>>>> "The code was extended to be able to deoptimize objects of a frame > that > >>>>> is not the top frame and to let another thread than the owning thread > do > >>>>> it." > >>>>> > >>>>> which I don't like the sound of at all when it comes to ObjectMonitor > >>>>> state. So I'd like to understand in detail exactly what is going on here > >>>>> and why.? This is a very intrusive change that seems to badly break > >>>>> encapsulation and impacts future changes to ObjectMonitor that are > under > >>>>> investigation. > >>>>> > >>>>> --- > >>>>> > >>>>> src/hotspot/share/runtime/thread.cpp > >>>>> > >>>>> Can you please explain why > JavaThread::wait_for_object_deoptimization > >>>>> has to be handcrafted in this way rather than using proper transitions. > >>>>> > >>>>> We got rid of "deopt suspend" some time ago and it is disturbing to > see > >>>>> it being added back (effectively). This seems like it may be something > >>>>> that handshakes could be used for. > >>>>> > >>>>> Thanks, > >>>>> David > >>>>> ----- > >>>>> > >>>>> On 12/12/2019 7:02 am, David Holmes wrote: > >>>>>> On 12/12/2019 1:07 am, Reingruber, Richard wrote: > >>>>>>> Hi David, > >>>>>>> > >>>>>>> ??? > Most of the details here are in areas I can comment on in > detail, > >>>>>>> but I > >>>>>>> ??? > did take an initial general look at things. > >>>>>>> > >>>>>>> Thanks for taking the time! > >>>>>> > >>>>>> Apologies the above should read: > >>>>>> > >>>>>> "Most of the details here are in areas I *can't* comment on in detail > >>>>>> ..." > >>>>>> > >>>>>> David > >>>>>> > >>>>>>> ??? > The only thing that jumped out at me is that I think the > >>>>>>> ??? > DeoptimizeObjectsALotThread should be a hidden thread. > >>>>>>> ??? > > >>>>>>> ??? > +? bool is_hidden_from_external_view() const { return true; } > >>>>>>> > >>>>>>> Yes, it should. Will add the method like above. > >>>>>>> > >>>>>>> ??? > Also I don't see any testing of the > DeoptimizeObjectsALotThread. > >>>>>>> Without > >>>>>>> ??? > active testing this will just bit-rot. > >>>>>>> > >>>>>>> DeoptimizeObjectsALot is meant for stress testing with a larger > >>>>>>> workload. I will add a minimal test > >>>>>>> to keep it fresh. > >>>>>>> > >>>>>>> ??? > Also on the tests I don't understand your @requires clause: > >>>>>>> ??? > > >>>>>>> ??? >?? @requires ((vm.compMode != "Xcomp") & > vm.compiler2.enabled > >> & > >>>>>>> ??? > (vm.opt.TieredCompilation != true)) > >>>>>>> ??? > > >>>>>>> ??? > This seems to require that TieredCompilation is disabled, but > >>>>>>> tiered is > >>>>>>> ??? > our normal mode of operation. ?? > >>>>>>> ??? > > >>>>>>> > >>>>>>> I removed the clause. I guess I wanted to target the tests towards > the > >>>>>>> code they are supposed to > >>>>>>> test, and it's easier to analyze failures w/o tiered compilation and > >>>>>>> with just one compiler thread. > >>>>>>> > >>>>>>> Additionally I will make use of > >>>>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests. > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Richard. > >>>>>>> > >>>>>>> -----Original Message----- > >>>>>>> From: David Holmes > >>>>>>> Sent: Mittwoch, 11. Dezember 2019 08:03 > >>>>>>> To: Reingruber, Richard ; > >>>>>>> serviceability-dev at openjdk.java.net; > >>>>>>> hotspot-compiler-dev at openjdk.java.net; > >>>>>>> hotspot-runtime-dev at openjdk.java.net > >>>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > >>>>>>> Performance in the Presence of JVMTI Agents > >>>>>>> > >>>>>>> Hi Richard, > >>>>>>> > >>>>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote: > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> I would like to get reviews please for > >>>>>>>> > >>>>>>>> > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ > >>>>>>>> > >>>>>>>> Corresponding RFE: > >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745 > >>>>>>>> > >>>>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 > >>>>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK- > 8214584 [1] > >>>>>>>> > >>>>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing > without > >>>>>>>> issues (thanks!). In addition the > >>>>>>>> change is being tested at SAP since I posted the first RFR some > >>>>>>>> months ago. > >>>>>>>> > >>>>>>>> The intention of this enhancement is to benefit performance wise > from > >>>>>>>> escape analysis even if JVMTI > >>>>>>>> agents request capabilities that allow them to access local variable > >>>>>>>> values. E.g. if you start-up > >>>>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, > then > >>>>>>>> escape analysis is disabled right > >>>>>>>> from the beginning, well before a debugger attaches -- if ever one > >>>>>>>> should do so. With the > >>>>>>>> enhancement, escape analysis will remain enabled until and after > a > >>>>>>>> debugger attaches. EA based > >>>>>>>> optimizations are reverted just before an agent acquires the > >>>>>>>> reference to an object. In the JBS item > >>>>>>>> you'll find more details. > >>>>>>> > >>>>>>> Most of the details here are in areas I can comment on in detail, but > I > >>>>>>> did take an initial general look at things. > >>>>>>> > >>>>>>> The only thing that jumped out at me is that I think the > >>>>>>> DeoptimizeObjectsALotThread should be a hidden thread. > >>>>>>> > >>>>>>> +? bool is_hidden_from_external_view() const { return true; } > >>>>>>> > >>>>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread. > >>>>>>> Without > >>>>>>> active testing this will just bit-rot. > >>>>>>> > >>>>>>> Also on the tests I don't understand your @requires clause: > >>>>>>> > >>>>>>> ??? @requires ((vm.compMode != "Xcomp") & > vm.compiler2.enabled & > >>>>>>> (vm.opt.TieredCompilation != true)) > >>>>>>> > >>>>>>> This seems to require that TieredCompilation is disabled, but tiered > is > >>>>>>> our normal mode of operation. ?? > >>>>>>> > >>>>>>> Thanks, > >>>>>>> David > >>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Richard. > >>>>>>>> > >>>>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745 > >>>>>>>> > >> > http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.pa > tc > >> h > >>>>>>>> > >>>>>>>> > >>>>>>>> From ralf.schmelter at sap.com Fri Mar 13 11:43:08 2020 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Fri, 13 Mar 2020 11:43:08 +0000 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: <01361a9d-2855-db67-a176-73731fada08f@oracle.com> <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com> <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com> <0343dfac-61f7-1b1c-ee96-bdee130578ad@oracle.com> Message-ID: Hi, I have updated the webrev: http://cr.openjdk.java.net/~rschmelter/webrevs/8237354/webrev.1/ It has the following significant changes: - The jcmd now uses two separate flags. The -gz flag is now a boolean flag which toggles the compression on/off. And the new -gz-level flag can be used to change the compression level. If tried to change the jlong flag coding to allow the old behavior (only one flag, which acts both as a boolean flag and a jlong flag), but decided against it, since it changes the semantic of a jlong flag. And I don't expect the -gz-level flag to be used all that much. - I no longer use my own threads. Instead I use the WorkGang returned from CollectedHeap:: get_safepoint_workers(). This works fine, apart from Shenandoah GC, which runs into assertions when calling the CollectedHeap::object_iterate() method from a worker thread. I'm not sure if the assertion is too strong, but since the GC is currently experimental, I switch back to single threading in this case (as would be the case for serial GC or epsilon GC). Using the worker threads removes the problems the original code had regarding destruction of the monitor used. - The reported number of bytes is now the one written to disk. Best regards, Ralf -----Original Message----- From: Ioi Lam Sent: Dienstag, 25. Februar 2020 18:03 To: Langer, Christoph ; Schmelter, Ralf ; Yasumasa Suenaga ; serguei.spitsyn at oracle.com; hotspot-runtime-dev at openjdk.java.net runtime Cc: serviceability-dev at openjdk.java.net Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump Hi Christoph, This sounds fair. I will remove my objection :-) Thanks - Ioi From igor.ignatyev at oracle.com Fri Mar 13 16:26:07 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Fri, 13 Mar 2020 09:26:07 -0700 Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they do not attempt to use sudo when available In-Reply-To: <90e1ffcc-6636-b328-b167-fd1d42072575@oracle.com> References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com> <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com> <150f87f8-5b54-deae-ced1-1227bb1ed17a@oracle.com> <227652ff-ffb0-a1f0-531d-03b75ecec921@oracle.com> <33ffd0e7-d017-c1bf-feb5-4d2f07e753f2@oracle.com> <90e1ffcc-6636-b328-b167-fd1d42072575@oracle.com> Message-ID: HI Chris, overall looks good to me, a few comments though: 1. since you removed vm.hasSAandCanAttach from VMProps, you also need to remove it from all TEST.ROOT files which mention it (test/jdk/TEST.ROOT and test/hotspot/jtreg/TEST.ROOT) so people won't be confused by undefined property and jtreg will be able to properly report invalid usages of it if any. 2. in SATestUtils::canAddPrivileges, could you please add some meaningful message to the RuntimeException at L#102? 3. SATestUtils::checkAttachOk method name is somewhat misleading (hence you had to put comment every time you used it), I'd recommend you to rename to smth like skipIfCannotAttach(). 4. in SATestUtils::checkAttachOk's javadoc, it would be better to use @throws tag like: > + /** > + * Checks if SA Attach is expected to work. > +. * @throws SkippedException ifSA Attach is not expected to work. > + */ 5. it also might make sense to catch IOException within SATestUtils::checkAttachOk and throw it as Error or RuntimeException. I've briefly looked at all the changed tests and they look good. Thanks, -- Igor > On Mar 12, 2020, at 11:06 PM, Chris Plummer wrote: > > Hi Serguei, > > Thanks for the review! > > Can I get one more reviewer please? > > thanks, > > Chris > > On 3/12/20 12:06 AM, serguei.spitsyn at oracle.com wrote: >> Hi Chris, >> >> >> On 3/12/20 00:03, Chris Plummer wrote: >>> Hi Serguei, >>> >>> That check used to be in Platform.shouldSAAttach(), which essentially was moved to SATestUtils.checkAttachOk() and reworked some. It was necessary in Platform.shouldSAAttach() since that was used to evaluation vm.hasSAandCanAttach (which is now gone). When I moved everything to SATestUtils.checkAttachOk(), I recall thinking it wasn't really necessary since all tests that call it should have @require vm.hasSA, but left it in anyway just to be extra safe. I'm still inclined to just leave it in, but would not be opposed to removing it. >> >> I agree, it is more safe to keep it, at list for now. >> >> >> Thanks, >> Serguei >> >>> thanks, >>> >>> Chris >>> >>> On 3/11/20 11:20 PM, serguei.spitsyn at oracle.com wrote: >>>> Hi Chris, >>>> >>>> I've made another pass today. >>>> It looks good to me. >>>> >>>> I have just one minor questions. >>>> >>>> There is some overlap between the requires vm.hasSA check and checkAttachOk: >>>> + public static void checkAttachOk() throws IOException { >>>> + if (!Platform.hasSA()) { >>>> + throw new SkippedException("SA not supported."); >>>> + } >>>> In the former case, the test is not run but in the latter the SkippedException is thrown. >>>> As I see, all tests with the checkAttachOk call use requires vm.hasSA as well. >>>> It can be that the first check "if (!Platform.hasSA())" in the checkAttachOk is redundant. >>>> It is okay and more safe in general but generates little confusion. >>>> I'm okay if you don't do anything with this but wanted to know your view. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 3/10/20 18:57, Chris Plummer wrote: >>>>> On 3/10/20 6:07 PM, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Chris, >>>>>> >>>>>> Overall, this looks as a right direction to me while it is not easy to verify all the details. >>>>> Yes, there are a lot of tests with quite a few different types of changes. I did a lot of testing and verified that when the tests pass, they pass for the right reasons (really ran the test, skipped due to lack of privileges, or skipped due to running signed on OSX 10.14 or later). I also verified locally running as root, running with a cached sudo, and running without sudo. >>>>>> I'll make another pass tomorrow. >>>>> Thanks! >>>>>> >>>>>> A couple of quick nits so far: >>>>>> >>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestCpoolForInvokeDynamic.java.udiff.html >>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestDefaultMethods.java.udiff.html >>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestHeapDumpForInvokeDynamic.java.udiff.html >>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestInstanceKlassSizeForInterface.java.udiff.html >>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestJhsdbJstackMixed.java.udiff.html >>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestRevPtrsForInvokeDynamic.java.udiff.html >>>>>> import jdk.test.lib.Utils; >>>>>> -import jdk.test.lib.Asserts; >>>>>> +import jdk.test.lib.SA.SATestUtils; >>>>>> Need to swap these exports. >>>>>> >>>>>> >>>>> Ok >>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/lib/jdk/test/lib/SA/SATestUtils.java.frames.html >>>>>> 48 if (SATestUtils.needsPrivileges()) { >>>>>> 49 cmdStringList = SATestUtils.addPrivileges(cmdStringList); >>>>>> The method calls are local, so the class name can be omitted in the method names: >>>>>> SATestUtils.needsPrivileges and SATestUtils.addPrivileges. >>>>> Ok >>>>>> >>>>>> >>>>>> 94 try { >>>>>> 95 if (echoProcess.waitFor(60, TimeUnit.SECONDS) == false) { >>>>>> 96 // Due to using the "-n" option, sudo should complete almost immediately. 60 seconds >>>>>> 97 // is more than generous. If it didn't complete in that time, something went very wrong. >>>>>> 98 echoProcess.destroyForcibly(); >>>>>> 99 throw new RuntimeException("Timed out waiting for sudo to execute."); >>>>>> 100 } >>>>>> 101 } catch (InterruptedException e) { >>>>>> 102 throw new RuntimeException(e); >>>>>> 103 } >>>>>> The lines 101/103 are misaligned. >>>>> Ok. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>> Thanks, >>>>> >>>>> Chris >>>>>> >>>>>> >>>>>> On 3/9/20 19:29, Chris Plummer wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Please help review the following: >>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238268 >>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/ >>>>>>> >>>>>>> I'll try to give enough background first to make it easier to understand the changes. On OSX you must run SA tests that attach to a live process as root or using sudo. For example: >>>>>>> >>>>>>> sudo make run-test TEST=serviceability/sa/ClhsdbJstackXcompStress.java >>>>>>> >>>>>>> Whether running as root or under sudo, the check to allow the test to run is done with: >>>>>>> >>>>>>> private static boolean canAttachOSX() { >>>>>>> return userName.equals("root"); >>>>>>> } >>>>>>> >>>>>>> Any test using "@requires vm.hasSAandCanAttach" must pass this check via Platform.shouldSAAttach(), which for OSX returns: >>>>>>> >>>>>>> return canAttachOSX() && !isSignedOSX(); >>>>>>> >>>>>>> So if running as root the "@requires vm.hasSAandCanAttach" passes, otherwise it does not. However, using a root login to run tests is not a very desirable, nor is issuing a "sudo make run-test" (any created file ends up with root ownership). Because of this support was previously added for just running the attaching process using sudo, not the entire test. This was only done for the 20 or so tests that use ClhsdbLauncher. These tests use "@requires vm.hasSA", and then while running the test will do a "sudo" check if canAttachOSX() returns false: >>>>>>> >>>>>>> if (!Platform.shouldSAAttach()) { >>>>>>> if (Platform.isOSX()) { >>>>>>> if (Platform.isSignedOSX()) { >>>>>>> throw new SkippedException("SA attach not expected to work. JDK is signed."); >>>>>>> } else if (SATestUtils.canAddPrivileges()) { >>>>>>> needPrivileges = true; >>>>>>> } >>>>>>> } >>>>>>> if (!needPrivileges) { >>>>>>> // Skip the test if we don't have enough permissions to attach >>>>>>> // and cannot add privileges. >>>>>>> throw new SkippedException( >>>>>>> "SA attach not expected to work. Insufficient privileges."); >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> So basically it does a runtime check of vm.hasSAandCanAttach, and if it fails then checks if running with sudo will work. This allows for either a passwordless sudo to be used when running clhsdb, or for the user to be prompted for the sudo password (note I've remove support for the latter with my changes). >>>>>>> >>>>>>> That brings us to the CR that is being fixed. ClhsdbLauncher tests support sudo and will therefore run with our CI testing on OSX, but the 25 or so tests that use "@requires vm.hasSAandCanAttach" do not, and therefore are never run with our CI OSX testing. The changes in this webrev fix that. >>>>>>> >>>>>>> There are two possible approaches to the fix. One is having the check for sudo be done as part of the vm.hasSAandCanAttach evaluation. The other approach is to do the check in the test at runtime similar to how ClhsdbLauncher currently does. This would mean just using "@requires vm.hasSA" for all the tests instead of "@requires vm.hasSAandCanAttach". I chose the later because there is an advantage to throwing SkippedException rather than just silently skipping the test using @requires. The advantage is that mdash tells you how many tests were skipped, and when you hover over the reason you can see the SkippedException message, which will differentiate between reasons like the JDK was signed or there are insufficient privileges. If all the checking was done by the vm.hasSAandCanAttach evaluation, you would not know why the test wasn't run. >>>>>>> >>>>>>> The "support" related changes made are all in the following 3 files. The rest of the changes are in the tests: >>>>>>> >>>>>>> test/jtreg-ext/requires/VMProps.java >>>>>>> test/lib/jdk/test/lib/Platform.java >>>>>>> test/lib/jdk/test/lib/SA/SATestUtils.java >>>>>>> >>>>>>> You'll noticed that one change I made to the sudo support in SATestUtils.canAddPrivileges() is to make sudo non-interactive, which means no password prompt. So that means either the user does not require a password, or the credentials have been cached. Otherwise the sudo check will fail. On most platforms if you execute a sudo command, the credentials are cached for 5 minutes. So if your user is not setup for passwordless sudo, then a sudo command can be issued before running the tests, and will likely remain cached until the test is run. The reason for using passwordless is because prompting in the middle of running tests can be confusing (you usually walk way once launching the tests and miss the prompt anyway), and avoids unnecessary delays in automated testing due to waiting for the password prompt to timeout (it used to wait 1 minute). >>>>>>> >>>>>>> There are essentially 3 types of tests that SA Attach to a process, each needing a slightly different fix: >>>>>>> >>>>>>> 1. Tests that directly launch a jdk.hotspot.agent class, such as TestClassDump.java. They need to call SATestUtils.checkAttachOk() to verify that attaching will be possible, and then SATestUtils.addPrivilegesIfNeeded(pb) to get the sudo command added if needed.They also need to switch from using hasSAandCanAttach to using hasSA. >>>>>>> >>>>>>> 2. Tests that launch command line tools such has jhsdb. They need to call SATestUtils.checkAttachOk() to verify that attaching will be possible, and then SATestUtils.createProcessBuilder() to create a process that will be launched using sudo if necessary.They also need to switch from using hasSAandCanAttach to using hasSA. >>>>>>> >>>>>>> 3. Tests that use ClhsdbLauncher. They already use hasSA instead of hasSAandCanAttach, and rely on ClhsdbLauncher to do check at runtime if attaching will work, so for the most part all the these tests are unchanged. ClhsdbLauncher was modified to take advantage of the new SATestUtils.createProcessBuilder() and SATestUtils.checkAttachOk() APIs. >>>>>>> >>>>>>> Some tests required special handling: >>>>>>> >>>>>>> test/hotspot/jtreg/compiler/ciReplay/TestSAClient.java >>>>>>> test/hotspot/jtreg/compiler/ciReplay/TestSAServer.java >>>>>>> >>>>>>> - These two tests SA Attach to a core file, not to a process, so only need hasSA, >>>>>>> not hasSAandCanAttach. No other changes were needed. >>>>>>> >>>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFindPC.java >>>>>>> >>>>>>> - The output should never be null. If the test was skipped due to lack of privileges, you >>>>>>> would never get to this section of the test. >>>>>>> >>>>>>> test/hotspot/jtreg/serviceability/sa/TestClhsdbJstackLock.java >>>>>>> test/hotspot/jtreg/serviceability/sa/TestIntConstant.java >>>>>>> test/hotspot/jtreg/serviceability/sa/TestPrintMdo.java >>>>>>> test/hotspot/jtreg/serviceability/sa/TestType.java >>>>>>> test/hotspot/jtreg/serviceability/sa/TestUniverse.java >>>>>>> >>>>>>> - These are ClhsdbLauncher tests, so they should have been using hasSA instead of >>>>>>> hasSAandCanAttachin the first place. No other changes were needed. >>>>>>> >>>>>>> test/hotspot/jtreg/serviceability/sa/TestCpoolForInvokeDynamic.java >>>>>>> test/hotspot/jtreg/serviceability/sa/TestDefaultMethods.java >>>>>>> test/hotspot/jtreg/serviceability/sa/TestG1HeapRegion.java >>>>>>> >>>>>>> - These tests used to "@require mac" but seem run fine on OSX, so I removed this requirement. >>>>>>> >>>>>>> test/jdk/sun/tools/jhsdb/BasicLauncherTest.java >>>>>>> >>>>>>> - This test had a runtime check to not run on OSX due to not having core file stack >>>>>>> walking support. However, this tests always attaches to a process, not a core file, >>>>>>> and seems to run just fine on OSX. >>>>>>> >>>>>>> test/jdk/sun/tools/jstack/DeadlockDetectionTest.java >>>>>>> >>>>>>> - I changed the test to throw a SkippedException if it gets the unexpected error code >>>>>>> rather than just println. >>>>>>> >>>>>>> And a few other miscellaneous changes not already covered: >>>>>>> >>>>>>> test/lib/jdk/test/lib/Platform.java >>>>>>> - Made canPtraceAttachLinux() public so it can be called from SATestUtils. >>>>>>> - vm.hasSAandCanAttach is now gone. >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pavel.rappo at oracle.com Fri Mar 13 16:26:54 2020 From: pavel.rappo at oracle.com (Pavel Rappo) Date: Fri, 13 Mar 2020 16:26:54 +0000 Subject: RFR: [small, docs] JDK-8240971 Fix CSS styles in some doc comments In-Reply-To: References: Message-ID: <2216FA2C-C369-4FF1-B0D2-800D3AD59B1B@oracle.com> This is really nice. Incidentally, it also makes https://bugs.openjdk.java.net/browse/JDK-8234395 less relevant. -Pavel > On 12 Mar 2020, at 20:50, Jonathan Gibbons wrote: > > Please review a simple fix regarding the non-standard use of some CSS in some doc comments. > > From the JBS Description: > > Recently, for the display of javadoc block tags, javadoc changed from using an inconsistent set of CSS class names on the generated 'dt' elements to using a single new name ("notes") on the enclosing 'dl' element. > > There are a few (4) places in the main JDK code where the old-style names were used explicitly in doc comments, in order to emulate the appearance of a list of block tags. These use-sites should be fixed up. They are in the following files: > > open/src/java.base/share/classes/module-info.java > open/src/java.se/share/classes/module-info.java > open/src/java.management.rmi/share/classes/module-info.java > open/src/jdk.jconsole/share/classes/module-info.java > > In addition, these four files used the style attribute to force the font to be used. The font is now set in the standard CSS for "notes", and so the local use of a "style" attribute is no longer necessary. > > -- Jon > > JBS: https://bugs.openjdk.java.net/browse/JDK-8240971 > Webrev: http://cr.openjdk.java.net/~jjg/8240971/webrev.00/index.html > API: http://cr.openjdk.java.net/~jjg/8240971/api.00/index.html > From jonathan.gibbons at oracle.com Fri Mar 13 16:34:04 2020 From: jonathan.gibbons at oracle.com (Jonathan Gibbons) Date: Fri, 13 Mar 2020 09:34:04 -0700 Subject: RFR: [small, docs] JDK-8240971 Fix CSS styles in some doc comments In-Reply-To: <2216FA2C-C369-4FF1-B0D2-800D3AD59B1B@oracle.com> References: <2216FA2C-C369-4FF1-B0D2-800D3AD59B1B@oracle.com> Message-ID: <158d778e-c70d-b5df-b8ab-2296fea609ac@oracle.com> At some point, we should separate JDK-specific definitions from javadoc-general definitions, using a separate stylesheet. -- Jon On 3/13/20 9:26 AM, Pavel Rappo wrote: > This is really nice. Incidentally, it also makes > > https://bugs.openjdk.java.net/browse/JDK-8234395 > > less relevant. > > -Pavel > >> On 12 Mar 2020, at 20:50, Jonathan Gibbons wrote: >> >> Please review a simple fix regarding the non-standard use of some CSS in some doc comments. >> >> From the JBS Description: >> >> Recently, for the display of javadoc block tags, javadoc changed from using an inconsistent set of CSS class names on the generated 'dt' elements to using a single new name ("notes") on the enclosing 'dl' element. >> >> There are a few (4) places in the main JDK code where the old-style names were used explicitly in doc comments, in order to emulate the appearance of a list of block tags. These use-sites should be fixed up. They are in the following files: >> >> open/src/java.base/share/classes/module-info.java >> open/src/java.se/share/classes/module-info.java >> open/src/java.management.rmi/share/classes/module-info.java >> open/src/jdk.jconsole/share/classes/module-info.java >> >> In addition, these four files used the style attribute to force the font to be used. The font is now set in the standard CSS for "notes", and so the local use of a "style" attribute is no longer necessary. >> >> -- Jon >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8240971 >> Webrev: http://cr.openjdk.java.net/~jjg/8240971/webrev.00/index.html >> API: http://cr.openjdk.java.net/~jjg/8240971/api.00/index.html >> From daniil.x.titov at oracle.com Fri Mar 13 22:05:11 2020 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Fri, 13 Mar 2020 15:05:11 -0700 Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI connector port In-Reply-To: References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com> <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com> <41392FC2-AC78-4E94-878B-907DDFB3E968@oracle.com> Message-ID: <78EA1B33-1B0D-4969-AFC5-E73D8F82E6FA@oracle.com> Hi Yasumasa, Serguei and Alex, Please review a new version of the webrev that includes the changes Yasumasa suggested. > Shutdown hook is already registered in c'tor of HotSpotAgent. > It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed. The shutdown hook registered in the HotSpotAgent c'tor only works for non-servers, so we still need a the shutdown hook for remote server being added in SALauncher. I changed it to use the lambda expression. 101 public HotSpotAgent() { 102 // for non-server add shutdown hook to clean-up debugger in case 103 // of forced exit. For remote server, shutdown hook is added by 104 // DebugServer. 105 Runtime.getRuntime().addShutdownHook(new java.lang.Thread( 106 new Runnable() { 107 public void run() { 108 synchronized (HotSpotAgent.this) { 109 if (!isServer) { 110 detach(); 111 } 112 } 113 } 114 })); 115 } >> Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains >> `exclusiveAccess.dirs=.` to avoid concurrent execution As I understand exclusiveAccess.dirs prevents only the tests located in this directory from being run simultaneously and other tests could still run in parallel with one of these tests. Thus I would prefer to have the retry mechanism in place. I simplified the code using the class variables instead of local arrays. Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.03/ [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 [3] Bug: https://bugs.openjdk.java.net/browse/JDK-8196751 Thank you, Daniil ?On 3/6/20, 6:15 PM, "Yasumasa Suenaga" wrote: Hi Daniil, On 2020/03/07 3:38, Daniil Titov wrote: > Hi Yasumasa, > > -> checkBasicOptions() is needed? I think you can remove this method and embed it in caller. > I think that having a piece of code that invokes a method named "buildAttachArgs" with a copy of the argument map just for its side-effect ( it throws an exception if parameters are incorrect) and ignores its return might look confusing. Thus, I found it more appropriate to wrap it inside a method with more relevant name . Ok, but I prefer to leave comment it. > > SADebugDTest > > - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array. > We cannot use primitives there since these local variables are captured in lambda expression and are required to be final. > The other option is to use some other wrapper for them but I don't see any obvious benefits in it comparing to the array. Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains `exclusiveAccess.dirs=.` to avoid concurrent execution. If you do not think this error check, test code is more simply. > I will include your other suggestion in the new version of the webrev. Sorry, I have one more comment: > - Shutdown hook is very good idea. You can implement more simply if you use lambda expression. Shutdown hook is already registered in c'tor of HotSpotAgent. It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed. Thanks, Yasumasa > Thanks! > Daniil > > ?On 3/6/20, 12:30 AM, "Yasumasa Suenaga" wrote: > > Hi Daniil, > > > - SALauncher.java > - checkBasicOptions() is needed? I think you can remove this method and embed it in caller. > - I think registryPort should be checked with Integer.parseInt() like others (rmiPort and pid) rather than regex. > - Shutdown hook is very good idea. You can implement more simply if you use lambda expression. > > - SADebugDTest.java > - Please add bug ID to @bug. > - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array. > > > Thanks, > > Yasumasa > > > On 2020/03/06 10:15, Daniil Titov wrote: > > Hi Yasumasa, Serguei and Alex, > > > > Please review a new version of the fix [1] that addresses your comments. The new version in addition to RMI connector > > port option introduces two more options to specify RMI registry port and RMI connector host name. Currently, these > > last two settings could be specified using the system properties but the system properties have the following disadvantages > > comparing to the command line options: > > - It?s hard to know about them: they are not listed in tool?s help. > > - They have long names that hard to remember > > - It is easy to mistype them in the command line and you will not get any warning about it. > > > > The CSR [2] was also updated and needs to be reviewed. > > > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > > container and connecting to it with the GUI debugger. Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.02/ > > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > [3] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > > > Thank you, > > Daniil > > > > ?On 2/24/20, 5:45 AM, "Yasumasa Suenaga" wrote: > > > > Hi Daniil, > > > > - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments. > > Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply. > > > > - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used. > > But you can use same port number as RMI registry (1099). > > It is same as relation between jmxremote.port and jmxremote.rmi.port. > > > > > > Thanks, > > > > Yasumasa > > > > > > On 2020/02/24 13:21, Daniil Titov wrote: > > > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port. > > > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container. > > > > > > New CSR [3] was created for this change and it needs to be reviewed as well. > > > > > > Man pages for jhsdb will be updated in a separate issue. > > > > > > The current implementation (sun.jvm.hotspot.SALauncher) parses the command line options passed to jhsdb tool, > > > converts them to the ones for the debug server and then delegates the call to sun.jvm.hotspot.DebugServer.main(). > > > > > > // delegate to the actual SA debug server. > > > 367 DebugServer.main(newArgArray.toArray(new String[0])); > > > > > > However, sun.jvm.hotspot.DebugServer doesn't support named options and that prevents from efficiently adding new options to the tool. > > > I found it more suitable to start Hotspot agent directly in SALauncher rather than adding a new option in both sun.jvm.hotspot.SALauncher > > > and sun.jvm.hotspot.DebugServer and delegating the call. With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated > > > but I would prefer to address it in a separate issue. > > > > > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > > > container and connecting to it with the GUI debugger. > > > Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01 > > > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > > > > > Thank you, > > > Daniil > > > > > > > > > > > > > > > From suenaga at oss.nttdata.com Sat Mar 14 01:35:35 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Sat, 14 Mar 2020 10:35:35 +0900 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 Message-ID: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> Hi all, Please review this change: JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ JDK-8234624 introduced DWARF parser in SA for unwinding native frames in jstack mixed mode. However some error has seen intermittently after that. I investigated the cause of this, I found two concerns: A: lack of buffer (.eh_frame section data) range check B: Language personality routine and Language Specific Data Area (LSDA) are not considered I addd range check for .eh_frame processing, and ignore personality routine and LSDA in this webrev. Also I added bailout code if DWARF processing is failed due to these concerns. This change has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), also I tested it on my Fedora 31 box and Oracle Linux 7.7 container. Thanks, Yasumasa From suenaga at oss.nttdata.com Sat Mar 14 02:23:37 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Sat, 14 Mar 2020 11:23:37 +0900 Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI connector port In-Reply-To: <78EA1B33-1B0D-4969-AFC5-E73D8F82E6FA@oracle.com> References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com> <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com> <41392FC2-AC78-4E94-878B-907DDFB3E968@oracle.com> <78EA1B33-1B0D-4969-AFC5-E73D8F82E6FA@oracle.com> Message-ID: Hi Daniil, On 2020/03/14 7:05, Daniil Titov wrote: > Hi Yasumasa, Serguei and Alex, > > Please review a new version of the webrev that includes the changes Yasumasa suggested. > >> Shutdown hook is already registered in c'tor of HotSpotAgent. >> It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed. > > The shutdown hook registered in the HotSpotAgent c'tor only works for non-servers, so we still need a > the shutdown hook for remote server being added in SALauncher. I changed it to use the lambda expression. > > 101 public HotSpotAgent() { > 102 // for non-server add shutdown hook to clean-up debugger in case > 103 // of forced exit. For remote server, shutdown hook is added by > 104 // DebugServer. > 105 Runtime.getRuntime().addShutdownHook(new java.lang.Thread( > 106 new Runnable() { > 107 public void run() { > 108 synchronized (HotSpotAgent.this) { > 109 if (!isServer) { > 110 detach(); > 111 } > 112 } > 113 } > 114 })); > 115 } I missed it, thanks! >>> Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains >>> `exclusiveAccess.dirs=.` to avoid concurrent execution > As I understand exclusiveAccess.dirs prevents only the tests located in this directory from being run simultaneously and other tests could still run in parallel with one of these tests. Thus I would prefer to have the retry mechanism in place. I simplified the code using the class variables instead of local arrays. Ok, but I think it might be more simply with TestLibrary. For example, can you use TestLibrary::getUnusedRandomPort ? It is used in test/jdk/java/rmi/testlibrary/RMID.java . Thanks, Yasumasa > Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.03/ > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > [3] Bug: https://bugs.openjdk.java.net/browse/JDK-8196751 > > Thank you, > Daniil > > ?On 3/6/20, 6:15 PM, "Yasumasa Suenaga" wrote: > > Hi Daniil, > > On 2020/03/07 3:38, Daniil Titov wrote: > > Hi Yasumasa, > > > > -> checkBasicOptions() is needed? I think you can remove this method and embed it in caller. > > I think that having a piece of code that invokes a method named "buildAttachArgs" with a copy of the argument map just for its side-effect ( it throws an exception if parameters are incorrect) and ignores its return might look confusing. Thus, I found it more appropriate to wrap it inside a method with more relevant name . > > Ok, but I prefer to leave comment it. > > > > > SADebugDTest > > > - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array. > > We cannot use primitives there since these local variables are captured in lambda expression and are required to be final. > > The other option is to use some other wrapper for them but I don't see any obvious benefits in it comparing to the array. > > Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains `exclusiveAccess.dirs=.` to avoid concurrent execution. > If you do not think this error check, test code is more simply. > > > > I will include your other suggestion in the new version of the webrev. > > Sorry, I have one more comment: > > > - Shutdown hook is very good idea. You can implement more simply if you use lambda expression. > > Shutdown hook is already registered in c'tor of HotSpotAgent. > It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed. > > > Thanks, > > Yasumasa > > > > Thanks! > > Daniil > > > > ?On 3/6/20, 12:30 AM, "Yasumasa Suenaga" wrote: > > > > Hi Daniil, > > > > > > - SALauncher.java > > - checkBasicOptions() is needed? I think you can remove this method and embed it in caller. > > - I think registryPort should be checked with Integer.parseInt() like others (rmiPort and pid) rather than regex. > > - Shutdown hook is very good idea. You can implement more simply if you use lambda expression. > > > > - SADebugDTest.java > > - Please add bug ID to @bug. > > - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array. > > > > > > Thanks, > > > > Yasumasa > > > > > > On 2020/03/06 10:15, Daniil Titov wrote: > > > Hi Yasumasa, Serguei and Alex, > > > > > > Please review a new version of the fix [1] that addresses your comments. The new version in addition to RMI connector > > > port option introduces two more options to specify RMI registry port and RMI connector host name. Currently, these > > > last two settings could be specified using the system properties but the system properties have the following disadvantages > > > comparing to the command line options: > > > - It?s hard to know about them: they are not listed in tool?s help. > > > - They have long names that hard to remember > > > - It is easy to mistype them in the command line and you will not get any warning about it. > > > > > > The CSR [2] was also updated and needs to be reviewed. > > > > > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > > > container and connecting to it with the GUI debugger. Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.02/ > > > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > > [3] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > > > > > Thank you, > > > Daniil > > > > > > ?On 2/24/20, 5:45 AM, "Yasumasa Suenaga" wrote: > > > > > > Hi Daniil, > > > > > > - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments. > > > Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply. > > > > > > - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used. > > > But you can use same port number as RMI registry (1099). > > > It is same as relation between jmxremote.port and jmxremote.rmi.port. > > > > > > > > > Thanks, > > > > > > Yasumasa > > > > > > > > > On 2020/02/24 13:21, Daniil Titov wrote: > > > > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port. > > > > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container. > > > > > > > > New CSR [3] was created for this change and it needs to be reviewed as well. > > > > > > > > Man pages for jhsdb will be updated in a separate issue. > > > > > > > > The current implementation (sun.jvm.hotspot.SALauncher) parses the command line options passed to jhsdb tool, > > > > converts them to the ones for the debug server and then delegates the call to sun.jvm.hotspot.DebugServer.main(). > > > > > > > > // delegate to the actual SA debug server. > > > > 367 DebugServer.main(newArgArray.toArray(new String[0])); > > > > > > > > However, sun.jvm.hotspot.DebugServer doesn't support named options and that prevents from efficiently adding new options to the tool. > > > > I found it more suitable to start Hotspot agent directly in SALauncher rather than adding a new option in both sun.jvm.hotspot.SALauncher > > > > and sun.jvm.hotspot.DebugServer and delegating the call. With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated > > > > but I would prefer to address it in a separate issue. > > > > > > > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > > > > container and connecting to it with the GUI debugger. > > > > Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01 > > > > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > > > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > > > > > > > Thank you, > > > > Daniil > > > > > > > > > > > > > > > > > > > > > > > > > > From chris.plummer at oracle.com Sun Mar 15 23:35:16 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Sun, 15 Mar 2020 16:35:16 -0700 Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they do not attempt to use sudo when available In-Reply-To: References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com> <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com> <150f87f8-5b54-deae-ced1-1227bb1ed17a@oracle.com> <227652ff-ffb0-a1f0-531d-03b75ecec921@oracle.com> <33ffd0e7-d017-c1bf-feb5-4d2f07e753f2@oracle.com> <90e1ffcc-6636-b328-b167-fd1d42072575@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From igor.ignatyev at oracle.com Sun Mar 15 23:49:28 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Sun, 15 Mar 2020 16:49:28 -0700 Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they do not attempt to use sudo when available In-Reply-To: References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com> <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com> <150f87f8-5b54-deae-ced1-1227bb1ed17a@oracle.com> <227652ff-ffb0-a1f0-531d-03b75ecec921@oracle.com> <33ffd0e7-d017-c1bf-feb5-4d2f07e753f2@oracle.com> <90e1ffcc-6636-b328-b167-fd1d42072575@oracle.com> Message-ID: <78834ED4-055B-4E43-86A8-01BB49BC3C73@oracle.com> Hi Chris, looks good, thanks! one minor nit, in SATestUtils::skipIfCannotAttach, you have two exception messages which start with 'SA attach not expected to work.', and one w/ 'SA Attach not expected to work.' (w/ Attach instead of attach), it'd be nicer to have them uniform. Cheers, -- Igor > On Mar 15, 2020, at 4:35 PM, Chris Plummer wrote: > > Hi Igor, > > Thanks for the review. Here's and updated webrev with all of the suggestions from you and Serguei: > > http://cr.openjdk.java.net/~cjplummer/8238268/webrev.01/index.html > > Also some comments inline below. > > On 3/13/20 9:26 AM, Igor Ignatyev wrote: >> HI Chris, >> >> overall looks good to me, a few comments though: >> 1. since you removed vm.hasSAandCanAttach from VMProps, you also need to remove it from all TEST.ROOT files which mention it (test/jdk/TEST.ROOT and test/hotspot/jtreg/TEST.ROOT) so people won't be confused by undefined property and jtreg will be able to properly report invalid usages of it if any. > Ok, but it's unclear to me what requires.properties is even for, and what is the impact of extra or missing properties. What kind of test would catch these errors? jtreg uses 'requires.properties' as a list of extra variables for @require expressions, if a test uses a name which isn't in 'requires.properties' (or known to jtreg), jtreg won't execute such test and will set its status to Error w/ 'invalid name ...' message. >> >> 2. in SATestUtils::canAddPrivileges, could you please add some meaningful message to the RuntimeException at L#102? >> > Ok. > > throw new RuntimeException("sudo process interrupted", e); > >> 3. SATestUtils::checkAttachOk method name is somewhat misleading (hence you had to put comment every time you used it), I'd recommend you to rename to smth like skipIfCannotAttach(). > Ok, but I still left the comment in place. >> >> 4. in SATestUtils::checkAttachOk's javadoc, it would be better to use @throws tag like: >>> + /** >>> + * Checks if SA Attach is expected to work. >>> +. * @throws SkippedException ifSA Attach is not expected to work. >>> + */ >> >> > Ok. >> 5. it also might make sense to catch IOException within SATestUtils::checkAttachOk and throw it as Error or RuntimeException. >> > Ok. >> I've briefly looked at all the changed tests and they look good. > > Thanks! > > Chris >> >> Thanks, >> -- Igor >> >> >>> On Mar 12, 2020, at 11:06 PM, Chris Plummer > wrote: >>> >>> Hi Serguei, >>> >>> Thanks for the review! >>> >>> Can I get one more reviewer please? >>> >>> thanks, >>> >>> Chris >>> >>> On 3/12/20 12:06 AM, serguei.spitsyn at oracle.com wrote: >>>> Hi Chris, >>>> >>>> >>>> On 3/12/20 00:03, Chris Plummer wrote: >>>>> Hi Serguei, >>>>> >>>>> That check used to be in Platform.shouldSAAttach(), which essentially was moved to SATestUtils.checkAttachOk() and reworked some. It was necessary in Platform.shouldSAAttach() since that was used to evaluation vm.hasSAandCanAttach (which is now gone). When I moved everything to SATestUtils.checkAttachOk(), I recall thinking it wasn't really necessary since all tests that call it should have @require vm.hasSA, but left it in anyway just to be extra safe. I'm still inclined to just leave it in, but would not be opposed to removing it. >>>> >>>> I agree, it is more safe to keep it, at list for now. >>>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 3/11/20 11:20 PM, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Chris, >>>>>> >>>>>> I've made another pass today. >>>>>> It looks good to me. >>>>>> >>>>>> I have just one minor questions. >>>>>> >>>>>> There is some overlap between the requires vm.hasSA check and checkAttachOk: >>>>>> + public static void checkAttachOk() throws IOException { >>>>>> + if (!Platform.hasSA()) { >>>>>> + throw new SkippedException("SA not supported."); >>>>>> + } >>>>>> In the former case, the test is not run but in the latter the SkippedException is thrown. >>>>>> As I see, all tests with the checkAttachOk call use requires vm.hasSA as well. >>>>>> It can be that the first check "if (!Platform.hasSA())" in the checkAttachOk is redundant. >>>>>> It is okay and more safe in general but generates little confusion. >>>>>> I'm okay if you don't do anything with this but wanted to know your view. >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 3/10/20 18:57, Chris Plummer wrote: >>>>>>> On 3/10/20 6:07 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> Overall, this looks as a right direction to me while it is not easy to verify all the details. >>>>>>> Yes, there are a lot of tests with quite a few different types of changes. I did a lot of testing and verified that when the tests pass, they pass for the right reasons (really ran the test, skipped due to lack of privileges, or skipped due to running signed on OSX 10.14 or later). I also verified locally running as root, running with a cached sudo, and running without sudo. >>>>>>>> I'll make another pass tomorrow. >>>>>>> Thanks! >>>>>>>> >>>>>>>> A couple of quick nits so far: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestCpoolForInvokeDynamic.java.udiff.html >>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestDefaultMethods.java.udiff.html >>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestHeapDumpForInvokeDynamic.java.udiff.html >>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestInstanceKlassSizeForInterface.java.udiff.html >>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestJhsdbJstackMixed.java.udiff.html >>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestRevPtrsForInvokeDynamic.java.udiff.html >>>>>>>> import jdk.test.lib.Utils; >>>>>>>> -import jdk.test.lib.Asserts; >>>>>>>> +import jdk.test.lib.SA.SATestUtils; >>>>>>>> Need to swap these exports. >>>>>>>> >>>>>>>> >>>>>>> Ok >>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/lib/jdk/test/lib/SA/SATestUtils.java.frames.html >>>>>>>> 48 if (SATestUtils.needsPrivileges()) { >>>>>>>> 49 cmdStringList = SATestUtils.addPrivileges(cmdStringList); >>>>>>>> The method calls are local, so the class name can be omitted in the method names: >>>>>>>> SATestUtils.needsPrivileges and SATestUtils.addPrivileges. >>>>>>> Ok >>>>>>>> >>>>>>>> >>>>>>>> 94 try { >>>>>>>> 95 if (echoProcess.waitFor(60, TimeUnit.SECONDS) == false) { >>>>>>>> 96 // Due to using the "-n" option, sudo should complete almost immediately. 60 seconds >>>>>>>> 97 // is more than generous. If it didn't complete in that time, something went very wrong. >>>>>>>> 98 echoProcess.destroyForcibly(); >>>>>>>> 99 throw new RuntimeException("Timed out waiting for sudo to execute."); >>>>>>>> 100 } >>>>>>>> 101 } catch (InterruptedException e) { >>>>>>>> 102 throw new RuntimeException(e); >>>>>>>> 103 } >>>>>>>> The lines 101/103 are misaligned. >>>>>>> Ok. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Chris >>>>>>>> >>>>>>>> >>>>>>>> On 3/9/20 19:29, Chris Plummer wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> Please help review the following: >>>>>>>>> >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238268 >>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/ >>>>>>>>> >>>>>>>>> I'll try to give enough background first to make it easier to understand the changes. On OSX you must run SA tests that attach to a live process as root or using sudo. For example: >>>>>>>>> >>>>>>>>> sudo make run-test TEST=serviceability/sa/ClhsdbJstackXcompStress.java >>>>>>>>> >>>>>>>>> Whether running as root or under sudo, the check to allow the test to run is done with: >>>>>>>>> >>>>>>>>> private static boolean canAttachOSX() { >>>>>>>>> return userName.equals("root"); >>>>>>>>> } >>>>>>>>> >>>>>>>>> Any test using "@requires vm.hasSAandCanAttach" must pass this check via Platform.shouldSAAttach(), which for OSX returns: >>>>>>>>> >>>>>>>>> return canAttachOSX() && !isSignedOSX(); >>>>>>>>> >>>>>>>>> So if running as root the "@requires vm.hasSAandCanAttach" passes, otherwise it does not. However, using a root login to run tests is not a very desirable, nor is issuing a "sudo make run-test" (any created file ends up with root ownership). Because of this support was previously added for just running the attaching process using sudo, not the entire test. This was only done for the 20 or so tests that use ClhsdbLauncher. These tests use "@requires vm.hasSA", and then while running the test will do a "sudo" check if canAttachOSX() returns false: >>>>>>>>> >>>>>>>>> if (!Platform.shouldSAAttach()) { >>>>>>>>> if (Platform.isOSX()) { >>>>>>>>> if (Platform.isSignedOSX()) { >>>>>>>>> throw new SkippedException("SA attach not expected to work. JDK is signed."); >>>>>>>>> } else if (SATestUtils.canAddPrivileges()) { >>>>>>>>> needPrivileges = true; >>>>>>>>> } >>>>>>>>> } >>>>>>>>> if (!needPrivileges) { >>>>>>>>> // Skip the test if we don't have enough permissions to attach >>>>>>>>> // and cannot add privileges. >>>>>>>>> throw new SkippedException( >>>>>>>>> "SA attach not expected to work. Insufficient privileges."); >>>>>>>>> } >>>>>>>>> } >>>>>>>>> >>>>>>>>> So basically it does a runtime check of vm.hasSAandCanAttach, and if it fails then checks if running with sudo will work. This allows for either a passwordless sudo to be used when running clhsdb, or for the user to be prompted for the sudo password (note I've remove support for the latter with my changes). >>>>>>>>> >>>>>>>>> That brings us to the CR that is being fixed. ClhsdbLauncher tests support sudo and will therefore run with our CI testing on OSX, but the 25 or so tests that use "@requires vm.hasSAandCanAttach" do not, and therefore are never run with our CI OSX testing. The changes in this webrev fix that. >>>>>>>>> >>>>>>>>> There are two possible approaches to the fix. One is having the check for sudo be done as part of the vm.hasSAandCanAttach evaluation. The other approach is to do the check in the test at runtime similar to how ClhsdbLauncher currently does. This would mean just using "@requires vm.hasSA" for all the tests instead of "@requires vm.hasSAandCanAttach". I chose the later because there is an advantage to throwing SkippedException rather than just silently skipping the test using @requires. The advantage is that mdash tells you how many tests were skipped, and when you hover over the reason you can see the SkippedException message, which will differentiate between reasons like the JDK was signed or there are insufficient privileges. If all the checking was done by the vm.hasSAandCanAttach evaluation, you would not know why the test wasn't run. >>>>>>>>> >>>>>>>>> The "support" related changes made are all in the following 3 files. The rest of the changes are in the tests: >>>>>>>>> >>>>>>>>> test/jtreg-ext/requires/VMProps.java >>>>>>>>> test/lib/jdk/test/lib/Platform.java >>>>>>>>> test/lib/jdk/test/lib/SA/SATestUtils.java >>>>>>>>> >>>>>>>>> You'll noticed that one change I made to the sudo support in SATestUtils.canAddPrivileges() is to make sudo non-interactive, which means no password prompt. So that means either the user does not require a password, or the credentials have been cached. Otherwise the sudo check will fail. On most platforms if you execute a sudo command, the credentials are cached for 5 minutes. So if your user is not setup for passwordless sudo, then a sudo command can be issued before running the tests, and will likely remain cached until the test is run. The reason for using passwordless is because prompting in the middle of running tests can be confusing (you usually walk way once launching the tests and miss the prompt anyway), and avoids unnecessary delays in automated testing due to waiting for the password prompt to timeout (it used to wait 1 minute). >>>>>>>>> >>>>>>>>> There are essentially 3 types of tests that SA Attach to a process, each needing a slightly different fix: >>>>>>>>> >>>>>>>>> 1. Tests that directly launch a jdk.hotspot.agent class, such as TestClassDump.java. They need to call SATestUtils.checkAttachOk() to verify that attaching will be possible, and then SATestUtils.addPrivilegesIfNeeded(pb) to get the sudo command added if needed.They also need to switch from using hasSAandCanAttach to using hasSA. >>>>>>>>> >>>>>>>>> 2. Tests that launch command line tools such has jhsdb. They need to call SATestUtils.checkAttachOk() to verify that attaching will be possible, and then SATestUtils.createProcessBuilder() to create a process that will be launched using sudo if necessary.They also need to switch from using hasSAandCanAttach to using hasSA. >>>>>>>>> >>>>>>>>> 3. Tests that use ClhsdbLauncher. They already use hasSA instead of hasSAandCanAttach, and rely on ClhsdbLauncher to do check at runtime if attaching will work, so for the most part all the these tests are unchanged. ClhsdbLauncher was modified to take advantage of the new SATestUtils.createProcessBuilder() and SATestUtils.checkAttachOk() APIs. >>>>>>>>> >>>>>>>>> Some tests required special handling: >>>>>>>>> >>>>>>>>> test/hotspot/jtreg/compiler/ciReplay/TestSAClient.java >>>>>>>>> test/hotspot/jtreg/compiler/ciReplay/TestSAServer.java >>>>>>>>> >>>>>>>>> - These two tests SA Attach to a core file, not to a process, so only need hasSA, >>>>>>>>> not hasSAandCanAttach. No other changes were needed. >>>>>>>>> >>>>>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFindPC.java >>>>>>>>> >>>>>>>>> - The output should never be null. If the test was skipped due to lack of privileges, you >>>>>>>>> would never get to this section of the test. >>>>>>>>> >>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestClhsdbJstackLock.java >>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestIntConstant.java >>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestPrintMdo.java >>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestType.java >>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestUniverse.java >>>>>>>>> >>>>>>>>> - These are ClhsdbLauncher tests, so they should have been using hasSA instead of >>>>>>>>> hasSAandCanAttachin the first place. No other changes were needed. >>>>>>>>> >>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestCpoolForInvokeDynamic.java >>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestDefaultMethods.java >>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestG1HeapRegion.java >>>>>>>>> >>>>>>>>> - These tests used to "@require mac" but seem run fine on OSX, so I removed this requirement. >>>>>>>>> >>>>>>>>> test/jdk/sun/tools/jhsdb/BasicLauncherTest.java >>>>>>>>> >>>>>>>>> - This test had a runtime check to not run on OSX due to not having core file stack >>>>>>>>> walking support. However, this tests always attaches to a process, not a core file, >>>>>>>>> and seems to run just fine on OSX. >>>>>>>>> >>>>>>>>> test/jdk/sun/tools/jstack/DeadlockDetectionTest.java >>>>>>>>> >>>>>>>>> - I changed the test to throw a SkippedException if it gets the unexpected error code >>>>>>>>> rather than just println. >>>>>>>>> >>>>>>>>> And a few other miscellaneous changes not already covered: >>>>>>>>> >>>>>>>>> test/lib/jdk/test/lib/Platform.java >>>>>>>>> - Made canPtraceAttachLinux() public so it can be called from SATestUtils. >>>>>>>>> - vm.hasSAandCanAttach is now gone. >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Mon Mar 16 00:47:38 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Sun, 15 Mar 2020 17:47:38 -0700 Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they do not attempt to use sudo when available In-Reply-To: <78834ED4-055B-4E43-86A8-01BB49BC3C73@oracle.com> References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com> <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com> <150f87f8-5b54-deae-ced1-1227bb1ed17a@oracle.com> <227652ff-ffb0-a1f0-531d-03b75ecec921@oracle.com> <33ffd0e7-d017-c1bf-feb5-4d2f07e753f2@oracle.com> <90e1ffcc-6636-b328-b167-fd1d42072575@oracle.com> <78834ED4-055B-4E43-86A8-01BB49BC3C73@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Mon Mar 16 02:17:03 2020 From: david.holmes at oracle.com (David Holmes) Date: Mon, 16 Mar 2020 12:17:03 +1000 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> Message-ID: <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com> Hi Yasumasa, I can't review this as I know nothing about the code, but I'm putting the patch through our internal testing. David On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: > Hi all, > > Please review this change: > > ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 > ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ > > JDK-8234624 introduced DWARF parser in SA for unwinding native frames in > jstack mixed mode. > However some error has seen intermittently after that. > > I investigated the cause of this, I found two concerns: > > ? A: lack of buffer (.eh_frame section data) range check > ? B: Language personality routine and Language Specific Data Area > (LSDA) are not considered > > I addd range check for .eh_frame processing, and ignore personality > routine and LSDA in this webrev. > Also I added bailout code if DWARF processing is failed due to these > concerns. > > This change has passed all tests on submit repo > (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), > also I tested it on my Fedora 31 box and Oracle Linux 7.7 container. > > > Thanks, > > Yasumasa From david.holmes at oracle.com Mon Mar 16 02:53:48 2020 From: david.holmes at oracle.com (David Holmes) Date: Mon, 16 Mar 2020 12:53:48 +1000 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com> References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com> Message-ID: <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com> On 16/03/2020 12:17 pm, David Holmes wrote: > Hi Yasumasa, > > I can't review this as I know nothing about the code, but I'm putting > the patch through our internal testing. Sorry but the crashes still exist: # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949 # # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) # Problematic frame: # C [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e in fact they seem worse as the test seems to always crash now. David > David > > On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: >> Hi all, >> >> Please review this change: >> >> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ >> >> JDK-8234624 introduced DWARF parser in SA for unwinding native frames >> in jstack mixed mode. >> However some error has seen intermittently after that. >> >> I investigated the cause of this, I found two concerns: >> >> ?? A: lack of buffer (.eh_frame section data) range check >> ?? B: Language personality routine and Language Specific Data Area >> (LSDA) are not considered >> >> I addd range check for .eh_frame processing, and ignore personality >> routine and LSDA in this webrev. >> Also I added bailout code if DWARF processing is failed due to these >> concerns. >> >> This change has passed all tests on submit repo >> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), >> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container. >> >> >> Thanks, >> >> Yasumasa From david.holmes at oracle.com Mon Mar 16 04:12:07 2020 From: david.holmes at oracle.com (David Holmes) Date: Mon, 16 Mar 2020 14:12:07 +1000 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com> References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com> <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com> Message-ID: Correction ... On 16/03/2020 12:53 pm, David Holmes wrote: > On 16/03/2020 12:17 pm, David Holmes wrote: >> Hi Yasumasa, >> >> I can't review this as I know nothing about the code, but I'm putting >> the patch through our internal testing. > > Sorry but the crashes still exist: > > # > # A fatal error has been detected by the Java Runtime Environment: > # > #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949 > # > # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build > 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug > 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, > sharing, tiered, compressed oops, g1 gc, linux-amd64) > # Problematic frame: > # C? [libsaproc.so+0x494e]? DwarfParser::process_dwarf(unsigned long)+0x4e > > in fact they seem worse as the test seems to always crash now. Not worse - sorry. I see 6 failures out of 119 runs of the test in linux-x64. I don't see a pattern as to where it fails versus passes. It doesn't fail for me locally. David > David > >> David >> >> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: >>> Hi all, >>> >>> Please review this change: >>> >>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ >>> >>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames >>> in jstack mixed mode. >>> However some error has seen intermittently after that. >>> >>> I investigated the cause of this, I found two concerns: >>> >>> ?? A: lack of buffer (.eh_frame section data) range check >>> ?? B: Language personality routine and Language Specific Data Area >>> (LSDA) are not considered >>> >>> I addd range check for .eh_frame processing, and ignore personality >>> routine and LSDA in this webrev. >>> Also I added bailout code if DWARF processing is failed due to these >>> concerns. >>> >>> This change has passed all tests on submit repo >>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), >>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container. >>> >>> >>> Thanks, >>> >>> Yasumasa From serguei.spitsyn at oracle.com Mon Mar 16 05:22:53 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Sun, 15 Mar 2020 22:22:53 -0700 Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they do not attempt to use sudo when available In-Reply-To: References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com> <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com> <150f87f8-5b54-deae-ced1-1227bb1ed17a@oracle.com> <227652ff-ffb0-a1f0-531d-03b75ecec921@oracle.com> <33ffd0e7-d017-c1bf-feb5-4d2f07e753f2@oracle.com> <90e1ffcc-6636-b328-b167-fd1d42072575@oracle.com> <78834ED4-055B-4E43-86A8-01BB49BC3C73@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From suenaga at oss.nttdata.com Mon Mar 16 06:36:28 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Mon, 16 Mar 2020 15:36:28 +0900 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com> <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com> Message-ID: <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com> Hi David, Thank you for testing it. I updated webrev to avoid bailout to Java frame when DWARF has language personality routine or LSDA. Could you try it? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ It works well on my Fedora 31 and Oracle Linux 7.7 . I've pushed it to submit repo. Diff from webrev.00 is here: http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652 Thanks, Yasumasa On 2020/03/16 13:12, David Holmes wrote: > Correction ... > > On 16/03/2020 12:53 pm, David Holmes wrote: >> On 16/03/2020 12:17 pm, David Holmes wrote: >>> Hi Yasumasa, >>> >>> I can't review this as I know nothing about the code, but I'm putting the patch through our internal testing. >> >> Sorry but the crashes still exist: >> >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949 >> # >> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) >> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >> # Problematic frame: >> # C? [libsaproc.so+0x494e]? DwarfParser::process_dwarf(unsigned long)+0x4e >> >> in fact they seem worse as the test seems to always crash now. > > Not worse - sorry. I see 6 failures out of 119 runs of the test in linux-x64. I don't see a pattern as to where it fails versus passes. > > It doesn't fail for me locally. > > David > >> David >> >>> David >>> >>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: >>>> Hi all, >>>> >>>> Please review this change: >>>> >>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ >>>> >>>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames in jstack mixed mode. >>>> However some error has seen intermittently after that. >>>> >>>> I investigated the cause of this, I found two concerns: >>>> >>>> ?? A: lack of buffer (.eh_frame section data) range check >>>> ?? B: Language personality routine and Language Specific Data Area (LSDA) are not considered >>>> >>>> I addd range check for .eh_frame processing, and ignore personality routine and LSDA in this webrev. >>>> Also I added bailout code if DWARF processing is failed due to these concerns. >>>> >>>> This change has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), >>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa From chris.plummer at oracle.com Mon Mar 16 06:43:39 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Sun, 15 Mar 2020 23:43:39 -0700 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com> References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com> <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com> <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com> Message-ID: <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com> BTW, if you submit it to the submit repo, we can then go and run additional internal tests (and even more builds) using that job. Chris On 3/15/20 11:36 PM, Yasumasa Suenaga wrote: > Hi David, > > Thank you for testing it. > > I updated webrev to avoid bailout to Java frame when DWARF has > language personality routine or LSDA. > Could you try it? > > ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ > > It works well on my Fedora 31 and Oracle Linux 7.7 . > I've pushed it to submit repo. > > Diff from webrev.00 is here: > ? http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652 > > > Thanks, > > Yasumasa > > > On 2020/03/16 13:12, David Holmes wrote: >> Correction ... >> >> On 16/03/2020 12:53 pm, David Holmes wrote: >>> On 16/03/2020 12:17 pm, David Holmes wrote: >>>> Hi Yasumasa, >>>> >>>> I can't review this as I know nothing about the code, but I'm >>>> putting the patch through our internal testing. >>> >>> Sorry but the crashes still exist: >>> >>> # >>> # A fatal error has been detected by the Java Runtime Environment: >>> # >>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949 >>> # >>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug >>> build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) >>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug >>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, >>> sharing, tiered, compressed oops, g1 gc, linux-amd64) >>> # Problematic frame: >>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned >>> long)+0x4e >>> >>> in fact they seem worse as the test seems to always crash now. >> >> Not worse - sorry. I see 6 failures out of 119 runs of the test in >> linux-x64. I don't see a pattern as to where it fails versus passes. >> >> It doesn't fail for me locally. >> >> David >> >>> David >>> >>>> David >>>> >>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: >>>>> Hi all, >>>>> >>>>> Please review this change: >>>>> >>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>> ?? webrev: >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ >>>>> >>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native >>>>> frames in jstack mixed mode. >>>>> However some error has seen intermittently after that. >>>>> >>>>> I investigated the cause of this, I found two concerns: >>>>> >>>>> ?? A: lack of buffer (.eh_frame section data) range check >>>>> ?? B: Language personality routine and Language Specific Data Area >>>>> (LSDA) are not considered >>>>> >>>>> I addd range check for .eh_frame processing, and ignore >>>>> personality routine and LSDA in this webrev. >>>>> Also I added bailout code if DWARF processing is failed due to >>>>> these concerns. >>>>> >>>>> This change has passed all tests on submit repo >>>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), >>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa From suenaga at oss.nttdata.com Mon Mar 16 06:51:03 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Mon, 16 Mar 2020 15:51:03 +0900 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com> References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com> <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com> <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com> <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com> Message-ID: On 2020/03/16 15:43, Chris Plummer wrote: > BTW, if you submit it to the submit repo, we can then go and run additional internal tests (and even more builds) using that job. I've pushed the change to submit repo, but I've not yet received the result. I will share you when I get job ID. Yasumasa > Chris > > On 3/15/20 11:36 PM, Yasumasa Suenaga wrote: >> Hi David, >> >> Thank you for testing it. >> >> I updated webrev to avoid bailout to Java frame when DWARF has language personality routine or LSDA. >> Could you try it? >> >> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ >> >> It works well on my Fedora 31 and Oracle Linux 7.7 . >> I've pushed it to submit repo. >> >> Diff from webrev.00 is here: >> ? http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652 >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2020/03/16 13:12, David Holmes wrote: >>> Correction ... >>> >>> On 16/03/2020 12:53 pm, David Holmes wrote: >>>> On 16/03/2020 12:17 pm, David Holmes wrote: >>>>> Hi Yasumasa, >>>>> >>>>> I can't review this as I know nothing about the code, but I'm putting the patch through our internal testing. >>>> >>>> Sorry but the crashes still exist: >>>> >>>> # >>>> # A fatal error has been detected by the Java Runtime Environment: >>>> # >>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949 >>>> # >>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) >>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>> # Problematic frame: >>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e >>>> >>>> in fact they seem worse as the test seems to always crash now. >>> >>> Not worse - sorry. I see 6 failures out of 119 runs of the test in linux-x64. I don't see a pattern as to where it fails versus passes. >>> >>> It doesn't fail for me locally. >>> >>> David >>> >>>> David >>>> >>>>> David >>>>> >>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: >>>>>> Hi all, >>>>>> >>>>>> Please review this change: >>>>>> >>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ >>>>>> >>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames in jstack mixed mode. >>>>>> However some error has seen intermittently after that. >>>>>> >>>>>> I investigated the cause of this, I found two concerns: >>>>>> >>>>>> ?? A: lack of buffer (.eh_frame section data) range check >>>>>> ?? B: Language personality routine and Language Specific Data Area (LSDA) are not considered >>>>>> >>>>>> I addd range check for .eh_frame processing, and ignore personality routine and LSDA in this webrev. >>>>>> Also I added bailout code if DWARF processing is failed due to these concerns. >>>>>> >>>>>> This change has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), >>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa > > From serguei.spitsyn at oracle.com Mon Mar 16 06:57:13 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Sun, 15 Mar 2020 23:57:13 -0700 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com> <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> Message-ID: <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Mon Mar 16 06:57:48 2020 From: david.holmes at oracle.com (David Holmes) Date: Mon, 16 Mar 2020 16:57:48 +1000 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com> <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com> <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com> <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com> Message-ID: On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote: > On 2020/03/16 15:43, Chris Plummer wrote: >> BTW, if you submit it to the submit repo, we can then go and run >> additional internal tests (and even more builds) using that job. Thanks for that tip Chris! > I've pushed the change to submit repo, but I've not yet received the > result. > I will share you when I get job ID. We can see the id. Just need to wait for the builds to complete before submitting the additional tests. David > Yasumasa > >> Chris >> >> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote: >>> Hi David, >>> >>> Thank you for testing it. >>> >>> I updated webrev to avoid bailout to Java frame when DWARF has >>> language personality routine or LSDA. >>> Could you try it? >>> >>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ >>> >>> It works well on my Fedora 31 and Oracle Linux 7.7 . >>> I've pushed it to submit repo. >>> >>> Diff from webrev.00 is here: >>> ? http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652 >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2020/03/16 13:12, David Holmes wrote: >>>> Correction ... >>>> >>>> On 16/03/2020 12:53 pm, David Holmes wrote: >>>>> On 16/03/2020 12:17 pm, David Holmes wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> I can't review this as I know nothing about the code, but I'm >>>>>> putting the patch through our internal testing. >>>>> >>>>> Sorry but the crashes still exist: >>>>> >>>>> # >>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>> # >>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949 >>>>> # >>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug >>>>> build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) >>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug >>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, >>>>> sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>> # Problematic frame: >>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned >>>>> long)+0x4e >>>>> >>>>> in fact they seem worse as the test seems to always crash now. >>>> >>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in >>>> linux-x64. I don't see a pattern as to where it fails versus passes. >>>> >>>> It doesn't fail for me locally. >>>> >>>> David >>>> >>>>> David >>>>> >>>>>> David >>>>>> >>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> Please review this change: >>>>>>> >>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>> ?? webrev: >>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ >>>>>>> >>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native >>>>>>> frames in jstack mixed mode. >>>>>>> However some error has seen intermittently after that. >>>>>>> >>>>>>> I investigated the cause of this, I found two concerns: >>>>>>> >>>>>>> ?? A: lack of buffer (.eh_frame section data) range check >>>>>>> ?? B: Language personality routine and Language Specific Data >>>>>>> Area (LSDA) are not considered >>>>>>> >>>>>>> I addd range check for .eh_frame processing, and ignore >>>>>>> personality routine and LSDA in this webrev. >>>>>>> Also I added bailout code if DWARF processing is failed due to >>>>>>> these concerns. >>>>>>> >>>>>>> This change has passed all tests on submit repo >>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), >>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >> >> From chris.plummer at oracle.com Mon Mar 16 06:57:50 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Sun, 15 Mar 2020 23:57:50 -0700 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com> <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com> <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com> <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com> Message-ID: <516600a1-a9f9-4ad1-89dc-049bb8e8d131@oracle.com> On 3/15/20 11:51 PM, Yasumasa Suenaga wrote: > On 2020/03/16 15:43, Chris Plummer wrote: >> BTW, if you submit it to the submit repo, we can then go and run >> additional internal tests (and even more builds) using that job. > > I've pushed the change to submit repo, but I've not yet received the > result. > I will share you when I get job ID. I see it, but I'm off to bed and am not sure what David was running, so I'll let him take a stab at it. Chris > > Yasumasa > >> Chris >> >> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote: >>> Hi David, >>> >>> Thank you for testing it. >>> >>> I updated webrev to avoid bailout to Java frame when DWARF has >>> language personality routine or LSDA. >>> Could you try it? >>> >>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ >>> >>> It works well on my Fedora 31 and Oracle Linux 7.7 . >>> I've pushed it to submit repo. >>> >>> Diff from webrev.00 is here: >>> ? http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652 >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2020/03/16 13:12, David Holmes wrote: >>>> Correction ... >>>> >>>> On 16/03/2020 12:53 pm, David Holmes wrote: >>>>> On 16/03/2020 12:17 pm, David Holmes wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> I can't review this as I know nothing about the code, but I'm >>>>>> putting the patch through our internal testing. >>>>> >>>>> Sorry but the crashes still exist: >>>>> >>>>> # >>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>> # >>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949 >>>>> # >>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug >>>>> build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) >>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug >>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, >>>>> sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>> # Problematic frame: >>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned >>>>> long)+0x4e >>>>> >>>>> in fact they seem worse as the test seems to always crash now. >>>> >>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in >>>> linux-x64. I don't see a pattern as to where it fails versus passes. >>>> >>>> It doesn't fail for me locally. >>>> >>>> David >>>> >>>>> David >>>>> >>>>>> David >>>>>> >>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> Please review this change: >>>>>>> >>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>> ?? webrev: >>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ >>>>>>> >>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native >>>>>>> frames in jstack mixed mode. >>>>>>> However some error has seen intermittently after that. >>>>>>> >>>>>>> I investigated the cause of this, I found two concerns: >>>>>>> >>>>>>> ?? A: lack of buffer (.eh_frame section data) range check >>>>>>> ?? B: Language personality routine and Language Specific Data >>>>>>> Area (LSDA) are not considered >>>>>>> >>>>>>> I addd range check for .eh_frame processing, and ignore >>>>>>> personality routine and LSDA in this webrev. >>>>>>> Also I added bailout code if DWARF processing is failed due to >>>>>>> these concerns. >>>>>>> >>>>>>> This change has passed all tests on submit repo >>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), >>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 >>>>>>> container. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >> >> From serguei.spitsyn at oracle.com Mon Mar 16 07:05:56 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 16 Mar 2020 00:05:56 -0700 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com> <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> Message-ID: <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Mon Mar 16 07:17:01 2020 From: david.holmes at oracle.com (David Holmes) Date: Mon, 16 Mar 2020 17:17:01 +1000 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com> <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com> <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com> <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com> Message-ID: <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com> Sorry it is still crashing. # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704 # # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0640217.suenaga.source) # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) # Problematic frame: # C [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e # Same as before. David ----- On 16/03/2020 4:57 pm, David Holmes wrote: > On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote: >> On 2020/03/16 15:43, Chris Plummer wrote: >>> BTW, if you submit it to the submit repo, we can then go and run >>> additional internal tests (and even more builds) using that job. > > Thanks for that tip Chris! > >> I've pushed the change to submit repo, but I've not yet received the >> result. >> I will share you when I get job ID. > > We can see the id. Just need to wait for the builds to complete before > submitting the additional tests. > > David > >> Yasumasa >> >>> Chris >>> >>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote: >>>> Hi David, >>>> >>>> Thank you for testing it. >>>> >>>> I updated webrev to avoid bailout to Java frame when DWARF has >>>> language personality routine or LSDA. >>>> Could you try it? >>>> >>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ >>>> >>>> It works well on my Fedora 31 and Oracle Linux 7.7 . >>>> I've pushed it to submit repo. >>>> >>>> Diff from webrev.00 is here: >>>> ? http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652 >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2020/03/16 13:12, David Holmes wrote: >>>>> Correction ... >>>>> >>>>> On 16/03/2020 12:53 pm, David Holmes wrote: >>>>>> On 16/03/2020 12:17 pm, David Holmes wrote: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> I can't review this as I know nothing about the code, but I'm >>>>>>> putting the patch through our internal testing. >>>>>> >>>>>> Sorry but the crashes still exist: >>>>>> >>>>>> # >>>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>>> # >>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949 >>>>>> # >>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug >>>>>> build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) >>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug >>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, >>>>>> sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>>> # Problematic frame: >>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned >>>>>> long)+0x4e >>>>>> >>>>>> in fact they seem worse as the test seems to always crash now. >>>>> >>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in >>>>> linux-x64. I don't see a pattern as to where it fails versus passes. >>>>> >>>>> It doesn't fail for me locally. >>>>> >>>>> David >>>>> >>>>>> David >>>>>> >>>>>>> David >>>>>>> >>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> Please review this change: >>>>>>>> >>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>> ?? webrev: >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ >>>>>>>> >>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native >>>>>>>> frames in jstack mixed mode. >>>>>>>> However some error has seen intermittently after that. >>>>>>>> >>>>>>>> I investigated the cause of this, I found two concerns: >>>>>>>> >>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check >>>>>>>> ?? B: Language personality routine and Language Specific Data >>>>>>>> Area (LSDA) are not considered >>>>>>>> >>>>>>>> I addd range check for .eh_frame processing, and ignore >>>>>>>> personality routine and LSDA in this webrev. >>>>>>>> Also I added bailout code if DWARF processing is failed due to >>>>>>>> these concerns. >>>>>>>> >>>>>>>> This change has passed all tests on submit repo >>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), >>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 >>>>>>>> container. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>> >>> From serguei.spitsyn at oracle.com Mon Mar 16 08:10:41 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 16 Mar 2020 01:10:41 -0700 Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI connector port In-Reply-To: <78EA1B33-1B0D-4969-AFC5-E73D8F82E6FA@oracle.com> References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com> <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com> <41392FC2-AC78-4E94-878B-907DDFB3E968@oracle.com> <78EA1B33-1B0D-4969-AFC5-E73D8F82E6FA@oracle.com> Message-ID: <79af4ca3-9c8d-ada0-2a72-88b538b01077@oracle.com> Hi Daniil, The update looks pretty good to me so far. I'll make another pass tomorrow. Thanks, Serguei On 3/13/20 15:05, Daniil Titov wrote: > Hi Yasumasa, Serguei and Alex, > > Please review a new version of the webrev that includes the changes Yasumasa suggested. > >> Shutdown hook is already registered in c'tor of HotSpotAgent. >> It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed. > The shutdown hook registered in the HotSpotAgent c'tor only works for non-servers, so we still need a > the shutdown hook for remote server being added in SALauncher. I changed it to use the lambda expression. > > 101 public HotSpotAgent() { > 102 // for non-server add shutdown hook to clean-up debugger in case > 103 // of forced exit. For remote server, shutdown hook is added by > 104 // DebugServer. > 105 Runtime.getRuntime().addShutdownHook(new java.lang.Thread( > 106 new Runnable() { > 107 public void run() { > 108 synchronized (HotSpotAgent.this) { > 109 if (!isServer) { > 110 detach(); > 111 } > 112 } > 113 } > 114 })); > 115 } > >>> Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains >>> `exclusiveAccess.dirs=.` to avoid concurrent execution > As I understand exclusiveAccess.dirs prevents only the tests located in this directory from being run simultaneously and other tests could still run in parallel with one of these tests. Thus I would prefer to have the retry mechanism in place. I simplified the code using the class variables instead of local arrays. > > Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.03/ > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > [3] Bug: https://bugs.openjdk.java.net/browse/JDK-8196751 > > Thank you, > Daniil > > ?On 3/6/20, 6:15 PM, "Yasumasa Suenaga" wrote: > > Hi Daniil, > > On 2020/03/07 3:38, Daniil Titov wrote: > > Hi Yasumasa, > > > > -> checkBasicOptions() is needed? I think you can remove this method and embed it in caller. > > I think that having a piece of code that invokes a method named "buildAttachArgs" with a copy of the argument map just for its side-effect ( it throws an exception if parameters are incorrect) and ignores its return might look confusing. Thus, I found it more appropriate to wrap it inside a method with more relevant name . > > Ok, but I prefer to leave comment it. > > > > > SADebugDTest > > > - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array. > > We cannot use primitives there since these local variables are captured in lambda expression and are required to be final. > > The other option is to use some other wrapper for them but I don't see any obvious benefits in it comparing to the array. > > Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains `exclusiveAccess.dirs=.` to avoid concurrent execution. > If you do not think this error check, test code is more simply. > > > > I will include your other suggestion in the new version of the webrev. > > Sorry, I have one more comment: > > > - Shutdown hook is very good idea. You can implement more simply if you use lambda expression. > > Shutdown hook is already registered in c'tor of HotSpotAgent. > It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed. > > > Thanks, > > Yasumasa > > > > Thanks! > > Daniil > > > > ?On 3/6/20, 12:30 AM, "Yasumasa Suenaga" wrote: > > > > Hi Daniil, > > > > > > - SALauncher.java > > - checkBasicOptions() is needed? I think you can remove this method and embed it in caller. > > - I think registryPort should be checked with Integer.parseInt() like others (rmiPort and pid) rather than regex. > > - Shutdown hook is very good idea. You can implement more simply if you use lambda expression. > > > > - SADebugDTest.java > > - Please add bug ID to @bug. > > - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array. > > > > > > Thanks, > > > > Yasumasa > > > > > > On 2020/03/06 10:15, Daniil Titov wrote: > > > Hi Yasumasa, Serguei and Alex, > > > > > > Please review a new version of the fix [1] that addresses your comments. The new version in addition to RMI connector > > > port option introduces two more options to specify RMI registry port and RMI connector host name. Currently, these > > > last two settings could be specified using the system properties but the system properties have the following disadvantages > > > comparing to the command line options: > > > - It?s hard to know about them: they are not listed in tool?s help. > > > - They have long names that hard to remember > > > - It is easy to mistype them in the command line and you will not get any warning about it. > > > > > > The CSR [2] was also updated and needs to be reviewed. > > > > > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > > > container and connecting to it with the GUI debugger. Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.02/ > > > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > > [3] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > > > > > Thank you, > > > Daniil > > > > > > ?On 2/24/20, 5:45 AM, "Yasumasa Suenaga" wrote: > > > > > > Hi Daniil, > > > > > > - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments. > > > Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply. > > > > > > - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used. > > > But you can use same port number as RMI registry (1099). > > > It is same as relation between jmxremote.port and jmxremote.rmi.port. > > > > > > > > > Thanks, > > > > > > Yasumasa > > > > > > > > > On 2020/02/24 13:21, Daniil Titov wrote: > > > > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port. > > > > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container. > > > > > > > > New CSR [3] was created for this change and it needs to be reviewed as well. > > > > > > > > Man pages for jhsdb will be updated in a separate issue. > > > > > > > > The current implementation (sun.jvm.hotspot.SALauncher) parses the command line options passed to jhsdb tool, > > > > converts them to the ones for the debug server and then delegates the call to sun.jvm.hotspot.DebugServer.main(). > > > > > > > > // delegate to the actual SA debug server. > > > > 367 DebugServer.main(newArgArray.toArray(new String[0])); > > > > > > > > However, sun.jvm.hotspot.DebugServer doesn't support named options and that prevents from efficiently adding new options to the tool. > > > > I found it more suitable to start Hotspot agent directly in SALauncher rather than adding a new option in both sun.jvm.hotspot.SALauncher > > > > and sun.jvm.hotspot.DebugServer and delegating the call. With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated > > > > but I would prefer to address it in a separate issue. > > > > > > > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > > > > container and connecting to it with the GUI debugger. > > > > Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01 > > > > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > > > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > > > > > > > Thank you, > > > > Daniil > > > > > > > > > > > > > > > > > > > > > > > > > > From linzang at tencent.com Mon Mar 16 09:18:18 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Mon, 16 Mar 2020 09:18:18 +0000 Subject: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1) Message-ID: Just update a new path, my preliminary measure show about 3.5x speedup of jmap histo on a nearly full 4GB G1 heap (8-core platform with parallel thread number set to 4). webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_02/ bug: https://bugs.openjdk.java.net/browse/JDK-8215624 CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 BRs, Lin ?> On 2020/3/2, 9:56 PM, "linzang(??)" wrote: > > Dear all, > Let me try to ease the reviewing work by some explanation :P > The patch's target is to speed up jmap -histo for heap iteration, from my experience it is necessary for large heap investigation. E.g in bigData scenario I have tried to conduct jmap -histo against 180GB heap, it does take quite a while. > And if my understanding is corrent, even the jmap -histo without "live" option does heap inspection with heap lock acquired. so it is very likely to block mutator thread in allocation-sensitive scenario. I would say the faster the heap inspection does, the shorter the mutator be blocked. This is parallel iteration for jmap is necessary. > I think the parallel heap inspection should be applied to all kind of heap. However, consider the heap layout are different for GCs, much time is required to understand all kinds of the heap layout to make the whole change. IMO, It is not wise to have a huge patch for the whole solution at once, and it is even harder to review it. So I plan to implement it incrementally, the first patch (this one) is going to confirm the implemention detail of how jmap accept the new option, passes it to attachListener of the jvm process and then how to make the parallel inspection closure be generic enough to make it easy to extend to different heap layout. And also how to implement the heap inspection in specific gc's heap. This patch use G1's heap as the begining. > This patch actually do several things: > 1. Add an option "parallelThreadNum=" to jmap -histo, the default behavior is to set N to 0, means let's JVM decide how many threads to use for heap inspection. Set this option to 1 will disable parallel heap inspection. (more details in CSR: https://bugs.openjdk.java.net/browse/JDK-8239290) > 2. Make a change in how Jmap passing arguments, changes in http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.udiff.html, originally it pass options as separate arguments to attachListener, this patch change to that all options be compose to a single string. So the arg_count_max in attachListener.hpp do not need to be changed, and hence avoid the compatibility issue, as disscussed at https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-March/027334.html > 3. Add an abstract class ParHeapInspectTask in heapInspection.hpp / heapInspection.cpp, It's work(uint worker_id) method prepares the data structure (KlassInfoTable) need for every parallel worker thread, and then call do_object_iterate_parallel() which is heap specific implementation. I also added some machenism in KlassInfoTable to support parallel iteration, such as merge(). > 4. In specific heap (G1 in this patch), create a subclass of ParHeapInspectTask, implement the do_object_iterate_parallel() for parallel heap inspection. For G1, it simply invoke g1CollectedHeap's object_iterate_parallel(). > 5. Add related test. > 6. it may be easy to extend this patch for other kinds of heap by creating subclass of ParHeapInspectTask and implement the do_object_iterate_parallel(). > > Hope these info could help on code review and initate the discussion :-) > Thanks! > > BRs, > Lin > >On 2020/2/19, 9:40 AM, "linzang(??)" wrote:. > > > > Re-post this RFR with correct enhancement number to make it trackable. > > please ignore the previous wrong post. sorry for troubles. > > > > webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/ > > Hi bug: https://bugs.openjdk.java.net/browse/JDK-8215624 > > CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 > > -------------- > > Lin > > >Hi Lin, > > > > > >Could you, please, re-post your RFR with the right enhancement number in > > >the message subject? > > >It will be more trackable this way. > > > > > >Thanks, > > >Serguei > > > > > > > > >On 2/17/20 10:29 PM, linzang(??) wrote: > > >> Dear David, > > >> Thanks a lot! > > >> I have updated the refined code to http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/. > > >> IMHO the parallel heap inspection can be extended to all kinds of heap as long as the heap layout can support parallel iteration. > > >> Maybe we can firstly use this webrev to discuss how to implement it, because I am not sure my current implementation is an appropriate way to communicate with collectedHeap, then we can extend the solution to other kinds of heap. > > >> > > >> Thanks, > > >> -------------- > > >> Lin > > >>> Hi Lin, > > >>> > > >>> Adding in hotspot-gc-dev as they need to see how this interacts with GC > > >>> worker threads, and whether it needs to be extended beyond G1. > > >>> > > >>> I happened to spot one nit when browsing: > > >>> > > >>> src/hotspot/share/gc/shared/collectedHeap.hpp > > >>> > > >>> + virtual bool run_par_heap_inspect_task(KlassInfoTable* cit, > > >>> + BoolObjectClosure* filter, > > >>> + size_t* missed_count, > > >>> + size_t thread_num) { > > >>> + return NULL; > > >>> > > >>> s/NULL/false/ > > >>> > > >>> Cheers, > > >>> David > > >>> > > >>> On 18/02/2020 2:15 pm, linzang(??) wrote: > > >>>> Dear All, > > >>>> May I ask your help to review the follow changes: > > >>>> webrev: > > >>>> http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/ > > >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8215624 > > >>>> related CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 > > >>>> This patch enable parallel heap inspection of G1 for jmap histo. > > >>>> my simple test shown it can speed up 2x of jmap -histo with > > >>>> parallelThreadNum set to 2 for heap at ~500M on 4-core platform. > > >>>> > > >>>> ------------------------------------------------------------------------ > > >>>> BRs, > > >>>> Lin > > >> > > > > From suenaga at oss.nttdata.com Mon Mar 16 09:20:28 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Mon, 16 Mar 2020 18:20:28 +0900 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com> References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com> <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com> <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com> <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com> <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com> Message-ID: <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com> Hi David, I missed loop condition, so I fixed it and pushed to submit repo. Could you try again? http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23 webrev is here: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ Thanks a lot! Yasumasa On 2020/03/16 16:17, David Holmes wrote: > Sorry it is still crashing. > > # > # A fatal error has been detected by the Java Runtime Environment: > # > #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704 > # > # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0640217.suenaga.source) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) > # Problematic frame: > # C? [libsaproc.so+0x494e]? DwarfParser::process_dwarf(unsigned long)+0x4e > # > > Same as before. > > David > ----- > > On 16/03/2020 4:57 pm, David Holmes wrote: >> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote: >>> On 2020/03/16 15:43, Chris Plummer wrote: >>>> BTW, if you submit it to the submit repo, we can then go and run additional internal tests (and even more builds) using that job. >> >> Thanks for that tip Chris! >> >>> I've pushed the change to submit repo, but I've not yet received the result. >>> I will share you when I get job ID. >> >> We can see the id. Just need to wait for the builds to complete before submitting the additional tests. >> >> David >> >>> Yasumasa >>> >>>> Chris >>>> >>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote: >>>>> Hi David, >>>>> >>>>> Thank you for testing it. >>>>> >>>>> I updated webrev to avoid bailout to Java frame when DWARF has language personality routine or LSDA. >>>>> Could you try it? >>>>> >>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ >>>>> >>>>> It works well on my Fedora 31 and Oracle Linux 7.7 . >>>>> I've pushed it to submit repo. >>>>> >>>>> Diff from webrev.00 is here: >>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652 >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2020/03/16 13:12, David Holmes wrote: >>>>>> Correction ... >>>>>> >>>>>> On 16/03/2020 12:53 pm, David Holmes wrote: >>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote: >>>>>>>> Hi Yasumasa, >>>>>>>> >>>>>>>> I can't review this as I know nothing about the code, but I'm putting the patch through our internal testing. >>>>>>> >>>>>>> Sorry but the crashes still exist: >>>>>>> >>>>>>> # >>>>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>>>> # >>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949 >>>>>>> # >>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) >>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>>>> # Problematic frame: >>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>> >>>>>>> in fact they seem worse as the test seems to always crash now. >>>>>> >>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in linux-x64. I don't see a pattern as to where it fails versus passes. >>>>>> >>>>>> It doesn't fail for me locally. >>>>>> >>>>>> David >>>>>> >>>>>>> David >>>>>>> >>>>>>>> David >>>>>>>> >>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> Please review this change: >>>>>>>>> >>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ >>>>>>>>> >>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames in jstack mixed mode. >>>>>>>>> However some error has seen intermittently after that. >>>>>>>>> >>>>>>>>> I investigated the cause of this, I found two concerns: >>>>>>>>> >>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check >>>>>>>>> ?? B: Language personality routine and Language Specific Data Area (LSDA) are not considered >>>>>>>>> >>>>>>>>> I addd range check for .eh_frame processing, and ignore personality routine and LSDA in this webrev. >>>>>>>>> Also I added bailout code if DWARF processing is failed due to these concerns. >>>>>>>>> >>>>>>>>> This change has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), >>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>> >>>> From david.holmes at oracle.com Mon Mar 16 11:46:41 2020 From: david.holmes at oracle.com (David Holmes) Date: Mon, 16 Mar 2020 21:46:41 +1000 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com> References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com> <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com> <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com> <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com> <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com> <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com> Message-ID: On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote: > Hi David, > > I missed loop condition, so I fixed it and pushed to submit repo. > Could you try again? > > ? http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23 > > webrev is here: > > ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ Test job resubmitted. Will advise results if it completes before I go to bed :) David > > Thanks a lot! > > Yasumasa > > > On 2020/03/16 16:17, David Holmes wrote: >> Sorry it is still crashing. >> >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704 >> # >> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build >> 15-internal+0-2020-03-16-0640217.suenaga.source) >> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug >> 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, sharing, >> tiered, compressed oops, g1 gc, linux-amd64) >> # Problematic frame: >> # C? [libsaproc.so+0x494e]? DwarfParser::process_dwarf(unsigned >> long)+0x4e >> # >> >> Same as before. >> >> David >> ----- >> >> On 16/03/2020 4:57 pm, David Holmes wrote: >>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote: >>>> On 2020/03/16 15:43, Chris Plummer wrote: >>>>> BTW, if you submit it to the submit repo, we can then go and run >>>>> additional internal tests (and even more builds) using that job. >>> >>> Thanks for that tip Chris! >>> >>>> I've pushed the change to submit repo, but I've not yet received the >>>> result. >>>> I will share you when I get job ID. >>> >>> We can see the id. Just need to wait for the builds to complete >>> before submitting the additional tests. >>> >>> David >>> >>>> Yasumasa >>>> >>>>> Chris >>>>> >>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote: >>>>>> Hi David, >>>>>> >>>>>> Thank you for testing it. >>>>>> >>>>>> I updated webrev to avoid bailout to Java frame when DWARF has >>>>>> language personality routine or LSDA. >>>>>> Could you try it? >>>>>> >>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ >>>>>> >>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 . >>>>>> I've pushed it to submit repo. >>>>>> >>>>>> Diff from webrev.00 is here: >>>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652 >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2020/03/16 13:12, David Holmes wrote: >>>>>>> Correction ... >>>>>>> >>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote: >>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote: >>>>>>>>> Hi Yasumasa, >>>>>>>>> >>>>>>>>> I can't review this as I know nothing about the code, but I'm >>>>>>>>> putting the patch through our internal testing. >>>>>>>> >>>>>>>> Sorry but the crashes still exist: >>>>>>>> >>>>>>>> # >>>>>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>>>>> # >>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949 >>>>>>>> # >>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug >>>>>>>> build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) >>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug >>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed >>>>>>>> mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>>>>> # Problematic frame: >>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned >>>>>>>> long)+0x4e >>>>>>>> >>>>>>>> in fact they seem worse as the test seems to always crash now. >>>>>>> >>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test >>>>>>> in linux-x64. I don't see a pattern as to where it fails versus >>>>>>> passes. >>>>>>> >>>>>>> It doesn't fail for me locally. >>>>>>> >>>>>>> David >>>>>>> >>>>>>>> David >>>>>>>> >>>>>>>>> David >>>>>>>>> >>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> Please review this change: >>>>>>>>>> >>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>>>> ?? webrev: >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ >>>>>>>>>> >>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native >>>>>>>>>> frames in jstack mixed mode. >>>>>>>>>> However some error has seen intermittently after that. >>>>>>>>>> >>>>>>>>>> I investigated the cause of this, I found two concerns: >>>>>>>>>> >>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check >>>>>>>>>> ?? B: Language personality routine and Language Specific Data >>>>>>>>>> Area (LSDA) are not considered >>>>>>>>>> >>>>>>>>>> I addd range check for .eh_frame processing, and ignore >>>>>>>>>> personality routine and LSDA in this webrev. >>>>>>>>>> Also I added bailout code if DWARF processing is failed due to >>>>>>>>>> these concerns. >>>>>>>>>> >>>>>>>>>> This change has passed all tests on submit repo >>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), >>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 >>>>>>>>>> container. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>> >>>>> From david.holmes at oracle.com Mon Mar 16 12:01:27 2020 From: david.holmes at oracle.com (David Holmes) Date: Mon, 16 Mar 2020 22:01:27 +1000 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com> <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com> <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com> <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com> <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com> <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com> Message-ID: On 16/03/2020 9:46 pm, David Holmes wrote: > On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote: >> Hi David, >> >> I missed loop condition, so I fixed it and pushed to submit repo. >> Could you try again? >> >> ?? http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23 >> >> webrev is here: >> >> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ > > Test job resubmitted. Will advise results if it completes before I go to > bed :) Seems to have passed okay. David > David > >> >> Thanks a lot! >> >> Yasumasa >> >> >> On 2020/03/16 16:17, David Holmes wrote: >>> Sorry it is still crashing. >>> >>> # >>> # A fatal error has been detected by the Java Runtime Environment: >>> # >>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704 >>> # >>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug >>> build 15-internal+0-2020-03-16-0640217.suenaga.source) >>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug >>> 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, sharing, >>> tiered, compressed oops, g1 gc, linux-amd64) >>> # Problematic frame: >>> # C? [libsaproc.so+0x494e]? DwarfParser::process_dwarf(unsigned >>> long)+0x4e >>> # >>> >>> Same as before. >>> >>> David >>> ----- >>> >>> On 16/03/2020 4:57 pm, David Holmes wrote: >>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote: >>>>> On 2020/03/16 15:43, Chris Plummer wrote: >>>>>> BTW, if you submit it to the submit repo, we can then go and run >>>>>> additional internal tests (and even more builds) using that job. >>>> >>>> Thanks for that tip Chris! >>>> >>>>> I've pushed the change to submit repo, but I've not yet received >>>>> the result. >>>>> I will share you when I get job ID. >>>> >>>> We can see the id. Just need to wait for the builds to complete >>>> before submitting the additional tests. >>>> >>>> David >>>> >>>>> Yasumasa >>>>> >>>>>> Chris >>>>>> >>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote: >>>>>>> Hi David, >>>>>>> >>>>>>> Thank you for testing it. >>>>>>> >>>>>>> I updated webrev to avoid bailout to Java frame when DWARF has >>>>>>> language personality routine or LSDA. >>>>>>> Could you try it? >>>>>>> >>>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ >>>>>>> >>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 . >>>>>>> I've pushed it to submit repo. >>>>>>> >>>>>>> Diff from webrev.00 is here: >>>>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652 >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2020/03/16 13:12, David Holmes wrote: >>>>>>>> Correction ... >>>>>>>> >>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote: >>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote: >>>>>>>>>> Hi Yasumasa, >>>>>>>>>> >>>>>>>>>> I can't review this as I know nothing about the code, but I'm >>>>>>>>>> putting the patch through our internal testing. >>>>>>>>> >>>>>>>>> Sorry but the crashes still exist: >>>>>>>>> >>>>>>>>> # >>>>>>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>>>>>> # >>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949 >>>>>>>>> # >>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) >>>>>>>>> (fastdebug build >>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) >>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug >>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed >>>>>>>>> mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>>>>>> # Problematic frame: >>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned >>>>>>>>> long)+0x4e >>>>>>>>> >>>>>>>>> in fact they seem worse as the test seems to always crash now. >>>>>>>> >>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test >>>>>>>> in linux-x64. I don't see a pattern as to where it fails versus >>>>>>>> passes. >>>>>>>> >>>>>>>> It doesn't fail for me locally. >>>>>>>> >>>>>>>> David >>>>>>>> >>>>>>>>> David >>>>>>>>> >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> Please review this change: >>>>>>>>>>> >>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>>>>> ?? webrev: >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ >>>>>>>>>>> >>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding >>>>>>>>>>> native frames in jstack mixed mode. >>>>>>>>>>> However some error has seen intermittently after that. >>>>>>>>>>> >>>>>>>>>>> I investigated the cause of this, I found two concerns: >>>>>>>>>>> >>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check >>>>>>>>>>> ?? B: Language personality routine and Language Specific Data >>>>>>>>>>> Area (LSDA) are not considered >>>>>>>>>>> >>>>>>>>>>> I addd range check for .eh_frame processing, and ignore >>>>>>>>>>> personality routine and LSDA in this webrev. >>>>>>>>>>> Also I added bailout code if DWARF processing is failed due >>>>>>>>>>> to these concerns. >>>>>>>>>>> >>>>>>>>>>> This change has passed all tests on submit repo >>>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), >>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 >>>>>>>>>>> container. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>> >>>>>> From suenaga at oss.nttdata.com Mon Mar 16 12:03:51 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Mon, 16 Mar 2020 21:03:51 +0900 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com> <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com> <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com> <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com> <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com> <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com> Message-ID: Thank you so much, David! Yasumasa On 2020/03/16 21:01, David Holmes wrote: > On 16/03/2020 9:46 pm, David Holmes wrote: >> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote: >>> Hi David, >>> >>> I missed loop condition, so I fixed it and pushed to submit repo. >>> Could you try again? >>> >>> ?? http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23 >>> >>> webrev is here: >>> >>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >> >> Test job resubmitted. Will advise results if it completes before I go to bed :) > > Seems to have passed okay. > > David > >> David >> >>> >>> Thanks a lot! >>> >>> Yasumasa >>> >>> >>> On 2020/03/16 16:17, David Holmes wrote: >>>> Sorry it is still crashing. >>>> >>>> # >>>> # A fatal error has been detected by the Java Runtime Environment: >>>> # >>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704 >>>> # >>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0640217.suenaga.source) >>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>> # Problematic frame: >>>> # C? [libsaproc.so+0x494e]? DwarfParser::process_dwarf(unsigned long)+0x4e >>>> # >>>> >>>> Same as before. >>>> >>>> David >>>> ----- >>>> >>>> On 16/03/2020 4:57 pm, David Holmes wrote: >>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote: >>>>>> On 2020/03/16 15:43, Chris Plummer wrote: >>>>>>> BTW, if you submit it to the submit repo, we can then go and run additional internal tests (and even more builds) using that job. >>>>> >>>>> Thanks for that tip Chris! >>>>> >>>>>> I've pushed the change to submit repo, but I've not yet received the result. >>>>>> I will share you when I get job ID. >>>>> >>>>> We can see the id. Just need to wait for the builds to complete before submitting the additional tests. >>>>> >>>>> David >>>>> >>>>>> Yasumasa >>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote: >>>>>>>> Hi David, >>>>>>>> >>>>>>>> Thank you for testing it. >>>>>>>> >>>>>>>> I updated webrev to avoid bailout to Java frame when DWARF has language personality routine or LSDA. >>>>>>>> Could you try it? >>>>>>>> >>>>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ >>>>>>>> >>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 . >>>>>>>> I've pushed it to submit repo. >>>>>>>> >>>>>>>> Diff from webrev.00 is here: >>>>>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652 >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2020/03/16 13:12, David Holmes wrote: >>>>>>>>> Correction ... >>>>>>>>> >>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote: >>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote: >>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>> >>>>>>>>>>> I can't review this as I know nothing about the code, but I'm putting the patch through our internal testing. >>>>>>>>>> >>>>>>>>>> Sorry but the crashes still exist: >>>>>>>>>> >>>>>>>>>> # >>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>>>>>>> # >>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949 >>>>>>>>>> # >>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) >>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>>>>>>> # Problematic frame: >>>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>>>>> >>>>>>>>>> in fact they seem worse as the test seems to always crash now. >>>>>>>>> >>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in linux-x64. I don't see a pattern as to where it fails versus passes. >>>>>>>>> >>>>>>>>> It doesn't fail for me locally. >>>>>>>>> >>>>>>>>> David >>>>>>>>> >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: >>>>>>>>>>>> Hi all, >>>>>>>>>>>> >>>>>>>>>>>> Please review this change: >>>>>>>>>>>> >>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ >>>>>>>>>>>> >>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames in jstack mixed mode. >>>>>>>>>>>> However some error has seen intermittently after that. >>>>>>>>>>>> >>>>>>>>>>>> I investigated the cause of this, I found two concerns: >>>>>>>>>>>> >>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check >>>>>>>>>>>> ?? B: Language personality routine and Language Specific Data Area (LSDA) are not considered >>>>>>>>>>>> >>>>>>>>>>>> I addd range check for .eh_frame processing, and ignore personality routine and LSDA in this webrev. >>>>>>>>>>>> Also I added bailout code if DWARF processing is failed due to these concerns. >>>>>>>>>>>> >>>>>>>>>>>> This change has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), >>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>> >>>>>>> From suenaga at oss.nttdata.com Mon Mar 16 12:07:14 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Mon, 16 Mar 2020 21:07:14 +0900 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com> <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com> <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com> <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com> <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com> <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com> Message-ID: <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com> Hi all, This webrev has passed submit repo (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and additional tests. So please review it: JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ Thanks, Yasumasa On 2020/03/16 21:03, Yasumasa Suenaga wrote: > Thank you so much, David! > > Yasumasa > > > On 2020/03/16 21:01, David Holmes wrote: >> On 16/03/2020 9:46 pm, David Holmes wrote: >>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote: >>>> Hi David, >>>> >>>> I missed loop condition, so I fixed it and pushed to submit repo. >>>> Could you try again? >>>> >>>> ?? http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23 >>>> >>>> webrev is here: >>>> >>>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >>> >>> Test job resubmitted. Will advise results if it completes before I go to bed :) >> >> Seems to have passed okay. >> >> David >> >>> David >>> >>>> >>>> Thanks a lot! >>>> >>>> Yasumasa >>>> >>>> >>>> On 2020/03/16 16:17, David Holmes wrote: >>>>> Sorry it is still crashing. >>>>> >>>>> # >>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>> # >>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704 >>>>> # >>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0640217.suenaga.source) >>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>> # Problematic frame: >>>>> # C? [libsaproc.so+0x494e]? DwarfParser::process_dwarf(unsigned long)+0x4e >>>>> # >>>>> >>>>> Same as before. >>>>> >>>>> David >>>>> ----- >>>>> >>>>> On 16/03/2020 4:57 pm, David Holmes wrote: >>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote: >>>>>>> On 2020/03/16 15:43, Chris Plummer wrote: >>>>>>>> BTW, if you submit it to the submit repo, we can then go and run additional internal tests (and even more builds) using that job. >>>>>> >>>>>> Thanks for that tip Chris! >>>>>> >>>>>>> I've pushed the change to submit repo, but I've not yet received the result. >>>>>>> I will share you when I get job ID. >>>>>> >>>>>> We can see the id. Just need to wait for the builds to complete before submitting the additional tests. >>>>>> >>>>>> David >>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote: >>>>>>>>> Hi David, >>>>>>>>> >>>>>>>>> Thank you for testing it. >>>>>>>>> >>>>>>>>> I updated webrev to avoid bailout to Java frame when DWARF has language personality routine or LSDA. >>>>>>>>> Could you try it? >>>>>>>>> >>>>>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ >>>>>>>>> >>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 . >>>>>>>>> I've pushed it to submit repo. >>>>>>>>> >>>>>>>>> Diff from webrev.00 is here: >>>>>>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652 >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2020/03/16 13:12, David Holmes wrote: >>>>>>>>>> Correction ... >>>>>>>>>> >>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote: >>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote: >>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>> >>>>>>>>>>>> I can't review this as I know nothing about the code, but I'm putting the patch through our internal testing. >>>>>>>>>>> >>>>>>>>>>> Sorry but the crashes still exist: >>>>>>>>>>> >>>>>>>>>>> # >>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>>>>>>>> # >>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949 >>>>>>>>>>> # >>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) >>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>>>>>>>> # Problematic frame: >>>>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>>>>>> >>>>>>>>>>> in fact they seem worse as the test seems to always crash now. >>>>>>>>>> >>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in linux-x64. I don't see a pattern as to where it fails versus passes. >>>>>>>>>> >>>>>>>>>> It doesn't fail for me locally. >>>>>>>>>> >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: >>>>>>>>>>>>> Hi all, >>>>>>>>>>>>> >>>>>>>>>>>>> Please review this change: >>>>>>>>>>>>> >>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ >>>>>>>>>>>>> >>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames in jstack mixed mode. >>>>>>>>>>>>> However some error has seen intermittently after that. >>>>>>>>>>>>> >>>>>>>>>>>>> I investigated the cause of this, I found two concerns: >>>>>>>>>>>>> >>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check >>>>>>>>>>>>> ?? B: Language personality routine and Language Specific Data Area (LSDA) are not considered >>>>>>>>>>>>> >>>>>>>>>>>>> I addd range check for .eh_frame processing, and ignore personality routine and LSDA in this webrev. >>>>>>>>>>>>> Also I added bailout code if DWARF processing is failed due to these concerns. >>>>>>>>>>>>> >>>>>>>>>>>>> This change has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), >>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>> >>>>>>>> From chris.plummer at oracle.com Mon Mar 16 18:26:37 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 16 Mar 2020 11:26:37 -0700 Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they do not attempt to use sudo when available In-Reply-To: References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com> <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com> <150f87f8-5b54-deae-ced1-1227bb1ed17a@oracle.com> <227652ff-ffb0-a1f0-531d-03b75ecec921@oracle.com> <33ffd0e7-d017-c1bf-feb5-4d2f07e753f2@oracle.com> <90e1ffcc-6636-b328-b167-fd1d42072575@oracle.com> <78834ED4-055B-4E43-86A8-01BB49BC3C73@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Mon Mar 16 18:43:40 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 16 Mar 2020 11:43:40 -0700 Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they do not attempt to use sudo when available In-Reply-To: References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com> <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com> <150f87f8-5b54-deae-ced1-1227bb1ed17a@oracle.com> <227652ff-ffb0-a1f0-531d-03b75ecec921@oracle.com> <33ffd0e7-d017-c1bf-feb5-4d2f07e753f2@oracle.com> <90e1ffcc-6636-b328-b167-fd1d42072575@oracle.com> <78834ED4-055B-4E43-86A8-01BB49BC3C73@oracle.com> Message-ID: <7bbb95be-1c85-d396-1b1a-c55e6a25e8a8@oracle.com> An HTML attachment was scrubbed... URL: From daniil.x.titov at oracle.com Mon Mar 16 19:00:26 2020 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Mon, 16 Mar 2020 12:00:26 -0700 Subject: RFR: 8240711: TestJstatdPort.java failed due to "ExportException: Port already in use:" Message-ID: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com> Please review the change [1] that fixes the intermittent failure of the test. The problem here is that if the RMI port is in use than the test keep waiting for "jstatd started (bound to " to appear in the process output and in this case It doesn't happen. at java.util.concurrent.CountDownLatch.await(java.base at 15-internal/CountDownLatch.java:232) at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:205) at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:133) at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:254) at jdk.test.lib.thread.ProcessThread$ProcessRunnable.xrun(ProcessThread.java:153) at jdk.test.lib.thread.XRun.run(XRun.java:40) at java.lang.Thread.run(java.base at 15-internal/Thread.java:832) at jdk.test.lib.thread.TestThread.run(TestThread.java:123) Testing: Mach5 tests for sun/tools/jstatd/ successfully passed. Tier1-tier3 tests are still in progress. [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.01/ [2] https://bugs.openjdk.java.net/browse/JDK-8240711 Thank you, Daniil From igor.ignatyev at oracle.com Mon Mar 16 19:13:00 2020 From: igor.ignatyev at oracle.com (Igor Ignatev) Date: Mon, 16 Mar 2020 12:13:00 -0700 Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they do not attempt to use sudo when available In-Reply-To: <7bbb95be-1c85-d396-1b1a-c55e6a25e8a8@oracle.com> References: <7bbb95be-1c85-d396-1b1a-c55e6a25e8a8@oracle.com> Message-ID: <247B7EA6-7BB5-4F6C-84C4-C110BAF8F063@oracle.com> > On Mar 16, 2020, at 11:43 AM, "serguei.spitsyn at oracle.com" wrote: > > ? >> On 3/16/20 11:26, Chris Plummer wrote: >> I had to make another change. TestMutuallyExclusivePlatformPredicates.java failed when I ran tier 3. I had fixed it a long while back due to Platform.shouldSAAttach() being removed, but there were more changes to Platform.java after that that I didn't account for. isRoot() was added and canPtraceAttrachLinux() was made public. So this is what the diff looks like now: >> >> --- a/test/hotspot/jtreg/testlibrary_tests/TestMutuallyExclusivePlatformPredicates.java >> +++ b/test/hotspot/jtreg/testlibrary_tests/TestMutuallyExclusivePlatformPredicates.java >> @@ -51,9 +51,9 @@ >> VM_TYPE("isClient", "isServer", "isMinimal", "isZero", "isEmbedded"), >> MODE("isInt", "isMixed", "isComp"), >> IGNORED("isEmulatedClient", "isDebugBuild", "isFastDebugBuild", >> - "isSlowDebugBuild", "hasSA", "shouldSAAttach", "isTieredSupported", >> + "isSlowDebugBuild", "hasSA", "canPtraceAttachLinux", "isTieredSupported", >> "areCustomLoadersSupportedForCDS", "isDefaultCDSArchiveSupported", >> - "isSignedOSX"); >> + "isSignedOSX", "isRoot"); >> >> However, I'm thinking maybe I should just move canPtraceAttachLinux() to SATestUtils.java since that's the only user, and it is an SA specific API. What do you think? > > The approach to localize canPtraceAttachLinux() in SATestUtils.java sounds right to me if it is an SA specific API. > +1 ? Igor > Thanks, > Serguei > >> >> Chris >> >>> On 3/15/20 10:22 PM, serguei.spitsyn at oracle.com wrote: >>> Hi Chris, >>> >>> Looks good. >>> Thank you for update! >>> >>> Thanks, >>> Serguei >>> >>> >>>> On 3/15/20 17:47, Chris Plummer wrote: >>>> I changed them all to "SA Attach" and grepped to make sure there are no other occurrences of "SA attach". >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>>> On 3/15/20 4:49 PM, Igor Ignatyev wrote: >>>>> Hi Chris, >>>>> >>>>> looks good, thanks! >>>>> >>>>> one minor nit, in SATestUtils::skipIfCannotAttach, you have two exception messages which start with 'SA attach not expected to work.', and one w/ 'SA Attach not expected to work.' (w/ Attach instead of attach), it'd be nicer to have them uniform. >>>>> >>>>> Cheers, >>>>> -- Igor >>>>> >>>>>> On Mar 15, 2020, at 4:35 PM, Chris Plummer wrote: >>>>>> >>>>>> Hi Igor, >>>>>> >>>>>> Thanks for the review. Here's and updated webrev with all of the suggestions from you and Serguei: >>>>>> >>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.01/index.html >>>>>> >>>>>> Also some comments inline below. >>>>>> >>>>>>> On 3/13/20 9:26 AM, Igor Ignatyev wrote: >>>>>>> HI Chris, >>>>>>> >>>>>>> overall looks good to me, a few comments though: >>>>>>> 1. since you removed vm.hasSAandCanAttach from VMProps, you also need to remove it from all TEST.ROOT files which mention it (test/jdk/TEST.ROOT and test/hotspot/jtreg/TEST.ROOT) so people won't be confused by undefined property and jtreg will be able to properly report invalid usages of it if any. >>>>>> Ok, but it's unclear to me what requires.properties is even for, and what is the impact of extra or missing properties. What kind of test would catch these errors? >>>>> jtreg uses 'requires.properties' as a list of extra variables for @require expressions, if a test uses a name which isn't in 'requires.properties' (or known to jtreg), jtreg won't execute such test and will set its status to Error w/ 'invalid name ...' message. >>>>> >>>>>>> >>>>>>> 2. in SATestUtils::canAddPrivileges, could you please add some meaningful message to the RuntimeException at L#102? >>>>>>> >>>>>> Ok. >>>>>> >>>>>> throw new RuntimeException("sudo process interrupted", e); >>>>>> >>>>>>> 3. SATestUtils::checkAttachOk method name is somewhat misleading (hence you had to put comment every time you used it), I'd recommend you to rename to smth like skipIfCannotAttach(). >>>>>> Ok, but I still left the comment in place. >>>>>>> >>>>>>> 4. in SATestUtils::checkAttachOk's javadoc, it would be better to use @throws tag like: >>>>>>>> + /** >>>>>>>> + * Checks if SA Attach is expected to work. >>>>>>>> +. * @throws SkippedException ifSA Attach is not expected to work. >>>>>>>> + */ >>>>>>> >>>>>>> >>>>>> Ok. >>>>>>> 5. it also might make sense to catch IOException within SATestUtils::checkAttachOk and throw it as Error or RuntimeException. >>>>>>> >>>>>> Ok. >>>>>>> I've briefly looked at all the changed tests and they look good. >>>>>> >>>>>> Thanks! >>>>>> >>>>>> Chris >>>>>>> >>>>>>> Thanks, >>>>>>> -- Igor >>>>>>> >>>>>>> >>>>>>>> On Mar 12, 2020, at 11:06 PM, Chris Plummer wrote: >>>>>>>> >>>>>>>> Hi Serguei, >>>>>>>> >>>>>>>> Thanks for the review! >>>>>>>> >>>>>>>> Can I get one more reviewer please? >>>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 3/12/20 12:06 AM, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Hi Chris, >>>>>>>>> >>>>>>>>> >>>>>>>>> On 3/12/20 00:03, Chris Plummer wrote: >>>>>>>>>> Hi Serguei, >>>>>>>>>> >>>>>>>>>> That check used to be in Platform.shouldSAAttach(), which essentially was moved to SATestUtils.checkAttachOk() and reworked some. It was necessary in Platform.shouldSAAttach() since that was used to evaluation vm.hasSAandCanAttach (which is now gone). When I moved everything to SATestUtils.checkAttachOk(), I recall thinking it wasn't really necessary since all tests that call it should have @require vm.hasSA, but left it in anyway just to be extra safe. I'm still inclined to just leave it in, but would not be opposed to removing it. >>>>>>>>> >>>>>>>>> I agree, it is more safe to keep it, at list for now. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> On 3/11/20 11:20 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>> Hi Chris, >>>>>>>>>>> >>>>>>>>>>> I've made another pass today. >>>>>>>>>>> It looks good to me. >>>>>>>>>>> >>>>>>>>>>> I have just one minor questions. >>>>>>>>>>> >>>>>>>>>>> There is some overlap between the requires vm.hasSA check and checkAttachOk: >>>>>>>>>>> + public static void checkAttachOk() throws IOException { >>>>>>>>>>> + if (!Platform.hasSA()) { >>>>>>>>>>> + throw new SkippedException("SA not supported."); >>>>>>>>>>> + } >>>>>>>>>>> In the former case, the test is not run but in the latter the SkippedException is thrown. >>>>>>>>>>> As I see, all tests with the checkAttachOk call use requires vm.hasSA as well. >>>>>>>>>>> It can be that the first check "if (!Platform.hasSA())" in the checkAttachOk is redundant. >>>>>>>>>>> It is okay and more safe in general but generates little confusion. >>>>>>>>>>> I'm okay if you don't do anything with this but wanted to know your view. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Serguei >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 3/10/20 18:57, Chris Plummer wrote: >>>>>>>>>>>> On 3/10/20 6:07 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>> >>>>>>>>>>>>> Overall, this looks as a right direction to me while it is not easy to verify all the details. >>>>>>>>>>>> Yes, there are a lot of tests with quite a few different types of changes. I did a lot of testing and verified that when the tests pass, they pass for the right reasons (really ran the test, skipped due to lack of privileges, or skipped due to running signed on OSX 10.14 or later). I also verified locally running as root, running with a cached sudo, and running without sudo. >>>>>>>>>>>>> I'll make another pass tomorrow. >>>>>>>>>>>> Thanks! >>>>>>>>>>>>> >>>>>>>>>>>>> A couple of quick nits so far: >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestCpoolForInvokeDynamic.java.udiff.html >>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestDefaultMethods.java.udiff.html >>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestHeapDumpForInvokeDynamic.java.udiff.html >>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestInstanceKlassSizeForInterface.java.udiff.html >>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestJhsdbJstackMixed.java.udiff.html >>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestRevPtrsForInvokeDynamic.java.udiff.html >>>>>>>>>>>>> import jdk.test.lib.Utils; >>>>>>>>>>>>> -import jdk.test.lib.Asserts; >>>>>>>>>>>>> +import jdk.test.lib.SA.SATestUtils; >>>>>>>>>>>>> Need to swap these exports. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> Ok >>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/lib/jdk/test/lib/SA/SATestUtils.java.frames.html >>>>>>>>>>>>> 48 if (SATestUtils.needsPrivileges()) { >>>>>>>>>>>>> 49 cmdStringList = SATestUtils.addPrivileges(cmdStringList); >>>>>>>>>>>>> The method calls are local, so the class name can be omitted in the method names: >>>>>>>>>>>>> SATestUtils.needsPrivileges and SATestUtils.addPrivileges. >>>>>>>>>>>> Ok >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 94 try { >>>>>>>>>>>>> 95 if (echoProcess.waitFor(60, TimeUnit.SECONDS) == false) { >>>>>>>>>>>>> 96 // Due to using the "-n" option, sudo should complete almost immediately. 60 seconds >>>>>>>>>>>>> 97 // is more than generous. If it didn't complete in that time, something went very wrong. >>>>>>>>>>>>> 98 echoProcess.destroyForcibly(); >>>>>>>>>>>>> 99 throw new RuntimeException("Timed out waiting for sudo to execute."); >>>>>>>>>>>>> 100 } >>>>>>>>>>>>> 101 } catch (InterruptedException e) { >>>>>>>>>>>>> 102 throw new RuntimeException(e); >>>>>>>>>>>>> 103 } >>>>>>>>>>>>> The lines 101/103 are misaligned. >>>>>>>>>>>> Ok. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Serguei >>>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 3/9/20 19:29, Chris Plummer wrote: >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Please help review the following: >>>>>>>>>>>>>> >>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238268 >>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> I'll try to give enough background first to make it easier to understand the changes. On OSX you must run SA tests that attach to a live process as root or using sudo. For example: >>>>>>>>>>>>>> >>>>>>>>>>>>>> sudo make run-test TEST=serviceability/sa/ClhsdbJstackXcompStress.java >>>>>>>>>>>>>> >>>>>>>>>>>>>> Whether running as root or under sudo, the check to allow the test to run is done with: >>>>>>>>>>>>>> >>>>>>>>>>>>>> private static boolean canAttachOSX() { >>>>>>>>>>>>>> return userName.equals("root"); >>>>>>>>>>>>>> } >>>>>>>>>>>>>> >>>>>>>>>>>>>> Any test using "@requires vm.hasSAandCanAttach" must pass this check via Platform.shouldSAAttach(), which for OSX returns: >>>>>>>>>>>>>> >>>>>>>>>>>>>> return canAttachOSX() && !isSignedOSX(); >>>>>>>>>>>>>> >>>>>>>>>>>>>> So if running as root the "@requires vm.hasSAandCanAttach" passes, otherwise it does not. However, using a root login to run tests is not a very desirable, nor is issuing a "sudo make run-test" (any created file ends up with root ownership). Because of this support was previously added for just running the attaching process using sudo, not the entire test. This was only done for the 20 or so tests that use ClhsdbLauncher. These tests use "@requires vm.hasSA", and then while running the test will do a "sudo" check if canAttachOSX() returns false: >>>>>>>>>>>>>> >>>>>>>>>>>>>> if (!Platform.shouldSAAttach()) { >>>>>>>>>>>>>> if (Platform.isOSX()) { >>>>>>>>>>>>>> if (Platform.isSignedOSX()) { >>>>>>>>>>>>>> throw new SkippedException("SA attach not expected to work. JDK is signed."); >>>>>>>>>>>>>> } else if (SATestUtils.canAddPrivileges()) { >>>>>>>>>>>>>> needPrivileges = true; >>>>>>>>>>>>>> } >>>>>>>>>>>>>> } >>>>>>>>>>>>>> if (!needPrivileges) { >>>>>>>>>>>>>> // Skip the test if we don't have enough permissions to attach >>>>>>>>>>>>>> // and cannot add privileges. >>>>>>>>>>>>>> throw new SkippedException( >>>>>>>>>>>>>> "SA attach not expected to work. Insufficient privileges."); >>>>>>>>>>>>>> } >>>>>>>>>>>>>> } >>>>>>>>>>>>>> >>>>>>>>>>>>>> So basically it does a runtime check of vm.hasSAandCanAttach, and if it fails then checks if running with sudo will work. This allows for either a passwordless sudo to be used when running clhsdb, or for the user to be prompted for the sudo password (note I've remove support for the latter with my changes). >>>>>>>>>>>>>> >>>>>>>>>>>>>> That brings us to the CR that is being fixed. ClhsdbLauncher tests support sudo and will therefore run with our CI testing on OSX, but the 25 or so tests that use "@requires vm.hasSAandCanAttach" do not, and therefore are never run with our CI OSX testing. The changes in this webrev fix that. >>>>>>>>>>>>>> >>>>>>>>>>>>>> There are two possible approaches to the fix. One is having the check for sudo be done as part of the vm.hasSAandCanAttach evaluation. The other approach is to do the check in the test at runtime similar to how ClhsdbLauncher currently does. This would mean just using "@requires vm.hasSA" for all the tests instead of "@requires vm.hasSAandCanAttach". I chose the later because there is an advantage to throwing SkippedException rather than just silently skipping the test using @requires. The advantage is that mdash tells you how many tests were skipped, and when you hover over the reason you can see the SkippedException message, which will differentiate between reasons like the JDK was signed or there are insufficient privileges. If all the checking was done by the vm.hasSAandCanAttach evaluation, you would not know why the test wasn't run. >>>>>>>>>>>>>> >>>>>>>>>>>>>> The "support" related changes made are all in the following 3 files. The rest of the changes are in the tests: >>>>>>>>>>>>>> >>>>>>>>>>>>>> test/jtreg-ext/requires/VMProps.java >>>>>>>>>>>>>> test/lib/jdk/test/lib/Platform.java >>>>>>>>>>>>>> test/lib/jdk/test/lib/SA/SATestUtils.java >>>>>>>>>>>>>> >>>>>>>>>>>>>> You'll noticed that one change I made to the sudo support in SATestUtils.canAddPrivileges() is to make sudo non-interactive, which means no password prompt. So that means either the user does not require a password, or the credentials have been cached. Otherwise the sudo check will fail. On most platforms if you execute a sudo command, the credentials are cached for 5 minutes. So if your user is not setup for passwordless sudo, then a sudo command can be issued before running the tests, and will likely remain cached until the test is run. The reason for using passwordless is because prompting in the middle of running tests can be confusing (you usually walk way once launching the tests and miss the prompt anyway), and avoids unnecessary delays in automated testing due to waiting for the password prompt to timeout (it used to wait 1 minute). >>>>>>>>>>>>>> >>>>>>>>>>>>>> There are essentially 3 types of tests that SA Attach to a process, each needing a slightly different fix: >>>>>>>>>>>>>> >>>>>>>>>>>>>> 1. Tests that directly launch a jdk.hotspot.agent class, such as TestClassDump.java. They need to call SATestUtils.checkAttachOk() to verify that attaching will be possible, and then SATestUtils.addPrivilegesIfNeeded(pb) to get the sudo command added if needed.They also need to switch from using hasSAandCanAttach to using hasSA. >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2. Tests that launch command line tools such has jhsdb. They need to call SATestUtils.checkAttachOk() to verify that attaching will be possible, and then SATestUtils.createProcessBuilder() to create a process that will be launched using sudo if necessary.They also need to switch from using hasSAandCanAttach to using hasSA. >>>>>>>>>>>>>> >>>>>>>>>>>>>> 3. Tests that use ClhsdbLauncher. They already use hasSA instead of hasSAandCanAttach, and rely on ClhsdbLauncher to do check at runtime if attaching will work, so for the most part all the these tests are unchanged. ClhsdbLauncher was modified to take advantage of the new SATestUtils.createProcessBuilder() and SATestUtils.checkAttachOk() APIs. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Some tests required special handling: >>>>>>>>>>>>>> >>>>>>>>>>>>>> test/hotspot/jtreg/compiler/ciReplay/TestSAClient.java >>>>>>>>>>>>>> test/hotspot/jtreg/compiler/ciReplay/TestSAServer.java >>>>>>>>>>>>>> >>>>>>>>>>>>>> - These two tests SA Attach to a core file, not to a process, so only need hasSA, >>>>>>>>>>>>>> not hasSAandCanAttach. No other changes were needed. >>>>>>>>>>>>>> >>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFindPC.java >>>>>>>>>>>>>> >>>>>>>>>>>>>> - The output should never be null. If the test was skipped due to lack of privileges, you >>>>>>>>>>>>>> would never get to this section of the test. >>>>>>>>>>>>>> >>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestClhsdbJstackLock.java >>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestIntConstant.java >>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestPrintMdo.java >>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestType.java >>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestUniverse.java >>>>>>>>>>>>>> >>>>>>>>>>>>>> - These are ClhsdbLauncher tests, so they should have been using hasSA instead of >>>>>>>>>>>>>> hasSAandCanAttachin the first place. No other changes were needed. >>>>>>>>>>>>>> >>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestCpoolForInvokeDynamic.java >>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestDefaultMethods.java >>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestG1HeapRegion.java >>>>>>>>>>>>>> >>>>>>>>>>>>>> - These tests used to "@require mac" but seem run fine on OSX, so I removed this requirement. >>>>>>>>>>>>>> >>>>>>>>>>>>>> test/jdk/sun/tools/jhsdb/BasicLauncherTest.java >>>>>>>>>>>>>> >>>>>>>>>>>>>> - This test had a runtime check to not run on OSX due to not having core file stack >>>>>>>>>>>>>> walking support. However, this tests always attaches to a process, not a core file, >>>>>>>>>>>>>> and seems to run just fine on OSX. >>>>>>>>>>>>>> >>>>>>>>>>>>>> test/jdk/sun/tools/jstack/DeadlockDetectionTest.java >>>>>>>>>>>>>> >>>>>>>>>>>>>> - I changed the test to throw a SkippedException if it gets the unexpected error code >>>>>>>>>>>>>> rather than just println. >>>>>>>>>>>>>> >>>>>>>>>>>>>> And a few other miscellaneous changes not already covered: >>>>>>>>>>>>>> >>>>>>>>>>>>>> test/lib/jdk/test/lib/Platform.java >>>>>>>>>>>>>> - Made canPtraceAttachLinux() public so it can be called from SATestUtils. >>>>>>>>>>>>>> - vm.hasSAandCanAttach is now gone. >>>>>>>>>>>>>> >>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Chris >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexey.menkov at oracle.com Mon Mar 16 23:02:08 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Mon, 16 Mar 2020 16:02:08 -0700 Subject: RFR: 8240711: TestJstatdPort.java failed due to "ExportException: Port already in use:" In-Reply-To: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com> References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com> Message-ID: <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com> Hi Daniil, Looks like the test is supposed to handle "port in use" issue (see lines 103-114). I suppose in case "port in use" jstatd exits, but ProcessTools.startProcess() continue to wait for "jstatd started" message. --alex On 03/16/2020 12:00, Daniil Titov wrote: > Please review the change [1] that fixes the intermittent failure of the test. > > The problem here is that if the RMI port is in use than the test keep waiting for "jstatd started (bound to " to appear in the process output and in this case > It doesn't happen. > > at java.util.concurrent.CountDownLatch.await(java.base at 15-internal/CountDownLatch.java:232) > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:205) > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:133) > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:254) > at jdk.test.lib.thread.ProcessThread$ProcessRunnable.xrun(ProcessThread.java:153) > at jdk.test.lib.thread.XRun.run(XRun.java:40) > at java.lang.Thread.run(java.base at 15-internal/Thread.java:832) > at jdk.test.lib.thread.TestThread.run(TestThread.java:123) > > Testing: Mach5 tests for sun/tools/jstatd/ successfully passed. Tier1-tier3 tests are still in progress. > > [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.01/ > [2] https://bugs.openjdk.java.net/browse/JDK-8240711 > > > Thank you, > Daniil > > > From daniil.x.titov at oracle.com Mon Mar 16 23:13:18 2020 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Mon, 16 Mar 2020 16:13:18 -0700 Subject: 8240711: TestJstatdPort.java failed due to "ExportException: Port already in use:" In-Reply-To: <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com> References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com> <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com> Message-ID: Hi Alex, Yes, you are right, class JstatdTest has the code that is supposed to handle the "port in use" case but at least for this specific test sun/tools/jstatd/TestJstatdPort.java is doesn't work. Since there are multiple tests in sun/tools/jstatd/* folder that use this class and different ports might be subject to the "port in use" error and taking into account that it's hard to reproduce such case I found it safer to leave the original code and just augment it with what was missing for this specific case rather than completely replacing it. Best regards, Daniil ?On 3/16/20, 4:02 PM, "Alex Menkov" wrote: Hi Daniil, Looks like the test is supposed to handle "port in use" issue (see lines 103-114). I suppose in case "port in use" jstatd exits, but ProcessTools.startProcess() continue to wait for "jstatd started" message. --alex On 03/16/2020 12:00, Daniil Titov wrote: > Please review the change [1] that fixes the intermittent failure of the test. > > The problem here is that if the RMI port is in use than the test keep waiting for "jstatd started (bound to " to appear in the process output and in this case > It doesn't happen. > > at java.util.concurrent.CountDownLatch.await(java.base at 15-internal/CountDownLatch.java:232) > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:205) > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:133) > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:254) > at jdk.test.lib.thread.ProcessThread$ProcessRunnable.xrun(ProcessThread.java:153) > at jdk.test.lib.thread.XRun.run(XRun.java:40) > at java.lang.Thread.run(java.base at 15-internal/Thread.java:832) > at jdk.test.lib.thread.TestThread.run(TestThread.java:123) > > Testing: Mach5 tests for sun/tools/jstatd/ successfully passed. Tier1-tier3 tests are still in progress. > > [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.01/ > [2] https://bugs.openjdk.java.net/browse/JDK-8240711 > > > Thank you, > Daniil > > > From daniil.x.titov at oracle.com Mon Mar 16 23:17:00 2020 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Mon, 16 Mar 2020 16:17:00 -0700 Subject: RFR: 8240711: TestJstatdPort.java failed due to "ExportException: Port already in use:" In-Reply-To: <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com> References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com> <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com> Message-ID: <7D3EB454-76AA-47A7-91A6-417ED92C6F56@oracle.com> Resending with the corrected subject ... Hi Alex, Yes, you are right, class JstatdTest has the code that is supposed to handle the "port in use" case but at least for this specific test (sun/tools/jstatd/TestJstatdPort.java) it doesn't work. Since there are multiple tests in sun/tools/jstatd/* folder that use this class and different ports might be subject to the "port in use" error and taking into account that it's hard to reproduce such case I found it safer to leave the original code and just augment it with what was missing for this specific case rather than completely replacing it. Best regards, Daniil ?On 3/16/20, 4:02 PM, "Alex Menkov" wrote: Hi Daniil, Looks like the test is supposed to handle "port in use" issue (see lines 103-114). I suppose in case "port in use" jstatd exits, but ProcessTools.startProcess() continue to wait for "jstatd started" message. --alex On 03/16/2020 12:00, Daniil Titov wrote: > Please review the change [1] that fixes the intermittent failure of the test. > > The problem here is that if the RMI port is in use than the test keep waiting for "jstatd started (bound to " to appear in the process output and in this case > It doesn't happen. > > at java.util.concurrent.CountDownLatch.await(java.base at 15-internal/CountDownLatch.java:232) > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:205) > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:133) > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:254) > at jdk.test.lib.thread.ProcessThread$ProcessRunnable.xrun(ProcessThread.java:153) > at jdk.test.lib.thread.XRun.run(XRun.java:40) > at java.lang.Thread.run(java.base at 15-internal/Thread.java:832) > at jdk.test.lib.thread.TestThread.run(TestThread.java:123) > > Testing: Mach5 tests for sun/tools/jstatd/ successfully passed. Tier1-tier3 tests are still in progress. > > [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.01/ > [2] https://bugs.openjdk.java.net/browse/JDK-8240711 > > > Thank you, > Daniil > > > From alexey.menkov at oracle.com Mon Mar 16 23:47:05 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Mon, 16 Mar 2020 16:47:05 -0700 Subject: RFR: 8240711: TestJstatdPort.java failed due to "ExportException: Port already in use:" In-Reply-To: <7D3EB454-76AA-47A7-91A6-417ED92C6F56@oracle.com> References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com> <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com> <7D3EB454-76AA-47A7-91A6-417ED92C6F56@oracle.com> Message-ID: <38b055df-37cd-cc11-79c9-323fb55077ff@oracle.com> I don't agree. The code handles exact the same "port in use" case for the same tool. So it either works or doesn't. And have 2 code blocks which suppose to do the same makes the code messy. BTW did you tested the change (I mean craft the test to get "port in use" error)? --alex On 03/16/2020 16:17, Daniil Titov wrote: > Resending with the corrected subject ... > > Hi Alex, > > Yes, you are right, class JstatdTest has the code that is supposed to handle the "port in use" > case but at least for this specific test (sun/tools/jstatd/TestJstatdPort.java) it doesn't work. > > Since there are multiple tests in sun/tools/jstatd/* folder that use this class and different ports > might be subject to the "port in use" error and taking into account that it's hard to reproduce such case > I found it safer to leave the original code and just augment it with what was missing for this specific > case rather than completely replacing it. > > Best regards, > Daniil > > ?On 3/16/20, 4:02 PM, "Alex Menkov" wrote: > > Hi Daniil, > > Looks like the test is supposed to handle "port in use" issue (see lines > 103-114). > I suppose in case "port in use" jstatd exits, but > ProcessTools.startProcess() continue to wait for "jstatd started" message. > > --alex > > On 03/16/2020 12:00, Daniil Titov wrote: > > Please review the change [1] that fixes the intermittent failure of the test. > > > > The problem here is that if the RMI port is in use than the test keep waiting for "jstatd started (bound to " to appear in the process output and in this case > > It doesn't happen. > > > > at java.util.concurrent.CountDownLatch.await(java.base at 15-internal/CountDownLatch.java:232) > > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:205) > > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:133) > > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:254) > > at jdk.test.lib.thread.ProcessThread$ProcessRunnable.xrun(ProcessThread.java:153) > > at jdk.test.lib.thread.XRun.run(XRun.java:40) > > at java.lang.Thread.run(java.base at 15-internal/Thread.java:832) > > at jdk.test.lib.thread.TestThread.run(TestThread.java:123) > > > > Testing: Mach5 tests for sun/tools/jstatd/ successfully passed. Tier1-tier3 tests are still in progress. > > > > [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.01/ > > [2] https://bugs.openjdk.java.net/browse/JDK-8240711 > > > > > > Thank you, > > Daniil > > > > > > > > > From chris.plummer at oracle.com Tue Mar 17 00:11:12 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 16 Mar 2020 17:11:12 -0700 Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they do not attempt to use sudo when available In-Reply-To: <247B7EA6-7BB5-4F6C-84C4-C110BAF8F063@oracle.com> References: <7bbb95be-1c85-d396-1b1a-c55e6a25e8a8@oracle.com> <247B7EA6-7BB5-4F6C-84C4-C110BAF8F063@oracle.com> Message-ID: <22798bb3-9800-fa1d-8668-dc0e95b6eccc@oracle.com> An HTML attachment was scrubbed... URL: From igor.ignatyev at oracle.com Tue Mar 17 00:14:34 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Mon, 16 Mar 2020 17:14:34 -0700 Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they do not attempt to use sudo when available In-Reply-To: <22798bb3-9800-fa1d-8668-dc0e95b6eccc@oracle.com> References: <7bbb95be-1c85-d396-1b1a-c55e6a25e8a8@oracle.com> <247B7EA6-7BB5-4F6C-84C4-C110BAF8F063@oracle.com> <22798bb3-9800-fa1d-8668-dc0e95b6eccc@oracle.com> Message-ID: <556F1CD7-46ED-46D9-BAB9-DD099111D981@oracle.com> Hi Chris, does canPtraceAttachLinux have to be public? otherwise, looks good to me. -- Igor > On Mar 16, 2020, at 5:11 PM, Chris Plummer wrote: > > Hi Serguei and Igor, > > New webrev: > > http://cr.openjdk.java.net/~cjplummer/8238268/webrev.02/index.html > > Only files changed were Platform.java and SATestUtils.java. > > -Moved canPtraceAttachLinux() from Platform.java to SATestUtils.java > -Changed Platform.canPtraceAttachLinux() reference in SATestUtils.java to just be canPtraceAttachLinux(). > -Had to change userName.equals("root") reference in canPtraceAttachLinux() to Platform.isRoot(). Probably should have been that way in the first place. > -Made some adjustments to the imports > > thanks, > > Chris > > On 3/16/20 12:13 PM, Igor Ignatev wrote: >> >> >>> On Mar 16, 2020, at 11:43 AM, "serguei.spitsyn at oracle.com" wrote: >>> >>> ? >>> On 3/16/20 11:26, Chris Plummer wrote: >>>> I had to make another change. TestMutuallyExclusivePlatformPredicates.java failed when I ran tier 3. I had fixed it a long while back due to Platform.shouldSAAttach() being removed, but there were more changes to Platform.java after that that I didn't account for. isRoot() was added and canPtraceAttrachLinux() was made public. So this is what the diff looks like now: <> >>>> >>>> --- a/test/hotspot/jtreg/testlibrary_tests/TestMutuallyExclusivePlatformPredicates.java >>>> +++ b/test/hotspot/jtreg/testlibrary_tests/TestMutuallyExclusivePlatformPredicates.java >>>> @@ -51,9 +51,9 @@ >>>> VM_TYPE("isClient", "isServer", "isMinimal", "isZero", "isEmbedded"), >>>> MODE("isInt", "isMixed", "isComp"), >>>> IGNORED("isEmulatedClient", "isDebugBuild", "isFastDebugBuild", >>>> - "isSlowDebugBuild", "hasSA", "shouldSAAttach", "isTieredSupported", >>>> + "isSlowDebugBuild", "hasSA", "canPtraceAttachLinux", "isTieredSupported", >>>> "areCustomLoadersSupportedForCDS", "isDefaultCDSArchiveSupported", >>>> - "isSignedOSX"); >>>> + "isSignedOSX", "isRoot"); >>>> >>>> However, I'm thinking maybe I should just move canPtraceAttachLinux() to SATestUtils.java since that's the only user, and it is an SA specific API. What do you think? >>> >>> The approach to localize canPtraceAttachLinux() in SATestUtils.java sounds right to me if it is an SA specific API. >>> >> +1 >> ? Igor >> >>> Thanks, >>> Serguei >>> >>>> >>>> Chris >>>> >>>> On 3/15/20 10:22 PM, serguei.spitsyn at oracle.com wrote: >>>>> Hi Chris, >>>>> >>>>> Looks good. >>>>> Thank you for update! >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 3/15/20 17:47, Chris Plummer wrote: >>>>>> I changed them all to "SA Attach" and grepped to make sure there are no other occurrences of "SA attach". >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 3/15/20 4:49 PM, Igor Ignatyev wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> looks good, thanks! >>>>>>> >>>>>>> one minor nit, in SATestUtils::skipIfCannotAttach, you have two exception messages which start with 'SA attach not expected to work.', and one w/ 'SA Attach not expected to work.' (w/ Attach instead of attach), it'd be nicer to have them uniform. >>>>>>> >>>>>>> Cheers, >>>>>>> -- Igor >>>>>>> >>>>>>>> On Mar 15, 2020, at 4:35 PM, Chris Plummer > wrote: >>>>>>>> >>>>>>>> Hi Igor, >>>>>>>> >>>>>>>> Thanks for the review. Here's and updated webrev with all of the suggestions from you and Serguei: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.01/index.html >>>>>>>> >>>>>>>> Also some comments inline below. >>>>>>>> >>>>>>>> On 3/13/20 9:26 AM, Igor Ignatyev wrote: >>>>>>>>> HI Chris, >>>>>>>>> >>>>>>>>> overall looks good to me, a few comments though: >>>>>>>>> 1. since you removed vm.hasSAandCanAttach from VMProps, you also need to remove it from all TEST.ROOT files which mention it (test/jdk/TEST.ROOT and test/hotspot/jtreg/TEST.ROOT) so people won't be confused by undefined property and jtreg will be able to properly report invalid usages of it if any. >>>>>>>> Ok, but it's unclear to me what requires.properties is even for, and what is the impact of extra or missing properties. What kind of test would catch these errors? >>>>>>> jtreg uses 'requires.properties' as a list of extra variables for @require expressions, if a test uses a name which isn't in 'requires.properties' (or known to jtreg), jtreg won't execute such test and will set its status to Error w/ 'invalid name ...' message. >>>>>>> >>>>>>>>> >>>>>>>>> 2. in SATestUtils::canAddPrivileges, could you please add some meaningful message to the RuntimeException at L#102? >>>>>>>>> >>>>>>>> Ok. >>>>>>>> >>>>>>>> throw new RuntimeException("sudo process interrupted", e); >>>>>>>> >>>>>>>>> 3. SATestUtils::checkAttachOk method name is somewhat misleading (hence you had to put comment every time you used it), I'd recommend you to rename to smth like skipIfCannotAttach(). >>>>>>>> Ok, but I still left the comment in place. >>>>>>>>> >>>>>>>>> 4. in SATestUtils::checkAttachOk's javadoc, it would be better to use @throws tag like: >>>>>>>>>> + /** >>>>>>>>>> + * Checks if SA Attach is expected to work. >>>>>>>>>> +. * @throws SkippedException ifSA Attach is not expected to work. >>>>>>>>>> + */ >>>>>>>>> >>>>>>>>> >>>>>>>> Ok. >>>>>>>>> 5. it also might make sense to catch IOException within SATestUtils::checkAttachOk and throw it as Error or RuntimeException. >>>>>>>>> >>>>>>>> Ok. >>>>>>>>> I've briefly looked at all the changed tests and they look good. >>>>>>>> >>>>>>>> Thanks! >>>>>>>> >>>>>>>> Chris >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> -- Igor >>>>>>>>> >>>>>>>>> >>>>>>>>>> On Mar 12, 2020, at 11:06 PM, Chris Plummer > wrote: >>>>>>>>>> >>>>>>>>>> Hi Serguei, >>>>>>>>>> >>>>>>>>>> Thanks for the review! >>>>>>>>>> >>>>>>>>>> Can I get one more reviewer please? >>>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> On 3/12/20 12:06 AM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>> Hi Chris, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 3/12/20 00:03, Chris Plummer wrote: >>>>>>>>>>>> Hi Serguei, >>>>>>>>>>>> >>>>>>>>>>>> That check used to be in Platform.shouldSAAttach(), which essentially was moved to SATestUtils.checkAttachOk() and reworked some. It was necessary in Platform.shouldSAAttach() since that was used to evaluation vm.hasSAandCanAttach (which is now gone). When I moved everything to SATestUtils.checkAttachOk(), I recall thinking it wasn't really necessary since all tests that call it should have @require vm.hasSA, but left it in anyway just to be extra safe. I'm still inclined to just leave it in, but would not be opposed to removing it. >>>>>>>>>>> >>>>>>>>>>> I agree, it is more safe to keep it, at list for now. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Serguei >>>>>>>>>>> >>>>>>>>>>>> thanks, >>>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> On 3/11/20 11:20 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>> >>>>>>>>>>>>> I've made another pass today. >>>>>>>>>>>>> It looks good to me. >>>>>>>>>>>>> >>>>>>>>>>>>> I have just one minor questions. >>>>>>>>>>>>> >>>>>>>>>>>>> There is some overlap between the requires vm.hasSA check and checkAttachOk: >>>>>>>>>>>>> + public static void checkAttachOk() throws IOException { >>>>>>>>>>>>> + if (!Platform.hasSA()) { >>>>>>>>>>>>> + throw new SkippedException("SA not supported."); >>>>>>>>>>>>> + } >>>>>>>>>>>>> In the former case, the test is not run but in the latter the SkippedException is thrown. >>>>>>>>>>>>> As I see, all tests with the checkAttachOk call use requires vm.hasSA as well. >>>>>>>>>>>>> It can be that the first check "if (!Platform.hasSA())" in the checkAttachOk is redundant. >>>>>>>>>>>>> It is okay and more safe in general but generates little confusion. >>>>>>>>>>>>> I'm okay if you don't do anything with this but wanted to know your view. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Serguei >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 3/10/20 18:57, Chris Plummer wrote: >>>>>>>>>>>>>> On 3/10/20 6:07 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Overall, this looks as a right direction to me while it is not easy to verify all the details. >>>>>>>>>>>>>> Yes, there are a lot of tests with quite a few different types of changes. I did a lot of testing and verified that when the tests pass, they pass for the right reasons (really ran the test, skipped due to lack of privileges, or skipped due to running signed on OSX 10.14 or later). I also verified locally running as root, running with a cached sudo, and running without sudo. >>>>>>>>>>>>>>> I'll make another pass tomorrow. >>>>>>>>>>>>>> Thanks! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> A couple of quick nits so far: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestCpoolForInvokeDynamic.java.udiff.html >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestDefaultMethods.java.udiff.html >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestHeapDumpForInvokeDynamic.java.udiff.html >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestInstanceKlassSizeForInterface.java.udiff.html >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestJhsdbJstackMixed.java.udiff.html >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestRevPtrsForInvokeDynamic.java.udiff.html >>>>>>>>>>>>>>> import jdk.test.lib.Utils; >>>>>>>>>>>>>>> -import jdk.test.lib.Asserts; >>>>>>>>>>>>>>> +import jdk.test.lib.SA.SATestUtils; >>>>>>>>>>>>>>> Need to swap these exports. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> Ok >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/lib/jdk/test/lib/SA/SATestUtils.java.frames.html >>>>>>>>>>>>>>> 48 if (SATestUtils.needsPrivileges()) { >>>>>>>>>>>>>>> 49 cmdStringList = SATestUtils.addPrivileges(cmdStringList); >>>>>>>>>>>>>>> The method calls are local, so the class name can be omitted in the method names: >>>>>>>>>>>>>>> SATestUtils.needsPrivileges and SATestUtils.addPrivileges. >>>>>>>>>>>>>> Ok >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 94 try { >>>>>>>>>>>>>>> 95 if (echoProcess.waitFor(60, TimeUnit.SECONDS) == false) { >>>>>>>>>>>>>>> 96 // Due to using the "-n" option, sudo should complete almost immediately. 60 seconds >>>>>>>>>>>>>>> 97 // is more than generous. If it didn't complete in that time, something went very wrong. >>>>>>>>>>>>>>> 98 echoProcess.destroyForcibly(); >>>>>>>>>>>>>>> 99 throw new RuntimeException("Timed out waiting for sudo to execute."); >>>>>>>>>>>>>>> 100 } >>>>>>>>>>>>>>> 101 } catch (InterruptedException e) { >>>>>>>>>>>>>>> 102 throw new RuntimeException(e); >>>>>>>>>>>>>>> 103 } >>>>>>>>>>>>>>> The lines 101/103 are misaligned. >>>>>>>>>>>>>> Ok. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Serguei >>>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 3/9/20 19:29, Chris Plummer wrote: >>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Please help review the following: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238268 >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I'll try to give enough background first to make it easier to understand the changes. On OSX you must run SA tests that attach to a live process as root or using sudo. For example: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> sudo make run-test TEST=serviceability/sa/ClhsdbJstackXcompStress.java >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Whether running as root or under sudo, the check to allow the test to run is done with: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> private static boolean canAttachOSX() { >>>>>>>>>>>>>>>> return userName.equals("root"); >>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Any test using "@requires vm.hasSAandCanAttach" must pass this check via Platform.shouldSAAttach(), which for OSX returns: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> return canAttachOSX() && !isSignedOSX(); >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> So if running as root the "@requires vm.hasSAandCanAttach" passes, otherwise it does not. However, using a root login to run tests is not a very desirable, nor is issuing a "sudo make run-test" (any created file ends up with root ownership). Because of this support was previously added for just running the attaching process using sudo, not the entire test. This was only done for the 20 or so tests that use ClhsdbLauncher. These tests use "@requires vm.hasSA", and then while running the test will do a "sudo" check if canAttachOSX() returns false: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> if (!Platform.shouldSAAttach()) { >>>>>>>>>>>>>>>> if (Platform.isOSX()) { >>>>>>>>>>>>>>>> if (Platform.isSignedOSX()) { >>>>>>>>>>>>>>>> throw new SkippedException("SA attach not expected to work. JDK is signed."); >>>>>>>>>>>>>>>> } else if (SATestUtils.canAddPrivileges()) { >>>>>>>>>>>>>>>> needPrivileges = true; >>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>> if (!needPrivileges) { >>>>>>>>>>>>>>>> // Skip the test if we don't have enough permissions to attach >>>>>>>>>>>>>>>> // and cannot add privileges. >>>>>>>>>>>>>>>> throw new SkippedException( >>>>>>>>>>>>>>>> "SA attach not expected to work. Insufficient privileges."); >>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> So basically it does a runtime check of vm.hasSAandCanAttach, and if it fails then checks if running with sudo will work. This allows for either a passwordless sudo to be used when running clhsdb, or for the user to be prompted for the sudo password (note I've remove support for the latter with my changes). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> That brings us to the CR that is being fixed. ClhsdbLauncher tests support sudo and will therefore run with our CI testing on OSX, but the 25 or so tests that use "@requires vm.hasSAandCanAttach" do not, and therefore are never run with our CI OSX testing. The changes in this webrev fix that. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> There are two possible approaches to the fix. One is having the check for sudo be done as part of the vm.hasSAandCanAttach evaluation. The other approach is to do the check in the test at runtime similar to how ClhsdbLauncher currently does. This would mean just using "@requires vm.hasSA" for all the tests instead of "@requires vm.hasSAandCanAttach". I chose the later because there is an advantage to throwing SkippedException rather than just silently skipping the test using @requires. The advantage is that mdash tells you how many tests were skipped, and when you hover over the reason you can see the SkippedException message, which will differentiate between reasons like the JDK was signed or there are insufficient privileges. If all the checking was done by the vm.hasSAandCanAttach evaluation, you would not know why the test wasn't run. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The "support" related changes made are all in the following 3 files. The rest of the changes are in the tests: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> test/jtreg-ext/requires/VMProps.java >>>>>>>>>>>>>>>> test/lib/jdk/test/lib/Platform.java >>>>>>>>>>>>>>>> test/lib/jdk/test/lib/SA/SATestUtils.java >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> You'll noticed that one change I made to the sudo support in SATestUtils.canAddPrivileges() is to make sudo non-interactive, which means no password prompt. So that means either the user does not require a password, or the credentials have been cached. Otherwise the sudo check will fail. On most platforms if you execute a sudo command, the credentials are cached for 5 minutes. So if your user is not setup for passwordless sudo, then a sudo command can be issued before running the tests, and will likely remain cached until the test is run. The reason for using passwordless is because prompting in the middle of running tests can be confusing (you usually walk way once launching the tests and miss the prompt anyway), and avoids unnecessary delays in automated testing due to waiting for the password prompt to timeout (it used to wait 1 minute). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> There are essentially 3 types of tests that SA Attach to a process, each needing a slightly different fix: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 1. Tests that directly launch a jdk.hotspot.agent class, such as TestClassDump.java. They need to call SATestUtils.checkAttachOk() to verify that attaching will be possible, and then SATestUtils.addPrivilegesIfNeeded(pb) to get the sudo command added if needed.They also need to switch from using hasSAandCanAttach to using hasSA. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2. Tests that launch command line tools such has jhsdb. They need to call SATestUtils.checkAttachOk() to verify that attaching will be possible, and then SATestUtils.createProcessBuilder() to create a process that will be launched using sudo if necessary.They also need to switch from using hasSAandCanAttach to using hasSA. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 3. Tests that use ClhsdbLauncher. They already use hasSA instead of hasSAandCanAttach, and rely on ClhsdbLauncher to do check at runtime if attaching will work, so for the most part all the these tests are unchanged. ClhsdbLauncher was modified to take advantage of the new SATestUtils.createProcessBuilder() and SATestUtils.checkAttachOk() APIs. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Some tests required special handling: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> test/hotspot/jtreg/compiler/ciReplay/TestSAClient.java >>>>>>>>>>>>>>>> test/hotspot/jtreg/compiler/ciReplay/TestSAServer.java >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - These two tests SA Attach to a core file, not to a process, so only need hasSA, >>>>>>>>>>>>>>>> not hasSAandCanAttach. No other changes were needed. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFindPC.java >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - The output should never be null. If the test was skipped due to lack of privileges, you >>>>>>>>>>>>>>>> would never get to this section of the test. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestClhsdbJstackLock.java >>>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestIntConstant.java >>>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestPrintMdo.java >>>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestType.java >>>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestUniverse.java >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - These are ClhsdbLauncher tests, so they should have been using hasSA instead of >>>>>>>>>>>>>>>> hasSAandCanAttachin the first place. No other changes were needed. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestCpoolForInvokeDynamic.java >>>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestDefaultMethods.java >>>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestG1HeapRegion.java >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - These tests used to "@require mac" but seem run fine on OSX, so I removed this requirement. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> test/jdk/sun/tools/jhsdb/BasicLauncherTest.java >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - This test had a runtime check to not run on OSX due to not having core file stack >>>>>>>>>>>>>>>> walking support. However, this tests always attaches to a process, not a core file, >>>>>>>>>>>>>>>> and seems to run just fine on OSX. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> test/jdk/sun/tools/jstack/DeadlockDetectionTest.java >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - I changed the test to throw a SkippedException if it gets the unexpected error code >>>>>>>>>>>>>>>> rather than just println. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> And a few other miscellaneous changes not already covered: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> test/lib/jdk/test/lib/Platform.java >>>>>>>>>>>>>>>> - Made canPtraceAttachLinux() public so it can be called from SATestUtils. >>>>>>>>>>>>>>>> - vm.hasSAandCanAttach is now gone. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Tue Mar 17 00:20:35 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 16 Mar 2020 17:20:35 -0700 Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they do not attempt to use sudo when available In-Reply-To: <556F1CD7-46ED-46D9-BAB9-DD099111D981@oracle.com> References: <7bbb95be-1c85-d396-1b1a-c55e6a25e8a8@oracle.com> <247B7EA6-7BB5-4F6C-84C4-C110BAF8F063@oracle.com> <22798bb3-9800-fa1d-8668-dc0e95b6eccc@oracle.com> <556F1CD7-46ED-46D9-BAB9-DD099111D981@oracle.com> Message-ID: <40a5458c-a2e3-5e37-688c-3f82ceb689a8@oracle.com> An HTML attachment was scrubbed... URL: From daniil.x.titov at oracle.com Tue Mar 17 00:38:40 2020 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Mon, 16 Mar 2020 17:38:40 -0700 Subject: RFR: 8240711: TestJstatdPort.java failed due to "ExportException: Port already in use:" In-Reply-To: <38b055df-37cd-cc11-79c9-323fb55077ff@oracle.com> References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com> <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com> <7D3EB454-76AA-47A7-91A6-417ED92C6F56@oracle.com> <38b055df-37cd-cc11-79c9-323fb55077ff@oracle.com> Message-ID: Hi Alex, Yes, I did test the change by modifying the test to use the RMI port that is already in use ( the stack trace in the original email was exact from this changed test) and then ensured that with the fix the such issue is properly handled. I will send a new version of the webrev that removes the old version of the code that tried to handle the "port in use" case. Thanks! Best regards, Daniil ?On 3/16/20, 4:47 PM, "Alex Menkov" wrote: I don't agree. The code handles exact the same "port in use" case for the same tool. So it either works or doesn't. And have 2 code blocks which suppose to do the same makes the code messy. BTW did you tested the change (I mean craft the test to get "port in use" error)? --alex On 03/16/2020 16:17, Daniil Titov wrote: > Resending with the corrected subject ... > > Hi Alex, > > Yes, you are right, class JstatdTest has the code that is supposed to handle the "port in use" > case but at least for this specific test (sun/tools/jstatd/TestJstatdPort.java) it doesn't work. > > Since there are multiple tests in sun/tools/jstatd/* folder that use this class and different ports > might be subject to the "port in use" error and taking into account that it's hard to reproduce such case > I found it safer to leave the original code and just augment it with what was missing for this specific > case rather than completely replacing it. > > Best regards, > Daniil > > ?On 3/16/20, 4:02 PM, "Alex Menkov" wrote: > > Hi Daniil, > > Looks like the test is supposed to handle "port in use" issue (see lines > 103-114). > I suppose in case "port in use" jstatd exits, but > ProcessTools.startProcess() continue to wait for "jstatd started" message. > > --alex > > On 03/16/2020 12:00, Daniil Titov wrote: > > Please review the change [1] that fixes the intermittent failure of the test. > > > > The problem here is that if the RMI port is in use than the test keep waiting for "jstatd started (bound to " to appear in the process output and in this case > > It doesn't happen. > > > > at java.util.concurrent.CountDownLatch.await(java.base at 15-internal/CountDownLatch.java:232) > > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:205) > > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:133) > > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:254) > > at jdk.test.lib.thread.ProcessThread$ProcessRunnable.xrun(ProcessThread.java:153) > > at jdk.test.lib.thread.XRun.run(XRun.java:40) > > at java.lang.Thread.run(java.base at 15-internal/Thread.java:832) > > at jdk.test.lib.thread.TestThread.run(TestThread.java:123) > > > > Testing: Mach5 tests for sun/tools/jstatd/ successfully passed. Tier1-tier3 tests are still in progress. > > > > [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.01/ > > [2] https://bugs.openjdk.java.net/browse/JDK-8240711 > > > > > > Thank you, > > Daniil > > > > > > > > > From chris.plummer at oracle.com Tue Mar 17 01:16:29 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 16 Mar 2020 18:16:29 -0700 Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they do not attempt to use sudo when available In-Reply-To: <556F1CD7-46ED-46D9-BAB9-DD099111D981@oracle.com> References: <7bbb95be-1c85-d396-1b1a-c55e6a25e8a8@oracle.com> <247B7EA6-7BB5-4F6C-84C4-C110BAF8F063@oracle.com> <22798bb3-9800-fa1d-8668-dc0e95b6eccc@oracle.com> <556F1CD7-46ED-46D9-BAB9-DD099111D981@oracle.com> Message-ID: <7f615584-81e2-e49a-0bfb-563adc1f5834@oracle.com> An HTML attachment was scrubbed... URL: From dms at samersoff.net Tue Mar 17 14:02:24 2020 From: dms at samersoff.net (Dmitry Samersoff) Date: Tue, 17 Mar 2020 17:02:24 +0300 Subject: jmx-dev RFR 8240604: Rewrite sun/management/jmxremote/bootstrap/CustomLauncherTest.java test to make binaries from source file In-Reply-To: References: <6b21ac03-0505-881e-b805-1681faab437b@bell-sw.com> Message-ID: <9a14fff8-d4a0-83be-bbf1-cc7558e5c6b4@samersoff.net> Hello Alexander, The fix looks good for me. -Dmitry On 05.03.2020 17:27, Daniel Fuchs wrote: > Hi Alexander, > > Fixes to JMX & management agent are reviewed on the > seviceability-dev (added in to:) these days. > > best regards, > > -- daniel > > On 05/03/2020 13:17, Alexander Scherbatiy wrote: >> Hello, >> >> Could you review a small enhancement where the test CustomLauncherTest >> is updated to build binary launcher file from launcher.c file. >> The file launcher.c is renamed to exelauncher.c to follow the name >> convention for executable test files building by jdk make system. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8240604 >> Webrev: http://cr.openjdk.java.net/~alexsch/8240604/webrev.00 >> >> The changes for obsolete binary files from >> sun/management/jmxremote/bootstrap/linux-* and solaris-* are not >> included into the webrev. They needs to be removed manually. >> >> The test is passed on Ubuntu 18.04 x86-64, Solaris Sparc v9 11.2, and >> Solaris x64 11.4 systems. >> >> The test is excluded from Windows and Mac Os X systems. >> >> Thanks, >> Alexander. > From igor.ignatyev at oracle.com Tue Mar 17 17:11:01 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 17 Mar 2020 10:11:01 -0700 Subject: jmx-dev RFR 8240604: Rewrite sun/management/jmxremote/bootstrap/CustomLauncherTest.java test to make binaries from source file In-Reply-To: References: <6b21ac03-0505-881e-b805-1681faab437b@bell-sw.com> Message-ID: Hi Alexander, overall looks good to me, I have a few comments though: - you can use Utils.TEST_CLASSPATH instead of CustomLauncherTest.TEST_CLASSPATH - CustomLauncherTest::findLibjvm can be simplified by use Platform::jvmLibDir - exelauncher.c has a comment which refers to the test as CustomLauncherTest.sh, could you please update the comment? - you have to add /native flag to @run action, otherwise jtreg won't exclude this test from runs w/ test.nativepath being unset I also have a question regarding your statement that >> The changes for obsolete binary files <...> are not included into the webrev. They needs to be removed manually. you are planning to remove these files as part of this patch, right? Thanks, -- Igor > On Mar 5, 2020, at 6:27 AM, Daniel Fuchs wrote: > > Hi Alexander, > > Fixes to JMX & management agent are reviewed on the > seviceability-dev (added in to:) these days. > > best regards, > > -- daniel > > On 05/03/2020 13:17, Alexander Scherbatiy wrote: >> Hello, >> Could you review a small enhancement where the test CustomLauncherTest is updated to build binary launcher file from launcher.c file. >> The file launcher.c is renamed to exelauncher.c to follow the name convention for executable test files building by jdk make system. >> Bug: https://bugs.openjdk.java.net/browse/JDK-8240604 >> Webrev: http://cr.openjdk.java.net/~alexsch/8240604/webrev.00 >> The changes for obsolete binary files from sun/management/jmxremote/bootstrap/linux-* and solaris-* are not included into the webrev. They needs to be removed manually. >> The test is passed on Ubuntu 18.04 x86-64, Solaris Sparc v9 11.2, and Solaris x64 11.4 systems. >> The test is excluded from Windows and Mac Os X systems. >> Thanks, >> Alexander. > From daniil.x.titov at oracle.com Tue Mar 17 18:40:32 2020 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Tue, 17 Mar 2020 11:40:32 -0700 Subject: RFR: 8240711: TestJstatdPort.java failed due to "ExportException: Port already in use:" In-Reply-To: <3F070550-DAB3-431E-89AD-66532B366FEA@oracle.com> References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com> <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com> <7D3EB454-76AA-47A7-91A6-417ED92C6F56@oracle.com> <38b055df-37cd-cc11-79c9-323fb55077ff@oracle.com> <3F070550-DAB3-431E-89AD-66532B366FEA@oracle.com> Message-ID: <597AFBA6-42DB-4499-BF0C-8160155BABF4@oracle.com> Hi Alex, Please review a new version of the fix that removes the old version of the code that tried to handle the "port in use" case. Testing: Mach5 tests for sun/tools/jstatd/ successfully passed 100 times. Tier1-tier3 tests successfully passed. [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.02 [2] https://bugs.openjdk.java.net/browse/JDK-8240711 Thanks, Daniil ?On 3/16/20, 5:38 PM, "Daniil Titov" wrote: Hi Alex, Yes, I did test the change by modifying the test to use the RMI port that is already in use ( the stack trace in the original email was exact from this changed test) and then ensured that with the fix the such issue is properly handled. I will send a new version of the webrev that removes the old version of the code that tried to handle the "port in use" case. Thanks! Best regards, Daniil ?On 3/16/20, 4:47 PM, "Alex Menkov" wrote: I don't agree. The code handles exact the same "port in use" case for the same tool. So it either works or doesn't. And have 2 code blocks which suppose to do the same makes the code messy. BTW did you tested the change (I mean craft the test to get "port in use" error)? --alex On 03/16/2020 16:17, Daniil Titov wrote: > Resending with the corrected subject ... > > Hi Alex, > > Yes, you are right, class JstatdTest has the code that is supposed to handle the "port in use" > case but at least for this specific test (sun/tools/jstatd/TestJstatdPort.java) it doesn't work. > > Since there are multiple tests in sun/tools/jstatd/* folder that use this class and different ports > might be subject to the "port in use" error and taking into account that it's hard to reproduce such case > I found it safer to leave the original code and just augment it with what was missing for this specific > case rather than completely replacing it. > > Best regards, > Daniil > > ?On 3/16/20, 4:02 PM, "Alex Menkov" wrote: > > Hi Daniil, > > Looks like the test is supposed to handle "port in use" issue (see lines > 103-114). > I suppose in case "port in use" jstatd exits, but > ProcessTools.startProcess() continue to wait for "jstatd started" message. > > --alex > > On 03/16/2020 12:00, Daniil Titov wrote: > > Please review the change [1] that fixes the intermittent failure of the test. > > > > The problem here is that if the RMI port is in use than the test keep waiting for "jstatd started (bound to " to appear in the process output and in this case > > It doesn't happen. > > > > at java.util.concurrent.CountDownLatch.await(java.base at 15-internal/CountDownLatch.java:232) > > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:205) > > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:133) > > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:254) > > at jdk.test.lib.thread.ProcessThread$ProcessRunnable.xrun(ProcessThread.java:153) > > at jdk.test.lib.thread.XRun.run(XRun.java:40) > > at java.lang.Thread.run(java.base at 15-internal/Thread.java:832) > > at jdk.test.lib.thread.TestThread.run(TestThread.java:123) > > > > Testing: Mach5 tests for sun/tools/jstatd/ successfully passed. Tier1-tier3 tests are still in progress. > > > > [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.01/ > > [2] https://bugs.openjdk.java.net/browse/JDK-8240711 > > > > > > Thank you, > > Daniil > > > > > > > > > From alexey.menkov at oracle.com Tue Mar 17 18:58:48 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Tue, 17 Mar 2020 11:58:48 -0700 Subject: RFR: 8240711: TestJstatdPort.java failed due to "ExportException: Port already in use:" In-Reply-To: <597AFBA6-42DB-4499-BF0C-8160155BABF4@oracle.com> References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com> <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com> <7D3EB454-76AA-47A7-91A6-417ED92C6F56@oracle.com> <38b055df-37cd-cc11-79c9-323fb55077ff@oracle.com> <3F070550-DAB3-431E-89AD-66532B366FEA@oracle.com> <597AFBA6-42DB-4499-BF0C-8160155BABF4@oracle.com> Message-ID: <8f18db0e-e988-ba8f-2f55-07584cb4b7e0@oracle.com> LGTM --alex On 03/17/2020 11:40, Daniil Titov wrote: > Hi Alex, > > Please review a new version of the fix that removes the old version of the code that tried to handle the "port in use" case. > > Testing: Mach5 tests for sun/tools/jstatd/ successfully passed 100 times. Tier1-tier3 tests successfully passed. > > [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.02 > [2] https://bugs.openjdk.java.net/browse/JDK-8240711 > > Thanks, > Daniil > > > > ?On 3/16/20, 5:38 PM, "Daniil Titov" wrote: > > Hi Alex, > > Yes, I did test the change by modifying the test to use the RMI port that is already in use > ( the stack trace in the original email was exact from this changed test) and then ensured that with the fix > the such issue is properly handled. > > I will send a new version of the webrev that removes the old version of the code that tried to handle the "port in use" case. > > Thanks! > > Best regards, > Daniil > > > > > ?On 3/16/20, 4:47 PM, "Alex Menkov" wrote: > > I don't agree. > The code handles exact the same "port in use" case for the same tool. > So it either works or doesn't. > And have 2 code blocks which suppose to do the same makes the code messy. > BTW did you tested the change (I mean craft the test to get "port in > use" error)? > > --alex > > On 03/16/2020 16:17, Daniil Titov wrote: > > Resending with the corrected subject ... > > > > Hi Alex, > > > > Yes, you are right, class JstatdTest has the code that is supposed to handle the "port in use" > > case but at least for this specific test (sun/tools/jstatd/TestJstatdPort.java) it doesn't work. > > > > Since there are multiple tests in sun/tools/jstatd/* folder that use this class and different ports > > might be subject to the "port in use" error and taking into account that it's hard to reproduce such case > > I found it safer to leave the original code and just augment it with what was missing for this specific > > case rather than completely replacing it. > > > > Best regards, > > Daniil > > > > ?On 3/16/20, 4:02 PM, "Alex Menkov" wrote: > > > > Hi Daniil, > > > > Looks like the test is supposed to handle "port in use" issue (see lines > > 103-114). > > I suppose in case "port in use" jstatd exits, but > > ProcessTools.startProcess() continue to wait for "jstatd started" message. > > > > --alex > > > > On 03/16/2020 12:00, Daniil Titov wrote: > > > Please review the change [1] that fixes the intermittent failure of the test. > > > > > > The problem here is that if the RMI port is in use than the test keep waiting for "jstatd started (bound to " to appear in the process output and in this case > > > It doesn't happen. > > > > > > at java.util.concurrent.CountDownLatch.await(java.base at 15-internal/CountDownLatch.java:232) > > > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:205) > > > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:133) > > > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:254) > > > at jdk.test.lib.thread.ProcessThread$ProcessRunnable.xrun(ProcessThread.java:153) > > > at jdk.test.lib.thread.XRun.run(XRun.java:40) > > > at java.lang.Thread.run(java.base at 15-internal/Thread.java:832) > > > at jdk.test.lib.thread.TestThread.run(TestThread.java:123) > > > > > > Testing: Mach5 tests for sun/tools/jstatd/ successfully passed. Tier1-tier3 tests are still in progress. > > > > > > [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.01/ > > > [2] https://bugs.openjdk.java.net/browse/JDK-8240711 > > > > > > > > > Thank you, > > > Daniil > > > > > > > > > > > > > > > > > > > From patricio.chilano.mateo at oracle.com Tue Mar 17 20:14:14 2020 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Tue, 17 Mar 2020 17:14:14 -0300 Subject: RFR 8240902: JDI shared memory connector can use already closed Handles Message-ID: Hi all, Please review the following patch: Bug: https://bugs.openjdk.java.net/browse/JDK-8240902 Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/ Calling closeConnection() on an already created/opened connection includes calls to CloseHandle() on objects that can still be used by other threads. This can lead to either undefined behavior or, as detailed in the bug comments, changes of state of unrelated objects. This issue was found while debugging the reason behind some jshell test failures seen after pushing 8230594. Not as important, but there are also calls to closeStream() from createStream()/openStream() when failing to create/open a stream that will return after executing "CHECK_ERROR(enterMutex(stream, NULL));" without closing the intended resources. Then, calling closeConnection() could assert if the reason of the previous failure was that the stream's mutex failed to be created/opened. These patch aims to address these issues too. Tested in mach5 with the current baseline, tiers1-3 and several runs of open/test/langtools/:tier1 which includes the jshell tests where this connector is used. I also applied patch http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev mentioned in the comments of the bug, on top of the baseline and run the langtool tests with and without this fix. Without the fix running around 30 repetitions already shows failures in tests jdk/jshell/FailOverExecutionControlTest.java and jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With the fix I run several hundred runs and saw no failures. Let me know if there is any additional testing I should do. As a side note, I see there are a couple of open issues related with jshell failures (8209848) which could be related to this bug and therefore might be fixed by this patch. Thanks, Patricio From chris.plummer at oracle.com Wed Mar 18 04:52:59 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 17 Mar 2020 21:52:59 -0700 Subject: RFR(XS) 8240906: Update ZGC ProblemList for serviceability/sa/TestJmapCoreMetaspace.java Message-ID: <2545b3cf-3136-c990-dcfc-b6b6b7cc7d9f@oracle.com> Hello, Please review the following: https://bugs.openjdk.java.net/browse/JDK-8240906 diff --git a/test/hotspot/jtreg/ProblemList-zgc.txt b/test/hotspot/jtreg/ProblemList-zgc.txt --- a/test/hotspot/jtreg/ProblemList-zgc.txt +++ b/test/hotspot/jtreg/ProblemList-zgc.txt @@ -47,5 +47,5 @@ ?serviceability/sa/TestJhsdbJstackLock.java 8220624?? generic-all ?serviceability/sa/TestJhsdbJstackMixed.java 8220624?? generic-all ?serviceability/sa/TestJmapCore.java 8220624?? generic-all -serviceability/sa/TestJmapCoreMetaspace.java 8219443?? generic-all +serviceability/sa/TestJmapCoreMetaspace.java 8220624?? generic-all ?serviceability/sa/sadebugd/DebugdConnectTest.java 8220624?? generic-all 8219443 [1] was closed as a dup of 8219405 [2], which is a very intermittent bug that occurs even without ZGC, so should not be used to problem list this test for ZGC. However it should be ZGC problem listed due to 8220624 [3] just like TestJmapCore.java is. [1] https://bugs.openjdk.java.net/browse/JDK-8219443 [2] https://bugs.openjdk.java.net/browse/JDK-8219405 [3] https://bugs.openjdk.java.net/browse/JDK-8220624 thanks, Chris From chris.plummer at oracle.com Wed Mar 18 04:59:00 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 17 Mar 2020 21:59:00 -0700 Subject: RFR(XS) 8227340: Modify problem list entry for javax/management/remote/mandatory/connection/MultiThreadDeadLockTest.java Message-ID: <0c2e3ed0-e35c-c555-b8ed-ba64eb08f6ae@oracle.com> Hello, Please review the following: https://bugs.openjdk.java.net/browse/JDK-8227340 diff --git a/test/jdk/ProblemList.txt b/test/jdk/ProblemList.txt --- a/test/jdk/ProblemList.txt +++ b/test/jdk/ProblemList.txt @@ -587,7 +587,7 @@ ?java/lang/management/ThreadMXBean/AllThreadIds.java 8131745 generic-all ?javax/management/monitor/DerivedGaugeMonitorTest.java 8042211 generic-all -javax/management/remote/mandatory/connection/MultiThreadDeadLockTest.java 8042215 generic-all +javax/management/remote/mandatory/connection/MultiThreadDeadLockTest.java 8227337 generic-all 8042215 [1] used to be the correct CR to problem list this test under, but it was accidentally used to fix for a different bug. 8042215 [1] has now been cloned to 8227337 [2] so the problem list needs to be updated also. [1] https://bugs.openjdk.java.net/browse/JDK-8042215 [2] https://bugs.openjdk.java.net/browse/JDK-8227337 thanks, Chris From david.holmes at oracle.com Wed Mar 18 05:11:27 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 18 Mar 2020 15:11:27 +1000 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: <01361a9d-2855-db67-a176-73731fada08f@oracle.com> <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com> <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com> <0343dfac-61f7-1b1c-ee96-bdee130578ad@oracle.com> Message-ID: Hi Ralf, On 13/03/2020 9:43 pm, Schmelter, Ralf wrote: > Hi, > > I have updated the webrev: http://cr.openjdk.java.net/~rschmelter/webrevs/8237354/webrev.1/ > > It has the following significant changes: > > - The jcmd now uses two separate flags. The -gz flag is now a boolean flag which toggles the compression on/off. And the new -gz-level flag can be used to change the compression level. If tried to change the jlong flag coding to allow the old behavior (only one flag, which acts both as a boolean flag and a jlong flag), but decided against it, since it changes the semantic of a jlong flag. And I don't expect the -gz-level flag to be used all that much. > > - I no longer use my own threads. Instead I use the WorkGang returned from CollectedHeap:: get_safepoint_workers(). This works fine, apart from Shenandoah GC, which runs into assertions when calling the CollectedHeap::object_iterate() method from a worker thread. I'm not sure if the assertion is too strong, but since the GC is currently experimental, I switch back to single threading in this case (as would be the case for serial GC or epsilon GC). Using the worker threads removes the problems the original code had regarding destruction of the monitor used. I'm glad to see you are no longer using your own threads, and I apologise that I have not yet been able to look further into the thread lifecycle issues you encountered. However I'm not clear how this solves the problem of destroying the monitor while it can still be being accessed - is the dumping occurring at a safepoint in the WorkGang threads? Thanks, David ----- > - The reported number of bytes is now the one written to disk. > > Best regards, > Ralf > > -----Original Message----- > From: Ioi Lam > Sent: Dienstag, 25. Februar 2020 18:03 > To: Langer, Christoph ; Schmelter, Ralf ; Yasumasa Suenaga ; serguei.spitsyn at oracle.com; hotspot-runtime-dev at openjdk.java.net runtime > Cc: serviceability-dev at openjdk.java.net > Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump > > Hi Christoph, > > This sounds fair. I will remove my objection :-) > > Thanks > - Ioi > From david.holmes at oracle.com Wed Mar 18 05:16:47 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 18 Mar 2020 15:16:47 +1000 Subject: RFR(XS) 8227340: Modify problem list entry for javax/management/remote/mandatory/connection/MultiThreadDeadLockTest.java In-Reply-To: <0c2e3ed0-e35c-c555-b8ed-ba64eb08f6ae@oracle.com> References: <0c2e3ed0-e35c-c555-b8ed-ba64eb08f6ae@oracle.com> Message-ID: <0eb4d0c3-60fb-be22-238a-fd8f4b10ff9e@oracle.com> Hi Chris, On 18/03/2020 2:59 pm, Chris Plummer wrote: > Hello, > > Please review the following: > > https://bugs.openjdk.java.net/browse/JDK-8227340 > > diff --git a/test/jdk/ProblemList.txt b/test/jdk/ProblemList.txt > --- a/test/jdk/ProblemList.txt > +++ b/test/jdk/ProblemList.txt > @@ -587,7 +587,7 @@ > ?java/lang/management/ThreadMXBean/AllThreadIds.java 8131745 generic-all > > ?javax/management/monitor/DerivedGaugeMonitorTest.java 8042211 generic-all > -javax/management/remote/mandatory/connection/MultiThreadDeadLockTest.java > 8042215 generic-all > +javax/management/remote/mandatory/connection/MultiThreadDeadLockTest.java > 8227337 generic-all > > 8042215 [1] used to be the correct CR to problem list this test under, > but it was accidentally used to fix for a different bug. 8042215 [1] has > now been cloned to 8227337 [2] so the problem list needs to be updated > also. Okay. The bugs themselves are in a bit of a muddle but this issue is okay. Thanks, David > [1] https://bugs.openjdk.java.net/browse/JDK-8042215 > [2] https://bugs.openjdk.java.net/browse/JDK-8227337 > > thanks, > > Chris > > From ralf.schmelter at sap.com Wed Mar 18 06:39:36 2020 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Wed, 18 Mar 2020 06:39:36 +0000 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: <01361a9d-2855-db67-a176-73731fada08f@oracle.com> <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com> <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com> <0343dfac-61f7-1b1c-ee96-bdee130578ad@oracle.com> , Message-ID: Hi David, >?However I'm not clear how this solves??the problem of destroying > the monitor while it can still be being?accessed - is the dumping > occurring at a safepoint in the WorkGang threads? Because when the run_task() method returns, I can be sure none of the work gang threads still use the mutex. They have to exit the thread_loop() method to finish the task. And by exiting the method they have released the mutex. Best regards, Ralf From: David Holmes Sent: Wednesday, March 18, 2020 6:11 AM To: Schmelter, Ralf ; Ioi Lam ; Langer, Christoph ; Yasumasa Suenaga ; serguei.spitsyn at oracle.com ; hotspot-runtime-dev at openjdk.java.net runtime Cc: serviceability-dev at openjdk.java.net Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump ? Hi Ralf, On 13/03/2020 9:43 pm, Schmelter, Ralf wrote: > Hi, > > I have updated the webrev: http://cr.openjdk.java.net/~rschmelter/webrevs/8237354/webrev.1/ > > It has the following significant changes: > > - The jcmd now uses two separate flags. The -gz flag is now a boolean flag which toggles the compression? on/off. And the new -gz-level flag can be used to change the compression level. If tried to change the jlong flag coding to allow the old behavior (only one flag, which acts both as a boolean flag and a jlong flag), but decided against it, since it changes the semantic of a jlong flag. And I don't expect the -gz-level flag to be used all that much. > > - I no longer use my own threads. Instead I use the WorkGang returned from CollectedHeap:: get_safepoint_workers(). This works fine, apart from Shenandoah GC, which runs into assertions when calling the CollectedHeap::object_iterate() method from a worker thread. I'm not sure if the assertion is too strong, but since the GC is currently experimental, I switch back to single threading in this case (as would be the case for serial GC or epsilon GC). Using the worker threads removes the problems the original code had regarding destruction of the monitor used. I'm glad to see you are no longer using your own threads, and I apologise that I have not yet been able to look further into the thread lifecycle issues you encountered. However I'm not clear how this solves the problem of destroying the monitor while it can still be being accessed - is the dumping occurring at a safepoint in the WorkGang threads? Thanks, David ----- From david.holmes at oracle.com Wed Mar 18 06:43:22 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 18 Mar 2020 16:43:22 +1000 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com> <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com> <0343dfac-61f7-1b1c-ee96-bdee130578ad@oracle.com> Message-ID: <05ae5818-f1b7-2e86-e4dd-c09ff240748e@oracle.com> On 18/03/2020 4:39 pm, Schmelter, Ralf wrote: > Hi David, > >> ?However I'm not clear how this solves??the problem of destroying >> the monitor while it can still be being?accessed - is the dumping >> occurring at a safepoint in the WorkGang threads? > > Because when the run_task() method returns, I can be sure none > of the work gang threads still use the mutex. They have to exit the > thread_loop() method to finish the task. And by exiting the method > they have released the mutex. All of which is happening via VM_HeapDumper::doit(). Got it. Thanks, David > Best regards, > Ralf > > > > > > > From: David Holmes > > Sent: Wednesday, March 18, 2020 6:11 AM > > To: Schmelter, Ralf ; Ioi Lam ; Langer, Christoph ; Yasumasa Suenaga ; serguei.spitsyn at oracle.com ; hotspot-runtime-dev at openjdk.java.net > runtime > > Cc: serviceability-dev at openjdk.java.net > > Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump > > > > > Hi Ralf, > > > > On 13/03/2020 9:43 pm, Schmelter, Ralf wrote: > >> Hi, > >> > >> I have updated the webrev: > http://cr.openjdk.java.net/~rschmelter/webrevs/8237354/webrev.1/ > >> > >> It has the following significant changes: > >> > >> - The jcmd now uses two separate flags. The -gz flag is now a boolean flag which toggles the compression? on/off. And the new -gz-level flag can be used to change the compression level. If tried to change the jlong flag coding to allow the old behavior (only > one flag, which acts both as a boolean flag and a jlong flag), but decided against it, since it changes the semantic of a jlong flag. And I don't expect the -gz-level flag to be used all that much. > >> > >> - I no longer use my own threads. Instead I use the WorkGang returned from CollectedHeap:: get_safepoint_workers(). This works fine, apart from Shenandoah GC, which runs into assertions when calling the CollectedHeap::object_iterate() method from a worker > thread. I'm not sure if the assertion is too strong, but since the GC is currently experimental, I switch back to single threading in this case (as would be the case for serial GC or epsilon GC). Using the worker threads removes the problems the original code > had regarding destruction of the monitor used. > > > > I'm glad to see you are no longer using your own threads, and I > > apologise that I have not yet been able to look further into the thread > > lifecycle issues you encountered. However I'm not clear how this solves > > the problem of destroying the monitor while it can still be being > > accessed - is the dumping occurring at a safepoint in the WorkGang threads? > > > > Thanks, > > David > > ----- > > From david.holmes at oracle.com Wed Mar 18 07:27:30 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 18 Mar 2020 17:27:30 +1000 Subject: RFR 8240902: JDI shared memory connector can use already closed Handles In-Reply-To: References: Message-ID: Hi Patricio, On 18/03/2020 6:14 am, Patricio Chilano wrote: > Hi all, > > Please review the following patch: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8240902 > Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/ > > Calling closeConnection() on an already created/opened connection > includes calls to CloseHandle() on objects that can still be used by > other threads. This can lead to either undefined behavior or, as > detailed in the bug comments, changes of state of unrelated objects. This was a really great find! > This issue was found while debugging the reason behind some jshell test > failures seen after pushing 8230594. Not as important, but there are > also calls to closeStream() from createStream()/openStream() when > failing to create/open a stream that will return after executing > "CHECK_ERROR(enterMutex(stream, NULL));" without closing the intended > resources. Then, calling closeConnection() could assert if the reason of > the previous failure was that the stream's mutex failed to be > created/opened. These patch aims to address these issues too. Patch looks good in general. The internal reference count guards deletion of the internal resources, and is itself safe because never actually delete the connection. Thanks for adding the comment about this aspect. A few items: Please update copyright year before pushing. Please align ENTER_CONNECTION/LEAVE_CONNECTION macros the same way as STREAM_INVARIANT. 170 unsigned int refcount; 171 jint state; I'm unclear about the use of stream->state and connection->state as guards - unless accessed under a mutex these would seem to at least need acquire/release semantics. Additionally the reads of refcount would also seem to need to some form of memory synchronization - though the Windows docs for the Interlocked* API does not show how to simply read such a variable! Though I note that the RtlFirstEntrySList method for the "Interlocked Singly Linked Lists" API does state "Access to the list is synchronized on a multiprocessor system." which suggests a read of such a variable does require some form of memory synchronization! 413 while (attempts>0) { spaces around > If the loop at 413 never encounters a zero reference_count then it doesn't close the events or the mutex but still returns SYS_OK. That seems wrong but I'm not sure what the right behaviour is here. And please wait for serviceability folk to review this. Thanks, David ----- > Tested in mach5 with the current baseline, tiers1-3 and several runs of > open/test/langtools/:tier1 which includes the jshell tests where this > connector is used. I also applied patch > http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev > mentioned in the comments of the bug, on top of the baseline and run the > langtool tests with and without this fix. Without the fix running around > 30 repetitions already shows failures in tests > jdk/jshell/FailOverExecutionControlTest.java and > jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With the fix > I run several hundred runs and saw no failures. Let me know if there is > any additional testing I should do. > > As a side note, I see there are a couple of open issues related with > jshell failures (8209848) which could be related to this bug and > therefore might be fixed by this patch. > > Thanks, > Patricio > From stefan.karlsson at oracle.com Wed Mar 18 09:00:54 2020 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 18 Mar 2020 10:00:54 +0100 Subject: RFR(XS) 8240906: Update ZGC ProblemList for serviceability/sa/TestJmapCoreMetaspace.java In-Reply-To: <2545b3cf-3136-c990-dcfc-b6b6b7cc7d9f@oracle.com> References: <2545b3cf-3136-c990-dcfc-b6b6b7cc7d9f@oracle.com> Message-ID: Looks good. StefanK On 2020-03-18 05:52, Chris Plummer wrote: > Hello, > > Please review the following: > > https://bugs.openjdk.java.net/browse/JDK-8240906 > > diff --git a/test/hotspot/jtreg/ProblemList-zgc.txt > b/test/hotspot/jtreg/ProblemList-zgc.txt > --- a/test/hotspot/jtreg/ProblemList-zgc.txt > +++ b/test/hotspot/jtreg/ProblemList-zgc.txt > @@ -47,5 +47,5 @@ > ?serviceability/sa/TestJhsdbJstackLock.java 8220624?? generic-all > ?serviceability/sa/TestJhsdbJstackMixed.java 8220624?? generic-all > ?serviceability/sa/TestJmapCore.java 8220624?? generic-all > -serviceability/sa/TestJmapCoreMetaspace.java 8219443 generic-all > +serviceability/sa/TestJmapCoreMetaspace.java 8220624 generic-all > ?serviceability/sa/sadebugd/DebugdConnectTest.java 8220624 generic-all > > 8219443 [1] was closed as a dup of 8219405 [2], which is a very > intermittent bug that occurs even without ZGC, so should not be used > to problem list this test for ZGC. However it should be ZGC problem > listed due to 8220624 [3] just like TestJmapCore.java is. > > [1] https://bugs.openjdk.java.net/browse/JDK-8219443 > [2] https://bugs.openjdk.java.net/browse/JDK-8219405 > [3] https://bugs.openjdk.java.net/browse/JDK-8220624 > > thanks, > > Chris > From alexander.scherbatiy at bell-sw.com Wed Mar 18 11:57:28 2020 From: alexander.scherbatiy at bell-sw.com (Alexander Scherbatiy) Date: Wed, 18 Mar 2020 14:57:28 +0300 Subject: jmx-dev RFR 8240604: Rewrite sun/management/jmxremote/bootstrap/CustomLauncherTest.java test to make binaries from source file In-Reply-To: References: <6b21ac03-0505-881e-b805-1681faab437b@bell-sw.com> Message-ID: <4ac1b2eb-7d0d-7fa5-8437-b7b987df92fe@bell-sw.com> Hello, Could you review the updated fix: ? http://cr.openjdk.java.net/~alexsch/8240604/webrev.01 Utils.TEST_CLASS_PATH, Platform.jvmLibDir(), and /native flag are added to the CustomLauncherTest.java test. I also included TEST_NATIVE_PATH to the Utils lib. I have not found a history about CustomLauncherTest.sh script in launcher.c so I just updated the comment as "A minature launcher for use by CustomLauncherTest.java test" in the exelauncher.c file. The comment that I had about removing the linux-* and solaris-* binary files I wrote because it is not clear for what is the right way to include removed binary files into webrev. Could I just use "hg remove binary-fie" and run webrev to add the removed binary files into webrev? Thanks, Alexander. On 17.03.2020 20:11, Igor Ignatyev wrote: > Hi Alexander, > > overall looks good to me, I have a few comments though: > - you can use Utils.TEST_CLASSPATH instead of CustomLauncherTest.TEST_CLASSPATH > - CustomLauncherTest::findLibjvm can be simplified by use Platform::jvmLibDir > - exelauncher.c has a comment which refers to the test as CustomLauncherTest.sh, could you please update the comment? > - you have to add /native flag to @run action, otherwise jtreg won't exclude this test from runs w/ test.nativepath being unset > > I also have a question regarding your statement that >>> The changes for obsolete binary files <...> are not included into the webrev. They needs to be removed manually. > you are planning to remove these files as part of this patch, right? > > Thanks, > -- Igor > > >> On Mar 5, 2020, at 6:27 AM, Daniel Fuchs wrote: >> >> Hi Alexander, >> >> Fixes to JMX & management agent are reviewed on the >> seviceability-dev (added in to:) these days. >> >> best regards, >> >> -- daniel >> >> On 05/03/2020 13:17, Alexander Scherbatiy wrote: >>> Hello, >>> Could you review a small enhancement where the test CustomLauncherTest is updated to build binary launcher file from launcher.c file. >>> The file launcher.c is renamed to exelauncher.c to follow the name convention for executable test files building by jdk make system. >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240604 >>> Webrev: http://cr.openjdk.java.net/~alexsch/8240604/webrev.00 >>> The changes for obsolete binary files from sun/management/jmxremote/bootstrap/linux-* and solaris-* are not included into the webrev. They needs to be removed manually. >>> The test is passed on Ubuntu 18.04 x86-64, Solaris Sparc v9 11.2, and Solaris x64 11.4 systems. >>> The test is excluded from Windows and Mac Os X systems. >>> Thanks, >>> Alexander. From alexander.scherbatiy at bell-sw.com Wed Mar 18 15:48:57 2020 From: alexander.scherbatiy at bell-sw.com (Alexander Scherbatiy) Date: Wed, 18 Mar 2020 18:48:57 +0300 Subject: jmx-dev RFR 8240604: Rewrite sun/management/jmxremote/bootstrap/CustomLauncherTest.java test to make binaries from source file In-Reply-To: <4ac1b2eb-7d0d-7fa5-8437-b7b987df92fe@bell-sw.com> References: <6b21ac03-0505-881e-b805-1681faab437b@bell-sw.com> <4ac1b2eb-7d0d-7fa5-8437-b7b987df92fe@bell-sw.com> Message-ID: On 18.03.2020 14:57, Alexander Scherbatiy wrote: > Hello, > > Could you review the updated fix: > > ? http://cr.openjdk.java.net/~alexsch/8240604/webrev.01 > > Utils.TEST_CLASS_PATH, Platform.jvmLibDir(), and /native flag are > added to the CustomLauncherTest.java test. I also included > TEST_NATIVE_PATH to the Utils lib. > > I have not found a history about CustomLauncherTest.sh script in > launcher.c so I just updated the comment as "A minature launcher for > use by CustomLauncherTest.java test" in the exelauncher.c file. ? I also updated the word with type 'minature' to 'miniature'. Thanks, Alexander. > > > The comment that I had about removing the linux-* and solaris-* binary > files I wrote because it is not clear for what is the right way to > include removed binary files into webrev. > > Could I just use "hg remove binary-fie" and run webrev to add the > removed binary files into webrev? > > > Thanks, > > Alexander. > > On 17.03.2020 20:11, Igor Ignatyev wrote: >> Hi Alexander, >> >> overall looks good to me, I have a few comments though: >> ? - you can use Utils.TEST_CLASSPATH instead of >> CustomLauncherTest.TEST_CLASSPATH >> - CustomLauncherTest::findLibjvm can be simplified by use >> Platform::jvmLibDir >> - exelauncher.c has a comment which refers to the test as >> CustomLauncherTest.sh, could you please update the comment? >> - you have to add /native flag to @run action, otherwise jtreg won't >> exclude this test from runs w/ test.nativepath being unset >> >> I also have a question regarding your statement that >>>> The changes for obsolete binary files <...> are not included into >>>> the webrev. They needs to be removed manually. >> you are planning to remove these files as part of this patch, right? >> >> Thanks, >> -- Igor >> >> >>> On Mar 5, 2020, at 6:27 AM, Daniel Fuchs >>> wrote: >>> >>> Hi Alexander, >>> >>> Fixes to JMX & management agent are reviewed on the >>> seviceability-dev (added in to:) these days. >>> >>> best regards, >>> >>> -- daniel >>> >>> On 05/03/2020 13:17, Alexander Scherbatiy wrote: >>>> Hello, >>>> Could you review a small enhancement where the test >>>> CustomLauncherTest is updated to build binary launcher file from >>>> launcher.c file. >>>> The file launcher.c is renamed to exelauncher.c to follow the name >>>> convention for executable test files building by jdk make system. >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240604 >>>> Webrev: http://cr.openjdk.java.net/~alexsch/8240604/webrev.00 >>>> The changes for obsolete binary files from >>>> sun/management/jmxremote/bootstrap/linux-* and solaris-* are not >>>> included into the webrev. They needs to be removed manually. >>>> The test is passed on Ubuntu 18.04 x86-64, Solaris Sparc v9 11.2, >>>> and Solaris x64 11.4 systems. >>>> The test is excluded from Windows and Mac Os X systems. >>>> Thanks, >>>> Alexander. From igor.ignatyev at oracle.com Wed Mar 18 16:00:49 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 18 Mar 2020 09:00:49 -0700 Subject: jmx-dev RFR 8240604: Rewrite sun/management/jmxremote/bootstrap/CustomLauncherTest.java test to make binaries from source file In-Reply-To: <4ac1b2eb-7d0d-7fa5-8437-b7b987df92fe@bell-sw.com> References: <6b21ac03-0505-881e-b805-1681faab437b@bell-sw.com> <4ac1b2eb-7d0d-7fa5-8437-b7b987df92fe@bell-sw.com> Message-ID: Hi Alexander, > I also included TEST_NATIVE_PATH to the Utils lib. for the sake of clarity and ease of backporting, I'd prefer to have it added by a separate bug and commit. > Could I just use "hg remove binary-fie" and run webrev to add the removed binary files into webrev? IIRC correctly, webrev will just say 'a binary file got removed', in any case I'll take it as a 'yes, I'm going to remove these files as part of 8240604', so thumbs up. -- Igor > On Mar 18, 2020, at 4:57 AM, Alexander Scherbatiy wrote: > > Hello, > > Could you review the updated fix: > > http://cr.openjdk.java.net/~alexsch/8240604/webrev.01 > > Utils.TEST_CLASS_PATH, Platform.jvmLibDir(), and /native flag are added to the CustomLauncherTest.java test. I also included TEST_NATIVE_PATH to the Utils lib. > > I have not found a history about CustomLauncherTest.sh script in launcher.c so I just updated the comment as "A minature launcher for use by CustomLauncherTest.java test" in the exelauncher.c file. > > > The comment that I had about removing the linux-* and solaris-* binary files I wrote because it is not clear for what is the right way to include removed binary files into webrev. > > Could I just use "hg remove binary-fie" and run webrev to add the removed binary files into webrev? > > > Thanks, > > Alexander. > > On 17.03.2020 20:11, Igor Ignatyev wrote: >> Hi Alexander, >> >> overall looks good to me, I have a few comments though: >> - you can use Utils.TEST_CLASSPATH instead of CustomLauncherTest.TEST_CLASSPATH >> - CustomLauncherTest::findLibjvm can be simplified by use Platform::jvmLibDir >> - exelauncher.c has a comment which refers to the test as CustomLauncherTest.sh, could you please update the comment? >> - you have to add /native flag to @run action, otherwise jtreg won't exclude this test from runs w/ test.nativepath being unset >> >> I also have a question regarding your statement that >>>> The changes for obsolete binary files <...> are not included into the webrev. They needs to be removed manually. >> you are planning to remove these files as part of this patch, right? >> >> Thanks, >> -- Igor >> >> >>> On Mar 5, 2020, at 6:27 AM, Daniel Fuchs wrote: >>> >>> Hi Alexander, >>> >>> Fixes to JMX & management agent are reviewed on the >>> seviceability-dev (added in to:) these days. >>> >>> best regards, >>> >>> -- daniel >>> >>> On 05/03/2020 13:17, Alexander Scherbatiy wrote: >>>> Hello, >>>> Could you review a small enhancement where the test CustomLauncherTest is updated to build binary launcher file from launcher.c file. >>>> The file launcher.c is renamed to exelauncher.c to follow the name convention for executable test files building by jdk make system. >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240604 >>>> Webrev: http://cr.openjdk.java.net/~alexsch/8240604/webrev.00 >>>> The changes for obsolete binary files from sun/management/jmxremote/bootstrap/linux-* and solaris-* are not included into the webrev. They needs to be removed manually. >>>> The test is passed on Ubuntu 18.04 x86-64, Solaris Sparc v9 11.2, and Solaris x64 11.4 systems. >>>> The test is excluded from Windows and Mac Os X systems. >>>> Thanks, >>>> Alexander. From alexander.scherbatiy at bell-sw.com Wed Mar 18 16:54:27 2020 From: alexander.scherbatiy at bell-sw.com (Alexander Scherbatiy) Date: Wed, 18 Mar 2020 19:54:27 +0300 Subject: jmx-dev RFR 8240604: Rewrite sun/management/jmxremote/bootstrap/CustomLauncherTest.java test to make binaries from source file In-Reply-To: References: <6b21ac03-0505-881e-b805-1681faab437b@bell-sw.com> <4ac1b2eb-7d0d-7fa5-8437-b7b987df92fe@bell-sw.com> Message-ID: <24e51dc7-714a-0932-a27f-778da35e8e29@bell-sw.com> On 18.03.2020 19:00, Igor Ignatyev wrote: > Hi Alexander, > >> I also included TEST_NATIVE_PATH to the Utils lib. > for the sake of clarity and ease of backporting, I'd prefer to have it added by a separate bug and commit. Here is the updated fix where TEST_NATIVE_PATH is not added to the Utils lib. ? http://cr.openjdk.java.net/~alexsch/8240604/webrev.02/ Thanks, Alexander. >> Could I just use "hg remove binary-fie" and run webrev to add the removed binary files into webrev? > IIRC correctly, webrev will just say 'a binary file got removed', in any case I'll take it as a 'yes, I'm going to remove these files as part of 8240604', so thumbs up. > > -- Igor > > >> On Mar 18, 2020, at 4:57 AM, Alexander Scherbatiy wrote: >> >> Hello, >> >> Could you review the updated fix: >> >> http://cr.openjdk.java.net/~alexsch/8240604/webrev.01 >> >> Utils.TEST_CLASS_PATH, Platform.jvmLibDir(), and /native flag are added to the CustomLauncherTest.java test. I also included TEST_NATIVE_PATH to the Utils lib. >> >> I have not found a history about CustomLauncherTest.sh script in launcher.c so I just updated the comment as "A minature launcher for use by CustomLauncherTest.java test" in the exelauncher.c file. >> >> >> The comment that I had about removing the linux-* and solaris-* binary files I wrote because it is not clear for what is the right way to include removed binary files into webrev. >> >> Could I just use "hg remove binary-fie" and run webrev to add the removed binary files into webrev? >> >> >> Thanks, >> >> Alexander. >> >> On 17.03.2020 20:11, Igor Ignatyev wrote: >>> Hi Alexander, >>> >>> overall looks good to me, I have a few comments though: >>> - you can use Utils.TEST_CLASSPATH instead of CustomLauncherTest.TEST_CLASSPATH >>> - CustomLauncherTest::findLibjvm can be simplified by use Platform::jvmLibDir >>> - exelauncher.c has a comment which refers to the test as CustomLauncherTest.sh, could you please update the comment? >>> - you have to add /native flag to @run action, otherwise jtreg won't exclude this test from runs w/ test.nativepath being unset >>> >>> I also have a question regarding your statement that >>>>> The changes for obsolete binary files <...> are not included into the webrev. They needs to be removed manually. >>> you are planning to remove these files as part of this patch, right? >>> >>> Thanks, >>> -- Igor >>> >>> >>>> On Mar 5, 2020, at 6:27 AM, Daniel Fuchs wrote: >>>> >>>> Hi Alexander, >>>> >>>> Fixes to JMX & management agent are reviewed on the >>>> seviceability-dev (added in to:) these days. >>>> >>>> best regards, >>>> >>>> -- daniel >>>> >>>> On 05/03/2020 13:17, Alexander Scherbatiy wrote: >>>>> Hello, >>>>> Could you review a small enhancement where the test CustomLauncherTest is updated to build binary launcher file from launcher.c file. >>>>> The file launcher.c is renamed to exelauncher.c to follow the name convention for executable test files building by jdk make system. >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240604 >>>>> Webrev: http://cr.openjdk.java.net/~alexsch/8240604/webrev.00 >>>>> The changes for obsolete binary files from sun/management/jmxremote/bootstrap/linux-* and solaris-* are not included into the webrev. They needs to be removed manually. >>>>> The test is passed on Ubuntu 18.04 x86-64, Solaris Sparc v9 11.2, and Solaris x64 11.4 systems. >>>>> The test is excluded from Windows and Mac Os X systems. >>>>> Thanks, >>>>> Alexander. From rkennke at redhat.com Wed Mar 18 16:57:26 2020 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 18 Mar 2020 17:57:26 +0100 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com> <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> Message-ID: Hi Serguei, Thanks for your review! A quick update on my progress: The wrong condition was a good find! In-fact so much that it lead to the whole implementation not reporting any unloaded classes. I changed that back, and now it's so slow because of all those unloaded classes firing events, I'm trying to understand where it's loosing time. Some other findings: - I can't keep the lock while calling into JVMTI e.g. for GetTag() or SetTag(), otherwise it risks to deadlock. - The current implementation doesn't seem to report any unloaded classes either (i.e. the bag returned by classTrack_processUnloads(JNIEnv *env) is always empty or NULL), at least not in my testcase. I'm investigating why this might be the case, or maybe I did something wrong. Roman > Sorry, forgot to complete my comments at the end (see below). > > > On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote: >> Hi Roman, >> >> Thank you for the update and sorry for the latency in review. >> >> Some comments are below. >> >> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >> >> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag) >> 88 { >> 89 debugMonitorEnter(deletedSignatureLock); >> 90 if (currentClassTag == -1) { >> 91 // Class tracking not initialized, nobody's interested >> 92 debugMonitorExit(deletedSignatureLock); >> 93 return; >> 94 } >> Just a question: >> ? Q1: Should the ObjectFree events be disabled for the jvmtiEnv that does >> ????? the class tracking if class tracking has not been initialized? >> >> 70 static jlong currentClassTag; I'm thinking if the name is better to >> be something like: lastClassTag or highestClassTag. >> >> 99 KlassNode* klass = *klass_ptr; >> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103 >> klass_ptr = &klass->next; 104 klass = *klass_ptr; >> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // klass not >> found - ignore. >> 107 debugMonitorExit(deletedSignatureLock); >> 108 return; >> 109 } >> ?It seems to me, something is wrong in the condition at L106 above. >> ?Should it be? : >> ??? if (klass == NULL || klass->klass_tag != tag) >> >> ?Otherwise, how can the second check ever work correctly as the return >> will always happen when (klass != NULL)? >> >> ? >> There are several places in this file with the the indent: >> 90 if (currentClassTag == -1) { >> 91 // Class tracking not initialized, nobody's interested >> 92 debugMonitorExit(deletedSignatureLock); >> 93 return; >> 94 } >> ... >> 152 if (currentClassTag == -1) { >> 153 // Class tracking not initialized yet, nobody's interested >> 154 debugMonitorExit(deletedSignatureLock); >> 155 return; >> 156 } >> ... >> 161 if (error != JVMTI_ERROR_NONE) { >> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv"); >> 163 } >> 164 if (tag != 0l) { >> 165 debugMonitorExit(deletedSignatureLock); >> 166 return; // Already added >> 167 } >> ... >> 281 cleanDeleted(void *signatureVoid, void *arg) >> 282 { >> 283 char* sig = (char*)signatureVoid; >> 284 jvmtiDeallocate(sig); >> 285 return JNI_TRUE; >> 286 } >> ... >> 291 void >> 292 classTrack_reset(void) >> 293 { >> 294 int idx; >> 295 debugMonitorEnter(deletedSignatureLock); >> 296 >> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) { >> 298 KlassNode* node = table[idx]; >> 299 while (node != NULL) { >> 300 KlassNode* next = node->next; >> 301 jvmtiDeallocate(node->signature); >> 302 jvmtiDeallocate(node); >> 303 node = next; >> 304 } >> 305 } >> 306 jvmtiDeallocate(table); >> 307 >> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL); >> 309 bagDestroyBag(deletedSignatureBag); >> 310 >> 311 currentClassTag = -1; >> 312 >> 313 (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); >> 314 trackingEnv = NULL; >> 315 >> 316 debugMonitorExit(deletedSignatureLock); >> >> Could you, please, fix several comments below? >> 63 * The JVMTI tracking env to keep track of klass tags, for class-unloads >> ?The comma is not needed. >> ?Would it better to replace: klass tags => klass_tag's ? >> >> >> 73 * Lock to keep table, currentClassTag and deletedSignatureBag >> consistent >> ?Maybe: Lock to guard ... or lock to keep integrity of ... >> >> 84 * Callback when classes are freed, Finds the signature and >> remembers it in deletedSignatureBag. Would be better to use words like >> "store" or "record", "Find" should not start from capital letter: >> Invoke the callback when classes are freed, find and record the >> signature in deletedSignatureBag. >> >> 96 // Find deleted KlassNode 133 // Class tracking not initialized, >> nobody's interested 153 // Class tracking not initialized yet, >> nobody's interested 158 /* Check this is not a duplicate */ Missed dot >> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { // >> klass not found - ignore. In opposite, dot is not needed as the >> comment does not start from a capital letter. 111 // At this point we >> have the KlassNode corresponding to the tag >> 112 // in klass, and the pointer to it in klass_node. > > The comment above can be better. Maybe, something like: > ? " At this point, we found the KlassNode matching the klass tag(and it is > linked). > >> 113 // Remember the unloaded signature. > ?Better: Record the signature of the unloaded class and unlink it. > > Thanks, > Serguei > >> Thanks, >> Serguei >> >> On 3/9/20 05:39, Roman Kennke wrote: >>> Hello all, >>> >>> Can I please get reviews of this change? In the meantime, we've done >>> more testing and also field-/torture-testing by a customer who is happy >>> now. :-) >>> >>> Thanks, >>> Roman >>> >>> >>>> Hi Serguei, >>>> >>>> Thanks for reviewing! >>>> >>>> I updated the patch to reflect your suggestions, very good! >>>> It also includes a fix to allow re-connecting an agent after disconnect, >>>> namely move setup of the trackingEnv and deletedSignatureBag to >>>> _activate() to ensure have those structures after re-connect. >>>> >>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/ >>>> >>>> Let me know what you think! >>>> Roman >>>> >>>>> Hi Roman, >>>>> >>>>> Thank you for taking care about this scalability issue! >>>>> >>>>> I have a couple of quick comments. >>>>> >>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>> >>>>> 72 /* >>>>> 73 * Lock to protect deletedSignatureBag >>>>> 74 */ >>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /* >>>>> 78 * A bag containing all the deleted classes' signatures. Must be >>>>> accessed under >>>>> 79 * deletedTagLock, >>>>> 80 */ >>>>> 81 struct bag* deletedSignatureBag; >>>>> >>>>> ? The comments contradict to each other. >>>>> ? I guess, the lock name at line 79 has to be deletedSignatureLock >>>>> instead of deletedTagLock. >>>>> ? Also, comma at the end must be replaced with dot. >>>>> >>>>> >>>>> 101 // Tag not found? Ignore. >>>>> 102 if (klass == NULL) { >>>>> 103 debugMonitorExit(deletedSignatureLock); >>>>> 104 return; >>>>> 105 } >>>>> 106 >>>>> 107 // Scan linked-list. >>>>> 108 jlong found_tag = klass->klass_tag; >>>>> 109 while (klass != NULL && found_tag != tag) { >>>>> 110 klass_ptr = &klass->next; >>>>> 111 klass = *klass_ptr; >>>>> 112 found_tag = klass->klass_tag; >>>>> 113 } >>>>> 114 >>>>> 115 // Tag not found? Ignore. >>>>> 116 if (found_tag != tag) { >>>>> 117 debugMonitorExit(deletedSignatureLock); >>>>> 118 return; >>>>> 119 } >>>>> >>>>> >>>>> ?The code above can be simplified, so that the lines 101-105 are not >>>>> needed anymore. >>>>> ?It can be something like this: >>>>> >>>>> // Scan linked-list. >>>>> while (klass != NULL && klass->klass_tag != tag) { >>>>> klass_ptr = &klass->next; >>>>> klass = *klass_ptr; >>>>> } >>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not found - ignore. >>>>> debugMonitorExit(deletedSignatureLock); >>>>> return; >>>>> } >>>>> >>>>> It will take more time when I get a chance to look at the rest. >>>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> >>>>> >>>>> On 12/21/19 13:24, Roman Kennke wrote: >>>>>> Here comes an update that resolves some races that happen when >>>>>> disconnecting an agent. In particular, we need to take the lock on >>>>>> basically every operation, and also need to check whether or not >>>>>> class-tracking is active and return an appropriate result (e.g. an empty >>>>>> list) when we're not. >>>>>> >>>>>> Updated webrev: >>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ >>>>>> >>>>>> Thanks, >>>>>> Roman >>>>>> >>>>>> >>>>>>> So, here comes the O(1) implementation: >>>>>>> >>>>>>> - Whenever a class is 'prepared', it is registered with a tag, and we >>>>>>> set-up a listener to get notified when it is unloaded. >>>>>>> - Prepared classes are kept in a datastructure that is a table, which >>>>>>> each entry being the head of a linked-list of KlassNode*. The table is >>>>>>> indexed by tag % slot-count, and then simply prepend the new KlassNode*. >>>>>>> This is O(1) operation. >>>>>>> - When we get notified of unloading a class, we look up the signature of >>>>>>> the reported tag in that table, and remember it in a bag. The KlassNode* >>>>>>> is then unlinked from the table and deallocated. This is ~O(1) operation >>>>>>> too, depending on the depth of the table. In my testcase which hammered >>>>>>> the code with class-loads and unloads, I usually see depths of like 2-3, >>>>>>> but not usually more. It should be ok. >>>>>>> - when processUnloads() gets called, we simply hand out that bag, and >>>>>>> allocate a new one. >>>>>>> - I also added cleanup-code in classTrack_reset() to avoid leaking the >>>>>>> signatures and KlassNode* etc when debug agent gets detached and/or >>>>>>> re-attached (was missing before). >>>>>>> - I also added locks around data-structure-manipulation (was missing >>>>>>> before). >>>>>>> - Also, I only activate this whole process when an actual listener gets >>>>>>> registered on EI_GC_FINISH. This seems to happen right when attaching a >>>>>>> jdb, not sure why jdb does that though. This may be something to improve >>>>>>> in the future? >>>>>>> >>>>>>> In my tests, the performance of class-tracking itself looks really good. >>>>>>> The bottleneck now is clearly actual synthesizing the class-unload >>>>>>> events. I don't see how this can be helped when the debug agent asks for it? >>>>>>> >>>>>>> Updated webrev: >>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ >>>>>>> >>>>>>> Please let me know what you think of it. >>>>>>> >>>>>>> Thanks, >>>>>>> Roman >>>>>>> >>>>>>> >>>>>>>> Alright, the perfectionist in me got me. I am implementing the even more >>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing for now. >>>>>>>> >>>>>>>> Thanks,Roman >>>>>>>> >>>>>>>> Hi Chris, >>>>>>>>>> I'll have a look at this, although it might not be for a few days. In >>>>>>>>>> the meantime, maybe you can describe your new implementation in >>>>>>>>>> classTrack.c so it's easier to look through the changes. >>>>>>>>> Sure. >>>>>>>>> >>>>>>>>> The purpose of this class-tracking is to be able to determine the >>>>>>>>> signatures of unloaded classes when GC/class-unloading happened, so that >>>>>>>>> we can generate the appropriate JDWP event. >>>>>>>>> >>>>>>>>> The current implementation does so by maintaining a table of currently >>>>>>>>> prepared classes by building that table when classTrack is initialized, >>>>>>>>> and then add new classes whenever a class gets loaded. When unloading >>>>>>>>> occurs, that cache is rebuilt into a new table, and compared with the >>>>>>>>> old table, and whatever is in the old, but not in the new table gets >>>>>>>>> returned. The problem is that when GCs happen frequently and/or many >>>>>>>>> classes get loaded+unloaded, this amounts to O(classCount*gcCount) >>>>>>>>> complexity. >>>>>>>>> >>>>>>>>> The new implementation keeps a linked-list of prepared classes, and also >>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). Whenever an >>>>>>>>> unload/GC occurs, the list of prepared classes is scanned, and classes >>>>>>>>> that are also in the deletedTagBag are unlinked (thus maintaining the >>>>>>>>> prepared-classes-list) and its signature put in the list that gets returned. >>>>>>>>> >>>>>>>>> The implementation is not perfect. In order to determine whether or not >>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. That process is >>>>>>>>> therefore still O(unloadedClassCount). The assumption here is that >>>>>>>>> unloadedClassCount << classCount. In my experiments this seems to be >>>>>>>>> true, and also reasonable to expect. >>>>>>>>> >>>>>>>>> (I have some ideas how to improve the implementation to ~O(1) but it >>>>>>>>> would be considerably more complex: have to maintain a (hash)table that >>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload, and build the >>>>>>>>> unloaded-signatures list there, but I don't currently see that it's >>>>>>>>> worth the effort). >>>>>>>>> >>>>>>>>> In addition to all that, this process is only activated when there's an >>>>>>>>> actual listener registered for EI_GC_FINISH. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Roman >>>>>>>>> >>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote: >>>>>>>>>>> Hello all, >>>>>>>>>>> >>>>>>>>>>> Issue: >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269 >>>>>>>>>>> >>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c. It avoids >>>>>>>>>>> throwing away the class cache on GC, and instead keeps track of >>>>>>>>>>> loaded/unloaded classes one-by-one. >>>>>>>>>>> >>>>>>>>>>> In addition to that, it avoids this whole dance until an agent >>>>>>>>>>> registers interest in EI_GC_FINISH. >>>>>>>>>>> >>>>>>>>>>> Webrev: >>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ >>>>>>>>>>> >>>>>>>>>>> Testing: manual testing of provided test scenarios and timing. >>>>>>>>>>> >>>>>>>>>>> Eg with the testcase provided here: >>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>>>> >>>>>>>>>>> I am getting those numbers: >>>>>>>>>>> unpatched: no debug: 84s with debug: 225s >>>>>>>>>>> patched:?? no debug: 85s with debug: 95s >>>>>>>>>>> >>>>>>>>>>> I also tested successfully through jdk/submit repo >>>>>>>>>>> >>>>>>>>>>> Can I please get a review? >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Roman >>>>>>>>>>> >> > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From igor.ignatyev at oracle.com Wed Mar 18 17:02:04 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 18 Mar 2020 10:02:04 -0700 Subject: jmx-dev RFR 8240604: Rewrite sun/management/jmxremote/bootstrap/CustomLauncherTest.java test to make binaries from source file In-Reply-To: <24e51dc7-714a-0932-a27f-778da35e8e29@bell-sw.com> References: <6b21ac03-0505-881e-b805-1681faab437b@bell-sw.com> <4ac1b2eb-7d0d-7fa5-8437-b7b987df92fe@bell-sw.com> <24e51dc7-714a-0932-a27f-778da35e8e29@bell-sw.com> Message-ID: +import static jdk.test.lib.Utils.TEST_CLASS_PATH; I'm not a huge fun of 'import static', yet don't insist on removing it either. + System.out.println(" libjvm : " + jvmLibDir.toString()); jvmLibDir doesn't point to libjvm, so you need either update message prefix or use the actual value which will be used as path to libjvm. I personally prefer the latter. btw, you don't need to explicitly call toString in string concatenation. -- Igor > On Mar 18, 2020, at 9:54 AM, Alexander Scherbatiy wrote: > > On 18.03.2020 19:00, Igor Ignatyev wrote: > >> Hi Alexander, >> >>> I also included TEST_NATIVE_PATH to the Utils lib. >> for the sake of clarity and ease of backporting, I'd prefer to have it added by a separate bug and commit. > > Here is the updated fix where TEST_NATIVE_PATH is not added to the Utils lib. > > http://cr.openjdk.java.net/~alexsch/8240604/webrev.02/ > > > Thanks, > > Alexander. > >>> Could I just use "hg remove binary-fie" and run webrev to add the removed binary files into webrev? >> IIRC correctly, webrev will just say 'a binary file got removed', in any case I'll take it as a 'yes, I'm going to remove these files as part of 8240604', so thumbs up. >> >> -- Igor >> >>> On Mar 18, 2020, at 4:57 AM, Alexander Scherbatiy wrote: >>> >>> Hello, >>> >>> Could you review the updated fix: >>> >>> http://cr.openjdk.java.net/~alexsch/8240604/webrev.01 >>> >>> Utils.TEST_CLASS_PATH, Platform.jvmLibDir(), and /native flag are added to the CustomLauncherTest.java test. I also included TEST_NATIVE_PATH to the Utils lib. >>> >>> I have not found a history about CustomLauncherTest.sh script in launcher.c so I just updated the comment as "A minature launcher for use by CustomLauncherTest.java test" in the exelauncher.c file. >>> >>> >>> The comment that I had about removing the linux-* and solaris-* binary files I wrote because it is not clear for what is the right way to include removed binary files into webrev. >>> >>> Could I just use "hg remove binary-fie" and run webrev to add the removed binary files into webrev? >>> >>> >>> Thanks, >>> >>> Alexander. >>> >>> On 17.03.2020 20:11, Igor Ignatyev wrote: >>>> Hi Alexander, >>>> >>>> overall looks good to me, I have a few comments though: >>>> - you can use Utils.TEST_CLASSPATH instead of CustomLauncherTest.TEST_CLASSPATH >>>> - CustomLauncherTest::findLibjvm can be simplified by use Platform::jvmLibDir >>>> - exelauncher.c has a comment which refers to the test as CustomLauncherTest.sh, could you please update the comment? >>>> - you have to add /native flag to @run action, otherwise jtreg won't exclude this test from runs w/ test.nativepath being unset >>>> >>>> I also have a question regarding your statement that >>>>>> The changes for obsolete binary files <...> are not included into the webrev. They needs to be removed manually. >>>> you are planning to remove these files as part of this patch, right? >>>> >>>> Thanks, >>>> -- Igor >>>> >>>> >>>>> On Mar 5, 2020, at 6:27 AM, Daniel Fuchs wrote: >>>>> >>>>> Hi Alexander, >>>>> >>>>> Fixes to JMX & management agent are reviewed on the >>>>> seviceability-dev (added in to:) these days. >>>>> >>>>> best regards, >>>>> >>>>> -- daniel >>>>> >>>>> On 05/03/2020 13:17, Alexander Scherbatiy wrote: >>>>>> Hello, >>>>>> Could you review a small enhancement where the test CustomLauncherTest is updated to build binary launcher file from launcher.c file. >>>>>> The file launcher.c is renamed to exelauncher.c to follow the name convention for executable test files building by jdk make system. >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240604 >>>>>> Webrev: http://cr.openjdk.java.net/~alexsch/8240604/webrev.00 >>>>>> The changes for obsolete binary files from sun/management/jmxremote/bootstrap/linux-* and solaris-* are not included into the webrev. They needs to be removed manually. >>>>>> The test is passed on Ubuntu 18.04 x86-64, Solaris Sparc v9 11.2, and Solaris x64 11.4 systems. >>>>>> The test is excluded from Windows and Mac Os X systems. >>>>>> Thanks, >>>>>> Alexander. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Wed Mar 18 17:16:33 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 18 Mar 2020 10:16:33 -0700 Subject: RFR(XS) 8240906: Update ZGC ProblemList for serviceability/sa/TestJmapCoreMetaspace.java In-Reply-To: References: <2545b3cf-3136-c990-dcfc-b6b6b7cc7d9f@oracle.com> Message-ID: <528e337b-9b7b-ffae-d28f-8baf03f6e3cd@oracle.com> Thanks! On 3/18/20 2:00 AM, Stefan Karlsson wrote: > Looks good. > > StefanK > > On 2020-03-18 05:52, Chris Plummer wrote: >> Hello, >> >> Please review the following: >> >> https://bugs.openjdk.java.net/browse/JDK-8240906 >> >> diff --git a/test/hotspot/jtreg/ProblemList-zgc.txt >> b/test/hotspot/jtreg/ProblemList-zgc.txt >> --- a/test/hotspot/jtreg/ProblemList-zgc.txt >> +++ b/test/hotspot/jtreg/ProblemList-zgc.txt >> @@ -47,5 +47,5 @@ >> ?serviceability/sa/TestJhsdbJstackLock.java 8220624 generic-all >> ?serviceability/sa/TestJhsdbJstackMixed.java 8220624 generic-all >> ?serviceability/sa/TestJmapCore.java 8220624?? generic-all >> -serviceability/sa/TestJmapCoreMetaspace.java 8219443 generic-all >> +serviceability/sa/TestJmapCoreMetaspace.java 8220624 generic-all >> ?serviceability/sa/sadebugd/DebugdConnectTest.java 8220624 generic-all >> >> 8219443 [1] was closed as a dup of 8219405 [2], which is a very >> intermittent bug that occurs even without ZGC, so should not be used >> to problem list this test for ZGC. However it should be ZGC problem >> listed due to 8220624 [3] just like TestJmapCore.java is. >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8219443 >> [2] https://bugs.openjdk.java.net/browse/JDK-8219405 >> [3] https://bugs.openjdk.java.net/browse/JDK-8220624 >> >> thanks, >> >> Chris >> > From daniil.x.titov at oracle.com Wed Mar 18 17:20:39 2020 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Wed, 18 Mar 2020 10:20:39 -0700 Subject: RFR: 8240711: TestJstatdPort.java failed due to "ExportException: Port already in use:" In-Reply-To: <8f18db0e-e988-ba8f-2f55-07584cb4b7e0@oracle.com> References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com> <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com> <7D3EB454-76AA-47A7-91A6-417ED92C6F56@oracle.com> <38b055df-37cd-cc11-79c9-323fb55077ff@oracle.com> <3F070550-DAB3-431E-89AD-66532B366FEA@oracle.com> <597AFBA6-42DB-4499-BF0C-8160155BABF4@oracle.com> <8f18db0e-e988-ba8f-2f55-07584cb4b7e0@oracle.com> Message-ID: Hi Alex, Thank you for reviewing this change. Best regards, Daniil ?On 3/17/20, 11:58 AM, "Alex Menkov" wrote: LGTM --alex On 03/17/2020 11:40, Daniil Titov wrote: > Hi Alex, > > Please review a new version of the fix that removes the old version of the code that tried to handle the "port in use" case. > > Testing: Mach5 tests for sun/tools/jstatd/ successfully passed 100 times. Tier1-tier3 tests successfully passed. > > [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.02 > [2] https://bugs.openjdk.java.net/browse/JDK-8240711 > > Thanks, > Daniil > > > > ?On 3/16/20, 5:38 PM, "Daniil Titov" wrote: > > Hi Alex, > > Yes, I did test the change by modifying the test to use the RMI port that is already in use > ( the stack trace in the original email was exact from this changed test) and then ensured that with the fix > the such issue is properly handled. > > I will send a new version of the webrev that removes the old version of the code that tried to handle the "port in use" case. > > Thanks! > > Best regards, > Daniil > > > > > ?On 3/16/20, 4:47 PM, "Alex Menkov" wrote: > > I don't agree. > The code handles exact the same "port in use" case for the same tool. > So it either works or doesn't. > And have 2 code blocks which suppose to do the same makes the code messy. > BTW did you tested the change (I mean craft the test to get "port in > use" error)? > > --alex > > On 03/16/2020 16:17, Daniil Titov wrote: > > Resending with the corrected subject ... > > > > Hi Alex, > > > > Yes, you are right, class JstatdTest has the code that is supposed to handle the "port in use" > > case but at least for this specific test (sun/tools/jstatd/TestJstatdPort.java) it doesn't work. > > > > Since there are multiple tests in sun/tools/jstatd/* folder that use this class and different ports > > might be subject to the "port in use" error and taking into account that it's hard to reproduce such case > > I found it safer to leave the original code and just augment it with what was missing for this specific > > case rather than completely replacing it. > > > > Best regards, > > Daniil > > > > ?On 3/16/20, 4:02 PM, "Alex Menkov" wrote: > > > > Hi Daniil, > > > > Looks like the test is supposed to handle "port in use" issue (see lines > > 103-114). > > I suppose in case "port in use" jstatd exits, but > > ProcessTools.startProcess() continue to wait for "jstatd started" message. > > > > --alex > > > > On 03/16/2020 12:00, Daniil Titov wrote: > > > Please review the change [1] that fixes the intermittent failure of the test. > > > > > > The problem here is that if the RMI port is in use than the test keep waiting for "jstatd started (bound to " to appear in the process output and in this case > > > It doesn't happen. > > > > > > at java.util.concurrent.CountDownLatch.await(java.base at 15-internal/CountDownLatch.java:232) > > > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:205) > > > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:133) > > > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:254) > > > at jdk.test.lib.thread.ProcessThread$ProcessRunnable.xrun(ProcessThread.java:153) > > > at jdk.test.lib.thread.XRun.run(XRun.java:40) > > > at java.lang.Thread.run(java.base at 15-internal/Thread.java:832) > > > at jdk.test.lib.thread.TestThread.run(TestThread.java:123) > > > > > > Testing: Mach5 tests for sun/tools/jstatd/ successfully passed. Tier1-tier3 tests are still in progress. > > > > > > [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.01/ > > > [2] https://bugs.openjdk.java.net/browse/JDK-8240711 > > > > > > > > > Thank you, > > > Daniil > > > > > > > > > > > > > > > > > > > From alexander.scherbatiy at bell-sw.com Wed Mar 18 17:35:33 2020 From: alexander.scherbatiy at bell-sw.com (Alexander Scherbatiy) Date: Wed, 18 Mar 2020 20:35:33 +0300 Subject: jmx-dev RFR 8240604: Rewrite sun/management/jmxremote/bootstrap/CustomLauncherTest.java test to make binaries from source file In-Reply-To: References: <6b21ac03-0505-881e-b805-1681faab437b@bell-sw.com> <4ac1b2eb-7d0d-7fa5-8437-b7b987df92fe@bell-sw.com> <24e51dc7-714a-0932-a27f-778da35e8e29@bell-sw.com> Message-ID: <1c1cea2c-182f-2f9f-bdd3-6d9776723956@bell-sw.com> On 18.03.2020 20:02, Igor Ignatyev wrote: > +import static jdk.test.lib.Utils.TEST_CLASS_PATH; > I'm not a huge fun of 'import static', yet don't insist on removing it > either. > > + System.out.println(" libjvm : " + jvmLibDir.toString()); > jvmLibDir doesn't point to libjvm, so you need either update message > prefix or use the actual value which will be used as path to libjvm. I > personally prefer the latter. > btw, you don't need to explicitly call toString in string concatenation. > ? Here is the updated fix where the static import is removed and libjvm path is used: ? http://cr.openjdk.java.net/~alexsch/8240604/webrev.03/ ? Thanks, ? Alexander. > -- Igor > >> On Mar 18, 2020, at 9:54 AM, Alexander Scherbatiy >> > > wrote: >> >> On 18.03.2020 19:00, Igor Ignatyev wrote: >> >>> Hi Alexander, >>> >>>> I also included TEST_NATIVE_PATH to the Utils lib. >>> for the sake of clarity and ease of backporting, I'd prefer to have >>> it added by a separate bug and commit. >> >> Here is the updated fix where TEST_NATIVE_PATH is not added to the >> Utils lib. >> >> http://cr.openjdk.java.net/~alexsch/8240604/webrev.02/ >> >> >> Thanks, >> >> Alexander. >> >>>> Could I just use "hg remove binary-fie" and run webrev to add the >>>> removed binary files into webrev? >>> IIRC correctly, webrev will just say 'a binary file got removed', in >>> any case I'll take it as a 'yes, I'm going to remove these files as >>> part of 8240604', so thumbs up. >>> >>> -- Igor >>> >>>> On Mar 18, 2020, at 4:57 AM, Alexander Scherbatiy >>>> >>> > wrote: >>>> >>>> Hello, >>>> >>>> Could you review the updated fix: >>>> >>>> http://cr.openjdk.java.net/~alexsch/8240604/webrev.01 >>>> >>>> Utils.TEST_CLASS_PATH, Platform.jvmLibDir(), and /native flag are >>>> added to the CustomLauncherTest.java test. I also included >>>> TEST_NATIVE_PATH to the Utils lib. >>>> >>>> I have not found a history about CustomLauncherTest.sh script in >>>> launcher.c so I just updated the comment as "A minature launcher >>>> for use by CustomLauncherTest.java test" in the exelauncher.c file. >>>> >>>> >>>> The comment that I had about removing the linux-* and solaris-* >>>> binary files I wrote because it is not clear for what is the right >>>> way to include removed binary files into webrev. >>>> >>>> Could I just use "hg remove binary-fie" and run webrev to add the >>>> removed binary files into webrev? >>>> >>>> >>>> Thanks, >>>> >>>> Alexander. >>>> >>>> On 17.03.2020 20:11, Igor Ignatyev wrote: >>>>> Hi Alexander, >>>>> >>>>> overall looks good to me, I have a few comments though: >>>>> ?- you can use Utils.TEST_CLASSPATH instead of >>>>> CustomLauncherTest.TEST_CLASSPATH >>>>> - CustomLauncherTest::findLibjvm can be simplified by use >>>>> Platform::jvmLibDir >>>>> - exelauncher.c has a comment which refers to the test as >>>>> CustomLauncherTest.sh, could you please update the comment? >>>>> - you have to add /native flag to @run action, otherwise jtreg >>>>> won't exclude this test from runs w/ test.nativepath being unset >>>>> >>>>> I also have a question regarding your statement that >>>>>>> The changes for obsolete binary files <...> are not included >>>>>>> into the webrev. They needs to be removed manually. >>>>> you are planning to remove these files as part of this patch, right? >>>>> >>>>> Thanks, >>>>> -- Igor >>>>> >>>>> >>>>>> On Mar 5, 2020, at 6:27 AM, Daniel Fuchs >>>>> > wrote: >>>>>> >>>>>> Hi Alexander, >>>>>> >>>>>> Fixes to JMX & management agent are reviewed on the >>>>>> seviceability-dev (added in to:) these days. >>>>>> >>>>>> best regards, >>>>>> >>>>>> -- daniel >>>>>> >>>>>> On 05/03/2020 13:17, Alexander Scherbatiy wrote: >>>>>>> Hello, >>>>>>> Could you review a small enhancement where the test >>>>>>> CustomLauncherTest is updated to build binary launcher file from >>>>>>> launcher.c file. >>>>>>> The file launcher.c is renamed to exelauncher.c to follow the >>>>>>> name convention for executable test files building by jdk make >>>>>>> system. >>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240604 >>>>>>> Webrev: http://cr.openjdk.java.net/~alexsch/8240604/webrev.00 >>>>>>> The changes for obsolete binary files from >>>>>>> sun/management/jmxremote/bootstrap/linux-* and solaris-* are not >>>>>>> included into the webrev. They needs to be removed manually. >>>>>>> The test is passed on Ubuntu 18.04 x86-64, Solaris Sparc v9 >>>>>>> 11.2, and Solaris x64 11.4 systems. >>>>>>> The test is excluded from Windows and Mac Os X systems. >>>>>>> Thanks, >>>>>>> Alexander. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Wed Mar 18 17:51:50 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 18 Mar 2020 10:51:50 -0700 Subject: RFR(XS) 8227340: Modify problem list entry for javax/management/remote/mandatory/connection/MultiThreadDeadLockTest.java In-Reply-To: <0eb4d0c3-60fb-be22-238a-fd8f4b10ff9e@oracle.com> References: <0c2e3ed0-e35c-c555-b8ed-ba64eb08f6ae@oracle.com> <0eb4d0c3-60fb-be22-238a-fd8f4b10ff9e@oracle.com> Message-ID: Thanks! Chris On 3/17/20 10:16 PM, David Holmes wrote: > Hi Chris, > > On 18/03/2020 2:59 pm, Chris Plummer wrote: >> Hello, >> >> Please review the following: >> >> https://bugs.openjdk.java.net/browse/JDK-8227340 >> >> diff --git a/test/jdk/ProblemList.txt b/test/jdk/ProblemList.txt >> --- a/test/jdk/ProblemList.txt >> +++ b/test/jdk/ProblemList.txt >> @@ -587,7 +587,7 @@ >> ??java/lang/management/ThreadMXBean/AllThreadIds.java 8131745 >> generic-all >> >> ??javax/management/monitor/DerivedGaugeMonitorTest.java 8042211 >> generic-all >> -javax/management/remote/mandatory/connection/MultiThreadDeadLockTest.java >> 8042215 generic-all >> +javax/management/remote/mandatory/connection/MultiThreadDeadLockTest.java >> 8227337 generic-all >> >> 8042215 [1] used to be the correct CR to problem list this test >> under, but it was accidentally used to fix for a different bug. >> 8042215 [1] has now been cloned to 8227337 [2] so the problem list >> needs to be updated also. > > Okay. The bugs themselves are in a bit of a muddle but this issue is > okay. > > Thanks, > David > >> [1] https://bugs.openjdk.java.net/browse/JDK-8042215 >> [2] https://bugs.openjdk.java.net/browse/JDK-8227337 >> >> thanks, >> >> Chris >> >> From igor.ignatyev at oracle.com Wed Mar 18 18:00:32 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 18 Mar 2020 11:00:32 -0700 Subject: jmx-dev RFR 8240604: Rewrite sun/management/jmxremote/bootstrap/CustomLauncherTest.java test to make binaries from source file In-Reply-To: <1c1cea2c-182f-2f9f-bdd3-6d9776723956@bell-sw.com> References: <6b21ac03-0505-881e-b805-1681faab437b@bell-sw.com> <4ac1b2eb-7d0d-7fa5-8437-b7b987df92fe@bell-sw.com> <24e51dc7-714a-0932-a27f-778da35e8e29@bell-sw.com> <1c1cea2c-182f-2f9f-bdd3-6d9776723956@bell-sw.com> Message-ID: <40924AA7-C53C-4F56-8CE4-25672C58540C@oracle.com> thanks! LGTM. -- Igor > On Mar 18, 2020, at 10:35 AM, Alexander Scherbatiy wrote: > > On 18.03.2020 20:02, Igor Ignatyev wrote: > >> +import static jdk.test.lib.Utils.TEST_CLASS_PATH; >> I'm not a huge fun of 'import static', yet don't insist on removing it either. >> >> + System.out.println(" libjvm : " + jvmLibDir.toString()); >> jvmLibDir doesn't point to libjvm, so you need either update message prefix or use the actual value which will be used as path to libjvm. I personally prefer the latter. >> btw, you don't need to explicitly call toString in string concatenation. >> > Here is the updated fix where the static import is removed and libjvm path is used: > > http://cr.openjdk.java.net/~alexsch/8240604/webrev.03/ > > Thanks, > > Alexander. > > > >> -- Igor >> >>> On Mar 18, 2020, at 9:54 AM, Alexander Scherbatiy > wrote: >>> >>> On 18.03.2020 19:00, Igor Ignatyev wrote: >>> >>>> Hi Alexander, >>>> >>>>> I also included TEST_NATIVE_PATH to the Utils lib. >>>> for the sake of clarity and ease of backporting, I'd prefer to have it added by a separate bug and commit. >>> >>> Here is the updated fix where TEST_NATIVE_PATH is not added to the Utils lib. >>> >>> http://cr.openjdk.java.net/~alexsch/8240604/webrev.02/ >>> >>> >>> Thanks, >>> >>> Alexander. >>> >>>>> Could I just use "hg remove binary-fie" and run webrev to add the removed binary files into webrev? >>>> IIRC correctly, webrev will just say 'a binary file got removed', in any case I'll take it as a 'yes, I'm going to remove these files as part of 8240604', so thumbs up. >>>> >>>> -- Igor >>>> >>>>> On Mar 18, 2020, at 4:57 AM, Alexander Scherbatiy > wrote: >>>>> >>>>> Hello, >>>>> >>>>> Could you review the updated fix: >>>>> >>>>> http://cr.openjdk.java.net/~alexsch/8240604/webrev.01 >>>>> >>>>> Utils.TEST_CLASS_PATH, Platform.jvmLibDir(), and /native flag are added to the CustomLauncherTest.java test. I also included TEST_NATIVE_PATH to the Utils lib. >>>>> >>>>> I have not found a history about CustomLauncherTest.sh script in launcher.c so I just updated the comment as "A minature launcher for use by CustomLauncherTest.java test" in the exelauncher.c file. >>>>> >>>>> >>>>> The comment that I had about removing the linux-* and solaris-* binary files I wrote because it is not clear for what is the right way to include removed binary files into webrev. >>>>> >>>>> Could I just use "hg remove binary-fie" and run webrev to add the removed binary files into webrev? >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Alexander. >>>>> >>>>> On 17.03.2020 20:11, Igor Ignatyev wrote: >>>>>> Hi Alexander, >>>>>> >>>>>> overall looks good to me, I have a few comments though: >>>>>> - you can use Utils.TEST_CLASSPATH instead of CustomLauncherTest.TEST_CLASSPATH >>>>>> - CustomLauncherTest::findLibjvm can be simplified by use Platform::jvmLibDir >>>>>> - exelauncher.c has a comment which refers to the test as CustomLauncherTest.sh, could you please update the comment? >>>>>> - you have to add /native flag to @run action, otherwise jtreg won't exclude this test from runs w/ test.nativepath being unset >>>>>> >>>>>> I also have a question regarding your statement that >>>>>>>> The changes for obsolete binary files <...> are not included into the webrev. They needs to be removed manually. >>>>>> you are planning to remove these files as part of this patch, right? >>>>>> >>>>>> Thanks, >>>>>> -- Igor >>>>>> >>>>>> >>>>>>> On Mar 5, 2020, at 6:27 AM, Daniel Fuchs > wrote: >>>>>>> >>>>>>> Hi Alexander, >>>>>>> >>>>>>> Fixes to JMX & management agent are reviewed on the >>>>>>> seviceability-dev (added in to:) these days. >>>>>>> >>>>>>> best regards, >>>>>>> >>>>>>> -- daniel >>>>>>> >>>>>>> On 05/03/2020 13:17, Alexander Scherbatiy wrote: >>>>>>>> Hello, >>>>>>>> Could you review a small enhancement where the test CustomLauncherTest is updated to build binary launcher file from launcher.c file. >>>>>>>> The file launcher.c is renamed to exelauncher.c to follow the name convention for executable test files building by jdk make system. >>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240604 >>>>>>>> Webrev: http://cr.openjdk.java.net/~alexsch/8240604/webrev.00 >>>>>>>> The changes for obsolete binary files from sun/management/jmxremote/bootstrap/linux-* and solaris-* are not included into the webrev. They needs to be removed manually. >>>>>>>> The test is passed on Ubuntu 18.04 x86-64, Solaris Sparc v9 11.2, and Solaris x64 11.4 systems. >>>>>>>> The test is excluded from Windows and Mac Os X systems. >>>>>>>> Thanks, >>>>>>>> Alexander. >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From leonid.mesnik at oracle.com Wed Mar 18 19:37:17 2020 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Wed, 18 Mar 2020 12:37:17 -0700 Subject: RFR: 8241123: Refactor vmTestbase stress framework to use j.u.c and make creation of threads more flexible Message-ID: <0d73d306-2eff-375c-65e1-67142b2c6c59@oracle.com> Hi Could you please review following fix which slightly refactor vmTestbase stress test harness. This refactoring helps to add virtual threads testing support. The Wicket uses plain sync/wait/notify mechanism which cause carrier thread starvation and should not be used in virtual threads. The ManagedThread is a subclass of Thread so it couldn't be virtual thread. Following fix changes Wicket to use locks/conditions to don't pin vthread to carrier thread while starting testing. ManagedThread is fixed to keep execution thread as the thread variable and isolate it's creation. Test vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java was updated to don't use Wicket. (The lock has a reference to thread which affects test.) Wicket "finished" in class ThreadsRunner was changed to atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which might happened in stress GC tests. webrev: http://cr.openjdk.java.net/~lmesnik/8241123/webrev.00/ bug: https://bugs.openjdk.java.net/browse/JDK-8241123 Leonid From igor.ignatyev at oracle.com Wed Mar 18 19:48:50 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 18 Mar 2020 12:48:50 -0700 Subject: RFR: 8241123: Refactor vmTestbase stress framework to use j.u.c and make creation of threads more flexible In-Reply-To: <0d73d306-2eff-375c-65e1-67142b2c6c59@oracle.com> References: <0d73d306-2eff-375c-65e1-67142b2c6c59@oracle.com> Message-ID: Hi Leonid, I've started looking at your webrev, and so far have a couple questions: > Test vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java was updated to don't use Wicket. (The lock has a reference to thread which affects test.) can't you use just a volatile boolean field? > Wicket "finished" in class ThreadsRunner was changed to atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which might happened in stress GC tests. won't j.u.c.CountDownLatch be more appropriate and cleaner solution here? I need more time to get grasp of Wicket and your changes in it; will come back to you after I understand them. -- Igor > On Mar 18, 2020, at 12:37 PM, Leonid Mesnik wrote: > > Hi > > Could you please review following fix which slightly refactor vmTestbase stress test harness. This refactoring helps to add virtual threads testing support. > > The Wicket uses plain sync/wait/notify mechanism which cause carrier thread starvation and should not be used in virtual threads. The ManagedThread is a subclass of Thread so it couldn't be virtual thread. > > > Following fix changes Wicket to use locks/conditions to don't pin vthread to carrier thread while starting testing. > > ManagedThread is fixed to keep execution thread as the thread variable and isolate it's creation. > > Test vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java was updated to don't use Wicket. (The lock has a reference to thread which affects test.) > > Wicket "finished" in class ThreadsRunner was changed to atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which might happened in stress GC tests. > > webrev: http://cr.openjdk.java.net/~lmesnik/8241123/webrev.00/ > > bug: https://bugs.openjdk.java.net/browse/JDK-8241123 > > > Leonid > From chris.plummer at oracle.com Wed Mar 18 20:05:55 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 18 Mar 2020 13:05:55 -0700 Subject: RFR(XS) 8241162: ProblemList serviceability/sa/TestHeapDumpForInvokeDynamic.java on OSX Message-ID: Hello, Please review the following: https://bugs.openjdk.java.net/browse/JDK-8241162 diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt +++ b/test/hotspot/jtreg/ProblemList.txt @@ -131,7 +131,7 @@ ?serviceability/sa/TestCpoolForInvokeDynamic.java 8193639 solaris-all ?serviceability/sa/TestDefaultMethods.java 8193639 solaris-all ?serviceability/sa/TestG1HeapRegion.java 8193639 solaris-all -serviceability/sa/TestHeapDumpForInvokeDynamic.java 8193639 solaris-all +serviceability/sa/TestHeapDumpForInvokeDynamic.java 8193639,8241158 solaris-all,macosx-x64 ?serviceability/sa/TestHeapDumpForLargeArray.java 8193639 solaris-all ?serviceability/sa/TestInstanceKlassSize.java 8193639,8230664 solaris-all,linux-ppc64le,linux-ppc64 ?serviceability/sa/TestInstanceKlassSizeForInterface.java 8193639,8230664 solaris-all,linux-ppc64le,linux-ppc64 This test was recently re-enabled for OSX (along with a large number of other SA tests), but fails when using -XX:ArchiveRelocationMode=1. See JDK-8241158 [1]. Since a fix is not readily available, we need to problemlist for now. [1] https://bugs.openjdk.java.net/browse/JDK-8241158 thanks, Chris From daniel.daugherty at oracle.com Wed Mar 18 20:07:42 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 18 Mar 2020 16:07:42 -0400 Subject: RFR(XS) 8241162: ProblemList serviceability/sa/TestHeapDumpForInvokeDynamic.java on OSX In-Reply-To: References: Message-ID: <19d22a23-9ab0-f64f-1ac9-4cd3a8dd13ee@oracle.com> Thumbs up. Also, this fix is trivial. Dan On 3/18/20 4:05 PM, Chris Plummer wrote: > Hello, > > Please review the following: > > https://bugs.openjdk.java.net/browse/JDK-8241162 > > diff --git a/test/hotspot/jtreg/ProblemList.txt > b/test/hotspot/jtreg/ProblemList.txt > --- a/test/hotspot/jtreg/ProblemList.txt > +++ b/test/hotspot/jtreg/ProblemList.txt > @@ -131,7 +131,7 @@ > ?serviceability/sa/TestCpoolForInvokeDynamic.java 8193639 solaris-all > ?serviceability/sa/TestDefaultMethods.java 8193639 solaris-all > ?serviceability/sa/TestG1HeapRegion.java 8193639 solaris-all > -serviceability/sa/TestHeapDumpForInvokeDynamic.java 8193639 solaris-all > +serviceability/sa/TestHeapDumpForInvokeDynamic.java 8193639,8241158 > solaris-all,macosx-x64 > ?serviceability/sa/TestHeapDumpForLargeArray.java 8193639 solaris-all > ?serviceability/sa/TestInstanceKlassSize.java 8193639,8230664 > solaris-all,linux-ppc64le,linux-ppc64 > ?serviceability/sa/TestInstanceKlassSizeForInterface.java > 8193639,8230664 solaris-all,linux-ppc64le,linux-ppc64 > > This test was recently re-enabled for OSX (along with a large number > of other SA tests), but fails when using -XX:ArchiveRelocationMode=1. > See JDK-8241158 [1]. Since a fix is not readily available, we need to > problemlist for now. > > [1] https://bugs.openjdk.java.net/browse/JDK-8241158 > > thanks, > > Chris > From chris.plummer at oracle.com Wed Mar 18 20:23:27 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 18 Mar 2020 13:23:27 -0700 Subject: RFR(XS) 8241162: ProblemList serviceability/sa/TestHeapDumpForInvokeDynamic.java on OSX In-Reply-To: <19d22a23-9ab0-f64f-1ac9-4cd3a8dd13ee@oracle.com> References: <19d22a23-9ab0-f64f-1ac9-4cd3a8dd13ee@oracle.com> Message-ID: Thanks! On 3/18/20 1:07 PM, Daniel D. Daugherty wrote: > Thumbs up. Also, this fix is trivial. > > Dan > > > On 3/18/20 4:05 PM, Chris Plummer wrote: >> Hello, >> >> Please review the following: >> >> https://bugs.openjdk.java.net/browse/JDK-8241162 >> >> diff --git a/test/hotspot/jtreg/ProblemList.txt >> b/test/hotspot/jtreg/ProblemList.txt >> --- a/test/hotspot/jtreg/ProblemList.txt >> +++ b/test/hotspot/jtreg/ProblemList.txt >> @@ -131,7 +131,7 @@ >> ?serviceability/sa/TestCpoolForInvokeDynamic.java 8193639 solaris-all >> ?serviceability/sa/TestDefaultMethods.java 8193639 solaris-all >> ?serviceability/sa/TestG1HeapRegion.java 8193639 solaris-all >> -serviceability/sa/TestHeapDumpForInvokeDynamic.java 8193639 solaris-all >> +serviceability/sa/TestHeapDumpForInvokeDynamic.java 8193639,8241158 >> solaris-all,macosx-x64 >> ?serviceability/sa/TestHeapDumpForLargeArray.java 8193639 solaris-all >> ?serviceability/sa/TestInstanceKlassSize.java 8193639,8230664 >> solaris-all,linux-ppc64le,linux-ppc64 >> ?serviceability/sa/TestInstanceKlassSizeForInterface.java >> 8193639,8230664 solaris-all,linux-ppc64le,linux-ppc64 >> >> This test was recently re-enabled for OSX (along with a large number >> of other SA tests), but fails when using -XX:ArchiveRelocationMode=1. >> See JDK-8241158 [1]. Since a fix is not readily available, we need to >> problemlist for now. >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8241158 >> >> thanks, >> >> Chris >> > From leonid.mesnik at oracle.com Wed Mar 18 20:29:22 2020 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Wed, 18 Mar 2020 13:29:22 -0700 Subject: RFR: 8241123: Refactor vmTestbase stress framework to use j.u.c and make creation of threads more flexible In-Reply-To: References: <0d73d306-2eff-375c-65e1-67142b2c6c59@oracle.com> Message-ID: <5344eb3a-b17a-09c1-0f1f-8c1462899fe3@oracle.com> On 3/18/20 12:48 PM, Igor Ignatyev wrote: > Hi Leonid, > > I've started looking at your webrev, and so far have a couple questions: > >> Test vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java was updated to don't use Wicket. (The lock has a reference to thread which affects test.) > can't you use just a volatile boolean field? I can, but I don't see any benefits to use volatile fields instead of atomics. I prefer to use Atomic* anywhere because of it's clearer semantics. Using of explicit get/set and other similar accessors. > >> Wicket "finished" in class ThreadsRunner was changed to atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which might happened in stress GC tests. > won't j.u.c.CountDownLatch be more appropriate and cleaner solution here? Unfortunately no. The CountDownLatch would be a nice solution but it is possible to get OOME in gc/lock (might be other) tests. I replaced Wicked by the same reason. Updating the AtomicInteger doesn't allocate any memory and don't cause OOME. Leonid > > I need more time to get grasp of Wicket and your changes in it; will come back to you after I understand them. > > -- Igor > >> On Mar 18, 2020, at 12:37 PM, Leonid Mesnik wrote: >> >> Hi >> >> Could you please review following fix which slightly refactor vmTestbase stress test harness. This refactoring helps to add virtual threads testing support. >> >> The Wicket uses plain sync/wait/notify mechanism which cause carrier thread starvation and should not be used in virtual threads. The ManagedThread is a subclass of Thread so it couldn't be virtual thread. >> >> >> Following fix changes Wicket to use locks/conditions to don't pin vthread to carrier thread while starting testing. >> >> ManagedThread is fixed to keep execution thread as the thread variable and isolate it's creation. >> >> Test vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java was updated to don't use Wicket. (The lock has a reference to thread which affects test.) >> >> Wicket "finished" in class ThreadsRunner was changed to atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which might happened in stress GC tests. >> >> webrev: http://cr.openjdk.java.net/~lmesnik/8241123/webrev.00/ >> >> bug: https://bugs.openjdk.java.net/browse/JDK-8241123 >> >> >> Leonid >> From daniel.daugherty at oracle.com Wed Mar 18 20:30:43 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 18 Mar 2020 16:30:43 -0400 Subject: RFR 8240902: JDI shared memory connector can use already closed Handles In-Reply-To: References: Message-ID: On 3/17/20 4:14 PM, Patricio Chilano wrote: > Hi all, > > Please review the following patch: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8240902 > Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/ src/jdk.jdi/share/native/libdt_shmem/shmemBase.c ??? L411: ??? int attempts = 10; ??? L420: ??????? sysSleep(200); ??????? I presume that this is a 200 millisecond sleep so this new loop ??????? will delay a closeStream() call by at most 2 seconds. You may ??????? want those literals to be #define'ed values at the top of the ??????? file, e.g., like this one: ??????? #define MAX_GENERATION_RETRIES 20 ??????? Your choice on the names of the new #defines if you choose to ??????? do that. You might even consider putting them close to ??????? "typedef struct SharedMemoryConnection". ??????? Update: Oh yuck! Now I see that there is existing code that ??????? does the same kind of looping with sysSleep() calls when the ??????? linger option is set. I revise my comment: You're following ??????? the existing style in the function so go with what you have. ??? Don't forget to update the copyright year before you push. ??? L379: closeStream(Stream *stream, jboolean linger, unsigned int *refcount ) ??????? nit - please delete space before ')'. ??? L412: ??? MemoryBarrier();???? /* Prevent load of refcount to float above. */ ??????? typo: s/to float/from floating/ ??? L413: ??? while (attempts>0) { ??????? nit - please add spaces around '>'. ??? L415-418, L537, L541, L552: ??????? nit - indent should be four spaces instead of two spaces. ??????? The existing L546 and L549 should indented four spaces instead ??????? of two spaces. Please fix since you there. I'm good with the code changes. I only have nits above so I don't need to see another webrev. Dan > Calling closeConnection() on an already created/opened connection > includes calls to CloseHandle() on objects that can still be used by > other threads. This can lead to either undefined behavior or, as > detailed in the bug comments, changes of state of unrelated objects. > This issue was found while debugging the reason behind some jshell > test failures seen after pushing 8230594. Not as important, but there > are also calls to closeStream() from createStream()/openStream() when > failing to create/open a stream that will return after executing > "CHECK_ERROR(enterMutex(stream, NULL));" without closing the intended > resources. Then, calling closeConnection() could assert if the reason > of the previous failure was that the stream's mutex failed to be > created/opened. These patch aims to address these issues too. > > Tested in mach5 with the current baseline, tiers1-3 and several runs > of open/test/langtools/:tier1 which includes the jshell tests where > this connector is used. I also applied patch > http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev > mentioned in the comments of the bug, on top of the baseline and run > the langtool tests with and without this fix. Without the fix running > around 30 repetitions already shows failures in tests > jdk/jshell/FailOverExecutionControlTest.java and > jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With the > fix I run several hundred runs and saw no failures. Let me know if > there is any additional testing I should do. > > As a side note, I see there are a couple of open issues related with > jshell failures (8209848) which could be related to this bug and > therefore might be fixed by this patch. > > Thanks, > Patricio > From patricio.chilano.mateo at oracle.com Wed Mar 18 20:44:51 2020 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Wed, 18 Mar 2020 17:44:51 -0300 Subject: RFR 8240902: JDI shared memory connector can use already closed Handles In-Reply-To: References: Message-ID: <2212664f-930b-cdfc-7ae6-70c7205cdaca@oracle.com> Hi David, On 3/18/20 4:27 AM, David Holmes wrote: > Hi Patricio, > > On 18/03/2020 6:14 am, Patricio Chilano wrote: >> Hi all, >> >> Please review the following patch: >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8240902 >> Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/ >> >> Calling closeConnection() on an already created/opened connection >> includes calls to CloseHandle() on objects that can still be used by >> other threads. This can lead to either undefined behavior or, as >> detailed in the bug comments, changes of state of unrelated objects. > > This was a really great find! Thanks!? : ) >> This issue was found while debugging the reason behind some jshell >> test failures seen after pushing 8230594. Not as important, but there >> are also calls to closeStream() from createStream()/openStream() when >> failing to create/open a stream that will return after executing >> "CHECK_ERROR(enterMutex(stream, NULL));" without closing the intended >> resources. Then, calling closeConnection() could assert if the reason >> of the previous failure was that the stream's mutex failed to be >> created/opened. These patch aims to address these issues too. > > Patch looks good in general. The internal reference count guards > deletion of the internal resources, and is itself safe because never > actually delete the connection. Thanks for adding the comment about > this aspect. > > A few items: > > Please update copyright year before pushing. Done. > Please align ENTER_CONNECTION/LEAVE_CONNECTION macros the same way as > STREAM_INVARIANT. Done. > ?170 unsigned int refcount; > ?171???? jint state; > > I'm unclear about the use of stream->state and connection->state as > guards - unless accessed under a mutex these would seem to at least > need acquire/release semantics. > > Additionally the reads of refcount would also seem to need to some > form of memory synchronization - though the Windows docs for the > Interlocked* API does not show how to simply read such a variable! > Though I note that the RtlFirstEntrySList method for the "Interlocked > Singly Linked Lists" API does state "Access to the list is > synchronized on a multiprocessor system." which suggests a read of > such a variable does require some form of memory synchronization! In the case of the stream struct, the state field is protected by the mutex field. It is set to STATE_CLOSED while holding the mutex, and threads that read it must acquire the mutex first through sysIPMutexEnter(). For the cases where sysIPMutexEnter() didn't acquire the mutex, we will return something different than SYS_OK and the call will exit anyways. All this behaves as before, I didn't change it. The refcount and state that I added to the SharedMemoryConnection struct work together. For a thread closing the connection, setting the connection state to STATE_CLOSED has to happen before reading the refcount (more on the atomicity of that read later). That's why I added the MemoryBarrier() call; which I see it's better if I just move it to after setting the connection state to closed. For the threads accessing the connection, incrementing the refcount has to happen before reading the connection state. That's already provided by the InterlockedIncrement() which uses a full memory barrier. In this way if the thread closing the connection reads a refcount of 0, then we know it's safe to release the resources, since other threads accessing the connection will see that the state is closed after incrementing the refcount. If the read of refcount is not 0, then it could be that a thread is accessing the connection or not (it could have read a state connection of STATE_CLOSED after incrementing the refcount), we don't know, so we can't release anything. Similarly if the thread accessing the connection reads that the state is not closed, then we know it's safe to access the stream since anybody closing the connection will still have to read refcount which will be at least 1. As for the atomicity of the read of refcount, from https://docs.microsoft.com/en-us/windows/win32/sync/interlocked-variable-access, it states that "simple reads and writes to properly-aligned 32-bit variables are atomic operations". Maybe I should declare refcount explicitly as DWORD32? Instead of having a refcount we could have done something similar to the stream struct and protect access to the connection through a mutex. To avoid serializing all threads we could have used SRW locks and only the one closing the connection would do AcquireSRWLockExclusive(). It would change the state of the connection to STATE_CLOSED, close all handles, and then release the mutex. ENTER_CONNECTION() and LEAVE_CONNECTION() would acquire and release the mutex in shared mode. But other that maybe be more easy to read I don't think the change will be smaller. > ?413 while (attempts>0) { > > spaces around > Done. > If the loop at 413 never encounters a zero reference_count then it > doesn't close the events or the mutex but still returns SYS_OK. That > seems wrong but I'm not sure what the right behaviour is here. I can change the return value to be SYS_ERR, but I don't think there is much we can do about it unless we want to wait forever until we can release those resources. > And please wait for serviceability folk to review this. Sounds good. Thanks for looking at this David! I will move the MemoryBarrier() and change the refcount to be DWORD32 if you are okay with that. Thanks, Patricio > Thanks, > David > ----- > >> Tested in mach5 with the current baseline, tiers1-3 and several runs >> of open/test/langtools/:tier1 which includes the jshell tests where >> this connector is used. I also applied patch >> http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev >> mentioned in the comments of the bug, on top of the baseline and run >> the langtool tests with and without this fix. Without the fix running >> around 30 repetitions already shows failures in tests >> jdk/jshell/FailOverExecutionControlTest.java and >> jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With the >> fix I run several hundred runs and saw no failures. Let me know if >> there is any additional testing I should do. >> >> As a side note, I see there are a couple of open issues related with >> jshell failures (8209848) which could be related to this bug and >> therefore might be fixed by this patch. >> >> Thanks, >> Patricio >> From serguei.spitsyn at oracle.com Wed Mar 18 21:03:53 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 18 Mar 2020 14:03:53 -0700 Subject: RFR 8240902: JDI shared memory connector can use already closed Handles In-Reply-To: References: Message-ID: <161fe95b-4d48-fd61-15ec-7daecb193868@oracle.com> An HTML attachment was scrubbed... URL: From igor.ignatyev at oracle.com Wed Mar 18 21:15:14 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 18 Mar 2020 14:15:14 -0700 Subject: RFR: 8241123: Refactor vmTestbase stress framework to use j.u.c and make creation of threads more flexible In-Reply-To: <5344eb3a-b17a-09c1-0f1f-8c1462899fe3@oracle.com> References: <0d73d306-2eff-375c-65e1-67142b2c6c59@oracle.com> <5344eb3a-b17a-09c1-0f1f-8c1462899fe3@oracle.com> Message-ID: <50110939-1AEF-40B5-969C-C5313633B1F9@oracle.com> > On Mar 18, 2020, at 1:29 PM, Leonid Mesnik wrote: > > > On 3/18/20 12:48 PM, Igor Ignatyev wrote: >> Hi Leonid, >> >> I've started looking at your webrev, and so far have a couple questions: >> >>> Test vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java was updated to don't use Wicket. (The lock has a reference to thread which affects test.) >> can't you use just a volatile boolean field? > I can, but I don't see any benefits to use volatile fields instead of atomics. I prefer to use Atomic* anywhere because of it's clearer semantics. Using of explicit get/set and other similar accessors. you aren't using any accessors other than plain get/set, which are semantically equal to setting/getting a volatile field, so I'm not sure how it's clearer.as of benefits of a volatile field, the code is shorter (and arguable cleaner) and you save some heap space. anyhow, I don't insist on usage of volatile boolean over AtomicBoolean, >> >>> Wicket "finished" in class ThreadsRunner was changed to atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which might happened in stress GC tests. >> won't j.u.c.CountDownLatch be more appropriate and cleaner solution here? > > Unfortunately no. The CountDownLatch would be a nice solution but it is possible to get OOME in gc/lock (might be other) tests. I replaced Wicked by the same reason. Updating the AtomicInteger doesn't allocate any memory and don't cause OOME. I see. > > Leonid > >> >> I need more time to get grasp of Wicket and your changes in it; will come back to you after I understand them. >> >> -- Igor >> >>> On Mar 18, 2020, at 12:37 PM, Leonid Mesnik wrote: >>> >>> Hi >>> >>> Could you please review following fix which slightly refactor vmTestbase stress test harness. This refactoring helps to add virtual threads testing support. >>> >>> The Wicket uses plain sync/wait/notify mechanism which cause carrier thread starvation and should not be used in virtual threads. The ManagedThread is a subclass of Thread so it couldn't be virtual thread. >>> >>> >>> Following fix changes Wicket to use locks/conditions to don't pin vthread to carrier thread while starting testing. >>> >>> ManagedThread is fixed to keep execution thread as the thread variable and isolate it's creation. >>> >>> Test vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java was updated to don't use Wicket. (The lock has a reference to thread which affects test.) >>> >>> Wicket "finished" in class ThreadsRunner was changed to atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which might happened in stress GC tests. >>> >>> webrev: http://cr.openjdk.java.net/~lmesnik/8241123/webrev.00/ >>> >>> bug: https://bugs.openjdk.java.net/browse/JDK-8241123 >>> >>> >>> Leonid >>> From igor.ignatyev at oracle.com Wed Mar 18 21:30:41 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 18 Mar 2020 14:30:41 -0700 Subject: RFR: 8241123: Refactor vmTestbase stress framework to use j.u.c and make creation of threads more flexible In-Reply-To: References: <0d73d306-2eff-375c-65e1-67142b2c6c59@oracle.com> Message-ID: <4E0F364A-47F3-428D-9C08-6B1ADFCB9D24@oracle.com> > I need more time to get grasp of Wicket and your changes in it; will come back to you after I understand them. ok, now when I believe that I have enough understanding of Wicket, I have a few comments: 1. > 68 private Lock lock = new ReentrantLock(); > 69 private Condition condition = lock.newCondition(); it's better to make these fields final. 2. as all writes and reads of Wicket::count are guarded by lock.lock, there is no need for it to be atomic. 3. adding lock to getWaiters will also remove need for Wicket::waiters to be atomic. the rest looks good to me. Thanks, -- Igor > On Mar 18, 2020, at 12:48 PM, Igor Ignatyev wrote: > > Hi Leonid, > > I've started looking at your webrev, and so far have a couple questions: > >> Test vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java was updated to don't use Wicket. (The lock has a reference to thread which affects test.) > can't you use just a volatile boolean field? > >> Wicket "finished" in class ThreadsRunner was changed to atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which might happened in stress GC tests. > won't j.u.c.CountDownLatch be more appropriate and cleaner solution here? > > I need more time to get grasp of Wicket and your changes in it; will come back to you after I understand them. > > -- Igor > >> On Mar 18, 2020, at 12:37 PM, Leonid Mesnik wrote: >> >> Hi >> >> Could you please review following fix which slightly refactor vmTestbase stress test harness. This refactoring helps to add virtual threads testing support. >> >> The Wicket uses plain sync/wait/notify mechanism which cause carrier thread starvation and should not be used in virtual threads. The ManagedThread is a subclass of Thread so it couldn't be virtual thread. >> >> >> Following fix changes Wicket to use locks/conditions to don't pin vthread to carrier thread while starting testing. >> >> ManagedThread is fixed to keep execution thread as the thread variable and isolate it's creation. >> >> Test vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java was updated to don't use Wicket. (The lock has a reference to thread which affects test.) >> >> Wicket "finished" in class ThreadsRunner was changed to atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which might happened in stress GC tests. >> >> webrev: http://cr.openjdk.java.net/~lmesnik/8241123/webrev.00/ >> >> bug: https://bugs.openjdk.java.net/browse/JDK-8241123 >> >> >> Leonid >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Wed Mar 18 21:37:30 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 18 Mar 2020 17:37:30 -0400 Subject: RFR 8240902: JDI shared memory connector can use already closed Handles In-Reply-To: References: Message-ID: <211b15ab-96da-7e1d-09dc-5e5f25388b09@oracle.com> Patricio, This is a separate follow up about the jshell tests. Since I have been tracking these as part of my GK work and filed a number of those bugs, I figured I would help analyze them... JDK-8209848 test/langtools/jdk/jshell tests failed with Accept timed out https://bugs.openjdk.java.net/browse/JDK-8209848 ??? This bug has been linked to sightings on Linux-X64, SPARC and Win-X64. ??? While this fix should reduce the number of sightings on Win-X64, it ??? won't help for Linux-X64 or SPARC. https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8209848 JDK-8184445 JShell tests: fail intermittently if tests are run in high concurrent mode. https://bugs.openjdk.java.net/browse/JDK-8184445 ??? The tests you mention below are also mentioned in this bug. ??? This bug has been linked to sightings on Linux-X64, OSX, SPARC and ??? Win-X64. While this fix should reduce the number of sightings on ??? Win-X64, it won't help for Linux-X64, OSX, or SPARC. https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8184445 JDK-8173079 JShell test: jdk/jshell/UserJdiUserRemoteTest.java fails intermittently https://bugs.openjdk.java.net/browse/JDK-8173079 ??? I have high hopes that this bug (on Win*) will be addressed by ??? this fix (8240902) because this test uses JDI... ??? Most of the sightings for this bug are for Win-X64 and few SPARC. ??? The bug also mentions Linux sightings, but I don't see any current ??? links for Linux: https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8173079 ??? So this fix should help with Win* sightings, but not Linux or SPARC. JDK-8190912 jdk/jshell/JdiHangingListenExecutionControlTest.java failed with ??????????? timeout waiting for connection https://bugs.openjdk.java.net/browse/JDK-8190912 ??? I have high hopes that this bug (on Win*) will be addressed by ??? this fix (8240902) because this test uses JDI... ??? Most of the sightings for this bug are for Win-X64; there is one ??? Linux and one OSX. https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8190912 ??? So this fix should help with Win* sightings, but not Linux or OSX. JDK-8207166 langtools/jdk/jshell/JdiHangingLaunchExecutionControlTest.java https://bugs.openjdk.java.net/browse/JDK-8207166 ??? I have high hopes that this bug will be addressed by this fix ??? (8240902) because this test uses JDI... ??? The one linked sighting with platform info is for Win-X64. https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8207166 JDK-8235780 jdk/jshell/FailOverExecutionControlDyingLaunchTest.java fails during setup https://bugs.openjdk.java.net/browse/JDK-8235780 ??? I have high hopes that this bug (on Win*) will be addressed by ??? this fix (8240902) because this test uses JDI... ??? This bug has been linked to sightings on Linux-X64 and Win-X64. ??? While this fix should reduce the number of sightings on Win-X64, ??? it won't help for Linux-X64. https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8235780 JDK-8239930 jdk/jshell/UserJdiUserRemoteTest.java fails due to agentvm mode timeout https://bugs.openjdk.java.net/browse/JDK-8239930 ??? I have high hopes that this bug will be addressed by this fix ??? (8240902) because this test uses JDI... ??? Both sightings linked to this bug are for Win-X64: https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8239930 JDK-8240531 jshell/FailOverExecutionControlDyingLaunchTest.java fails due to agentvm mode timeout https://bugs.openjdk.java.net/browse/JDK-8240531 ??? I have high hopes that this bug will be addressed by this fix ??? (8240902) because this test uses JDI... ??? The three sightings linked to this bug are for Win-X64: https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8240531 Okay... that's it for the jshell bugs that I track that have been spotted on Win-X64 (and usually other platforms too). Dan On 3/17/20 4:14 PM, Patricio Chilano wrote: > Hi all, ... > > Please review the following patch: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8240902 > Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/ > > Calling closeConnection() on an already created/opened connection > includes calls to CloseHandle() on objects that can still be used by > other threads. This can lead to either undefined behavior or, as > detailed in the bug comments, changes of state of unrelated objects. > This issue was found while debugging the reason behind some jshell > test failures seen after pushing 8230594. Not as important, but there > are also calls to closeStream() from createStream()/openStream() when > failing to create/open a stream that will return after executing > "CHECK_ERROR(enterMutex(stream, NULL));" without closing the intended > resources. Then, calling closeConnection() could assert if the reason > of the previous failure was that the stream's mutex failed to be > created/opened. These patch aims to address these issues too. > > Tested in mach5 with the current baseline, tiers1-3 and several runs > of open/test/langtools/:tier1 which includes the jshell tests where > this connector is used. I also applied patch > http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev > mentioned in the comments of the bug, on top of the baseline and run > the langtool tests with and without this fix. Without the fix running > around 30 repetitions already shows failures in tests > jdk/jshell/FailOverExecutionControlTest.java and > jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With the > fix I run several hundred runs and saw no failures. Let me know if > there is any additional testing I should do. > > As a side note, I see there are a couple of open issues related with > jshell failures (8209848) which could be related to this bug and > therefore might be fixed by this patch. > > Thanks, > Patricio > From daniel.daugherty at oracle.com Wed Mar 18 21:40:28 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 18 Mar 2020 17:40:28 -0400 Subject: RFR 8240902: JDI shared memory connector can use already closed Handles In-Reply-To: <211b15ab-96da-7e1d-09dc-5e5f25388b09@oracle.com> References: <211b15ab-96da-7e1d-09dc-5e5f25388b09@oracle.com> Message-ID: <152abc4d-b5e4-b8d0-27c1-72b51f56d0a3@oracle.com> serviceability-dev at ... was included in this email by mistake. This was only supposed to go to Patricio since the Mach5 links won't work outside of Oracle. Sigh... Sorry about the noise folks! Dan On 3/18/20 5:37 PM, Daniel D. Daugherty wrote: > Patricio, > > This is a separate follow up about the jshell tests. Since I have been > tracking these as part of my GK work and filed a number of those bugs, > I figured I would help analyze them... > > JDK-8209848 test/langtools/jdk/jshell tests failed with Accept timed out > https://bugs.openjdk.java.net/browse/JDK-8209848 > > ??? This bug has been linked to sightings on Linux-X64, SPARC and > Win-X64. > ??? While this fix should reduce the number of sightings on Win-X64, it > ??? won't help for Linux-X64 or SPARC. > > https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8209848 > > JDK-8184445 JShell tests: fail intermittently if tests are run in high > concurrent mode. > https://bugs.openjdk.java.net/browse/JDK-8184445 > > ??? The tests you mention below are also mentioned in this bug. > > ??? This bug has been linked to sightings on Linux-X64, OSX, SPARC and > ??? Win-X64. While this fix should reduce the number of sightings on > ??? Win-X64, it won't help for Linux-X64, OSX, or SPARC. > > https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8184445 > > JDK-8173079 JShell test: jdk/jshell/UserJdiUserRemoteTest.java fails > intermittently > https://bugs.openjdk.java.net/browse/JDK-8173079 > > ??? I have high hopes that this bug (on Win*) will be addressed by > ??? this fix (8240902) because this test uses JDI... > > ??? Most of the sightings for this bug are for Win-X64 and few SPARC. > ??? The bug also mentions Linux sightings, but I don't see any current > ??? links for Linux: > > https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8173079 > > ??? So this fix should help with Win* sightings, but not Linux or SPARC. > > JDK-8190912 jdk/jshell/JdiHangingListenExecutionControlTest.java > failed with > ??????????? timeout waiting for connection > https://bugs.openjdk.java.net/browse/JDK-8190912 > > ??? I have high hopes that this bug (on Win*) will be addressed by > ??? this fix (8240902) because this test uses JDI... > > ??? Most of the sightings for this bug are for Win-X64; there is one > ??? Linux and one OSX. > > https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8190912 > > ??? So this fix should help with Win* sightings, but not Linux or OSX. > > JDK-8207166 > langtools/jdk/jshell/JdiHangingLaunchExecutionControlTest.java > https://bugs.openjdk.java.net/browse/JDK-8207166 > > ??? I have high hopes that this bug will be addressed by this fix > ??? (8240902) because this test uses JDI... > > ??? The one linked sighting with platform info is for Win-X64. > > https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8207166 > > JDK-8235780 jdk/jshell/FailOverExecutionControlDyingLaunchTest.java > fails during setup > https://bugs.openjdk.java.net/browse/JDK-8235780 > > ??? I have high hopes that this bug (on Win*) will be addressed by > ??? this fix (8240902) because this test uses JDI... > > ??? This bug has been linked to sightings on Linux-X64 and Win-X64. > ??? While this fix should reduce the number of sightings on Win-X64, > ??? it won't help for Linux-X64. > > https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8235780 > > JDK-8239930 jdk/jshell/UserJdiUserRemoteTest.java fails due to agentvm > mode timeout > https://bugs.openjdk.java.net/browse/JDK-8239930 > > ??? I have high hopes that this bug will be addressed by this fix > ??? (8240902) because this test uses JDI... > > ??? Both sightings linked to this bug are for Win-X64: > > https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8239930 > > JDK-8240531 jshell/FailOverExecutionControlDyingLaunchTest.java fails > due to agentvm mode timeout > https://bugs.openjdk.java.net/browse/JDK-8240531 > > ??? I have high hopes that this bug will be addressed by this fix > ??? (8240902) because this test uses JDI... > > ??? The three sightings linked to this bug are for Win-X64: > > https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8240531 > > > Okay... that's it for the jshell bugs that I track that have been > spotted on Win-X64 (and usually other platforms too). > > Dan > > > On 3/17/20 4:14 PM, Patricio Chilano wrote: >> Hi all, ... >> >> Please review the following patch: >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8240902 >> Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/ >> >> Calling closeConnection() on an already created/opened connection >> includes calls to CloseHandle() on objects that can still be used by >> other threads. This can lead to either undefined behavior or, as >> detailed in the bug comments, changes of state of unrelated objects. >> This issue was found while debugging the reason behind some jshell >> test failures seen after pushing 8230594. Not as important, but there >> are also calls to closeStream() from createStream()/openStream() when >> failing to create/open a stream that will return after executing >> "CHECK_ERROR(enterMutex(stream, NULL));" without closing the intended >> resources. Then, calling closeConnection() could assert if the reason >> of the previous failure was that the stream's mutex failed to be >> created/opened. These patch aims to address these issues too. >> >> Tested in mach5 with the current baseline, tiers1-3 and several runs >> of open/test/langtools/:tier1 which includes the jshell tests where >> this connector is used. I also applied patch >> http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev >> mentioned in the comments of the bug, on top of the baseline and run >> the langtool tests with and without this fix. Without the fix running >> around 30 repetitions already shows failures in tests >> jdk/jshell/FailOverExecutionControlTest.java and >> jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With the >> fix I run several hundred runs and saw no failures. Let me know if >> there is any additional testing I should do. >> >> As a side note, I see there are a couple of open issues related with >> jshell failures (8209848) which could be related to this bug and >> therefore might be fixed by this patch. >> >> Thanks, >> Patricio >> > From patricio.chilano.mateo at oracle.com Wed Mar 18 21:48:45 2020 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Wed, 18 Mar 2020 18:48:45 -0300 Subject: RFR 8240902: JDI shared memory connector can use already closed Handles In-Reply-To: References: Message-ID: <2fe1dc87-641c-44f1-aae2-6b5c9da19227@oracle.com> Hi Dan, On 3/18/20 5:30 PM, Daniel D. Daugherty wrote: > On 3/17/20 4:14 PM, Patricio Chilano wrote: >> Hi all, >> >> Please review the following patch: >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8240902 >> Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/ > > src/jdk.jdi/share/native/libdt_shmem/shmemBase.c > ??? L411: ??? int attempts = 10; > ??? L420: ??????? sysSleep(200); > ??????? I presume that this is a 200 millisecond sleep so this new loop > ??????? will delay a closeStream() call by at most 2 seconds. You may > ??????? want those literals to be #define'ed values at the top of the > ??????? file, e.g., like this one: > > ??????? #define MAX_GENERATION_RETRIES 20 > > ??????? Your choice on the names of the new #defines if you choose to > ??????? do that. You might even consider putting them close to > ??????? "typedef struct SharedMemoryConnection". > > ??????? Update: Oh yuck! Now I see that there is existing code that > ??????? does the same kind of looping with sysSleep() calls when the > ??????? linger option is set. I revise my comment: You're following > ??????? the existing style in the function so go with what you have. Ok, I left the loop as it is now. > Don't forget to update the copyright year before you push. Done. > L379: closeStream(Stream *stream, jboolean linger, unsigned int > *refcount ) > ??????? nit - please delete space before ')'. Done. > L412: ??? MemoryBarrier();???? /* Prevent load of refcount to float > above. */ > ??????? typo: s/to float/from floating/ After replying to David's review I realized the enterMutex() call on closeStream() will already provide acquire semantics so reading the refcount will not float above. I removed the barrier. > L413: ??? while (attempts>0) { > ??????? nit - please add spaces around '>'. Done. > L415-418, L537, L541, L552: > ??????? nit - indent should be four spaces instead of two spaces. Done. > The existing L546 and L549 should indented four spaces instead > ??????? of two spaces. Please fix since you there. Done. > I'm good with the code changes. I only have nits above so I don't need > to see another webrev. Thanks for reviewing this Dan! I might send a v2 later. Thanks, Patricio > Dan > >> Calling closeConnection() on an already created/opened connection >> includes calls to CloseHandle() on objects that can still be used by >> other threads. This can lead to either undefined behavior or, as >> detailed in the bug comments, changes of state of unrelated objects. >> This issue was found while debugging the reason behind some jshell >> test failures seen after pushing 8230594. Not as important, but there >> are also calls to closeStream() from createStream()/openStream() when >> failing to create/open a stream that will return after executing >> "CHECK_ERROR(enterMutex(stream, NULL));" without closing the intended >> resources. Then, calling closeConnection() could assert if the reason >> of the previous failure was that the stream's mutex failed to be >> created/opened. These patch aims to address these issues too. >> >> Tested in mach5 with the current baseline, tiers1-3 and several runs >> of open/test/langtools/:tier1 which includes the jshell tests where >> this connector is used. I also applied patch >> http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev >> mentioned in the comments of the bug, on top of the baseline and run >> the langtool tests with and without this fix. Without the fix running >> around 30 repetitions already shows failures in tests >> jdk/jshell/FailOverExecutionControlTest.java and >> jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With the >> fix I run several hundred runs and saw no failures. Let me know if >> there is any additional testing I should do. >> >> As a side note, I see there are a couple of open issues related with >> jshell failures (8209848) which could be related to this bug and >> therefore might be fixed by this patch. >> >> Thanks, >> Patricio >> > From patricio.chilano.mateo at oracle.com Wed Mar 18 22:02:00 2020 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Wed, 18 Mar 2020 19:02:00 -0300 Subject: RFR 8240902: JDI shared memory connector can use already closed Handles In-Reply-To: <161fe95b-4d48-fd61-15ec-7daecb193868@oracle.com> References: <161fe95b-4d48-fd61-15ec-7daecb193868@oracle.com> Message-ID: Hi Serguei, On 3/18/20 6:03 PM, serguei.spitsyn at oracle.com wrote: > Hi Patricio, > > Good finding, thank you for taking care about this! > The fix looks good in general. > > There are several spots with the wrong indent (must be 4, not 2): > 64 #define ENTER_CONNECTION(connection) do { \ > 65 InterlockedIncrement(&connection->refcount); \ > 66 if (IS_STATE_CLOSED(connection->state)) { \ > 67 setLastErrorMsg("stream closed"); \ > 68 InterlockedDecrement(&connection->refcount); \ > 69 return SYS_ERR; \ > 70 } \ > 71 } while (0) > 72 > 73 #define LEAVE_CONNECTION(connection) do { \ > 74 InterlockedDecrement(&connection->refcount); \ > 75 } while (0) > ? I'd also suggest to move content left and use indent 4 from the side. Done. I already aligned it the same way as STREAM_INVARIANT and I fixed the indent inside ENTER_CONNECTION(). > 414 if (*refcount == 0) { > 415 sysEventClose(stream->hasData); > 416 sysEventClose(stream->hasSpace); > 417 sysIPMutexClose(stream->mutex); > 418 break; > 419 } ... > 535 Stream * stream = &connection->outgoing; > 536 if (stream->state == STATE_OPEN) { > 537 (void)closeStream(stream, JNI_TRUE, &connection->refcount); > 538 } > 539 stream = &connection->incoming; > 540 if (stream->state == STATE_OPEN) { > 541 (void)closeStream(stream, JNI_FALSE, &connection->refcount); > 542 } ... > 551 if (connection->shutdown) { > 552 sysEventClose(connection->shutdown); > 553 } > 554 } ... 1022 shmemBase_sendByte(SharedMemoryConnection *connection, > jbyte data) > 1023 { > 1024 ENTER_CONNECTION(connection); > 1025 jint rc = shmemBase_sendByte_internal(connection, data); > 1026 LEAVE_CONNECTION(connection); > 1027 return rc; > 1028 } > ... > > 1055 jint > 1056 shmemBase_receiveByte(SharedMemoryConnection *connection, jbyte > *data) > 1057 { > 1058 ENTER_CONNECTION(connection); > 1059 jint rc = shmemBase_receiveByte_internal(connection, data); > 1060 LEAVE_CONNECTION(connection); > 1061 return rc; > 1062 } ... > 1136 jint > 1137 shmemBase_sendPacket(SharedMemoryConnection *connection, const > jdwpPacket *packet) > 1138 { > 1139 ENTER_CONNECTION(connection); > 1140 jint rc = shmemBase_sendPacket_internal(connection, packet); > 1141 LEAVE_CONNECTION(connection); > 1142 return rc; > 1143 } > ... > 1229 jint > 1230 shmemBase_receivePacket(SharedMemoryConnection *connection, > jdwpPacket *packet) > 1231 { > 1232 ENTER_CONNECTION(connection); > 1233 jint rc = shmemBase_receivePacket_internal(connection, packet); > 1234 LEAVE_CONNECTION(connection); > 1235 return rc; > 1236 } Done. Fix all those. > Some other nits were already commented by David and Dan. > > I'd suggest to test with tier-5 as well for more safety. Thanks for looking at this Serguei! I'll give it a new run in mach5 and add tier5. Thanks, Patricio > Thanks, > Serguei > > > On 3/17/20 13:14, Patricio Chilano wrote: >> Hi all, >> >> Please review the following patch: >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8240902 >> Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/ >> >> Calling closeConnection() on an already created/opened connection >> includes calls to CloseHandle() on objects that can still be used by >> other threads. This can lead to either undefined behavior or, as >> detailed in the bug comments, changes of state of unrelated objects. >> This issue was found while debugging the reason behind some jshell >> test failures seen after pushing 8230594. Not as important, but there >> are also calls to closeStream() from createStream()/openStream() when >> failing to create/open a stream that will return after executing >> "CHECK_ERROR(enterMutex(stream, NULL));" without closing the intended >> resources. Then, calling closeConnection() could assert if the reason >> of the previous failure was that the stream's mutex failed to be >> created/opened. These patch aims to address these issues too. >> >> Tested in mach5 with the current baseline, tiers1-3 and several runs >> of open/test/langtools/:tier1 which includes the jshell tests where >> this connector is used. I also applied patch >> http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev >> mentioned in the comments of the bug, on top of the baseline and run >> the langtool tests with and without this fix. Without the fix running >> around 30 repetitions already shows failures in tests >> jdk/jshell/FailOverExecutionControlTest.java and >> jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With the >> fix I run several hundred runs and saw no failures. Let me know if >> there is any additional testing I should do. >> >> As a side note, I see there are a couple of open issues related with >> jshell failures (8209848) which could be related to this bug and >> therefore might be fixed by this patch. >> >> Thanks, >> Patricio >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From leonid.mesnik at oracle.com Wed Mar 18 22:18:43 2020 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Wed, 18 Mar 2020 15:18:43 -0700 Subject: RFR: 8241123: Refactor vmTestbase stress framework to use j.u.c and make creation of threads more flexible In-Reply-To: <4E0F364A-47F3-428D-9C08-6B1ADFCB9D24@oracle.com> References: <0d73d306-2eff-375c-65e1-67142b2c6c59@oracle.com> <4E0F364A-47F3-428D-9C08-6B1ADFCB9D24@oracle.com> Message-ID: <504b0902-9fd1-ea8c-399a-185a4ceaa9e0@oracle.com> On 3/18/20 2:30 PM, Igor Ignatyev wrote: >> I need more time to get grasp of Wicket and your changes in it; will >> come back to you after I understand them. > ok, now when I believe that I have enough understanding of Wicket, I > have a few comments: > 1. >> 68 private Lock lock = new ReentrantLock(); >> 69 private Condition condition = lock.newCondition(); > it's better to make these fields final. > > 2. as all writes and reads of Wicket::count are guarded by lock.lock, > there is no need for it to be atomic. > 3. adding lock to?getWaiters will also remove need for Wicket::waiters > to be atomic. All 3 are fixed. Thanks for your suggestions. Updated version: http://cr.openjdk.java.net/~lmesnik/8241123/webrev.01/ Leonid > > the rest looks good to me. > > Thanks, > -- Igor > > > >> On Mar 18, 2020, at 12:48 PM, Igor Ignatyev > > wrote: >> >> Hi Leonid, >> >> I've started looking at your webrev, and so far have a couple questions: >> >>> Test >>> vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java >>> was updated to don't use Wicket. (The lock has a reference to thread >>> which affects test.) >> can't you use just a volatile boolean field? >> >>> Wicket "finished" in class ThreadsRunner was changed to >>> atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which >>> might happened in stress GC tests. >> won't j.u.c.CountDownLatch be more appropriate and cleaner solution here? >> >> I need more time to get grasp of Wicket and your changes in it; will >> come back to you after I understand them. >> >> -- Igor >> >>> On Mar 18, 2020, at 12:37 PM, Leonid Mesnik >>> > wrote: >>> >>> Hi >>> >>> Could you please review following fix which slightly refactor >>> vmTestbase stress test harness. This refactoring helps to add >>> virtual threads testing support. >>> >>> The Wicket uses plain sync/wait/notify mechanism which cause carrier >>> thread starvation and should not be used in virtual threads. The >>> ManagedThread is a subclass of Thread so it couldn't be virtual thread. >>> >>> >>> Following fix changes Wicket to use locks/conditions to don't pin >>> vthread to carrier thread while starting testing. >>> >>> ManagedThread is fixed to keep execution thread as the thread >>> variable and isolate it's creation. >>> >>> Test >>> vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java >>> was updated to don't use Wicket. (The lock has a reference to thread >>> which affects test.) >>> >>> Wicket "finished" in class ThreadsRunner was changed to >>> atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which >>> might happened in stress GC tests. >>> >>> webrev: http://cr.openjdk.java.net/~lmesnik/8241123/webrev.00/ >>> >>> bug: https://bugs.openjdk.java.net/browse/JDK-8241123 >>> >>> >>> Leonid >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From igor.ignatyev at oracle.com Wed Mar 18 22:22:56 2020 From: igor.ignatyev at oracle.com (Igor Ignatev) Date: Wed, 18 Mar 2020 15:22:56 -0700 Subject: RFR: 8241123: Refactor vmTestbase stress framework to use j.u.c and make creation of threads more flexible In-Reply-To: <504b0902-9fd1-ea8c-399a-185a4ceaa9e0@oracle.com> References: <504b0902-9fd1-ea8c-399a-185a4ceaa9e0@oracle.com> Message-ID: Reviewed. ? Igor > On Mar 18, 2020, at 3:18 PM, Leonid Mesnik wrote: > > ? > > > On 3/18/20 2:30 PM, Igor Ignatyev wrote: >>> I need more time to get grasp of Wicket and your changes in it; will come back to you after I understand them. >> ok, now when I believe that I have enough understanding of Wicket, I have a few comments: >> 1. >>> 68 private Lock lock = new ReentrantLock(); >>> 69 private Condition condition = lock.newCondition(); >> it's better to make these fields final. >> >> 2. as all writes and reads of Wicket::count are guarded by lock.lock, there is no need for it to be atomic. >> 3. adding lock to getWaiters will also remove need for Wicket::waiters to be atomic. > All 3 are fixed. Thanks for your suggestions. > > Updated version: > > http://cr.openjdk.java.net/~lmesnik/8241123/webrev.01/ > > Leonid > >> >> the rest looks good to me. >> >> Thanks, >> -- Igor >> >> >> >>> On Mar 18, 2020, at 12:48 PM, Igor Ignatyev wrote: >>> >>> Hi Leonid, >>> >>> I've started looking at your webrev, and so far have a couple questions: >>> >>>> Test vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java was updated to don't use Wicket. (The lock has a reference to thread which affects test.) >>> can't you use just a volatile boolean field? >>> >>>> Wicket "finished" in class ThreadsRunner was changed to atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which might happened in stress GC tests. >>> won't j.u.c.CountDownLatch be more appropriate and cleaner solution here? >>> >>> I need more time to get grasp of Wicket and your changes in it; will come back to you after I understand them. >>> >>> -- Igor >>> >>>> On Mar 18, 2020, at 12:37 PM, Leonid Mesnik wrote: >>>> >>>> Hi >>>> >>>> Could you please review following fix which slightly refactor vmTestbase stress test harness. This refactoring helps to add virtual threads testing support. >>>> >>>> The Wicket uses plain sync/wait/notify mechanism which cause carrier thread starvation and should not be used in virtual threads. The ManagedThread is a subclass of Thread so it couldn't be virtual thread. >>>> >>>> >>>> Following fix changes Wicket to use locks/conditions to don't pin vthread to carrier thread while starting testing. >>>> >>>> ManagedThread is fixed to keep execution thread as the thread variable and isolate it's creation. >>>> >>>> Test vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java was updated to don't use Wicket. (The lock has a reference to thread which affects test.) >>>> >>>> Wicket "finished" in class ThreadsRunner was changed to atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which might happened in stress GC tests. >>>> >>>> webrev: http://cr.openjdk.java.net/~lmesnik/8241123/webrev.00/ >>>> >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8241123 >>>> >>>> >>>> Leonid >>>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From leonid.mesnik at oracle.com Wed Mar 18 22:51:15 2020 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Wed, 18 Mar 2020 15:51:15 -0700 Subject: RFR: 8241123: Refactor vmTestbase stress framework to use j.u.c and make creation of threads more flexible In-Reply-To: References: <504b0902-9fd1-ea8c-399a-185a4ceaa9e0@oracle.com> Message-ID: Thank you for review and? feedback. Leonid On 3/18/20 3:22 PM, Igor Ignatev wrote: > Reviewed. > > ? Igor > >> On Mar 18, 2020, at 3:18 PM, Leonid Mesnik >> wrote: >> >> ? >> >> >> On 3/18/20 2:30 PM, Igor Ignatyev wrote: >>>> I need more time to get grasp of Wicket and your changes in it; >>>> will come back to you after I understand them. >>> ok, now when I believe that I have enough understanding of Wicket, I >>> have a few comments: >>> 1. >>>> 68 private Lock lock = new ReentrantLock(); >>>> 69 private Condition condition = lock.newCondition(); >>> it's better to make these fields final. >>> >>> 2. as all writes and reads of Wicket::count are guarded by >>> lock.lock, there is no need for it to be atomic. >>> 3. adding lock to?getWaiters will also remove need for >>> Wicket::waiters to be atomic. >> >> All 3 are fixed. Thanks for your suggestions. >> >> Updated version: >> >> http://cr.openjdk.java.net/~lmesnik/8241123/webrev.01/ >> >> Leonid >> >>> >>> the rest looks good to me. >>> >>> Thanks, >>> -- Igor >>> >>> >>> >>>> On Mar 18, 2020, at 12:48 PM, Igor Ignatyev >>>> > wrote: >>>> >>>> Hi Leonid, >>>> >>>> I've started looking at your webrev, and so far have a couple >>>> questions: >>>> >>>>> Test >>>>> vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java >>>>> was updated to don't use Wicket. (The lock has a reference to >>>>> thread which affects test.) >>>> can't you use just a volatile boolean field? >>>> >>>>> Wicket "finished" in class ThreadsRunner was changed to >>>>> atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which >>>>> might happened in stress GC tests. >>>> won't j.u.c.CountDownLatch be more appropriate and cleaner solution >>>> here? >>>> >>>> I need more time to get grasp of Wicket and your changes in it; >>>> will come back to you after I understand them. >>>> >>>> -- Igor >>>> >>>>> On Mar 18, 2020, at 12:37 PM, Leonid Mesnik >>>>> > wrote: >>>>> >>>>> Hi >>>>> >>>>> Could you please review following fix which slightly refactor >>>>> vmTestbase stress test harness. This refactoring helps to add >>>>> virtual threads testing support. >>>>> >>>>> The Wicket uses plain sync/wait/notify mechanism which cause >>>>> carrier thread starvation and should not be used in virtual >>>>> threads. The ManagedThread is a subclass of Thread so it couldn't >>>>> be virtual thread. >>>>> >>>>> >>>>> Following fix changes Wicket to use locks/conditions to don't pin >>>>> vthread to carrier thread while starting testing. >>>>> >>>>> ManagedThread is fixed to keep execution thread as the thread >>>>> variable and isolate it's creation. >>>>> >>>>> Test >>>>> vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java >>>>> was updated to don't use Wicket. (The lock has a reference to >>>>> thread which affects test.) >>>>> >>>>> Wicket "finished" in class ThreadsRunner was changed to >>>>> atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which >>>>> might happened in stress GC tests. >>>>> >>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8241123/webrev.00/ >>>>> >>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8241123 >>>>> >>>>> >>>>> Leonid >>>>> >>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Wed Mar 18 23:10:44 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 19 Mar 2020 09:10:44 +1000 Subject: RFR 8240902: JDI shared memory connector can use already closed Handles In-Reply-To: <2212664f-930b-cdfc-7ae6-70c7205cdaca@oracle.com> References: <2212664f-930b-cdfc-7ae6-70c7205cdaca@oracle.com> Message-ID: <8928590e-6516-051e-aa84-098a6fdc9d45@oracle.com> Hi Patricio, On 19/03/2020 6:44 am, Patricio Chilano wrote: > Hi David, > > On 3/18/20 4:27 AM, David Holmes wrote: >> Hi Patricio, >> >> On 18/03/2020 6:14 am, Patricio Chilano wrote: >>> Hi all, >>> >>> Please review the following patch: >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240902 >>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/ >>> >>> Calling closeConnection() on an already created/opened connection >>> includes calls to CloseHandle() on objects that can still be used by >>> other threads. This can lead to either undefined behavior or, as >>> detailed in the bug comments, changes of state of unrelated objects. >> >> This was a really great find! > Thanks!? : ) > >>> This issue was found while debugging the reason behind some jshell >>> test failures seen after pushing 8230594. Not as important, but there >>> are also calls to closeStream() from createStream()/openStream() when >>> failing to create/open a stream that will return after executing >>> "CHECK_ERROR(enterMutex(stream, NULL));" without closing the intended >>> resources. Then, calling closeConnection() could assert if the reason >>> of the previous failure was that the stream's mutex failed to be >>> created/opened. These patch aims to address these issues too. >> >> Patch looks good in general. The internal reference count guards >> deletion of the internal resources, and is itself safe because never >> actually delete the connection. Thanks for adding the comment about >> this aspect. >> >> A few items: >> >> Please update copyright year before pushing. > Done. > >> Please align ENTER_CONNECTION/LEAVE_CONNECTION macros the same way as >> STREAM_INVARIANT. > Done. > >> ?170 unsigned int refcount; >> ?171???? jint state; >> >> I'm unclear about the use of stream->state and connection->state as >> guards - unless accessed under a mutex these would seem to at least >> need acquire/release semantics. >> >> Additionally the reads of refcount would also seem to need to some >> form of memory synchronization - though the Windows docs for the >> Interlocked* API does not show how to simply read such a variable! >> Though I note that the RtlFirstEntrySList method for the "Interlocked >> Singly Linked Lists" API does state "Access to the list is >> synchronized on a multiprocessor system." which suggests a read of >> such a variable does require some form of memory synchronization! > In the case of the stream struct, the state field is protected by the > mutex field. It is set to STATE_CLOSED while holding the mutex, and > threads that read it must acquire the mutex first through > sysIPMutexEnter(). For the cases where sysIPMutexEnter() didn't acquire > the mutex, we will return something different than SYS_OK and the call > will exit anyways. All this behaves as before, I didn't change it. Thanks for clarifying. > The refcount and state that I added to the SharedMemoryConnection struct > work together. For a thread closing the connection, setting the > connection state to STATE_CLOSED has to happen before reading the > refcount (more on the atomicity of that read later). That's why I added > the MemoryBarrier() call; which I see it's better if I just move it to > after setting the connection state to closed. For the threads accessing > the connection, incrementing the refcount has to happen before reading > the connection state. That's already provided by the > InterlockedIncrement() which uses a full memory barrier. In this way if > the thread closing the connection reads a refcount of 0, then we know > it's safe to release the resources, since other threads accessing the > connection will see that the state is closed after incrementing the > refcount. If the read of refcount is not 0, then it could be that a > thread is accessing the connection or not (it could have read a state > connection of STATE_CLOSED after incrementing the refcount), we don't > know, so we can't release anything. Similarly if the thread accessing > the connection reads that the state is not closed, then we know it's > safe to access the stream since anybody closing the connection will > still have to read refcount which will be at least 1. > As for the atomicity of the read of refcount, from > https://docs.microsoft.com/en-us/windows/win32/sync/interlocked-variable-access, > it states that "simple reads and writes to properly-aligned 32-bit > variables are atomic operations". Maybe I should declare refcount > explicitly as DWORD32? It isn't the atomicity in question with the naked read but the visibility. Any latency in the visibility of the store done by the InterLocked*() function should be handled by the retry loop, but what is to stop the C++ compiler from hoisting the read of refcount out of the loop? It isn't even volatile (which has a stronger meaning in VS than regular C+++). > Instead of having a refcount we could have done something similar to the > stream struct and protect access to the connection through a mutex. To > avoid serializing all threads we could have used SRW locks and only the > one closing the connection would do AcquireSRWLockExclusive(). It would > change the state of the connection to STATE_CLOSED, close all handles, > and then release the mutex. ENTER_CONNECTION() and LEAVE_CONNECTION() > would acquire and release the mutex in shared mode. But other that maybe > be more easy to read I don't think the change will be smaller. > >> ?413 while (attempts>0) { >> >> spaces around > > Done. > >> If the loop at 413 never encounters a zero reference_count then it >> doesn't close the events or the mutex but still returns SYS_OK. That >> seems wrong but I'm not sure what the right behaviour is here. > I can change the return value to be SYS_ERR, but I don't think there is > much we can do about it unless we want to wait forever until we can > release those resources. SYS_ERR would look better, but I see now that the return value is completely ignored anyway. So we're just going to leak resources if the loop "times out". I guess this is the best we can do. Thanks, David > >> And please wait for serviceability folk to review this. > Sounds good. > > > Thanks for looking at this David! I will move the MemoryBarrier() and > change the refcount to be DWORD32 if you are okay with that. > > > Thanks, > Patricio >> Thanks, >> David >> ----- >> >>> Tested in mach5 with the current baseline, tiers1-3 and several runs >>> of open/test/langtools/:tier1 which includes the jshell tests where >>> this connector is used. I also applied patch >>> http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev >>> mentioned in the comments of the bug, on top of the baseline and run >>> the langtool tests with and without this fix. Without the fix running >>> around 30 repetitions already shows failures in tests >>> jdk/jshell/FailOverExecutionControlTest.java and >>> jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With the >>> fix I run several hundred runs and saw no failures. Let me know if >>> there is any additional testing I should do. >>> >>> As a side note, I see there are a couple of open issues related with >>> jshell failures (8209848) which could be related to this bug and >>> therefore might be fixed by this patch. >>> >>> Thanks, >>> Patricio >>> > From chris.plummer at oracle.com Thu Mar 19 02:35:16 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 18 Mar 2020 19:35:16 -0700 Subject: RFR(XS) 8240543: Update problem list entry for serviceability/sa/TestRevPtrsForInvokeDynamic.java to reference JDK-8241235 Message-ID: Hello, Please review the following: https://bugs.openjdk.java.net/browse/JDK-8240543 diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt +++ b/test/hotspot/jtreg/ProblemList.txt @@ -140,7 +140,7 @@ ?serviceability/sa/TestJmapCore.java 8193639 solaris-all ?serviceability/sa/TestJmapCoreMetaspace.java 8193639 solaris-all ?serviceability/sa/TestPrintMdo.java 8193639 solaris-all -serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all +serviceability/sa/TestRevPtrsForInvokeDynamic.java 8241235 generic-all ?serviceability/sa/TestType.java 8193639 solaris-all ?serviceability/sa/TestUniverse.java#id0 8193639 solaris-all 8191270 [1] no longer seems to reproduce. Because of that I was hoping to remove this test from the problem list, but found that in a tier4 run that this test fails for a different reason when a combination of compiler related flags is specified. I opened up a 8241235 [2]? for that failure and need to update ProblemList.txt to reference it instead. I will close 8191270 [1] once this change is pushed. [1] https://bugs.openjdk.java.net/browse/JDK-8191270 [2] https://bugs.openjdk.java.net/browse/JDK-8241235 thanks, Chris From david.holmes at oracle.com Thu Mar 19 03:42:31 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 18 Mar 2020 20:42:31 -0700 (PDT) Subject: RFR(XS) 8240543: Update problem list entry for serviceability/sa/TestRevPtrsForInvokeDynamic.java to reference JDK-8241235 In-Reply-To: References: Message-ID: <9f82c4d1-9f43-20e3-3f08-41f5531c46e4@oracle.com> Looks good Chris. Thanks, David On 19/03/2020 12:35 pm, Chris Plummer wrote: > Hello, > > Please review the following: > > https://bugs.openjdk.java.net/browse/JDK-8240543 > > diff --git a/test/hotspot/jtreg/ProblemList.txt > b/test/hotspot/jtreg/ProblemList.txt > --- a/test/hotspot/jtreg/ProblemList.txt > +++ b/test/hotspot/jtreg/ProblemList.txt > @@ -140,7 +140,7 @@ > ?serviceability/sa/TestJmapCore.java 8193639 solaris-all > ?serviceability/sa/TestJmapCoreMetaspace.java 8193639 solaris-all > ?serviceability/sa/TestPrintMdo.java 8193639 solaris-all > -serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all > +serviceability/sa/TestRevPtrsForInvokeDynamic.java 8241235 generic-all > ?serviceability/sa/TestType.java 8193639 solaris-all > ?serviceability/sa/TestUniverse.java#id0 8193639 solaris-all > > 8191270 [1] no longer seems to reproduce. Because of that I was hoping > to remove this test from the problem list, but found that in a tier4 run > that this test fails for a different reason when a combination of > compiler related flags is specified. I opened up a 8241235 [2]? for that > failure and need to update ProblemList.txt to reference it instead. I > will close 8191270 [1] once this change is pushed. > > [1] https://bugs.openjdk.java.net/browse/JDK-8191270 > [2] https://bugs.openjdk.java.net/browse/JDK-8241235 > > thanks, > > Chris > From chris.plummer at oracle.com Thu Mar 19 04:17:32 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 18 Mar 2020 21:17:32 -0700 Subject: RFR(XS) 8240543: Update problem list entry for serviceability/sa/TestRevPtrsForInvokeDynamic.java to reference JDK-8241235 In-Reply-To: <9f82c4d1-9f43-20e3-3f08-41f5531c46e4@oracle.com> References: <9f82c4d1-9f43-20e3-3f08-41f5531c46e4@oracle.com> Message-ID: <258b9aca-58c7-dd1c-fb8e-aa69ea02706f@oracle.com> Thanks! On 3/18/20 8:42 PM, David Holmes wrote: > Looks good Chris. > > Thanks, > David > > On 19/03/2020 12:35 pm, Chris Plummer wrote: >> Hello, >> >> Please review the following: >> >> https://bugs.openjdk.java.net/browse/JDK-8240543 >> >> diff --git a/test/hotspot/jtreg/ProblemList.txt >> b/test/hotspot/jtreg/ProblemList.txt >> --- a/test/hotspot/jtreg/ProblemList.txt >> +++ b/test/hotspot/jtreg/ProblemList.txt >> @@ -140,7 +140,7 @@ >> ??serviceability/sa/TestJmapCore.java 8193639 solaris-all >> ??serviceability/sa/TestJmapCoreMetaspace.java 8193639 solaris-all >> ??serviceability/sa/TestPrintMdo.java 8193639 solaris-all >> -serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all >> +serviceability/sa/TestRevPtrsForInvokeDynamic.java 8241235 generic-all >> ??serviceability/sa/TestType.java 8193639 solaris-all >> ??serviceability/sa/TestUniverse.java#id0 8193639 solaris-all >> >> 8191270 [1] no longer seems to reproduce. Because of that I was >> hoping to remove this test from the problem list, but found that in a >> tier4 run that this test fails for a different reason when a >> combination of compiler related flags is specified. I opened up a >> 8241235 [2]? for that failure and need to update ProblemList.txt to >> reference it instead. I will close 8191270 [1] once this change is >> pushed. >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8191270 >> [2] https://bugs.openjdk.java.net/browse/JDK-8241235 >> >> thanks, >> >> Chris >> From patricio.chilano.mateo at oracle.com Thu Mar 19 06:18:07 2020 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Thu, 19 Mar 2020 03:18:07 -0300 Subject: RFR 8240902: JDI shared memory connector can use already closed Handles In-Reply-To: <8928590e-6516-051e-aa84-098a6fdc9d45@oracle.com> References: <2212664f-930b-cdfc-7ae6-70c7205cdaca@oracle.com> <8928590e-6516-051e-aa84-098a6fdc9d45@oracle.com> Message-ID: Hi David, On 3/18/20 8:10 PM, David Holmes wrote: > Hi Patricio, > > On 19/03/2020 6:44 am, Patricio Chilano wrote: >> Hi David, >> >> On 3/18/20 4:27 AM, David Holmes wrote: >>> Hi Patricio, >>> >>> On 18/03/2020 6:14 am, Patricio Chilano wrote: >>>> Hi all, >>>> >>>> Please review the following patch: >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240902 >>>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/ >>>> >>>> Calling closeConnection() on an already created/opened connection >>>> includes calls to CloseHandle() on objects that can still be used >>>> by other threads. This can lead to either undefined behavior or, as >>>> detailed in the bug comments, changes of state of unrelated objects. >>> >>> This was a really great find! >> Thanks!? : ) >> >>>> This issue was found while debugging the reason behind some jshell >>>> test failures seen after pushing 8230594. Not as important, but >>>> there are also calls to closeStream() from >>>> createStream()/openStream() when failing to create/open a stream >>>> that will return after executing "CHECK_ERROR(enterMutex(stream, >>>> NULL));" without closing the intended resources. Then, calling >>>> closeConnection() could assert if the reason of the previous >>>> failure was that the stream's mutex failed to be created/opened. >>>> These patch aims to address these issues too. >>> >>> Patch looks good in general. The internal reference count guards >>> deletion of the internal resources, and is itself safe because never >>> actually delete the connection. Thanks for adding the comment about >>> this aspect. >>> >>> A few items: >>> >>> Please update copyright year before pushing. >> Done. >> >>> Please align ENTER_CONNECTION/LEAVE_CONNECTION macros the same way >>> as STREAM_INVARIANT. >> Done. >> >>> ?170 unsigned int refcount; >>> ?171???? jint state; >>> >>> I'm unclear about the use of stream->state and connection->state as >>> guards - unless accessed under a mutex these would seem to at least >>> need acquire/release semantics. >>> >>> Additionally the reads of refcount would also seem to need to some >>> form of memory synchronization - though the Windows docs for the >>> Interlocked* API does not show how to simply read such a variable! >>> Though I note that the RtlFirstEntrySList method for the >>> "Interlocked Singly Linked Lists" API does state "Access to the list >>> is synchronized on a multiprocessor system." which suggests a read >>> of such a variable does require some form of memory synchronization! >> In the case of the stream struct, the state field is protected by the >> mutex field. It is set to STATE_CLOSED while holding the mutex, and >> threads that read it must acquire the mutex first through >> sysIPMutexEnter(). For the cases where sysIPMutexEnter() didn't >> acquire the mutex, we will return something different than SYS_OK and >> the call will exit anyways. All this behaves as before, I didn't >> change it. > > Thanks for clarifying. > >> The refcount and state that I added to the SharedMemoryConnection >> struct work together. For a thread closing the connection, setting >> the connection state to STATE_CLOSED has to happen before reading the >> refcount (more on the atomicity of that read later). That's why I >> added the MemoryBarrier() call; which I see it's better if I just >> move it to after setting the connection state to closed. For the >> threads accessing the connection, incrementing the refcount has to >> happen before reading the connection state. That's already provided >> by the InterlockedIncrement() which uses a full memory barrier. In >> this way if the thread closing the connection reads a refcount of 0, >> then we know it's safe to release the resources, since other threads >> accessing the connection will see that the state is closed after >> incrementing the refcount. If the read of refcount is not 0, then it >> could be that a thread is accessing the connection or not (it could >> have read a state connection of STATE_CLOSED after incrementing the >> refcount), we don't know, so we can't release anything. Similarly if >> the thread accessing the connection reads that the state is not >> closed, then we know it's safe to access the stream since anybody >> closing the connection will still have to read refcount which will be >> at least 1. >> As for the atomicity of the read of refcount, from >> https://docs.microsoft.com/en-us/windows/win32/sync/interlocked-variable-access, >> it states that "simple reads and writes to properly-aligned 32-bit >> variables are atomic operations". Maybe I should declare refcount >> explicitly as DWORD32? > > It isn't the atomicity in question with the naked read but the > visibility. Any latency in the visibility of the store done by the > InterLocked*() function should be handled by the retry loop, but what > is to stop the C++ compiler from hoisting the read of refcount out of > the loop? It isn't even volatile (which has a stronger meaning in VS > than regular C+++). I see what you mean now, I was thinking on atomicity and order of operations but didn't consider the visibility of that read. Yes, if the compiler decides to be smart and hoist the read out of the loop we might never notice that it is safe to release those resources and we would leak them for no reason. I see from the windows docs(https://docs.microsoft.com/en-us/cpp/c-language/type-qualifiers) that declaring it volatile as you pointed out should be enough to prevent that. >> Instead of having a refcount we could have done something similar to >> the stream struct and protect access to the connection through a >> mutex. To avoid serializing all threads we could have used SRW locks >> and only the one closing the connection would do >> AcquireSRWLockExclusive(). It would change the state of the >> connection to STATE_CLOSED, close all handles, and then release the >> mutex. ENTER_CONNECTION() and LEAVE_CONNECTION() would acquire and >> release the mutex in shared mode. But other that maybe be more easy >> to read I don't think the change will be smaller. >> >>> ?413 while (attempts>0) { >>> >>> spaces around > >> Done. >> >>> If the loop at 413 never encounters a zero reference_count then it >>> doesn't close the events or the mutex but still returns SYS_OK. That >>> seems wrong but I'm not sure what the right behaviour is here. >> I can change the return value to be SYS_ERR, but I don't think there >> is much we can do about it unless we want to wait forever until we >> can release those resources. > > SYS_ERR would look better, but I see now that the return value is > completely ignored anyway. So we're just going to leak resources if > the loop "times out". I guess this is the best we can do. Here is v2 with the corrections: Full: http://cr.openjdk.java.net/~pchilanomate/8240902/v2/webrev/ Inc: http://cr.openjdk.java.net/~pchilanomate/8240902/v2/inc/webrev/ ? (not sure why the indent fixes are not highlighted as changes but the Frames view does show they changed) I'll give it a run on mach5 adding tier5 as Serguei suggested. Thanks, Patricio > Thanks, > David > >> >>> And please wait for serviceability folk to review this. >> Sounds good. >> >> >> Thanks for looking at this David! I will move the MemoryBarrier() and >> change the refcount to be DWORD32 if you are okay with that. >> >> >> Thanks, >> Patricio >>> Thanks, >>> David >>> ----- >>> >>>> Tested in mach5 with the current baseline, tiers1-3 and several >>>> runs of open/test/langtools/:tier1 which includes the jshell tests >>>> where this connector is used. I also applied patch >>>> http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev >>>> mentioned in the comments of the bug, on top of the baseline and >>>> run the langtool tests with and without this fix. Without the fix >>>> running around 30 repetitions already shows failures in tests >>>> jdk/jshell/FailOverExecutionControlTest.java and >>>> jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With the >>>> fix I run several hundred runs and saw no failures. Let me know if >>>> there is any additional testing I should do. >>>> >>>> As a side note, I see there are a couple of open issues related >>>> with jshell failures (8209848) which could be related to this bug >>>> and therefore might be fixed by this patch. >>>> >>>> Thanks, >>>> Patricio >>>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Thu Mar 19 07:50:55 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 19 Mar 2020 17:50:55 +1000 Subject: RFR 8240902: JDI shared memory connector can use already closed Handles In-Reply-To: References: <2212664f-930b-cdfc-7ae6-70c7205cdaca@oracle.com> <8928590e-6516-051e-aa84-098a6fdc9d45@oracle.com> Message-ID: Hi Patricio, Incremental changes look good. Thanks, David On 19/03/2020 4:18 pm, Patricio Chilano wrote: > Hi David, > > On 3/18/20 8:10 PM, David Holmes wrote: >> Hi Patricio, >> >> On 19/03/2020 6:44 am, Patricio Chilano wrote: >>> Hi David, >>> >>> On 3/18/20 4:27 AM, David Holmes wrote: >>>> Hi Patricio, >>>> >>>> On 18/03/2020 6:14 am, Patricio Chilano wrote: >>>>> Hi all, >>>>> >>>>> Please review the following patch: >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240902 >>>>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/ >>>>> >>>>> Calling closeConnection() on an already created/opened connection >>>>> includes calls to CloseHandle() on objects that can still be used >>>>> by other threads. This can lead to either undefined behavior or, as >>>>> detailed in the bug comments, changes of state of unrelated objects. >>>> >>>> This was a really great find! >>> Thanks!? : ) >>> >>>>> This issue was found while debugging the reason behind some jshell >>>>> test failures seen after pushing 8230594. Not as important, but >>>>> there are also calls to closeStream() from >>>>> createStream()/openStream() when failing to create/open a stream >>>>> that will return after executing "CHECK_ERROR(enterMutex(stream, >>>>> NULL));" without closing the intended resources. Then, calling >>>>> closeConnection() could assert if the reason of the previous >>>>> failure was that the stream's mutex failed to be created/opened. >>>>> These patch aims to address these issues too. >>>> >>>> Patch looks good in general. The internal reference count guards >>>> deletion of the internal resources, and is itself safe because never >>>> actually delete the connection. Thanks for adding the comment about >>>> this aspect. >>>> >>>> A few items: >>>> >>>> Please update copyright year before pushing. >>> Done. >>> >>>> Please align ENTER_CONNECTION/LEAVE_CONNECTION macros the same way >>>> as STREAM_INVARIANT. >>> Done. >>> >>>> ?170 unsigned int refcount; >>>> ?171???? jint state; >>>> >>>> I'm unclear about the use of stream->state and connection->state as >>>> guards - unless accessed under a mutex these would seem to at least >>>> need acquire/release semantics. >>>> >>>> Additionally the reads of refcount would also seem to need to some >>>> form of memory synchronization - though the Windows docs for the >>>> Interlocked* API does not show how to simply read such a variable! >>>> Though I note that the RtlFirstEntrySList method for the >>>> "Interlocked Singly Linked Lists" API does state "Access to the list >>>> is synchronized on a multiprocessor system." which suggests a read >>>> of such a variable does require some form of memory synchronization! >>> In the case of the stream struct, the state field is protected by the >>> mutex field. It is set to STATE_CLOSED while holding the mutex, and >>> threads that read it must acquire the mutex first through >>> sysIPMutexEnter(). For the cases where sysIPMutexEnter() didn't >>> acquire the mutex, we will return something different than SYS_OK and >>> the call will exit anyways. All this behaves as before, I didn't >>> change it. >> >> Thanks for clarifying. >> >>> The refcount and state that I added to the SharedMemoryConnection >>> struct work together. For a thread closing the connection, setting >>> the connection state to STATE_CLOSED has to happen before reading the >>> refcount (more on the atomicity of that read later). That's why I >>> added the MemoryBarrier() call; which I see it's better if I just >>> move it to after setting the connection state to closed. For the >>> threads accessing the connection, incrementing the refcount has to >>> happen before reading the connection state. That's already provided >>> by the InterlockedIncrement() which uses a full memory barrier. In >>> this way if the thread closing the connection reads a refcount of 0, >>> then we know it's safe to release the resources, since other threads >>> accessing the connection will see that the state is closed after >>> incrementing the refcount. If the read of refcount is not 0, then it >>> could be that a thread is accessing the connection or not (it could >>> have read a state connection of STATE_CLOSED after incrementing the >>> refcount), we don't know, so we can't release anything. Similarly if >>> the thread accessing the connection reads that the state is not >>> closed, then we know it's safe to access the stream since anybody >>> closing the connection will still have to read refcount which will be >>> at least 1. >>> As for the atomicity of the read of refcount, from >>> https://docs.microsoft.com/en-us/windows/win32/sync/interlocked-variable-access, >>> it states that "simple reads and writes to properly-aligned 32-bit >>> variables are atomic operations". Maybe I should declare refcount >>> explicitly as DWORD32? >> >> It isn't the atomicity in question with the naked read but the >> visibility. Any latency in the visibility of the store done by the >> InterLocked*() function should be handled by the retry loop, but what >> is to stop the C++ compiler from hoisting the read of refcount out of >> the loop? It isn't even volatile (which has a stronger meaning in VS >> than regular C+++). > I see what you mean now, I was thinking on atomicity and order of > operations but didn't consider the visibility of that read. Yes, if the > compiler decides to be smart and hoist the read out of the loop we might > never notice that it is safe to release those resources and we would > leak them for no reason. I see from the windows > docs(https://docs.microsoft.com/en-us/cpp/c-language/type-qualifiers) > that declaring it volatile as you pointed out should be enough to > prevent that. > >>> Instead of having a refcount we could have done something similar to >>> the stream struct and protect access to the connection through a >>> mutex. To avoid serializing all threads we could have used SRW locks >>> and only the one closing the connection would do >>> AcquireSRWLockExclusive(). It would change the state of the >>> connection to STATE_CLOSED, close all handles, and then release the >>> mutex. ENTER_CONNECTION() and LEAVE_CONNECTION() would acquire and >>> release the mutex in shared mode. But other that maybe be more easy >>> to read I don't think the change will be smaller. >>> >>>> ?413 while (attempts>0) { >>>> >>>> spaces around > >>> Done. >>> >>>> If the loop at 413 never encounters a zero reference_count then it >>>> doesn't close the events or the mutex but still returns SYS_OK. That >>>> seems wrong but I'm not sure what the right behaviour is here. >>> I can change the return value to be SYS_ERR, but I don't think there >>> is much we can do about it unless we want to wait forever until we >>> can release those resources. >> >> SYS_ERR would look better, but I see now that the return value is >> completely ignored anyway. So we're just going to leak resources if >> the loop "times out". I guess this is the best we can do. > Here is v2 with the corrections: > > Full: http://cr.openjdk.java.net/~pchilanomate/8240902/v2/webrev/ > Inc: http://cr.openjdk.java.net/~pchilanomate/8240902/v2/inc/webrev/ > ? (not sure > why the indent fixes are not highlighted as changes but the Frames view > does show they changed) > > I'll give it a run on mach5 adding tier5 as Serguei suggested. > > > Thanks, > Patricio >> Thanks, >> David >> >>> >>>> And please wait for serviceability folk to review this. >>> Sounds good. >>> >>> >>> Thanks for looking at this David! I will move the MemoryBarrier() and >>> change the refcount to be DWORD32 if you are okay with that. >>> >>> >>> Thanks, >>> Patricio >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> Tested in mach5 with the current baseline, tiers1-3 and several >>>>> runs of open/test/langtools/:tier1 which includes the jshell tests >>>>> where this connector is used. I also applied patch >>>>> http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev >>>>> mentioned in the comments of the bug, on top of the baseline and >>>>> run the langtool tests with and without this fix. Without the fix >>>>> running around 30 repetitions already shows failures in tests >>>>> jdk/jshell/FailOverExecutionControlTest.java and >>>>> jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With the >>>>> fix I run several hundred runs and saw no failures. Let me know if >>>>> there is any additional testing I should do. >>>>> >>>>> As a side note, I see there are a couple of open issues related >>>>> with jshell failures (8209848) which could be related to this bug >>>>> and therefore might be fixed by this patch. >>>>> >>>>> Thanks, >>>>> Patricio >>>>> >>> > From daniel.daugherty at oracle.com Thu Mar 19 14:22:01 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 19 Mar 2020 10:22:01 -0400 Subject: RFR 8240902: JDI shared memory connector can use already closed Handles In-Reply-To: References: <2212664f-930b-cdfc-7ae6-70c7205cdaca@oracle.com> <8928590e-6516-051e-aa84-098a6fdc9d45@oracle.com> Message-ID: <29500fec-2419-b49d-6493-dd66aca9caf2@oracle.com> > Inc: http://cr.openjdk.java.net/~pchilanomate/8240902/v2/inc/webrev/ >? ? (not sure why the indent fixes are not highlighted as changes but the Frames view does show they changed) By default, webrev ignores leading and trailing whitespace changes. Use: ??? -b: Do not ignore changes in the amount of white space. if you want to see them. I'm okay that they are not there in most of the views. If you want to see them, look at the patch. src/jdk.jdi/share/native/libdt_shmem/shmemBase.c ??? No comments. Thumbs up. Dan On 3/19/20 2:18 AM, Patricio Chilano wrote: > Hi David, > > On 3/18/20 8:10 PM, David Holmes wrote: >> Hi Patricio, >> >> On 19/03/2020 6:44 am, Patricio Chilano wrote: >>> Hi David, >>> >>> On 3/18/20 4:27 AM, David Holmes wrote: >>>> Hi Patricio, >>>> >>>> On 18/03/2020 6:14 am, Patricio Chilano wrote: >>>>> Hi all, >>>>> >>>>> Please review the following patch: >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240902 >>>>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/ >>>>> >>>>> Calling closeConnection() on an already created/opened connection >>>>> includes calls to CloseHandle() on objects that can still be used >>>>> by other threads. This can lead to either undefined behavior or, >>>>> as detailed in the bug comments, changes of state of unrelated >>>>> objects. >>>> >>>> This was a really great find! >>> Thanks!? : ) >>> >>>>> This issue was found while debugging the reason behind some jshell >>>>> test failures seen after pushing 8230594. Not as important, but >>>>> there are also calls to closeStream() from >>>>> createStream()/openStream() when failing to create/open a stream >>>>> that will return after executing "CHECK_ERROR(enterMutex(stream, >>>>> NULL));" without closing the intended resources. Then, calling >>>>> closeConnection() could assert if the reason of the previous >>>>> failure was that the stream's mutex failed to be created/opened. >>>>> These patch aims to address these issues too. >>>> >>>> Patch looks good in general. The internal reference count guards >>>> deletion of the internal resources, and is itself safe because >>>> never actually delete the connection. Thanks for adding the comment >>>> about this aspect. >>>> >>>> A few items: >>>> >>>> Please update copyright year before pushing. >>> Done. >>> >>>> Please align ENTER_CONNECTION/LEAVE_CONNECTION macros the same way >>>> as STREAM_INVARIANT. >>> Done. >>> >>>> ?170 unsigned int refcount; >>>> ?171???? jint state; >>>> >>>> I'm unclear about the use of stream->state and connection->state as >>>> guards - unless accessed under a mutex these would seem to at least >>>> need acquire/release semantics. >>>> >>>> Additionally the reads of refcount would also seem to need to some >>>> form of memory synchronization - though the Windows docs for the >>>> Interlocked* API does not show how to simply read such a variable! >>>> Though I note that the RtlFirstEntrySList method for the >>>> "Interlocked Singly Linked Lists" API does state "Access to the >>>> list is synchronized on a multiprocessor system." which suggests a >>>> read of such a variable does require some form of memory >>>> synchronization! >>> In the case of the stream struct, the state field is protected by >>> the mutex field. It is set to STATE_CLOSED while holding the mutex, >>> and threads that read it must acquire the mutex first through >>> sysIPMutexEnter(). For the cases where sysIPMutexEnter() didn't >>> acquire the mutex, we will return something different than SYS_OK >>> and the call will exit anyways. All this behaves as before, I didn't >>> change it. >> >> Thanks for clarifying. >> >>> The refcount and state that I added to the SharedMemoryConnection >>> struct work together. For a thread closing the connection, setting >>> the connection state to STATE_CLOSED has to happen before reading >>> the refcount (more on the atomicity of that read later). That's why >>> I added the MemoryBarrier() call; which I see it's better if I just >>> move it to after setting the connection state to closed. For the >>> threads accessing the connection, incrementing the refcount has to >>> happen before reading the connection state. That's already provided >>> by the InterlockedIncrement() which uses a full memory barrier. In >>> this way if the thread closing the connection reads a refcount of 0, >>> then we know it's safe to release the resources, since other threads >>> accessing the connection will see that the state is closed after >>> incrementing the refcount. If the read of refcount is not 0, then it >>> could be that a thread is accessing the connection or not (it could >>> have read a state connection of STATE_CLOSED after incrementing the >>> refcount), we don't know, so we can't release anything. Similarly if >>> the thread accessing the connection reads that the state is not >>> closed, then we know it's safe to access the stream since anybody >>> closing the connection will still have to read refcount which will >>> be at least 1. >>> As for the atomicity of the read of refcount, from >>> https://docs.microsoft.com/en-us/windows/win32/sync/interlocked-variable-access, >>> it states that "simple reads and writes to properly-aligned 32-bit >>> variables are atomic operations". Maybe I should declare refcount >>> explicitly as DWORD32? >> >> It isn't the atomicity in question with the naked read but the >> visibility. Any latency in the visibility of the store done by the >> InterLocked*() function should be handled by the retry loop, but what >> is to stop the C++ compiler from hoisting the read of refcount out of >> the loop? It isn't even volatile (which has a stronger meaning in VS >> than regular C+++). > I see what you mean now, I was thinking on atomicity and order of > operations but didn't consider the visibility of that read. Yes, if > the compiler decides to be smart and hoist the read out of the loop we > might never notice that it is safe to release those resources and we > would leak them for no reason. I see from the windows > docs(https://docs.microsoft.com/en-us/cpp/c-language/type-qualifiers) > that declaring it volatile as you pointed out should be enough to > prevent that. > >>> Instead of having a refcount we could have done something similar to >>> the stream struct and protect access to the connection through a >>> mutex. To avoid serializing all threads we could have used SRW locks >>> and only the one closing the connection would do >>> AcquireSRWLockExclusive(). It would change the state of the >>> connection to STATE_CLOSED, close all handles, and then release the >>> mutex. ENTER_CONNECTION() and LEAVE_CONNECTION() would acquire and >>> release the mutex in shared mode. But other that maybe be more easy >>> to read I don't think the change will be smaller. >>> >>>> ?413 while (attempts>0) { >>>> >>>> spaces around > >>> Done. >>> >>>> If the loop at 413 never encounters a zero reference_count then it >>>> doesn't close the events or the mutex but still returns SYS_OK. >>>> That seems wrong but I'm not sure what the right behaviour is here. >>> I can change the return value to be SYS_ERR, but I don't think there >>> is much we can do about it unless we want to wait forever until we >>> can release those resources. >> >> SYS_ERR would look better, but I see now that the return value is >> completely ignored anyway. So we're just going to leak resources if >> the loop "times out". I guess this is the best we can do. > Here is v2 with the corrections: > > Full: http://cr.openjdk.java.net/~pchilanomate/8240902/v2/webrev/ > Inc: http://cr.openjdk.java.net/~pchilanomate/8240902/v2/inc/webrev/ > ? (not sure > why the indent fixes are not highlighted as changes but the Frames > view does show they changed) > > I'll give it a run on mach5 adding tier5 as Serguei suggested. > > > Thanks, > Patricio >> Thanks, >> David >> >>> >>>> And please wait for serviceability folk to review this. >>> Sounds good. >>> >>> >>> Thanks for looking at this David! I will move the MemoryBarrier() >>> and change the refcount to be DWORD32 if you are okay with that. >>> >>> >>> Thanks, >>> Patricio >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> Tested in mach5 with the current baseline, tiers1-3 and several >>>>> runs of open/test/langtools/:tier1 which includes the jshell tests >>>>> where this connector is used. I also applied patch >>>>> http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev >>>>> mentioned in the comments of the bug, on top of the baseline and >>>>> run the langtool tests with and without this fix. Without the fix >>>>> running around 30 repetitions already shows failures in tests >>>>> jdk/jshell/FailOverExecutionControlTest.java and >>>>> jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With >>>>> the fix I run several hundred runs and saw no failures. Let me >>>>> know if there is any additional testing I should do. >>>>> >>>>> As a side note, I see there are a couple of open issues related >>>>> with jshell failures (8209848) which could be related to this bug >>>>> and therefore might be fixed by this patch. >>>>> >>>>> Thanks, >>>>> Patricio >>>>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From patricio.chilano.mateo at oracle.com Thu Mar 19 14:40:08 2020 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Thu, 19 Mar 2020 11:40:08 -0300 Subject: RFR 8240902: JDI shared memory connector can use already closed Handles In-Reply-To: References: <2212664f-930b-cdfc-7ae6-70c7205cdaca@oracle.com> <8928590e-6516-051e-aa84-098a6fdc9d45@oracle.com> Message-ID: <77a0b7aa-aa26-cde2-b465-a074c0cae240@oracle.com> Thanks David! Patricio On 3/19/20 4:50 AM, David Holmes wrote: > Hi Patricio, > > Incremental changes look good. > > Thanks, > David > > On 19/03/2020 4:18 pm, Patricio Chilano wrote: >> Hi David, >> >> On 3/18/20 8:10 PM, David Holmes wrote: >>> Hi Patricio, >>> >>> On 19/03/2020 6:44 am, Patricio Chilano wrote: >>>> Hi David, >>>> >>>> On 3/18/20 4:27 AM, David Holmes wrote: >>>>> Hi Patricio, >>>>> >>>>> On 18/03/2020 6:14 am, Patricio Chilano wrote: >>>>>> Hi all, >>>>>> >>>>>> Please review the following patch: >>>>>> >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240902 >>>>>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/ >>>>>> >>>>>> Calling closeConnection() on an already created/opened connection >>>>>> includes calls to CloseHandle() on objects that can still be used >>>>>> by other threads. This can lead to either undefined behavior or, >>>>>> as detailed in the bug comments, changes of state of unrelated >>>>>> objects. >>>>> >>>>> This was a really great find! >>>> Thanks!? : ) >>>> >>>>>> This issue was found while debugging the reason behind some >>>>>> jshell test failures seen after pushing 8230594. Not as >>>>>> important, but there are also calls to closeStream() from >>>>>> createStream()/openStream() when failing to create/open a stream >>>>>> that will return after executing "CHECK_ERROR(enterMutex(stream, >>>>>> NULL));" without closing the intended resources. Then, calling >>>>>> closeConnection() could assert if the reason of the previous >>>>>> failure was that the stream's mutex failed to be created/opened. >>>>>> These patch aims to address these issues too. >>>>> >>>>> Patch looks good in general. The internal reference count guards >>>>> deletion of the internal resources, and is itself safe because >>>>> never actually delete the connection. Thanks for adding the >>>>> comment about this aspect. >>>>> >>>>> A few items: >>>>> >>>>> Please update copyright year before pushing. >>>> Done. >>>> >>>>> Please align ENTER_CONNECTION/LEAVE_CONNECTION macros the same way >>>>> as STREAM_INVARIANT. >>>> Done. >>>> >>>>> ?170 unsigned int refcount; >>>>> ?171???? jint state; >>>>> >>>>> I'm unclear about the use of stream->state and connection->state >>>>> as guards - unless accessed under a mutex these would seem to at >>>>> least need acquire/release semantics. >>>>> >>>>> Additionally the reads of refcount would also seem to need to some >>>>> form of memory synchronization - though the Windows docs for the >>>>> Interlocked* API does not show how to simply read such a variable! >>>>> Though I note that the RtlFirstEntrySList method for the >>>>> "Interlocked Singly Linked Lists" API does state "Access to the >>>>> list is synchronized on a multiprocessor system." which suggests a >>>>> read of such a variable does require some form of memory >>>>> synchronization! >>>> In the case of the stream struct, the state field is protected by >>>> the mutex field. It is set to STATE_CLOSED while holding the mutex, >>>> and threads that read it must acquire the mutex first through >>>> sysIPMutexEnter(). For the cases where sysIPMutexEnter() didn't >>>> acquire the mutex, we will return something different than SYS_OK >>>> and the call will exit anyways. All this behaves as before, I >>>> didn't change it. >>> >>> Thanks for clarifying. >>> >>>> The refcount and state that I added to the SharedMemoryConnection >>>> struct work together. For a thread closing the connection, setting >>>> the connection state to STATE_CLOSED has to happen before reading >>>> the refcount (more on the atomicity of that read later). That's why >>>> I added the MemoryBarrier() call; which I see it's better if I just >>>> move it to after setting the connection state to closed. For the >>>> threads accessing the connection, incrementing the refcount has to >>>> happen before reading the connection state. That's already provided >>>> by the InterlockedIncrement() which uses a full memory barrier. In >>>> this way if the thread closing the connection reads a refcount of >>>> 0, then we know it's safe to release the resources, since other >>>> threads accessing the connection will see that the state is closed >>>> after incrementing the refcount. If the read of refcount is not 0, >>>> then it could be that a thread is accessing the connection or not >>>> (it could have read a state connection of STATE_CLOSED after >>>> incrementing the refcount), we don't know, so we can't release >>>> anything. Similarly if the thread accessing the connection reads >>>> that the state is not closed, then we know it's safe to access the >>>> stream since anybody closing the connection will still have to read >>>> refcount which will be at least 1. >>>> As for the atomicity of the read of refcount, from >>>> https://docs.microsoft.com/en-us/windows/win32/sync/interlocked-variable-access, >>>> it states that "simple reads and writes to properly-aligned 32-bit >>>> variables are atomic operations". Maybe I should declare refcount >>>> explicitly as DWORD32? >>> >>> It isn't the atomicity in question with the naked read but the >>> visibility. Any latency in the visibility of the store done by the >>> InterLocked*() function should be handled by the retry loop, but >>> what is to stop the C++ compiler from hoisting the read of refcount >>> out of the loop? It isn't even volatile (which has a stronger >>> meaning in VS than regular C+++). >> I see what you mean now, I was thinking on atomicity and order of >> operations but didn't consider the visibility of that read. Yes, if >> the compiler decides to be smart and hoist the read out of the loop >> we might never notice that it is safe to release those resources and >> we would leak them for no reason. I see from the windows >> docs(https://docs.microsoft.com/en-us/cpp/c-language/type-qualifiers) >> that declaring it volatile as you pointed out should be enough to >> prevent that. >> >>>> Instead of having a refcount we could have done something similar >>>> to the stream struct and protect access to the connection through a >>>> mutex. To avoid serializing all threads we could have used SRW >>>> locks and only the one closing the connection would do >>>> AcquireSRWLockExclusive(). It would change the state of the >>>> connection to STATE_CLOSED, close all handles, and then release the >>>> mutex. ENTER_CONNECTION() and LEAVE_CONNECTION() would acquire and >>>> release the mutex in shared mode. But other that maybe be more easy >>>> to read I don't think the change will be smaller. >>>> >>>>> ?413 while (attempts>0) { >>>>> >>>>> spaces around > >>>> Done. >>>> >>>>> If the loop at 413 never encounters a zero reference_count then it >>>>> doesn't close the events or the mutex but still returns SYS_OK. >>>>> That seems wrong but I'm not sure what the right behaviour is here. >>>> I can change the return value to be SYS_ERR, but I don't think >>>> there is much we can do about it unless we want to wait forever >>>> until we can release those resources. >>> >>> SYS_ERR would look better, but I see now that the return value is >>> completely ignored anyway. So we're just going to leak resources if >>> the loop "times out". I guess this is the best we can do. >> Here is v2 with the corrections: >> >> Full: http://cr.openjdk.java.net/~pchilanomate/8240902/v2/webrev/ >> Inc: http://cr.openjdk.java.net/~pchilanomate/8240902/v2/inc/webrev/ >> ? (not >> sure why the indent fixes are not highlighted as changes but the >> Frames view does show they changed) >> >> I'll give it a run on mach5 adding tier5 as Serguei suggested. >> >> >> Thanks, >> Patricio >>> Thanks, >>> David >>> >>>> >>>>> And please wait for serviceability folk to review this. >>>> Sounds good. >>>> >>>> >>>> Thanks for looking at this David! I will move the MemoryBarrier() >>>> and change the refcount to be DWORD32 if you are okay with that. >>>> >>>> >>>> Thanks, >>>> Patricio >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>>> Tested in mach5 with the current baseline, tiers1-3 and several >>>>>> runs of open/test/langtools/:tier1 which includes the jshell >>>>>> tests where this connector is used. I also applied patch >>>>>> http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev >>>>>> mentioned in the comments of the bug, on top of the baseline and >>>>>> run the langtool tests with and without this fix. Without the fix >>>>>> running around 30 repetitions already shows failures in tests >>>>>> jdk/jshell/FailOverExecutionControlTest.java and >>>>>> jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With >>>>>> the fix I run several hundred runs and saw no failures. Let me >>>>>> know if there is any additional testing I should do. >>>>>> >>>>>> As a side note, I see there are a couple of open issues related >>>>>> with jshell failures (8209848) which could be related to this bug >>>>>> and therefore might be fixed by this patch. >>>>>> >>>>>> Thanks, >>>>>> Patricio >>>>>> >>>> >> From patricio.chilano.mateo at oracle.com Thu Mar 19 14:43:27 2020 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Thu, 19 Mar 2020 11:43:27 -0300 Subject: RFR 8240902: JDI shared memory connector can use already closed Handles In-Reply-To: <29500fec-2419-b49d-6493-dd66aca9caf2@oracle.com> References: <2212664f-930b-cdfc-7ae6-70c7205cdaca@oracle.com> <8928590e-6516-051e-aa84-098a6fdc9d45@oracle.com> <29500fec-2419-b49d-6493-dd66aca9caf2@oracle.com> Message-ID: On 3/19/20 11:22 AM, Daniel D. Daugherty wrote: > > Inc: http://cr.openjdk.java.net/~pchilanomate/8240902/v2/inc/webrev/ > >? ? (not sure why the indent fixes are not highlighted as changes but > the Frames view does show they changed) > > By default, webrev ignores leading and trailing whitespace changes. Use: > > ??? -b: Do not ignore changes in the amount of white space. > > if you want to see them. I'm okay that they are not there in most of > the views. If you want to see them, look at the patch. Thanks! I didn't know that option. > > src/jdk.jdi/share/native/libdt_shmem/shmemBase.c > ??? No comments. > > Thumbs up. Thanks for looking at this Dan! Patricio > Dan > > > On 3/19/20 2:18 AM, Patricio Chilano wrote: >> Hi David, >> >> On 3/18/20 8:10 PM, David Holmes wrote: >>> Hi Patricio, >>> >>> On 19/03/2020 6:44 am, Patricio Chilano wrote: >>>> Hi David, >>>> >>>> On 3/18/20 4:27 AM, David Holmes wrote: >>>>> Hi Patricio, >>>>> >>>>> On 18/03/2020 6:14 am, Patricio Chilano wrote: >>>>>> Hi all, >>>>>> >>>>>> Please review the following patch: >>>>>> >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240902 >>>>>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/ >>>>>> >>>>>> Calling closeConnection() on an already created/opened connection >>>>>> includes calls to CloseHandle() on objects that can still be used >>>>>> by other threads. This can lead to either undefined behavior or, >>>>>> as detailed in the bug comments, changes of state of unrelated >>>>>> objects. >>>>> >>>>> This was a really great find! >>>> Thanks!? : ) >>>> >>>>>> This issue was found while debugging the reason behind some >>>>>> jshell test failures seen after pushing 8230594. Not as >>>>>> important, but there are also calls to closeStream() from >>>>>> createStream()/openStream() when failing to create/open a stream >>>>>> that will return after executing "CHECK_ERROR(enterMutex(stream, >>>>>> NULL));" without closing the intended resources. Then, calling >>>>>> closeConnection() could assert if the reason of the previous >>>>>> failure was that the stream's mutex failed to be created/opened. >>>>>> These patch aims to address these issues too. >>>>> >>>>> Patch looks good in general. The internal reference count guards >>>>> deletion of the internal resources, and is itself safe because >>>>> never actually delete the connection. Thanks for adding the >>>>> comment about this aspect. >>>>> >>>>> A few items: >>>>> >>>>> Please update copyright year before pushing. >>>> Done. >>>> >>>>> Please align ENTER_CONNECTION/LEAVE_CONNECTION macros the same way >>>>> as STREAM_INVARIANT. >>>> Done. >>>> >>>>> ?170 unsigned int refcount; >>>>> ?171???? jint state; >>>>> >>>>> I'm unclear about the use of stream->state and connection->state >>>>> as guards - unless accessed under a mutex these would seem to at >>>>> least need acquire/release semantics. >>>>> >>>>> Additionally the reads of refcount would also seem to need to some >>>>> form of memory synchronization - though the Windows docs for the >>>>> Interlocked* API does not show how to simply read such a variable! >>>>> Though I note that the RtlFirstEntrySList method for the >>>>> "Interlocked Singly Linked Lists" API does state "Access to the >>>>> list is synchronized on a multiprocessor system." which suggests a >>>>> read of such a variable does require some form of memory >>>>> synchronization! >>>> In the case of the stream struct, the state field is protected by >>>> the mutex field. It is set to STATE_CLOSED while holding the mutex, >>>> and threads that read it must acquire the mutex first through >>>> sysIPMutexEnter(). For the cases where sysIPMutexEnter() didn't >>>> acquire the mutex, we will return something different than SYS_OK >>>> and the call will exit anyways. All this behaves as before, I >>>> didn't change it. >>> >>> Thanks for clarifying. >>> >>>> The refcount and state that I added to the SharedMemoryConnection >>>> struct work together. For a thread closing the connection, setting >>>> the connection state to STATE_CLOSED has to happen before reading >>>> the refcount (more on the atomicity of that read later). That's why >>>> I added the MemoryBarrier() call; which I see it's better if I just >>>> move it to after setting the connection state to closed. For the >>>> threads accessing the connection, incrementing the refcount has to >>>> happen before reading the connection state. That's already provided >>>> by the InterlockedIncrement() which uses a full memory barrier. In >>>> this way if the thread closing the connection reads a refcount of >>>> 0, then we know it's safe to release the resources, since other >>>> threads accessing the connection will see that the state is closed >>>> after incrementing the refcount. If the read of refcount is not 0, >>>> then it could be that a thread is accessing the connection or not >>>> (it could have read a state connection of STATE_CLOSED after >>>> incrementing the refcount), we don't know, so we can't release >>>> anything. Similarly if the thread accessing the connection reads >>>> that the state is not closed, then we know it's safe to access the >>>> stream since anybody closing the connection will still have to read >>>> refcount which will be at least 1. >>>> As for the atomicity of the read of refcount, from >>>> https://docs.microsoft.com/en-us/windows/win32/sync/interlocked-variable-access, >>>> it states that "simple reads and writes to properly-aligned 32-bit >>>> variables are atomic operations". Maybe I should declare refcount >>>> explicitly as DWORD32? >>> >>> It isn't the atomicity in question with the naked read but the >>> visibility. Any latency in the visibility of the store done by the >>> InterLocked*() function should be handled by the retry loop, but >>> what is to stop the C++ compiler from hoisting the read of refcount >>> out of the loop? It isn't even volatile (which has a stronger >>> meaning in VS than regular C+++). >> I see what you mean now, I was thinking on atomicity and order of >> operations but didn't consider the visibility of that read. Yes, if >> the compiler decides to be smart and hoist the read out of the loop >> we might never notice that it is safe to release those resources and >> we would leak them for no reason. I see from the windows >> docs(https://docs.microsoft.com/en-us/cpp/c-language/type-qualifiers) >> that declaring it volatile as you pointed out should be enough to >> prevent that. >> >>>> Instead of having a refcount we could have done something similar >>>> to the stream struct and protect access to the connection through a >>>> mutex. To avoid serializing all threads we could have used SRW >>>> locks and only the one closing the connection would do >>>> AcquireSRWLockExclusive(). It would change the state of the >>>> connection to STATE_CLOSED, close all handles, and then release the >>>> mutex. ENTER_CONNECTION() and LEAVE_CONNECTION() would acquire and >>>> release the mutex in shared mode. But other that maybe be more easy >>>> to read I don't think the change will be smaller. >>>> >>>>> ?413 while (attempts>0) { >>>>> >>>>> spaces around > >>>> Done. >>>> >>>>> If the loop at 413 never encounters a zero reference_count then it >>>>> doesn't close the events or the mutex but still returns SYS_OK. >>>>> That seems wrong but I'm not sure what the right behaviour is here. >>>> I can change the return value to be SYS_ERR, but I don't think >>>> there is much we can do about it unless we want to wait forever >>>> until we can release those resources. >>> >>> SYS_ERR would look better, but I see now that the return value is >>> completely ignored anyway. So we're just going to leak resources if >>> the loop "times out". I guess this is the best we can do. >> Here is v2 with the corrections: >> >> Full: http://cr.openjdk.java.net/~pchilanomate/8240902/v2/webrev/ >> Inc: http://cr.openjdk.java.net/~pchilanomate/8240902/v2/inc/webrev/ >> ? (not >> sure why the indent fixes are not highlighted as changes but the >> Frames view does show they changed) >> >> I'll give it a run on mach5 adding tier5 as Serguei suggested. >> >> >> Thanks, >> Patricio >>> Thanks, >>> David >>> >>>> >>>>> And please wait for serviceability folk to review this. >>>> Sounds good. >>>> >>>> >>>> Thanks for looking at this David! I will move the MemoryBarrier() >>>> and change the refcount to be DWORD32 if you are okay with that. >>>> >>>> >>>> Thanks, >>>> Patricio >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>>> Tested in mach5 with the current baseline, tiers1-3 and several >>>>>> runs of open/test/langtools/:tier1 which includes the jshell >>>>>> tests where this connector is used. I also applied patch >>>>>> http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev >>>>>> mentioned in the comments of the bug, on top of the baseline and >>>>>> run the langtool tests with and without this fix. Without the fix >>>>>> running around 30 repetitions already shows failures in tests >>>>>> jdk/jshell/FailOverExecutionControlTest.java and >>>>>> jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With >>>>>> the fix I run several hundred runs and saw no failures. Let me >>>>>> know if there is any additional testing I should do. >>>>>> >>>>>> As a side note, I see there are a couple of open issues related >>>>>> with jshell failures (8209848) which could be related to this bug >>>>>> and therefore might be fixed by this patch. >>>>>> >>>>>> Thanks, >>>>>> Patricio >>>>>> >>>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From coleen.phillimore at oracle.com Thu Mar 19 19:46:09 2020 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 19 Mar 2020 15:46:09 -0400 Subject: RFR (T) 8241320: The ClassLoaderData::_is_unsafe_anonymous field is unused in the SA Message-ID: <8c45be74-c33d-c4d3-76ff-98766112b6e7@oracle.com> Summary: remove unused code that is changing in Hotspot for hidden classes. Ran tier1-3 tests.? See bug for more details. open webrev at http://cr.openjdk.java.net/~coleenp/2020/8241320.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8241320 Thanks, Coleen From lois.foltan at oracle.com Thu Mar 19 20:12:04 2020 From: lois.foltan at oracle.com (Lois Foltan) Date: Thu, 19 Mar 2020 16:12:04 -0400 Subject: RFR (T) 8241320: The ClassLoaderData::_is_unsafe_anonymous field is unused in the SA In-Reply-To: <8c45be74-c33d-c4d3-76ff-98766112b6e7@oracle.com> References: <8c45be74-c33d-c4d3-76ff-98766112b6e7@oracle.com> Message-ID: Looks good. Lois On 3/19/2020 3:46 PM, coleen.phillimore at oracle.com wrote: > Summary: remove unused code that is changing in Hotspot for hidden > classes. > > Ran tier1-3 tests.? See bug for more details. > > open webrev at http://cr.openjdk.java.net/~coleenp/2020/8241320.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8241320 > > Thanks, > Coleen From coleen.phillimore at oracle.com Thu Mar 19 21:06:35 2020 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 19 Mar 2020 17:06:35 -0400 Subject: RFR (T) 8241320: The ClassLoaderData::_is_unsafe_anonymous field is unused in the SA In-Reply-To: References: <8c45be74-c33d-c4d3-76ff-98766112b6e7@oracle.com> Message-ID: <0a68d9e5-8500-3158-123c-e4253f07129f@oracle.com> Thanks Lois! Coleen On 3/19/20 4:12 PM, Lois Foltan wrote: > Looks good. > Lois > > On 3/19/2020 3:46 PM, coleen.phillimore at oracle.com wrote: >> Summary: remove unused code that is changing in Hotspot for hidden >> classes. >> >> Ran tier1-3 tests.? See bug for more details. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2020/8241320.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8241320 >> >> Thanks, >> Coleen > From david.holmes at oracle.com Thu Mar 19 22:43:08 2020 From: david.holmes at oracle.com (David Holmes) Date: Fri, 20 Mar 2020 08:43:08 +1000 Subject: RFR (T) 8241320: The ClassLoaderData::_is_unsafe_anonymous field is unused in the SA In-Reply-To: <8c45be74-c33d-c4d3-76ff-98766112b6e7@oracle.com> References: <8c45be74-c33d-c4d3-76ff-98766112b6e7@oracle.com> Message-ID: Hi Coleen, On 20/03/2020 5:46 am, coleen.phillimore at oracle.com wrote: > Summary: remove unused code that is changing in Hotspot for hidden classes. I'm not sure how to identify unused code in the SA given that it exposes a Java API for querying the JVM internals. You say getisUnsafeAnonymous() is unused because nothing in the SA calls it. But the same would seem to be true for other parts of the CLD API - for example - ClassLoaderData::dictionary() is called from - ClassLoaderData::allEntriesDo, is called from - ClassLoaderDataGraph::allEntriesDo, is called from - nowhere ??? David ----- > Ran tier1-3 tests.? See bug for more details. > > open webrev at http://cr.openjdk.java.net/~coleenp/2020/8241320.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8241320 > > Thanks, > Coleen From serguei.spitsyn at oracle.com Fri Mar 20 01:03:10 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 19 Mar 2020 18:03:10 -0700 Subject: RFR: 8241123: Refactor vmTestbase stress framework to use j.u.c and make creation of threads more flexible In-Reply-To: <504b0902-9fd1-ea8c-399a-185a4ceaa9e0@oracle.com> References: <0d73d306-2eff-375c-65e1-67142b2c6c59@oracle.com> <4E0F364A-47F3-428D-9C08-6B1ADFCB9D24@oracle.com> <504b0902-9fd1-ea8c-399a-185a4ceaa9e0@oracle.com> Message-ID: <240d6e82-d229-8a2d-6be2-3042a1537c11@oracle.com> An HTML attachment was scrubbed... URL: From leonid.mesnik at oracle.com Fri Mar 20 01:10:28 2020 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Thu, 19 Mar 2020 18:10:28 -0700 Subject: RFR: 8241123: Refactor vmTestbase stress framework to use j.u.c and make creation of threads more flexible In-Reply-To: <240d6e82-d229-8a2d-6be2-3042a1537c11@oracle.com> References: <0d73d306-2eff-375c-65e1-67142b2c6c59@oracle.com> <4E0F364A-47F3-428D-9C08-6B1ADFCB9D24@oracle.com> <504b0902-9fd1-ea8c-399a-185a4ceaa9e0@oracle.com> <240d6e82-d229-8a2d-6be2-3042a1537c11@oracle.com> Message-ID: Hi Thank you for review and feedback. See my comments inline. > On Mar 19, 2020, at 6:03 PM, serguei.spitsyn at oracle.com wrote: > > Hi Leonid, > > It looks good in general. > Just a couple of comments. > > > http://cr.openjdk.java.net/~lmesnik/8241123/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/share/Wicket.java.frames.html > 168 public int waitFor(long timeout) { > 169 if (timeout < 0) > 170 throw new IllegalArgumentException( > 171 "timeout value is negative: " + timeout); > 172 > 173 long id = System.currentTimeMillis(); > 174 > 175 try { > 176 lock.lock(); > 177 --waiters; > 178 if (debugOutput != null) { > 179 debugOutput.printf("Wicket %d %s: waitFor(). There are %d waiters totally now.\n", id, name, waiters); > 180 } > 181 > 182 long waitTime = timeout; > 183 long startTime = System.currentTimeMillis(); > 184 > 185 while (count > 0 && waitTime > 0) { > 186 try { > 187 condition.await(waitTime, TimeUnit.MILLISECONDS); > 188 } catch (InterruptedException e) { > 189 } > 190 waitTime = timeout - (System.currentTimeMillis() - startTime); > 191 } > 192 --waiters; > 193 return count; > 194 } finally { > 195 lock.unlock(); > 196 } > 197 } > > The waiters probably needs to be incremented instead of decremented at line: > 177 --waiters; Thank you, fixed. > > http://cr.openjdk.java.net/~lmesnik/8241123/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/share/runner/ThreadsRunner.java.udiff.html > private void waitForOtherThreads() { > if (shouldWait) { > shouldWait = false; > - finished.unlock(); > - finished.waitFor(); > + finished.decrementAndGet(); > + while (finished.get() != 0) { > + try { > + Thread.sleep(1000); > + } catch (InterruptedException ie) { > + } > + } > } else { > throw new TestBug("Waiting a second time is not premitted"); > } > } > > Should we use a shorter sleep, something like Thread.sleep(100)? > These tests executed 30 or 60 seconds now by default, so sleeping 1 sec doesn't increase overall time. But tI am fine to change it 100, it also should works fine. Leonid > > Thanks, > Serguei > > > On 3/18/20 15:18, Leonid Mesnik wrote: >> >> On 3/18/20 2:30 PM, Igor Ignatyev wrote: >>>> I need more time to get grasp of Wicket and your changes in it; will come back to you after I understand them. >>> ok, now when I believe that I have enough understanding of Wicket, I have a few comments: >>> 1. >>>> 68 private Lock lock = new ReentrantLock(); >>>> 69 private Condition condition = lock.newCondition(); >>> it's better to make these fields final. >>> >>> 2. as all writes and reads of Wicket::count are guarded by lock.lock, there is no need for it to be atomic. >>> 3. adding lock to getWaiters will also remove need for Wicket::waiters to be atomic. >> All 3 are fixed. Thanks for your suggestions. >> >> Updated version: >> >> http://cr.openjdk.java.net/~lmesnik/8241123/webrev.01/ >> Leonid >> >>> >>> the rest looks good to me. >>> >>> Thanks, >>> -- Igor >>> >>> >>> >>>> On Mar 18, 2020, at 12:48 PM, Igor Ignatyev > wrote: >>>> >>>> Hi Leonid, >>>> >>>> I've started looking at your webrev, and so far have a couple questions: >>>> >>>>> Test vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java was updated to don't use Wicket. (The lock has a reference to thread which affects test.) >>>> can't you use just a volatile boolean field? >>>> >>>>> Wicket "finished" in class ThreadsRunner was changed to atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which might happened in stress GC tests. >>>> won't j.u.c.CountDownLatch be more appropriate and cleaner solution here? >>>> >>>> I need more time to get grasp of Wicket and your changes in it; will come back to you after I understand them. >>>> >>>> -- Igor >>>> >>>>> On Mar 18, 2020, at 12:37 PM, Leonid Mesnik > wrote: >>>>> >>>>> Hi >>>>> >>>>> Could you please review following fix which slightly refactor vmTestbase stress test harness. This refactoring helps to add virtual threads testing support. >>>>> >>>>> The Wicket uses plain sync/wait/notify mechanism which cause carrier thread starvation and should not be used in virtual threads. The ManagedThread is a subclass of Thread so it couldn't be virtual thread. >>>>> >>>>> >>>>> Following fix changes Wicket to use locks/conditions to don't pin vthread to carrier thread while starting testing. >>>>> >>>>> ManagedThread is fixed to keep execution thread as the thread variable and isolate it's creation. >>>>> >>>>> Test vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java was updated to don't use Wicket. (The lock has a reference to thread which affects test.) >>>>> >>>>> Wicket "finished" in class ThreadsRunner was changed to atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which might happened in stress GC tests. >>>>> >>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8241123/webrev.00/ >>>>> >>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8241123 >>>>> >>>>> >>>>> Leonid >>>>> >>>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Fri Mar 20 02:05:24 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 19 Mar 2020 19:05:24 -0700 Subject: RFR: 8241123: Refactor vmTestbase stress framework to use j.u.c and make creation of threads more flexible In-Reply-To: References: <0d73d306-2eff-375c-65e1-67142b2c6c59@oracle.com> <4E0F364A-47F3-428D-9C08-6B1ADFCB9D24@oracle.com> <504b0902-9fd1-ea8c-399a-185a4ceaa9e0@oracle.com> <240d6e82-d229-8a2d-6be2-3042a1537c11@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Fri Mar 20 02:32:37 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 19 Mar 2020 19:32:37 -0700 Subject: RFR (T) 8241320: The ClassLoaderData::_is_unsafe_anonymous field is unused in the SA In-Reply-To: References: <8c45be74-c33d-c4d3-76ff-98766112b6e7@oracle.com> Message-ID: +1 Thanks, Serguei On 3/19/20 13:12, Lois Foltan wrote: > Looks good. > Lois > > On 3/19/2020 3:46 PM, coleen.phillimore at oracle.com wrote: >> Summary: remove unused code that is changing in Hotspot for hidden >> classes. >> >> Ran tier1-3 tests.? See bug for more details. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2020/8241320.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8241320 >> >> Thanks, >> Coleen > From chris.plummer at oracle.com Fri Mar 20 03:25:45 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 19 Mar 2020 20:25:45 -0700 Subject: RFR(XS) 8241335: ProblemList serviceability/sa/ClhsdbPstack.java due to JDK-8240956 Message-ID: <6f4dcbfb-9174-44f4-1e1a-3e678778807e@oracle.com> Hello, Please review the following: diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt +++ b/test/hotspot/jtreg/ProblemList.txt @@ -115,7 +115,7 @@ ?serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all ?serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all ?serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all +serviceability/sa/ClhsdbPstack.java 8193639,8240956 solaris-all,linux-all ?serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 solaris-all ?serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 solaris-all,linux-x64,macosx-x64,windows-x64 ?serviceability/sa/ClhsdbSource.java 8193639 solaris-all I'm still waiting for a tier1 run to make sure the test isn't run on linux. I'll push once that is done if I've had a review by then. thanks, Chris From mikael.vidstedt at oracle.com Fri Mar 20 03:27:31 2020 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Thu, 19 Mar 2020 20:27:31 -0700 Subject: RFR(XS) 8241335: ProblemList serviceability/sa/ClhsdbPstack.java due to JDK-8240956 In-Reply-To: <6f4dcbfb-9174-44f4-1e1a-3e678778807e@oracle.com> References: <6f4dcbfb-9174-44f4-1e1a-3e678778807e@oracle.com> Message-ID: Looks good, thanks for doing this! Cheers, Mikael > On Mar 19, 2020, at 8:25 PM, Chris Plummer wrote: > > ?Hello, > > Please review the following: > > diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt > --- a/test/hotspot/jtreg/ProblemList.txt > +++ b/test/hotspot/jtreg/ProblemList.txt > @@ -115,7 +115,7 @@ > serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all > serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all > serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all > -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all > +serviceability/sa/ClhsdbPstack.java 8193639,8240956 solaris-all,linux-all > serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 solaris-all > serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 solaris-all,linux-x64,macosx-x64,windows-x64 > serviceability/sa/ClhsdbSource.java 8193639 solaris-all > > I'm still waiting for a tier1 run to make sure the test isn't run on linux. I'll push once that is done if I've had a review by then. > > thanks, > > Chris > From serguei.spitsyn at oracle.com Fri Mar 20 04:41:35 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 19 Mar 2020 21:41:35 -0700 Subject: RFR(XS) 8241335: ProblemList serviceability/sa/ClhsdbPstack.java due to JDK-8240956 In-Reply-To: References: <6f4dcbfb-9174-44f4-1e1a-3e678778807e@oracle.com> Message-ID: +1 Thanks, Serguei On 3/19/20 20:27, Mikael Vidstedt wrote: > Looks good, thanks for doing this! > > Cheers, > Mikael > >> On Mar 19, 2020, at 8:25 PM, Chris Plummer wrote: >> >> ?Hello, >> >> Please review the following: >> >> diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt >> --- a/test/hotspot/jtreg/ProblemList.txt >> +++ b/test/hotspot/jtreg/ProblemList.txt >> @@ -115,7 +115,7 @@ >> serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all >> serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all >> serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all >> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all >> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 solaris-all,linux-all >> serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 solaris-all >> serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 solaris-all,linux-x64,macosx-x64,windows-x64 >> serviceability/sa/ClhsdbSource.java 8193639 solaris-all >> >> I'm still waiting for a tier1 run to make sure the test isn't run on linux. I'll push once that is done if I've had a review by then. >> >> thanks, >> >> Chris >> From chris.plummer at oracle.com Fri Mar 20 04:55:34 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 19 Mar 2020 21:55:34 -0700 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com> References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com> <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com> <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com> <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com> <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com> <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com> <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com> Message-ID: Hi Yasumasa, The test has been problem listed so please add undoing this to your webrev. Here's the diff that problem listed it: diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt +++ b/test/hotspot/jtreg/ProblemList.txt @@ -115,7 +115,7 @@ ?serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all ?serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all ?serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all +serviceability/sa/ClhsdbPstack.java 8193639,8240956 solaris-all,linux-all ?serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 solaris-all ?serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 solaris-all,linux-x64,macosx-x64,windows-x64 ?serviceability/sa/ClhsdbSource.java 8193639 solaris-all thanks, Chris On 3/16/20 5:07 AM, Yasumasa Suenaga wrote: > Hi all, > > This webrev has passed submit repo > (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and additional > tests. > So please review it: > > ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 > ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ > > > Thanks, > > Yasumasa > > > On 2020/03/16 21:03, Yasumasa Suenaga wrote: >> Thank you so much, David! >> >> Yasumasa >> >> >> On 2020/03/16 21:01, David Holmes wrote: >>> On 16/03/2020 9:46 pm, David Holmes wrote: >>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote: >>>>> Hi David, >>>>> >>>>> I missed loop condition, so I fixed it and pushed to submit repo. >>>>> Could you try again? >>>>> >>>>> ?? http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23 >>>>> >>>>> webrev is here: >>>>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >>>> >>>> Test job resubmitted. Will advise results if it completes before I >>>> go to bed :) >>> >>> Seems to have passed okay. >>> >>> David >>> >>>> David >>>> >>>>> >>>>> Thanks a lot! >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2020/03/16 16:17, David Holmes wrote: >>>>>> Sorry it is still crashing. >>>>>> >>>>>> # >>>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>>> # >>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704 >>>>>> # >>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug >>>>>> build 15-internal+0-2020-03-16-0640217.suenaga.source) >>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug >>>>>> 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, >>>>>> sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>>> # Problematic frame: >>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned >>>>>> long)+0x4e >>>>>> # >>>>>> >>>>>> Same as before. >>>>>> >>>>>> David >>>>>> ----- >>>>>> >>>>>> On 16/03/2020 4:57 pm, David Holmes wrote: >>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote: >>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote: >>>>>>>>> BTW, if you submit it to the submit repo, we can then go and >>>>>>>>> run additional internal tests (and even more builds) using >>>>>>>>> that job. >>>>>>> >>>>>>> Thanks for that tip Chris! >>>>>>> >>>>>>>> I've pushed the change to submit repo, but I've not yet >>>>>>>> received the result. >>>>>>>> I will share you when I get job ID. >>>>>>> >>>>>>> We can see the id. Just need to wait for the builds to complete >>>>>>> before submitting the additional tests. >>>>>>> >>>>>>> David >>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote: >>>>>>>>>> Hi David, >>>>>>>>>> >>>>>>>>>> Thank you for testing it. >>>>>>>>>> >>>>>>>>>> I updated webrev to avoid bailout to Java frame when DWARF >>>>>>>>>> has language personality routine or LSDA. >>>>>>>>>> Could you try it? >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ >>>>>>>>>> >>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 . >>>>>>>>>> I've pushed it to submit repo. >>>>>>>>>> >>>>>>>>>> Diff from webrev.00 is here: >>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote: >>>>>>>>>>> Correction ... >>>>>>>>>>> >>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote: >>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote: >>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>> >>>>>>>>>>>>> I can't review this as I know nothing about the code, but >>>>>>>>>>>>> I'm putting the patch through our internal testing. >>>>>>>>>>>> >>>>>>>>>>>> Sorry but the crashes still exist: >>>>>>>>>>>> >>>>>>>>>>>> # >>>>>>>>>>>> # A fatal error has been detected by the Java Runtime >>>>>>>>>>>> Environment: >>>>>>>>>>>> # >>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, >>>>>>>>>>>> tid=16949 >>>>>>>>>>>> # >>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) >>>>>>>>>>>> (fastdebug build >>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) >>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug >>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, >>>>>>>>>>>> mixed mode, sharing, tiered, compressed oops, g1 gc, >>>>>>>>>>>> linux-amd64) >>>>>>>>>>>> # Problematic frame: >>>>>>>>>>>> # C? [libsaproc.so+0x494e] >>>>>>>>>>>> DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>>>>>>> >>>>>>>>>>>> in fact they seem worse as the test seems to always crash now. >>>>>>>>>>> >>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the >>>>>>>>>>> test in linux-x64. I don't see a pattern as to where it >>>>>>>>>>> fails versus passes. >>>>>>>>>>> >>>>>>>>>>> It doesn't fail for me locally. >>>>>>>>>>> >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>>> David >>>>>>>>>>>>> >>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Please review this change: >>>>>>>>>>>>>> >>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>>>>>>>> ?? webrev: >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding >>>>>>>>>>>>>> native frames in jstack mixed mode. >>>>>>>>>>>>>> However some error has seen intermittently after that. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I investigated the cause of this, I found two concerns: >>>>>>>>>>>>>> >>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check >>>>>>>>>>>>>> ?? B: Language personality routine and Language Specific >>>>>>>>>>>>>> Data Area (LSDA) are not considered >>>>>>>>>>>>>> >>>>>>>>>>>>>> I addd range check for .eh_frame processing, and ignore >>>>>>>>>>>>>> personality routine and LSDA in this webrev. >>>>>>>>>>>>>> Also I added bailout code if DWARF processing is failed >>>>>>>>>>>>>> due to these concerns. >>>>>>>>>>>>>> >>>>>>>>>>>>>> This change has passed all tests on submit repo >>>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), >>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 >>>>>>>>>>>>>> container. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> From suenaga at oss.nttdata.com Fri Mar 20 09:45:19 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Fri, 20 Mar 2020 18:45:19 +0900 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com> <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com> <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com> <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com> <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com> <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com> <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com> Message-ID: <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com> Hi Chris, I uploaded new webrev which includes reverting change for ProblemList: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/ I tested it on submit repo (mach5-one-ysuenaga-JDK-8240956-2-20200320-0814-9586301), but it has failed in ClhsdbJstackXcompStress.java. However I think it is not caused by this change because ClhsdbJstackXcompStress.java tests `jhsdb jstack`, not mixed mode, it would not parse DWARF. Thanks, Yasumasa On 2020/03/20 13:55, Chris Plummer wrote: > Hi Yasumasa, > > The test has been problem listed so please add undoing this to your webrev. Here's the diff that problem listed it: > > diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt > --- a/test/hotspot/jtreg/ProblemList.txt > +++ b/test/hotspot/jtreg/ProblemList.txt > @@ -115,7 +115,7 @@ > ?serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all > ?serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all > ?serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all > -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all > +serviceability/sa/ClhsdbPstack.java 8193639,8240956 solaris-all,linux-all > ?serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 solaris-all > ?serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 solaris-all,linux-x64,macosx-x64,windows-x64 > ?serviceability/sa/ClhsdbSource.java 8193639 solaris-all > > thanks, > > Chris > > On 3/16/20 5:07 AM, Yasumasa Suenaga wrote: >> Hi all, >> >> This webrev has passed submit repo (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and additional tests. >> So please review it: >> >> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2020/03/16 21:03, Yasumasa Suenaga wrote: >>> Thank you so much, David! >>> >>> Yasumasa >>> >>> >>> On 2020/03/16 21:01, David Holmes wrote: >>>> On 16/03/2020 9:46 pm, David Holmes wrote: >>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote: >>>>>> Hi David, >>>>>> >>>>>> I missed loop condition, so I fixed it and pushed to submit repo. >>>>>> Could you try again? >>>>>> >>>>>> ?? http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23 >>>>>> >>>>>> webrev is here: >>>>>> >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >>>>> >>>>> Test job resubmitted. Will advise results if it completes before I go to bed :) >>>> >>>> Seems to have passed okay. >>>> >>>> David >>>> >>>>> David >>>>> >>>>>> >>>>>> Thanks a lot! >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2020/03/16 16:17, David Holmes wrote: >>>>>>> Sorry it is still crashing. >>>>>>> >>>>>>> # >>>>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>>>> # >>>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704 >>>>>>> # >>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0640217.suenaga.source) >>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>>>> # Problematic frame: >>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>> # >>>>>>> >>>>>>> Same as before. >>>>>>> >>>>>>> David >>>>>>> ----- >>>>>>> >>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote: >>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote: >>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote: >>>>>>>>>> BTW, if you submit it to the submit repo, we can then go and run additional internal tests (and even more builds) using that job. >>>>>>>> >>>>>>>> Thanks for that tip Chris! >>>>>>>> >>>>>>>>> I've pushed the change to submit repo, but I've not yet received the result. >>>>>>>>> I will share you when I get job ID. >>>>>>>> >>>>>>>> We can see the id. Just need to wait for the builds to complete before submitting the additional tests. >>>>>>>> >>>>>>>> David >>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote: >>>>>>>>>>> Hi David, >>>>>>>>>>> >>>>>>>>>>> Thank you for testing it. >>>>>>>>>>> >>>>>>>>>>> I updated webrev to avoid bailout to Java frame when DWARF has language personality routine or LSDA. >>>>>>>>>>> Could you try it? >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ >>>>>>>>>>> >>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 . >>>>>>>>>>> I've pushed it to submit repo. >>>>>>>>>>> >>>>>>>>>>> Diff from webrev.00 is here: >>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote: >>>>>>>>>>>> Correction ... >>>>>>>>>>>> >>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote: >>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote: >>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I can't review this as I know nothing about the code, but I'm putting the patch through our internal testing. >>>>>>>>>>>>> >>>>>>>>>>>>> Sorry but the crashes still exist: >>>>>>>>>>>>> >>>>>>>>>>>>> # >>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>>>>>>>>>> # >>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949 >>>>>>>>>>>>> # >>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) >>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>>>>>>>>>> # Problematic frame: >>>>>>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>>>>>>>> >>>>>>>>>>>>> in fact they seem worse as the test seems to always crash now. >>>>>>>>>>>> >>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in linux-x64. I don't see a pattern as to where it fails versus passes. >>>>>>>>>>>> >>>>>>>>>>>> It doesn't fail for me locally. >>>>>>>>>>>> >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>>> David >>>>>>>>>>>>> >>>>>>>>>>>>>> David >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Please review this change: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames in jstack mixed mode. >>>>>>>>>>>>>>> However some error has seen intermittently after that. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I investigated the cause of this, I found two concerns: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check >>>>>>>>>>>>>>> ?? B: Language personality routine and Language Specific Data Area (LSDA) are not considered >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I addd range check for .eh_frame processing, and ignore personality routine and LSDA in this webrev. >>>>>>>>>>>>>>> Also I added bailout code if DWARF processing is failed due to these concerns. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> This change has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), >>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> > > From coleen.phillimore at oracle.com Fri Mar 20 11:25:57 2020 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 20 Mar 2020 07:25:57 -0400 Subject: RFR (T) 8241320: The ClassLoaderData::_is_unsafe_anonymous field is unused in the SA In-Reply-To: References: <8c45be74-c33d-c4d3-76ff-98766112b6e7@oracle.com> Message-ID: <4f42fa70-77b8-76f7-85af-57ad15419812@oracle.com> Thanks Serguei! Coleen On 3/19/20 10:32 PM, serguei.spitsyn at oracle.com wrote: > +1 > > Thanks, > Serguei > > On 3/19/20 13:12, Lois Foltan wrote: >> Looks good. >> Lois >> >> On 3/19/2020 3:46 PM, coleen.phillimore at oracle.com wrote: >>> Summary: remove unused code that is changing in Hotspot for hidden >>> classes. >>> >>> Ran tier1-3 tests.? See bug for more details. >>> >>> open webrev at >>> http://cr.openjdk.java.net/~coleenp/2020/8241320.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8241320 >>> >>> Thanks, >>> Coleen >> > From coleen.phillimore at oracle.com Fri Mar 20 11:28:26 2020 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 20 Mar 2020 07:28:26 -0400 Subject: RFR (T) 8241320: The ClassLoaderData::_is_unsafe_anonymous field is unused in the SA In-Reply-To: References: <8c45be74-c33d-c4d3-76ff-98766112b6e7@oracle.com> Message-ID: <496b7e2c-09f3-913c-df19-dd8a475e4b67@oracle.com> On 3/19/20 6:43 PM, David Holmes wrote: > Hi Coleen, > > On 20/03/2020 5:46 am, coleen.phillimore at oracle.com wrote: >> Summary: remove unused code that is changing in Hotspot for hidden >> classes. > > I'm not sure how to identify unused code in the SA given that it > exposes a Java API for querying the JVM internals. You say > getisUnsafeAnonymous() is unused because nothing in the SA calls it. > But the same would seem to be true for other parts of the CLD API - > for example > > - ClassLoaderData::dictionary() is called from > ? - ClassLoaderData::allEntriesDo, is called from > ??? - ClassLoaderDataGraph::allEntriesDo, is called from > ????? - nowhere ??? Actually I had a look at that too because, of course, I was trying to remove more.? I think there is a caller for that: utilities/soql/sa.js: sa.sysDict["allEntriesDo(sun.jvm.hotspot.classfile.ClassLoaderDataGraph.ClassAndLoaderVisitor)"](visitor); But I don't know what the java script interface to SA is.? So I thought I'd leave it for now.? It might actually be useful theoretically. Thanks, Coleen > > David > ----- > >> Ran tier1-3 tests.? See bug for more details. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2020/8241320.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8241320 >> >> Thanks, >> Coleen From serguei.spitsyn at oracle.com Fri Mar 20 15:11:53 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 20 Mar 2020 08:11:53 -0700 Subject: RFR (T) 8241320: The ClassLoaderData::_is_unsafe_anonymous field is unused in the SA In-Reply-To: <496b7e2c-09f3-913c-df19-dd8a475e4b67@oracle.com> References: <8c45be74-c33d-c4d3-76ff-98766112b6e7@oracle.com> <496b7e2c-09f3-913c-df19-dd8a475e4b67@oracle.com> Message-ID: <3eb26fb1-2292-f800-5245-31edebaa27b2@oracle.com> On 3/20/20 04:28, coleen.phillimore at oracle.com wrote: > > > On 3/19/20 6:43 PM, David Holmes wrote: >> Hi Coleen, >> >> On 20/03/2020 5:46 am, coleen.phillimore at oracle.com wrote: >>> Summary: remove unused code that is changing in Hotspot for hidden >>> classes. >> >> I'm not sure how to identify unused code in the SA given that it >> exposes a Java API for querying the JVM internals. You say >> getisUnsafeAnonymous() is unused because nothing in the SA calls it. >> But the same would seem to be true for other parts of the CLD API - >> for example >> >> - ClassLoaderData::dictionary() is called from >> ? - ClassLoaderData::allEntriesDo, is called from >> ??? - ClassLoaderDataGraph::allEntriesDo, is called from >> ????? - nowhere ??? > > Actually I had a look at that too because, of course, I was trying to > remove more.? I think there is a caller for that: > > utilities/soql/sa.js: > sa.sysDict["allEntriesDo(sun.jvm.hotspot.classfile.ClassLoaderDataGraph.ClassAndLoaderVisitor)"](visitor); > > But I don't know what the java script interface to SA is.? So I > thought I'd leave it for now.? It might actually be useful theoretically. We have a plan to remove the java script support from SA. Chris P. investigated this and, probably, can tell more. Thanks, Serguei > > Thanks, > Coleen > >> >> David >> ----- >> >>> Ran tier1-3 tests.? See bug for more details. >>> >>> open webrev at >>> http://cr.openjdk.java.net/~coleenp/2020/8241320.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8241320 >>> >>> Thanks, >>> Coleen > From rkennke at redhat.com Fri Mar 20 15:30:24 2020 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 20 Mar 2020 16:30:24 +0100 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com> <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> Message-ID: <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com> I believe I came up with a much simpler solution that also solves the problems of the existing one, and the ones I proposed earlier. It turns out that we can take advantage of the fact that we can use *anything* as tags in JVMTI, even pointers to stuff (this is explicitely mentioned in the JVMTI spec). This means we can simply stick a pointer to the signature of a class into the tag, and pull it out again when we get notified that the class gets unloaded. This means we don't need an extra data-structure to keep track of classes and signatures, and it also makes the story around locking *much* simpler. Performance-wise this is O(1), i.e. no scanning of all classes needed (as in the current implementation) and no searching of table needed (like in my previous attempts). Please review this new revision: http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/ (Notice that there still appears to be a performance bottleneck with class-unloading when an actual debugger is attached. This doesn't seem to be related to the classTrack.c implementation though, but looks like a consequence of getting all those class-unload notifications over the wire. My testcase generates 1000s of them, and it's clogging up the buffers.) I am not sure why jdb needs to enable class-unload listener always. A simple hack disables it, and performance is brilliant, even when jdb is attached: http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch But this is not in the scope of this bug.) Roman On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote: > Sorry, forgot to complete my comments at the end (see below). > > > On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote: >> Hi Roman, >> >> Thank you for the update and sorry for the latency in review. >> >> Some comments are below. >> >> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >> >> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag) >> 88 { >> 89 debugMonitorEnter(deletedSignatureLock); >> 90 if (currentClassTag == -1) { >> 91 // Class tracking not initialized, nobody's interested >> 92 debugMonitorExit(deletedSignatureLock); >> 93 return; >> 94 } >> Just a question: >> ? Q1: Should the ObjectFree events be disabled for the jvmtiEnv that does >> ????? the class tracking if class tracking has not been initialized? >> >> 70 static jlong currentClassTag; I'm thinking if the name is better to >> be something like: lastClassTag or highestClassTag. >> >> 99 KlassNode* klass = *klass_ptr; >> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103 >> klass_ptr = &klass->next; 104 klass = *klass_ptr; >> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // klass not >> found - ignore. >> 107 debugMonitorExit(deletedSignatureLock); >> 108 return; >> 109 } >> ?It seems to me, something is wrong in the condition at L106 above. >> ?Should it be? : >> ??? if (klass == NULL || klass->klass_tag != tag) >> >> ?Otherwise, how can the second check ever work correctly as the return >> will always happen when (klass != NULL)? >> >> ? >> There are several places in this file with the the indent: >> 90 if (currentClassTag == -1) { >> 91 // Class tracking not initialized, nobody's interested >> 92 debugMonitorExit(deletedSignatureLock); >> 93 return; >> 94 } >> ... >> 152 if (currentClassTag == -1) { >> 153 // Class tracking not initialized yet, nobody's interested >> 154 debugMonitorExit(deletedSignatureLock); >> 155 return; >> 156 } >> ... >> 161 if (error != JVMTI_ERROR_NONE) { >> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv"); >> 163 } >> 164 if (tag != 0l) { >> 165 debugMonitorExit(deletedSignatureLock); >> 166 return; // Already added >> 167 } >> ... >> 281 cleanDeleted(void *signatureVoid, void *arg) >> 282 { >> 283 char* sig = (char*)signatureVoid; >> 284 jvmtiDeallocate(sig); >> 285 return JNI_TRUE; >> 286 } >> ... >> 291 void >> 292 classTrack_reset(void) >> 293 { >> 294 int idx; >> 295 debugMonitorEnter(deletedSignatureLock); >> 296 >> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) { >> 298 KlassNode* node = table[idx]; >> 299 while (node != NULL) { >> 300 KlassNode* next = node->next; >> 301 jvmtiDeallocate(node->signature); >> 302 jvmtiDeallocate(node); >> 303 node = next; >> 304 } >> 305 } >> 306 jvmtiDeallocate(table); >> 307 >> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL); >> 309 bagDestroyBag(deletedSignatureBag); >> 310 >> 311 currentClassTag = -1; >> 312 >> 313 (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); >> 314 trackingEnv = NULL; >> 315 >> 316 debugMonitorExit(deletedSignatureLock); >> >> Could you, please, fix several comments below? >> 63 * The JVMTI tracking env to keep track of klass tags, for class-unloads >> ?The comma is not needed. >> ?Would it better to replace: klass tags => klass_tag's ? >> >> >> 73 * Lock to keep table, currentClassTag and deletedSignatureBag >> consistent >> ?Maybe: Lock to guard ... or lock to keep integrity of ... >> >> 84 * Callback when classes are freed, Finds the signature and >> remembers it in deletedSignatureBag. Would be better to use words like >> "store" or "record", "Find" should not start from capital letter: >> Invoke the callback when classes are freed, find and record the >> signature in deletedSignatureBag. >> >> 96 // Find deleted KlassNode 133 // Class tracking not initialized, >> nobody's interested 153 // Class tracking not initialized yet, >> nobody's interested 158 /* Check this is not a duplicate */ Missed dot >> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { // >> klass not found - ignore. In opposite, dot is not needed as the >> comment does not start from a capital letter. 111 // At this point we >> have the KlassNode corresponding to the tag >> 112 // in klass, and the pointer to it in klass_node. > > The comment above can be better. Maybe, something like: > ? " At this point, we found the KlassNode matching the klass tag(and it is > linked). > >> 113 // Remember the unloaded signature. > ?Better: Record the signature of the unloaded class and unlink it. > > Thanks, > Serguei > >> Thanks, >> Serguei >> >> On 3/9/20 05:39, Roman Kennke wrote: >>> Hello all, >>> >>> Can I please get reviews of this change? In the meantime, we've done >>> more testing and also field-/torture-testing by a customer who is happy >>> now. :-) >>> >>> Thanks, >>> Roman >>> >>> >>>> Hi Serguei, >>>> >>>> Thanks for reviewing! >>>> >>>> I updated the patch to reflect your suggestions, very good! >>>> It also includes a fix to allow re-connecting an agent after disconnect, >>>> namely move setup of the trackingEnv and deletedSignatureBag to >>>> _activate() to ensure have those structures after re-connect. >>>> >>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/ >>>> >>>> Let me know what you think! >>>> Roman >>>> >>>>> Hi Roman, >>>>> >>>>> Thank you for taking care about this scalability issue! >>>>> >>>>> I have a couple of quick comments. >>>>> >>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>> >>>>> 72 /* >>>>> 73 * Lock to protect deletedSignatureBag >>>>> 74 */ >>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /* >>>>> 78 * A bag containing all the deleted classes' signatures. Must be >>>>> accessed under >>>>> 79 * deletedTagLock, >>>>> 80 */ >>>>> 81 struct bag* deletedSignatureBag; >>>>> >>>>> ? The comments contradict to each other. >>>>> ? I guess, the lock name at line 79 has to be deletedSignatureLock >>>>> instead of deletedTagLock. >>>>> ? Also, comma at the end must be replaced with dot. >>>>> >>>>> >>>>> 101 // Tag not found? Ignore. >>>>> 102 if (klass == NULL) { >>>>> 103 debugMonitorExit(deletedSignatureLock); >>>>> 104 return; >>>>> 105 } >>>>> 106 >>>>> 107 // Scan linked-list. >>>>> 108 jlong found_tag = klass->klass_tag; >>>>> 109 while (klass != NULL && found_tag != tag) { >>>>> 110 klass_ptr = &klass->next; >>>>> 111 klass = *klass_ptr; >>>>> 112 found_tag = klass->klass_tag; >>>>> 113 } >>>>> 114 >>>>> 115 // Tag not found? Ignore. >>>>> 116 if (found_tag != tag) { >>>>> 117 debugMonitorExit(deletedSignatureLock); >>>>> 118 return; >>>>> 119 } >>>>> >>>>> >>>>> ?The code above can be simplified, so that the lines 101-105 are not >>>>> needed anymore. >>>>> ?It can be something like this: >>>>> >>>>> // Scan linked-list. >>>>> while (klass != NULL && klass->klass_tag != tag) { >>>>> klass_ptr = &klass->next; >>>>> klass = *klass_ptr; >>>>> } >>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not found - ignore. >>>>> debugMonitorExit(deletedSignatureLock); >>>>> return; >>>>> } >>>>> >>>>> It will take more time when I get a chance to look at the rest. >>>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> >>>>> >>>>> On 12/21/19 13:24, Roman Kennke wrote: >>>>>> Here comes an update that resolves some races that happen when >>>>>> disconnecting an agent. In particular, we need to take the lock on >>>>>> basically every operation, and also need to check whether or not >>>>>> class-tracking is active and return an appropriate result (e.g. an empty >>>>>> list) when we're not. >>>>>> >>>>>> Updated webrev: >>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ >>>>>> >>>>>> Thanks, >>>>>> Roman >>>>>> >>>>>> >>>>>>> So, here comes the O(1) implementation: >>>>>>> >>>>>>> - Whenever a class is 'prepared', it is registered with a tag, and we >>>>>>> set-up a listener to get notified when it is unloaded. >>>>>>> - Prepared classes are kept in a datastructure that is a table, which >>>>>>> each entry being the head of a linked-list of KlassNode*. The table is >>>>>>> indexed by tag % slot-count, and then simply prepend the new KlassNode*. >>>>>>> This is O(1) operation. >>>>>>> - When we get notified of unloading a class, we look up the signature of >>>>>>> the reported tag in that table, and remember it in a bag. The KlassNode* >>>>>>> is then unlinked from the table and deallocated. This is ~O(1) operation >>>>>>> too, depending on the depth of the table. In my testcase which hammered >>>>>>> the code with class-loads and unloads, I usually see depths of like 2-3, >>>>>>> but not usually more. It should be ok. >>>>>>> - when processUnloads() gets called, we simply hand out that bag, and >>>>>>> allocate a new one. >>>>>>> - I also added cleanup-code in classTrack_reset() to avoid leaking the >>>>>>> signatures and KlassNode* etc when debug agent gets detached and/or >>>>>>> re-attached (was missing before). >>>>>>> - I also added locks around data-structure-manipulation (was missing >>>>>>> before). >>>>>>> - Also, I only activate this whole process when an actual listener gets >>>>>>> registered on EI_GC_FINISH. This seems to happen right when attaching a >>>>>>> jdb, not sure why jdb does that though. This may be something to improve >>>>>>> in the future? >>>>>>> >>>>>>> In my tests, the performance of class-tracking itself looks really good. >>>>>>> The bottleneck now is clearly actual synthesizing the class-unload >>>>>>> events. I don't see how this can be helped when the debug agent asks for it? >>>>>>> >>>>>>> Updated webrev: >>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ >>>>>>> >>>>>>> Please let me know what you think of it. >>>>>>> >>>>>>> Thanks, >>>>>>> Roman >>>>>>> >>>>>>> >>>>>>>> Alright, the perfectionist in me got me. I am implementing the even more >>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing for now. >>>>>>>> >>>>>>>> Thanks,Roman >>>>>>>> >>>>>>>> Hi Chris, >>>>>>>>>> I'll have a look at this, although it might not be for a few days. In >>>>>>>>>> the meantime, maybe you can describe your new implementation in >>>>>>>>>> classTrack.c so it's easier to look through the changes. >>>>>>>>> Sure. >>>>>>>>> >>>>>>>>> The purpose of this class-tracking is to be able to determine the >>>>>>>>> signatures of unloaded classes when GC/class-unloading happened, so that >>>>>>>>> we can generate the appropriate JDWP event. >>>>>>>>> >>>>>>>>> The current implementation does so by maintaining a table of currently >>>>>>>>> prepared classes by building that table when classTrack is initialized, >>>>>>>>> and then add new classes whenever a class gets loaded. When unloading >>>>>>>>> occurs, that cache is rebuilt into a new table, and compared with the >>>>>>>>> old table, and whatever is in the old, but not in the new table gets >>>>>>>>> returned. The problem is that when GCs happen frequently and/or many >>>>>>>>> classes get loaded+unloaded, this amounts to O(classCount*gcCount) >>>>>>>>> complexity. >>>>>>>>> >>>>>>>>> The new implementation keeps a linked-list of prepared classes, and also >>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). Whenever an >>>>>>>>> unload/GC occurs, the list of prepared classes is scanned, and classes >>>>>>>>> that are also in the deletedTagBag are unlinked (thus maintaining the >>>>>>>>> prepared-classes-list) and its signature put in the list that gets returned. >>>>>>>>> >>>>>>>>> The implementation is not perfect. In order to determine whether or not >>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. That process is >>>>>>>>> therefore still O(unloadedClassCount). The assumption here is that >>>>>>>>> unloadedClassCount << classCount. In my experiments this seems to be >>>>>>>>> true, and also reasonable to expect. >>>>>>>>> >>>>>>>>> (I have some ideas how to improve the implementation to ~O(1) but it >>>>>>>>> would be considerably more complex: have to maintain a (hash)table that >>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload, and build the >>>>>>>>> unloaded-signatures list there, but I don't currently see that it's >>>>>>>>> worth the effort). >>>>>>>>> >>>>>>>>> In addition to all that, this process is only activated when there's an >>>>>>>>> actual listener registered for EI_GC_FINISH. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Roman >>>>>>>>> >>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote: >>>>>>>>>>> Hello all, >>>>>>>>>>> >>>>>>>>>>> Issue: >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269 >>>>>>>>>>> >>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c. It avoids >>>>>>>>>>> throwing away the class cache on GC, and instead keeps track of >>>>>>>>>>> loaded/unloaded classes one-by-one. >>>>>>>>>>> >>>>>>>>>>> In addition to that, it avoids this whole dance until an agent >>>>>>>>>>> registers interest in EI_GC_FINISH. >>>>>>>>>>> >>>>>>>>>>> Webrev: >>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ >>>>>>>>>>> >>>>>>>>>>> Testing: manual testing of provided test scenarios and timing. >>>>>>>>>>> >>>>>>>>>>> Eg with the testcase provided here: >>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>>>> >>>>>>>>>>> I am getting those numbers: >>>>>>>>>>> unpatched: no debug: 84s with debug: 225s >>>>>>>>>>> patched:?? no debug: 85s with debug: 95s >>>>>>>>>>> >>>>>>>>>>> I also tested successfully through jdk/submit repo >>>>>>>>>>> >>>>>>>>>>> Can I please get a review? >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Roman >>>>>>>>>>> >> > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From chris.plummer at oracle.com Fri Mar 20 19:23:19 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 20 Mar 2020 12:23:19 -0700 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com> References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com> <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com> <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com> <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com> <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com> <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com> <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com> <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com> Message-ID: Hi Yasumasa, The failure is due to JDK-8231634, so not something you need to worry about. thanks, Chris On 3/20/20 2:45 AM, Yasumasa Suenaga wrote: > Hi Chris, > > I uploaded new webrev which includes reverting change for ProblemList: > > ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/ > > I tested it on submit repo > (mach5-one-ysuenaga-JDK-8240956-2-20200320-0814-9586301), > but it has failed in ClhsdbJstackXcompStress.java. > However I think it is not caused by this change because > ClhsdbJstackXcompStress.java tests `jhsdb jstack`, not mixed mode, it > would not parse DWARF. > > > Thanks, > > Yasumasa > > > On 2020/03/20 13:55, Chris Plummer wrote: >> Hi Yasumasa, >> >> The test has been problem listed so please add undoing this to your >> webrev. Here's the diff that problem listed it: >> >> diff --git a/test/hotspot/jtreg/ProblemList.txt >> b/test/hotspot/jtreg/ProblemList.txt >> --- a/test/hotspot/jtreg/ProblemList.txt >> +++ b/test/hotspot/jtreg/ProblemList.txt >> @@ -115,7 +115,7 @@ >> ??serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all >> ??serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all >> ??serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all >> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all >> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 >> solaris-all,linux-all >> ??serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 >> solaris-all >> ??serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 >> solaris-all,linux-x64,macosx-x64,windows-x64 >> ??serviceability/sa/ClhsdbSource.java 8193639 solaris-all >> >> thanks, >> >> Chris >> >> On 3/16/20 5:07 AM, Yasumasa Suenaga wrote: >>> Hi all, >>> >>> This webrev has passed submit repo >>> (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and >>> additional tests. >>> So please review it: >>> >>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2020/03/16 21:03, Yasumasa Suenaga wrote: >>>> Thank you so much, David! >>>> >>>> Yasumasa >>>> >>>> >>>> On 2020/03/16 21:01, David Holmes wrote: >>>>> On 16/03/2020 9:46 pm, David Holmes wrote: >>>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote: >>>>>>> Hi David, >>>>>>> >>>>>>> I missed loop condition, so I fixed it and pushed to submit repo. >>>>>>> Could you try again? >>>>>>> >>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23 >>>>>>> >>>>>>> webrev is here: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >>>>>> >>>>>> Test job resubmitted. Will advise results if it completes before >>>>>> I go to bed :) >>>>> >>>>> Seems to have passed okay. >>>>> >>>>> David >>>>> >>>>>> David >>>>>> >>>>>>> >>>>>>> Thanks a lot! >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2020/03/16 16:17, David Holmes wrote: >>>>>>>> Sorry it is still crashing. >>>>>>>> >>>>>>>> # >>>>>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>>>>> # >>>>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704 >>>>>>>> # >>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) >>>>>>>> (fastdebug build 15-internal+0-2020-03-16-0640217.suenaga.source) >>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug >>>>>>>> 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, >>>>>>>> sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>>>>> # Problematic frame: >>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned >>>>>>>> long)+0x4e >>>>>>>> # >>>>>>>> >>>>>>>> Same as before. >>>>>>>> >>>>>>>> David >>>>>>>> ----- >>>>>>>> >>>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote: >>>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote: >>>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote: >>>>>>>>>>> BTW, if you submit it to the submit repo, we can then go and >>>>>>>>>>> run additional internal tests (and even more builds) using >>>>>>>>>>> that job. >>>>>>>>> >>>>>>>>> Thanks for that tip Chris! >>>>>>>>> >>>>>>>>>> I've pushed the change to submit repo, but I've not yet >>>>>>>>>> received the result. >>>>>>>>>> I will share you when I get job ID. >>>>>>>>> >>>>>>>>> We can see the id. Just need to wait for the builds to >>>>>>>>> complete before submitting the additional tests. >>>>>>>>> >>>>>>>>> David >>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>> Hi David, >>>>>>>>>>>> >>>>>>>>>>>> Thank you for testing it. >>>>>>>>>>>> >>>>>>>>>>>> I updated webrev to avoid bailout to Java frame when DWARF >>>>>>>>>>>> has language personality routine or LSDA. >>>>>>>>>>>> Could you try it? >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ >>>>>>>>>>>> >>>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 . >>>>>>>>>>>> I've pushed it to submit repo. >>>>>>>>>>>> >>>>>>>>>>>> Diff from webrev.00 is here: >>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote: >>>>>>>>>>>>> Correction ... >>>>>>>>>>>>> >>>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote: >>>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote: >>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I can't review this as I know nothing about the code, >>>>>>>>>>>>>>> but I'm putting the patch through our internal testing. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Sorry but the crashes still exist: >>>>>>>>>>>>>> >>>>>>>>>>>>>> # >>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime >>>>>>>>>>>>>> Environment: >>>>>>>>>>>>>> # >>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, >>>>>>>>>>>>>> tid=16949 >>>>>>>>>>>>>> # >>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) >>>>>>>>>>>>>> (fastdebug build >>>>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) >>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug >>>>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, >>>>>>>>>>>>>> mixed mode, sharing, tiered, compressed oops, g1 gc, >>>>>>>>>>>>>> linux-amd64) >>>>>>>>>>>>>> # Problematic frame: >>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] >>>>>>>>>>>>>> DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>>>>>>>>> >>>>>>>>>>>>>> in fact they seem worse as the test seems to always crash >>>>>>>>>>>>>> now. >>>>>>>>>>>>> >>>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the >>>>>>>>>>>>> test in linux-x64. I don't see a pattern as to where it >>>>>>>>>>>>> fails versus passes. >>>>>>>>>>>>> >>>>>>>>>>>>> It doesn't fail for me locally. >>>>>>>>>>>>> >>>>>>>>>>>>> David >>>>>>>>>>>>> >>>>>>>>>>>>>> David >>>>>>>>>>>>>> >>>>>>>>>>>>>>> David >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Please review this change: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>>>>>>>>>> ?? webrev: >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding >>>>>>>>>>>>>>>> native frames in jstack mixed mode. >>>>>>>>>>>>>>>> However some error has seen intermittently after that. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I investigated the cause of this, I found two concerns: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check >>>>>>>>>>>>>>>> ?? B: Language personality routine and Language >>>>>>>>>>>>>>>> Specific Data Area (LSDA) are not considered >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I addd range check for .eh_frame processing, and ignore >>>>>>>>>>>>>>>> personality routine and LSDA in this webrev. >>>>>>>>>>>>>>>> Also I added bailout code if DWARF processing is failed >>>>>>>>>>>>>>>> due to these concerns. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> This change has passed all tests on submit repo >>>>>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), >>>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux >>>>>>>>>>>>>>>> 7.7 container. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >> >> From coleen.phillimore at oracle.com Fri Mar 20 19:28:36 2020 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 20 Mar 2020 15:28:36 -0400 Subject: RFR (T) 8241320: The ClassLoaderData::_is_unsafe_anonymous field is unused in the SA In-Reply-To: <3eb26fb1-2292-f800-5245-31edebaa27b2@oracle.com> References: <8c45be74-c33d-c4d3-76ff-98766112b6e7@oracle.com> <496b7e2c-09f3-913c-df19-dd8a475e4b67@oracle.com> <3eb26fb1-2292-f800-5245-31edebaa27b2@oracle.com> Message-ID: <599f317f-2330-248e-d456-3fb506090abf@oracle.com> On 3/20/20 11:11 AM, serguei.spitsyn at oracle.com wrote: > > On 3/20/20 04:28, coleen.phillimore at oracle.com wrote: >> >> >> On 3/19/20 6:43 PM, David Holmes wrote: >>> Hi Coleen, >>> >>> On 20/03/2020 5:46 am, coleen.phillimore at oracle.com wrote: >>>> Summary: remove unused code that is changing in Hotspot for hidden >>>> classes. >>> >>> I'm not sure how to identify unused code in the SA given that it >>> exposes a Java API for querying the JVM internals. You say >>> getisUnsafeAnonymous() is unused because nothing in the SA calls it. >>> But the same would seem to be true for other parts of the CLD API - >>> for example >>> >>> - ClassLoaderData::dictionary() is called from >>> ? - ClassLoaderData::allEntriesDo, is called from >>> ??? - ClassLoaderDataGraph::allEntriesDo, is called from >>> ????? - nowhere ??? >> >> Actually I had a look at that too because, of course, I was trying to >> remove more.? I think there is a caller for that: >> >> utilities/soql/sa.js: >> sa.sysDict["allEntriesDo(sun.jvm.hotspot.classfile.ClassLoaderDataGraph.ClassAndLoaderVisitor)"](visitor); >> >> But I don't know what the java script interface to SA is.? So I >> thought I'd leave it for now.? It might actually be useful >> theoretically. > I had second thoughts about it being useful from SA.? I think if we wanted to see what classes were loaded in the system dictionary for each loader, we could write a pretty simple python script from within gdb to do so. Coleen > We have a plan to remove the java script support from SA. > Chris P. investigated this and, probably, can tell more. > > Thanks, > Serguei > >> >> Thanks, >> Coleen >> >>> >>> David >>> ----- >>> >>>> Ran tier1-3 tests.? See bug for more details. >>>> >>>> open webrev at >>>> http://cr.openjdk.java.net/~coleenp/2020/8241320.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8241320 >>>> >>>> Thanks, >>>> Coleen >> > From chris.plummer at oracle.com Fri Mar 20 19:52:30 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 20 Mar 2020 12:52:30 -0700 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com> References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com> <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com> Message-ID: <7766d2f0-e223-7129-9c37-474bbcd67d70@oracle.com> On 3/20/20 8:30 AM, Roman Kennke wrote: > I believe I came up with a much simpler solution that also solves the > problems of the existing one, and the ones I proposed earlier. > > It turns out that we can take advantage of the fact that we can use > *anything* as tags in JVMTI, even pointers to stuff (this is explicitely > mentioned in the JVMTI spec). This means we can simply stick a pointer > to the signature of a class into the tag, and pull it out again when we > get notified that the class gets unloaded. > > This means we don't need an extra data-structure to keep track of > classes and signatures, and it also makes the story around locking > *much* simpler. Performance-wise this is O(1), i.e. no scanning of all > classes needed (as in the current implementation) and no searching of > table needed (like in my previous attempts). > > Please review this new revision: > http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/ I'll have a look at this. > > (Notice that there still appears to be a performance bottleneck with > class-unloading when an actual debugger is attached. This doesn't seem > to be related to the classTrack.c implementation though, but looks like > a consequence of getting all those class-unload notifications over the > wire. My testcase generates 1000s of them, and it's clogging up the > buffers.) At least this is only a one-shot hit when the classes are unloaded, and the performance hit is based on the number of classes being unloaded. The main issue is happening every GC, and is O(n) where n is the number of loaded classes. > I am not sure why jdb needs to enable class-unload listener always. A > simple hack disables it, and performance is brilliant, even when jdb is > attached: > http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch This is JDI, not jdb. It looks like it needs ClassUnload events so it can maintain typesBySignature, which is used by public APIs like allClasses(). So we have caching of loaded classes both in the debug agent and in JDI. Chris > But this is not in the scope of this bug.) > > Roman > > > On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote: >> Sorry, forgot to complete my comments at the end (see below). >> >> >> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote: >>> Hi Roman, >>> >>> Thank you for the update and sorry for the latency in review. >>> >>> Some comments are below. >>> >>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>> >>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag) >>> 88 { >>> 89 debugMonitorEnter(deletedSignatureLock); >>> 90 if (currentClassTag == -1) { >>> 91 // Class tracking not initialized, nobody's interested >>> 92 debugMonitorExit(deletedSignatureLock); >>> 93 return; >>> 94 } >>> Just a question: >>> ? Q1: Should the ObjectFree events be disabled for the jvmtiEnv that does >>> ????? the class tracking if class tracking has not been initialized? >>> >>> 70 static jlong currentClassTag; I'm thinking if the name is better to >>> be something like: lastClassTag or highestClassTag. >>> >>> 99 KlassNode* klass = *klass_ptr; >>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103 >>> klass_ptr = &klass->next; 104 klass = *klass_ptr; >>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // klass not >>> found - ignore. >>> 107 debugMonitorExit(deletedSignatureLock); >>> 108 return; >>> 109 } >>> ?It seems to me, something is wrong in the condition at L106 above. >>> ?Should it be? : >>> ??? if (klass == NULL || klass->klass_tag != tag) >>> >>> ?Otherwise, how can the second check ever work correctly as the return >>> will always happen when (klass != NULL)? >>> >>> >>> There are several places in this file with the the indent: >>> 90 if (currentClassTag == -1) { >>> 91 // Class tracking not initialized, nobody's interested >>> 92 debugMonitorExit(deletedSignatureLock); >>> 93 return; >>> 94 } >>> ... >>> 152 if (currentClassTag == -1) { >>> 153 // Class tracking not initialized yet, nobody's interested >>> 154 debugMonitorExit(deletedSignatureLock); >>> 155 return; >>> 156 } >>> ... >>> 161 if (error != JVMTI_ERROR_NONE) { >>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv"); >>> 163 } >>> 164 if (tag != 0l) { >>> 165 debugMonitorExit(deletedSignatureLock); >>> 166 return; // Already added >>> 167 } >>> ... >>> 281 cleanDeleted(void *signatureVoid, void *arg) >>> 282 { >>> 283 char* sig = (char*)signatureVoid; >>> 284 jvmtiDeallocate(sig); >>> 285 return JNI_TRUE; >>> 286 } >>> ... >>> 291 void >>> 292 classTrack_reset(void) >>> 293 { >>> 294 int idx; >>> 295 debugMonitorEnter(deletedSignatureLock); >>> 296 >>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) { >>> 298 KlassNode* node = table[idx]; >>> 299 while (node != NULL) { >>> 300 KlassNode* next = node->next; >>> 301 jvmtiDeallocate(node->signature); >>> 302 jvmtiDeallocate(node); >>> 303 node = next; >>> 304 } >>> 305 } >>> 306 jvmtiDeallocate(table); >>> 307 >>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL); >>> 309 bagDestroyBag(deletedSignatureBag); >>> 310 >>> 311 currentClassTag = -1; >>> 312 >>> 313 (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); >>> 314 trackingEnv = NULL; >>> 315 >>> 316 debugMonitorExit(deletedSignatureLock); >>> >>> Could you, please, fix several comments below? >>> 63 * The JVMTI tracking env to keep track of klass tags, for class-unloads >>> ?The comma is not needed. >>> ?Would it better to replace: klass tags => klass_tag's ? >>> >>> >>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag >>> consistent >>> ?Maybe: Lock to guard ... or lock to keep integrity of ... >>> >>> 84 * Callback when classes are freed, Finds the signature and >>> remembers it in deletedSignatureBag. Would be better to use words like >>> "store" or "record", "Find" should not start from capital letter: >>> Invoke the callback when classes are freed, find and record the >>> signature in deletedSignatureBag. >>> >>> 96 // Find deleted KlassNode 133 // Class tracking not initialized, >>> nobody's interested 153 // Class tracking not initialized yet, >>> nobody's interested 158 /* Check this is not a duplicate */ Missed dot >>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { // >>> klass not found - ignore. In opposite, dot is not needed as the >>> comment does not start from a capital letter. 111 // At this point we >>> have the KlassNode corresponding to the tag >>> 112 // in klass, and the pointer to it in klass_node. >> The comment above can be better. Maybe, something like: >> ? " At this point, we found the KlassNode matching the klass tag(and it is >> linked). >> >>> 113 // Remember the unloaded signature. >> ?Better: Record the signature of the unloaded class and unlink it. >> >> Thanks, >> Serguei >> >>> Thanks, >>> Serguei >>> >>> On 3/9/20 05:39, Roman Kennke wrote: >>>> Hello all, >>>> >>>> Can I please get reviews of this change? In the meantime, we've done >>>> more testing and also field-/torture-testing by a customer who is happy >>>> now. :-) >>>> >>>> Thanks, >>>> Roman >>>> >>>> >>>>> Hi Serguei, >>>>> >>>>> Thanks for reviewing! >>>>> >>>>> I updated the patch to reflect your suggestions, very good! >>>>> It also includes a fix to allow re-connecting an agent after disconnect, >>>>> namely move setup of the trackingEnv and deletedSignatureBag to >>>>> _activate() to ensure have those structures after re-connect. >>>>> >>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/ >>>>> >>>>> Let me know what you think! >>>>> Roman >>>>> >>>>>> Hi Roman, >>>>>> >>>>>> Thank you for taking care about this scalability issue! >>>>>> >>>>>> I have a couple of quick comments. >>>>>> >>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>> >>>>>> 72 /* >>>>>> 73 * Lock to protect deletedSignatureBag >>>>>> 74 */ >>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /* >>>>>> 78 * A bag containing all the deleted classes' signatures. Must be >>>>>> accessed under >>>>>> 79 * deletedTagLock, >>>>>> 80 */ >>>>>> 81 struct bag* deletedSignatureBag; >>>>>> >>>>>> ? The comments contradict to each other. >>>>>> ? I guess, the lock name at line 79 has to be deletedSignatureLock >>>>>> instead of deletedTagLock. >>>>>> ? Also, comma at the end must be replaced with dot. >>>>>> >>>>>> >>>>>> 101 // Tag not found? Ignore. >>>>>> 102 if (klass == NULL) { >>>>>> 103 debugMonitorExit(deletedSignatureLock); >>>>>> 104 return; >>>>>> 105 } >>>>>> 106 >>>>>> 107 // Scan linked-list. >>>>>> 108 jlong found_tag = klass->klass_tag; >>>>>> 109 while (klass != NULL && found_tag != tag) { >>>>>> 110 klass_ptr = &klass->next; >>>>>> 111 klass = *klass_ptr; >>>>>> 112 found_tag = klass->klass_tag; >>>>>> 113 } >>>>>> 114 >>>>>> 115 // Tag not found? Ignore. >>>>>> 116 if (found_tag != tag) { >>>>>> 117 debugMonitorExit(deletedSignatureLock); >>>>>> 118 return; >>>>>> 119 } >>>>>> >>>>>> >>>>>> ?The code above can be simplified, so that the lines 101-105 are not >>>>>> needed anymore. >>>>>> ?It can be something like this: >>>>>> >>>>>> // Scan linked-list. >>>>>> while (klass != NULL && klass->klass_tag != tag) { >>>>>> klass_ptr = &klass->next; >>>>>> klass = *klass_ptr; >>>>>> } >>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not found - ignore. >>>>>> debugMonitorExit(deletedSignatureLock); >>>>>> return; >>>>>> } >>>>>> >>>>>> It will take more time when I get a chance to look at the rest. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On 12/21/19 13:24, Roman Kennke wrote: >>>>>>> Here comes an update that resolves some races that happen when >>>>>>> disconnecting an agent. In particular, we need to take the lock on >>>>>>> basically every operation, and also need to check whether or not >>>>>>> class-tracking is active and return an appropriate result (e.g. an empty >>>>>>> list) when we're not. >>>>>>> >>>>>>> Updated webrev: >>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ >>>>>>> >>>>>>> Thanks, >>>>>>> Roman >>>>>>> >>>>>>> >>>>>>>> So, here comes the O(1) implementation: >>>>>>>> >>>>>>>> - Whenever a class is 'prepared', it is registered with a tag, and we >>>>>>>> set-up a listener to get notified when it is unloaded. >>>>>>>> - Prepared classes are kept in a datastructure that is a table, which >>>>>>>> each entry being the head of a linked-list of KlassNode*. The table is >>>>>>>> indexed by tag % slot-count, and then simply prepend the new KlassNode*. >>>>>>>> This is O(1) operation. >>>>>>>> - When we get notified of unloading a class, we look up the signature of >>>>>>>> the reported tag in that table, and remember it in a bag. The KlassNode* >>>>>>>> is then unlinked from the table and deallocated. This is ~O(1) operation >>>>>>>> too, depending on the depth of the table. In my testcase which hammered >>>>>>>> the code with class-loads and unloads, I usually see depths of like 2-3, >>>>>>>> but not usually more. It should be ok. >>>>>>>> - when processUnloads() gets called, we simply hand out that bag, and >>>>>>>> allocate a new one. >>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid leaking the >>>>>>>> signatures and KlassNode* etc when debug agent gets detached and/or >>>>>>>> re-attached (was missing before). >>>>>>>> - I also added locks around data-structure-manipulation (was missing >>>>>>>> before). >>>>>>>> - Also, I only activate this whole process when an actual listener gets >>>>>>>> registered on EI_GC_FINISH. This seems to happen right when attaching a >>>>>>>> jdb, not sure why jdb does that though. This may be something to improve >>>>>>>> in the future? >>>>>>>> >>>>>>>> In my tests, the performance of class-tracking itself looks really good. >>>>>>>> The bottleneck now is clearly actual synthesizing the class-unload >>>>>>>> events. I don't see how this can be helped when the debug agent asks for it? >>>>>>>> >>>>>>>> Updated webrev: >>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ >>>>>>>> >>>>>>>> Please let me know what you think of it. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Roman >>>>>>>> >>>>>>>> >>>>>>>>> Alright, the perfectionist in me got me. I am implementing the even more >>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing for now. >>>>>>>>> >>>>>>>>> Thanks,Roman >>>>>>>>> >>>>>>>>> Hi Chris, >>>>>>>>>>> I'll have a look at this, although it might not be for a few days. In >>>>>>>>>>> the meantime, maybe you can describe your new implementation in >>>>>>>>>>> classTrack.c so it's easier to look through the changes. >>>>>>>>>> Sure. >>>>>>>>>> >>>>>>>>>> The purpose of this class-tracking is to be able to determine the >>>>>>>>>> signatures of unloaded classes when GC/class-unloading happened, so that >>>>>>>>>> we can generate the appropriate JDWP event. >>>>>>>>>> >>>>>>>>>> The current implementation does so by maintaining a table of currently >>>>>>>>>> prepared classes by building that table when classTrack is initialized, >>>>>>>>>> and then add new classes whenever a class gets loaded. When unloading >>>>>>>>>> occurs, that cache is rebuilt into a new table, and compared with the >>>>>>>>>> old table, and whatever is in the old, but not in the new table gets >>>>>>>>>> returned. The problem is that when GCs happen frequently and/or many >>>>>>>>>> classes get loaded+unloaded, this amounts to O(classCount*gcCount) >>>>>>>>>> complexity. >>>>>>>>>> >>>>>>>>>> The new implementation keeps a linked-list of prepared classes, and also >>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). Whenever an >>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned, and classes >>>>>>>>>> that are also in the deletedTagBag are unlinked (thus maintaining the >>>>>>>>>> prepared-classes-list) and its signature put in the list that gets returned. >>>>>>>>>> >>>>>>>>>> The implementation is not perfect. In order to determine whether or not >>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. That process is >>>>>>>>>> therefore still O(unloadedClassCount). The assumption here is that >>>>>>>>>> unloadedClassCount << classCount. In my experiments this seems to be >>>>>>>>>> true, and also reasonable to expect. >>>>>>>>>> >>>>>>>>>> (I have some ideas how to improve the implementation to ~O(1) but it >>>>>>>>>> would be considerably more complex: have to maintain a (hash)table that >>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload, and build the >>>>>>>>>> unloaded-signatures list there, but I don't currently see that it's >>>>>>>>>> worth the effort). >>>>>>>>>> >>>>>>>>>> In addition to all that, this process is only activated when there's an >>>>>>>>>> actual listener registered for EI_GC_FINISH. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Roman >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote: >>>>>>>>>>>> Hello all, >>>>>>>>>>>> >>>>>>>>>>>> Issue: >>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269 >>>>>>>>>>>> >>>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c. It avoids >>>>>>>>>>>> throwing away the class cache on GC, and instead keeps track of >>>>>>>>>>>> loaded/unloaded classes one-by-one. >>>>>>>>>>>> >>>>>>>>>>>> In addition to that, it avoids this whole dance until an agent >>>>>>>>>>>> registers interest in EI_GC_FINISH. >>>>>>>>>>>> >>>>>>>>>>>> Webrev: >>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ >>>>>>>>>>>> >>>>>>>>>>>> Testing: manual testing of provided test scenarios and timing. >>>>>>>>>>>> >>>>>>>>>>>> Eg with the testcase provided here: >>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>>>>> >>>>>>>>>>>> I am getting those numbers: >>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s >>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s >>>>>>>>>>>> >>>>>>>>>>>> I also tested successfully through jdk/submit repo >>>>>>>>>>>> >>>>>>>>>>>> Can I please get a review? >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Roman >>>>>>>>>>>> From suenaga at oss.nttdata.com Sat Mar 21 00:55:04 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Sat, 21 Mar 2020 09:55:04 +0900 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com> <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com> <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com> <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com> <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com> <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com> <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com> <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com> Message-ID: <4f940048-06e1-bda1-6f75-225001572232@oss.nttdata.com> Thanks Chris! I'm waiting for reviewers for this change. Yasumasa On 2020/03/21 4:23, Chris Plummer wrote: > Hi Yasumasa, > > The failure is due to JDK-8231634, so not something you need to worry about. > > thanks, > > Chris > > On 3/20/20 2:45 AM, Yasumasa Suenaga wrote: >> Hi Chris, >> >> I uploaded new webrev which includes reverting change for ProblemList: >> >> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/ >> >> I tested it on submit repo (mach5-one-ysuenaga-JDK-8240956-2-20200320-0814-9586301), >> but it has failed in ClhsdbJstackXcompStress.java. >> However I think it is not caused by this change because ClhsdbJstackXcompStress.java tests `jhsdb jstack`, not mixed mode, it would not parse DWARF. >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2020/03/20 13:55, Chris Plummer wrote: >>> Hi Yasumasa, >>> >>> The test has been problem listed so please add undoing this to your webrev. Here's the diff that problem listed it: >>> >>> diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt >>> --- a/test/hotspot/jtreg/ProblemList.txt >>> +++ b/test/hotspot/jtreg/ProblemList.txt >>> @@ -115,7 +115,7 @@ >>> ??serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all >>> ??serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all >>> ??serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all >>> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all >>> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 solaris-all,linux-all >>> ??serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 solaris-all >>> ??serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 solaris-all,linux-x64,macosx-x64,windows-x64 >>> ??serviceability/sa/ClhsdbSource.java 8193639 solaris-all >>> >>> thanks, >>> >>> Chris >>> >>> On 3/16/20 5:07 AM, Yasumasa Suenaga wrote: >>>> Hi all, >>>> >>>> This webrev has passed submit repo (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and additional tests. >>>> So please review it: >>>> >>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2020/03/16 21:03, Yasumasa Suenaga wrote: >>>>> Thank you so much, David! >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2020/03/16 21:01, David Holmes wrote: >>>>>> On 16/03/2020 9:46 pm, David Holmes wrote: >>>>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote: >>>>>>>> Hi David, >>>>>>>> >>>>>>>> I missed loop condition, so I fixed it and pushed to submit repo. >>>>>>>> Could you try again? >>>>>>>> >>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23 >>>>>>>> >>>>>>>> webrev is here: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >>>>>>> >>>>>>> Test job resubmitted. Will advise results if it completes before I go to bed :) >>>>>> >>>>>> Seems to have passed okay. >>>>>> >>>>>> David >>>>>> >>>>>>> David >>>>>>> >>>>>>>> >>>>>>>> Thanks a lot! >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2020/03/16 16:17, David Holmes wrote: >>>>>>>>> Sorry it is still crashing. >>>>>>>>> >>>>>>>>> # >>>>>>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>>>>>> # >>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704 >>>>>>>>> # >>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0640217.suenaga.source) >>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>>>>>> # Problematic frame: >>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>>>> # >>>>>>>>> >>>>>>>>> Same as before. >>>>>>>>> >>>>>>>>> David >>>>>>>>> ----- >>>>>>>>> >>>>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote: >>>>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote: >>>>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote: >>>>>>>>>>>> BTW, if you submit it to the submit repo, we can then go and run additional internal tests (and even more builds) using that job. >>>>>>>>>> >>>>>>>>>> Thanks for that tip Chris! >>>>>>>>>> >>>>>>>>>>> I've pushed the change to submit repo, but I've not yet received the result. >>>>>>>>>>> I will share you when I get job ID. >>>>>>>>>> >>>>>>>>>> We can see the id. Just need to wait for the builds to complete before submitting the additional tests. >>>>>>>>>> >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>> Hi David, >>>>>>>>>>>>> >>>>>>>>>>>>> Thank you for testing it. >>>>>>>>>>>>> >>>>>>>>>>>>> I updated webrev to avoid bailout to Java frame when DWARF has language personality routine or LSDA. >>>>>>>>>>>>> Could you try it? >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ >>>>>>>>>>>>> >>>>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 . >>>>>>>>>>>>> I've pushed it to submit repo. >>>>>>>>>>>>> >>>>>>>>>>>>> Diff from webrev.00 is here: >>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote: >>>>>>>>>>>>>> Correction ... >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote: >>>>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote: >>>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I can't review this as I know nothing about the code, but I'm putting the patch through our internal testing. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Sorry but the crashes still exist: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> # >>>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>>>>>>>>>>>> # >>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949 >>>>>>>>>>>>>>> # >>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) >>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>>>>>>>>>>>> # Problematic frame: >>>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> in fact they seem worse as the test seems to always crash now. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in linux-x64. I don't see a pattern as to where it fails versus passes. >>>>>>>>>>>>>> >>>>>>>>>>>>>> It doesn't fail for me locally. >>>>>>>>>>>>>> >>>>>>>>>>>>>> David >>>>>>>>>>>>>> >>>>>>>>>>>>>>> David >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Please review this change: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames in jstack mixed mode. >>>>>>>>>>>>>>>>> However some error has seen intermittently after that. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I investigated the cause of this, I found two concerns: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check >>>>>>>>>>>>>>>>> ?? B: Language personality routine and Language Specific Data Area (LSDA) are not considered >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I addd range check for .eh_frame processing, and ignore personality routine and LSDA in this webrev. >>>>>>>>>>>>>>>>> Also I added bailout code if DWARF processing is failed due to these concerns. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> This change has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), >>>>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>> >>> > > From daniil.x.titov at oracle.com Sun Mar 22 22:29:27 2020 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Sun, 22 Mar 2020 15:29:27 -0700 Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI connector port In-Reply-To: References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com> <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com> <41392FC2-AC78-4E94-878B-907DDFB3E968@oracle.com> <78EA1B33-1B0D-4969-AFC5-E73D8F82E6FA@oracle.com> Message-ID: Hi Yasumasa, Serguei and Alex, Please review a new version of the webrev that merges SADebugDTest.java with changes done in [2]. Also the CRS [3] and the help message for debug server in SALauncher.java were updated to specify that '--hostname' option could be a hostname or an IPv4/IPv6 address. > Ok, but I think it might be more simply with TestLibrary. > For example, can you use TestLibrary::getUnusedRandomPort ? It is used in test/jdk/java/rmi/testlibrary/RMID.java . TestLibrary:: getUnusedRandomPort() doesn't allow to specify what ports are reserved and it uses some hardcoded port range [FIXED_PORT_MIN, FIXED_PORT_MAX] as reserved ports. Besides, test/jdk/java/rmi/testlibrary/TestLibrary.java class cannot be directly used in test/hotspot/jtreg/serviceability/* tests (it doesn't compile). Nevertheless, to simplify the test itself I moved findUnreservedFreePort(int .. reservedPorts) from SADebugTest.java to jdk.test.lib.Utils in /test/lib. Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. [1] http://cr.openjdk.java.net/~dtitov/8196751/webrev.04/ [2] https://bugs.openjdk.java.net/browse/JDK-8238268 [3] https://bugs.openjdk.java.net/browse/JDK-8239831 Thank you, Daniil ?On 3/13/20, 7:23 PM, "Yasumasa Suenaga" wrote: Hi Daniil, On 2020/03/14 7:05, Daniil Titov wrote: > Hi Yasumasa, Serguei and Alex, > > Please review a new version of the webrev that includes the changes Yasumasa suggested. > >> Shutdown hook is already registered in c'tor of HotSpotAgent. >> It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed. > > The shutdown hook registered in the HotSpotAgent c'tor only works for non-servers, so we still need a > the shutdown hook for remote server being added in SALauncher. I changed it to use the lambda expression. > > 101 public HotSpotAgent() { > 102 // for non-server add shutdown hook to clean-up debugger in case > 103 // of forced exit. For remote server, shutdown hook is added by > 104 // DebugServer. > 105 Runtime.getRuntime().addShutdownHook(new java.lang.Thread( > 106 new Runnable() { > 107 public void run() { > 108 synchronized (HotSpotAgent.this) { > 109 if (!isServer) { > 110 detach(); > 111 } > 112 } > 113 } > 114 })); > 115 } I missed it, thanks! >>> Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains >>> `exclusiveAccess.dirs=.` to avoid concurrent execution > As I understand exclusiveAccess.dirs prevents only the tests located in this directory from being run simultaneously and other tests could still run in parallel with one of these tests. Thus I would prefer to have the retry mechanism in place. I simplified the code using the class variables instead of local arrays. Ok, but I think it might be more simply with TestLibrary. For example, can you use TestLibrary::getUnusedRandomPort ? It is used in test/jdk/java/rmi/testlibrary/RMID.java . Thanks, Yasumasa > Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.03/ > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > [3] Bug: https://bugs.openjdk.java.net/browse/JDK-8196751 > > Thank you, > Daniil > > ?On 3/6/20, 6:15 PM, "Yasumasa Suenaga" wrote: > > Hi Daniil, > > On 2020/03/07 3:38, Daniil Titov wrote: > > Hi Yasumasa, > > > > -> checkBasicOptions() is needed? I think you can remove this method and embed it in caller. > > I think that having a piece of code that invokes a method named "buildAttachArgs" with a copy of the argument map just for its side-effect ( it throws an exception if parameters are incorrect) and ignores its return might look confusing. Thus, I found it more appropriate to wrap it inside a method with more relevant name . > > Ok, but I prefer to leave comment it. > > > > > SADebugDTest > > > - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array. > > We cannot use primitives there since these local variables are captured in lambda expression and are required to be final. > > The other option is to use some other wrapper for them but I don't see any obvious benefits in it comparing to the array. > > Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains `exclusiveAccess.dirs=.` to avoid concurrent execution. > If you do not think this error check, test code is more simply. > > > > I will include your other suggestion in the new version of the webrev. > > Sorry, I have one more comment: > > > - Shutdown hook is very good idea. You can implement more simply if you use lambda expression. > > Shutdown hook is already registered in c'tor of HotSpotAgent. > It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed. > > > Thanks, > > Yasumasa > > > > Thanks! > > Daniil > > > > ?On 3/6/20, 12:30 AM, "Yasumasa Suenaga" wrote: > > > > Hi Daniil, > > > > > > - SALauncher.java > > - checkBasicOptions() is needed? I think you can remove this method and embed it in caller. > > - I think registryPort should be checked with Integer.parseInt() like others (rmiPort and pid) rather than regex. > > - Shutdown hook is very good idea. You can implement more simply if you use lambda expression. > > > > - SADebugDTest.java > > - Please add bug ID to @bug. > > - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array. > > > > > > Thanks, > > > > Yasumasa > > > > > > On 2020/03/06 10:15, Daniil Titov wrote: > > > Hi Yasumasa, Serguei and Alex, > > > > > > Please review a new version of the fix [1] that addresses your comments. The new version in addition to RMI connector > > > port option introduces two more options to specify RMI registry port and RMI connector host name. Currently, these > > > last two settings could be specified using the system properties but the system properties have the following disadvantages > > > comparing to the command line options: > > > - It?s hard to know about them: they are not listed in tool?s help. > > > - They have long names that hard to remember > > > - It is easy to mistype them in the command line and you will not get any warning about it. > > > > > > The CSR [2] was also updated and needs to be reviewed. > > > > > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > > > container and connecting to it with the GUI debugger. Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.02/ > > > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > > [3] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > > > > > Thank you, > > > Daniil > > > > > > ?On 2/24/20, 5:45 AM, "Yasumasa Suenaga" wrote: > > > > > > Hi Daniil, > > > > > > - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments. > > > Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply. > > > > > > - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used. > > > But you can use same port number as RMI registry (1099). > > > It is same as relation between jmxremote.port and jmxremote.rmi.port. > > > > > > > > > Thanks, > > > > > > Yasumasa > > > > > > > > > On 2020/02/24 13:21, Daniil Titov wrote: > > > > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port. > > > > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container. > > > > > > > > New CSR [3] was created for this change and it needs to be reviewed as well. > > > > > > > > Man pages for jhsdb will be updated in a separate issue. > > > > > > > > The current implementation (sun.jvm.hotspot.SALauncher) parses the command line options passed to jhsdb tool, > > > > converts them to the ones for the debug server and then delegates the call to sun.jvm.hotspot.DebugServer.main(). > > > > > > > > // delegate to the actual SA debug server. > > > > 367 DebugServer.main(newArgArray.toArray(new String[0])); > > > > > > > > However, sun.jvm.hotspot.DebugServer doesn't support named options and that prevents from efficiently adding new options to the tool. > > > > I found it more suitable to start Hotspot agent directly in SALauncher rather than adding a new option in both sun.jvm.hotspot.SALauncher > > > > and sun.jvm.hotspot.DebugServer and delegating the call. With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated > > > > but I would prefer to address it in a separate issue. > > > > > > > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > > > > container and connecting to it with the GUI debugger. > > > > Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01 > > > > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > > > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > > > > > > > Thank you, > > > > Daniil > > > > > > > > > > > > > > > > > > > > > > > > > > From suenaga at oss.nttdata.com Mon Mar 23 06:13:53 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Mon, 23 Mar 2020 15:13:53 +0900 Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI connector port In-Reply-To: References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com> <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com> <41392FC2-AC78-4E94-878B-907DDFB3E968@oracle.com> <78EA1B33-1B0D-4969-AFC5-E73D8F82E6FA@oracle.com> Message-ID: Hi Daniil, Looks good! Yasumasa On 2020/03/23 7:29, Daniil Titov wrote: > Hi Yasumasa, Serguei and Alex, > > Please review a new version of the webrev that merges SADebugDTest.java with changes done in [2]. > > Also the CRS [3] and the help message for debug server in SALauncher.java were updated to specify that '--hostname' > option could be a hostname or an IPv4/IPv6 address. > > > Ok, but I think it might be more simply with TestLibrary. > > For example, can you use TestLibrary::getUnusedRandomPort ? It is used in test/jdk/java/rmi/testlibrary/RMID.java . > > TestLibrary:: getUnusedRandomPort() doesn't allow to specify what ports are reserved and it uses some hardcoded port range [FIXED_PORT_MIN, FIXED_PORT_MAX] as reserved ports. Besides, test/jdk/java/rmi/testlibrary/TestLibrary.java class cannot be directly used in test/hotspot/jtreg/serviceability/* tests (it doesn't compile). > > Nevertheless, to simplify the test itself I moved findUnreservedFreePort(int .. reservedPorts) from SADebugTest.java to jdk.test.lib.Utils in /test/lib. > > Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > [1] http://cr.openjdk.java.net/~dtitov/8196751/webrev.04/ > [2] https://bugs.openjdk.java.net/browse/JDK-8238268 > [3] https://bugs.openjdk.java.net/browse/JDK-8239831 > > Thank you, > Daniil > > ?On 3/13/20, 7:23 PM, "Yasumasa Suenaga" wrote: > > Hi Daniil, > > On 2020/03/14 7:05, Daniil Titov wrote: > > Hi Yasumasa, Serguei and Alex, > > > > Please review a new version of the webrev that includes the changes Yasumasa suggested. > > > >> Shutdown hook is already registered in c'tor of HotSpotAgent. > >> It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed. > > > > The shutdown hook registered in the HotSpotAgent c'tor only works for non-servers, so we still need a > > the shutdown hook for remote server being added in SALauncher. I changed it to use the lambda expression. > > > > 101 public HotSpotAgent() { > > 102 // for non-server add shutdown hook to clean-up debugger in case > > 103 // of forced exit. For remote server, shutdown hook is added by > > 104 // DebugServer. > > 105 Runtime.getRuntime().addShutdownHook(new java.lang.Thread( > > 106 new Runnable() { > > 107 public void run() { > > 108 synchronized (HotSpotAgent.this) { > > 109 if (!isServer) { > > 110 detach(); > > 111 } > > 112 } > > 113 } > > 114 })); > > 115 } > > I missed it, thanks! > > > >>> Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains > >>> `exclusiveAccess.dirs=.` to avoid concurrent execution > > As I understand exclusiveAccess.dirs prevents only the tests located in this directory from being run simultaneously and other tests could still run in parallel with one of these tests. Thus I would prefer to have the retry mechanism in place. I simplified the code using the class variables instead of local arrays. > > Ok, but I think it might be more simply with TestLibrary. > For example, can you use TestLibrary::getUnusedRandomPort ? It is used in test/jdk/java/rmi/testlibrary/RMID.java . > > > Thanks, > > Yasumasa > > > > Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.03/ > > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > [3] Bug: https://bugs.openjdk.java.net/browse/JDK-8196751 > > > > Thank you, > > Daniil > > > > ?On 3/6/20, 6:15 PM, "Yasumasa Suenaga" wrote: > > > > Hi Daniil, > > > > On 2020/03/07 3:38, Daniil Titov wrote: > > > Hi Yasumasa, > > > > > > -> checkBasicOptions() is needed? I think you can remove this method and embed it in caller. > > > I think that having a piece of code that invokes a method named "buildAttachArgs" with a copy of the argument map just for its side-effect ( it throws an exception if parameters are incorrect) and ignores its return might look confusing. Thus, I found it more appropriate to wrap it inside a method with more relevant name . > > > > Ok, but I prefer to leave comment it. > > > > > > > > SADebugDTest > > > > - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array. > > > We cannot use primitives there since these local variables are captured in lambda expression and are required to be final. > > > The other option is to use some other wrapper for them but I don't see any obvious benefits in it comparing to the array. > > > > Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains `exclusiveAccess.dirs=.` to avoid concurrent execution. > > If you do not think this error check, test code is more simply. > > > > > > > I will include your other suggestion in the new version of the webrev. > > > > Sorry, I have one more comment: > > > > > - Shutdown hook is very good idea. You can implement more simply if you use lambda expression. > > > > Shutdown hook is already registered in c'tor of HotSpotAgent. > > It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed. > > > > > > Thanks, > > > > Yasumasa > > > > > > > Thanks! > > > Daniil > > > > > > ?On 3/6/20, 12:30 AM, "Yasumasa Suenaga" wrote: > > > > > > Hi Daniil, > > > > > > > > > - SALauncher.java > > > - checkBasicOptions() is needed? I think you can remove this method and embed it in caller. > > > - I think registryPort should be checked with Integer.parseInt() like others (rmiPort and pid) rather than regex. > > > - Shutdown hook is very good idea. You can implement more simply if you use lambda expression. > > > > > > - SADebugDTest.java > > > - Please add bug ID to @bug. > > > - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array. > > > > > > > > > Thanks, > > > > > > Yasumasa > > > > > > > > > On 2020/03/06 10:15, Daniil Titov wrote: > > > > Hi Yasumasa, Serguei and Alex, > > > > > > > > Please review a new version of the fix [1] that addresses your comments. The new version in addition to RMI connector > > > > port option introduces two more options to specify RMI registry port and RMI connector host name. Currently, these > > > > last two settings could be specified using the system properties but the system properties have the following disadvantages > > > > comparing to the command line options: > > > > - It?s hard to know about them: they are not listed in tool?s help. > > > > - They have long names that hard to remember > > > > - It is easy to mistype them in the command line and you will not get any warning about it. > > > > > > > > The CSR [2] was also updated and needs to be reviewed. > > > > > > > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > > > > container and connecting to it with the GUI debugger. Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.02/ > > > > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > > > [3] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > > > > > > > Thank you, > > > > Daniil > > > > > > > > ?On 2/24/20, 5:45 AM, "Yasumasa Suenaga" wrote: > > > > > > > > Hi Daniil, > > > > > > > > - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments. > > > > Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply. > > > > > > > > - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used. > > > > But you can use same port number as RMI registry (1099). > > > > It is same as relation between jmxremote.port and jmxremote.rmi.port. > > > > > > > > > > > > Thanks, > > > > > > > > Yasumasa > > > > > > > > > > > > On 2020/02/24 13:21, Daniil Titov wrote: > > > > > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port. > > > > > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container. > > > > > > > > > > New CSR [3] was created for this change and it needs to be reviewed as well. > > > > > > > > > > Man pages for jhsdb will be updated in a separate issue. > > > > > > > > > > The current implementation (sun.jvm.hotspot.SALauncher) parses the command line options passed to jhsdb tool, > > > > > converts them to the ones for the debug server and then delegates the call to sun.jvm.hotspot.DebugServer.main(). > > > > > > > > > > // delegate to the actual SA debug server. > > > > > 367 DebugServer.main(newArgArray.toArray(new String[0])); > > > > > > > > > > However, sun.jvm.hotspot.DebugServer doesn't support named options and that prevents from efficiently adding new options to the tool. > > > > > I found it more suitable to start Hotspot agent directly in SALauncher rather than adding a new option in both sun.jvm.hotspot.SALauncher > > > > > and sun.jvm.hotspot.DebugServer and delegating the call. With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated > > > > > but I would prefer to address it in a separate issue. > > > > > > > > > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > > > > > container and connecting to it with the GUI debugger. > > > > > Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > > > > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01 > > > > > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > > > > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > > > > > > > > > Thank you, > > > > > Daniil > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From serguei.spitsyn at oracle.com Mon Mar 23 17:18:37 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 23 Mar 2020 10:18:37 -0700 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: <4f940048-06e1-bda1-6f75-225001572232@oss.nttdata.com> References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com> <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com> <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com> <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com> <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com> <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com> <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com> <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com> <4f940048-06e1-bda1-6f75-225001572232@oss.nttdata.com> Message-ID: <908005c7-c3ad-2f92-5853-2ae0895689b4@oracle.com> An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Mon Mar 23 18:22:38 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 23 Mar 2020 11:22:38 -0700 Subject: RFR 8240902: JDI shared memory connector can use already closed Handles In-Reply-To: References: <2212664f-930b-cdfc-7ae6-70c7205cdaca@oracle.com> <8928590e-6516-051e-aa84-098a6fdc9d45@oracle.com> Message-ID: <68e2e091-7bfd-8501-f8bf-60b55d64b9af@oracle.com> An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Mon Mar 23 18:45:08 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 23 Mar 2020 11:45:08 -0700 Subject: RFR: 8240711: TestJstatdPort.java failed due to "ExportException: Port already in use:" In-Reply-To: <597AFBA6-42DB-4499-BF0C-8160155BABF4@oracle.com> References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com> <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com> <7D3EB454-76AA-47A7-91A6-417ED92C6F56@oracle.com> <38b055df-37cd-cc11-79c9-323fb55077ff@oracle.com> <3F070550-DAB3-431E-89AD-66532B366FEA@oracle.com> <597AFBA6-42DB-4499-BF0C-8160155BABF4@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From magnus.ihse.bursie at oracle.com Mon Mar 23 19:03:26 2020 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Mon, 23 Mar 2020 20:03:26 +0100 Subject: RFR: JDK-8241463 Move build tools to respective modules Message-ID: The build tools (small java tools that are run during the build to generate source code, or data, needed in the JDK) have historically been placed in the "make" directory. This maybe made sense long time ago, but does not do so anymore. Instead, the build tools source code should move the the module that needs them. For instance, compilefontconfig should move to java.desktop, etc. There are multiple reasons for this: * Currently we build *all* build tools at once, which mean that we cannot compile java.base until e.g. the compilefontconfig tool is compiled, even though it is not needed. * If a build tool, e.g. compilefontconfig is modified, all build tools are recompiled, which triggers a rebuild of more or less the entire JDK. This makes development of the build tools unnecessary tedious. * When the build tools are modified, the group owning the corresponding module is the proper review instance, not the build team. But since they reside under "make", the review mails often include build-dev, but this is mostly noise for us. With this move, the ownership is made clear. In this patch, I have not modified how and when the build tools are compiled, but this shuffle is the prerequisite for continuing with that in a follow-up patch. I have also moved the build tools to the org.openjdk.buildtools.* package name space (inspired by Skara), instead of the strangely named build.tools.* name space. A few build tools are not moved in this patch. Two of them, charsetmapping and cldrconverter, are shared between two modules. (I think they should move to modules nevertheless, but they need some more thought to make sure I do this right.) The rest are tools that are needed for the build in general, like linking or javadoc support. I'll move this to a better location too, but in a separate patch. Bug: https://bugs.openjdk.java.net/browse/JDK-8241463 WebRev: http://cr.openjdk.java.net/~ihse/JDK-8241463-move-build-tools-to-modules/webrev.01 /Magnus From daniil.x.titov at oracle.com Mon Mar 23 19:05:21 2020 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Mon, 23 Mar 2020 12:05:21 -0700 Subject: 8240711: TestJstatdPort.java failed due to "ExportException: Port already in use:" In-Reply-To: References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com> <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com> <7D3EB454-76AA-47A7-91A6-417ED92C6F56@oracle.com> <38b055df-37cd-cc11-79c9-323fb55077ff@oracle.com> <3F070550-DAB3-431E-89AD-66532B366FEA@oracle.com> <597AFBA6-42DB-4499-BF0C-8160155BABF4@oracle.com> Message-ID: <1575A797-0E97-4473-B803-B3FBDA888B2A@oracle.com> Hi Serguei, In this case tryToSetupJstatdProcess() on line 346 return null and the test ?will try to find a new pair of ports and start jstatd process. Best regards, Daniil From: "serguei.spitsyn at oracle.com" Date: Monday, March 23, 2020 at 11:45 AM To: Daniil Titov , Alex Menkov , serviceability-dev Subject: Re: RFR: 8240711: TestJstatdPort.java failed due to "ExportException: Port already in use:" Hi Daniil, It looks Okay in general. But I've got a question. 329???????????? while (jstatdThread == null) { 330???????????????? if (!useDefaultPort) { 331???????????????????? port = String.valueOf(Utils.getFreePort()); 332???????????????? } 333 ?334???????????????? if (!useDefaultRmiPort) { 335???????????????????? rmiPort = String.valueOf(Utils.getFreePort()); 336???????????????? } 337 ?338???????????????? if (withExternalRegistry) { 339???????????????????? Registry registry = startRegistry(); 340???????????????????? if (registry == null) { 341???????????????????????? // The port is already in use. Cancel and try with a new one. 342???????????????????????? continue; 343???????????????????? } 344???????????????? } 345 ?346???????????????? jstatdThread = tryToSetupJstatdProcess(); 347???????????? } What is going to happen if all ports that we try are already in use? Does the test report this situation? Thanks, Serguei On 3/17/20 11:40, Daniil Titov wrote: Hi Alex, Please review a new version of the fix that removes the old version of the code that tried to handle the "port in use" case. Testing: Mach5 tests for sun/tools/jstatd/? successfully passed 100 times.? Tier1-tier3 tests successfully passed. [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.02? [2] https://bugs.openjdk.java.net/browse/JDK-8240711 Thanks, Daniil ?On 3/16/20, 5:38 PM, "Daniil Titov" wrote: ??? Hi Alex, ??? ????Yes,? I did test the change by modifying? the test to use the RMI port that is already in use ??? ( the stack trace in the original email was exact from this changed test) and then ensured that with the fix ????the such issue is properly handled. ??? ????I will send a new version of the webrev that removes the old version of the code that tried to handle the "port in use" case. ??? ????Thanks! ??? ????Best regards, ??? Daniil ??? ???? ???? ???? ?????On 3/16/20, 4:47 PM, "Alex Menkov" wrote: ??? ????????I don't agree. ??????? The code handles exact the same "port in use" case for the same tool. ??????? So it either works or doesn't. ??????? And have 2 code blocks which suppose to do the same makes the code messy. ??????? BTW did you tested the change (I mean craft the test to get "port in ????????use" error)? ??????? ????????--alex ??????? ????????On 03/16/2020 16:17, Daniil Titov wrote: ??????? > Resending with the corrected subject ... ??????? > ????????> Hi Alex, ??????? > ????????> Yes, you are right, class JstatdTest has the code that is supposed to handle the "port in use" ??????? > case but at least for this specific test? (sun/tools/jstatd/TestJstatdPort.java) it doesn't work. ??????? > ????????> Since there are multiple tests in sun/tools/jstatd/* folder that use this class and different ports ??????? > might be subject to the "port in use" error and taking into account that it's hard to reproduce such case ??????? > I found it safer to leave the original code and just augment it with what was missing for this specific ??????? > case rather than completely replacing it. ??????? > ????????> Best regards, ??????? > Daniil ??????? > ????????> ?On 3/16/20, 4:02 PM, "Alex Menkov" wrote: ??????? > ????????>????? Hi Daniil, ??????? >????? ????????>????? Looks like the test is supposed to handle "port in use" issue (see lines ??????? >????? 103-114). ??????? >????? I suppose in case "port in use" jstatd exits, but ??????? >????? ProcessTools.startProcess() continue to wait for "jstatd started" message. ??????? >????? ????????>????? --alex ??????? >????? ????????>????? On 03/16/2020 12:00, Daniil Titov wrote: ????? ??>????? > Please review the change [1] that fixes the intermittent failure of the test. ??????? >????? > ??????? >????? > The problem here is that if the RMI port is in use than the test keep waiting for "jstatd started (bound to " to appear in the process output and in this case ??????? >????? > It doesn't happen. ??????? >????? > ??????? >????? > ??????? at java.util.concurrent.CountDownLatch.await(java.base at 15-internal/CountDownLatch.java:232) ??????? >????? > ??????? at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:205) ??????? >????? > ??????? at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:133) ??????? >????? > ??????? at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:254) ??????? >????? > ??????? at jdk.test.lib.thread.ProcessThread$ProcessRunnable.xrun(ProcessThread.java:153) ??????? >????? > ??????? at jdk.test.lib.thread.XRun.run(XRun.java:40) ??????? >????? > ??????? at java.lang.Thread.run(java.base at 15-internal/Thread.java:832) ??????? >????? > ??????? at jdk.test.lib.thread.TestThread.run(TestThread.java:123) ??????? >????? > ??????? >????? > Testing: Mach5 tests for sun/tools/jstatd/ successfully passed.? Tier1-tier3 tests are still in progress. ??????? >????? > ??????? >????? > [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.01/ ??????? >????? > [2] https://bugs.openjdk.java.net/browse/JDK-8240711 ??????? >????? > ??????? >????? > ??????? >????? > Thank you, ??????? >????? > Daniil ??????? >????? > ??????? >????? > ??????? >????? > ??????? >????? ????????> ????????> ???????? ???? -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Mon Mar 23 19:13:12 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 23 Mar 2020 12:13:12 -0700 Subject: 8240711: TestJstatdPort.java failed due to "ExportException: Port already in use:" In-Reply-To: <1575A797-0E97-4473-B803-B3FBDA888B2A@oracle.com> References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com> <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com> <7D3EB454-76AA-47A7-91A6-417ED92C6F56@oracle.com> <38b055df-37cd-cc11-79c9-323fb55077ff@oracle.com> <3F070550-DAB3-431E-89AD-66532B366FEA@oracle.com> <597AFBA6-42DB-4499-BF0C-8160155BABF4@oracle.com> <1575A797-0E97-4473-B803-B3FBDA888B2A@oracle.com> Message-ID: <5fb5bca9-8052-6168-042f-dc0b7192daa5@oracle.com> An HTML attachment was scrubbed... URL: From daniil.x.titov at oracle.com Mon Mar 23 19:32:21 2020 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Mon, 23 Mar 2020 12:32:21 -0700 Subject: RFR: 8240711: TestJstatdPort.java failed due to "ExportException: Port already in use:" In-Reply-To: <5fb5bca9-8052-6168-042f-dc0b7192daa5@oracle.com> References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com> <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com> <7D3EB454-76AA-47A7-91A6-417ED92C6F56@oracle.com> <38b055df-37cd-cc11-79c9-323fb55077ff@oracle.com> <3F070550-DAB3-431E-89AD-66532B366FEA@oracle.com> <597AFBA6-42DB-4499-BF0C-8160155BABF4@oracle.com> <1575A797-0E97-4473-B803-B3FBDA888B2A@oracle.com> <5fb5bca9-8052-6168-042f-dc0b7192daa5@oracle.com> Message-ID: <1A634ECF-141A-4144-8777-A87A5FC41234@oracle.com> Hi Serguei, I don?t think ?that in any real environment the loop could not be able to find the pair of free ports before it is killed by JTREG due to timeout. But if you think that we need to limit the number of attempts here I could create a new issue for that. Thanks! --Daniil From: "serguei.spitsyn at oracle.com" Date: Monday, March 23, 2020 at 12:13 PM To: Daniil Titov , Alex Menkov , serviceability-dev Subject: Re: 8240711: TestJstatdPort.java failed due to "ExportException: Port already in use:" On 3/23/20 12:05, Daniil Titov wrote: Hi Serguei, In this case tryToSetupJstatdProcess() on line 346 return null and the test will try to find a new pair of ports and start jstatd process. I understand this. My question if this loop can be endless. What happens if there is no new pair of ports that we did not check yet? Do we fail with a timeout in such a case? If so, would it better to report that unused free port was not found? Is it possible to detect this situation? Thanks, Serguei Best regards, Daniil From: "serguei.spitsyn at oracle.com" Date: Monday, March 23, 2020 at 11:45 AM To: Daniil Titov , Alex Menkov , serviceability-dev Subject: Re: RFR: 8240711: TestJstatdPort.java failed due to "ExportException: Port already in use:" Hi Daniil, It looks Okay in general. But I've got a question. 329 while (jstatdThread == null) { 330 if (!useDefaultPort) { 331 port = String.valueOf(Utils.getFreePort()); 332 } 333 334 if (!useDefaultRmiPort) { 335 rmiPort = String.valueOf(Utils.getFreePort()); 336 } 337 338 if (withExternalRegistry) { 339 Registry registry = startRegistry(); 340 if (registry == null) { 341 // The port is already in use. Cancel and try with a new one. 342 continue; 343 } 344 } 345 346 jstatdThread = tryToSetupJstatdProcess(); 347 } What is going to happen if all ports that we try are already in use? Does the test report this situation? Thanks, Serguei On 3/17/20 11:40, Daniil Titov wrote: Hi Alex, Please review a new version of the fix that removes the old version of the code that tried to handle the "port in use" case. Testing: Mach5 tests for sun/tools/jstatd/ successfully passed 100 times. Tier1-tier3 tests successfully passed. [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.02 [2] https://bugs.openjdk.java.net/browse/JDK-8240711 Thanks, Daniil ?On 3/16/20, 5:38 PM, "Daniil Titov" wrote: Hi Alex, Yes, I did test the change by modifying the test to use the RMI port that is already in use ( the stack trace in the original email was exact from this changed test) and then ensured that with the fix the such issue is properly handled. I will send a new version of the webrev that removes the old version of the code that tried to handle the "port in use" case. Thanks! Best regards, Daniil ?On 3/16/20, 4:47 PM, "Alex Menkov" wrote: I don't agree. The code handles exact the same "port in use" case for the same tool. So it either works or doesn't. And have 2 code blocks which suppose to do the same makes the code messy. BTW did you tested the change (I mean craft the test to get "port in use" error)? --alex On 03/16/2020 16:17, Daniil Titov wrote: > Resending with the corrected subject ... > > Hi Alex, > > Yes, you are right, class JstatdTest has the code that is supposed to handle the "port in use" > case but at least for this specific test (sun/tools/jstatd/TestJstatdPort.java) it doesn't work. > > Since there are multiple tests in sun/tools/jstatd/* folder that use this class and different ports > might be subject to the "port in use" error and taking into account that it's hard to reproduce such case > I found it safer to leave the original code and just augment it with what was missing for this specific > case rather than completely replacing it. > > Best regards, > Daniil > > ?On 3/16/20, 4:02 PM, "Alex Menkov" wrote: > > Hi Daniil, > > Looks like the test is supposed to handle "port in use" issue (see lines > 103-114). > I suppose in case "port in use" jstatd exits, but > ProcessTools.startProcess() continue to wait for "jstatd started" message. > > --alex > > On 03/16/2020 12:00, Daniil Titov wrote: > > Please review the change [1] that fixes the intermittent failure of the test. > > > > The problem here is that if the RMI port is in use than the test keep waiting for "jstatd started (bound to " to appear in the process output and in this case > > It doesn't happen. > > > > at java.util.concurrent.CountDownLatch.await(java.base at 15-internal/CountDownLatch.java:232) > > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:205) > > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:133) > > at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:254) > > at jdk.test.lib.thread.ProcessThread$ProcessRunnable.xrun(ProcessThread.java:153) > > at jdk.test.lib.thread.XRun.run(XRun.java:40) > > at java.lang.Thread.run(java.base at 15-internal/Thread.java:832) > > at jdk.test.lib.thread.TestThread.run(TestThread.java:123) > > > > Testing: Mach5 tests for sun/tools/jstatd/ successfully passed. Tier1-tier3 tests are still in progress. > > > > [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.01/ > > [2] https://bugs.openjdk.java.net/browse/JDK-8240711 > > > > > > Thank you, > > Daniil > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Mon Mar 23 19:48:52 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 23 Mar 2020 12:48:52 -0700 Subject: RFR: 8240711: TestJstatdPort.java failed due to "ExportException: Port already in use:" In-Reply-To: <1A634ECF-141A-4144-8777-A87A5FC41234@oracle.com> References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com> <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com> <7D3EB454-76AA-47A7-91A6-417ED92C6F56@oracle.com> <38b055df-37cd-cc11-79c9-323fb55077ff@oracle.com> <3F070550-DAB3-431E-89AD-66532B366FEA@oracle.com> <597AFBA6-42DB-4499-BF0C-8160155BABF4@oracle.com> <1575A797-0E97-4473-B803-B3FBDA888B2A@oracle.com> <5fb5bca9-8052-6168-042f-dc0b7192daa5@oracle.com> <1A634ECF-141A-4144-8777-A87A5FC41234@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From erik.joelsson at oracle.com Mon Mar 23 19:54:44 2020 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Mon, 23 Mar 2020 12:54:44 -0700 Subject: RFR: JDK-8241463 Move build tools to respective modules In-Reply-To: References: Message-ID: <705f74a1-8f64-166c-63d1-7174c89443cd@oracle.com> Looks good. /Erik On 2020-03-23 12:03, Magnus Ihse Bursie wrote: > The build tools (small java tools that are run during the build to > generate source code, or data, needed in the JDK) have historically > been placed in the "make" directory. This maybe made sense long time > ago, but does not do so anymore. > > Instead, the build tools source code should move the the module that > needs them. For instance, compilefontconfig should move to > java.desktop, etc. > > There are multiple reasons for this: > > * Currently we build *all* build tools at once, which mean that we > cannot compile java.base until e.g. the compilefontconfig tool is > compiled, even though it is not needed. > > * If a build tool, e.g. compilefontconfig is modified, all build tools > are recompiled, which triggers a rebuild of more or less the entire > JDK. This makes development of the build tools unnecessary tedious. > > * When the build tools are modified, the group owning the > corresponding module is the proper review instance, not the build > team. But since they reside under "make", the review mails often > include build-dev, but this is mostly noise for us. With this move, > the ownership is made clear. > > In this patch, I have not modified how and when the build tools are > compiled, but this shuffle is the prerequisite for continuing with > that in a follow-up patch. > > I have also moved the build tools to the org.openjdk.buildtools.* > package name space (inspired by Skara), instead of the strangely named > build.tools.* name space. > > A few build tools are not moved in this patch. Two of them, > charsetmapping and cldrconverter, are shared between two modules. (I > think they should move to modules nevertheless, but they need some > more thought to make sure I do this right.) The rest are tools that > are needed for the build in general, like linking or javadoc support. > I'll move this to a better location too, but in a separate patch. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8241463 > WebRev: > http://cr.openjdk.java.net/~ihse/JDK-8241463-move-build-tools-to-modules/webrev.01 > > /Magnus > From mandy.chung at oracle.com Mon Mar 23 20:19:13 2020 From: mandy.chung at oracle.com (Mandy Chung) Date: Mon, 23 Mar 2020 13:19:13 -0700 Subject: RFR: JDK-8241463 Move build tools to respective modules In-Reply-To: References: Message-ID: <53d7119b-5e7e-22fe-97c1-0382f2d94fbc@oracle.com> Hi Magnus, Modularizing the build tools is a good move.??? This patch suggests to place the build tools under ??? src/$MODULE/share/tools/$PACKAGE/*.java I think the modular source location of the build tools needs more discussion, including jigsaw-dev for this discussion. The JDK source as specified in JEP 201 is under: ??? src/$MODULE/{share,$OS}/classes/$PACKAGE/*.java Compiling the source files from the `src` directory are the intermediate input to build the resulting image.??? Build tools are used to generate additional intermediate input (that is not part of the `src` directory) to build the image.?? So I wonder if make/$MODULE/share/tools or make/tools/$MODULE? may be better location for the build tools. Mandy On 3/23/20 12:03 PM, Magnus Ihse Bursie wrote: > The build tools (small java tools that are run during the build to > generate source code, or data, needed in the JDK) have historically > been placed in the "make" directory. This maybe made sense long time > ago, but does not do so anymore. > > Instead, the build tools source code should move the the module that > needs them. For instance, compilefontconfig should move to > java.desktop, etc. > > There are multiple reasons for this: > > * Currently we build *all* build tools at once, which mean that we > cannot compile java.base until e.g. the compilefontconfig tool is > compiled, even though it is not needed. > > * If a build tool, e.g. compilefontconfig is modified, all build tools > are recompiled, which triggers a rebuild of more or less the entire > JDK. This makes development of the build tools unnecessary tedious. > > * When the build tools are modified, the group owning the > corresponding module is the proper review instance, not the build > team. But since they reside under "make", the review mails often > include build-dev, but this is mostly noise for us. With this move, > the ownership is made clear. > > In this patch, I have not modified how and when the build tools are > compiled, but this shuffle is the prerequisite for continuing with > that in a follow-up patch. > > I have also moved the build tools to the org.openjdk.buildtools.* > package name space (inspired by Skara), instead of the strangely named > build.tools.* name space. > > A few build tools are not moved in this patch. Two of them, > charsetmapping and cldrconverter, are shared between two modules. (I > think they should move to modules nevertheless, but they need some > more thought to make sure I do this right.) The rest are tools that > are needed for the build in general, like linking or javadoc support. > I'll move this to a better location too, but in a separate patch. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8241463 > WebRev: > http://cr.openjdk.java.net/~ihse/JDK-8241463-move-build-tools-to-modules/webrev.01 > > /Magnus > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alan.Bateman at oracle.com Mon Mar 23 20:33:50 2020 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Mon, 23 Mar 2020 20:33:50 +0000 Subject: RFR: JDK-8241463 Move build tools to respective modules In-Reply-To: References: Message-ID: <20158b4b-f6f1-92e1-59da-d0f6c07a85ca@oracle.com> On 23/03/2020 19:03, Magnus Ihse Bursie wrote: > The build tools (small java tools that are run during the build to > generate source code, or data, needed in the JDK) have historically > been placed in the "make" directory. This maybe made sense long time > ago, but does not do so anymore. > > Instead, the build tools source code should move the the module that > needs them. For instance, compilefontconfig should move to > java.desktop, etc. > > There are multiple reasons for this: > > * Currently we build *all* build tools at once, which mean that we > cannot compile java.base until e.g. the compilefontconfig tool is > compiled, even though it is not needed. > > * If a build tool, e.g. compilefontconfig is modified, all build tools > are recompiled, which triggers a rebuild of more or less the entire > JDK. This makes development of the build tools unnecessary tedious. > > * When the build tools are modified, the group owning the > corresponding module is the proper review instance, not the build > team. But since they reside under "make", the review mails often > include build-dev, but this is mostly noise for us. With this move, > the ownership is made clear. > > In this patch, I have not modified how and when the build tools are > compiled, but this shuffle is the prerequisite for continuing with > that in a follow-up patch. > > I have also moved the build tools to the org.openjdk.buildtools.* > package name space (inspired by Skara), instead of the strangely named > build.tools.* name space. > > A few build tools are not moved in this patch. Two of them, > charsetmapping and cldrconverter, are shared between two modules. (I > think they should move to modules nevertheless, but they need some > more thought to make sure I do this right.) The rest are tools that > are needed for the build in general, like linking or javadoc support. > I'll move this to a better location too, but in a separate patch. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8241463 > WebRev: > http://cr.openjdk.java.net/~ihse/JDK-8241463-move-build-tools-to-modules/webrev.01 I think this will require further discussion, maybe even an update to JEP 201. I think it would be useful to see what other options were exploring, in particular options that organize the tools by module in the make tree (as it will confuse people to put them in the src tree). -Alan From serguei.spitsyn at oracle.com Mon Mar 23 21:59:56 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 23 Mar 2020 14:59:56 -0700 Subject: RFR: 8240711: TestJstatdPort.java failed due to "ExportException: Port already in use:" In-Reply-To: References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com> <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com> <7D3EB454-76AA-47A7-91A6-417ED92C6F56@oracle.com> <38b055df-37cd-cc11-79c9-323fb55077ff@oracle.com> <3F070550-DAB3-431E-89AD-66532B366FEA@oracle.com> <597AFBA6-42DB-4499-BF0C-8160155BABF4@oracle.com> <1575A797-0E97-4473-B803-B3FBDA888B2A@oracle.com> <5fb5bca9-8052-6168-042f-dc0b7192daa5@oracle.com> <1A634ECF-141A-4144-8777-A87A5FC41234@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From naoto.sato at oracle.com Mon Mar 23 22:15:31 2020 From: naoto.sato at oracle.com (naoto.sato at oracle.com) Date: Mon, 23 Mar 2020 15:15:31 -0700 Subject: RFR: JDK-8241463 Move build tools to respective modules In-Reply-To: References: Message-ID: Hi Magnus, I looked at i18n related changes: make/CopyInterimTZDB.gmk make/ToolsJdk.gmk make/gendata/Gendata-java.base.gmk make/gendata/GendataBreakIterator.gmk make/gendata/GendataTZDB.gmk make/gensrc/GensrcCharacterData.gmk make/gensrc/GensrcEmojiData.gmk They look ok to me. The *.java changes should have copyright year update. As to charsetmapping and cldrconverter, I believe they can reside in java.base, as jdk.charsets and jdk.localedata modules depend on it. Naoto On 3/23/20 12:03 PM, Magnus Ihse Bursie wrote: > The build tools (small java tools that are run during the build to > generate source code, or data, needed in the JDK) have historically been > placed in the "make" directory. This maybe made sense long time ago, but > does not do so anymore. > > Instead, the build tools source code should move the the module that > needs them. For instance, compilefontconfig should move to java.desktop, > etc. > > There are multiple reasons for this: > > * Currently we build *all* build tools at once, which mean that we > cannot compile java.base until e.g. the compilefontconfig tool is > compiled, even though it is not needed. > > * If a build tool, e.g. compilefontconfig is modified, all build tools > are recompiled, which triggers a rebuild of more or less the entire JDK. > This makes development of the build tools unnecessary tedious. > > * When the build tools are modified, the group owning the corresponding > module is the proper review instance, not the build team. But since they > reside under "make", the review mails often include build-dev, but this > is mostly noise for us. With this move, the ownership is made clear. > > In this patch, I have not modified how and when the build tools are > compiled, but this shuffle is the prerequisite for continuing with that > in a follow-up patch. > > I have also moved the build tools to the org.openjdk.buildtools.* > package name space (inspired by Skara), instead of the strangely named > build.tools.* name space. > > A few build tools are not moved in this patch. Two of them, > charsetmapping and cldrconverter, are shared between two modules. (I > think they should move to modules nevertheless, but they need some more > thought to make sure I do this right.) The rest are tools that are > needed for the build in general, like linking or javadoc support. I'll > move this to a better location too, but in a separate patch. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8241463 > WebRev: > http://cr.openjdk.java.net/~ihse/JDK-8241463-move-build-tools-to-modules/webrev.01 > > > /Magnus > From serguei.spitsyn at oracle.com Mon Mar 23 22:39:51 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 23 Mar 2020 15:39:51 -0700 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: <908005c7-c3ad-2f92-5853-2ae0895689b4@oracle.com> References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com> <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com> <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com> <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com> <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com> <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com> <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com> <4f940048-06e1-bda1-6f75-225001572232@oss.nttdata.com> <908005c7-c3ad-2f92-5853-2ae0895689b4@oracle.com> Message-ID: <2114ef12-b790-ee2d-4242-d114cc47151c@oracle.com> An HTML attachment was scrubbed... URL: From suenaga at oss.nttdata.com Tue Mar 24 00:08:09 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Tue, 24 Mar 2020 09:08:09 +0900 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: <2114ef12-b790-ee2d-4242-d114cc47151c@oracle.com> References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com> <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com> <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com> <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com> <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com> <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com> <4f940048-06e1-bda1-6f75-225001572232@oss.nttdata.com> <908005c7-c3ad-2f92-5853-2ae0895689b4@oracle.com> <2114ef12-b790-ee2d-4242-d114cc47151c@oracle.com> Message-ID: <9c9950af-5e36-90b5-d22f-a3e00d45fbcc@oss.nttdata.com> Hi Serguei, Thanks for your comment! I uploaded new webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.04/ Also I pushed it to submit repo: http://hg.openjdk.java.net/jdk/submit/rev/fade6a949bd1 On 2020/03/24 7:39, serguei.spitsyn at oracle.com wrote: > Hi Yasumasa, > > The mach5 tier5 testing looks good. > The serviceability/sa/ClhsdbPstack.java is failed without fix and is not failed with it. > > Thanks, > Serguei > > > On 3/23/20 10:18, serguei.spitsyn at oracle.com wrote: >> Hi Yasumasa, >> >> I looked at you changes. >> It is hard to understand if this fully solves the issue. >> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html >> >> @@ -34,10 +34,11 @@ >> >> public static LinuxAMD64CFrame getTopFrame(LinuxDebugger dbg, Address rip, ThreadContext context) { >> Address libptr = dbg.findLibPtrByAddress(rip); >> Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); >> DwarfParser dwarf = null; >> + boolean unsupportedDwarf = false; >> >> if (libptr != null) { // Native frame >> try { >> dwarf = new DwarfParser(libptr); >> dwarf.processDwarf(rip); >> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >> @@ -45,24 +46,33 @@ >> !dwarf.isBPOffsetAvailable()) >> ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) >> : context.getRegisterAsAddress(dwarf.getCFARegister()) >> .addOffsetTo(dwarf.getCFAOffset()); >> } catch (DebuggerException e) { >> - // Bail out to Java frame case >> + if (dwarf != null) { >> + // DWARF processing should succeed when the frame is native >> + // but it might fail if CIE has language personality routine >> + // and/or LSDA. >> + dwarf = null; >> + unsupportedDwarf = true; >> + } else { >> + throw e; >> + } >> } >> } >> >> return (cfa == null) ? null >> - : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf); >> + : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf, !unsupportedDwarf); >> } >> >> @@ -121,13 +131,25 @@ >> } >> >> return isValidFrame(nextCFA, context) ? nextCFA : null; >> } >> >> - private DwarfParser getNextDwarf(Address nextPC) { >> - DwarfParser nextDwarf = null; >> + @Override >> + public CFrame sender(ThreadProxy thread) { >> + if (!possibleNext) { >> + return null; >> + } >> + >> + ThreadContext context = thread.getContext(); >> + >> + Address nextPC = getNextPC(dwarf != null); >> + if (nextPC == null) { >> + return null; >> + } >> >> + DwarfParser nextDwarf = null; >> + boolean unsupportedDwarf = false; >> if ((dwarf != null) && dwarf.isIn(nextPC)) { >> nextDwarf = dwarf; >> } else { >> Address libptr = dbg.findLibPtrByAddress(nextPC); >> if (libptr != null) { >> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >> @@ -138,33 +160,29 @@ >> } >> } >> } >> >> if (nextDwarf != null) { >> + try { >> nextDwarf.processDwarf(nextPC); >> + } catch (DebuggerException e) { >> + // DWARF processing should succeed when the frame is native >> + // but it might fail if CIE has language personality routine >> + // and/or LSDA. >> + nextDwarf = null; >> + unsupportedDwarf = true; >> } >> >> This fix looks like a hack. >> Should we just propagate the Debugging exception instead of trying to maintain unsupportedDwarf flag? DwarfParser::processDwarf would throw DebuggerException if it cannot find DWARF which relates to PC. PC at this point is for next frame. So current frame (`this` object) is valid, and it should be processed. >> Also, I don't like that DWARF-specific abbreviations (like CIE, IDE,LSDA, etc.) are used without any comments explaining them. >> The code has to be generally readable without looking into the DWARF spec each time. I added comments for them in this webrev. Thanks, Yasumasa >> I'm submitting mach5 jobs to make sure the issue has been resolved with your fix. >> >> Thanks, >> Serguei >> >> >> On 3/20/20 17:55, Yasumasa Suenaga wrote: >>> Thanks Chris! >>> I'm waiting for reviewers for this change. >>> >>> >>> Yasumasa >>> >>> >>> On 2020/03/21 4:23, Chris Plummer wrote: >>>> Hi Yasumasa, >>>> >>>> The failure is due to JDK-8231634, so not something you need to worry about. >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 3/20/20 2:45 AM, Yasumasa Suenaga wrote: >>>>> Hi Chris, >>>>> >>>>> I uploaded new webrev which includes reverting change for ProblemList: >>>>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/ >>>>> >>>>> I tested it on submit repo (mach5-one-ysuenaga-JDK-8240956-2-20200320-0814-9586301), >>>>> but it has failed in ClhsdbJstackXcompStress.java. >>>>> However I think it is not caused by this change because ClhsdbJstackXcompStress.java tests `jhsdb jstack`, not mixed mode, it would not parse DWARF. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2020/03/20 13:55, Chris Plummer wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> The test has been problem listed so please add undoing this to your webrev. Here's the diff that problem listed it: >>>>>> >>>>>> diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt >>>>>> --- a/test/hotspot/jtreg/ProblemList.txt >>>>>> +++ b/test/hotspot/jtreg/ProblemList.txt >>>>>> @@ -115,7 +115,7 @@ >>>>>> ??serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all >>>>>> ??serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all >>>>>> ??serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all >>>>>> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all >>>>>> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 solaris-all,linux-all >>>>>> ??serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 solaris-all >>>>>> ??serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 solaris-all,linux-x64,macosx-x64,windows-x64 >>>>>> ??serviceability/sa/ClhsdbSource.java 8193639 solaris-all >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 3/16/20 5:07 AM, Yasumasa Suenaga wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> This webrev has passed submit repo (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and additional tests. >>>>>>> So please review it: >>>>>>> >>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2020/03/16 21:03, Yasumasa Suenaga wrote: >>>>>>>> Thank you so much, David! >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2020/03/16 21:01, David Holmes wrote: >>>>>>>>> On 16/03/2020 9:46 pm, David Holmes wrote: >>>>>>>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote: >>>>>>>>>>> Hi David, >>>>>>>>>>> >>>>>>>>>>> I missed loop condition, so I fixed it and pushed to submit repo. >>>>>>>>>>> Could you try again? >>>>>>>>>>> >>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23 >>>>>>>>>>> >>>>>>>>>>> webrev is here: >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >>>>>>>>>> >>>>>>>>>> Test job resubmitted. Will advise results if it completes before I go to bed :) >>>>>>>>> >>>>>>>>> Seems to have passed okay. >>>>>>>>> >>>>>>>>> David >>>>>>>>> >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks a lot! >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 2020/03/16 16:17, David Holmes wrote: >>>>>>>>>>>> Sorry it is still crashing. >>>>>>>>>>>> >>>>>>>>>>>> # >>>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>>>>>>>>> # >>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704 >>>>>>>>>>>> # >>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0640217.suenaga.source) >>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>>>>>>>>> # Problematic frame: >>>>>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>>>>>>> # >>>>>>>>>>>> >>>>>>>>>>>> Same as before. >>>>>>>>>>>> >>>>>>>>>>>> David >>>>>>>>>>>> ----- >>>>>>>>>>>> >>>>>>>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote: >>>>>>>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote: >>>>>>>>>>>>>>> BTW, if you submit it to the submit repo, we can then go and run additional internal tests (and even more builds) using that job. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for that tip Chris! >>>>>>>>>>>>> >>>>>>>>>>>>>> I've pushed the change to submit repo, but I've not yet received the result. >>>>>>>>>>>>>> I will share you when I get job ID. >>>>>>>>>>>>> >>>>>>>>>>>>> We can see the id. Just need to wait for the builds to complete before submitting the additional tests. >>>>>>>>>>>>> >>>>>>>>>>>>> David >>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thank you for testing it. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I updated webrev to avoid bailout to Java frame when DWARF has language personality routine or LSDA. >>>>>>>>>>>>>>>> Could you try it? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 . >>>>>>>>>>>>>>>> I've pushed it to submit repo. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Diff from webrev.00 is here: >>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote: >>>>>>>>>>>>>>>>> Correction ... >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote: >>>>>>>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote: >>>>>>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I can't review this as I know nothing about the code, but I'm putting the patch through our internal testing. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Sorry but the crashes still exist: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949 >>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) >>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>>>>>>>>>>>>>>> # Problematic frame: >>>>>>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> in fact they seem worse as the test seems to always crash now. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in linux-x64. I don't see a pattern as to where it fails versus passes. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> It doesn't fail for me locally. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Please review this change: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>>>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames in jstack mixed mode. >>>>>>>>>>>>>>>>>>>> However some error has seen intermittently after that. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I investigated the cause of this, I found two concerns: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check >>>>>>>>>>>>>>>>>>>> ?? B: Language personality routine and Language Specific Data Area (LSDA) are not considered >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I addd range check for .eh_frame processing, and ignore personality routine and LSDA in this webrev. >>>>>>>>>>>>>>>>>>>> Also I added bailout code if DWARF processing is failed due to these concerns. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> This change has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), >>>>>>>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>> >>>>>> >>>> >>>> >> > From chris.plummer at oracle.com Tue Mar 24 06:34:38 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 23 Mar 2020 23:34:38 -0700 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: <7766d2f0-e223-7129-9c37-474bbcd67d70@oracle.com> References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com> <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com> <7766d2f0-e223-7129-9c37-474bbcd67d70@oracle.com> Message-ID: Hi Roman, I assume JVMTI maintains separate tagging data for each agent so having two agents doing tagging won't result in confusion. I didn't actually find this in the spec. Would be nice to confirm that it is the case. However, your implementation does seem to conflict with other uses of tagging in the debug agent: ?- During the execution of ObjectReference.ReferringObjects, the object being checked is tagged. If this happens to be a Class instance, the tag you setup will end up being cleared. - During the execution of VirtualMachine.InstanceCounts, each Class instance being counted is tagged. So that means your tag is cleared for any Class passed to this API - SetTag is used in commonRef.c. I believe any Object for which an objectID is created and sent to the front end (debugger), a weakref to that object is created and tagged with a RefNode*. So you will have many Weakref objects with tags. When these are freed, they are passed to cbTrackingObjectFree() and these tags are incorrectly added to deletedSignatures(). This means you end up treating a RefNode* as a char* in synthesizeUnloadEvent(), and a ClassUnload event gets created with garbage for the classname. I also think this could cause issues when eventually this RefNode* is passed to jvmtiDeallocate(). However, I think you have a bug where you never actually free up signatures for Classes that get unloaded. Only signatures for loaded classes seem to get deleted, and that is done when the agent detaches. What would cause classTrack_addPreparedClass() to be called for a Class you've already seen? I don't understand the need for the "tag != 0l" check. thanks, Chris On 3/20/20 12:52 PM, Chris Plummer wrote: > On 3/20/20 8:30 AM, Roman Kennke wrote: >> I believe I came up with a much simpler solution that also solves the >> problems of the existing one, and the ones I proposed earlier. >> >> It turns out that we can take advantage of the fact that we can use >> *anything* as tags in JVMTI, even pointers to stuff (this is explicitely >> mentioned in the JVMTI spec). This means we can simply stick a pointer >> to the signature of a class into the tag, and pull it out again when we >> get notified that the class gets unloaded. >> >> This means we don't need an extra data-structure to keep track of >> classes and signatures, and it also makes the story around locking >> *much* simpler. Performance-wise this is O(1), i.e. no scanning of all >> classes needed (as in the current implementation) and no searching of >> table needed (like in my previous attempts). >> >> Please review this new revision: >> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/ > I'll have a look at this. >> >> (Notice that there still appears to be a performance bottleneck with >> class-unloading when an actual debugger is attached. This doesn't seem >> to be related to the classTrack.c implementation though, but looks like >> a consequence of getting all those class-unload notifications over the >> wire. My testcase generates 1000s of them, and it's clogging up the >> buffers.) > At least this is only a one-shot hit when the classes are unloaded, > and the performance hit is based on the number of classes being > unloaded. The main issue is happening every GC, and is O(n) where n is > the number of loaded classes. >> I am not sure why jdb needs to enable class-unload listener always. A >> simple hack disables it, and performance is brilliant, even when jdb is >> attached: >> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch > This is JDI, not jdb. It looks like it needs ClassUnload events so it > can maintain typesBySignature, which is used by public APIs like > allClasses(). So we have caching of loaded classes both in the debug > agent and in JDI. > > Chris >> But this is not in the scope of this bug.) >> >> Roman >> >> >> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote: >>> Sorry, forgot to complete my comments at the end (see below). >>> >>> >>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote: >>>> Hi Roman, >>>> >>>> Thank you for the update and sorry for the latency in review. >>>> >>>> Some comments are below. >>>> >>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>> >>>> >>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag) >>>> ?? 88 { >>>> 89 debugMonitorEnter(deletedSignatureLock); >>>> 90 if (currentClassTag == -1) { >>>> 91 // Class tracking not initialized, nobody's interested >>>> 92 debugMonitorExit(deletedSignatureLock); >>>> 93 return; >>>> ?? 94???? } >>>> Just a question: >>>> ?? Q1: Should the ObjectFree events be disabled for the jvmtiEnv >>>> that does >>>> ?????? the class tracking if class tracking has not been initialized? >>>> >>>> 70 static jlong currentClassTag; I'm thinking if the name is better to >>>> be something like: lastClassTag or highestClassTag. >>>> >>>> 99 KlassNode* klass = *klass_ptr; >>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103 >>>> klass_ptr = &klass->next; 104 klass = *klass_ptr; >>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // klass not >>>> found - ignore. >>>> 107 debugMonitorExit(deletedSignatureLock); >>>> 108 return; >>>> ? 109???? } >>>> ??It seems to me, something is wrong in the condition at L106 above. >>>> ??Should it be? : >>>> ???? if (klass == NULL || klass->klass_tag != tag) >>>> >>>> ??Otherwise, how can the second check ever work correctly as the >>>> return >>>> will always happen when (klass != NULL)? >>>> >>>> ? There are several places in this file with the the indent: >>>> 90 if (currentClassTag == -1) { >>>> 91 // Class tracking not initialized, nobody's interested >>>> 92 debugMonitorExit(deletedSignatureLock); >>>> 93 return; >>>> ?? 94???? } >>>> ? ... >>>> 152 if (currentClassTag == -1) { >>>> 153 // Class tracking not initialized yet, nobody's interested >>>> 154 debugMonitorExit(deletedSignatureLock); >>>> 155 return; >>>> ? 156???? } >>>> ? ... >>>> 161 if (error != JVMTI_ERROR_NONE) { >>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv"); >>>> ? 163???? } >>>> 164 if (tag != 0l) { >>>> 165 debugMonitorExit(deletedSignatureLock); >>>> 166 return; // Already added >>>> ? 167???? } >>>> ? ... >>>> 281 cleanDeleted(void *signatureVoid, void *arg) >>>> 282 { >>>> 283 char* sig = (char*)signatureVoid; >>>> 284 jvmtiDeallocate(sig); >>>> 285 return JNI_TRUE; >>>> ? 286 } >>>> ? ... >>>> ? 291 void >>>> ? 292 classTrack_reset(void) >>>> ? 293 { >>>> 294 int idx; >>>> 295 debugMonitorEnter(deletedSignatureLock); >>>> 296 >>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) { >>>> 298 KlassNode* node = table[idx]; >>>> 299 while (node != NULL) { >>>> 300 KlassNode* next = node->next; >>>> 301 jvmtiDeallocate(node->signature); >>>> 302 jvmtiDeallocate(node); >>>> 303 node = next; >>>> 304 } >>>> 305 } >>>> 306 jvmtiDeallocate(table); >>>> 307 >>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL); >>>> 309 bagDestroyBag(deletedSignatureBag); >>>> 310 >>>> 311 currentClassTag = -1; >>>> 312 >>>> 313 (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); >>>> 314 trackingEnv = NULL; >>>> 315 >>>> 316 debugMonitorExit(deletedSignatureLock); >>>> >>>> Could you, please, fix several comments below? >>>> 63 * The JVMTI tracking env to keep track of klass tags, for >>>> class-unloads >>>> ??The comma is not needed. >>>> ??Would it better to replace: klass tags => klass_tag's ? >>>> >>>> >>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag >>>> consistent >>>> ??Maybe: Lock to guard ... or lock to keep integrity of ... >>>> >>>> 84 * Callback when classes are freed, Finds the signature and >>>> remembers it in deletedSignatureBag. Would be better to use words like >>>> "store" or "record", "Find" should not start from capital letter: >>>> Invoke the callback when classes are freed, find and record the >>>> signature in deletedSignatureBag. >>>> >>>> 96 // Find deleted KlassNode 133 // Class tracking not initialized, >>>> nobody's interested 153 // Class tracking not initialized yet, >>>> nobody's interested 158 /* Check this is not a duplicate */ Missed dot >>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { // >>>> klass not found - ignore. In opposite, dot is not needed as the >>>> comment does not start from a capital letter. 111 // At this point we >>>> have the KlassNode corresponding to the tag >>>> 112 // in klass, and the pointer to it in klass_node. >>> ? The comment above can be better. Maybe, something like: >>> ? ? " At this point, we found the KlassNode matching the klass >>> tag(and it is >>> linked). >>> >>>> 113 // Remember the unloaded signature. >>> ??Better: Record the signature of the unloaded class and unlink it. >>> >>> Thanks, >>> Serguei >>> >>>> Thanks, >>>> Serguei >>>> >>>> On 3/9/20 05:39, Roman Kennke wrote: >>>>> Hello all, >>>>> >>>>> Can I please get reviews of this change? In the meantime, we've done >>>>> more testing and also field-/torture-testing by a customer who is >>>>> happy >>>>> now. :-) >>>>> >>>>> Thanks, >>>>> Roman >>>>> >>>>> >>>>>> Hi Serguei, >>>>>> >>>>>> Thanks for reviewing! >>>>>> >>>>>> I updated the patch to reflect your suggestions, very good! >>>>>> It also includes a fix to allow re-connecting an agent after >>>>>> disconnect, >>>>>> namely move setup of the trackingEnv and deletedSignatureBag to >>>>>> _activate() to ensure have those structures after re-connect. >>>>>> >>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/ >>>>>> >>>>>> Let me know what you think! >>>>>> Roman >>>>>> >>>>>>> Hi Roman, >>>>>>> >>>>>>> Thank you for taking care about this scalability issue! >>>>>>> >>>>>>> I have a couple of quick comments. >>>>>>> >>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>> >>>>>>> >>>>>>> 72 /* >>>>>>> 73 * Lock to protect deletedSignatureBag >>>>>>> 74 */ >>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /* >>>>>>> 78 * A bag containing all the deleted classes' signatures. Must be >>>>>>> accessed under >>>>>>> 79 * deletedTagLock, >>>>>>> ?? 80? */ >>>>>>> 81 struct bag* deletedSignatureBag; >>>>>>> >>>>>>> ?? The comments contradict to each other. >>>>>>> ?? I guess, the lock name at line 79 has to be deletedSignatureLock >>>>>>> instead of deletedTagLock. >>>>>>> ?? Also, comma at the end must be replaced with dot. >>>>>>> >>>>>>> >>>>>>> 101 // Tag not found? Ignore. >>>>>>> 102 if (klass == NULL) { >>>>>>> 103 debugMonitorExit(deletedSignatureLock); >>>>>>> 104 return; >>>>>>> 105 } >>>>>>> ? 106 >>>>>>> 107 // Scan linked-list. >>>>>>> 108 jlong found_tag = klass->klass_tag; >>>>>>> 109 while (klass != NULL && found_tag != tag) { >>>>>>> 110 klass_ptr = &klass->next; >>>>>>> 111 klass = *klass_ptr; >>>>>>> 112 found_tag = klass->klass_tag; >>>>>>> ? 113???? } >>>>>>> 114 >>>>>>> 115 // Tag not found? Ignore. >>>>>>> 116 if (found_tag != tag) { >>>>>>> 117 debugMonitorExit(deletedSignatureLock); >>>>>>> 118 return; >>>>>>> ? 119???? } >>>>>>> >>>>>>> >>>>>>> ??The code above can be simplified, so that the lines 101-105 >>>>>>> are not >>>>>>> needed anymore. >>>>>>> ??It can be something like this: >>>>>>> >>>>>>> // Scan linked-list. >>>>>>> while (klass != NULL && klass->klass_tag != tag) { >>>>>>> klass_ptr = &klass->next; >>>>>>> klass = *klass_ptr; >>>>>>> ????? } >>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not >>>>>>> found - ignore. >>>>>>> debugMonitorExit(deletedSignatureLock); >>>>>>> return; >>>>>>> ????? } >>>>>>> >>>>>>> It will take more time when I get a chance to look at the rest. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 12/21/19 13:24, Roman Kennke wrote: >>>>>>>> Here comes an update that resolves some races that happen when >>>>>>>> disconnecting an agent. In particular, we need to take the lock on >>>>>>>> basically every operation, and also need to check whether or not >>>>>>>> class-tracking is active and return an appropriate result (e.g. >>>>>>>> an empty >>>>>>>> list) when we're not. >>>>>>>> >>>>>>>> Updated webrev: >>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Roman >>>>>>>> >>>>>>>> >>>>>>>>> So, here comes the O(1) implementation: >>>>>>>>> >>>>>>>>> - Whenever a class is 'prepared', it is registered with a tag, >>>>>>>>> and we >>>>>>>>> set-up a listener to get notified when it is unloaded. >>>>>>>>> - Prepared classes are kept in a datastructure that is a >>>>>>>>> table, which >>>>>>>>> each entry being the head of a linked-list of KlassNode*. The >>>>>>>>> table is >>>>>>>>> indexed by tag % slot-count, and then simply prepend the new >>>>>>>>> KlassNode*. >>>>>>>>> This is O(1) operation. >>>>>>>>> - When we get notified of unloading a class, we look up the >>>>>>>>> signature of >>>>>>>>> the reported tag in that table, and remember it in a bag. The >>>>>>>>> KlassNode* >>>>>>>>> is then unlinked from the table and deallocated. This is ~O(1) >>>>>>>>> operation >>>>>>>>> too, depending on the depth of the table. In my testcase which >>>>>>>>> hammered >>>>>>>>> the code with class-loads and unloads, I usually see depths of >>>>>>>>> like 2-3, >>>>>>>>> but not usually more. It should be ok. >>>>>>>>> - when processUnloads() gets called, we simply hand out that >>>>>>>>> bag, and >>>>>>>>> allocate a new one. >>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid >>>>>>>>> leaking the >>>>>>>>> signatures and KlassNode* etc when debug agent gets detached >>>>>>>>> and/or >>>>>>>>> re-attached (was missing before). >>>>>>>>> - I also added locks around data-structure-manipulation (was >>>>>>>>> missing >>>>>>>>> before). >>>>>>>>> - Also, I only activate this whole process when an actual >>>>>>>>> listener gets >>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when >>>>>>>>> attaching a >>>>>>>>> jdb, not sure why jdb does that though. This may be something >>>>>>>>> to improve >>>>>>>>> in the future? >>>>>>>>> >>>>>>>>> In my tests, the performance of class-tracking itself looks >>>>>>>>> really good. >>>>>>>>> The bottleneck now is clearly actual synthesizing the >>>>>>>>> class-unload >>>>>>>>> events. I don't see how this can be helped when the debug >>>>>>>>> agent asks for it? >>>>>>>>> >>>>>>>>> Updated webrev: >>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ >>>>>>>>> >>>>>>>>> Please let me know what you think of it. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Roman >>>>>>>>> >>>>>>>>> >>>>>>>>>> Alright, the perfectionist in me got me. I am implementing >>>>>>>>>> the even more >>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing for >>>>>>>>>> now. >>>>>>>>>> >>>>>>>>>> Thanks,Roman >>>>>>>>>> >>>>>>>>>> ? Hi Chris, >>>>>>>>>>>> I'll have a look at this, although it might not be for a >>>>>>>>>>>> few days. In >>>>>>>>>>>> the meantime, maybe you can describe your new >>>>>>>>>>>> implementation in >>>>>>>>>>>> classTrack.c so it's easier to look through the changes. >>>>>>>>>>> Sure. >>>>>>>>>>> >>>>>>>>>>> The purpose of this class-tracking is to be able to >>>>>>>>>>> determine the >>>>>>>>>>> signatures of unloaded classes when GC/class-unloading >>>>>>>>>>> happened, so that >>>>>>>>>>> we can generate the appropriate JDWP event. >>>>>>>>>>> >>>>>>>>>>> The current implementation does so by maintaining a table of >>>>>>>>>>> currently >>>>>>>>>>> prepared classes by building that table when classTrack is >>>>>>>>>>> initialized, >>>>>>>>>>> and then add new classes whenever a class gets loaded. When >>>>>>>>>>> unloading >>>>>>>>>>> occurs, that cache is rebuilt into a new table, and compared >>>>>>>>>>> with the >>>>>>>>>>> old table, and whatever is in the old, but not in the new >>>>>>>>>>> table gets >>>>>>>>>>> returned. The problem is that when GCs happen frequently >>>>>>>>>>> and/or many >>>>>>>>>>> classes get loaded+unloaded, this amounts to >>>>>>>>>>> O(classCount*gcCount) >>>>>>>>>>> complexity. >>>>>>>>>>> >>>>>>>>>>> The new implementation keeps a linked-list of prepared >>>>>>>>>>> classes, and also >>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). >>>>>>>>>>> Whenever an >>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned, >>>>>>>>>>> and classes >>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus >>>>>>>>>>> maintaining the >>>>>>>>>>> prepared-classes-list) and its signature put in the list >>>>>>>>>>> that gets returned. >>>>>>>>>>> >>>>>>>>>>> The implementation is not perfect. In order to determine >>>>>>>>>>> whether or not >>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. >>>>>>>>>>> That process is >>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here >>>>>>>>>>> is that >>>>>>>>>>> unloadedClassCount << classCount. In my experiments this >>>>>>>>>>> seems to be >>>>>>>>>>> true, and also reasonable to expect. >>>>>>>>>>> >>>>>>>>>>> (I have some ideas how to improve the implementation to >>>>>>>>>>> ~O(1) but it >>>>>>>>>>> would be considerably more complex: have to maintain a >>>>>>>>>>> (hash)table that >>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload, >>>>>>>>>>> and build the >>>>>>>>>>> unloaded-signatures list there, but I don't currently see >>>>>>>>>>> that it's >>>>>>>>>>> worth the effort). >>>>>>>>>>> >>>>>>>>>>> In addition to all that, this process is only activated when >>>>>>>>>>> there's an >>>>>>>>>>> actual listener registered for EI_GC_FINISH. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Roman >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote: >>>>>>>>>>>>> Hello all, >>>>>>>>>>>>> >>>>>>>>>>>>> Issue: >>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269 >>>>>>>>>>>>> >>>>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c. >>>>>>>>>>>>> It avoids >>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps >>>>>>>>>>>>> track of >>>>>>>>>>>>> loaded/unloaded classes one-by-one. >>>>>>>>>>>>> >>>>>>>>>>>>> In addition to that, it avoids this whole dance until an >>>>>>>>>>>>> agent >>>>>>>>>>>>> registers interest in EI_GC_FINISH. >>>>>>>>>>>>> >>>>>>>>>>>>> Webrev: >>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ >>>>>>>>>>>>> >>>>>>>>>>>>> Testing: manual testing of provided test scenarios and >>>>>>>>>>>>> timing. >>>>>>>>>>>>> >>>>>>>>>>>>> Eg with the testcase provided here: >>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>>>>>> >>>>>>>>>>>>> I am getting those numbers: >>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s >>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s >>>>>>>>>>>>> >>>>>>>>>>>>> I also tested successfully through jdk/submit repo >>>>>>>>>>>>> >>>>>>>>>>>>> Can I please get a review? >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Roman >>>>>>>>>>>>> > > From serguei.spitsyn at oracle.com Tue Mar 24 06:56:49 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 23 Mar 2020 23:56:49 -0700 Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI connector port In-Reply-To: References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com> <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com> <41392FC2-AC78-4E94-878B-907DDFB3E968@oracle.com> <78EA1B33-1B0D-4969-AFC5-E73D8F82E6FA@oracle.com> Message-ID: <9c888cc4-f9f8-8f64-f90b-a949007bc1dc@oracle.com> Hi Daniil, It looks pretty good in general. It looks like you removed the last call site of DebugServer.main. Do we need to remove the DebugServer.java as well? Thanks, Serguei On 3/22/20 15:29, Daniil Titov wrote: > Hi Yasumasa, Serguei and Alex, > > Please review a new version of the webrev that merges SADebugDTest.java with changes done in [2]. > > Also the CRS [3] and the help message for debug server in SALauncher.java were updated to specify that '--hostname' > option could be a hostname or an IPv4/IPv6 address. > > > Ok, but I think it might be more simply with TestLibrary. > > For example, can you use TestLibrary::getUnusedRandomPort ? It is used in test/jdk/java/rmi/testlibrary/RMID.java . > > TestLibrary:: getUnusedRandomPort() doesn't allow to specify what ports are reserved and it uses some hardcoded port range [FIXED_PORT_MIN, FIXED_PORT_MAX] as reserved ports. Besides, test/jdk/java/rmi/testlibrary/TestLibrary.java class cannot be directly used in test/hotspot/jtreg/serviceability/* tests (it doesn't compile). > > Nevertheless, to simplify the test itself I moved findUnreservedFreePort(int .. reservedPorts) from SADebugTest.java to jdk.test.lib.Utils in /test/lib. > > Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > [1] http://cr.openjdk.java.net/~dtitov/8196751/webrev.04/ > [2] https://bugs.openjdk.java.net/browse/JDK-8238268 > [3] https://bugs.openjdk.java.net/browse/JDK-8239831 > > Thank you, > Daniil > > ?On 3/13/20, 7:23 PM, "Yasumasa Suenaga" wrote: > > Hi Daniil, > > On 2020/03/14 7:05, Daniil Titov wrote: > > Hi Yasumasa, Serguei and Alex, > > > > Please review a new version of the webrev that includes the changes Yasumasa suggested. > > > >> Shutdown hook is already registered in c'tor of HotSpotAgent. > >> It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed. > > > > The shutdown hook registered in the HotSpotAgent c'tor only works for non-servers, so we still need a > > the shutdown hook for remote server being added in SALauncher. I changed it to use the lambda expression. > > > > 101 public HotSpotAgent() { > > 102 // for non-server add shutdown hook to clean-up debugger in case > > 103 // of forced exit. For remote server, shutdown hook is added by > > 104 // DebugServer. > > 105 Runtime.getRuntime().addShutdownHook(new java.lang.Thread( > > 106 new Runnable() { > > 107 public void run() { > > 108 synchronized (HotSpotAgent.this) { > > 109 if (!isServer) { > > 110 detach(); > > 111 } > > 112 } > > 113 } > > 114 })); > > 115 } > > I missed it, thanks! > > > >>> Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains > >>> `exclusiveAccess.dirs=.` to avoid concurrent execution > > As I understand exclusiveAccess.dirs prevents only the tests located in this directory from being run simultaneously and other tests could still run in parallel with one of these tests. Thus I would prefer to have the retry mechanism in place. I simplified the code using the class variables instead of local arrays. > > Ok, but I think it might be more simply with TestLibrary. > For example, can you use TestLibrary::getUnusedRandomPort ? It is used in test/jdk/java/rmi/testlibrary/RMID.java . > > > Thanks, > > Yasumasa > > > > Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.03/ > > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > [3] Bug: https://bugs.openjdk.java.net/browse/JDK-8196751 > > > > Thank you, > > Daniil > > > > ?On 3/6/20, 6:15 PM, "Yasumasa Suenaga" wrote: > > > > Hi Daniil, > > > > On 2020/03/07 3:38, Daniil Titov wrote: > > > Hi Yasumasa, > > > > > > -> checkBasicOptions() is needed? I think you can remove this method and embed it in caller. > > > I think that having a piece of code that invokes a method named "buildAttachArgs" with a copy of the argument map just for its side-effect ( it throws an exception if parameters are incorrect) and ignores its return might look confusing. Thus, I found it more appropriate to wrap it inside a method with more relevant name . > > > > Ok, but I prefer to leave comment it. > > > > > > > > SADebugDTest > > > > - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array. > > > We cannot use primitives there since these local variables are captured in lambda expression and are required to be final. > > > The other option is to use some other wrapper for them but I don't see any obvious benefits in it comparing to the array. > > > > Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains `exclusiveAccess.dirs=.` to avoid concurrent execution. > > If you do not think this error check, test code is more simply. > > > > > > > I will include your other suggestion in the new version of the webrev. > > > > Sorry, I have one more comment: > > > > > - Shutdown hook is very good idea. You can implement more simply if you use lambda expression. > > > > Shutdown hook is already registered in c'tor of HotSpotAgent. > > It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed. > > > > > > Thanks, > > > > Yasumasa > > > > > > > Thanks! > > > Daniil > > > > > > ?On 3/6/20, 12:30 AM, "Yasumasa Suenaga" wrote: > > > > > > Hi Daniil, > > > > > > > > > - SALauncher.java > > > - checkBasicOptions() is needed? I think you can remove this method and embed it in caller. > > > - I think registryPort should be checked with Integer.parseInt() like others (rmiPort and pid) rather than regex. > > > - Shutdown hook is very good idea. You can implement more simply if you use lambda expression. > > > > > > - SADebugDTest.java > > > - Please add bug ID to @bug. > > > - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array. > > > > > > > > > Thanks, > > > > > > Yasumasa > > > > > > > > > On 2020/03/06 10:15, Daniil Titov wrote: > > > > Hi Yasumasa, Serguei and Alex, > > > > > > > > Please review a new version of the fix [1] that addresses your comments. The new version in addition to RMI connector > > > > port option introduces two more options to specify RMI registry port and RMI connector host name. Currently, these > > > > last two settings could be specified using the system properties but the system properties have the following disadvantages > > > > comparing to the command line options: > > > > - It?s hard to know about them: they are not listed in tool?s help. > > > > - They have long names that hard to remember > > > > - It is easy to mistype them in the command line and you will not get any warning about it. > > > > > > > > The CSR [2] was also updated and needs to be reviewed. > > > > > > > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > > > > container and connecting to it with the GUI debugger. Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.02/ > > > > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > > > [3] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > > > > > > > Thank you, > > > > Daniil > > > > > > > > ?On 2/24/20, 5:45 AM, "Yasumasa Suenaga" wrote: > > > > > > > > Hi Daniil, > > > > > > > > - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments. > > > > Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply. > > > > > > > > - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used. > > > > But you can use same port number as RMI registry (1099). > > > > It is same as relation between jmxremote.port and jmxremote.rmi.port. > > > > > > > > > > > > Thanks, > > > > > > > > Yasumasa > > > > > > > > > > > > On 2020/02/24 13:21, Daniil Titov wrote: > > > > > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port. > > > > > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container. > > > > > > > > > > New CSR [3] was created for this change and it needs to be reviewed as well. > > > > > > > > > > Man pages for jhsdb will be updated in a separate issue. > > > > > > > > > > The current implementation (sun.jvm.hotspot.SALauncher) parses the command line options passed to jhsdb tool, > > > > > converts them to the ones for the debug server and then delegates the call to sun.jvm.hotspot.DebugServer.main(). > > > > > > > > > > // delegate to the actual SA debug server. > > > > > 367 DebugServer.main(newArgArray.toArray(new String[0])); > > > > > > > > > > However, sun.jvm.hotspot.DebugServer doesn't support named options and that prevents from efficiently adding new options to the tool. > > > > > I found it more suitable to start Hotspot agent directly in SALauncher rather than adding a new option in both sun.jvm.hotspot.SALauncher > > > > > and sun.jvm.hotspot.DebugServer and delegating the call. With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated > > > > > but I would prefer to address it in a separate issue. > > > > > > > > > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > > > > > container and connecting to it with the GUI debugger. > > > > > Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > > > > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01 > > > > > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > > > > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > > > > > > > > > Thank you, > > > > > Daniil > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From rkennke at redhat.com Tue Mar 24 08:56:17 2020 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 24 Mar 2020 09:56:17 +0100 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com> <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com> <7766d2f0-e223-7129-9c37-474bbcd67d70@oracle.com> Message-ID: <7667eb6e-8458-6763-3f27-f97f6f49bed8@redhat.com> Hi Chris, > I assume JVMTI maintains separate tagging data for each agent so having > two agents doing tagging won't result in confusion. I didn't actually > find this in the spec. Would be nice to confirm that it is the case. > However, your implementation does seem to conflict with other uses of > tagging in the debug agent: The tagging data is per-jvmtiEnv. We create and use our own env (private to class-tracking), so this wouldn't conflict with other uses of tags. Could it be a problem that we have a single trackingEnv per JVM, though? /me scratches head. > What would cause classTrack_addPreparedClass() to be called for a Class > you've already seen? I don't understand the need for the "tag != 0l" check. It's probably not needed, may be a left-over from previous installments of this implementation. I will check it, and turn into an assert or so. Thanks, Roman > thanks, > > Chris > > On 3/20/20 12:52 PM, Chris Plummer wrote: >> On 3/20/20 8:30 AM, Roman Kennke wrote: >>> I believe I came up with a much simpler solution that also solves the >>> problems of the existing one, and the ones I proposed earlier. >>> >>> It turns out that we can take advantage of the fact that we can use >>> *anything* as tags in JVMTI, even pointers to stuff (this is explicitely >>> mentioned in the JVMTI spec). This means we can simply stick a pointer >>> to the signature of a class into the tag, and pull it out again when we >>> get notified that the class gets unloaded. >>> >>> This means we don't need an extra data-structure to keep track of >>> classes and signatures, and it also makes the story around locking >>> *much* simpler. Performance-wise this is O(1), i.e. no scanning of all >>> classes needed (as in the current implementation) and no searching of >>> table needed (like in my previous attempts). >>> >>> Please review this new revision: >>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/ >> I'll have a look at this. >>> >>> (Notice that there still appears to be a performance bottleneck with >>> class-unloading when an actual debugger is attached. This doesn't seem >>> to be related to the classTrack.c implementation though, but looks like >>> a consequence of getting all those class-unload notifications over the >>> wire. My testcase generates 1000s of them, and it's clogging up the >>> buffers.) >> At least this is only a one-shot hit when the classes are unloaded, >> and the performance hit is based on the number of classes being >> unloaded. The main issue is happening every GC, and is O(n) where n is >> the number of loaded classes. >>> I am not sure why jdb needs to enable class-unload listener always. A >>> simple hack disables it, and performance is brilliant, even when jdb is >>> attached: >>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch >> This is JDI, not jdb. It looks like it needs ClassUnload events so it >> can maintain typesBySignature, which is used by public APIs like >> allClasses(). So we have caching of loaded classes both in the debug >> agent and in JDI. >> >> Chris >>> But this is not in the scope of this bug.) >>> >>> Roman >>> >>> >>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote: >>>> Sorry, forgot to complete my comments at the end (see below). >>>> >>>> >>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote: >>>>> Hi Roman, >>>>> >>>>> Thank you for the update and sorry for the latency in review. >>>>> >>>>> Some comments are below. >>>>> >>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>> >>>>> >>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag) >>>>> ?? 88 { >>>>> 89 debugMonitorEnter(deletedSignatureLock); >>>>> 90 if (currentClassTag == -1) { >>>>> 91 // Class tracking not initialized, nobody's interested >>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>> 93 return; >>>>> ?? 94???? } >>>>> Just a question: >>>>> ?? Q1: Should the ObjectFree events be disabled for the jvmtiEnv >>>>> that does >>>>> ?????? the class tracking if class tracking has not been initialized? >>>>> >>>>> 70 static jlong currentClassTag; I'm thinking if the name is better to >>>>> be something like: lastClassTag or highestClassTag. >>>>> >>>>> 99 KlassNode* klass = *klass_ptr; >>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103 >>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr; >>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // klass not >>>>> found - ignore. >>>>> 107 debugMonitorExit(deletedSignatureLock); >>>>> 108 return; >>>>> ? 109???? } >>>>> ??It seems to me, something is wrong in the condition at L106 above. >>>>> ??Should it be? : >>>>> ???? if (klass == NULL || klass->klass_tag != tag) >>>>> >>>>> ??Otherwise, how can the second check ever work correctly as the >>>>> return >>>>> will always happen when (klass != NULL)? >>>>> >>>>> ? There are several places in this file with the the indent: >>>>> 90 if (currentClassTag == -1) { >>>>> 91 // Class tracking not initialized, nobody's interested >>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>> 93 return; >>>>> ?? 94???? } >>>>> ? ... >>>>> 152 if (currentClassTag == -1) { >>>>> 153 // Class tracking not initialized yet, nobody's interested >>>>> 154 debugMonitorExit(deletedSignatureLock); >>>>> 155 return; >>>>> ? 156???? } >>>>> ? ... >>>>> 161 if (error != JVMTI_ERROR_NONE) { >>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv"); >>>>> ? 163???? } >>>>> 164 if (tag != 0l) { >>>>> 165 debugMonitorExit(deletedSignatureLock); >>>>> 166 return; // Already added >>>>> ? 167???? } >>>>> ? ... >>>>> 281 cleanDeleted(void *signatureVoid, void *arg) >>>>> 282 { >>>>> 283 char* sig = (char*)signatureVoid; >>>>> 284 jvmtiDeallocate(sig); >>>>> 285 return JNI_TRUE; >>>>> ? 286 } >>>>> ? ... >>>>> ? 291 void >>>>> ? 292 classTrack_reset(void) >>>>> ? 293 { >>>>> 294 int idx; >>>>> 295 debugMonitorEnter(deletedSignatureLock); >>>>> 296 >>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) { >>>>> 298 KlassNode* node = table[idx]; >>>>> 299 while (node != NULL) { >>>>> 300 KlassNode* next = node->next; >>>>> 301 jvmtiDeallocate(node->signature); >>>>> 302 jvmtiDeallocate(node); >>>>> 303 node = next; >>>>> 304 } >>>>> 305 } >>>>> 306 jvmtiDeallocate(table); >>>>> 307 >>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL); >>>>> 309 bagDestroyBag(deletedSignatureBag); >>>>> 310 >>>>> 311 currentClassTag = -1; >>>>> 312 >>>>> 313 (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); >>>>> 314 trackingEnv = NULL; >>>>> 315 >>>>> 316 debugMonitorExit(deletedSignatureLock); >>>>> >>>>> Could you, please, fix several comments below? >>>>> 63 * The JVMTI tracking env to keep track of klass tags, for >>>>> class-unloads >>>>> ??The comma is not needed. >>>>> ??Would it better to replace: klass tags => klass_tag's ? >>>>> >>>>> >>>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag >>>>> consistent >>>>> ??Maybe: Lock to guard ... or lock to keep integrity of ... >>>>> >>>>> 84 * Callback when classes are freed, Finds the signature and >>>>> remembers it in deletedSignatureBag. Would be better to use words like >>>>> "store" or "record", "Find" should not start from capital letter: >>>>> Invoke the callback when classes are freed, find and record the >>>>> signature in deletedSignatureBag. >>>>> >>>>> 96 // Find deleted KlassNode 133 // Class tracking not initialized, >>>>> nobody's interested 153 // Class tracking not initialized yet, >>>>> nobody's interested 158 /* Check this is not a duplicate */ Missed dot >>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { // >>>>> klass not found - ignore. In opposite, dot is not needed as the >>>>> comment does not start from a capital letter. 111 // At this point we >>>>> have the KlassNode corresponding to the tag >>>>> 112 // in klass, and the pointer to it in klass_node. >>>> ? The comment above can be better. Maybe, something like: >>>> ? ? " At this point, we found the KlassNode matching the klass >>>> tag(and it is >>>> linked). >>>> >>>>> 113 // Remember the unloaded signature. >>>> ??Better: Record the signature of the unloaded class and unlink it. >>>> >>>> Thanks, >>>> Serguei >>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> On 3/9/20 05:39, Roman Kennke wrote: >>>>>> Hello all, >>>>>> >>>>>> Can I please get reviews of this change? In the meantime, we've done >>>>>> more testing and also field-/torture-testing by a customer who is >>>>>> happy >>>>>> now. :-) >>>>>> >>>>>> Thanks, >>>>>> Roman >>>>>> >>>>>> >>>>>>> Hi Serguei, >>>>>>> >>>>>>> Thanks for reviewing! >>>>>>> >>>>>>> I updated the patch to reflect your suggestions, very good! >>>>>>> It also includes a fix to allow re-connecting an agent after >>>>>>> disconnect, >>>>>>> namely move setup of the trackingEnv and deletedSignatureBag to >>>>>>> _activate() to ensure have those structures after re-connect. >>>>>>> >>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/ >>>>>>> >>>>>>> Let me know what you think! >>>>>>> Roman >>>>>>> >>>>>>>> Hi Roman, >>>>>>>> >>>>>>>> Thank you for taking care about this scalability issue! >>>>>>>> >>>>>>>> I have a couple of quick comments. >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>> >>>>>>>> >>>>>>>> 72 /* >>>>>>>> 73 * Lock to protect deletedSignatureBag >>>>>>>> 74 */ >>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /* >>>>>>>> 78 * A bag containing all the deleted classes' signatures. Must be >>>>>>>> accessed under >>>>>>>> 79 * deletedTagLock, >>>>>>>> ?? 80? */ >>>>>>>> 81 struct bag* deletedSignatureBag; >>>>>>>> >>>>>>>> ?? The comments contradict to each other. >>>>>>>> ?? I guess, the lock name at line 79 has to be deletedSignatureLock >>>>>>>> instead of deletedTagLock. >>>>>>>> ?? Also, comma at the end must be replaced with dot. >>>>>>>> >>>>>>>> >>>>>>>> 101 // Tag not found? Ignore. >>>>>>>> 102 if (klass == NULL) { >>>>>>>> 103 debugMonitorExit(deletedSignatureLock); >>>>>>>> 104 return; >>>>>>>> 105 } >>>>>>>> ? 106 >>>>>>>> 107 // Scan linked-list. >>>>>>>> 108 jlong found_tag = klass->klass_tag; >>>>>>>> 109 while (klass != NULL && found_tag != tag) { >>>>>>>> 110 klass_ptr = &klass->next; >>>>>>>> 111 klass = *klass_ptr; >>>>>>>> 112 found_tag = klass->klass_tag; >>>>>>>> ? 113???? } >>>>>>>> 114 >>>>>>>> 115 // Tag not found? Ignore. >>>>>>>> 116 if (found_tag != tag) { >>>>>>>> 117 debugMonitorExit(deletedSignatureLock); >>>>>>>> 118 return; >>>>>>>> ? 119???? } >>>>>>>> >>>>>>>> >>>>>>>> ??The code above can be simplified, so that the lines 101-105 >>>>>>>> are not >>>>>>>> needed anymore. >>>>>>>> ??It can be something like this: >>>>>>>> >>>>>>>> // Scan linked-list. >>>>>>>> while (klass != NULL && klass->klass_tag != tag) { >>>>>>>> klass_ptr = &klass->next; >>>>>>>> klass = *klass_ptr; >>>>>>>> ????? } >>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not >>>>>>>> found - ignore. >>>>>>>> debugMonitorExit(deletedSignatureLock); >>>>>>>> return; >>>>>>>> ????? } >>>>>>>> >>>>>>>> It will take more time when I get a chance to look at the rest. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 12/21/19 13:24, Roman Kennke wrote: >>>>>>>>> Here comes an update that resolves some races that happen when >>>>>>>>> disconnecting an agent. In particular, we need to take the lock on >>>>>>>>> basically every operation, and also need to check whether or not >>>>>>>>> class-tracking is active and return an appropriate result (e.g. >>>>>>>>> an empty >>>>>>>>> list) when we're not. >>>>>>>>> >>>>>>>>> Updated webrev: >>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Roman >>>>>>>>> >>>>>>>>> >>>>>>>>>> So, here comes the O(1) implementation: >>>>>>>>>> >>>>>>>>>> - Whenever a class is 'prepared', it is registered with a tag, >>>>>>>>>> and we >>>>>>>>>> set-up a listener to get notified when it is unloaded. >>>>>>>>>> - Prepared classes are kept in a datastructure that is a >>>>>>>>>> table, which >>>>>>>>>> each entry being the head of a linked-list of KlassNode*. The >>>>>>>>>> table is >>>>>>>>>> indexed by tag % slot-count, and then simply prepend the new >>>>>>>>>> KlassNode*. >>>>>>>>>> This is O(1) operation. >>>>>>>>>> - When we get notified of unloading a class, we look up the >>>>>>>>>> signature of >>>>>>>>>> the reported tag in that table, and remember it in a bag. The >>>>>>>>>> KlassNode* >>>>>>>>>> is then unlinked from the table and deallocated. This is ~O(1) >>>>>>>>>> operation >>>>>>>>>> too, depending on the depth of the table. In my testcase which >>>>>>>>>> hammered >>>>>>>>>> the code with class-loads and unloads, I usually see depths of >>>>>>>>>> like 2-3, >>>>>>>>>> but not usually more. It should be ok. >>>>>>>>>> - when processUnloads() gets called, we simply hand out that >>>>>>>>>> bag, and >>>>>>>>>> allocate a new one. >>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid >>>>>>>>>> leaking the >>>>>>>>>> signatures and KlassNode* etc when debug agent gets detached >>>>>>>>>> and/or >>>>>>>>>> re-attached (was missing before). >>>>>>>>>> - I also added locks around data-structure-manipulation (was >>>>>>>>>> missing >>>>>>>>>> before). >>>>>>>>>> - Also, I only activate this whole process when an actual >>>>>>>>>> listener gets >>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when >>>>>>>>>> attaching a >>>>>>>>>> jdb, not sure why jdb does that though. This may be something >>>>>>>>>> to improve >>>>>>>>>> in the future? >>>>>>>>>> >>>>>>>>>> In my tests, the performance of class-tracking itself looks >>>>>>>>>> really good. >>>>>>>>>> The bottleneck now is clearly actual synthesizing the >>>>>>>>>> class-unload >>>>>>>>>> events. I don't see how this can be helped when the debug >>>>>>>>>> agent asks for it? >>>>>>>>>> >>>>>>>>>> Updated webrev: >>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ >>>>>>>>>> >>>>>>>>>> Please let me know what you think of it. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Roman >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Alright, the perfectionist in me got me. I am implementing >>>>>>>>>>> the even more >>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing for >>>>>>>>>>> now. >>>>>>>>>>> >>>>>>>>>>> Thanks,Roman >>>>>>>>>>> >>>>>>>>>>> ? Hi Chris, >>>>>>>>>>>>> I'll have a look at this, although it might not be for a >>>>>>>>>>>>> few days. In >>>>>>>>>>>>> the meantime, maybe you can describe your new >>>>>>>>>>>>> implementation in >>>>>>>>>>>>> classTrack.c so it's easier to look through the changes. >>>>>>>>>>>> Sure. >>>>>>>>>>>> >>>>>>>>>>>> The purpose of this class-tracking is to be able to >>>>>>>>>>>> determine the >>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading >>>>>>>>>>>> happened, so that >>>>>>>>>>>> we can generate the appropriate JDWP event. >>>>>>>>>>>> >>>>>>>>>>>> The current implementation does so by maintaining a table of >>>>>>>>>>>> currently >>>>>>>>>>>> prepared classes by building that table when classTrack is >>>>>>>>>>>> initialized, >>>>>>>>>>>> and then add new classes whenever a class gets loaded. When >>>>>>>>>>>> unloading >>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and compared >>>>>>>>>>>> with the >>>>>>>>>>>> old table, and whatever is in the old, but not in the new >>>>>>>>>>>> table gets >>>>>>>>>>>> returned. The problem is that when GCs happen frequently >>>>>>>>>>>> and/or many >>>>>>>>>>>> classes get loaded+unloaded, this amounts to >>>>>>>>>>>> O(classCount*gcCount) >>>>>>>>>>>> complexity. >>>>>>>>>>>> >>>>>>>>>>>> The new implementation keeps a linked-list of prepared >>>>>>>>>>>> classes, and also >>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). >>>>>>>>>>>> Whenever an >>>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned, >>>>>>>>>>>> and classes >>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus >>>>>>>>>>>> maintaining the >>>>>>>>>>>> prepared-classes-list) and its signature put in the list >>>>>>>>>>>> that gets returned. >>>>>>>>>>>> >>>>>>>>>>>> The implementation is not perfect. In order to determine >>>>>>>>>>>> whether or not >>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. >>>>>>>>>>>> That process is >>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here >>>>>>>>>>>> is that >>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this >>>>>>>>>>>> seems to be >>>>>>>>>>>> true, and also reasonable to expect. >>>>>>>>>>>> >>>>>>>>>>>> (I have some ideas how to improve the implementation to >>>>>>>>>>>> ~O(1) but it >>>>>>>>>>>> would be considerably more complex: have to maintain a >>>>>>>>>>>> (hash)table that >>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload, >>>>>>>>>>>> and build the >>>>>>>>>>>> unloaded-signatures list there, but I don't currently see >>>>>>>>>>>> that it's >>>>>>>>>>>> worth the effort). >>>>>>>>>>>> >>>>>>>>>>>> In addition to all that, this process is only activated when >>>>>>>>>>>> there's an >>>>>>>>>>>> actual listener registered for EI_GC_FINISH. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Roman >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Chris >>>>>>>>>>>>> >>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote: >>>>>>>>>>>>>> Hello all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Issue: >>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269 >>>>>>>>>>>>>> >>>>>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c. >>>>>>>>>>>>>> It avoids >>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps >>>>>>>>>>>>>> track of >>>>>>>>>>>>>> loaded/unloaded classes one-by-one. >>>>>>>>>>>>>> >>>>>>>>>>>>>> In addition to that, it avoids this whole dance until an >>>>>>>>>>>>>> agent >>>>>>>>>>>>>> registers interest in EI_GC_FINISH. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and >>>>>>>>>>>>>> timing. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Eg with the testcase provided here: >>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>>>>>>> >>>>>>>>>>>>>> I am getting those numbers: >>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s >>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s >>>>>>>>>>>>>> >>>>>>>>>>>>>> I also tested successfully through jdk/submit repo >>>>>>>>>>>>>> >>>>>>>>>>>>>> Can I please get a review? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Roman >>>>>>>>>>>>>> >> >> > > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From magnus.ihse.bursie at oracle.com Tue Mar 24 12:12:45 2020 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Tue, 24 Mar 2020 13:12:45 +0100 Subject: RFR: JDK-8241463 Move build tools to respective modules In-Reply-To: References: Message-ID: <01a1c812-1adb-a4a2-3db0-327f76c0b21c@oracle.com> On 2020-03-23 23:15, naoto.sato at oracle.com wrote: > Hi Magnus, > > I looked at i18n related changes: > > make/CopyInterimTZDB.gmk > make/ToolsJdk.gmk > make/gendata/Gendata-java.base.gmk > make/gendata/GendataBreakIterator.gmk > make/gendata/GendataTZDB.gmk > make/gensrc/GensrcCharacterData.gmk > make/gensrc/GensrcEmojiData.gmk > > They look ok to me. Thank you! > > The *.java changes should have copyright year update. Ok, I'll update them. > > As to charsetmapping and cldrconverter, I believe they can reside in > java.base, as jdk.charsets and jdk.localedata modules depend on it. Okay. It's not ideal, but I think you're right. I'll move them as well. I'll publish an updated webrev with these changes when there's agreement on where in the source code tree to move the files. /Magnus > > Naoto > > On 3/23/20 12:03 PM, Magnus Ihse Bursie wrote: >> The build tools (small java tools that are run during the build to >> generate source code, or data, needed in the JDK) have historically >> been placed in the "make" directory. This maybe made sense long time >> ago, but does not do so anymore. >> >> Instead, the build tools source code should move the the module that >> needs them. For instance, compilefontconfig should move to >> java.desktop, etc. >> >> There are multiple reasons for this: >> >> * Currently we build *all* build tools at once, which mean that we >> cannot compile java.base until e.g. the compilefontconfig tool is >> compiled, even though it is not needed. >> >> * If a build tool, e.g. compilefontconfig is modified, all build >> tools are recompiled, which triggers a rebuild of more or less the >> entire JDK. This makes development of the build tools unnecessary >> tedious. >> >> * When the build tools are modified, the group owning the >> corresponding module is the proper review instance, not the build >> team. But since they reside under "make", the review mails often >> include build-dev, but this is mostly noise for us. With this move, >> the ownership is made clear. >> >> In this patch, I have not modified how and when the build tools are >> compiled, but this shuffle is the prerequisite for continuing with >> that in a follow-up patch. >> >> I have also moved the build tools to the org.openjdk.buildtools.* >> package name space (inspired by Skara), instead of the strangely >> named build.tools.* name space. >> >> A few build tools are not moved in this patch. Two of them, >> charsetmapping and cldrconverter, are shared between two modules. (I >> think they should move to modules nevertheless, but they need some >> more thought to make sure I do this right.) The rest are tools that >> are needed for the build in general, like linking or javadoc support. >> I'll move this to a better location too, but in a separate patch. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8241463 >> WebRev: >> http://cr.openjdk.java.net/~ihse/JDK-8241463-move-build-tools-to-modules/webrev.01 >> >> >> /Magnus >> From serguei.spitsyn at oracle.com Tue Mar 24 16:39:34 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 24 Mar 2020 09:39:34 -0700 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: <9c9950af-5e36-90b5-d22f-a3e00d45fbcc@oss.nttdata.com> References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com> <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com> <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com> <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com> <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com> <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com> <4f940048-06e1-bda1-6f75-225001572232@oss.nttdata.com> <908005c7-c3ad-2f92-5853-2ae0895689b4@oracle.com> <2114ef12-b790-ee2d-4242-d114cc47151c@oracle.com> <9c9950af-5e36-90b5-d22f-a3e00d45fbcc@oss.nttdata.com> Message-ID: Hi Yasumasa, I'm okay with this update. My mach5 test run for this patch is passed. Thanks, Serguei On 3/23/20 17:08, Yasumasa Suenaga wrote: > Hi Serguei, > > Thanks for your comment! > I uploaded new webrev: > > ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.04/ > > Also I pushed it to submit repo: > > ? http://hg.openjdk.java.net/jdk/submit/rev/fade6a949bd1 > > On 2020/03/24 7:39, serguei.spitsyn at oracle.com wrote: >> Hi Yasumasa, >> >> The mach5 tier5 testing looks good. >> The serviceability/sa/ClhsdbPstack.java is failed without fix and is >> not failed with it. >> >> Thanks, >> Serguei >> >> >> On 3/23/20 10:18, serguei.spitsyn at oracle.com wrote: >>> Hi Yasumasa, >>> >>> I looked at you changes. >>> It is hard to understand if this fully solves the issue. >>> >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html >>> >>> >>> @@ -34,10 +34,11 @@ >>> ? ???? public static LinuxAMD64CFrame getTopFrame(LinuxDebugger dbg, >>> Address rip, ThreadContext context) { >>> ??????? Address libptr = dbg.findLibPtrByAddress(rip); >>> ??????? Address cfa = >>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); >>> ??????? DwarfParser dwarf = null; >>> + boolean unsupportedDwarf = false; >>> ? ??????? if (libptr != null) { // Native frame >>> ????????? try { >>> ??????????? dwarf = new DwarfParser(libptr); >>> ??????????? dwarf.processDwarf(rip); >>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >>> >>> @@ -45,24 +46,33 @@ >>> ?????????????????? !dwarf.isBPOffsetAvailable()) >>> ????????????????????? ? >>> context.getRegisterAsAddress(AMD64ThreadContext.RBP) >>> ????????????????????? : >>> context.getRegisterAsAddress(dwarf.getCFARegister()) >>> .addOffsetTo(dwarf.getCFAOffset()); >>> ????????? } catch (DebuggerException e) { >>> - // Bail out to Java frame case >>> + if (dwarf != null) { >>> + // DWARF processing should succeed when the frame is native >>> + // but it might fail if CIE has language personality routine >>> + // and/or LSDA. >>> + dwarf = null; >>> + unsupportedDwarf = true; >>> + } else { >>> + throw e; >>> + } >>> ????????? } >>> ??????? } >>> ? ??????? return (cfa == null) ? null >>> - : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf); >>> + : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf, !unsupportedDwarf); >>> ???? } >>> >>> @@ -121,13 +131,25 @@ >>> ?????? } >>> ? ?????? return isValidFrame(nextCFA, context) ? nextCFA : null; >>> ???? } >>> ? - private DwarfParser getNextDwarf(Address nextPC) { >>> - DwarfParser nextDwarf = null; >>> + @Override >>> + public CFrame sender(ThreadProxy thread) { >>> + if (!possibleNext) { >>> + return null; >>> + } >>> + >>> + ThreadContext context = thread.getContext(); >>> + >>> + Address nextPC = getNextPC(dwarf != null); >>> + if (nextPC == null) { >>> + return null; >>> + } >>> ? + DwarfParser nextDwarf = null; >>> + boolean unsupportedDwarf = false; >>> ?????? if ((dwarf != null) && dwarf.isIn(nextPC)) { >>> ???????? nextDwarf = dwarf; >>> ?????? } else { >>> ???????? Address libptr = dbg.findLibPtrByAddress(nextPC); >>> ???????? if (libptr != null) { >>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >>> >>> @@ -138,33 +160,29 @@ >>> ?????????? } >>> ???????? } >>> ?????? } >>> ? ?????? if (nextDwarf != null) { >>> + try { >>> ???????? nextDwarf.processDwarf(nextPC); >>> + } catch (DebuggerException e) { >>> + // DWARF processing should succeed when the frame is native >>> + // but it might fail if CIE has language personality routine >>> + // and/or LSDA. >>> + nextDwarf = null; >>> + unsupportedDwarf = true; >>> ?????? } >>> >>> This fix looks like a hack. >>> Should we just propagate the Debugging exception instead of trying >>> to maintain unsupportedDwarf flag? > > DwarfParser::processDwarf would throw DebuggerException if it cannot > find DWARF which relates to PC. > PC at this point is for next frame. So current frame (`this` object) > is valid, and it should be processed. > > >>> Also, I don't like that DWARF-specific abbreviations (like CIE, >>> IDE,LSDA, etc.) are used without any comments explaining them. >>> The code has to be generally readable without looking into the DWARF >>> spec each time. > > I added comments for them in this webrev. > > > Thanks, > > Yasumasa > > >>> I'm submitting mach5 jobs to make sure the issue has been resolved >>> with your fix. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 3/20/20 17:55, Yasumasa Suenaga wrote: >>>> Thanks Chris! >>>> I'm waiting for reviewers for this change. >>>> >>>> >>>> Yasumasa >>>> >>>> >>>> On 2020/03/21 4:23, Chris Plummer wrote: >>>>> Hi Yasumasa, >>>>> >>>>> The failure is due to JDK-8231634, so not something you need to >>>>> worry about. >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 3/20/20 2:45 AM, Yasumasa Suenaga wrote: >>>>>> Hi Chris, >>>>>> >>>>>> I uploaded new webrev which includes reverting change for >>>>>> ProblemList: >>>>>> >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/ >>>>>> >>>>>> I tested it on submit repo >>>>>> (mach5-one-ysuenaga-JDK-8240956-2-20200320-0814-9586301), >>>>>> but it has failed in ClhsdbJstackXcompStress.java. >>>>>> However I think it is not caused by this change because >>>>>> ClhsdbJstackXcompStress.java tests `jhsdb jstack`, not mixed >>>>>> mode, it would not parse DWARF. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2020/03/20 13:55, Chris Plummer wrote: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> The test has been problem listed so please add undoing this to >>>>>>> your webrev. Here's the diff that problem listed it: >>>>>>> >>>>>>> diff --git a/test/hotspot/jtreg/ProblemList.txt >>>>>>> b/test/hotspot/jtreg/ProblemList.txt >>>>>>> --- a/test/hotspot/jtreg/ProblemList.txt >>>>>>> +++ b/test/hotspot/jtreg/ProblemList.txt >>>>>>> @@ -115,7 +115,7 @@ >>>>>>> ??serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all >>>>>>> ??serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all >>>>>>> ??serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all >>>>>>> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all >>>>>>> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 >>>>>>> solaris-all,linux-all >>>>>>> ??serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java >>>>>>> 8193639 solaris-all >>>>>>> ??serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 >>>>>>> solaris-all,linux-x64,macosx-x64,windows-x64 >>>>>>> ??serviceability/sa/ClhsdbSource.java 8193639 solaris-all >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 3/16/20 5:07 AM, Yasumasa Suenaga wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> This webrev has passed submit repo >>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and >>>>>>>> additional tests. >>>>>>>> So please review it: >>>>>>>> >>>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>> ? webrev: >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2020/03/16 21:03, Yasumasa Suenaga wrote: >>>>>>>>> Thank you so much, David! >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2020/03/16 21:01, David Holmes wrote: >>>>>>>>>> On 16/03/2020 9:46 pm, David Holmes wrote: >>>>>>>>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote: >>>>>>>>>>>> Hi David, >>>>>>>>>>>> >>>>>>>>>>>> I missed loop condition, so I fixed it and pushed to submit >>>>>>>>>>>> repo. >>>>>>>>>>>> Could you try again? >>>>>>>>>>>> >>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23 >>>>>>>>>>>> >>>>>>>>>>>> webrev is here: >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >>>>>>>>>>> >>>>>>>>>>> Test job resubmitted. Will advise results if it completes >>>>>>>>>>> before I go to bed :) >>>>>>>>>> >>>>>>>>>> Seems to have passed okay. >>>>>>>>>> >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks a lot! >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 2020/03/16 16:17, David Holmes wrote: >>>>>>>>>>>>> Sorry it is still crashing. >>>>>>>>>>>>> >>>>>>>>>>>>> # >>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime >>>>>>>>>>>>> Environment: >>>>>>>>>>>>> # >>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, >>>>>>>>>>>>> tid=13704 >>>>>>>>>>>>> # >>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) >>>>>>>>>>>>> (fastdebug build >>>>>>>>>>>>> 15-internal+0-2020-03-16-0640217.suenaga.source) >>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug >>>>>>>>>>>>> 15-internal+0-2020-03-16-0640217.suenaga.source, mixed >>>>>>>>>>>>> mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>>>>>>>>>> # Problematic frame: >>>>>>>>>>>>> # C? [libsaproc.so+0x494e] >>>>>>>>>>>>> DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>>>>>>>> # >>>>>>>>>>>>> >>>>>>>>>>>>> Same as before. >>>>>>>>>>>>> >>>>>>>>>>>>> David >>>>>>>>>>>>> ----- >>>>>>>>>>>>> >>>>>>>>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote: >>>>>>>>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote: >>>>>>>>>>>>>>>> BTW, if you submit it to the submit repo, we can then >>>>>>>>>>>>>>>> go and run additional internal tests (and even more >>>>>>>>>>>>>>>> builds) using that job. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks for that tip Chris! >>>>>>>>>>>>>> >>>>>>>>>>>>>>> I've pushed the change to submit repo, but I've not yet >>>>>>>>>>>>>>> received the result. >>>>>>>>>>>>>>> I will share you when I get job ID. >>>>>>>>>>>>>> >>>>>>>>>>>>>> We can see the id. Just need to wait for the builds to >>>>>>>>>>>>>> complete before submitting the additional tests. >>>>>>>>>>>>>> >>>>>>>>>>>>>> David >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thank you for testing it. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I updated webrev to avoid bailout to Java frame when >>>>>>>>>>>>>>>>> DWARF has language personality routine or LSDA. >>>>>>>>>>>>>>>>> Could you try it? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 . >>>>>>>>>>>>>>>>> I've pushed it to submit repo. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Diff from webrev.00 is here: >>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote: >>>>>>>>>>>>>>>>>> Correction ... >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote: >>>>>>>>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote: >>>>>>>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I can't review this as I know nothing about the >>>>>>>>>>>>>>>>>>>> code, but I'm putting the patch through our >>>>>>>>>>>>>>>>>>>> internal testing. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Sorry but the crashes still exist: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java >>>>>>>>>>>>>>>>>>> Runtime Environment: >>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, >>>>>>>>>>>>>>>>>>> pid=16948, tid=16949 >>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment >>>>>>>>>>>>>>>>>>> (15.0) (fastdebug build >>>>>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) >>>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM >>>>>>>>>>>>>>>>>>> (fastdebug >>>>>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, >>>>>>>>>>>>>>>>>>> mixed mode, sharing, tiered, compressed oops, g1 gc, >>>>>>>>>>>>>>>>>>> linux-amd64) >>>>>>>>>>>>>>>>>>> # Problematic frame: >>>>>>>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] >>>>>>>>>>>>>>>>>>> DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> in fact they seem worse as the test seems to always >>>>>>>>>>>>>>>>>>> crash now. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs >>>>>>>>>>>>>>>>>> of the test in linux-x64. I don't see a pattern as to >>>>>>>>>>>>>>>>>> where it fails versus passes. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> It doesn't fail for me locally. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Please review this change: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> ?? JBS: >>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>>>>>>>>>>>>>>> ?? webrev: >>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for >>>>>>>>>>>>>>>>>>>>> unwinding native frames in jstack mixed mode. >>>>>>>>>>>>>>>>>>>>> However some error has seen intermittently after >>>>>>>>>>>>>>>>>>>>> that. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I investigated the cause of this, I found two >>>>>>>>>>>>>>>>>>>>> concerns: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) >>>>>>>>>>>>>>>>>>>>> range check >>>>>>>>>>>>>>>>>>>>> ?? B: Language personality routine and Language >>>>>>>>>>>>>>>>>>>>> Specific Data Area (LSDA) are not considered >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I addd range check for .eh_frame processing, and >>>>>>>>>>>>>>>>>>>>> ignore personality routine and LSDA in this webrev. >>>>>>>>>>>>>>>>>>>>> Also I added bailout code if DWARF processing is >>>>>>>>>>>>>>>>>>>>> failed due to these concerns. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> This change has passed all tests on submit repo >>>>>>>>>>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle >>>>>>>>>>>>>>>>>>>>> Linux 7.7 container. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> >>> >> From daniil.x.titov at oracle.com Tue Mar 24 17:00:29 2020 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Tue, 24 Mar 2020 10:00:29 -0700 Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI connector port In-Reply-To: <9c888cc4-f9f8-8f64-f90b-a949007bc1dc@oracle.com> References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com> <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com> <41392FC2-AC78-4E94-878B-907DDFB3E968@oracle.com> <78EA1B33-1B0D-4969-AFC5-E73D8F82E6FA@oracle.com> <9c888cc4-f9f8-8f64-f90b-a949007bc1dc@oracle.com> Message-ID: <202C5C21-BA69-4ACF-9421-A9B5D6704C8C@oracle.com> Hi Serguei, > It looks like you removed the last call site of DebugServer.main. Yes. It is correct. > Do we need to remove the DebugServer.java as well? I was considering this but since it is a public class I think it needs to be deprecated first. I also think that it would be better to do in a separate issue since a CSR for deprecation needs to be filed for that. If you agree I will create a new issue for that. Thanks, Daniil ?On 3/23/20, 11:56 PM, "serguei.spitsyn at oracle.com" wrote: Hi Daniil, It looks pretty good in general. It looks like you removed the last call site of DebugServer.main. Do we need to remove the DebugServer.java as well? Thanks, Serguei On 3/22/20 15:29, Daniil Titov wrote: > Hi Yasumasa, Serguei and Alex, > > Please review a new version of the webrev that merges SADebugDTest.java with changes done in [2]. > > Also the CRS [3] and the help message for debug server in SALauncher.java were updated to specify that '--hostname' > option could be a hostname or an IPv4/IPv6 address. > > > Ok, but I think it might be more simply with TestLibrary. > > For example, can you use TestLibrary::getUnusedRandomPort ? It is used in test/jdk/java/rmi/testlibrary/RMID.java . > > TestLibrary:: getUnusedRandomPort() doesn't allow to specify what ports are reserved and it uses some hardcoded port range [FIXED_PORT_MIN, FIXED_PORT_MAX] as reserved ports. Besides, test/jdk/java/rmi/testlibrary/TestLibrary.java class cannot be directly used in test/hotspot/jtreg/serviceability/* tests (it doesn't compile). > > Nevertheless, to simplify the test itself I moved findUnreservedFreePort(int .. reservedPorts) from SADebugTest.java to jdk.test.lib.Utils in /test/lib. > > Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > [1] http://cr.openjdk.java.net/~dtitov/8196751/webrev.04/ > [2] https://bugs.openjdk.java.net/browse/JDK-8238268 > [3] https://bugs.openjdk.java.net/browse/JDK-8239831 > > Thank you, > Daniil > > ?On 3/13/20, 7:23 PM, "Yasumasa Suenaga" wrote: > > Hi Daniil, > > On 2020/03/14 7:05, Daniil Titov wrote: > > Hi Yasumasa, Serguei and Alex, > > > > Please review a new version of the webrev that includes the changes Yasumasa suggested. > > > >> Shutdown hook is already registered in c'tor of HotSpotAgent. > >> It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed. > > > > The shutdown hook registered in the HotSpotAgent c'tor only works for non-servers, so we still need a > > the shutdown hook for remote server being added in SALauncher. I changed it to use the lambda expression. > > > > 101 public HotSpotAgent() { > > 102 // for non-server add shutdown hook to clean-up debugger in case > > 103 // of forced exit. For remote server, shutdown hook is added by > > 104 // DebugServer. > > 105 Runtime.getRuntime().addShutdownHook(new java.lang.Thread( > > 106 new Runnable() { > > 107 public void run() { > > 108 synchronized (HotSpotAgent.this) { > > 109 if (!isServer) { > > 110 detach(); > > 111 } > > 112 } > > 113 } > > 114 })); > > 115 } > > I missed it, thanks! > > > >>> Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains > >>> `exclusiveAccess.dirs=.` to avoid concurrent execution > > As I understand exclusiveAccess.dirs prevents only the tests located in this directory from being run simultaneously and other tests could still run in parallel with one of these tests. Thus I would prefer to have the retry mechanism in place. I simplified the code using the class variables instead of local arrays. > > Ok, but I think it might be more simply with TestLibrary. > For example, can you use TestLibrary::getUnusedRandomPort ? It is used in test/jdk/java/rmi/testlibrary/RMID.java . > > > Thanks, > > Yasumasa > > > > Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.03/ > > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > [3] Bug: https://bugs.openjdk.java.net/browse/JDK-8196751 > > > > Thank you, > > Daniil > > > > ?On 3/6/20, 6:15 PM, "Yasumasa Suenaga" wrote: > > > > Hi Daniil, > > > > On 2020/03/07 3:38, Daniil Titov wrote: > > > Hi Yasumasa, > > > > > > -> checkBasicOptions() is needed? I think you can remove this method and embed it in caller. > > > I think that having a piece of code that invokes a method named "buildAttachArgs" with a copy of the argument map just for its side-effect ( it throws an exception if parameters are incorrect) and ignores its return might look confusing. Thus, I found it more appropriate to wrap it inside a method with more relevant name . > > > > Ok, but I prefer to leave comment it. > > > > > > > > SADebugDTest > > > > - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array. > > > We cannot use primitives there since these local variables are captured in lambda expression and are required to be final. > > > The other option is to use some other wrapper for them but I don't see any obvious benefits in it comparing to the array. > > > > Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains `exclusiveAccess.dirs=.` to avoid concurrent execution. > > If you do not think this error check, test code is more simply. > > > > > > > I will include your other suggestion in the new version of the webrev. > > > > Sorry, I have one more comment: > > > > > - Shutdown hook is very good idea. You can implement more simply if you use lambda expression. > > > > Shutdown hook is already registered in c'tor of HotSpotAgent. > > It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed. > > > > > > Thanks, > > > > Yasumasa > > > > > > > Thanks! > > > Daniil > > > > > > ?On 3/6/20, 12:30 AM, "Yasumasa Suenaga" wrote: > > > > > > Hi Daniil, > > > > > > > > > - SALauncher.java > > > - checkBasicOptions() is needed? I think you can remove this method and embed it in caller. > > > - I think registryPort should be checked with Integer.parseInt() like others (rmiPort and pid) rather than regex. > > > - Shutdown hook is very good idea. You can implement more simply if you use lambda expression. > > > > > > - SADebugDTest.java > > > - Please add bug ID to @bug. > > > - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array. > > > > > > > > > Thanks, > > > > > > Yasumasa > > > > > > > > > On 2020/03/06 10:15, Daniil Titov wrote: > > > > Hi Yasumasa, Serguei and Alex, > > > > > > > > Please review a new version of the fix [1] that addresses your comments. The new version in addition to RMI connector > > > > port option introduces two more options to specify RMI registry port and RMI connector host name. Currently, these > > > > last two settings could be specified using the system properties but the system properties have the following disadvantages > > > > comparing to the command line options: > > > > - It?s hard to know about them: they are not listed in tool?s help. > > > > - They have long names that hard to remember > > > > - It is easy to mistype them in the command line and you will not get any warning about it. > > > > > > > > The CSR [2] was also updated and needs to be reviewed. > > > > > > > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > > > > container and connecting to it with the GUI debugger. Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.02/ > > > > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > > > [3] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > > > > > > > Thank you, > > > > Daniil > > > > > > > > ?On 2/24/20, 5:45 AM, "Yasumasa Suenaga" wrote: > > > > > > > > Hi Daniil, > > > > > > > > - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments. > > > > Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply. > > > > > > > > - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used. > > > > But you can use same port number as RMI registry (1099). > > > > It is same as relation between jmxremote.port and jmxremote.rmi.port. > > > > > > > > > > > > Thanks, > > > > > > > > Yasumasa > > > > > > > > > > > > On 2020/02/24 13:21, Daniil Titov wrote: > > > > > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port. > > > > > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container. > > > > > > > > > > New CSR [3] was created for this change and it needs to be reviewed as well. > > > > > > > > > > Man pages for jhsdb will be updated in a separate issue. > > > > > > > > > > The current implementation (sun.jvm.hotspot.SALauncher) parses the command line options passed to jhsdb tool, > > > > > converts them to the ones for the debug server and then delegates the call to sun.jvm.hotspot.DebugServer.main(). > > > > > > > > > > // delegate to the actual SA debug server. > > > > > 367 DebugServer.main(newArgArray.toArray(new String[0])); > > > > > > > > > > However, sun.jvm.hotspot.DebugServer doesn't support named options and that prevents from efficiently adding new options to the tool. > > > > > I found it more suitable to start Hotspot agent directly in SALauncher rather than adding a new option in both sun.jvm.hotspot.SALauncher > > > > > and sun.jvm.hotspot.DebugServer and delegating the call. With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated > > > > > but I would prefer to address it in a separate issue. > > > > > > > > > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > > > > > container and connecting to it with the GUI debugger. > > > > > Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > > > > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01 > > > > > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > > > > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > > > > > > > > > Thank you, > > > > > Daniil > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From daniel.daugherty at oracle.com Tue Mar 24 17:01:29 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 24 Mar 2020 13:01:29 -0400 Subject: RFR(T): 8241532: ProblemList tests from 8241530 on OSX Message-ID: <59631faa-ee70-c95f-7c9f-91f24cd0d1de@oracle.com> Greetings, I have a trivial review for ProblemListing some tests. We're having some network issues with the new OSX 10.15 machines that are being addressed. In the mean time, I'm trying to reduce the noise in the CI in Tier5 and Tier6 so I'm ProblemListing the affected tests: $ hg diff diff -r 23dab0354eb0 test/jdk/ProblemList.txt --- a/test/jdk/ProblemList.txt??? Tue Mar 24 17:39:52 2020 +0100 +++ b/test/jdk/ProblemList.txt??? Tue Mar 24 12:57:43 2020 -0400 @@ -604,6 +604,10 @@ ?com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad.java 8030957 aix-all ?com/sun/management/OperatingSystemMXBean/GetSystemCpuLoad.java 8030957 aix-all +sun/management/jdp/JdpDefaultsTest.java 8241530 macosx-all +sun/management/jdp/JdpJmxRemoteDynamicPortTest.java 8241530 macosx-all +sun/management/jdp/JdpSpecificAddressTest.java 8241530 macosx-all + ?############################################################################ ?# jdk_jmx @@ -924,6 +928,9 @@ ?com/sun/jdi/InvokeHangTest.java 8218463 linux-all +com/sun/jdi/JdwpAttachTest.java 8241530 macosx-all +com/sun/jdi/JdwpListenTest.java 8241530 macosx-all + ?############################################################################ ?# jdk_time Thanks, in advance, for any comments, questions or suggestions. Dan From christian.tornqvist at oracle.com Tue Mar 24 17:03:31 2020 From: christian.tornqvist at oracle.com (Christian Tornqvist) Date: Tue, 24 Mar 2020 10:03:31 -0700 Subject: RFR(T): 8241532: ProblemList tests from 8241530 on OSX In-Reply-To: <59631faa-ee70-c95f-7c9f-91f24cd0d1de@oracle.com> References: <59631faa-ee70-c95f-7c9f-91f24cd0d1de@oracle.com> Message-ID: Looks good, thanks for doing this. Thanks, Christian > On Mar 24, 2020, at 10:01 AM, Daniel D. Daugherty wrote: > > Greetings, > > I have a trivial review for ProblemListing some tests. > > We're having some network issues with the new OSX 10.15 machines that > are being addressed. In the mean time, I'm trying to reduce the noise > in the CI in Tier5 and Tier6 so I'm ProblemListing the affected tests: > > $ hg diff > diff -r 23dab0354eb0 test/jdk/ProblemList.txt > --- a/test/jdk/ProblemList.txt Tue Mar 24 17:39:52 2020 +0100 > +++ b/test/jdk/ProblemList.txt Tue Mar 24 12:57:43 2020 -0400 > @@ -604,6 +604,10 @@ > com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad.java 8030957 aix-all > com/sun/management/OperatingSystemMXBean/GetSystemCpuLoad.java 8030957 aix-all > > +sun/management/jdp/JdpDefaultsTest.java 8241530 macosx-all > +sun/management/jdp/JdpJmxRemoteDynamicPortTest.java 8241530 macosx-all > +sun/management/jdp/JdpSpecificAddressTest.java 8241530 macosx-all > + > ############################################################################ > > # jdk_jmx > @@ -924,6 +928,9 @@ > > com/sun/jdi/InvokeHangTest.java 8218463 linux-all > > +com/sun/jdi/JdwpAttachTest.java 8241530 macosx-all > +com/sun/jdi/JdwpListenTest.java 8241530 macosx-all > + > ############################################################################ > > # jdk_time > > Thanks, in advance, for any comments, questions or suggestions. > > Dan > From daniel.daugherty at oracle.com Tue Mar 24 17:04:23 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 24 Mar 2020 13:04:23 -0400 Subject: RFR(T): 8241532: ProblemList tests from 8241530 on OSX In-Reply-To: References: <59631faa-ee70-c95f-7c9f-91f24cd0d1de@oracle.com> Message-ID: <5e7bf2c8-ad77-949d-c984-b63cf0ace03a@oracle.com> Thanks for the fast review! Dan On 3/24/20 1:03 PM, Christian Tornqvist wrote: > Looks good, thanks for doing this. > > Thanks, > Christian > >> On Mar 24, 2020, at 10:01 AM, Daniel D. Daugherty wrote: >> >> Greetings, >> >> I have a trivial review for ProblemListing some tests. >> >> We're having some network issues with the new OSX 10.15 machines that >> are being addressed. In the mean time, I'm trying to reduce the noise >> in the CI in Tier5 and Tier6 so I'm ProblemListing the affected tests: >> >> $ hg diff >> diff -r 23dab0354eb0 test/jdk/ProblemList.txt >> --- a/test/jdk/ProblemList.txt Tue Mar 24 17:39:52 2020 +0100 >> +++ b/test/jdk/ProblemList.txt Tue Mar 24 12:57:43 2020 -0400 >> @@ -604,6 +604,10 @@ >> com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad.java 8030957 aix-all >> com/sun/management/OperatingSystemMXBean/GetSystemCpuLoad.java 8030957 aix-all >> >> +sun/management/jdp/JdpDefaultsTest.java 8241530 macosx-all >> +sun/management/jdp/JdpJmxRemoteDynamicPortTest.java 8241530 macosx-all >> +sun/management/jdp/JdpSpecificAddressTest.java 8241530 macosx-all >> + >> ############################################################################ >> >> # jdk_jmx >> @@ -924,6 +928,9 @@ >> >> com/sun/jdi/InvokeHangTest.java 8218463 linux-all >> >> +com/sun/jdi/JdwpAttachTest.java 8241530 macosx-all >> +com/sun/jdi/JdwpListenTest.java 8241530 macosx-all >> + >> ############################################################################ >> >> # jdk_time >> >> Thanks, in advance, for any comments, questions or suggestions. >> >> Dan >> From chris.plummer at oracle.com Tue Mar 24 20:35:01 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 24 Mar 2020 13:35:01 -0700 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: <7667eb6e-8458-6763-3f27-f97f6f49bed8@redhat.com> References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com> <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com> <7766d2f0-e223-7129-9c37-474bbcd67d70@oracle.com> <7667eb6e-8458-6763-3f27-f97f6f49bed8@redhat.com> Message-ID: <86692131-c9ff-3904-dce6-ac59149b2b88@oracle.com> Hi Roman, On 3/24/20 1:56 AM, Roman Kennke wrote: > Hi Chris, > >> I assume JVMTI maintains separate tagging data for each agent so having >> two agents doing tagging won't result in confusion. I didn't actually >> find this in the spec. Would be nice to confirm that it is the case. >> However, your implementation does seem to conflict with other uses of >> tagging in the debug agent: > The tagging data is per-jvmtiEnv. We create and use our own env (private > to class-tracking), so this wouldn't conflict with other uses of tags. > Could it be a problem that we have a single trackingEnv per JVM, though? > /me scratches head. Ok. This is an area I'm not familiar with, but the spec does say: "Each call to GetEnv creates a new JVM TI connection and thus a new JVM TI environment." So it looks like what you are doing should be ok. I still think you have a bug where you are not deallocating signatures of classes that are unloaded. If you think otherwise please point out where this is done. thanks, Chris >> What would cause classTrack_addPreparedClass() to be called for a Class >> you've already seen? I don't understand the need for the "tag != 0l" check. > It's probably not needed, may be a left-over from previous installments > of this implementation. I will check it, and turn into an assert or so. > > Thanks, > Roman > >> thanks, >> >> Chris >> >> On 3/20/20 12:52 PM, Chris Plummer wrote: >>> On 3/20/20 8:30 AM, Roman Kennke wrote: >>>> I believe I came up with a much simpler solution that also solves the >>>> problems of the existing one, and the ones I proposed earlier. >>>> >>>> It turns out that we can take advantage of the fact that we can use >>>> *anything* as tags in JVMTI, even pointers to stuff (this is explicitely >>>> mentioned in the JVMTI spec). This means we can simply stick a pointer >>>> to the signature of a class into the tag, and pull it out again when we >>>> get notified that the class gets unloaded. >>>> >>>> This means we don't need an extra data-structure to keep track of >>>> classes and signatures, and it also makes the story around locking >>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning of all >>>> classes needed (as in the current implementation) and no searching of >>>> table needed (like in my previous attempts). >>>> >>>> Please review this new revision: >>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/ >>> I'll have a look at this. >>>> (Notice that there still appears to be a performance bottleneck with >>>> class-unloading when an actual debugger is attached. This doesn't seem >>>> to be related to the classTrack.c implementation though, but looks like >>>> a consequence of getting all those class-unload notifications over the >>>> wire. My testcase generates 1000s of them, and it's clogging up the >>>> buffers.) >>> At least this is only a one-shot hit when the classes are unloaded, >>> and the performance hit is based on the number of classes being >>> unloaded. The main issue is happening every GC, and is O(n) where n is >>> the number of loaded classes. >>>> I am not sure why jdb needs to enable class-unload listener always. A >>>> simple hack disables it, and performance is brilliant, even when jdb is >>>> attached: >>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch >>> This is JDI, not jdb. It looks like it needs ClassUnload events so it >>> can maintain typesBySignature, which is used by public APIs like >>> allClasses(). So we have caching of loaded classes both in the debug >>> agent and in JDI. >>> >>> Chris >>>> But this is not in the scope of this bug.) >>>> >>>> Roman >>>> >>>> >>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote: >>>>> Sorry, forgot to complete my comments at the end (see below). >>>>> >>>>> >>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Roman, >>>>>> >>>>>> Thank you for the update and sorry for the latency in review. >>>>>> >>>>>> Some comments are below. >>>>>> >>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>> >>>>>> >>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag) >>>>>> ?? 88 { >>>>>> 89 debugMonitorEnter(deletedSignatureLock); >>>>>> 90 if (currentClassTag == -1) { >>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>> 93 return; >>>>>> ?? 94???? } >>>>>> Just a question: >>>>>> ?? Q1: Should the ObjectFree events be disabled for the jvmtiEnv >>>>>> that does >>>>>> ?????? the class tracking if class tracking has not been initialized? >>>>>> >>>>>> 70 static jlong currentClassTag; I'm thinking if the name is better to >>>>>> be something like: lastClassTag or highestClassTag. >>>>>> >>>>>> 99 KlassNode* klass = *klass_ptr; >>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103 >>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr; >>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // klass not >>>>>> found - ignore. >>>>>> 107 debugMonitorExit(deletedSignatureLock); >>>>>> 108 return; >>>>>> ? 109???? } >>>>>> ??It seems to me, something is wrong in the condition at L106 above. >>>>>> ??Should it be? : >>>>>> ???? if (klass == NULL || klass->klass_tag != tag) >>>>>> >>>>>> ??Otherwise, how can the second check ever work correctly as the >>>>>> return >>>>>> will always happen when (klass != NULL)? >>>>>> >>>>>> ? There are several places in this file with the the indent: >>>>>> 90 if (currentClassTag == -1) { >>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>> 93 return; >>>>>> ?? 94???? } >>>>>> ? ... >>>>>> 152 if (currentClassTag == -1) { >>>>>> 153 // Class tracking not initialized yet, nobody's interested >>>>>> 154 debugMonitorExit(deletedSignatureLock); >>>>>> 155 return; >>>>>> ? 156???? } >>>>>> ? ... >>>>>> 161 if (error != JVMTI_ERROR_NONE) { >>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv"); >>>>>> ? 163???? } >>>>>> 164 if (tag != 0l) { >>>>>> 165 debugMonitorExit(deletedSignatureLock); >>>>>> 166 return; // Already added >>>>>> ? 167???? } >>>>>> ? ... >>>>>> 281 cleanDeleted(void *signatureVoid, void *arg) >>>>>> 282 { >>>>>> 283 char* sig = (char*)signatureVoid; >>>>>> 284 jvmtiDeallocate(sig); >>>>>> 285 return JNI_TRUE; >>>>>> ? 286 } >>>>>> ? ... >>>>>> ? 291 void >>>>>> ? 292 classTrack_reset(void) >>>>>> ? 293 { >>>>>> 294 int idx; >>>>>> 295 debugMonitorEnter(deletedSignatureLock); >>>>>> 296 >>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) { >>>>>> 298 KlassNode* node = table[idx]; >>>>>> 299 while (node != NULL) { >>>>>> 300 KlassNode* next = node->next; >>>>>> 301 jvmtiDeallocate(node->signature); >>>>>> 302 jvmtiDeallocate(node); >>>>>> 303 node = next; >>>>>> 304 } >>>>>> 305 } >>>>>> 306 jvmtiDeallocate(table); >>>>>> 307 >>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL); >>>>>> 309 bagDestroyBag(deletedSignatureBag); >>>>>> 310 >>>>>> 311 currentClassTag = -1; >>>>>> 312 >>>>>> 313 (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); >>>>>> 314 trackingEnv = NULL; >>>>>> 315 >>>>>> 316 debugMonitorExit(deletedSignatureLock); >>>>>> >>>>>> Could you, please, fix several comments below? >>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for >>>>>> class-unloads >>>>>> ??The comma is not needed. >>>>>> ??Would it better to replace: klass tags => klass_tag's ? >>>>>> >>>>>> >>>>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag >>>>>> consistent >>>>>> ??Maybe: Lock to guard ... or lock to keep integrity of ... >>>>>> >>>>>> 84 * Callback when classes are freed, Finds the signature and >>>>>> remembers it in deletedSignatureBag. Would be better to use words like >>>>>> "store" or "record", "Find" should not start from capital letter: >>>>>> Invoke the callback when classes are freed, find and record the >>>>>> signature in deletedSignatureBag. >>>>>> >>>>>> 96 // Find deleted KlassNode 133 // Class tracking not initialized, >>>>>> nobody's interested 153 // Class tracking not initialized yet, >>>>>> nobody's interested 158 /* Check this is not a duplicate */ Missed dot >>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { // >>>>>> klass not found - ignore. In opposite, dot is not needed as the >>>>>> comment does not start from a capital letter. 111 // At this point we >>>>>> have the KlassNode corresponding to the tag >>>>>> 112 // in klass, and the pointer to it in klass_node. >>>>> ? The comment above can be better. Maybe, something like: >>>>> ? ? " At this point, we found the KlassNode matching the klass >>>>> tag(and it is >>>>> linked). >>>>> >>>>>> 113 // Remember the unloaded signature. >>>>> ??Better: Record the signature of the unloaded class and unlink it. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> On 3/9/20 05:39, Roman Kennke wrote: >>>>>>> Hello all, >>>>>>> >>>>>>> Can I please get reviews of this change? In the meantime, we've done >>>>>>> more testing and also field-/torture-testing by a customer who is >>>>>>> happy >>>>>>> now. :-) >>>>>>> >>>>>>> Thanks, >>>>>>> Roman >>>>>>> >>>>>>> >>>>>>>> Hi Serguei, >>>>>>>> >>>>>>>> Thanks for reviewing! >>>>>>>> >>>>>>>> I updated the patch to reflect your suggestions, very good! >>>>>>>> It also includes a fix to allow re-connecting an agent after >>>>>>>> disconnect, >>>>>>>> namely move setup of the trackingEnv and deletedSignatureBag to >>>>>>>> _activate() to ensure have those structures after re-connect. >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/ >>>>>>>> >>>>>>>> Let me know what you think! >>>>>>>> Roman >>>>>>>> >>>>>>>>> Hi Roman, >>>>>>>>> >>>>>>>>> Thank you for taking care about this scalability issue! >>>>>>>>> >>>>>>>>> I have a couple of quick comments. >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>> >>>>>>>>> >>>>>>>>> 72 /* >>>>>>>>> 73 * Lock to protect deletedSignatureBag >>>>>>>>> 74 */ >>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /* >>>>>>>>> 78 * A bag containing all the deleted classes' signatures. Must be >>>>>>>>> accessed under >>>>>>>>> 79 * deletedTagLock, >>>>>>>>> ?? 80? */ >>>>>>>>> 81 struct bag* deletedSignatureBag; >>>>>>>>> >>>>>>>>> ?? The comments contradict to each other. >>>>>>>>> ?? I guess, the lock name at line 79 has to be deletedSignatureLock >>>>>>>>> instead of deletedTagLock. >>>>>>>>> ?? Also, comma at the end must be replaced with dot. >>>>>>>>> >>>>>>>>> >>>>>>>>> 101 // Tag not found? Ignore. >>>>>>>>> 102 if (klass == NULL) { >>>>>>>>> 103 debugMonitorExit(deletedSignatureLock); >>>>>>>>> 104 return; >>>>>>>>> 105 } >>>>>>>>> ? 106 >>>>>>>>> 107 // Scan linked-list. >>>>>>>>> 108 jlong found_tag = klass->klass_tag; >>>>>>>>> 109 while (klass != NULL && found_tag != tag) { >>>>>>>>> 110 klass_ptr = &klass->next; >>>>>>>>> 111 klass = *klass_ptr; >>>>>>>>> 112 found_tag = klass->klass_tag; >>>>>>>>> ? 113???? } >>>>>>>>> 114 >>>>>>>>> 115 // Tag not found? Ignore. >>>>>>>>> 116 if (found_tag != tag) { >>>>>>>>> 117 debugMonitorExit(deletedSignatureLock); >>>>>>>>> 118 return; >>>>>>>>> ? 119???? } >>>>>>>>> >>>>>>>>> >>>>>>>>> ??The code above can be simplified, so that the lines 101-105 >>>>>>>>> are not >>>>>>>>> needed anymore. >>>>>>>>> ??It can be something like this: >>>>>>>>> >>>>>>>>> // Scan linked-list. >>>>>>>>> while (klass != NULL && klass->klass_tag != tag) { >>>>>>>>> klass_ptr = &klass->next; >>>>>>>>> klass = *klass_ptr; >>>>>>>>> ????? } >>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not >>>>>>>>> found - ignore. >>>>>>>>> debugMonitorExit(deletedSignatureLock); >>>>>>>>> return; >>>>>>>>> ????? } >>>>>>>>> >>>>>>>>> It will take more time when I get a chance to look at the rest. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote: >>>>>>>>>> Here comes an update that resolves some races that happen when >>>>>>>>>> disconnecting an agent. In particular, we need to take the lock on >>>>>>>>>> basically every operation, and also need to check whether or not >>>>>>>>>> class-tracking is active and return an appropriate result (e.g. >>>>>>>>>> an empty >>>>>>>>>> list) when we're not. >>>>>>>>>> >>>>>>>>>> Updated webrev: >>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Roman >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> So, here comes the O(1) implementation: >>>>>>>>>>> >>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a tag, >>>>>>>>>>> and we >>>>>>>>>>> set-up a listener to get notified when it is unloaded. >>>>>>>>>>> - Prepared classes are kept in a datastructure that is a >>>>>>>>>>> table, which >>>>>>>>>>> each entry being the head of a linked-list of KlassNode*. The >>>>>>>>>>> table is >>>>>>>>>>> indexed by tag % slot-count, and then simply prepend the new >>>>>>>>>>> KlassNode*. >>>>>>>>>>> This is O(1) operation. >>>>>>>>>>> - When we get notified of unloading a class, we look up the >>>>>>>>>>> signature of >>>>>>>>>>> the reported tag in that table, and remember it in a bag. The >>>>>>>>>>> KlassNode* >>>>>>>>>>> is then unlinked from the table and deallocated. This is ~O(1) >>>>>>>>>>> operation >>>>>>>>>>> too, depending on the depth of the table. In my testcase which >>>>>>>>>>> hammered >>>>>>>>>>> the code with class-loads and unloads, I usually see depths of >>>>>>>>>>> like 2-3, >>>>>>>>>>> but not usually more. It should be ok. >>>>>>>>>>> - when processUnloads() gets called, we simply hand out that >>>>>>>>>>> bag, and >>>>>>>>>>> allocate a new one. >>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid >>>>>>>>>>> leaking the >>>>>>>>>>> signatures and KlassNode* etc when debug agent gets detached >>>>>>>>>>> and/or >>>>>>>>>>> re-attached (was missing before). >>>>>>>>>>> - I also added locks around data-structure-manipulation (was >>>>>>>>>>> missing >>>>>>>>>>> before). >>>>>>>>>>> - Also, I only activate this whole process when an actual >>>>>>>>>>> listener gets >>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when >>>>>>>>>>> attaching a >>>>>>>>>>> jdb, not sure why jdb does that though. This may be something >>>>>>>>>>> to improve >>>>>>>>>>> in the future? >>>>>>>>>>> >>>>>>>>>>> In my tests, the performance of class-tracking itself looks >>>>>>>>>>> really good. >>>>>>>>>>> The bottleneck now is clearly actual synthesizing the >>>>>>>>>>> class-unload >>>>>>>>>>> events. I don't see how this can be helped when the debug >>>>>>>>>>> agent asks for it? >>>>>>>>>>> >>>>>>>>>>> Updated webrev: >>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ >>>>>>>>>>> >>>>>>>>>>> Please let me know what you think of it. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Roman >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Alright, the perfectionist in me got me. I am implementing >>>>>>>>>>>> the even more >>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing for >>>>>>>>>>>> now. >>>>>>>>>>>> >>>>>>>>>>>> Thanks,Roman >>>>>>>>>>>> >>>>>>>>>>>> ? Hi Chris, >>>>>>>>>>>>>> I'll have a look at this, although it might not be for a >>>>>>>>>>>>>> few days. In >>>>>>>>>>>>>> the meantime, maybe you can describe your new >>>>>>>>>>>>>> implementation in >>>>>>>>>>>>>> classTrack.c so it's easier to look through the changes. >>>>>>>>>>>>> Sure. >>>>>>>>>>>>> >>>>>>>>>>>>> The purpose of this class-tracking is to be able to >>>>>>>>>>>>> determine the >>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading >>>>>>>>>>>>> happened, so that >>>>>>>>>>>>> we can generate the appropriate JDWP event. >>>>>>>>>>>>> >>>>>>>>>>>>> The current implementation does so by maintaining a table of >>>>>>>>>>>>> currently >>>>>>>>>>>>> prepared classes by building that table when classTrack is >>>>>>>>>>>>> initialized, >>>>>>>>>>>>> and then add new classes whenever a class gets loaded. When >>>>>>>>>>>>> unloading >>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and compared >>>>>>>>>>>>> with the >>>>>>>>>>>>> old table, and whatever is in the old, but not in the new >>>>>>>>>>>>> table gets >>>>>>>>>>>>> returned. The problem is that when GCs happen frequently >>>>>>>>>>>>> and/or many >>>>>>>>>>>>> classes get loaded+unloaded, this amounts to >>>>>>>>>>>>> O(classCount*gcCount) >>>>>>>>>>>>> complexity. >>>>>>>>>>>>> >>>>>>>>>>>>> The new implementation keeps a linked-list of prepared >>>>>>>>>>>>> classes, and also >>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). >>>>>>>>>>>>> Whenever an >>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned, >>>>>>>>>>>>> and classes >>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus >>>>>>>>>>>>> maintaining the >>>>>>>>>>>>> prepared-classes-list) and its signature put in the list >>>>>>>>>>>>> that gets returned. >>>>>>>>>>>>> >>>>>>>>>>>>> The implementation is not perfect. In order to determine >>>>>>>>>>>>> whether or not >>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. >>>>>>>>>>>>> That process is >>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here >>>>>>>>>>>>> is that >>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this >>>>>>>>>>>>> seems to be >>>>>>>>>>>>> true, and also reasonable to expect. >>>>>>>>>>>>> >>>>>>>>>>>>> (I have some ideas how to improve the implementation to >>>>>>>>>>>>> ~O(1) but it >>>>>>>>>>>>> would be considerably more complex: have to maintain a >>>>>>>>>>>>> (hash)table that >>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload, >>>>>>>>>>>>> and build the >>>>>>>>>>>>> unloaded-signatures list there, but I don't currently see >>>>>>>>>>>>> that it's >>>>>>>>>>>>> worth the effort). >>>>>>>>>>>>> >>>>>>>>>>>>> In addition to all that, this process is only activated when >>>>>>>>>>>>> there's an >>>>>>>>>>>>> actual listener registered for EI_GC_FINISH. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Roman >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Chris >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote: >>>>>>>>>>>>>>> Hello all, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Issue: >>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c. >>>>>>>>>>>>>>> It avoids >>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps >>>>>>>>>>>>>>> track of >>>>>>>>>>>>>>> loaded/unloaded classes one-by-one. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> In addition to that, it avoids this whole dance until an >>>>>>>>>>>>>>> agent >>>>>>>>>>>>>>> registers interest in EI_GC_FINISH. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and >>>>>>>>>>>>>>> timing. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Eg with the testcase provided here: >>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I am getting those numbers: >>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s >>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Can I please get a review? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>> >>> >> From rkennke at redhat.com Tue Mar 24 20:45:16 2020 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 24 Mar 2020 21:45:16 +0100 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: <86692131-c9ff-3904-dce6-ac59149b2b88@oracle.com> References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com> <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com> <7766d2f0-e223-7129-9c37-474bbcd67d70@oracle.com> <7667eb6e-8458-6763-3f27-f97f6f49bed8@redhat.com> <86692131-c9ff-3904-dce6-ac59149b2b88@oracle.com> Message-ID: <1de4cfc0-a719-efc6-8f9c-55c73c9043f7@redhat.com> >>> I assume JVMTI maintains separate tagging data for each agent so having >>> two agents doing tagging won't result in confusion. I didn't actually >>> find this in the spec. Would be nice to confirm that it is the case. >>> However, your implementation does seem to conflict with other uses of >>> tagging in the debug agent: >> The tagging data is per-jvmtiEnv. We create and use our own env (private >> to class-tracking), so this wouldn't conflict with other uses of tags. >> Could it be a problem that we have a single trackingEnv per JVM, though? >> /me scratches head. > Ok. This is an area I'm not familiar with, but the spec does say: > > "Each call to GetEnv creates a new JVM TI connection and thus a new JVM > TI environment." > > So it looks like what you are doing should be ok. I still think you have > a bug where you are not deallocating signatures of classes that are > unloaded. If you think otherwise please point out where this is done. Signatures that make it out of processUnloading() are deallocated in eventHandler.c, in synthesizeUnload(), right after it has been used. http://hg.openjdk.java.net/jdk/jdk/file/b9562cc25fc0/src/jdk.jdwp.agent/share/native/libjdwp/eventHandler.c#l527 Pending signatures on debug-agent-disconnect are deallocated in classTrack.c, in the reset() routine. Thanks, Roman > thanks, > > Chris >>> What would cause classTrack_addPreparedClass() to be called for a Class >>> you've already seen? I don't understand the need for the "tag != 0l" >>> check. >> It's probably not needed, may be a left-over from previous installments >> of this implementation. I will check it, and turn into an assert or so. >> >> Thanks, >> Roman >> >>> thanks, >>> >>> Chris >>> >>> On 3/20/20 12:52 PM, Chris Plummer wrote: >>>> On 3/20/20 8:30 AM, Roman Kennke wrote: >>>>> I believe I came up with a much simpler solution that also solves the >>>>> problems of the existing one, and the ones I proposed earlier. >>>>> >>>>> It turns out that we can take advantage of the fact that we can use >>>>> *anything* as tags in JVMTI, even pointers to stuff (this is >>>>> explicitely >>>>> mentioned in the JVMTI spec). This means we can simply stick a pointer >>>>> to the signature of a class into the tag, and pull it out again >>>>> when we >>>>> get notified that the class gets unloaded. >>>>> >>>>> This means we don't need an extra data-structure to keep track of >>>>> classes and signatures, and it also makes the story around locking >>>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning of all >>>>> classes needed (as in the current implementation) and no searching of >>>>> table needed (like in my previous attempts). >>>>> >>>>> Please review this new revision: >>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/ >>>> I'll have a look at this. >>>>> (Notice that there still appears to be a performance bottleneck with >>>>> class-unloading when an actual debugger is attached. This doesn't seem >>>>> to be related to the classTrack.c implementation though, but looks >>>>> like >>>>> a consequence of getting all those class-unload notifications over the >>>>> wire. My testcase generates 1000s of them, and it's clogging up the >>>>> buffers.) >>>> At least this is only a one-shot hit when the classes are unloaded, >>>> and the performance hit is based on the number of classes being >>>> unloaded. The main issue is happening every GC, and is O(n) where n is >>>> the number of loaded classes. >>>>> I am not sure why jdb needs to enable class-unload listener always. A >>>>> simple hack disables it, and performance is brilliant, even when >>>>> jdb is >>>>> attached: >>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch >>>> This is JDI, not jdb. It looks like it needs ClassUnload events so it >>>> can maintain typesBySignature, which is used by public APIs like >>>> allClasses(). So we have caching of loaded classes both in the debug >>>> agent and in JDI. >>>> >>>> Chris >>>>> But this is not in the scope of this bug.) >>>>> >>>>> Roman >>>>> >>>>> >>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote: >>>>>> Sorry, forgot to complete my comments at the end (see below). >>>>>> >>>>>> >>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote: >>>>>>> Hi Roman, >>>>>>> >>>>>>> Thank you for the update and sorry for the latency in review. >>>>>>> >>>>>>> Some comments are below. >>>>>>> >>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>> >>>>>>> >>>>>>> >>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag) >>>>>>> ??? 88 { >>>>>>> 89 debugMonitorEnter(deletedSignatureLock); >>>>>>> 90 if (currentClassTag == -1) { >>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>> 93 return; >>>>>>> ??? 94???? } >>>>>>> Just a question: >>>>>>> ??? Q1: Should the ObjectFree events be disabled for the jvmtiEnv >>>>>>> that does >>>>>>> ??????? the class tracking if class tracking has not been >>>>>>> initialized? >>>>>>> >>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is >>>>>>> better to >>>>>>> be something like: lastClassTag or highestClassTag. >>>>>>> >>>>>>> 99 KlassNode* klass = *klass_ptr; >>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103 >>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr; >>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // >>>>>>> klass not >>>>>>> found - ignore. >>>>>>> 107 debugMonitorExit(deletedSignatureLock); >>>>>>> 108 return; >>>>>>> ?? 109???? } >>>>>>> ???It seems to me, something is wrong in the condition at L106 >>>>>>> above. >>>>>>> ???Should it be? : >>>>>>> ????? if (klass == NULL || klass->klass_tag != tag) >>>>>>> >>>>>>> ???Otherwise, how can the second check ever work correctly as the >>>>>>> return >>>>>>> will always happen when (klass != NULL)? >>>>>>> >>>>>>> ?? There are several places in this file with the the indent: >>>>>>> 90 if (currentClassTag == -1) { >>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>> 93 return; >>>>>>> ??? 94???? } >>>>>>> ?? ... >>>>>>> 152 if (currentClassTag == -1) { >>>>>>> 153 // Class tracking not initialized yet, nobody's interested >>>>>>> 154 debugMonitorExit(deletedSignatureLock); >>>>>>> 155 return; >>>>>>> ?? 156???? } >>>>>>> ?? ... >>>>>>> 161 if (error != JVMTI_ERROR_NONE) { >>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv"); >>>>>>> ?? 163???? } >>>>>>> 164 if (tag != 0l) { >>>>>>> 165 debugMonitorExit(deletedSignatureLock); >>>>>>> 166 return; // Already added >>>>>>> ?? 167???? } >>>>>>> ?? ... >>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg) >>>>>>> 282 { >>>>>>> 283 char* sig = (char*)signatureVoid; >>>>>>> 284 jvmtiDeallocate(sig); >>>>>>> 285 return JNI_TRUE; >>>>>>> ?? 286 } >>>>>>> ?? ... >>>>>>> ?? 291 void >>>>>>> ?? 292 classTrack_reset(void) >>>>>>> ?? 293 { >>>>>>> 294 int idx; >>>>>>> 295 debugMonitorEnter(deletedSignatureLock); >>>>>>> 296 >>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) { >>>>>>> 298 KlassNode* node = table[idx]; >>>>>>> 299 while (node != NULL) { >>>>>>> 300 KlassNode* next = node->next; >>>>>>> 301 jvmtiDeallocate(node->signature); >>>>>>> 302 jvmtiDeallocate(node); >>>>>>> 303 node = next; >>>>>>> 304 } >>>>>>> 305 } >>>>>>> 306 jvmtiDeallocate(table); >>>>>>> 307 >>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL); >>>>>>> 309 bagDestroyBag(deletedSignatureBag); >>>>>>> 310 >>>>>>> 311 currentClassTag = -1; >>>>>>> 312 >>>>>>> 313 >>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); >>>>>>> 314 trackingEnv = NULL; >>>>>>> 315 >>>>>>> 316 debugMonitorExit(deletedSignatureLock); >>>>>>> >>>>>>> Could you, please, fix several comments below? >>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for >>>>>>> class-unloads >>>>>>> ???The comma is not needed. >>>>>>> ???Would it better to replace: klass tags => klass_tag's ? >>>>>>> >>>>>>> >>>>>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag >>>>>>> consistent >>>>>>> ???Maybe: Lock to guard ... or lock to keep integrity of ... >>>>>>> >>>>>>> 84 * Callback when classes are freed, Finds the signature and >>>>>>> remembers it in deletedSignatureBag. Would be better to use words >>>>>>> like >>>>>>> "store" or "record", "Find" should not start from capital letter: >>>>>>> Invoke the callback when classes are freed, find and record the >>>>>>> signature in deletedSignatureBag. >>>>>>> >>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not initialized, >>>>>>> nobody's interested 153 // Class tracking not initialized yet, >>>>>>> nobody's interested 158 /* Check this is not a duplicate */ >>>>>>> Missed dot >>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { // >>>>>>> klass not found - ignore. In opposite, dot is not needed as the >>>>>>> comment does not start from a capital letter. 111 // At this >>>>>>> point we >>>>>>> have the KlassNode corresponding to the tag >>>>>>> 112 // in klass, and the pointer to it in klass_node. >>>>>> ?? The comment above can be better. Maybe, something like: >>>>>> ?? ? " At this point, we found the KlassNode matching the klass >>>>>> tag(and it is >>>>>> linked). >>>>>> >>>>>>> 113 // Remember the unloaded signature. >>>>>> ???Better: Record the signature of the unloaded class and unlink it. >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> On 3/9/20 05:39, Roman Kennke wrote: >>>>>>>> Hello all, >>>>>>>> >>>>>>>> Can I please get reviews of this change? In the meantime, we've >>>>>>>> done >>>>>>>> more testing and also field-/torture-testing by a customer who is >>>>>>>> happy >>>>>>>> now. :-) >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Roman >>>>>>>> >>>>>>>> >>>>>>>>> Hi Serguei, >>>>>>>>> >>>>>>>>> Thanks for reviewing! >>>>>>>>> >>>>>>>>> I updated the patch to reflect your suggestions, very good! >>>>>>>>> It also includes a fix to allow re-connecting an agent after >>>>>>>>> disconnect, >>>>>>>>> namely move setup of the trackingEnv and deletedSignatureBag to >>>>>>>>> _activate() to ensure have those structures after re-connect. >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/ >>>>>>>>> >>>>>>>>> Let me know what you think! >>>>>>>>> Roman >>>>>>>>> >>>>>>>>>> Hi Roman, >>>>>>>>>> >>>>>>>>>> Thank you for taking care about this scalability issue! >>>>>>>>>> >>>>>>>>>> I have a couple of quick comments. >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 72 /* >>>>>>>>>> 73 * Lock to protect deletedSignatureBag >>>>>>>>>> 74 */ >>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /* >>>>>>>>>> 78 * A bag containing all the deleted classes' signatures. >>>>>>>>>> Must be >>>>>>>>>> accessed under >>>>>>>>>> 79 * deletedTagLock, >>>>>>>>>> ??? 80? */ >>>>>>>>>> 81 struct bag* deletedSignatureBag; >>>>>>>>>> >>>>>>>>>> ??? The comments contradict to each other. >>>>>>>>>> ??? I guess, the lock name at line 79 has to be >>>>>>>>>> deletedSignatureLock >>>>>>>>>> instead of deletedTagLock. >>>>>>>>>> ??? Also, comma at the end must be replaced with dot. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 101 // Tag not found? Ignore. >>>>>>>>>> 102 if (klass == NULL) { >>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock); >>>>>>>>>> 104 return; >>>>>>>>>> 105 } >>>>>>>>>> ?? 106 >>>>>>>>>> 107 // Scan linked-list. >>>>>>>>>> 108 jlong found_tag = klass->klass_tag; >>>>>>>>>> 109 while (klass != NULL && found_tag != tag) { >>>>>>>>>> 110 klass_ptr = &klass->next; >>>>>>>>>> 111 klass = *klass_ptr; >>>>>>>>>> 112 found_tag = klass->klass_tag; >>>>>>>>>> ?? 113???? } >>>>>>>>>> 114 >>>>>>>>>> 115 // Tag not found? Ignore. >>>>>>>>>> 116 if (found_tag != tag) { >>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock); >>>>>>>>>> 118 return; >>>>>>>>>> ?? 119???? } >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ???The code above can be simplified, so that the lines 101-105 >>>>>>>>>> are not >>>>>>>>>> needed anymore. >>>>>>>>>> ???It can be something like this: >>>>>>>>>> >>>>>>>>>> // Scan linked-list. >>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) { >>>>>>>>>> klass_ptr = &klass->next; >>>>>>>>>> klass = *klass_ptr; >>>>>>>>>> ?????? } >>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not >>>>>>>>>> found - ignore. >>>>>>>>>> debugMonitorExit(deletedSignatureLock); >>>>>>>>>> return; >>>>>>>>>> ?????? } >>>>>>>>>> >>>>>>>>>> It will take more time when I get a chance to look at the rest. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Serguei >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote: >>>>>>>>>>> Here comes an update that resolves some races that happen when >>>>>>>>>>> disconnecting an agent. In particular, we need to take the >>>>>>>>>>> lock on >>>>>>>>>>> basically every operation, and also need to check whether or not >>>>>>>>>>> class-tracking is active and return an appropriate result (e.g. >>>>>>>>>>> an empty >>>>>>>>>>> list) when we're not. >>>>>>>>>>> >>>>>>>>>>> Updated webrev: >>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Roman >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> So, here comes the O(1) implementation: >>>>>>>>>>>> >>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a tag, >>>>>>>>>>>> and we >>>>>>>>>>>> set-up a listener to get notified when it is unloaded. >>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a >>>>>>>>>>>> table, which >>>>>>>>>>>> each entry being the head of a linked-list of KlassNode*. The >>>>>>>>>>>> table is >>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend the new >>>>>>>>>>>> KlassNode*. >>>>>>>>>>>> This is O(1) operation. >>>>>>>>>>>> - When we get notified of unloading a class, we look up the >>>>>>>>>>>> signature of >>>>>>>>>>>> the reported tag in that table, and remember it in a bag. The >>>>>>>>>>>> KlassNode* >>>>>>>>>>>> is then unlinked from the table and deallocated. This is ~O(1) >>>>>>>>>>>> operation >>>>>>>>>>>> too, depending on the depth of the table. In my testcase which >>>>>>>>>>>> hammered >>>>>>>>>>>> the code with class-loads and unloads, I usually see depths of >>>>>>>>>>>> like 2-3, >>>>>>>>>>>> but not usually more. It should be ok. >>>>>>>>>>>> - when processUnloads() gets called, we simply hand out that >>>>>>>>>>>> bag, and >>>>>>>>>>>> allocate a new one. >>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid >>>>>>>>>>>> leaking the >>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets detached >>>>>>>>>>>> and/or >>>>>>>>>>>> re-attached (was missing before). >>>>>>>>>>>> - I also added locks around data-structure-manipulation (was >>>>>>>>>>>> missing >>>>>>>>>>>> before). >>>>>>>>>>>> - Also, I only activate this whole process when an actual >>>>>>>>>>>> listener gets >>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when >>>>>>>>>>>> attaching a >>>>>>>>>>>> jdb, not sure why jdb does that though. This may be something >>>>>>>>>>>> to improve >>>>>>>>>>>> in the future? >>>>>>>>>>>> >>>>>>>>>>>> In my tests, the performance of class-tracking itself looks >>>>>>>>>>>> really good. >>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the >>>>>>>>>>>> class-unload >>>>>>>>>>>> events. I don't see how this can be helped when the debug >>>>>>>>>>>> agent asks for it? >>>>>>>>>>>> >>>>>>>>>>>> Updated webrev: >>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ >>>>>>>>>>>> >>>>>>>>>>>> Please let me know what you think of it. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Roman >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Alright, the perfectionist in me got me. I am implementing >>>>>>>>>>>>> the even more >>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing for >>>>>>>>>>>>> now. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks,Roman >>>>>>>>>>>>> >>>>>>>>>>>>> ?? Hi Chris, >>>>>>>>>>>>>>> I'll have a look at this, although it might not be for a >>>>>>>>>>>>>>> few days. In >>>>>>>>>>>>>>> the meantime, maybe you can describe your new >>>>>>>>>>>>>>> implementation in >>>>>>>>>>>>>>> classTrack.c so it's easier to look through the changes. >>>>>>>>>>>>>> Sure. >>>>>>>>>>>>>> >>>>>>>>>>>>>> The purpose of this class-tracking is to be able to >>>>>>>>>>>>>> determine the >>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading >>>>>>>>>>>>>> happened, so that >>>>>>>>>>>>>> we can generate the appropriate JDWP event. >>>>>>>>>>>>>> >>>>>>>>>>>>>> The current implementation does so by maintaining a table of >>>>>>>>>>>>>> currently >>>>>>>>>>>>>> prepared classes by building that table when classTrack is >>>>>>>>>>>>>> initialized, >>>>>>>>>>>>>> and then add new classes whenever a class gets loaded. When >>>>>>>>>>>>>> unloading >>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and compared >>>>>>>>>>>>>> with the >>>>>>>>>>>>>> old table, and whatever is in the old, but not in the new >>>>>>>>>>>>>> table gets >>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently >>>>>>>>>>>>>> and/or many >>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to >>>>>>>>>>>>>> O(classCount*gcCount) >>>>>>>>>>>>>> complexity. >>>>>>>>>>>>>> >>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared >>>>>>>>>>>>>> classes, and also >>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). >>>>>>>>>>>>>> Whenever an >>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned, >>>>>>>>>>>>>> and classes >>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus >>>>>>>>>>>>>> maintaining the >>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list >>>>>>>>>>>>>> that gets returned. >>>>>>>>>>>>>> >>>>>>>>>>>>>> The implementation is not perfect. In order to determine >>>>>>>>>>>>>> whether or not >>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. >>>>>>>>>>>>>> That process is >>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here >>>>>>>>>>>>>> is that >>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this >>>>>>>>>>>>>> seems to be >>>>>>>>>>>>>> true, and also reasonable to expect. >>>>>>>>>>>>>> >>>>>>>>>>>>>> (I have some ideas how to improve the implementation to >>>>>>>>>>>>>> ~O(1) but it >>>>>>>>>>>>>> would be considerably more complex: have to maintain a >>>>>>>>>>>>>> (hash)table that >>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload, >>>>>>>>>>>>>> and build the >>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently see >>>>>>>>>>>>>> that it's >>>>>>>>>>>>>> worth the effort). >>>>>>>>>>>>>> >>>>>>>>>>>>>> In addition to all that, this process is only activated when >>>>>>>>>>>>>> there's an >>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Roman >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote: >>>>>>>>>>>>>>>> Hello all, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Issue: >>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c. >>>>>>>>>>>>>>>> It avoids >>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps >>>>>>>>>>>>>>>> track of >>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance until an >>>>>>>>>>>>>>>> agent >>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and >>>>>>>>>>>>>>>> timing. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Eg with the testcase provided here: >>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I am getting those numbers: >>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s >>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Can I please get a review? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>> >>>> >>> > > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From chris.plummer at oracle.com Tue Mar 24 21:39:46 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 24 Mar 2020 14:39:46 -0700 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: <1de4cfc0-a719-efc6-8f9c-55c73c9043f7@redhat.com> References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com> <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com> <7766d2f0-e223-7129-9c37-474bbcd67d70@oracle.com> <7667eb6e-8458-6763-3f27-f97f6f49bed8@redhat.com> <86692131-c9ff-3904-dce6-ac59149b2b88@oracle.com> <1de4cfc0-a719-efc6-8f9c-55c73c9043f7@redhat.com> Message-ID: <9e86af41-0c60-de4a-53f0-9627b7248209@oracle.com> On 3/24/20 1:45 PM, Roman Kennke wrote: >>>> I assume JVMTI maintains separate tagging data for each agent so having >>>> two agents doing tagging won't result in confusion. I didn't actually >>>> find this in the spec. Would be nice to confirm that it is the case. >>>> However, your implementation does seem to conflict with other uses of >>>> tagging in the debug agent: >>> The tagging data is per-jvmtiEnv. We create and use our own env (private >>> to class-tracking), so this wouldn't conflict with other uses of tags. >>> Could it be a problem that we have a single trackingEnv per JVM, though? >>> /me scratches head. >> Ok. This is an area I'm not familiar with, but the spec does say: >> >> "Each call to GetEnv creates a new JVM TI connection and thus a new JVM >> TI environment." >> >> So it looks like what you are doing should be ok. I still think you have >> a bug where you are not deallocating signatures of classes that are >> unloaded. If you think otherwise please point out where this is done. > Signatures that make it out of processUnloading() are deallocated in > eventHandler.c, in synthesizeUnload(), right after it has been used. > > http://hg.openjdk.java.net/jdk/jdk/file/b9562cc25fc0/src/jdk.jdwp.agent/share/native/libjdwp/eventHandler.c#l527 Ok. Good to know. Not the best of designs, but that's not your fault. I'll make another pass over the changes, but I think in general it looks good. I don't think I've seen another reviewer yet, so hopefully someone jumps in. Chris > Pending signatures on debug-agent-disconnect are deallocated in > classTrack.c, in the reset() routine. > > Thanks, > Roman > >> thanks, >> >> Chris >>>> What would cause classTrack_addPreparedClass() to be called for a Class >>>> you've already seen? I don't understand the need for the "tag != 0l" >>>> check. >>> It's probably not needed, may be a left-over from previous installments >>> of this implementation. I will check it, and turn into an assert or so. >>> >>> Thanks, >>> Roman >>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 3/20/20 12:52 PM, Chris Plummer wrote: >>>>> On 3/20/20 8:30 AM, Roman Kennke wrote: >>>>>> I believe I came up with a much simpler solution that also solves the >>>>>> problems of the existing one, and the ones I proposed earlier. >>>>>> >>>>>> It turns out that we can take advantage of the fact that we can use >>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is >>>>>> explicitely >>>>>> mentioned in the JVMTI spec). This means we can simply stick a pointer >>>>>> to the signature of a class into the tag, and pull it out again >>>>>> when we >>>>>> get notified that the class gets unloaded. >>>>>> >>>>>> This means we don't need an extra data-structure to keep track of >>>>>> classes and signatures, and it also makes the story around locking >>>>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning of all >>>>>> classes needed (as in the current implementation) and no searching of >>>>>> table needed (like in my previous attempts). >>>>>> >>>>>> Please review this new revision: >>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/ >>>>> I'll have a look at this. >>>>>> (Notice that there still appears to be a performance bottleneck with >>>>>> class-unloading when an actual debugger is attached. This doesn't seem >>>>>> to be related to the classTrack.c implementation though, but looks >>>>>> like >>>>>> a consequence of getting all those class-unload notifications over the >>>>>> wire. My testcase generates 1000s of them, and it's clogging up the >>>>>> buffers.) >>>>> At least this is only a one-shot hit when the classes are unloaded, >>>>> and the performance hit is based on the number of classes being >>>>> unloaded. The main issue is happening every GC, and is O(n) where n is >>>>> the number of loaded classes. >>>>>> I am not sure why jdb needs to enable class-unload listener always. A >>>>>> simple hack disables it, and performance is brilliant, even when >>>>>> jdb is >>>>>> attached: >>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch >>>>> This is JDI, not jdb. It looks like it needs ClassUnload events so it >>>>> can maintain typesBySignature, which is used by public APIs like >>>>> allClasses(). So we have caching of loaded classes both in the debug >>>>> agent and in JDI. >>>>> >>>>> Chris >>>>>> But this is not in the scope of this bug.) >>>>>> >>>>>> Roman >>>>>> >>>>>> >>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote: >>>>>>> Sorry, forgot to complete my comments at the end (see below). >>>>>>> >>>>>>> >>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote: >>>>>>>> Hi Roman, >>>>>>>> >>>>>>>> Thank you for the update and sorry for the latency in review. >>>>>>>> >>>>>>>> Some comments are below. >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag) >>>>>>>> ??? 88 { >>>>>>>> 89 debugMonitorEnter(deletedSignatureLock); >>>>>>>> 90 if (currentClassTag == -1) { >>>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>>> 93 return; >>>>>>>> ??? 94???? } >>>>>>>> Just a question: >>>>>>>> ??? Q1: Should the ObjectFree events be disabled for the jvmtiEnv >>>>>>>> that does >>>>>>>> ??????? the class tracking if class tracking has not been >>>>>>>> initialized? >>>>>>>> >>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is >>>>>>>> better to >>>>>>>> be something like: lastClassTag or highestClassTag. >>>>>>>> >>>>>>>> 99 KlassNode* klass = *klass_ptr; >>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103 >>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr; >>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // >>>>>>>> klass not >>>>>>>> found - ignore. >>>>>>>> 107 debugMonitorExit(deletedSignatureLock); >>>>>>>> 108 return; >>>>>>>> ?? 109???? } >>>>>>>> ???It seems to me, something is wrong in the condition at L106 >>>>>>>> above. >>>>>>>> ???Should it be? : >>>>>>>> ????? if (klass == NULL || klass->klass_tag != tag) >>>>>>>> >>>>>>>> ???Otherwise, how can the second check ever work correctly as the >>>>>>>> return >>>>>>>> will always happen when (klass != NULL)? >>>>>>>> >>>>>>>> ?? There are several places in this file with the the indent: >>>>>>>> 90 if (currentClassTag == -1) { >>>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>>> 93 return; >>>>>>>> ??? 94???? } >>>>>>>> ?? ... >>>>>>>> 152 if (currentClassTag == -1) { >>>>>>>> 153 // Class tracking not initialized yet, nobody's interested >>>>>>>> 154 debugMonitorExit(deletedSignatureLock); >>>>>>>> 155 return; >>>>>>>> ?? 156???? } >>>>>>>> ?? ... >>>>>>>> 161 if (error != JVMTI_ERROR_NONE) { >>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv"); >>>>>>>> ?? 163???? } >>>>>>>> 164 if (tag != 0l) { >>>>>>>> 165 debugMonitorExit(deletedSignatureLock); >>>>>>>> 166 return; // Already added >>>>>>>> ?? 167???? } >>>>>>>> ?? ... >>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg) >>>>>>>> 282 { >>>>>>>> 283 char* sig = (char*)signatureVoid; >>>>>>>> 284 jvmtiDeallocate(sig); >>>>>>>> 285 return JNI_TRUE; >>>>>>>> ?? 286 } >>>>>>>> ?? ... >>>>>>>> ?? 291 void >>>>>>>> ?? 292 classTrack_reset(void) >>>>>>>> ?? 293 { >>>>>>>> 294 int idx; >>>>>>>> 295 debugMonitorEnter(deletedSignatureLock); >>>>>>>> 296 >>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) { >>>>>>>> 298 KlassNode* node = table[idx]; >>>>>>>> 299 while (node != NULL) { >>>>>>>> 300 KlassNode* next = node->next; >>>>>>>> 301 jvmtiDeallocate(node->signature); >>>>>>>> 302 jvmtiDeallocate(node); >>>>>>>> 303 node = next; >>>>>>>> 304 } >>>>>>>> 305 } >>>>>>>> 306 jvmtiDeallocate(table); >>>>>>>> 307 >>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL); >>>>>>>> 309 bagDestroyBag(deletedSignatureBag); >>>>>>>> 310 >>>>>>>> 311 currentClassTag = -1; >>>>>>>> 312 >>>>>>>> 313 >>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); >>>>>>>> 314 trackingEnv = NULL; >>>>>>>> 315 >>>>>>>> 316 debugMonitorExit(deletedSignatureLock); >>>>>>>> >>>>>>>> Could you, please, fix several comments below? >>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for >>>>>>>> class-unloads >>>>>>>> ???The comma is not needed. >>>>>>>> ???Would it better to replace: klass tags => klass_tag's ? >>>>>>>> >>>>>>>> >>>>>>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag >>>>>>>> consistent >>>>>>>> ???Maybe: Lock to guard ... or lock to keep integrity of ... >>>>>>>> >>>>>>>> 84 * Callback when classes are freed, Finds the signature and >>>>>>>> remembers it in deletedSignatureBag. Would be better to use words >>>>>>>> like >>>>>>>> "store" or "record", "Find" should not start from capital letter: >>>>>>>> Invoke the callback when classes are freed, find and record the >>>>>>>> signature in deletedSignatureBag. >>>>>>>> >>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not initialized, >>>>>>>> nobody's interested 153 // Class tracking not initialized yet, >>>>>>>> nobody's interested 158 /* Check this is not a duplicate */ >>>>>>>> Missed dot >>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { // >>>>>>>> klass not found - ignore. In opposite, dot is not needed as the >>>>>>>> comment does not start from a capital letter. 111 // At this >>>>>>>> point we >>>>>>>> have the KlassNode corresponding to the tag >>>>>>>> 112 // in klass, and the pointer to it in klass_node. >>>>>>> ?? The comment above can be better. Maybe, something like: >>>>>>> ?? ? " At this point, we found the KlassNode matching the klass >>>>>>> tag(and it is >>>>>>> linked). >>>>>>> >>>>>>>> 113 // Remember the unloaded signature. >>>>>>> ???Better: Record the signature of the unloaded class and unlink it. >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> On 3/9/20 05:39, Roman Kennke wrote: >>>>>>>>> Hello all, >>>>>>>>> >>>>>>>>> Can I please get reviews of this change? In the meantime, we've >>>>>>>>> done >>>>>>>>> more testing and also field-/torture-testing by a customer who is >>>>>>>>> happy >>>>>>>>> now. :-) >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Roman >>>>>>>>> >>>>>>>>> >>>>>>>>>> Hi Serguei, >>>>>>>>>> >>>>>>>>>> Thanks for reviewing! >>>>>>>>>> >>>>>>>>>> I updated the patch to reflect your suggestions, very good! >>>>>>>>>> It also includes a fix to allow re-connecting an agent after >>>>>>>>>> disconnect, >>>>>>>>>> namely move setup of the trackingEnv and deletedSignatureBag to >>>>>>>>>> _activate() to ensure have those structures after re-connect. >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/ >>>>>>>>>> >>>>>>>>>> Let me know what you think! >>>>>>>>>> Roman >>>>>>>>>> >>>>>>>>>>> Hi Roman, >>>>>>>>>>> >>>>>>>>>>> Thank you for taking care about this scalability issue! >>>>>>>>>>> >>>>>>>>>>> I have a couple of quick comments. >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 72 /* >>>>>>>>>>> 73 * Lock to protect deletedSignatureBag >>>>>>>>>>> 74 */ >>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /* >>>>>>>>>>> 78 * A bag containing all the deleted classes' signatures. >>>>>>>>>>> Must be >>>>>>>>>>> accessed under >>>>>>>>>>> 79 * deletedTagLock, >>>>>>>>>>> ??? 80? */ >>>>>>>>>>> 81 struct bag* deletedSignatureBag; >>>>>>>>>>> >>>>>>>>>>> ??? The comments contradict to each other. >>>>>>>>>>> ??? I guess, the lock name at line 79 has to be >>>>>>>>>>> deletedSignatureLock >>>>>>>>>>> instead of deletedTagLock. >>>>>>>>>>> ??? Also, comma at the end must be replaced with dot. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 101 // Tag not found? Ignore. >>>>>>>>>>> 102 if (klass == NULL) { >>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>> 104 return; >>>>>>>>>>> 105 } >>>>>>>>>>> ?? 106 >>>>>>>>>>> 107 // Scan linked-list. >>>>>>>>>>> 108 jlong found_tag = klass->klass_tag; >>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) { >>>>>>>>>>> 110 klass_ptr = &klass->next; >>>>>>>>>>> 111 klass = *klass_ptr; >>>>>>>>>>> 112 found_tag = klass->klass_tag; >>>>>>>>>>> ?? 113???? } >>>>>>>>>>> 114 >>>>>>>>>>> 115 // Tag not found? Ignore. >>>>>>>>>>> 116 if (found_tag != tag) { >>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>> 118 return; >>>>>>>>>>> ?? 119???? } >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> ???The code above can be simplified, so that the lines 101-105 >>>>>>>>>>> are not >>>>>>>>>>> needed anymore. >>>>>>>>>>> ???It can be something like this: >>>>>>>>>>> >>>>>>>>>>> // Scan linked-list. >>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) { >>>>>>>>>>> klass_ptr = &klass->next; >>>>>>>>>>> klass = *klass_ptr; >>>>>>>>>>> ?????? } >>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not >>>>>>>>>>> found - ignore. >>>>>>>>>>> debugMonitorExit(deletedSignatureLock); >>>>>>>>>>> return; >>>>>>>>>>> ?????? } >>>>>>>>>>> >>>>>>>>>>> It will take more time when I get a chance to look at the rest. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Serguei >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote: >>>>>>>>>>>> Here comes an update that resolves some races that happen when >>>>>>>>>>>> disconnecting an agent. In particular, we need to take the >>>>>>>>>>>> lock on >>>>>>>>>>>> basically every operation, and also need to check whether or not >>>>>>>>>>>> class-tracking is active and return an appropriate result (e.g. >>>>>>>>>>>> an empty >>>>>>>>>>>> list) when we're not. >>>>>>>>>>>> >>>>>>>>>>>> Updated webrev: >>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Roman >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> So, here comes the O(1) implementation: >>>>>>>>>>>>> >>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a tag, >>>>>>>>>>>>> and we >>>>>>>>>>>>> set-up a listener to get notified when it is unloaded. >>>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a >>>>>>>>>>>>> table, which >>>>>>>>>>>>> each entry being the head of a linked-list of KlassNode*. The >>>>>>>>>>>>> table is >>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend the new >>>>>>>>>>>>> KlassNode*. >>>>>>>>>>>>> This is O(1) operation. >>>>>>>>>>>>> - When we get notified of unloading a class, we look up the >>>>>>>>>>>>> signature of >>>>>>>>>>>>> the reported tag in that table, and remember it in a bag. The >>>>>>>>>>>>> KlassNode* >>>>>>>>>>>>> is then unlinked from the table and deallocated. This is ~O(1) >>>>>>>>>>>>> operation >>>>>>>>>>>>> too, depending on the depth of the table. In my testcase which >>>>>>>>>>>>> hammered >>>>>>>>>>>>> the code with class-loads and unloads, I usually see depths of >>>>>>>>>>>>> like 2-3, >>>>>>>>>>>>> but not usually more. It should be ok. >>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out that >>>>>>>>>>>>> bag, and >>>>>>>>>>>>> allocate a new one. >>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid >>>>>>>>>>>>> leaking the >>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets detached >>>>>>>>>>>>> and/or >>>>>>>>>>>>> re-attached (was missing before). >>>>>>>>>>>>> - I also added locks around data-structure-manipulation (was >>>>>>>>>>>>> missing >>>>>>>>>>>>> before). >>>>>>>>>>>>> - Also, I only activate this whole process when an actual >>>>>>>>>>>>> listener gets >>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when >>>>>>>>>>>>> attaching a >>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be something >>>>>>>>>>>>> to improve >>>>>>>>>>>>> in the future? >>>>>>>>>>>>> >>>>>>>>>>>>> In my tests, the performance of class-tracking itself looks >>>>>>>>>>>>> really good. >>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the >>>>>>>>>>>>> class-unload >>>>>>>>>>>>> events. I don't see how this can be helped when the debug >>>>>>>>>>>>> agent asks for it? >>>>>>>>>>>>> >>>>>>>>>>>>> Updated webrev: >>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ >>>>>>>>>>>>> >>>>>>>>>>>>> Please let me know what you think of it. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Roman >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am implementing >>>>>>>>>>>>>> the even more >>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing for >>>>>>>>>>>>>> now. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks,Roman >>>>>>>>>>>>>> >>>>>>>>>>>>>> ?? Hi Chris, >>>>>>>>>>>>>>>> I'll have a look at this, although it might not be for a >>>>>>>>>>>>>>>> few days. In >>>>>>>>>>>>>>>> the meantime, maybe you can describe your new >>>>>>>>>>>>>>>> implementation in >>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the changes. >>>>>>>>>>>>>>> Sure. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to >>>>>>>>>>>>>>> determine the >>>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading >>>>>>>>>>>>>>> happened, so that >>>>>>>>>>>>>>> we can generate the appropriate JDWP event. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The current implementation does so by maintaining a table of >>>>>>>>>>>>>>> currently >>>>>>>>>>>>>>> prepared classes by building that table when classTrack is >>>>>>>>>>>>>>> initialized, >>>>>>>>>>>>>>> and then add new classes whenever a class gets loaded. When >>>>>>>>>>>>>>> unloading >>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and compared >>>>>>>>>>>>>>> with the >>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the new >>>>>>>>>>>>>>> table gets >>>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently >>>>>>>>>>>>>>> and/or many >>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to >>>>>>>>>>>>>>> O(classCount*gcCount) >>>>>>>>>>>>>>> complexity. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared >>>>>>>>>>>>>>> classes, and also >>>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). >>>>>>>>>>>>>>> Whenever an >>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned, >>>>>>>>>>>>>>> and classes >>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus >>>>>>>>>>>>>>> maintaining the >>>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list >>>>>>>>>>>>>>> that gets returned. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The implementation is not perfect. In order to determine >>>>>>>>>>>>>>> whether or not >>>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. >>>>>>>>>>>>>>> That process is >>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here >>>>>>>>>>>>>>> is that >>>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this >>>>>>>>>>>>>>> seems to be >>>>>>>>>>>>>>> true, and also reasonable to expect. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> (I have some ideas how to improve the implementation to >>>>>>>>>>>>>>> ~O(1) but it >>>>>>>>>>>>>>> would be considerably more complex: have to maintain a >>>>>>>>>>>>>>> (hash)table that >>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload, >>>>>>>>>>>>>>> and build the >>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently see >>>>>>>>>>>>>>> that it's >>>>>>>>>>>>>>> worth the effort). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> In addition to all that, this process is only activated when >>>>>>>>>>>>>>> there's an >>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote: >>>>>>>>>>>>>>>>> Hello all, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Issue: >>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c. >>>>>>>>>>>>>>>>> It avoids >>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps >>>>>>>>>>>>>>>>> track of >>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance until an >>>>>>>>>>>>>>>>> agent >>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and >>>>>>>>>>>>>>>>> timing. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Eg with the testcase provided here: >>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I am getting those numbers: >>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s >>>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Can I please get a review? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>> >> From serguei.spitsyn at oracle.com Tue Mar 24 21:46:33 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 24 Mar 2020 14:46:33 -0700 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: <9e86af41-0c60-de4a-53f0-9627b7248209@oracle.com> References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com> <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com> <7766d2f0-e223-7129-9c37-474bbcd67d70@oracle.com> <7667eb6e-8458-6763-3f27-f97f6f49bed8@redhat.com> <86692131-c9ff-3904-dce6-ac59149b2b88@oracle.com> <1de4cfc0-a719-efc6-8f9c-55c73c9043f7@redhat.com> <9e86af41-0c60-de4a-53f0-9627b7248209@oracle.com> Message-ID: <8e6f2f0c-ceee-2ef1-50b5-cea01bb31893@oracle.com> On 3/24/20 14:39, Chris Plummer wrote: > On 3/24/20 1:45 PM, Roman Kennke wrote: >>>>> I assume JVMTI maintains separate tagging data for each agent so >>>>> having >>>>> two agents doing tagging won't result in confusion. I didn't actually >>>>> find this in the spec. Would be nice to confirm that it is the case. >>>>> However, your implementation does seem to conflict with other uses of >>>>> tagging in the debug agent: >>>> The tagging data is per-jvmtiEnv. We create and use our own env >>>> (private >>>> to class-tracking), so this wouldn't conflict with other uses of tags. >>>> Could it be a problem that we have a single trackingEnv per JVM, >>>> though? >>>> /me scratches head. >>> Ok. This is an area I'm not familiar with, but the spec does say: >>> >>> "Each call to GetEnv creates a new JVM TI connection and thus a new JVM >>> TI environment." >>> >>> So it looks like what you are doing should be ok. I still think you >>> have >>> a bug where you are not deallocating signatures of classes that are >>> unloaded. If you think otherwise please point out where this is done. >> Signatures that make it out of processUnloading() are deallocated in >> eventHandler.c, in synthesizeUnload(), right after it has been used. >> >> http://hg.openjdk.java.net/jdk/jdk/file/b9562cc25fc0/src/jdk.jdwp.agent/share/native/libjdwp/eventHandler.c#l527 >> > Ok. Good to know. Not the best of designs, but that's not your fault. > I'll make another pass over the changes, but I think in general it > looks good. I don't think I've seen another reviewer yet, so hopefully > someone jumps in. As I understand, Roman already resolved my previous comments. So, I will do another pass for v6. Thanks, Serguei > > Chris >> Pending signatures on debug-agent-disconnect are deallocated in >> classTrack.c, in the reset() routine. >> >> Thanks, >> Roman >> >>> thanks, >>> >>> Chris >>>>> What would cause classTrack_addPreparedClass() to be called for a >>>>> Class >>>>> you've already seen? I don't understand the need for the "tag != 0l" >>>>> check. >>>> It's probably not needed, may be a left-over from previous >>>> installments >>>> of this implementation. I will check it, and turn into an assert or >>>> so. >>>> >>>> Thanks, >>>> Roman >>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 3/20/20 12:52 PM, Chris Plummer wrote: >>>>>> On 3/20/20 8:30 AM, Roman Kennke wrote: >>>>>>> I believe I came up with a much simpler solution that also >>>>>>> solves the >>>>>>> problems of the existing one, and the ones I proposed earlier. >>>>>>> >>>>>>> It turns out that we can take advantage of the fact that we can use >>>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is >>>>>>> explicitely >>>>>>> mentioned in the JVMTI spec). This means we can simply stick a >>>>>>> pointer >>>>>>> to the signature of a class into the tag, and pull it out again >>>>>>> when we >>>>>>> get notified that the class gets unloaded. >>>>>>> >>>>>>> This means we don't need an extra data-structure to keep track of >>>>>>> classes and signatures, and it also makes the story around locking >>>>>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning >>>>>>> of all >>>>>>> classes needed (as in the current implementation) and no >>>>>>> searching of >>>>>>> table needed (like in my previous attempts). >>>>>>> >>>>>>> Please review this new revision: >>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/ >>>>>> I'll have a look at this. >>>>>>> (Notice that there still appears to be a performance bottleneck >>>>>>> with >>>>>>> class-unloading when an actual debugger is attached. This >>>>>>> doesn't seem >>>>>>> to be related to the classTrack.c implementation though, but looks >>>>>>> like >>>>>>> a consequence of getting all those class-unload notifications >>>>>>> over the >>>>>>> wire. My testcase generates 1000s of them, and it's clogging up the >>>>>>> buffers.) >>>>>> At least this is only a one-shot hit when the classes are unloaded, >>>>>> and the performance hit is based on the number of classes being >>>>>> unloaded. The main issue is happening every GC, and is O(n) where >>>>>> n is >>>>>> the number of loaded classes. >>>>>>> I am not sure why jdb needs to enable class-unload listener >>>>>>> always. A >>>>>>> simple hack disables it, and performance is brilliant, even when >>>>>>> jdb is >>>>>>> attached: >>>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch >>>>>> This is JDI, not jdb. It looks like it needs ClassUnload events >>>>>> so it >>>>>> can maintain typesBySignature, which is used by public APIs like >>>>>> allClasses(). So we have caching of loaded classes both in the debug >>>>>> agent and in JDI. >>>>>> >>>>>> Chris >>>>>>> But this is not in the scope of this bug.) >>>>>>> >>>>>>> Roman >>>>>>> >>>>>>> >>>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote: >>>>>>>> Sorry, forgot to complete my comments at the end (see below). >>>>>>>> >>>>>>>> >>>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Hi Roman, >>>>>>>>> >>>>>>>>> Thank you for the update and sorry for the latency in review. >>>>>>>>> >>>>>>>>> Some comments are below. >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag) >>>>>>>>> ???? 88 { >>>>>>>>> 89 debugMonitorEnter(deletedSignatureLock); >>>>>>>>> 90 if (currentClassTag == -1) { >>>>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>>>> 93 return; >>>>>>>>> ???? 94???? } >>>>>>>>> Just a question: >>>>>>>>> ???? Q1: Should the ObjectFree events be disabled for the >>>>>>>>> jvmtiEnv >>>>>>>>> that does >>>>>>>>> ???????? the class tracking if class tracking has not been >>>>>>>>> initialized? >>>>>>>>> >>>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is >>>>>>>>> better to >>>>>>>>> be something like: lastClassTag or highestClassTag. >>>>>>>>> >>>>>>>>> 99 KlassNode* klass = *klass_ptr; >>>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103 >>>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr; >>>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // >>>>>>>>> klass not >>>>>>>>> found - ignore. >>>>>>>>> 107 debugMonitorExit(deletedSignatureLock); >>>>>>>>> 108 return; >>>>>>>>> ??? 109???? } >>>>>>>>> ????It seems to me, something is wrong in the condition at L106 >>>>>>>>> above. >>>>>>>>> ????Should it be? : >>>>>>>>> ?????? if (klass == NULL || klass->klass_tag != tag) >>>>>>>>> >>>>>>>>> ????Otherwise, how can the second check ever work correctly as >>>>>>>>> the >>>>>>>>> return >>>>>>>>> will always happen when (klass != NULL)? >>>>>>>>> >>>>>>>>> ??? There are several places in this file with the the indent: >>>>>>>>> 90 if (currentClassTag == -1) { >>>>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>>>> 93 return; >>>>>>>>> ???? 94???? } >>>>>>>>> ??? ... >>>>>>>>> 152 if (currentClassTag == -1) { >>>>>>>>> 153 // Class tracking not initialized yet, nobody's interested >>>>>>>>> 154 debugMonitorExit(deletedSignatureLock); >>>>>>>>> 155 return; >>>>>>>>> ??? 156???? } >>>>>>>>> ??? ... >>>>>>>>> 161 if (error != JVMTI_ERROR_NONE) { >>>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv"); >>>>>>>>> ??? 163???? } >>>>>>>>> 164 if (tag != 0l) { >>>>>>>>> 165 debugMonitorExit(deletedSignatureLock); >>>>>>>>> 166 return; // Already added >>>>>>>>> ??? 167???? } >>>>>>>>> ??? ... >>>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg) >>>>>>>>> 282 { >>>>>>>>> 283 char* sig = (char*)signatureVoid; >>>>>>>>> 284 jvmtiDeallocate(sig); >>>>>>>>> 285 return JNI_TRUE; >>>>>>>>> ??? 286 } >>>>>>>>> ??? ... >>>>>>>>> ??? 291 void >>>>>>>>> ??? 292 classTrack_reset(void) >>>>>>>>> ??? 293 { >>>>>>>>> 294 int idx; >>>>>>>>> 295 debugMonitorEnter(deletedSignatureLock); >>>>>>>>> 296 >>>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) { >>>>>>>>> 298 KlassNode* node = table[idx]; >>>>>>>>> 299 while (node != NULL) { >>>>>>>>> 300 KlassNode* next = node->next; >>>>>>>>> 301 jvmtiDeallocate(node->signature); >>>>>>>>> 302 jvmtiDeallocate(node); >>>>>>>>> 303 node = next; >>>>>>>>> 304 } >>>>>>>>> 305 } >>>>>>>>> 306 jvmtiDeallocate(table); >>>>>>>>> 307 >>>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL); >>>>>>>>> 309 bagDestroyBag(deletedSignatureBag); >>>>>>>>> 310 >>>>>>>>> 311 currentClassTag = -1; >>>>>>>>> 312 >>>>>>>>> 313 >>>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); >>>>>>>>> >>>>>>>>> 314 trackingEnv = NULL; >>>>>>>>> 315 >>>>>>>>> 316 debugMonitorExit(deletedSignatureLock); >>>>>>>>> >>>>>>>>> Could you, please, fix several comments below? >>>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for >>>>>>>>> class-unloads >>>>>>>>> ????The comma is not needed. >>>>>>>>> ????Would it better to replace: klass tags => klass_tag's ? >>>>>>>>> >>>>>>>>> >>>>>>>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag >>>>>>>>> consistent >>>>>>>>> ????Maybe: Lock to guard ... or lock to keep integrity of ... >>>>>>>>> >>>>>>>>> 84 * Callback when classes are freed, Finds the signature and >>>>>>>>> remembers it in deletedSignatureBag. Would be better to use words >>>>>>>>> like >>>>>>>>> "store" or "record", "Find" should not start from capital letter: >>>>>>>>> Invoke the callback when classes are freed, find and record the >>>>>>>>> signature in deletedSignatureBag. >>>>>>>>> >>>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not >>>>>>>>> initialized, >>>>>>>>> nobody's interested 153 // Class tracking not initialized yet, >>>>>>>>> nobody's interested 158 /* Check this is not a duplicate */ >>>>>>>>> Missed dot >>>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) >>>>>>>>> { // >>>>>>>>> klass not found - ignore. In opposite, dot is not needed as the >>>>>>>>> comment does not start from a capital letter. 111 // At this >>>>>>>>> point we >>>>>>>>> have the KlassNode corresponding to the tag >>>>>>>>> 112 // in klass, and the pointer to it in klass_node. >>>>>>>> ??? The comment above can be better. Maybe, something like: >>>>>>>> ??? ? " At this point, we found the KlassNode matching the klass >>>>>>>> tag(and it is >>>>>>>> linked). >>>>>>>> >>>>>>>>> 113 // Remember the unloaded signature. >>>>>>>> ????Better: Record the signature of the unloaded class and >>>>>>>> unlink it. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>> On 3/9/20 05:39, Roman Kennke wrote: >>>>>>>>>> Hello all, >>>>>>>>>> >>>>>>>>>> Can I please get reviews of this change? In the meantime, we've >>>>>>>>>> done >>>>>>>>>> more testing and also field-/torture-testing by a customer >>>>>>>>>> who is >>>>>>>>>> happy >>>>>>>>>> now. :-) >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Roman >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Hi Serguei, >>>>>>>>>>> >>>>>>>>>>> Thanks for reviewing! >>>>>>>>>>> >>>>>>>>>>> I updated the patch to reflect your suggestions, very good! >>>>>>>>>>> It also includes a fix to allow re-connecting an agent after >>>>>>>>>>> disconnect, >>>>>>>>>>> namely move setup of the trackingEnv and deletedSignatureBag to >>>>>>>>>>> _activate() to ensure have those structures after re-connect. >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/ >>>>>>>>>>> >>>>>>>>>>> Let me know what you think! >>>>>>>>>>> Roman >>>>>>>>>>> >>>>>>>>>>>> Hi Roman, >>>>>>>>>>>> >>>>>>>>>>>> Thank you for taking care about this scalability issue! >>>>>>>>>>>> >>>>>>>>>>>> I have a couple of quick comments. >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 72 /* >>>>>>>>>>>> 73 * Lock to protect deletedSignatureBag >>>>>>>>>>>> 74 */ >>>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /* >>>>>>>>>>>> 78 * A bag containing all the deleted classes' signatures. >>>>>>>>>>>> Must be >>>>>>>>>>>> accessed under >>>>>>>>>>>> 79 * deletedTagLock, >>>>>>>>>>>> ???? 80? */ >>>>>>>>>>>> 81 struct bag* deletedSignatureBag; >>>>>>>>>>>> >>>>>>>>>>>> ???? The comments contradict to each other. >>>>>>>>>>>> ???? I guess, the lock name at line 79 has to be >>>>>>>>>>>> deletedSignatureLock >>>>>>>>>>>> instead of deletedTagLock. >>>>>>>>>>>> ???? Also, comma at the end must be replaced with dot. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 101 // Tag not found? Ignore. >>>>>>>>>>>> 102 if (klass == NULL) { >>>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>> 104 return; >>>>>>>>>>>> 105 } >>>>>>>>>>>> ??? 106 >>>>>>>>>>>> 107 // Scan linked-list. >>>>>>>>>>>> 108 jlong found_tag = klass->klass_tag; >>>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) { >>>>>>>>>>>> 110 klass_ptr = &klass->next; >>>>>>>>>>>> 111 klass = *klass_ptr; >>>>>>>>>>>> 112 found_tag = klass->klass_tag; >>>>>>>>>>>> ??? 113???? } >>>>>>>>>>>> 114 >>>>>>>>>>>> 115 // Tag not found? Ignore. >>>>>>>>>>>> 116 if (found_tag != tag) { >>>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>> 118 return; >>>>>>>>>>>> ??? 119???? } >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> ????The code above can be simplified, so that the lines >>>>>>>>>>>> 101-105 >>>>>>>>>>>> are not >>>>>>>>>>>> needed anymore. >>>>>>>>>>>> ????It can be something like this: >>>>>>>>>>>> >>>>>>>>>>>> // Scan linked-list. >>>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) { >>>>>>>>>>>> klass_ptr = &klass->next; >>>>>>>>>>>> klass = *klass_ptr; >>>>>>>>>>>> ??????? } >>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not >>>>>>>>>>>> found - ignore. >>>>>>>>>>>> debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>> return; >>>>>>>>>>>> ??????? } >>>>>>>>>>>> >>>>>>>>>>>> It will take more time when I get a chance to look at the >>>>>>>>>>>> rest. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Serguei >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote: >>>>>>>>>>>>> Here comes an update that resolves some races that happen >>>>>>>>>>>>> when >>>>>>>>>>>>> disconnecting an agent. In particular, we need to take the >>>>>>>>>>>>> lock on >>>>>>>>>>>>> basically every operation, and also need to check whether >>>>>>>>>>>>> or not >>>>>>>>>>>>> class-tracking is active and return an appropriate result >>>>>>>>>>>>> (e.g. >>>>>>>>>>>>> an empty >>>>>>>>>>>>> list) when we're not. >>>>>>>>>>>>> >>>>>>>>>>>>> Updated webrev: >>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Roman >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> So, here comes the O(1) implementation: >>>>>>>>>>>>>> >>>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a >>>>>>>>>>>>>> tag, >>>>>>>>>>>>>> and we >>>>>>>>>>>>>> set-up a listener to get notified when it is unloaded. >>>>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a >>>>>>>>>>>>>> table, which >>>>>>>>>>>>>> each entry being the head of a linked-list of KlassNode*. >>>>>>>>>>>>>> The >>>>>>>>>>>>>> table is >>>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend the new >>>>>>>>>>>>>> KlassNode*. >>>>>>>>>>>>>> This is O(1) operation. >>>>>>>>>>>>>> - When we get notified of unloading a class, we look up the >>>>>>>>>>>>>> signature of >>>>>>>>>>>>>> the reported tag in that table, and remember it in a bag. >>>>>>>>>>>>>> The >>>>>>>>>>>>>> KlassNode* >>>>>>>>>>>>>> is then unlinked from the table and deallocated. This is >>>>>>>>>>>>>> ~O(1) >>>>>>>>>>>>>> operation >>>>>>>>>>>>>> too, depending on the depth of the table. In my testcase >>>>>>>>>>>>>> which >>>>>>>>>>>>>> hammered >>>>>>>>>>>>>> the code with class-loads and unloads, I usually see >>>>>>>>>>>>>> depths of >>>>>>>>>>>>>> like 2-3, >>>>>>>>>>>>>> but not usually more. It should be ok. >>>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out that >>>>>>>>>>>>>> bag, and >>>>>>>>>>>>>> allocate a new one. >>>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid >>>>>>>>>>>>>> leaking the >>>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets detached >>>>>>>>>>>>>> and/or >>>>>>>>>>>>>> re-attached (was missing before). >>>>>>>>>>>>>> - I also added locks around data-structure-manipulation (was >>>>>>>>>>>>>> missing >>>>>>>>>>>>>> before). >>>>>>>>>>>>>> - Also, I only activate this whole process when an actual >>>>>>>>>>>>>> listener gets >>>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when >>>>>>>>>>>>>> attaching a >>>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be >>>>>>>>>>>>>> something >>>>>>>>>>>>>> to improve >>>>>>>>>>>>>> in the future? >>>>>>>>>>>>>> >>>>>>>>>>>>>> In my tests, the performance of class-tracking itself looks >>>>>>>>>>>>>> really good. >>>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the >>>>>>>>>>>>>> class-unload >>>>>>>>>>>>>> events. I don't see how this can be helped when the debug >>>>>>>>>>>>>> agent asks for it? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Updated webrev: >>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> Please let me know what you think of it. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Roman >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am implementing >>>>>>>>>>>>>>> the even more >>>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off >>>>>>>>>>>>>>> reviewing for >>>>>>>>>>>>>>> now. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks,Roman >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ??? Hi Chris, >>>>>>>>>>>>>>>>> I'll have a look at this, although it might not be for a >>>>>>>>>>>>>>>>> few days. In >>>>>>>>>>>>>>>>> the meantime, maybe you can describe your new >>>>>>>>>>>>>>>>> implementation in >>>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the changes. >>>>>>>>>>>>>>>> Sure. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to >>>>>>>>>>>>>>>> determine the >>>>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading >>>>>>>>>>>>>>>> happened, so that >>>>>>>>>>>>>>>> we can generate the appropriate JDWP event. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The current implementation does so by maintaining a >>>>>>>>>>>>>>>> table of >>>>>>>>>>>>>>>> currently >>>>>>>>>>>>>>>> prepared classes by building that table when classTrack is >>>>>>>>>>>>>>>> initialized, >>>>>>>>>>>>>>>> and then add new classes whenever a class gets loaded. >>>>>>>>>>>>>>>> When >>>>>>>>>>>>>>>> unloading >>>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and >>>>>>>>>>>>>>>> compared >>>>>>>>>>>>>>>> with the >>>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the new >>>>>>>>>>>>>>>> table gets >>>>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently >>>>>>>>>>>>>>>> and/or many >>>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to >>>>>>>>>>>>>>>> O(classCount*gcCount) >>>>>>>>>>>>>>>> complexity. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared >>>>>>>>>>>>>>>> classes, and also >>>>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). >>>>>>>>>>>>>>>> Whenever an >>>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned, >>>>>>>>>>>>>>>> and classes >>>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus >>>>>>>>>>>>>>>> maintaining the >>>>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list >>>>>>>>>>>>>>>> that gets returned. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The implementation is not perfect. In order to determine >>>>>>>>>>>>>>>> whether or not >>>>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. >>>>>>>>>>>>>>>> That process is >>>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here >>>>>>>>>>>>>>>> is that >>>>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this >>>>>>>>>>>>>>>> seems to be >>>>>>>>>>>>>>>> true, and also reasonable to expect. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> (I have some ideas how to improve the implementation to >>>>>>>>>>>>>>>> ~O(1) but it >>>>>>>>>>>>>>>> would be considerably more complex: have to maintain a >>>>>>>>>>>>>>>> (hash)table that >>>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload, >>>>>>>>>>>>>>>> and build the >>>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently see >>>>>>>>>>>>>>>> that it's >>>>>>>>>>>>>>>> worth the effort). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> In addition to all that, this process is only activated >>>>>>>>>>>>>>>> when >>>>>>>>>>>>>>>> there's an >>>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote: >>>>>>>>>>>>>>>>>> Hello all, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Issue: >>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269 >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of >>>>>>>>>>>>>>>>>> classTrack.c. >>>>>>>>>>>>>>>>>> It avoids >>>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps >>>>>>>>>>>>>>>>>> track of >>>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance until an >>>>>>>>>>>>>>>>>> agent >>>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and >>>>>>>>>>>>>>>>>> timing. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Eg with the testcase provided here: >>>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I am getting those numbers: >>>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s >>>>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Can I please get a review? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>>> >>> > > From chris.plummer at oracle.com Tue Mar 24 21:47:54 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 24 Mar 2020 14:47:54 -0700 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: <8e6f2f0c-ceee-2ef1-50b5-cea01bb31893@oracle.com> References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com> <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com> <7766d2f0-e223-7129-9c37-474bbcd67d70@oracle.com> <7667eb6e-8458-6763-3f27-f97f6f49bed8@redhat.com> <86692131-c9ff-3904-dce6-ac59149b2b88@oracle.com> <1de4cfc0-a719-efc6-8f9c-55c73c9043f7@redhat.com> <9e86af41-0c60-de4a-53f0-9627b7248209@oracle.com> <8e6f2f0c-ceee-2ef1-50b5-cea01bb31893@oracle.com> Message-ID: On 3/24/20 2:46 PM, serguei.spitsyn at oracle.com wrote: > On 3/24/20 14:39, Chris Plummer wrote: >> On 3/24/20 1:45 PM, Roman Kennke wrote: >>>>>> I assume JVMTI maintains separate tagging data for each agent so >>>>>> having >>>>>> two agents doing tagging won't result in confusion. I didn't >>>>>> actually >>>>>> find this in the spec. Would be nice to confirm that it is the case. >>>>>> However, your implementation does seem to conflict with other >>>>>> uses of >>>>>> tagging in the debug agent: >>>>> The tagging data is per-jvmtiEnv. We create and use our own env >>>>> (private >>>>> to class-tracking), so this wouldn't conflict with other uses of >>>>> tags. >>>>> Could it be a problem that we have a single trackingEnv per JVM, >>>>> though? >>>>> /me scratches head. >>>> Ok. This is an area I'm not familiar with, but the spec does say: >>>> >>>> "Each call to GetEnv creates a new JVM TI connection and thus a new >>>> JVM >>>> TI environment." >>>> >>>> So it looks like what you are doing should be ok. I still think you >>>> have >>>> a bug where you are not deallocating signatures of classes that are >>>> unloaded. If you think otherwise please point out where this is done. >>> Signatures that make it out of processUnloading() are deallocated in >>> eventHandler.c, in synthesizeUnload(), right after it has been used. >>> >>> http://hg.openjdk.java.net/jdk/jdk/file/b9562cc25fc0/src/jdk.jdwp.agent/share/native/libjdwp/eventHandler.c#l527 >>> >> Ok. Good to know. Not the best of designs, but that's not your fault. >> I'll make another pass over the changes, but I think in general it >> looks good. I don't think I've seen another reviewer yet, so >> hopefully someone jumps in. > > As I understand, Roman already resolved my previous comments. > So, I will do another pass for v6. I think it's pretty much a rewrite since you last reviewed it. Chris > > Thanks, > Serguei > >> >> Chris >>> Pending signatures on debug-agent-disconnect are deallocated in >>> classTrack.c, in the reset() routine. >>> >>> Thanks, >>> Roman >>> >>>> thanks, >>>> >>>> Chris >>>>>> What would cause classTrack_addPreparedClass() to be called for a >>>>>> Class >>>>>> you've already seen? I don't understand the need for the "tag != 0l" >>>>>> check. >>>>> It's probably not needed, may be a left-over from previous >>>>> installments >>>>> of this implementation. I will check it, and turn into an assert >>>>> or so. >>>>> >>>>> Thanks, >>>>> Roman >>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 3/20/20 12:52 PM, Chris Plummer wrote: >>>>>>> On 3/20/20 8:30 AM, Roman Kennke wrote: >>>>>>>> I believe I came up with a much simpler solution that also >>>>>>>> solves the >>>>>>>> problems of the existing one, and the ones I proposed earlier. >>>>>>>> >>>>>>>> It turns out that we can take advantage of the fact that we can >>>>>>>> use >>>>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is >>>>>>>> explicitely >>>>>>>> mentioned in the JVMTI spec). This means we can simply stick a >>>>>>>> pointer >>>>>>>> to the signature of a class into the tag, and pull it out again >>>>>>>> when we >>>>>>>> get notified that the class gets unloaded. >>>>>>>> >>>>>>>> This means we don't need an extra data-structure to keep track of >>>>>>>> classes and signatures, and it also makes the story around locking >>>>>>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning >>>>>>>> of all >>>>>>>> classes needed (as in the current implementation) and no >>>>>>>> searching of >>>>>>>> table needed (like in my previous attempts). >>>>>>>> >>>>>>>> Please review this new revision: >>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/ >>>>>>> I'll have a look at this. >>>>>>>> (Notice that there still appears to be a performance bottleneck >>>>>>>> with >>>>>>>> class-unloading when an actual debugger is attached. This >>>>>>>> doesn't seem >>>>>>>> to be related to the classTrack.c implementation though, but looks >>>>>>>> like >>>>>>>> a consequence of getting all those class-unload notifications >>>>>>>> over the >>>>>>>> wire. My testcase generates 1000s of them, and it's clogging up >>>>>>>> the >>>>>>>> buffers.) >>>>>>> At least this is only a one-shot hit when the classes are unloaded, >>>>>>> and the performance hit is based on the number of classes being >>>>>>> unloaded. The main issue is happening every GC, and is O(n) >>>>>>> where n is >>>>>>> the number of loaded classes. >>>>>>>> I am not sure why jdb needs to enable class-unload listener >>>>>>>> always. A >>>>>>>> simple hack disables it, and performance is brilliant, even when >>>>>>>> jdb is >>>>>>>> attached: >>>>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch >>>>>>> This is JDI, not jdb. It looks like it needs ClassUnload events >>>>>>> so it >>>>>>> can maintain typesBySignature, which is used by public APIs like >>>>>>> allClasses(). So we have caching of loaded classes both in the >>>>>>> debug >>>>>>> agent and in JDI. >>>>>>> >>>>>>> Chris >>>>>>>> But this is not in the scope of this bug.) >>>>>>>> >>>>>>>> Roman >>>>>>>> >>>>>>>> >>>>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Sorry, forgot to complete my comments at the end (see below). >>>>>>>>> >>>>>>>>> >>>>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote: >>>>>>>>>> Hi Roman, >>>>>>>>>> >>>>>>>>>> Thank you for the update and sorry for the latency in review. >>>>>>>>>> >>>>>>>>>> Some comments are below. >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag) >>>>>>>>>> ???? 88 { >>>>>>>>>> 89 debugMonitorEnter(deletedSignatureLock); >>>>>>>>>> 90 if (currentClassTag == -1) { >>>>>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>>>>> 93 return; >>>>>>>>>> ???? 94???? } >>>>>>>>>> Just a question: >>>>>>>>>> ???? Q1: Should the ObjectFree events be disabled for the >>>>>>>>>> jvmtiEnv >>>>>>>>>> that does >>>>>>>>>> ???????? the class tracking if class tracking has not been >>>>>>>>>> initialized? >>>>>>>>>> >>>>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is >>>>>>>>>> better to >>>>>>>>>> be something like: lastClassTag or highestClassTag. >>>>>>>>>> >>>>>>>>>> 99 KlassNode* klass = *klass_ptr; >>>>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103 >>>>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr; >>>>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // >>>>>>>>>> klass not >>>>>>>>>> found - ignore. >>>>>>>>>> 107 debugMonitorExit(deletedSignatureLock); >>>>>>>>>> 108 return; >>>>>>>>>> ??? 109???? } >>>>>>>>>> ????It seems to me, something is wrong in the condition at L106 >>>>>>>>>> above. >>>>>>>>>> ????Should it be? : >>>>>>>>>> ?????? if (klass == NULL || klass->klass_tag != tag) >>>>>>>>>> >>>>>>>>>> ????Otherwise, how can the second check ever work correctly >>>>>>>>>> as the >>>>>>>>>> return >>>>>>>>>> will always happen when (klass != NULL)? >>>>>>>>>> >>>>>>>>>> ??? There are several places in this file with the the indent: >>>>>>>>>> 90 if (currentClassTag == -1) { >>>>>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>>>>> 93 return; >>>>>>>>>> ???? 94???? } >>>>>>>>>> ??? ... >>>>>>>>>> 152 if (currentClassTag == -1) { >>>>>>>>>> 153 // Class tracking not initialized yet, nobody's interested >>>>>>>>>> 154 debugMonitorExit(deletedSignatureLock); >>>>>>>>>> 155 return; >>>>>>>>>> ??? 156???? } >>>>>>>>>> ??? ... >>>>>>>>>> 161 if (error != JVMTI_ERROR_NONE) { >>>>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class >>>>>>>>>> trackingEnv"); >>>>>>>>>> ??? 163???? } >>>>>>>>>> 164 if (tag != 0l) { >>>>>>>>>> 165 debugMonitorExit(deletedSignatureLock); >>>>>>>>>> 166 return; // Already added >>>>>>>>>> ??? 167???? } >>>>>>>>>> ??? ... >>>>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg) >>>>>>>>>> 282 { >>>>>>>>>> 283 char* sig = (char*)signatureVoid; >>>>>>>>>> 284 jvmtiDeallocate(sig); >>>>>>>>>> 285 return JNI_TRUE; >>>>>>>>>> ??? 286 } >>>>>>>>>> ??? ... >>>>>>>>>> ??? 291 void >>>>>>>>>> ??? 292 classTrack_reset(void) >>>>>>>>>> ??? 293 { >>>>>>>>>> 294 int idx; >>>>>>>>>> 295 debugMonitorEnter(deletedSignatureLock); >>>>>>>>>> 296 >>>>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) { >>>>>>>>>> 298 KlassNode* node = table[idx]; >>>>>>>>>> 299 while (node != NULL) { >>>>>>>>>> 300 KlassNode* next = node->next; >>>>>>>>>> 301 jvmtiDeallocate(node->signature); >>>>>>>>>> 302 jvmtiDeallocate(node); >>>>>>>>>> 303 node = next; >>>>>>>>>> 304 } >>>>>>>>>> 305 } >>>>>>>>>> 306 jvmtiDeallocate(table); >>>>>>>>>> 307 >>>>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL); >>>>>>>>>> 309 bagDestroyBag(deletedSignatureBag); >>>>>>>>>> 310 >>>>>>>>>> 311 currentClassTag = -1; >>>>>>>>>> 312 >>>>>>>>>> 313 >>>>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); >>>>>>>>>> >>>>>>>>>> 314 trackingEnv = NULL; >>>>>>>>>> 315 >>>>>>>>>> 316 debugMonitorExit(deletedSignatureLock); >>>>>>>>>> >>>>>>>>>> Could you, please, fix several comments below? >>>>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for >>>>>>>>>> class-unloads >>>>>>>>>> ????The comma is not needed. >>>>>>>>>> ????Would it better to replace: klass tags => klass_tag's ? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag >>>>>>>>>> consistent >>>>>>>>>> ????Maybe: Lock to guard ... or lock to keep integrity of ... >>>>>>>>>> >>>>>>>>>> 84 * Callback when classes are freed, Finds the signature and >>>>>>>>>> remembers it in deletedSignatureBag. Would be better to use >>>>>>>>>> words >>>>>>>>>> like >>>>>>>>>> "store" or "record", "Find" should not start from capital >>>>>>>>>> letter: >>>>>>>>>> Invoke the callback when classes are freed, find and record the >>>>>>>>>> signature in deletedSignatureBag. >>>>>>>>>> >>>>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not >>>>>>>>>> initialized, >>>>>>>>>> nobody's interested 153 // Class tracking not initialized yet, >>>>>>>>>> nobody's interested 158 /* Check this is not a duplicate */ >>>>>>>>>> Missed dot >>>>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) >>>>>>>>>> { // >>>>>>>>>> klass not found - ignore. In opposite, dot is not needed as the >>>>>>>>>> comment does not start from a capital letter. 111 // At this >>>>>>>>>> point we >>>>>>>>>> have the KlassNode corresponding to the tag >>>>>>>>>> 112 // in klass, and the pointer to it in klass_node. >>>>>>>>> ??? The comment above can be better. Maybe, something like: >>>>>>>>> ??? ? " At this point, we found the KlassNode matching the klass >>>>>>>>> tag(and it is >>>>>>>>> linked). >>>>>>>>> >>>>>>>>>> 113 // Remember the unloaded signature. >>>>>>>>> ????Better: Record the signature of the unloaded class and >>>>>>>>> unlink it. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Serguei >>>>>>>>>> >>>>>>>>>> On 3/9/20 05:39, Roman Kennke wrote: >>>>>>>>>>> Hello all, >>>>>>>>>>> >>>>>>>>>>> Can I please get reviews of this change? In the meantime, we've >>>>>>>>>>> done >>>>>>>>>>> more testing and also field-/torture-testing by a customer >>>>>>>>>>> who is >>>>>>>>>>> happy >>>>>>>>>>> now. :-) >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Roman >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Hi Serguei, >>>>>>>>>>>> >>>>>>>>>>>> Thanks for reviewing! >>>>>>>>>>>> >>>>>>>>>>>> I updated the patch to reflect your suggestions, very good! >>>>>>>>>>>> It also includes a fix to allow re-connecting an agent after >>>>>>>>>>>> disconnect, >>>>>>>>>>>> namely move setup of the trackingEnv and >>>>>>>>>>>> deletedSignatureBag to >>>>>>>>>>>> _activate() to ensure have those structures after re-connect. >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/ >>>>>>>>>>>> >>>>>>>>>>>> Let me know what you think! >>>>>>>>>>>> Roman >>>>>>>>>>>> >>>>>>>>>>>>> Hi Roman, >>>>>>>>>>>>> >>>>>>>>>>>>> Thank you for taking care about this scalability issue! >>>>>>>>>>>>> >>>>>>>>>>>>> I have a couple of quick comments. >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 72 /* >>>>>>>>>>>>> 73 * Lock to protect deletedSignatureBag >>>>>>>>>>>>> 74 */ >>>>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /* >>>>>>>>>>>>> 78 * A bag containing all the deleted classes' signatures. >>>>>>>>>>>>> Must be >>>>>>>>>>>>> accessed under >>>>>>>>>>>>> 79 * deletedTagLock, >>>>>>>>>>>>> ???? 80? */ >>>>>>>>>>>>> 81 struct bag* deletedSignatureBag; >>>>>>>>>>>>> >>>>>>>>>>>>> ???? The comments contradict to each other. >>>>>>>>>>>>> ???? I guess, the lock name at line 79 has to be >>>>>>>>>>>>> deletedSignatureLock >>>>>>>>>>>>> instead of deletedTagLock. >>>>>>>>>>>>> ???? Also, comma at the end must be replaced with dot. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 101 // Tag not found? Ignore. >>>>>>>>>>>>> 102 if (klass == NULL) { >>>>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>> 104 return; >>>>>>>>>>>>> 105 } >>>>>>>>>>>>> ??? 106 >>>>>>>>>>>>> 107 // Scan linked-list. >>>>>>>>>>>>> 108 jlong found_tag = klass->klass_tag; >>>>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) { >>>>>>>>>>>>> 110 klass_ptr = &klass->next; >>>>>>>>>>>>> 111 klass = *klass_ptr; >>>>>>>>>>>>> 112 found_tag = klass->klass_tag; >>>>>>>>>>>>> ??? 113???? } >>>>>>>>>>>>> 114 >>>>>>>>>>>>> 115 // Tag not found? Ignore. >>>>>>>>>>>>> 116 if (found_tag != tag) { >>>>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>> 118 return; >>>>>>>>>>>>> ??? 119???? } >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> ????The code above can be simplified, so that the lines >>>>>>>>>>>>> 101-105 >>>>>>>>>>>>> are not >>>>>>>>>>>>> needed anymore. >>>>>>>>>>>>> ????It can be something like this: >>>>>>>>>>>>> >>>>>>>>>>>>> // Scan linked-list. >>>>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) { >>>>>>>>>>>>> klass_ptr = &klass->next; >>>>>>>>>>>>> klass = *klass_ptr; >>>>>>>>>>>>> ??????? } >>>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not >>>>>>>>>>>>> found - ignore. >>>>>>>>>>>>> debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>> return; >>>>>>>>>>>>> ??????? } >>>>>>>>>>>>> >>>>>>>>>>>>> It will take more time when I get a chance to look at the >>>>>>>>>>>>> rest. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Serguei >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote: >>>>>>>>>>>>>> Here comes an update that resolves some races that happen >>>>>>>>>>>>>> when >>>>>>>>>>>>>> disconnecting an agent. In particular, we need to take the >>>>>>>>>>>>>> lock on >>>>>>>>>>>>>> basically every operation, and also need to check whether >>>>>>>>>>>>>> or not >>>>>>>>>>>>>> class-tracking is active and return an appropriate result >>>>>>>>>>>>>> (e.g. >>>>>>>>>>>>>> an empty >>>>>>>>>>>>>> list) when we're not. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Updated webrev: >>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Roman >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> So, here comes the O(1) implementation: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with >>>>>>>>>>>>>>> a tag, >>>>>>>>>>>>>>> and we >>>>>>>>>>>>>>> set-up a listener to get notified when it is unloaded. >>>>>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a >>>>>>>>>>>>>>> table, which >>>>>>>>>>>>>>> each entry being the head of a linked-list of >>>>>>>>>>>>>>> KlassNode*. The >>>>>>>>>>>>>>> table is >>>>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend the >>>>>>>>>>>>>>> new >>>>>>>>>>>>>>> KlassNode*. >>>>>>>>>>>>>>> This is O(1) operation. >>>>>>>>>>>>>>> - When we get notified of unloading a class, we look up the >>>>>>>>>>>>>>> signature of >>>>>>>>>>>>>>> the reported tag in that table, and remember it in a >>>>>>>>>>>>>>> bag. The >>>>>>>>>>>>>>> KlassNode* >>>>>>>>>>>>>>> is then unlinked from the table and deallocated. This is >>>>>>>>>>>>>>> ~O(1) >>>>>>>>>>>>>>> operation >>>>>>>>>>>>>>> too, depending on the depth of the table. In my testcase >>>>>>>>>>>>>>> which >>>>>>>>>>>>>>> hammered >>>>>>>>>>>>>>> the code with class-loads and unloads, I usually see >>>>>>>>>>>>>>> depths of >>>>>>>>>>>>>>> like 2-3, >>>>>>>>>>>>>>> but not usually more. It should be ok. >>>>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out >>>>>>>>>>>>>>> that >>>>>>>>>>>>>>> bag, and >>>>>>>>>>>>>>> allocate a new one. >>>>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid >>>>>>>>>>>>>>> leaking the >>>>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets >>>>>>>>>>>>>>> detached >>>>>>>>>>>>>>> and/or >>>>>>>>>>>>>>> re-attached (was missing before). >>>>>>>>>>>>>>> - I also added locks around data-structure-manipulation >>>>>>>>>>>>>>> (was >>>>>>>>>>>>>>> missing >>>>>>>>>>>>>>> before). >>>>>>>>>>>>>>> - Also, I only activate this whole process when an actual >>>>>>>>>>>>>>> listener gets >>>>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when >>>>>>>>>>>>>>> attaching a >>>>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be >>>>>>>>>>>>>>> something >>>>>>>>>>>>>>> to improve >>>>>>>>>>>>>>> in the future? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> In my tests, the performance of class-tracking itself looks >>>>>>>>>>>>>>> really good. >>>>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the >>>>>>>>>>>>>>> class-unload >>>>>>>>>>>>>>> events. I don't see how this can be helped when the debug >>>>>>>>>>>>>>> agent asks for it? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Updated webrev: >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Please let me know what you think of it. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am implementing >>>>>>>>>>>>>>>> the even more >>>>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off >>>>>>>>>>>>>>>> reviewing for >>>>>>>>>>>>>>>> now. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks,Roman >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ??? Hi Chris, >>>>>>>>>>>>>>>>>> I'll have a look at this, although it might not be for a >>>>>>>>>>>>>>>>>> few days. In >>>>>>>>>>>>>>>>>> the meantime, maybe you can describe your new >>>>>>>>>>>>>>>>>> implementation in >>>>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the changes. >>>>>>>>>>>>>>>>> Sure. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to >>>>>>>>>>>>>>>>> determine the >>>>>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading >>>>>>>>>>>>>>>>> happened, so that >>>>>>>>>>>>>>>>> we can generate the appropriate JDWP event. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The current implementation does so by maintaining a >>>>>>>>>>>>>>>>> table of >>>>>>>>>>>>>>>>> currently >>>>>>>>>>>>>>>>> prepared classes by building that table when >>>>>>>>>>>>>>>>> classTrack is >>>>>>>>>>>>>>>>> initialized, >>>>>>>>>>>>>>>>> and then add new classes whenever a class gets loaded. >>>>>>>>>>>>>>>>> When >>>>>>>>>>>>>>>>> unloading >>>>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and >>>>>>>>>>>>>>>>> compared >>>>>>>>>>>>>>>>> with the >>>>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the new >>>>>>>>>>>>>>>>> table gets >>>>>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently >>>>>>>>>>>>>>>>> and/or many >>>>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to >>>>>>>>>>>>>>>>> O(classCount*gcCount) >>>>>>>>>>>>>>>>> complexity. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared >>>>>>>>>>>>>>>>> classes, and also >>>>>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). >>>>>>>>>>>>>>>>> Whenever an >>>>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is >>>>>>>>>>>>>>>>> scanned, >>>>>>>>>>>>>>>>> and classes >>>>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus >>>>>>>>>>>>>>>>> maintaining the >>>>>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list >>>>>>>>>>>>>>>>> that gets returned. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The implementation is not perfect. In order to determine >>>>>>>>>>>>>>>>> whether or not >>>>>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. >>>>>>>>>>>>>>>>> That process is >>>>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption >>>>>>>>>>>>>>>>> here >>>>>>>>>>>>>>>>> is that >>>>>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this >>>>>>>>>>>>>>>>> seems to be >>>>>>>>>>>>>>>>> true, and also reasonable to expect. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> (I have some ideas how to improve the implementation to >>>>>>>>>>>>>>>>> ~O(1) but it >>>>>>>>>>>>>>>>> would be considerably more complex: have to maintain a >>>>>>>>>>>>>>>>> (hash)table that >>>>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon >>>>>>>>>>>>>>>>> unload, >>>>>>>>>>>>>>>>> and build the >>>>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently see >>>>>>>>>>>>>>>>> that it's >>>>>>>>>>>>>>>>> worth the effort). >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> In addition to all that, this process is only >>>>>>>>>>>>>>>>> activated when >>>>>>>>>>>>>>>>> there's an >>>>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote: >>>>>>>>>>>>>>>>>>> Hello all, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Issue: >>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of >>>>>>>>>>>>>>>>>>> classTrack.c. >>>>>>>>>>>>>>>>>>> It avoids >>>>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps >>>>>>>>>>>>>>>>>>> track of >>>>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance >>>>>>>>>>>>>>>>>>> until an >>>>>>>>>>>>>>>>>>> agent >>>>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and >>>>>>>>>>>>>>>>>>> timing. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Eg with the testcase provided here: >>>>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I am getting those numbers: >>>>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s >>>>>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Can I please get a review? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>>>> >>>> >> >> > From serguei.spitsyn at oracle.com Tue Mar 24 21:50:25 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 24 Mar 2020 14:50:25 -0700 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com> <7766d2f0-e223-7129-9c37-474bbcd67d70@oracle.com> <7667eb6e-8458-6763-3f27-f97f6f49bed8@redhat.com> <86692131-c9ff-3904-dce6-ac59149b2b88@oracle.com> <1de4cfc0-a719-efc6-8f9c-55c73c9043f7@redhat.com> <9e86af41-0c60-de4a-53f0-9627b7248209@oracle.com> <8e6f2f0c-ceee-2ef1-50b5-cea01bb31893@oracle.com> Message-ID: <8c996072-9b3b-0b64-6aea-bf477dced13c@oracle.com> On 3/24/20 14:47, Chris Plummer wrote: > On 3/24/20 2:46 PM, serguei.spitsyn at oracle.com wrote: >> On 3/24/20 14:39, Chris Plummer wrote: >>> On 3/24/20 1:45 PM, Roman Kennke wrote: >>>>>>> I assume JVMTI maintains separate tagging data for each agent so >>>>>>> having >>>>>>> two agents doing tagging won't result in confusion. I didn't >>>>>>> actually >>>>>>> find this in the spec. Would be nice to confirm that it is the >>>>>>> case. >>>>>>> However, your implementation does seem to conflict with other >>>>>>> uses of >>>>>>> tagging in the debug agent: >>>>>> The tagging data is per-jvmtiEnv. We create and use our own env >>>>>> (private >>>>>> to class-tracking), so this wouldn't conflict with other uses of >>>>>> tags. >>>>>> Could it be a problem that we have a single trackingEnv per JVM, >>>>>> though? >>>>>> /me scratches head. >>>>> Ok. This is an area I'm not familiar with, but the spec does say: >>>>> >>>>> "Each call to GetEnv creates a new JVM TI connection and thus a >>>>> new JVM >>>>> TI environment." >>>>> >>>>> So it looks like what you are doing should be ok. I still think >>>>> you have >>>>> a bug where you are not deallocating signatures of classes that are >>>>> unloaded. If you think otherwise please point out where this is done. >>>> Signatures that make it out of processUnloading() are deallocated in >>>> eventHandler.c, in synthesizeUnload(), right after it has been used. >>>> >>>> http://hg.openjdk.java.net/jdk/jdk/file/b9562cc25fc0/src/jdk.jdwp.agent/share/native/libjdwp/eventHandler.c#l527 >>>> >>> Ok. Good to know. Not the best of designs, but that's not your >>> fault. I'll make another pass over the changes, but I think in >>> general it looks good. I don't think I've seen another reviewer yet, >>> so hopefully someone jumps in. >> >> As I understand, Roman already resolved my previous comments. >> So, I will do another pass for v6. > I think it's pretty much a rewrite since you last reviewed it. Yes, I'm expecting this as some performance related issues were discovered. Thanks, Serguei > > Chris >> >> Thanks, >> Serguei >> >>> >>> Chris >>>> Pending signatures on debug-agent-disconnect are deallocated in >>>> classTrack.c, in the reset() routine. >>>> >>>> Thanks, >>>> Roman >>>> >>>>> thanks, >>>>> >>>>> Chris >>>>>>> What would cause classTrack_addPreparedClass() to be called for >>>>>>> a Class >>>>>>> you've already seen? I don't understand the need for the "tag != >>>>>>> 0l" >>>>>>> check. >>>>>> It's probably not needed, may be a left-over from previous >>>>>> installments >>>>>> of this implementation. I will check it, and turn into an assert >>>>>> or so. >>>>>> >>>>>> Thanks, >>>>>> Roman >>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 3/20/20 12:52 PM, Chris Plummer wrote: >>>>>>>> On 3/20/20 8:30 AM, Roman Kennke wrote: >>>>>>>>> I believe I came up with a much simpler solution that also >>>>>>>>> solves the >>>>>>>>> problems of the existing one, and the ones I proposed earlier. >>>>>>>>> >>>>>>>>> It turns out that we can take advantage of the fact that we >>>>>>>>> can use >>>>>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is >>>>>>>>> explicitely >>>>>>>>> mentioned in the JVMTI spec). This means we can simply stick a >>>>>>>>> pointer >>>>>>>>> to the signature of a class into the tag, and pull it out again >>>>>>>>> when we >>>>>>>>> get notified that the class gets unloaded. >>>>>>>>> >>>>>>>>> This means we don't need an extra data-structure to keep track of >>>>>>>>> classes and signatures, and it also makes the story around >>>>>>>>> locking >>>>>>>>> *much* simpler. Performance-wise this is O(1), i.e. no >>>>>>>>> scanning of all >>>>>>>>> classes needed (as in the current implementation) and no >>>>>>>>> searching of >>>>>>>>> table needed (like in my previous attempts). >>>>>>>>> >>>>>>>>> Please review this new revision: >>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/ >>>>>>>> I'll have a look at this. >>>>>>>>> (Notice that there still appears to be a performance >>>>>>>>> bottleneck with >>>>>>>>> class-unloading when an actual debugger is attached. This >>>>>>>>> doesn't seem >>>>>>>>> to be related to the classTrack.c implementation though, but >>>>>>>>> looks >>>>>>>>> like >>>>>>>>> a consequence of getting all those class-unload notifications >>>>>>>>> over the >>>>>>>>> wire. My testcase generates 1000s of them, and it's clogging >>>>>>>>> up the >>>>>>>>> buffers.) >>>>>>>> At least this is only a one-shot hit when the classes are >>>>>>>> unloaded, >>>>>>>> and the performance hit is based on the number of classes being >>>>>>>> unloaded. The main issue is happening every GC, and is O(n) >>>>>>>> where n is >>>>>>>> the number of loaded classes. >>>>>>>>> I am not sure why jdb needs to enable class-unload listener >>>>>>>>> always. A >>>>>>>>> simple hack disables it, and performance is brilliant, even when >>>>>>>>> jdb is >>>>>>>>> attached: >>>>>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch >>>>>>>>> >>>>>>>> This is JDI, not jdb. It looks like it needs ClassUnload events >>>>>>>> so it >>>>>>>> can maintain typesBySignature, which is used by public APIs like >>>>>>>> allClasses(). So we have caching of loaded classes both in the >>>>>>>> debug >>>>>>>> agent and in JDI. >>>>>>>> >>>>>>>> Chris >>>>>>>>> But this is not in the scope of this bug.) >>>>>>>>> >>>>>>>>> Roman >>>>>>>>> >>>>>>>>> >>>>>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>> Sorry, forgot to complete my comments at the end (see below). >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>> Hi Roman, >>>>>>>>>>> >>>>>>>>>>> Thank you for the update and sorry for the latency in review. >>>>>>>>>>> >>>>>>>>>>> Some comments are below. >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag) >>>>>>>>>>> ???? 88 { >>>>>>>>>>> 89 debugMonitorEnter(deletedSignatureLock); >>>>>>>>>>> 90 if (currentClassTag == -1) { >>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>> 93 return; >>>>>>>>>>> ???? 94???? } >>>>>>>>>>> Just a question: >>>>>>>>>>> ???? Q1: Should the ObjectFree events be disabled for the >>>>>>>>>>> jvmtiEnv >>>>>>>>>>> that does >>>>>>>>>>> ???????? the class tracking if class tracking has not been >>>>>>>>>>> initialized? >>>>>>>>>>> >>>>>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is >>>>>>>>>>> better to >>>>>>>>>>> be something like: lastClassTag or highestClassTag. >>>>>>>>>>> >>>>>>>>>>> 99 KlassNode* klass = *klass_ptr; >>>>>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103 >>>>>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr; >>>>>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // >>>>>>>>>>> klass not >>>>>>>>>>> found - ignore. >>>>>>>>>>> 107 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>> 108 return; >>>>>>>>>>> ??? 109???? } >>>>>>>>>>> ????It seems to me, something is wrong in the condition at L106 >>>>>>>>>>> above. >>>>>>>>>>> ????Should it be? : >>>>>>>>>>> ?????? if (klass == NULL || klass->klass_tag != tag) >>>>>>>>>>> >>>>>>>>>>> ????Otherwise, how can the second check ever work correctly >>>>>>>>>>> as the >>>>>>>>>>> return >>>>>>>>>>> will always happen when (klass != NULL)? >>>>>>>>>>> >>>>>>>>>>> ??? There are several places in this file with the the indent: >>>>>>>>>>> 90 if (currentClassTag == -1) { >>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>> 93 return; >>>>>>>>>>> ???? 94???? } >>>>>>>>>>> ??? ... >>>>>>>>>>> 152 if (currentClassTag == -1) { >>>>>>>>>>> 153 // Class tracking not initialized yet, nobody's interested >>>>>>>>>>> 154 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>> 155 return; >>>>>>>>>>> ??? 156???? } >>>>>>>>>>> ??? ... >>>>>>>>>>> 161 if (error != JVMTI_ERROR_NONE) { >>>>>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class >>>>>>>>>>> trackingEnv"); >>>>>>>>>>> ??? 163???? } >>>>>>>>>>> 164 if (tag != 0l) { >>>>>>>>>>> 165 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>> 166 return; // Already added >>>>>>>>>>> ??? 167???? } >>>>>>>>>>> ??? ... >>>>>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg) >>>>>>>>>>> 282 { >>>>>>>>>>> 283 char* sig = (char*)signatureVoid; >>>>>>>>>>> 284 jvmtiDeallocate(sig); >>>>>>>>>>> 285 return JNI_TRUE; >>>>>>>>>>> ??? 286 } >>>>>>>>>>> ??? ... >>>>>>>>>>> ??? 291 void >>>>>>>>>>> ??? 292 classTrack_reset(void) >>>>>>>>>>> ??? 293 { >>>>>>>>>>> 294 int idx; >>>>>>>>>>> 295 debugMonitorEnter(deletedSignatureLock); >>>>>>>>>>> 296 >>>>>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) { >>>>>>>>>>> 298 KlassNode* node = table[idx]; >>>>>>>>>>> 299 while (node != NULL) { >>>>>>>>>>> 300 KlassNode* next = node->next; >>>>>>>>>>> 301 jvmtiDeallocate(node->signature); >>>>>>>>>>> 302 jvmtiDeallocate(node); >>>>>>>>>>> 303 node = next; >>>>>>>>>>> 304 } >>>>>>>>>>> 305 } >>>>>>>>>>> 306 jvmtiDeallocate(table); >>>>>>>>>>> 307 >>>>>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL); >>>>>>>>>>> 309 bagDestroyBag(deletedSignatureBag); >>>>>>>>>>> 310 >>>>>>>>>>> 311 currentClassTag = -1; >>>>>>>>>>> 312 >>>>>>>>>>> 313 >>>>>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); >>>>>>>>>>> >>>>>>>>>>> 314 trackingEnv = NULL; >>>>>>>>>>> 315 >>>>>>>>>>> 316 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>> >>>>>>>>>>> Could you, please, fix several comments below? >>>>>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for >>>>>>>>>>> class-unloads >>>>>>>>>>> ????The comma is not needed. >>>>>>>>>>> ????Would it better to replace: klass tags => klass_tag's ? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 73 * Lock to keep table, currentClassTag and >>>>>>>>>>> deletedSignatureBag >>>>>>>>>>> consistent >>>>>>>>>>> ????Maybe: Lock to guard ... or lock to keep integrity of ... >>>>>>>>>>> >>>>>>>>>>> 84 * Callback when classes are freed, Finds the signature and >>>>>>>>>>> remembers it in deletedSignatureBag. Would be better to use >>>>>>>>>>> words >>>>>>>>>>> like >>>>>>>>>>> "store" or "record", "Find" should not start from capital >>>>>>>>>>> letter: >>>>>>>>>>> Invoke the callback when classes are freed, find and record the >>>>>>>>>>> signature in deletedSignatureBag. >>>>>>>>>>> >>>>>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not >>>>>>>>>>> initialized, >>>>>>>>>>> nobody's interested 153 // Class tracking not initialized yet, >>>>>>>>>>> nobody's interested 158 /* Check this is not a duplicate */ >>>>>>>>>>> Missed dot >>>>>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != >>>>>>>>>>> tag) { // >>>>>>>>>>> klass not found - ignore. In opposite, dot is not needed as the >>>>>>>>>>> comment does not start from a capital letter. 111 // At this >>>>>>>>>>> point we >>>>>>>>>>> have the KlassNode corresponding to the tag >>>>>>>>>>> 112 // in klass, and the pointer to it in klass_node. >>>>>>>>>> ??? The comment above can be better. Maybe, something like: >>>>>>>>>> ??? ? " At this point, we found the KlassNode matching the klass >>>>>>>>>> tag(and it is >>>>>>>>>> linked). >>>>>>>>>> >>>>>>>>>>> 113 // Remember the unloaded signature. >>>>>>>>>> ????Better: Record the signature of the unloaded class and >>>>>>>>>> unlink it. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Serguei >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Serguei >>>>>>>>>>> >>>>>>>>>>> On 3/9/20 05:39, Roman Kennke wrote: >>>>>>>>>>>> Hello all, >>>>>>>>>>>> >>>>>>>>>>>> Can I please get reviews of this change? In the meantime, >>>>>>>>>>>> we've >>>>>>>>>>>> done >>>>>>>>>>>> more testing and also field-/torture-testing by a customer >>>>>>>>>>>> who is >>>>>>>>>>>> happy >>>>>>>>>>>> now. :-) >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Roman >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Hi Serguei, >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for reviewing! >>>>>>>>>>>>> >>>>>>>>>>>>> I updated the patch to reflect your suggestions, very good! >>>>>>>>>>>>> It also includes a fix to allow re-connecting an agent after >>>>>>>>>>>>> disconnect, >>>>>>>>>>>>> namely move setup of the trackingEnv and >>>>>>>>>>>>> deletedSignatureBag to >>>>>>>>>>>>> _activate() to ensure have those structures after re-connect. >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/ >>>>>>>>>>>>> >>>>>>>>>>>>> Let me know what you think! >>>>>>>>>>>>> Roman >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Roman, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thank you for taking care about this scalability issue! >>>>>>>>>>>>>> >>>>>>>>>>>>>> I have a couple of quick comments. >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 72 /* >>>>>>>>>>>>>> 73 * Lock to protect deletedSignatureBag >>>>>>>>>>>>>> 74 */ >>>>>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /* >>>>>>>>>>>>>> 78 * A bag containing all the deleted classes' signatures. >>>>>>>>>>>>>> Must be >>>>>>>>>>>>>> accessed under >>>>>>>>>>>>>> 79 * deletedTagLock, >>>>>>>>>>>>>> ???? 80? */ >>>>>>>>>>>>>> 81 struct bag* deletedSignatureBag; >>>>>>>>>>>>>> >>>>>>>>>>>>>> ???? The comments contradict to each other. >>>>>>>>>>>>>> ???? I guess, the lock name at line 79 has to be >>>>>>>>>>>>>> deletedSignatureLock >>>>>>>>>>>>>> instead of deletedTagLock. >>>>>>>>>>>>>> ???? Also, comma at the end must be replaced with dot. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 101 // Tag not found? Ignore. >>>>>>>>>>>>>> 102 if (klass == NULL) { >>>>>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>> 104 return; >>>>>>>>>>>>>> 105 } >>>>>>>>>>>>>> ??? 106 >>>>>>>>>>>>>> 107 // Scan linked-list. >>>>>>>>>>>>>> 108 jlong found_tag = klass->klass_tag; >>>>>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) { >>>>>>>>>>>>>> 110 klass_ptr = &klass->next; >>>>>>>>>>>>>> 111 klass = *klass_ptr; >>>>>>>>>>>>>> 112 found_tag = klass->klass_tag; >>>>>>>>>>>>>> ??? 113???? } >>>>>>>>>>>>>> 114 >>>>>>>>>>>>>> 115 // Tag not found? Ignore. >>>>>>>>>>>>>> 116 if (found_tag != tag) { >>>>>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>> 118 return; >>>>>>>>>>>>>> ??? 119???? } >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> ????The code above can be simplified, so that the lines >>>>>>>>>>>>>> 101-105 >>>>>>>>>>>>>> are not >>>>>>>>>>>>>> needed anymore. >>>>>>>>>>>>>> ????It can be something like this: >>>>>>>>>>>>>> >>>>>>>>>>>>>> // Scan linked-list. >>>>>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) { >>>>>>>>>>>>>> klass_ptr = &klass->next; >>>>>>>>>>>>>> klass = *klass_ptr; >>>>>>>>>>>>>> ??????? } >>>>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not >>>>>>>>>>>>>> found - ignore. >>>>>>>>>>>>>> debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>> return; >>>>>>>>>>>>>> ??????? } >>>>>>>>>>>>>> >>>>>>>>>>>>>> It will take more time when I get a chance to look at the >>>>>>>>>>>>>> rest. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Serguei >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote: >>>>>>>>>>>>>>> Here comes an update that resolves some races that >>>>>>>>>>>>>>> happen when >>>>>>>>>>>>>>> disconnecting an agent. In particular, we need to take the >>>>>>>>>>>>>>> lock on >>>>>>>>>>>>>>> basically every operation, and also need to check >>>>>>>>>>>>>>> whether or not >>>>>>>>>>>>>>> class-tracking is active and return an appropriate >>>>>>>>>>>>>>> result (e.g. >>>>>>>>>>>>>>> an empty >>>>>>>>>>>>>>> list) when we're not. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Updated webrev: >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> So, here comes the O(1) implementation: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with >>>>>>>>>>>>>>>> a tag, >>>>>>>>>>>>>>>> and we >>>>>>>>>>>>>>>> set-up a listener to get notified when it is unloaded. >>>>>>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a >>>>>>>>>>>>>>>> table, which >>>>>>>>>>>>>>>> each entry being the head of a linked-list of >>>>>>>>>>>>>>>> KlassNode*. The >>>>>>>>>>>>>>>> table is >>>>>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend >>>>>>>>>>>>>>>> the new >>>>>>>>>>>>>>>> KlassNode*. >>>>>>>>>>>>>>>> This is O(1) operation. >>>>>>>>>>>>>>>> - When we get notified of unloading a class, we look up >>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>> signature of >>>>>>>>>>>>>>>> the reported tag in that table, and remember it in a >>>>>>>>>>>>>>>> bag. The >>>>>>>>>>>>>>>> KlassNode* >>>>>>>>>>>>>>>> is then unlinked from the table and deallocated. This >>>>>>>>>>>>>>>> is ~O(1) >>>>>>>>>>>>>>>> operation >>>>>>>>>>>>>>>> too, depending on the depth of the table. In my >>>>>>>>>>>>>>>> testcase which >>>>>>>>>>>>>>>> hammered >>>>>>>>>>>>>>>> the code with class-loads and unloads, I usually see >>>>>>>>>>>>>>>> depths of >>>>>>>>>>>>>>>> like 2-3, >>>>>>>>>>>>>>>> but not usually more. It should be ok. >>>>>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out >>>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>> bag, and >>>>>>>>>>>>>>>> allocate a new one. >>>>>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid >>>>>>>>>>>>>>>> leaking the >>>>>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets >>>>>>>>>>>>>>>> detached >>>>>>>>>>>>>>>> and/or >>>>>>>>>>>>>>>> re-attached (was missing before). >>>>>>>>>>>>>>>> - I also added locks around data-structure-manipulation >>>>>>>>>>>>>>>> (was >>>>>>>>>>>>>>>> missing >>>>>>>>>>>>>>>> before). >>>>>>>>>>>>>>>> - Also, I only activate this whole process when an actual >>>>>>>>>>>>>>>> listener gets >>>>>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right >>>>>>>>>>>>>>>> when >>>>>>>>>>>>>>>> attaching a >>>>>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be >>>>>>>>>>>>>>>> something >>>>>>>>>>>>>>>> to improve >>>>>>>>>>>>>>>> in the future? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> In my tests, the performance of class-tracking itself >>>>>>>>>>>>>>>> looks >>>>>>>>>>>>>>>> really good. >>>>>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the >>>>>>>>>>>>>>>> class-unload >>>>>>>>>>>>>>>> events. I don't see how this can be helped when the debug >>>>>>>>>>>>>>>> agent asks for it? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Updated webrev: >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Please let me know what you think of it. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am >>>>>>>>>>>>>>>>> implementing >>>>>>>>>>>>>>>>> the even more >>>>>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off >>>>>>>>>>>>>>>>> reviewing for >>>>>>>>>>>>>>>>> now. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks,Roman >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ??? Hi Chris, >>>>>>>>>>>>>>>>>>> I'll have a look at this, although it might not be >>>>>>>>>>>>>>>>>>> for a >>>>>>>>>>>>>>>>>>> few days. In >>>>>>>>>>>>>>>>>>> the meantime, maybe you can describe your new >>>>>>>>>>>>>>>>>>> implementation in >>>>>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the >>>>>>>>>>>>>>>>>>> changes. >>>>>>>>>>>>>>>>>> Sure. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to >>>>>>>>>>>>>>>>>> determine the >>>>>>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading >>>>>>>>>>>>>>>>>> happened, so that >>>>>>>>>>>>>>>>>> we can generate the appropriate JDWP event. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The current implementation does so by maintaining a >>>>>>>>>>>>>>>>>> table of >>>>>>>>>>>>>>>>>> currently >>>>>>>>>>>>>>>>>> prepared classes by building that table when >>>>>>>>>>>>>>>>>> classTrack is >>>>>>>>>>>>>>>>>> initialized, >>>>>>>>>>>>>>>>>> and then add new classes whenever a class gets >>>>>>>>>>>>>>>>>> loaded. When >>>>>>>>>>>>>>>>>> unloading >>>>>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and >>>>>>>>>>>>>>>>>> compared >>>>>>>>>>>>>>>>>> with the >>>>>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the >>>>>>>>>>>>>>>>>> new >>>>>>>>>>>>>>>>>> table gets >>>>>>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently >>>>>>>>>>>>>>>>>> and/or many >>>>>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to >>>>>>>>>>>>>>>>>> O(classCount*gcCount) >>>>>>>>>>>>>>>>>> complexity. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared >>>>>>>>>>>>>>>>>> classes, and also >>>>>>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). >>>>>>>>>>>>>>>>>> Whenever an >>>>>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is >>>>>>>>>>>>>>>>>> scanned, >>>>>>>>>>>>>>>>>> and classes >>>>>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus >>>>>>>>>>>>>>>>>> maintaining the >>>>>>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list >>>>>>>>>>>>>>>>>> that gets returned. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The implementation is not perfect. In order to determine >>>>>>>>>>>>>>>>>> whether or not >>>>>>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. >>>>>>>>>>>>>>>>>> That process is >>>>>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption >>>>>>>>>>>>>>>>>> here >>>>>>>>>>>>>>>>>> is that >>>>>>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this >>>>>>>>>>>>>>>>>> seems to be >>>>>>>>>>>>>>>>>> true, and also reasonable to expect. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> (I have some ideas how to improve the implementation to >>>>>>>>>>>>>>>>>> ~O(1) but it >>>>>>>>>>>>>>>>>> would be considerably more complex: have to maintain a >>>>>>>>>>>>>>>>>> (hash)table that >>>>>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon >>>>>>>>>>>>>>>>>> unload, >>>>>>>>>>>>>>>>>> and build the >>>>>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently >>>>>>>>>>>>>>>>>> see >>>>>>>>>>>>>>>>>> that it's >>>>>>>>>>>>>>>>>> worth the effort). >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> In addition to all that, this process is only >>>>>>>>>>>>>>>>>> activated when >>>>>>>>>>>>>>>>>> there's an >>>>>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote: >>>>>>>>>>>>>>>>>>>> Hello all, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Issue: >>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of >>>>>>>>>>>>>>>>>>>> classTrack.c. >>>>>>>>>>>>>>>>>>>> It avoids >>>>>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps >>>>>>>>>>>>>>>>>>>> track of >>>>>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance >>>>>>>>>>>>>>>>>>>> until an >>>>>>>>>>>>>>>>>>>> agent >>>>>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and >>>>>>>>>>>>>>>>>>>> timing. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Eg with the testcase provided here: >>>>>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I am getting those numbers: >>>>>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s >>>>>>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Can I please get a review? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>>>>> >>>>> >>> >>> >> > > From suenaga at oss.nttdata.com Tue Mar 24 23:47:28 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Wed, 25 Mar 2020 08:47:28 +0900 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com> <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com> <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com> <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com> <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com> <4f940048-06e1-bda1-6f75-225001572232@oss.nttdata.com> <908005c7-c3ad-2f92-5853-2ae0895689b4@oracle.com> <2114ef12-b790-ee2d-4242-d114cc47151c@oracle.com> <9c9950af-5e36-90b5-d22f-a3e00d45fbcc@oss.nttdata.com> Message-ID: Thanks Serguei! I will push it when I get second reviewer. Yasumasa On 2020/03/25 1:39, serguei.spitsyn at oracle.com wrote: > Hi Yasumasa, > > I'm okay with this update. > My mach5 test run for this patch is passed. > > Thanks, > Serguei > > > On 3/23/20 17:08, Yasumasa Suenaga wrote: >> Hi Serguei, >> >> Thanks for your comment! >> I uploaded new webrev: >> >> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.04/ >> >> Also I pushed it to submit repo: >> >> ? http://hg.openjdk.java.net/jdk/submit/rev/fade6a949bd1 >> >> On 2020/03/24 7:39, serguei.spitsyn at oracle.com wrote: >>> Hi Yasumasa, >>> >>> The mach5 tier5 testing looks good. >>> The serviceability/sa/ClhsdbPstack.java is failed without fix and is not failed with it. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 3/23/20 10:18, serguei.spitsyn at oracle.com wrote: >>>> Hi Yasumasa, >>>> >>>> I looked at you changes. >>>> It is hard to understand if this fully solves the issue. >>>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html >>>> >>>> @@ -34,10 +34,11 @@ >>>> ? ???? public static LinuxAMD64CFrame getTopFrame(LinuxDebugger dbg, Address rip, ThreadContext context) { >>>> ??????? Address libptr = dbg.findLibPtrByAddress(rip); >>>> ??????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); >>>> ??????? DwarfParser dwarf = null; >>>> + boolean unsupportedDwarf = false; >>>> ? ??????? if (libptr != null) { // Native frame >>>> ????????? try { >>>> ??????????? dwarf = new DwarfParser(libptr); >>>> ??????????? dwarf.processDwarf(rip); >>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >>>> >>>> @@ -45,24 +46,33 @@ >>>> ?????????????????? !dwarf.isBPOffsetAvailable()) >>>> ????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) >>>> ????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister()) >>>> .addOffsetTo(dwarf.getCFAOffset()); >>>> ????????? } catch (DebuggerException e) { >>>> - // Bail out to Java frame case >>>> + if (dwarf != null) { >>>> + // DWARF processing should succeed when the frame is native >>>> + // but it might fail if CIE has language personality routine >>>> + // and/or LSDA. >>>> + dwarf = null; >>>> + unsupportedDwarf = true; >>>> + } else { >>>> + throw e; >>>> + } >>>> ????????? } >>>> ??????? } >>>> ? ??????? return (cfa == null) ? null >>>> - : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf); >>>> + : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf, !unsupportedDwarf); >>>> ???? } >>>> >>>> @@ -121,13 +131,25 @@ >>>> ?????? } >>>> ? ?????? return isValidFrame(nextCFA, context) ? nextCFA : null; >>>> ???? } >>>> ? - private DwarfParser getNextDwarf(Address nextPC) { >>>> - DwarfParser nextDwarf = null; >>>> + @Override >>>> + public CFrame sender(ThreadProxy thread) { >>>> + if (!possibleNext) { >>>> + return null; >>>> + } >>>> + >>>> + ThreadContext context = thread.getContext(); >>>> + >>>> + Address nextPC = getNextPC(dwarf != null); >>>> + if (nextPC == null) { >>>> + return null; >>>> + } >>>> ? + DwarfParser nextDwarf = null; >>>> + boolean unsupportedDwarf = false; >>>> ?????? if ((dwarf != null) && dwarf.isIn(nextPC)) { >>>> ???????? nextDwarf = dwarf; >>>> ?????? } else { >>>> ???????? Address libptr = dbg.findLibPtrByAddress(nextPC); >>>> ???????? if (libptr != null) { >>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >>>> >>>> @@ -138,33 +160,29 @@ >>>> ?????????? } >>>> ???????? } >>>> ?????? } >>>> ? ?????? if (nextDwarf != null) { >>>> + try { >>>> ???????? nextDwarf.processDwarf(nextPC); >>>> + } catch (DebuggerException e) { >>>> + // DWARF processing should succeed when the frame is native >>>> + // but it might fail if CIE has language personality routine >>>> + // and/or LSDA. >>>> + nextDwarf = null; >>>> + unsupportedDwarf = true; >>>> ?????? } >>>> >>>> This fix looks like a hack. >>>> Should we just propagate the Debugging exception instead of trying to maintain unsupportedDwarf flag? >> >> DwarfParser::processDwarf would throw DebuggerException if it cannot find DWARF which relates to PC. >> PC at this point is for next frame. So current frame (`this` object) is valid, and it should be processed. >> >> >>>> Also, I don't like that DWARF-specific abbreviations (like CIE, IDE,LSDA, etc.) are used without any comments explaining them. >>>> The code has to be generally readable without looking into the DWARF spec each time. >> >> I added comments for them in this webrev. >> >> >> Thanks, >> >> Yasumasa >> >> >>>> I'm submitting mach5 jobs to make sure the issue has been resolved with your fix. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 3/20/20 17:55, Yasumasa Suenaga wrote: >>>>> Thanks Chris! >>>>> I'm waiting for reviewers for this change. >>>>> >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2020/03/21 4:23, Chris Plummer wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> The failure is due to JDK-8231634, so not something you need to worry about. >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 3/20/20 2:45 AM, Yasumasa Suenaga wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> I uploaded new webrev which includes reverting change for ProblemList: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/ >>>>>>> >>>>>>> I tested it on submit repo (mach5-one-ysuenaga-JDK-8240956-2-20200320-0814-9586301), >>>>>>> but it has failed in ClhsdbJstackXcompStress.java. >>>>>>> However I think it is not caused by this change because ClhsdbJstackXcompStress.java tests `jhsdb jstack`, not mixed mode, it would not parse DWARF. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2020/03/20 13:55, Chris Plummer wrote: >>>>>>>> Hi Yasumasa, >>>>>>>> >>>>>>>> The test has been problem listed so please add undoing this to your webrev. Here's the diff that problem listed it: >>>>>>>> >>>>>>>> diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt >>>>>>>> --- a/test/hotspot/jtreg/ProblemList.txt >>>>>>>> +++ b/test/hotspot/jtreg/ProblemList.txt >>>>>>>> @@ -115,7 +115,7 @@ >>>>>>>> ??serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all >>>>>>>> ??serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all >>>>>>>> ??serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all >>>>>>>> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all >>>>>>>> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 solaris-all,linux-all >>>>>>>> ??serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 solaris-all >>>>>>>> ??serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 solaris-all,linux-x64,macosx-x64,windows-x64 >>>>>>>> ??serviceability/sa/ClhsdbSource.java 8193639 solaris-all >>>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 3/16/20 5:07 AM, Yasumasa Suenaga wrote: >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> This webrev has passed submit repo (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and additional tests. >>>>>>>>> So please review it: >>>>>>>>> >>>>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2020/03/16 21:03, Yasumasa Suenaga wrote: >>>>>>>>>> Thank you so much, David! >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2020/03/16 21:01, David Holmes wrote: >>>>>>>>>>> On 16/03/2020 9:46 pm, David Holmes wrote: >>>>>>>>>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote: >>>>>>>>>>>>> Hi David, >>>>>>>>>>>>> >>>>>>>>>>>>> I missed loop condition, so I fixed it and pushed to submit repo. >>>>>>>>>>>>> Could you try again? >>>>>>>>>>>>> >>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23 >>>>>>>>>>>>> >>>>>>>>>>>>> webrev is here: >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >>>>>>>>>>>> >>>>>>>>>>>> Test job resubmitted. Will advise results if it completes before I go to bed :) >>>>>>>>>>> >>>>>>>>>>> Seems to have passed okay. >>>>>>>>>>> >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks a lot! >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 2020/03/16 16:17, David Holmes wrote: >>>>>>>>>>>>>> Sorry it is still crashing. >>>>>>>>>>>>>> >>>>>>>>>>>>>> # >>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>>>>>>>>>>> # >>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704 >>>>>>>>>>>>>> # >>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0640217.suenaga.source) >>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>>>>>>>>>>> # Problematic frame: >>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>>>>>>>>> # >>>>>>>>>>>>>> >>>>>>>>>>>>>> Same as before. >>>>>>>>>>>>>> >>>>>>>>>>>>>> David >>>>>>>>>>>>>> ----- >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote: >>>>>>>>>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote: >>>>>>>>>>>>>>>>> BTW, if you submit it to the submit repo, we can then go and run additional internal tests (and even more builds) using that job. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks for that tip Chris! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I've pushed the change to submit repo, but I've not yet received the result. >>>>>>>>>>>>>>>> I will share you when I get job ID. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> We can see the id. Just need to wait for the builds to complete before submitting the additional tests. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> David >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thank you for testing it. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I updated webrev to avoid bailout to Java frame when DWARF has language personality routine or LSDA. >>>>>>>>>>>>>>>>>> Could you try it? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 . >>>>>>>>>>>>>>>>>> I've pushed it to submit repo. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Diff from webrev.00 is here: >>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652 >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote: >>>>>>>>>>>>>>>>>>> Correction ... >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote: >>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote: >>>>>>>>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I can't review this as I know nothing about the code, but I'm putting the patch through our internal testing. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Sorry but the crashes still exist: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949 >>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) >>>>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>>>>>>>>>>>>>>>>> # Problematic frame: >>>>>>>>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> in fact they seem worse as the test seems to always crash now. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in linux-x64. I don't see a pattern as to where it fails versus passes. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> It doesn't fail for me locally. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Please review this change: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>>>>>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames in jstack mixed mode. >>>>>>>>>>>>>>>>>>>>>> However some error has seen intermittently after that. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I investigated the cause of this, I found two concerns: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check >>>>>>>>>>>>>>>>>>>>>> ?? B: Language personality routine and Language Specific Data Area (LSDA) are not considered >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I addd range check for .eh_frame processing, and ignore personality routine and LSDA in this webrev. >>>>>>>>>>>>>>>>>>>>>> Also I added bailout code if DWARF processing is failed due to these concerns. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> This change has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), >>>>>>>>>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>> >>> > > From serguei.spitsyn at oracle.com Wed Mar 25 09:44:13 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 25 Mar 2020 02:44:13 -0700 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com> References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com> <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com> Message-ID: An HTML attachment was scrubbed... URL: From rkennke at redhat.com Wed Mar 25 13:00:45 2020 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 25 Mar 2020 14:00:45 +0100 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com> <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com> Message-ID: <359442d4-d6a1-f123-dad1-ac712f06eb0e@redhat.com> Hi Sergei, > The fix looks pretty clean now. > I also like new name of the lock.:) Thank you! > Just one comment below. > > http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html > > 110 if (tag != 0l) { > 111 return; // Already added > 112 } > > ?It is better to use a named constant or macro instead. > ?Also, it'd be nice to add a short comment about this value is. As I replied to Chris earlier, this whole block can be turned into an assert. I also made a constant for the value 0, which should be pretty much self-explaining. http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.07/ > How do you test the fix? I am using a manual test that is provided in this bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1751985 "Script to compare performance of GC with and without debugger, when many classes are loaded and classes are being unloaded": https://bugzilla.redhat.com/attachment.cgi?id=1640688 I am also using this test and manually attach/detach jdb a couple of times in a row to check that disconnecting and reconnecting works well (this tended to deadlock or crash with an earlier version of the patch, and is now looking good). I am also running tier1 and tier2 tests locally, and as soon as we all agree that the fix is reasonable, I will push it to the submit repo. I am not sure if any of those tests actually exercise that code, though. Let me know if you want me to run any specific tests. Thank you, Roman > Thanks, > Serguei > > > On 3/20/20 08:30, Roman Kennke wrote: >> I believe I came up with a much simpler solution that also solves the >> problems of the existing one, and the ones I proposed earlier. >> >> It turns out that we can take advantage of the fact that we can use >> *anything* as tags in JVMTI, even pointers to stuff (this is explicitely >> mentioned in the JVMTI spec). This means we can simply stick a pointer >> to the signature of a class into the tag, and pull it out again when we >> get notified that the class gets unloaded. >> >> This means we don't need an extra data-structure to keep track of >> classes and signatures, and it also makes the story around locking >> *much* simpler. Performance-wise this is O(1), i.e. no scanning of all >> classes needed (as in the current implementation) and no searching of >> table needed (like in my previous attempts). >> >> Please review this new revision: >> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/ >> >> (Notice that there still appears to be a performance bottleneck with >> class-unloading when an actual debugger is attached. This doesn't seem >> to be related to the classTrack.c implementation though, but looks like >> a consequence of getting all those class-unload notifications over the >> wire. My testcase generates 1000s of them, and it's clogging up the >> buffers.) >> >> I am not sure why jdb needs to enable class-unload listener always. A >> simple hack disables it, and performance is brilliant, even when jdb is >> attached: >> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch >> >> But this is not in the scope of this bug.) >> >> Roman >> >> >> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote: >>> Sorry, forgot to complete my comments at the end (see below). >>> >>> >>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote: >>>> Hi Roman, >>>> >>>> Thank you for the update and sorry for the latency in review. >>>> >>>> Some comments are below. >>>> >>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>> >>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag) >>>> 88 { >>>> 89 debugMonitorEnter(deletedSignatureLock); >>>> 90 if (currentClassTag == -1) { >>>> 91 // Class tracking not initialized, nobody's interested >>>> 92 debugMonitorExit(deletedSignatureLock); >>>> 93 return; >>>> 94 } >>>> Just a question: >>>> ? Q1: Should the ObjectFree events be disabled for the jvmtiEnv that does >>>> ????? the class tracking if class tracking has not been initialized? >>>> >>>> 70 static jlong currentClassTag; I'm thinking if the name is better to >>>> be something like: lastClassTag or highestClassTag. >>>> >>>> 99 KlassNode* klass = *klass_ptr; >>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103 >>>> klass_ptr = &klass->next; 104 klass = *klass_ptr; >>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // klass not >>>> found - ignore. >>>> 107 debugMonitorExit(deletedSignatureLock); >>>> 108 return; >>>> 109 } >>>> ?It seems to me, something is wrong in the condition at L106 above. >>>> ?Should it be? : >>>> ??? if (klass == NULL || klass->klass_tag != tag) >>>> >>>> ?Otherwise, how can the second check ever work correctly as the return >>>> will always happen when (klass != NULL)? >>>> >>>> ? >>>> There are several places in this file with the the indent: >>>> 90 if (currentClassTag == -1) { >>>> 91 // Class tracking not initialized, nobody's interested >>>> 92 debugMonitorExit(deletedSignatureLock); >>>> 93 return; >>>> 94 } >>>> ... >>>> 152 if (currentClassTag == -1) { >>>> 153 // Class tracking not initialized yet, nobody's interested >>>> 154 debugMonitorExit(deletedSignatureLock); >>>> 155 return; >>>> 156 } >>>> ... >>>> 161 if (error != JVMTI_ERROR_NONE) { >>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv"); >>>> 163 } >>>> 164 if (tag != 0l) { >>>> 165 debugMonitorExit(deletedSignatureLock); >>>> 166 return; // Already added >>>> 167 } >>>> ... >>>> 281 cleanDeleted(void *signatureVoid, void *arg) >>>> 282 { >>>> 283 char* sig = (char*)signatureVoid; >>>> 284 jvmtiDeallocate(sig); >>>> 285 return JNI_TRUE; >>>> 286 } >>>> ... >>>> 291 void >>>> 292 classTrack_reset(void) >>>> 293 { >>>> 294 int idx; >>>> 295 debugMonitorEnter(deletedSignatureLock); >>>> 296 >>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) { >>>> 298 KlassNode* node = table[idx]; >>>> 299 while (node != NULL) { >>>> 300 KlassNode* next = node->next; >>>> 301 jvmtiDeallocate(node->signature); >>>> 302 jvmtiDeallocate(node); >>>> 303 node = next; >>>> 304 } >>>> 305 } >>>> 306 jvmtiDeallocate(table); >>>> 307 >>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL); >>>> 309 bagDestroyBag(deletedSignatureBag); >>>> 310 >>>> 311 currentClassTag = -1; >>>> 312 >>>> 313 (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); >>>> 314 trackingEnv = NULL; >>>> 315 >>>> 316 debugMonitorExit(deletedSignatureLock); >>>> >>>> Could you, please, fix several comments below? >>>> 63 * The JVMTI tracking env to keep track of klass tags, for class-unloads >>>> ?The comma is not needed. >>>> ?Would it better to replace: klass tags => klass_tag's ? >>>> >>>> >>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag >>>> consistent >>>> ?Maybe: Lock to guard ... or lock to keep integrity of ... >>>> >>>> 84 * Callback when classes are freed, Finds the signature and >>>> remembers it in deletedSignatureBag. Would be better to use words like >>>> "store" or "record", "Find" should not start from capital letter: >>>> Invoke the callback when classes are freed, find and record the >>>> signature in deletedSignatureBag. >>>> >>>> 96 // Find deleted KlassNode 133 // Class tracking not initialized, >>>> nobody's interested 153 // Class tracking not initialized yet, >>>> nobody's interested 158 /* Check this is not a duplicate */ Missed dot >>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { // >>>> klass not found - ignore. In opposite, dot is not needed as the >>>> comment does not start from a capital letter. 111 // At this point we >>>> have the KlassNode corresponding to the tag >>>> 112 // in klass, and the pointer to it in klass_node. >>> The comment above can be better. Maybe, something like: >>> ? " At this point, we found the KlassNode matching the klass tag(and it is >>> linked). >>> >>>> 113 // Remember the unloaded signature. >>> ?Better: Record the signature of the unloaded class and unlink it. >>> >>> Thanks, >>> Serguei >>> >>>> Thanks, >>>> Serguei >>>> >>>> On 3/9/20 05:39, Roman Kennke wrote: >>>>> Hello all, >>>>> >>>>> Can I please get reviews of this change? In the meantime, we've done >>>>> more testing and also field-/torture-testing by a customer who is happy >>>>> now. :-) >>>>> >>>>> Thanks, >>>>> Roman >>>>> >>>>> >>>>>> Hi Serguei, >>>>>> >>>>>> Thanks for reviewing! >>>>>> >>>>>> I updated the patch to reflect your suggestions, very good! >>>>>> It also includes a fix to allow re-connecting an agent after disconnect, >>>>>> namely move setup of the trackingEnv and deletedSignatureBag to >>>>>> _activate() to ensure have those structures after re-connect. >>>>>> >>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/ >>>>>> >>>>>> Let me know what you think! >>>>>> Roman >>>>>> >>>>>>> Hi Roman, >>>>>>> >>>>>>> Thank you for taking care about this scalability issue! >>>>>>> >>>>>>> I have a couple of quick comments. >>>>>>> >>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>> >>>>>>> 72 /* >>>>>>> 73 * Lock to protect deletedSignatureBag >>>>>>> 74 */ >>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /* >>>>>>> 78 * A bag containing all the deleted classes' signatures. Must be >>>>>>> accessed under >>>>>>> 79 * deletedTagLock, >>>>>>> 80 */ >>>>>>> 81 struct bag* deletedSignatureBag; >>>>>>> >>>>>>> ? The comments contradict to each other. >>>>>>> ? I guess, the lock name at line 79 has to be deletedSignatureLock >>>>>>> instead of deletedTagLock. >>>>>>> ? Also, comma at the end must be replaced with dot. >>>>>>> >>>>>>> >>>>>>> 101 // Tag not found? Ignore. >>>>>>> 102 if (klass == NULL) { >>>>>>> 103 debugMonitorExit(deletedSignatureLock); >>>>>>> 104 return; >>>>>>> 105 } >>>>>>> 106 >>>>>>> 107 // Scan linked-list. >>>>>>> 108 jlong found_tag = klass->klass_tag; >>>>>>> 109 while (klass != NULL && found_tag != tag) { >>>>>>> 110 klass_ptr = &klass->next; >>>>>>> 111 klass = *klass_ptr; >>>>>>> 112 found_tag = klass->klass_tag; >>>>>>> 113 } >>>>>>> 114 >>>>>>> 115 // Tag not found? Ignore. >>>>>>> 116 if (found_tag != tag) { >>>>>>> 117 debugMonitorExit(deletedSignatureLock); >>>>>>> 118 return; >>>>>>> 119 } >>>>>>> >>>>>>> >>>>>>> ?The code above can be simplified, so that the lines 101-105 are not >>>>>>> needed anymore. >>>>>>> ?It can be something like this: >>>>>>> >>>>>>> // Scan linked-list. >>>>>>> while (klass != NULL && klass->klass_tag != tag) { >>>>>>> klass_ptr = &klass->next; >>>>>>> klass = *klass_ptr; >>>>>>> } >>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not found - ignore. >>>>>>> debugMonitorExit(deletedSignatureLock); >>>>>>> return; >>>>>>> } >>>>>>> >>>>>>> It will take more time when I get a chance to look at the rest. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 12/21/19 13:24, Roman Kennke wrote: >>>>>>>> Here comes an update that resolves some races that happen when >>>>>>>> disconnecting an agent. In particular, we need to take the lock on >>>>>>>> basically every operation, and also need to check whether or not >>>>>>>> class-tracking is active and return an appropriate result (e.g. an empty >>>>>>>> list) when we're not. >>>>>>>> >>>>>>>> Updated webrev: >>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Roman >>>>>>>> >>>>>>>> >>>>>>>>> So, here comes the O(1) implementation: >>>>>>>>> >>>>>>>>> - Whenever a class is 'prepared', it is registered with a tag, and we >>>>>>>>> set-up a listener to get notified when it is unloaded. >>>>>>>>> - Prepared classes are kept in a datastructure that is a table, which >>>>>>>>> each entry being the head of a linked-list of KlassNode*. The table is >>>>>>>>> indexed by tag % slot-count, and then simply prepend the new KlassNode*. >>>>>>>>> This is O(1) operation. >>>>>>>>> - When we get notified of unloading a class, we look up the signature of >>>>>>>>> the reported tag in that table, and remember it in a bag. The KlassNode* >>>>>>>>> is then unlinked from the table and deallocated. This is ~O(1) operation >>>>>>>>> too, depending on the depth of the table. In my testcase which hammered >>>>>>>>> the code with class-loads and unloads, I usually see depths of like 2-3, >>>>>>>>> but not usually more. It should be ok. >>>>>>>>> - when processUnloads() gets called, we simply hand out that bag, and >>>>>>>>> allocate a new one. >>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid leaking the >>>>>>>>> signatures and KlassNode* etc when debug agent gets detached and/or >>>>>>>>> re-attached (was missing before). >>>>>>>>> - I also added locks around data-structure-manipulation (was missing >>>>>>>>> before). >>>>>>>>> - Also, I only activate this whole process when an actual listener gets >>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when attaching a >>>>>>>>> jdb, not sure why jdb does that though. This may be something to improve >>>>>>>>> in the future? >>>>>>>>> >>>>>>>>> In my tests, the performance of class-tracking itself looks really good. >>>>>>>>> The bottleneck now is clearly actual synthesizing the class-unload >>>>>>>>> events. I don't see how this can be helped when the debug agent asks for it? >>>>>>>>> >>>>>>>>> Updated webrev: >>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ >>>>>>>>> >>>>>>>>> Please let me know what you think of it. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Roman >>>>>>>>> >>>>>>>>> >>>>>>>>>> Alright, the perfectionist in me got me. I am implementing the even more >>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing for now. >>>>>>>>>> >>>>>>>>>> Thanks,Roman >>>>>>>>>> >>>>>>>>>> Hi Chris, >>>>>>>>>>>> I'll have a look at this, although it might not be for a few days. In >>>>>>>>>>>> the meantime, maybe you can describe your new implementation in >>>>>>>>>>>> classTrack.c so it's easier to look through the changes. >>>>>>>>>>> Sure. >>>>>>>>>>> >>>>>>>>>>> The purpose of this class-tracking is to be able to determine the >>>>>>>>>>> signatures of unloaded classes when GC/class-unloading happened, so that >>>>>>>>>>> we can generate the appropriate JDWP event. >>>>>>>>>>> >>>>>>>>>>> The current implementation does so by maintaining a table of currently >>>>>>>>>>> prepared classes by building that table when classTrack is initialized, >>>>>>>>>>> and then add new classes whenever a class gets loaded. When unloading >>>>>>>>>>> occurs, that cache is rebuilt into a new table, and compared with the >>>>>>>>>>> old table, and whatever is in the old, but not in the new table gets >>>>>>>>>>> returned. The problem is that when GCs happen frequently and/or many >>>>>>>>>>> classes get loaded+unloaded, this amounts to O(classCount*gcCount) >>>>>>>>>>> complexity. >>>>>>>>>>> >>>>>>>>>>> The new implementation keeps a linked-list of prepared classes, and also >>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). Whenever an >>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned, and classes >>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus maintaining the >>>>>>>>>>> prepared-classes-list) and its signature put in the list that gets returned. >>>>>>>>>>> >>>>>>>>>>> The implementation is not perfect. In order to determine whether or not >>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. That process is >>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here is that >>>>>>>>>>> unloadedClassCount << classCount. In my experiments this seems to be >>>>>>>>>>> true, and also reasonable to expect. >>>>>>>>>>> >>>>>>>>>>> (I have some ideas how to improve the implementation to ~O(1) but it >>>>>>>>>>> would be considerably more complex: have to maintain a (hash)table that >>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload, and build the >>>>>>>>>>> unloaded-signatures list there, but I don't currently see that it's >>>>>>>>>>> worth the effort). >>>>>>>>>>> >>>>>>>>>>> In addition to all that, this process is only activated when there's an >>>>>>>>>>> actual listener registered for EI_GC_FINISH. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Roman >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote: >>>>>>>>>>>>> Hello all, >>>>>>>>>>>>> >>>>>>>>>>>>> Issue: >>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269 >>>>>>>>>>>>> >>>>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c. It avoids >>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps track of >>>>>>>>>>>>> loaded/unloaded classes one-by-one. >>>>>>>>>>>>> >>>>>>>>>>>>> In addition to that, it avoids this whole dance until an agent >>>>>>>>>>>>> registers interest in EI_GC_FINISH. >>>>>>>>>>>>> >>>>>>>>>>>>> Webrev: >>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ >>>>>>>>>>>>> >>>>>>>>>>>>> Testing: manual testing of provided test scenarios and timing. >>>>>>>>>>>>> >>>>>>>>>>>>> Eg with the testcase provided here: >>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>>>>>> >>>>>>>>>>>>> I am getting those numbers: >>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s >>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s >>>>>>>>>>>>> >>>>>>>>>>>>> I also tested successfully through jdk/submit repo >>>>>>>>>>>>> >>>>>>>>>>>>> Can I please get a review? >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Roman >>>>>>>>>>>>> > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From leonid.mesnik at oracle.com Wed Mar 25 15:55:36 2020 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Wed, 25 Mar 2020 08:55:36 -0700 Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the children process if vmArguments is already specified Message-ID: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com> Hi Could you please review following fix which change LingeredApp to prepend vm options to java/vm.test.opts when startApp is used and provide startAppVmOpts to override options completely. The intention is to avoid issue like in this bug where test/jtreg options were ignored by tests. Also I fixed some tests where intention was to append vm options rather than to override them. webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/ bug: https://bugs.openjdk.java.net/browse/JDK-8240698 Leonid From igor.ignatyev at oracle.com Wed Mar 25 16:40:15 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 25 Mar 2020 09:40:15 -0700 Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the children process if vmArguments is already specified In-Reply-To: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com> References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com> Message-ID: <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com> Hi Leonid, I have briefly looked at the patch, a few comments so far: test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java: - at L#114, could you please call static method using class name (as the opposite of using instance)? or was it meant to be theApp.runAppVmOpts(vmArgs) ? test/lib/jdk/test/lib/apps/LingeredApp.java: - it seems that code indent of startApp(LingeredApp, String[]) isn't correct - I don't like startAppVmOpts name, but unfortunately don't have a better suggestion (yet) Thanks, -- Igor > On Mar 25, 2020, at 8:55 AM, Leonid Mesnik wrote: > > Hi > > Could you please review following fix which change LingeredApp to prepend vm options to java/vm.test.opts when startApp is used and provide startAppVmOpts to override options completely. > > The intention is to avoid issue like in this bug where test/jtreg options were ignored by tests. Also I fixed some tests where intention was to append vm options rather than to override them. > > webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/ > > bug: https://bugs.openjdk.java.net/browse/JDK-8240698 > > Leonid > From ioi.lam at oracle.com Wed Mar 25 16:46:07 2020 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 25 Mar 2020 09:46:07 -0700 Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the children process if vmArguments is already specified In-Reply-To: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com> References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com> Message-ID: Hi Lenoid, Thanks for fixing this. If you just look at a test case, it's not very obvious what the difference is between ??? LingerApp.startApp(myApp, "-XX:Xyz=123"); ??? LingerApp.startAppVmOpts(myApp, "-XX:Xyz=123"); How about renaming startAppVmOpts/runAppVmOpts -> startAppExactVmOpts/runAppExactVmOpts? === ?415???? public static void startApp(LingeredApp theApp, String... additonalVMOpts) throws IOException { ?416???????????? String[] vmOpts = additonalVMOpts == null ? Utils.getTestJavaOpts() : Utils.appendTestJavaOpts(additonalVMOpts); ?417???????????? startAppVmOpts(theApp, vmOpts); ?418???? } I think there's no need to check for additonalVMOpts == null. If the caller passes no arguments, additonalVMOpts will be an empty array (but not null); You will get a null for additonalVMOpts only if the caller explicitly passes in a null, like this ????? LingerApp.startApp(theApp, null); but this is not good programming style and you will get a Javac warning: public class DotDotDot { ? public static void main(String args[]) { ??? doit(); ??? doit(null); ? } ? static void doit(String ...args) { ??? System.out.println(args); ? } } $ javac DotDotDot.java DotDotDot.java:4: warning: non-varargs call of varargs method with inexact argument type for last parameter; ??? doit(null); ???????? ^ ? cast to String for a varargs call ? cast to String[] for a non-varargs call and to suppress this warning 1 warning Thanks! - Ioi On 3/25/20 8:55 AM, Leonid Mesnik wrote: > Hi > > Could you please review following fix which change LingeredApp to > prepend vm options to java/vm.test.opts when startApp is used and > provide startAppVmOpts to override options completely. > > The intention is to avoid issue like in this bug where test/jtreg > options were ignored by tests. Also I fixed some tests where intention > was to append vm options rather than to override them. > > webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/ > > bug: https://bugs.openjdk.java.net/browse/JDK-8240698 > > Leonid > From stefan.karlsson at oracle.com Wed Mar 25 17:14:03 2020 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 25 Mar 2020 18:14:03 +0100 Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the children process if vmArguments is already specified In-Reply-To: <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com> References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com> <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com> Message-ID: <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com> On 2020-03-25 17:40, Igor Ignatyev wrote: > Hi Leonid, > > I have briefly looked at the patch, a few comments so far: > > test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java: > - at L#114, could you please call static method using class name (as the opposite of using instance)? or was it meant to be theApp.runAppVmOpts(vmArgs) ? > > test/lib/jdk/test/lib/apps/LingeredApp.java: > - it seems that code indent of startApp(LingeredApp, String[]) isn't correct > - I don't like startAppVmOpts name, but unfortunately don't have a better suggestion (yet) I was going to say the same. Jtreg has the concept of "java options" and "vm options". We have had a fair share of bugs and wasted time when tests have been using the "vm options" part (VM_OPTIONS, test.vm.options, etc), and we've been moving away from using that way to pass options. I recently cleaned up some of this with: 8237111: LingeredApp should be started with getTestJavaOpts Because of this, I would prefer if we used a name that doesn't include "VmOpts", because it's too alike the other concept. Some suggestions: ?startAppJavaOptions ?startAppUsingJavaOptions ?startAppWithJavaOptions ?startAppExactJavaOptions ?startAppJvmOptions Thanks, StefanK > Thanks, > -- Igor > >> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik wrote: >> >> Hi >> >> Could you please review following fix which change LingeredApp to prepend vm options to java/vm.test.opts when startApp is used and provide startAppVmOpts to override options completely. >> >> The intention is to avoid issue like in this bug where test/jtreg options were ignored by tests. Also I fixed some tests where intention was to append vm options rather than to override them. >> >> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/ >> >> bug: https://bugs.openjdk.java.net/browse/JDK-8240698 >> >> Leonid >> From leonid.mesnik at oracle.com Wed Mar 25 17:52:18 2020 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Wed, 25 Mar 2020 10:52:18 -0700 Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the children process if vmArguments is already specified In-Reply-To: <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com> References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com> <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com> <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com> Message-ID: Igor, Stefan Thank you for feedback, see my comments inline. On 3/25/20 10:14 AM, Stefan Karlsson wrote: > On 2020-03-25 17:40, Igor Ignatyev wrote: >> Hi Leonid, >> >> I have briefly looked at the patch, a few comments so far: >> >> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java: >> ? - at L#114, could you please call static method using class name >> (as the opposite of using instance)? or was it meant to be >> theApp.runAppVmOpts(vmArgs) ? No, it is a plain bug. I wanted to use non-static method first and forget to change to classname. >> >> test/lib/jdk/test/lib/apps/LingeredApp.java: >> - it seems that code indent of startApp(LingeredApp, String[]) isn't >> correct >> - I don't like startAppVmOpts name, but unfortunately don't have a >> better suggestion (yet) > > I was going to say the same. Jtreg has the concept of "java options" > and "vm options". We have had a fair share of bugs and wasted time > when tests have been using the "vm options" part (VM_OPTIONS, > test.vm.options, etc), and we've been moving away from using that way > to pass options. I recently cleaned up some of this with: > > 8237111: LingeredApp should be started with getTestJavaOpts > > Because of this, I would prefer if we used a name that doesn't include > "VmOpts", because it's too alike the other concept. Some suggestions: > ?startAppJavaOptions > ?startAppUsingJavaOptions > ?startAppWithJavaOptions > ?startAppExactJavaOptions > ?startAppJvmOptions I prefer 'startAppExactJvmOptions' (and same runApp..) to be clear that this method doesn't use default test options and whole combination should be prepared by user. And left startApp(String .. addtionaJVmOpts) for cases when additional options are prepend to standard set. Let me know if what do you think about this. Leonid > > Thanks, > StefanK > >> Thanks, >> -- Igor >> >>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik >>> wrote: >>> >>> Hi >>> >>> Could you please review following fix which change LingeredApp to >>> prepend vm options to java/vm.test.opts when startApp is used and >>> provide startAppVmOpts to override options completely. >>> >>> The intention is to avoid issue like in this bug where test/jtreg >>> options were ignored by tests. Also I fixed some tests where >>> intention was to append vm options rather than to override them. >>> >>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/ >>> >>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698 >>> >>> Leonid >>> > From chris.plummer at oracle.com Wed Mar 25 18:07:48 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 25 Mar 2020 11:07:48 -0700 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: <359442d4-d6a1-f123-dad1-ac712f06eb0e@redhat.com> References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com> <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com> <359442d4-d6a1-f123-dad1-ac712f06eb0e@redhat.com> Message-ID: Hi Roman, Regarding the new assert: ?105???? if (gdata && gdata->assertOn) { ?106???????? // Check this is not already tagged. ?107???????? jlong tag; ?108???????? error = JVMTI_FUNC_PTR(trackingEnv, GetTag)(env, klass, &tag); ?109???????? if (error != JVMTI_ERROR_NONE) { ?110???????????? EXIT_ERROR(error, "Unable to GetTag with class trackingEnv"); ?111???????? } ?112???????? JDI_ASSERT(tag == NOT_TAGGED); ?113???? } I think you should remove the gdata check. gdata should never be NULL when you get to this code. If it is ever NULL then there's a bug, and the check will hide the bug. Regarding testing, after you do the submit repo testing let me know the jobID and I'll do additional testing on it. thanks, Chris On 3/25/20 6:00 AM, Roman Kennke wrote: > Hi Sergei, > >> The fix looks pretty clean now. >> I also like new name of the lock.:) > Thank you! > >> Just one comment below. >> >> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >> >> 110 if (tag != 0l) { >> 111 return; // Already added >> 112 } >> >> ?It is better to use a named constant or macro instead. >> ?Also, it'd be nice to add a short comment about this value is. > As I replied to Chris earlier, this whole block can be turned into an > assert. I also made a constant for the value 0, which should be pretty > much self-explaining. > > http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.07/ > >> How do you test the fix? > I am using a manual test that is provided in this bug report: > https://bugzilla.redhat.com/show_bug.cgi?id=1751985 > > "Script to compare performance of GC with and without debugger, when > many classes are loaded and classes are being unloaded": > > https://bugzilla.redhat.com/attachment.cgi?id=1640688 > > I am also using this test and manually attach/detach jdb a couple of > times in a row to check that disconnecting and reconnecting works well > (this tended to deadlock or crash with an earlier version of the patch, > and is now looking good). > > I am also running tier1 and tier2 tests locally, and as soon as we all > agree that the fix is reasonable, I will push it to the submit repo. I > am not sure if any of those tests actually exercise that code, though. > Let me know if you want me to run any specific tests. > > Thank you, > Roman > > > >> Thanks, >> Serguei >> >> >> On 3/20/20 08:30, Roman Kennke wrote: >>> I believe I came up with a much simpler solution that also solves the >>> problems of the existing one, and the ones I proposed earlier. >>> >>> It turns out that we can take advantage of the fact that we can use >>> *anything* as tags in JVMTI, even pointers to stuff (this is explicitely >>> mentioned in the JVMTI spec). This means we can simply stick a pointer >>> to the signature of a class into the tag, and pull it out again when we >>> get notified that the class gets unloaded. >>> >>> This means we don't need an extra data-structure to keep track of >>> classes and signatures, and it also makes the story around locking >>> *much* simpler. Performance-wise this is O(1), i.e. no scanning of all >>> classes needed (as in the current implementation) and no searching of >>> table needed (like in my previous attempts). >>> >>> Please review this new revision: >>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/ >>> >>> (Notice that there still appears to be a performance bottleneck with >>> class-unloading when an actual debugger is attached. This doesn't seem >>> to be related to the classTrack.c implementation though, but looks like >>> a consequence of getting all those class-unload notifications over the >>> wire. My testcase generates 1000s of them, and it's clogging up the >>> buffers.) >>> >>> I am not sure why jdb needs to enable class-unload listener always. A >>> simple hack disables it, and performance is brilliant, even when jdb is >>> attached: >>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch >>> >>> But this is not in the scope of this bug.) >>> >>> Roman >>> >>> >>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote: >>>> Sorry, forgot to complete my comments at the end (see below). >>>> >>>> >>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote: >>>>> Hi Roman, >>>>> >>>>> Thank you for the update and sorry for the latency in review. >>>>> >>>>> Some comments are below. >>>>> >>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>> >>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag) >>>>> 88 { >>>>> 89 debugMonitorEnter(deletedSignatureLock); >>>>> 90 if (currentClassTag == -1) { >>>>> 91 // Class tracking not initialized, nobody's interested >>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>> 93 return; >>>>> 94 } >>>>> Just a question: >>>>> ? Q1: Should the ObjectFree events be disabled for the jvmtiEnv that does >>>>> ????? the class tracking if class tracking has not been initialized? >>>>> >>>>> 70 static jlong currentClassTag; I'm thinking if the name is better to >>>>> be something like: lastClassTag or highestClassTag. >>>>> >>>>> 99 KlassNode* klass = *klass_ptr; >>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103 >>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr; >>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // klass not >>>>> found - ignore. >>>>> 107 debugMonitorExit(deletedSignatureLock); >>>>> 108 return; >>>>> 109 } >>>>> ?It seems to me, something is wrong in the condition at L106 above. >>>>> ?Should it be? : >>>>> ??? if (klass == NULL || klass->klass_tag != tag) >>>>> >>>>> ?Otherwise, how can the second check ever work correctly as the return >>>>> will always happen when (klass != NULL)? >>>>> >>>>> >>>>> There are several places in this file with the the indent: >>>>> 90 if (currentClassTag == -1) { >>>>> 91 // Class tracking not initialized, nobody's interested >>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>> 93 return; >>>>> 94 } >>>>> ... >>>>> 152 if (currentClassTag == -1) { >>>>> 153 // Class tracking not initialized yet, nobody's interested >>>>> 154 debugMonitorExit(deletedSignatureLock); >>>>> 155 return; >>>>> 156 } >>>>> ... >>>>> 161 if (error != JVMTI_ERROR_NONE) { >>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv"); >>>>> 163 } >>>>> 164 if (tag != 0l) { >>>>> 165 debugMonitorExit(deletedSignatureLock); >>>>> 166 return; // Already added >>>>> 167 } >>>>> ... >>>>> 281 cleanDeleted(void *signatureVoid, void *arg) >>>>> 282 { >>>>> 283 char* sig = (char*)signatureVoid; >>>>> 284 jvmtiDeallocate(sig); >>>>> 285 return JNI_TRUE; >>>>> 286 } >>>>> ... >>>>> 291 void >>>>> 292 classTrack_reset(void) >>>>> 293 { >>>>> 294 int idx; >>>>> 295 debugMonitorEnter(deletedSignatureLock); >>>>> 296 >>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) { >>>>> 298 KlassNode* node = table[idx]; >>>>> 299 while (node != NULL) { >>>>> 300 KlassNode* next = node->next; >>>>> 301 jvmtiDeallocate(node->signature); >>>>> 302 jvmtiDeallocate(node); >>>>> 303 node = next; >>>>> 304 } >>>>> 305 } >>>>> 306 jvmtiDeallocate(table); >>>>> 307 >>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL); >>>>> 309 bagDestroyBag(deletedSignatureBag); >>>>> 310 >>>>> 311 currentClassTag = -1; >>>>> 312 >>>>> 313 (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); >>>>> 314 trackingEnv = NULL; >>>>> 315 >>>>> 316 debugMonitorExit(deletedSignatureLock); >>>>> >>>>> Could you, please, fix several comments below? >>>>> 63 * The JVMTI tracking env to keep track of klass tags, for class-unloads >>>>> ?The comma is not needed. >>>>> ?Would it better to replace: klass tags => klass_tag's ? >>>>> >>>>> >>>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag >>>>> consistent >>>>> ?Maybe: Lock to guard ... or lock to keep integrity of ... >>>>> >>>>> 84 * Callback when classes are freed, Finds the signature and >>>>> remembers it in deletedSignatureBag. Would be better to use words like >>>>> "store" or "record", "Find" should not start from capital letter: >>>>> Invoke the callback when classes are freed, find and record the >>>>> signature in deletedSignatureBag. >>>>> >>>>> 96 // Find deleted KlassNode 133 // Class tracking not initialized, >>>>> nobody's interested 153 // Class tracking not initialized yet, >>>>> nobody's interested 158 /* Check this is not a duplicate */ Missed dot >>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { // >>>>> klass not found - ignore. In opposite, dot is not needed as the >>>>> comment does not start from a capital letter. 111 // At this point we >>>>> have the KlassNode corresponding to the tag >>>>> 112 // in klass, and the pointer to it in klass_node. >>>> The comment above can be better. Maybe, something like: >>>> ? " At this point, we found the KlassNode matching the klass tag(and it is >>>> linked). >>>> >>>>> 113 // Remember the unloaded signature. >>>> ?Better: Record the signature of the unloaded class and unlink it. >>>> >>>> Thanks, >>>> Serguei >>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> On 3/9/20 05:39, Roman Kennke wrote: >>>>>> Hello all, >>>>>> >>>>>> Can I please get reviews of this change? In the meantime, we've done >>>>>> more testing and also field-/torture-testing by a customer who is happy >>>>>> now. :-) >>>>>> >>>>>> Thanks, >>>>>> Roman >>>>>> >>>>>> >>>>>>> Hi Serguei, >>>>>>> >>>>>>> Thanks for reviewing! >>>>>>> >>>>>>> I updated the patch to reflect your suggestions, very good! >>>>>>> It also includes a fix to allow re-connecting an agent after disconnect, >>>>>>> namely move setup of the trackingEnv and deletedSignatureBag to >>>>>>> _activate() to ensure have those structures after re-connect. >>>>>>> >>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/ >>>>>>> >>>>>>> Let me know what you think! >>>>>>> Roman >>>>>>> >>>>>>>> Hi Roman, >>>>>>>> >>>>>>>> Thank you for taking care about this scalability issue! >>>>>>>> >>>>>>>> I have a couple of quick comments. >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>> >>>>>>>> 72 /* >>>>>>>> 73 * Lock to protect deletedSignatureBag >>>>>>>> 74 */ >>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /* >>>>>>>> 78 * A bag containing all the deleted classes' signatures. Must be >>>>>>>> accessed under >>>>>>>> 79 * deletedTagLock, >>>>>>>> 80 */ >>>>>>>> 81 struct bag* deletedSignatureBag; >>>>>>>> >>>>>>>> ? The comments contradict to each other. >>>>>>>> ? I guess, the lock name at line 79 has to be deletedSignatureLock >>>>>>>> instead of deletedTagLock. >>>>>>>> ? Also, comma at the end must be replaced with dot. >>>>>>>> >>>>>>>> >>>>>>>> 101 // Tag not found? Ignore. >>>>>>>> 102 if (klass == NULL) { >>>>>>>> 103 debugMonitorExit(deletedSignatureLock); >>>>>>>> 104 return; >>>>>>>> 105 } >>>>>>>> 106 >>>>>>>> 107 // Scan linked-list. >>>>>>>> 108 jlong found_tag = klass->klass_tag; >>>>>>>> 109 while (klass != NULL && found_tag != tag) { >>>>>>>> 110 klass_ptr = &klass->next; >>>>>>>> 111 klass = *klass_ptr; >>>>>>>> 112 found_tag = klass->klass_tag; >>>>>>>> 113 } >>>>>>>> 114 >>>>>>>> 115 // Tag not found? Ignore. >>>>>>>> 116 if (found_tag != tag) { >>>>>>>> 117 debugMonitorExit(deletedSignatureLock); >>>>>>>> 118 return; >>>>>>>> 119 } >>>>>>>> >>>>>>>> >>>>>>>> ?The code above can be simplified, so that the lines 101-105 are not >>>>>>>> needed anymore. >>>>>>>> ?It can be something like this: >>>>>>>> >>>>>>>> // Scan linked-list. >>>>>>>> while (klass != NULL && klass->klass_tag != tag) { >>>>>>>> klass_ptr = &klass->next; >>>>>>>> klass = *klass_ptr; >>>>>>>> } >>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not found - ignore. >>>>>>>> debugMonitorExit(deletedSignatureLock); >>>>>>>> return; >>>>>>>> } >>>>>>>> >>>>>>>> It will take more time when I get a chance to look at the rest. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 12/21/19 13:24, Roman Kennke wrote: >>>>>>>>> Here comes an update that resolves some races that happen when >>>>>>>>> disconnecting an agent. In particular, we need to take the lock on >>>>>>>>> basically every operation, and also need to check whether or not >>>>>>>>> class-tracking is active and return an appropriate result (e.g. an empty >>>>>>>>> list) when we're not. >>>>>>>>> >>>>>>>>> Updated webrev: >>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Roman >>>>>>>>> >>>>>>>>> >>>>>>>>>> So, here comes the O(1) implementation: >>>>>>>>>> >>>>>>>>>> - Whenever a class is 'prepared', it is registered with a tag, and we >>>>>>>>>> set-up a listener to get notified when it is unloaded. >>>>>>>>>> - Prepared classes are kept in a datastructure that is a table, which >>>>>>>>>> each entry being the head of a linked-list of KlassNode*. The table is >>>>>>>>>> indexed by tag % slot-count, and then simply prepend the new KlassNode*. >>>>>>>>>> This is O(1) operation. >>>>>>>>>> - When we get notified of unloading a class, we look up the signature of >>>>>>>>>> the reported tag in that table, and remember it in a bag. The KlassNode* >>>>>>>>>> is then unlinked from the table and deallocated. This is ~O(1) operation >>>>>>>>>> too, depending on the depth of the table. In my testcase which hammered >>>>>>>>>> the code with class-loads and unloads, I usually see depths of like 2-3, >>>>>>>>>> but not usually more. It should be ok. >>>>>>>>>> - when processUnloads() gets called, we simply hand out that bag, and >>>>>>>>>> allocate a new one. >>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid leaking the >>>>>>>>>> signatures and KlassNode* etc when debug agent gets detached and/or >>>>>>>>>> re-attached (was missing before). >>>>>>>>>> - I also added locks around data-structure-manipulation (was missing >>>>>>>>>> before). >>>>>>>>>> - Also, I only activate this whole process when an actual listener gets >>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when attaching a >>>>>>>>>> jdb, not sure why jdb does that though. This may be something to improve >>>>>>>>>> in the future? >>>>>>>>>> >>>>>>>>>> In my tests, the performance of class-tracking itself looks really good. >>>>>>>>>> The bottleneck now is clearly actual synthesizing the class-unload >>>>>>>>>> events. I don't see how this can be helped when the debug agent asks for it? >>>>>>>>>> >>>>>>>>>> Updated webrev: >>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ >>>>>>>>>> >>>>>>>>>> Please let me know what you think of it. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Roman >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Alright, the perfectionist in me got me. I am implementing the even more >>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing for now. >>>>>>>>>>> >>>>>>>>>>> Thanks,Roman >>>>>>>>>>> >>>>>>>>>>> Hi Chris, >>>>>>>>>>>>> I'll have a look at this, although it might not be for a few days. In >>>>>>>>>>>>> the meantime, maybe you can describe your new implementation in >>>>>>>>>>>>> classTrack.c so it's easier to look through the changes. >>>>>>>>>>>> Sure. >>>>>>>>>>>> >>>>>>>>>>>> The purpose of this class-tracking is to be able to determine the >>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading happened, so that >>>>>>>>>>>> we can generate the appropriate JDWP event. >>>>>>>>>>>> >>>>>>>>>>>> The current implementation does so by maintaining a table of currently >>>>>>>>>>>> prepared classes by building that table when classTrack is initialized, >>>>>>>>>>>> and then add new classes whenever a class gets loaded. When unloading >>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and compared with the >>>>>>>>>>>> old table, and whatever is in the old, but not in the new table gets >>>>>>>>>>>> returned. The problem is that when GCs happen frequently and/or many >>>>>>>>>>>> classes get loaded+unloaded, this amounts to O(classCount*gcCount) >>>>>>>>>>>> complexity. >>>>>>>>>>>> >>>>>>>>>>>> The new implementation keeps a linked-list of prepared classes, and also >>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). Whenever an >>>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned, and classes >>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus maintaining the >>>>>>>>>>>> prepared-classes-list) and its signature put in the list that gets returned. >>>>>>>>>>>> >>>>>>>>>>>> The implementation is not perfect. In order to determine whether or not >>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. That process is >>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here is that >>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this seems to be >>>>>>>>>>>> true, and also reasonable to expect. >>>>>>>>>>>> >>>>>>>>>>>> (I have some ideas how to improve the implementation to ~O(1) but it >>>>>>>>>>>> would be considerably more complex: have to maintain a (hash)table that >>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload, and build the >>>>>>>>>>>> unloaded-signatures list there, but I don't currently see that it's >>>>>>>>>>>> worth the effort). >>>>>>>>>>>> >>>>>>>>>>>> In addition to all that, this process is only activated when there's an >>>>>>>>>>>> actual listener registered for EI_GC_FINISH. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Roman >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Chris >>>>>>>>>>>>> >>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote: >>>>>>>>>>>>>> Hello all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Issue: >>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269 >>>>>>>>>>>>>> >>>>>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c. It avoids >>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps track of >>>>>>>>>>>>>> loaded/unloaded classes one-by-one. >>>>>>>>>>>>>> >>>>>>>>>>>>>> In addition to that, it avoids this whole dance until an agent >>>>>>>>>>>>>> registers interest in EI_GC_FINISH. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and timing. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Eg with the testcase provided here: >>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>>>>>>> >>>>>>>>>>>>>> I am getting those numbers: >>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s >>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s >>>>>>>>>>>>>> >>>>>>>>>>>>>> I also tested successfully through jdk/submit repo >>>>>>>>>>>>>> >>>>>>>>>>>>>> Can I please get a review? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Roman >>>>>>>>>>>>>> From rkennke at redhat.com Wed Mar 25 18:37:23 2020 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 25 Mar 2020 19:37:23 +0100 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com> <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com> <359442d4-d6a1-f123-dad1-ac712f06eb0e@redhat.com> Message-ID: Hi Chris, > Regarding the new assert: > > ?105???? if (gdata && gdata->assertOn) { > ?106???????? // Check this is not already tagged. > ?107???????? jlong tag; > ?108???????? error = JVMTI_FUNC_PTR(trackingEnv, GetTag)(env, klass, &tag); > ?109???????? if (error != JVMTI_ERROR_NONE) { > ?110???????????? EXIT_ERROR(error, "Unable to GetTag with class > trackingEnv"); > ?111???????? } > ?112???????? JDI_ASSERT(tag == NOT_TAGGED); > ?113???? } > > I think you should remove the gdata check. gdata should never be NULL > when you get to this code. If it is ever NULL then there's a bug, and > the check will hide the bug. Ok, will remove this. > Regarding testing, after you do the submit repo testing let me know the > jobID and I'll do additional testing on it. I did the submit repo earlier today, and it came back green: mach5-one-rkennke-JDK-8227269-2-20200325-1421-9706762 Thanks, Roman > thanks, > > Chris > > On 3/25/20 6:00 AM, Roman Kennke wrote: >> Hi Sergei, >> >>> The fix looks pretty clean now. >>> I also like new name of the lock.:) >> Thank you! >> >>> Just one comment below. >>> >>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>> >>> >>> 110 if (tag != 0l) { >>> 111 return; // Already added >>> ? 112???? } >>> >>> ??It is better to use a named constant or macro instead. >>> ??Also, it'd be nice to add a short comment about this value is. >> As I replied to Chris earlier, this whole block can be turned into an >> assert. I also made a constant for the value 0, which should be pretty >> much self-explaining. >> >> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.07/ >> >>> How do you test the fix? >> I am using a manual test that is provided in this bug report: >> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >> >> "Script to compare performance of GC with and without debugger, when >> many classes are loaded and classes are being unloaded": >> >> https://bugzilla.redhat.com/attachment.cgi?id=1640688 >> >> I am also using this test and manually attach/detach jdb a couple of >> times in a row to check that disconnecting and reconnecting works well >> (this tended to deadlock or crash with an earlier version of the patch, >> and is now looking good). >> >> I am also running tier1 and tier2 tests locally, and as soon as we all >> agree that the fix is reasonable, I will push it to the submit repo. I >> am not sure if any of those tests actually exercise that code, though. >> Let me know if you want me to run any specific tests. >> >> Thank you, >> Roman >> >> >> >>> Thanks, >>> Serguei >>> >>> >>> On 3/20/20 08:30, Roman Kennke wrote: >>>> I believe I came up with a much simpler solution that also solves the >>>> problems of the existing one, and the ones I proposed earlier. >>>> >>>> It turns out that we can take advantage of the fact that we can use >>>> *anything* as tags in JVMTI, even pointers to stuff (this is >>>> explicitely >>>> mentioned in the JVMTI spec). This means we can simply stick a pointer >>>> to the signature of a class into the tag, and pull it out again when we >>>> get notified that the class gets unloaded. >>>> >>>> This means we don't need an extra data-structure to keep track of >>>> classes and signatures, and it also makes the story around locking >>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning of all >>>> classes needed (as in the current implementation) and no searching of >>>> table needed (like in my previous attempts). >>>> >>>> Please review this new revision: >>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/ >>>> >>>> (Notice that there still appears to be a performance bottleneck with >>>> class-unloading when an actual debugger is attached. This doesn't seem >>>> to be related to the classTrack.c implementation though, but looks like >>>> a consequence of getting all those class-unload notifications over the >>>> wire. My testcase generates 1000s of them, and it's clogging up the >>>> buffers.) >>>> >>>> I am not sure why jdb needs to enable class-unload listener always. A >>>> simple hack disables it, and performance is brilliant, even when jdb is >>>> attached: >>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch >>>> >>>> But this is not in the scope of this bug.) >>>> >>>> Roman >>>> >>>> >>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote: >>>>> Sorry, forgot to complete my comments at the end (see below). >>>>> >>>>> >>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Roman, >>>>>> >>>>>> Thank you for the update and sorry for the latency in review. >>>>>> >>>>>> Some comments are below. >>>>>> >>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>> >>>>>> >>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag) >>>>>> ?? 88 { >>>>>> 89 debugMonitorEnter(deletedSignatureLock); >>>>>> 90 if (currentClassTag == -1) { >>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>> 93 return; >>>>>> ?? 94???? } >>>>>> Just a question: >>>>>> ?? Q1: Should the ObjectFree events be disabled for the jvmtiEnv >>>>>> that does >>>>>> ?????? the class tracking if class tracking has not been initialized? >>>>>> >>>>>> 70 static jlong currentClassTag; I'm thinking if the name is >>>>>> better to >>>>>> be something like: lastClassTag or highestClassTag. >>>>>> >>>>>> 99 KlassNode* klass = *klass_ptr; >>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103 >>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr; >>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // klass >>>>>> not >>>>>> found - ignore. >>>>>> 107 debugMonitorExit(deletedSignatureLock); >>>>>> 108 return; >>>>>> ? 109???? } >>>>>> ??It seems to me, something is wrong in the condition at L106 above. >>>>>> ??Should it be? : >>>>>> ???? if (klass == NULL || klass->klass_tag != tag) >>>>>> >>>>>> ??Otherwise, how can the second check ever work correctly as the >>>>>> return >>>>>> will always happen when (klass != NULL)? >>>>>> >>>>>> ? There are several places in this file with the the indent: >>>>>> 90 if (currentClassTag == -1) { >>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>> 93 return; >>>>>> ?? 94???? } >>>>>> ? ... >>>>>> 152 if (currentClassTag == -1) { >>>>>> 153 // Class tracking not initialized yet, nobody's interested >>>>>> 154 debugMonitorExit(deletedSignatureLock); >>>>>> 155 return; >>>>>> ? 156???? } >>>>>> ? ... >>>>>> 161 if (error != JVMTI_ERROR_NONE) { >>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv"); >>>>>> ? 163???? } >>>>>> 164 if (tag != 0l) { >>>>>> 165 debugMonitorExit(deletedSignatureLock); >>>>>> 166 return; // Already added >>>>>> ? 167???? } >>>>>> ? ... >>>>>> 281 cleanDeleted(void *signatureVoid, void *arg) >>>>>> 282 { >>>>>> 283 char* sig = (char*)signatureVoid; >>>>>> 284 jvmtiDeallocate(sig); >>>>>> 285 return JNI_TRUE; >>>>>> ? 286 } >>>>>> ? ... >>>>>> ? 291 void >>>>>> ? 292 classTrack_reset(void) >>>>>> ? 293 { >>>>>> 294 int idx; >>>>>> 295 debugMonitorEnter(deletedSignatureLock); >>>>>> 296 >>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) { >>>>>> 298 KlassNode* node = table[idx]; >>>>>> 299 while (node != NULL) { >>>>>> 300 KlassNode* next = node->next; >>>>>> 301 jvmtiDeallocate(node->signature); >>>>>> 302 jvmtiDeallocate(node); >>>>>> 303 node = next; >>>>>> 304 } >>>>>> 305 } >>>>>> 306 jvmtiDeallocate(table); >>>>>> 307 >>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL); >>>>>> 309 bagDestroyBag(deletedSignatureBag); >>>>>> 310 >>>>>> 311 currentClassTag = -1; >>>>>> 312 >>>>>> 313 >>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); >>>>>> 314 trackingEnv = NULL; >>>>>> 315 >>>>>> 316 debugMonitorExit(deletedSignatureLock); >>>>>> >>>>>> Could you, please, fix several comments below? >>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for >>>>>> class-unloads >>>>>> ??The comma is not needed. >>>>>> ??Would it better to replace: klass tags => klass_tag's ? >>>>>> >>>>>> >>>>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag >>>>>> consistent >>>>>> ??Maybe: Lock to guard ... or lock to keep integrity of ... >>>>>> >>>>>> 84 * Callback when classes are freed, Finds the signature and >>>>>> remembers it in deletedSignatureBag. Would be better to use words >>>>>> like >>>>>> "store" or "record", "Find" should not start from capital letter: >>>>>> Invoke the callback when classes are freed, find and record the >>>>>> signature in deletedSignatureBag. >>>>>> >>>>>> 96 // Find deleted KlassNode 133 // Class tracking not initialized, >>>>>> nobody's interested 153 // Class tracking not initialized yet, >>>>>> nobody's interested 158 /* Check this is not a duplicate */ Missed >>>>>> dot >>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { // >>>>>> klass not found - ignore. In opposite, dot is not needed as the >>>>>> comment does not start from a capital letter. 111 // At this point we >>>>>> have the KlassNode corresponding to the tag >>>>>> 112 // in klass, and the pointer to it in klass_node. >>>>> ? The comment above can be better. Maybe, something like: >>>>> ? ? " At this point, we found the KlassNode matching the klass >>>>> tag(and it is >>>>> linked). >>>>> >>>>>> 113 // Remember the unloaded signature. >>>>> ??Better: Record the signature of the unloaded class and unlink it. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> On 3/9/20 05:39, Roman Kennke wrote: >>>>>>> Hello all, >>>>>>> >>>>>>> Can I please get reviews of this change? In the meantime, we've done >>>>>>> more testing and also field-/torture-testing by a customer who is >>>>>>> happy >>>>>>> now. :-) >>>>>>> >>>>>>> Thanks, >>>>>>> Roman >>>>>>> >>>>>>> >>>>>>>> Hi Serguei, >>>>>>>> >>>>>>>> Thanks for reviewing! >>>>>>>> >>>>>>>> I updated the patch to reflect your suggestions, very good! >>>>>>>> It also includes a fix to allow re-connecting an agent after >>>>>>>> disconnect, >>>>>>>> namely move setup of the trackingEnv and deletedSignatureBag to >>>>>>>> _activate() to ensure have those structures after re-connect. >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/ >>>>>>>> >>>>>>>> Let me know what you think! >>>>>>>> Roman >>>>>>>> >>>>>>>>> Hi Roman, >>>>>>>>> >>>>>>>>> Thank you for taking care about this scalability issue! >>>>>>>>> >>>>>>>>> I have a couple of quick comments. >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>> >>>>>>>>> >>>>>>>>> 72 /* >>>>>>>>> 73 * Lock to protect deletedSignatureBag >>>>>>>>> 74 */ >>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /* >>>>>>>>> 78 * A bag containing all the deleted classes' signatures. Must be >>>>>>>>> accessed under >>>>>>>>> 79 * deletedTagLock, >>>>>>>>> ?? 80? */ >>>>>>>>> 81 struct bag* deletedSignatureBag; >>>>>>>>> >>>>>>>>> ?? The comments contradict to each other. >>>>>>>>> ?? I guess, the lock name at line 79 has to be >>>>>>>>> deletedSignatureLock >>>>>>>>> instead of deletedTagLock. >>>>>>>>> ?? Also, comma at the end must be replaced with dot. >>>>>>>>> >>>>>>>>> >>>>>>>>> 101 // Tag not found? Ignore. >>>>>>>>> 102 if (klass == NULL) { >>>>>>>>> 103 debugMonitorExit(deletedSignatureLock); >>>>>>>>> 104 return; >>>>>>>>> 105 } >>>>>>>>> ? 106 >>>>>>>>> 107 // Scan linked-list. >>>>>>>>> 108 jlong found_tag = klass->klass_tag; >>>>>>>>> 109 while (klass != NULL && found_tag != tag) { >>>>>>>>> 110 klass_ptr = &klass->next; >>>>>>>>> 111 klass = *klass_ptr; >>>>>>>>> 112 found_tag = klass->klass_tag; >>>>>>>>> ? 113???? } >>>>>>>>> 114 >>>>>>>>> 115 // Tag not found? Ignore. >>>>>>>>> 116 if (found_tag != tag) { >>>>>>>>> 117 debugMonitorExit(deletedSignatureLock); >>>>>>>>> 118 return; >>>>>>>>> ? 119???? } >>>>>>>>> >>>>>>>>> >>>>>>>>> ??The code above can be simplified, so that the lines 101-105 >>>>>>>>> are not >>>>>>>>> needed anymore. >>>>>>>>> ??It can be something like this: >>>>>>>>> >>>>>>>>> // Scan linked-list. >>>>>>>>> while (klass != NULL && klass->klass_tag != tag) { >>>>>>>>> klass_ptr = &klass->next; >>>>>>>>> klass = *klass_ptr; >>>>>>>>> ????? } >>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not >>>>>>>>> found - ignore. >>>>>>>>> debugMonitorExit(deletedSignatureLock); >>>>>>>>> return; >>>>>>>>> ????? } >>>>>>>>> >>>>>>>>> It will take more time when I get a chance to look at the rest. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote: >>>>>>>>>> Here comes an update that resolves some races that happen when >>>>>>>>>> disconnecting an agent. In particular, we need to take the >>>>>>>>>> lock on >>>>>>>>>> basically every operation, and also need to check whether or not >>>>>>>>>> class-tracking is active and return an appropriate result >>>>>>>>>> (e.g. an empty >>>>>>>>>> list) when we're not. >>>>>>>>>> >>>>>>>>>> Updated webrev: >>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Roman >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> So, here comes the O(1) implementation: >>>>>>>>>>> >>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a >>>>>>>>>>> tag, and we >>>>>>>>>>> set-up a listener to get notified when it is unloaded. >>>>>>>>>>> - Prepared classes are kept in a datastructure that is a >>>>>>>>>>> table, which >>>>>>>>>>> each entry being the head of a linked-list of KlassNode*. The >>>>>>>>>>> table is >>>>>>>>>>> indexed by tag % slot-count, and then simply prepend the new >>>>>>>>>>> KlassNode*. >>>>>>>>>>> This is O(1) operation. >>>>>>>>>>> - When we get notified of unloading a class, we look up the >>>>>>>>>>> signature of >>>>>>>>>>> the reported tag in that table, and remember it in a bag. The >>>>>>>>>>> KlassNode* >>>>>>>>>>> is then unlinked from the table and deallocated. This is >>>>>>>>>>> ~O(1) operation >>>>>>>>>>> too, depending on the depth of the table. In my testcase >>>>>>>>>>> which hammered >>>>>>>>>>> the code with class-loads and unloads, I usually see depths >>>>>>>>>>> of like 2-3, >>>>>>>>>>> but not usually more. It should be ok. >>>>>>>>>>> - when processUnloads() gets called, we simply hand out that >>>>>>>>>>> bag, and >>>>>>>>>>> allocate a new one. >>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid >>>>>>>>>>> leaking the >>>>>>>>>>> signatures and KlassNode* etc when debug agent gets detached >>>>>>>>>>> and/or >>>>>>>>>>> re-attached (was missing before). >>>>>>>>>>> - I also added locks around data-structure-manipulation (was >>>>>>>>>>> missing >>>>>>>>>>> before). >>>>>>>>>>> - Also, I only activate this whole process when an actual >>>>>>>>>>> listener gets >>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when >>>>>>>>>>> attaching a >>>>>>>>>>> jdb, not sure why jdb does that though. This may be something >>>>>>>>>>> to improve >>>>>>>>>>> in the future? >>>>>>>>>>> >>>>>>>>>>> In my tests, the performance of class-tracking itself looks >>>>>>>>>>> really good. >>>>>>>>>>> The bottleneck now is clearly actual synthesizing the >>>>>>>>>>> class-unload >>>>>>>>>>> events. I don't see how this can be helped when the debug >>>>>>>>>>> agent asks for it? >>>>>>>>>>> >>>>>>>>>>> Updated webrev: >>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ >>>>>>>>>>> >>>>>>>>>>> Please let me know what you think of it. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Roman >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Alright, the perfectionist in me got me. I am implementing >>>>>>>>>>>> the even more >>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing >>>>>>>>>>>> for now. >>>>>>>>>>>> >>>>>>>>>>>> Thanks,Roman >>>>>>>>>>>> >>>>>>>>>>>> ? Hi Chris, >>>>>>>>>>>>>> I'll have a look at this, although it might not be for a >>>>>>>>>>>>>> few days. In >>>>>>>>>>>>>> the meantime, maybe you can describe your new >>>>>>>>>>>>>> implementation in >>>>>>>>>>>>>> classTrack.c so it's easier to look through the changes. >>>>>>>>>>>>> Sure. >>>>>>>>>>>>> >>>>>>>>>>>>> The purpose of this class-tracking is to be able to >>>>>>>>>>>>> determine the >>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading >>>>>>>>>>>>> happened, so that >>>>>>>>>>>>> we can generate the appropriate JDWP event. >>>>>>>>>>>>> >>>>>>>>>>>>> The current implementation does so by maintaining a table >>>>>>>>>>>>> of currently >>>>>>>>>>>>> prepared classes by building that table when classTrack is >>>>>>>>>>>>> initialized, >>>>>>>>>>>>> and then add new classes whenever a class gets loaded. When >>>>>>>>>>>>> unloading >>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and >>>>>>>>>>>>> compared with the >>>>>>>>>>>>> old table, and whatever is in the old, but not in the new >>>>>>>>>>>>> table gets >>>>>>>>>>>>> returned. The problem is that when GCs happen frequently >>>>>>>>>>>>> and/or many >>>>>>>>>>>>> classes get loaded+unloaded, this amounts to >>>>>>>>>>>>> O(classCount*gcCount) >>>>>>>>>>>>> complexity. >>>>>>>>>>>>> >>>>>>>>>>>>> The new implementation keeps a linked-list of prepared >>>>>>>>>>>>> classes, and also >>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). >>>>>>>>>>>>> Whenever an >>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned, >>>>>>>>>>>>> and classes >>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus >>>>>>>>>>>>> maintaining the >>>>>>>>>>>>> prepared-classes-list) and its signature put in the list >>>>>>>>>>>>> that gets returned. >>>>>>>>>>>>> >>>>>>>>>>>>> The implementation is not perfect. In order to determine >>>>>>>>>>>>> whether or not >>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. >>>>>>>>>>>>> That process is >>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here >>>>>>>>>>>>> is that >>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this >>>>>>>>>>>>> seems to be >>>>>>>>>>>>> true, and also reasonable to expect. >>>>>>>>>>>>> >>>>>>>>>>>>> (I have some ideas how to improve the implementation to >>>>>>>>>>>>> ~O(1) but it >>>>>>>>>>>>> would be considerably more complex: have to maintain a >>>>>>>>>>>>> (hash)table that >>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload, >>>>>>>>>>>>> and build the >>>>>>>>>>>>> unloaded-signatures list there, but I don't currently see >>>>>>>>>>>>> that it's >>>>>>>>>>>>> worth the effort). >>>>>>>>>>>>> >>>>>>>>>>>>> In addition to all that, this process is only activated >>>>>>>>>>>>> when there's an >>>>>>>>>>>>> actual listener registered for EI_GC_FINISH. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Roman >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Chris >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote: >>>>>>>>>>>>>>> Hello all, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Issue: >>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c. >>>>>>>>>>>>>>> It avoids >>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps >>>>>>>>>>>>>>> track of >>>>>>>>>>>>>>> loaded/unloaded classes one-by-one. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> In addition to that, it avoids this whole dance until an >>>>>>>>>>>>>>> agent >>>>>>>>>>>>>>> registers interest in EI_GC_FINISH. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and >>>>>>>>>>>>>>> timing. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Eg with the testcase provided here: >>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I am getting those numbers: >>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s >>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Can I please get a review? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>> > > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From leonid.mesnik at oracle.com Wed Mar 25 19:01:30 2020 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Wed, 25 Mar 2020 12:01:30 -0700 Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the children process if vmArguments is already specified In-Reply-To: <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com> References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com> <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com> <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com> Message-ID: <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com> Added Ioi, who also proposed new version of startAppVmOpts. Please find new webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/ Renamed startAppVmOpts/runAppVmOpts to "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should make very clear that this method doesn't use any of test.java.opts, test.vm.opts. Also, I fixed test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java metnioned by Igor, and removed null pointer check as Ioi suggested in startApp method. + public static void startApp(LingeredApp theApp, String... additionalJvmOpts) throws IOException { + startAppExactJvmOpts(theApp, Utils.appendTestJavaOpts(additionalJvmOpts)); + } Leonid On 3/25/20 10:14 AM, Stefan Karlsson wrote: > On 2020-03-25 17:40, Igor Ignatyev wrote: >> Hi Leonid, >> >> I have briefly looked at the patch, a few comments so far: >> >> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java: >> ? - at L#114, could you please call static method using class name >> (as the opposite of using instance)? or was it meant to be >> theApp.runAppVmOpts(vmArgs) ? >> >> test/lib/jdk/test/lib/apps/LingeredApp.java: >> - it seems that code indent of startApp(LingeredApp, String[]) isn't >> correct >> - I don't like startAppVmOpts name, but unfortunately don't have a >> better suggestion (yet) > > I was going to say the same. Jtreg has the concept of "java options" > and "vm options". We have had a fair share of bugs and wasted time > when tests have been using the "vm options" part (VM_OPTIONS, > test.vm.options, etc), and we've been moving away from using that way > to pass options. I recently cleaned up some of this with: > > 8237111: LingeredApp should be started with getTestJavaOpts > > Because of this, I would prefer if we used a name that doesn't include > "VmOpts", because it's too alike the other concept. Some suggestions: > ?startAppJavaOptions > ?startAppUsingJavaOptions > ?startAppWithJavaOptions > ?startAppExactJavaOptions > ?startAppJvmOptions > > Thanks, > StefanK > >> Thanks, >> -- Igor >> >>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik >>> wrote: >>> >>> Hi >>> >>> Could you please review following fix which change LingeredApp to >>> prepend vm options to java/vm.test.opts when startApp is used and >>> provide startAppVmOpts to override options completely. >>> >>> The intention is to avoid issue like in this bug where test/jtreg >>> options were ignored by tests. Also I fixed some tests where >>> intention was to append vm options rather than to override them. >>> >>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/ >>> >>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698 >>> >>> Leonid >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.karlsson at oracle.com Wed Mar 25 19:06:57 2020 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 25 Mar 2020 20:06:57 +0100 Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the children process if vmArguments is already specified In-Reply-To: <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com> References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com> <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com> <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com> <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com> Message-ID: <375a722e-2397-450f-51d0-781b8bdd9ee8@oracle.com> Thanks for changing the name. Sounds good to me. I leave the full review to others. StefanK On 2020-03-25 20:01, Leonid Mesnik wrote: > > Added Ioi, who also proposed new version of startAppVmOpts. > > Please find new webrev: > http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/ > > Renamed startAppVmOpts/runAppVmOpts to > "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should make very > clear that this method doesn't use any of test.java.opts, test.vm.opts. > > Also, I fixed test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java > metnioned by Igor, and removed null pointer check as Ioi suggested in > startApp method. > > + public static void startApp(LingeredApp theApp, String... > additionalJvmOpts) throws IOException { > + startAppExactJvmOpts(theApp, > Utils.appendTestJavaOpts(additionalJvmOpts)); > + } > > Leonid > > On 3/25/20 10:14 AM, Stefan Karlsson wrote: >> On 2020-03-25 17:40, Igor Ignatyev wrote: >>> Hi Leonid, >>> >>> I have briefly looked at the patch, a few comments so far: >>> >>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java: >>> ? - at L#114, could you please call static method using class name >>> (as the opposite of using instance)? or was it meant to be >>> theApp.runAppVmOpts(vmArgs) ? >>> >>> test/lib/jdk/test/lib/apps/LingeredApp.java: >>> - it seems that code indent of startApp(LingeredApp, String[]) isn't >>> correct >>> - I don't like startAppVmOpts name, but unfortunately don't have a >>> better suggestion (yet) >> >> I was going to say the same. Jtreg has the concept of "java options" >> and "vm options". We have had a fair share of bugs and wasted time >> when tests have been using the "vm options" part (VM_OPTIONS, >> test.vm.options, etc), and we've been moving away from using that way >> to pass options. I recently cleaned up some of this with: >> >> 8237111: LingeredApp should be started with getTestJavaOpts >> >> Because of this, I would prefer if we used a name that doesn't >> include "VmOpts", because it's too alike the other concept. Some >> suggestions: >> ?startAppJavaOptions >> ?startAppUsingJavaOptions >> ?startAppWithJavaOptions >> ?startAppExactJavaOptions >> ?startAppJvmOptions >> >> Thanks, >> StefanK >> >>> Thanks, >>> -- Igor >>> >>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik >>>> wrote: >>>> >>>> Hi >>>> >>>> Could you please review following fix which change LingeredApp to >>>> prepend vm options to java/vm.test.opts when startApp is used and >>>> provide startAppVmOpts to override options completely. >>>> >>>> The intention is to avoid issue like in this bug where test/jtreg >>>> options were ignored by tests. Also I fixed some tests where >>>> intention was to append vm options rather than to override them. >>>> >>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/ >>>> >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698 >>>> >>>> Leonid >>>> >> From magnus.ihse.bursie at oracle.com Wed Mar 25 19:29:53 2020 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Wed, 25 Mar 2020 20:29:53 +0100 Subject: RFR: JDK-8241618 Fix unchecked warning for jdk.hotspot.agent Message-ID: <8d884fcb-f424-1b54-7ece-5260037b2843@oracle.com> With the recent fixes in JDK-8241310, JDK-8237746 and JDK-8241073, and the upcoming fixes to remove the deprecated nashorn and jdk.rmi, the JDK build is very close to producing no warnings when compiling the Java classes. The one remaining sinner is jdk.hotspot.agent. Most of the warnings here are turned off, but unchecked and deprecation cannot be completely silenced. Since the poor agent does not seem to receive much love nowadays, I took it upon myself to fix these warnings, so we can finally get a quiet build. I started to address the unchecked warnings. Unfortunately, this was a much bigger task than I anticipated. I had to generify most of the module. On the plus side, the code is so much better now. And most of the changes were trivial, just tedious. There are a few places were I'm not entirely happy with the current solution, and that at least merits some discussion. I have resorted to @SuppressWarnings in four classes: ciMethodData, MethodData, TableModelComparator and VirtualBaseConstructor. All of them has in common that they are doing slightly fishy things with classes in collections. I'm not entirely sure they are bug-free, but this patch leaves the behavior untouched. I did some efforts to sort out the logic, but it turned out to be too hairy for me to fix, and it will probably require more substantial changes to the workings of the code. To make the code valid, I have moved ConstMethod to extend Metadata instead of VMObject. My understanding is that this is benign (and likely intended), but I really need for someone who knows the code to confirm this. I have also added a FIXME to signal this. I'll remove the FIXME as soon as I get confirmation that this is OK. (The reason for this is the following piece of code from Metadata.java: metadataConstructor.addMapping("ConstMethod", ConstMethod.class)) In ObjectListPanel, there is some code that screams "dead" with this change. I added a FIXME to point this out: ??? for (Iterator iter = elements.iterator(); iter.hasNext(); ) { ????? if (iter.next() instanceof Array) { ??????? // FIXME: Does not seem possible to happen ??????? hasArrays = true; ??????? return; ????? } It seems that if you start pulling this thread, even more dead code will unravel, so I'm not so eager to touch this in the current patch. But I can remove the FIXME if you want. My first iteration of this patch tried to generify the IntervalTree and related class hierarchy. However, this turned out to be impossible due to some weird usage in AnnotatedMemoryPanel, where there seemed to be confusion as to whether the tree stored Annotations or Addresses. I'm not entirely convinced the code is correct, it certainly looked and smelled very fishy. However, I reverted these changes since I could not get them to work due to this, and it was not needed for the goal of just getting rid of the warning. Finally, I have done no testing apart from verifying that it builds. Please advice on suitable tests to run. Bug: https://bugs.openjdk.java.net/browse/JDK-8241618 WebRev: http://cr.openjdk.java.net/~ihse/JDK-8241618-fix-unchecked-warnings-for-agent/webrev.01 /Magnus From chris.plummer at oracle.com Wed Mar 25 19:36:34 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 25 Mar 2020 12:36:34 -0700 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com> <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com> <359442d4-d6a1-f123-dad1-ac712f06eb0e@redhat.com> Message-ID: <7c613970-1788-5bd5-06f9-5c754843a193@oracle.com> An HTML attachment was scrubbed... URL: From igor.ignatyev at oracle.com Wed Mar 25 19:46:15 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 25 Mar 2020 12:46:15 -0700 Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the children process if vmArguments is already specified In-Reply-To: <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com> References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com> <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com> <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com> <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com> Message-ID: <042D1C56-75E3-40B9-8259-035C9A13C13B@oracle.com> Hi Leonid, not related related to your patch (but yet somewhat made more obvious by it), it seems all (or at least almost all) the tests which use LingeredApp should be run in "driver" mode as they just orchestrate execution of other JVMs, so running them w/ main (let alone main/othervm) just wastes time, test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java#id1, for example, will now executed w/ Xcomp which will make it very slow for no reasons. since you already got your hands dirty w/ these tests, could you please file an RFE to sort this out and list all the affected tests there? re: the patch, could you please update ClhsdbJstack.java test not to be run w/ Xcomp and follow the same pattern you used in other tests (e.g. ClhsdbScanOops) ? other than that it looks fine to me, I however wouldn't be able to tell if all svc tests continue to do that they were supposed to, so I'd prefer for someone from svc team to chime in. Thanks, -- Igor > On Mar 25, 2020, at 12:01 PM, Leonid Mesnik wrote: > > Added Ioi, who also proposed new version of startAppVmOpts. > > Please find new webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/ > Renamed startAppVmOpts/runAppVmOpts to "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should make very clear that this method doesn't use any of test.java.opts, test.vm.opts. > > Also, I fixed test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java metnioned by Igor, and removed null pointer check as Ioi suggested in startApp method. > > + public static void startApp(LingeredApp theApp, String... additionalJvmOpts) throws IOException { > + startAppExactJvmOpts(theApp, Utils.appendTestJavaOpts(additionalJvmOpts)); > + } > > Leonid > > On 3/25/20 10:14 AM, Stefan Karlsson wrote: >> On 2020-03-25 17:40, Igor Ignatyev wrote: >>> Hi Leonid, >>> >>> I have briefly looked at the patch, a few comments so far: >>> >>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java: >>> - at L#114, could you please call static method using class name (as the opposite of using instance)? or was it meant to be theApp.runAppVmOpts(vmArgs) ? >>> >>> test/lib/jdk/test/lib/apps/LingeredApp.java: >>> - it seems that code indent of startApp(LingeredApp, String[]) isn't correct >>> - I don't like startAppVmOpts name, but unfortunately don't have a better suggestion (yet) >> >> I was going to say the same. Jtreg has the concept of "java options" and "vm options". We have had a fair share of bugs and wasted time when tests have been using the "vm options" part (VM_OPTIONS, test.vm.options, etc), and we've been moving away from using that way to pass options. I recently cleaned up some of this with: >> >> 8237111: LingeredApp should be started with getTestJavaOpts >> >> Because of this, I would prefer if we used a name that doesn't include "VmOpts", because it's too alike the other concept. Some suggestions: >> startAppJavaOptions >> startAppUsingJavaOptions >> startAppWithJavaOptions >> startAppExactJavaOptions >> startAppJvmOptions >> >> Thanks, >> StefanK >> >>> Thanks, >>> -- Igor >>> >>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik wrote: >>>> >>>> Hi >>>> >>>> Could you please review following fix which change LingeredApp to prepend vm options to java/vm.test.opts when startApp is used and provide startAppVmOpts to override options completely. >>>> >>>> The intention is to avoid issue like in this bug where test/jtreg options were ignored by tests. Also I fixed some tests where intention was to append vm options rather than to override them. >>>> >>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/ >>>> >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698 >>>> >>>> Leonid >>>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Wed Mar 25 19:52:02 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 25 Mar 2020 12:52:02 -0700 Subject: RFR: JDK-8241618 Fix unchecked warning for jdk.hotspot.agent In-Reply-To: <8d884fcb-f424-1b54-7ece-5260037b2843@oracle.com> References: <8d884fcb-f424-1b54-7ece-5260037b2843@oracle.com> Message-ID: <01f77be3-e7d2-a051-80ab-e81c83922cf6@oracle.com> Hi Magus, I haven't looked at the changes yet, other to see that there are many files touched, but after reading below (and only partly understanding since I don't know this area well), I was wondering if this issue wouldn't be better served with multiple passes made to fix the warnings. Start with a straight forward one where you are maybe only making one or two types of changes, but that affect a large number of files and don't cascade into other more complicated changes. This will get a lot of the noise out of the way, and then we can focus on some of the harder issues you bring up below. As for testing, I think the following list will capture all of them, but can't say for sure: open/test/hotspot/jtreg/serviceability/sa open/test/hotspot/jtreg/resourcehogs/serviceability/sa open/test/jdk/sun/tools/jhsdb open/test/jdk/sun/tools/jstack open/test/jdk/sun/tools/jmap open/test/hotspot/jtreg/gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java open/test/hotspot/jtreg/compiler/ciReplay/TestSAClient.java open/test/hotspot/jtreg/compiler/ciReplay/TestSAServer.java Chris On 3/25/20 12:29 PM, Magnus Ihse Bursie wrote: > With the recent fixes in JDK-8241310, JDK-8237746 and JDK-8241073, and > the upcoming fixes to remove the deprecated nashorn and jdk.rmi, the > JDK build is very close to producing no warnings when compiling the > Java classes. > > The one remaining sinner is jdk.hotspot.agent. Most of the warnings > here are turned off, but unchecked and deprecation cannot be > completely silenced. > > Since the poor agent does not seem to receive much love nowadays, I > took it upon myself to fix these warnings, so we can finally get a > quiet build. > > I started to address the unchecked warnings. Unfortunately, this was a > much bigger task than I anticipated. I had to generify most of the > module. On the plus side, the code is so much better now. And most of > the changes were trivial, just tedious. > > There are a few places were I'm not entirely happy with the current > solution, and that at least merits some discussion. > > I have resorted to @SuppressWarnings in four classes: ciMethodData, > MethodData, TableModelComparator and VirtualBaseConstructor. All of > them has in common that they are doing slightly fishy things with > classes in collections. I'm not entirely sure they are bug-free, but > this patch leaves the behavior untouched. I did some efforts to sort > out the logic, but it turned out to be too hairy for me to fix, and it > will probably require more substantial changes to the workings of the > code. > > To make the code valid, I have moved ConstMethod to extend Metadata > instead of VMObject. My understanding is that this is benign (and > likely intended), but I really need for someone who knows the code to > confirm this. I have also added a FIXME to signal this. I'll remove > the FIXME as soon as I get confirmation that this is OK. > (The reason for this is the following piece of code from > Metadata.java: metadataConstructor.addMapping("ConstMethod", > ConstMethod.class)) > > In ObjectListPanel, there is some code that screams "dead" with this > change. I added a FIXME to point this out: > ??? for (Iterator iter = elements.iterator(); iter.hasNext(); ) { > ????? if (iter.next() instanceof Array) { > ??????? // FIXME: Does not seem possible to happen > ??????? hasArrays = true; > ??????? return; > ????? } > It seems that if you start pulling this thread, even more dead code > will unravel, so I'm not so eager to touch this in the current patch. > But I can remove the FIXME if you want. > > My first iteration of this patch tried to generify the IntervalTree > and related class hierarchy. However, this turned out to be impossible > due to some weird usage in AnnotatedMemoryPanel, where there seemed to > be confusion as to whether the tree stored Annotations or Addresses. > I'm not entirely convinced the code is correct, it certainly looked > and smelled very fishy. However, I reverted these changes since I > could not get them to work due to this, and it was not needed for the > goal of just getting rid of the warning. > > Finally, I have done no testing apart from verifying that it builds. > Please advice on suitable tests to run. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8241618 > WebRev: > http://cr.openjdk.java.net/~ihse/JDK-8241618-fix-unchecked-warnings-for-agent/webrev.01 > > /Magnus From rkennke at redhat.com Wed Mar 25 19:59:10 2020 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 25 Mar 2020 20:59:10 +0100 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: <7c613970-1788-5bd5-06f9-5c754843a193@oracle.com> References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com> <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com> <359442d4-d6a1-f123-dad1-ac712f06eb0e@redhat.com> <7c613970-1788-5bd5-06f9-5c754843a193@oracle.com> Message-ID: <70717ba0-bcd1-05d4-3f25-ad1dd30309c1@redhat.com> Hi Chris, Apparently we can get into classTrack_reset() before calling activate(), and we're seeing a null deletedSignatureBag. A simple NULL-check around the cleaning routine fixes the problem for me. http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.08/ Should I post another submit-repo job with that fix? Thanks, Roman > Hi Roman, > > com/sun/jdi/JdwpAllowTest.java crashed on many runs: > > Stack: [0x00007fbb790f9000,0x00007fbb791fa000], sp=0x00007fbb791f8af0, free space=1022k > Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [libjdwp.so+0xdb71] bagEnumerateOver+0x11 > C [libjdwp.so+0xe365] classTrack_reset+0x25 > C [libjdwp.so+0xfca1] debugInit_reset+0x71 > C [libjdwp.so+0x12e0d] debugLoop_run+0x38d > C [libjdwp.so+0x25700] acceptThread+0x80 > V [libjvm.so+0xf4b5a7] JvmtiAgentThread::call_start_function()+0x1c7 > V [libjvm.so+0x15215c6] JavaThread::thread_main_inner()+0x226 > V [libjvm.so+0x1527736] Thread::call_run()+0xf6 > V [libjvm.so+0x1250ade] thread_native_entry(Thread*)+0x10e > > > This happened during a test task run of open/test/jdk/:jdk_jdi. There > doesn't seem to be anything magic on the command line that might be > triggering. Pretty much I see it with all the various VM configs we test. > > I'm also seeing crashes in the following tests, but not as often: > > serviceability/jvmti/ModuleAwareAgents/ThreadStart/MAAThreadStart.java > vmTestbase/nsk/jdwp/VirtualMachine/Version/version002/TestDescription.java > vmTestbase/nsk/jdwp/VirtualMachine/ReleaseEvents/releaseevents002/TestDescription.java > vmTestbase/nsk/jdwp/VirtualMachine/HoldEvents/holdevents002/TestDescription.java > vmTestbase/nsk/jdwp/VirtualMachine/Dispose/dispose001/TestDescription.java > > thanks, > > Chris > > > On 3/25/20 11:37 AM, Roman Kennke wrote: >> Hi Chris, >> >>> Regarding the new assert: >>> >>> ?105???? if (gdata && gdata->assertOn) { >>> ?106???????? // Check this is not already tagged. >>> ?107???????? jlong tag; >>> ?108???????? error = JVMTI_FUNC_PTR(trackingEnv, GetTag)(env, klass, &tag); >>> ?109???????? if (error != JVMTI_ERROR_NONE) { >>> ?110???????????? EXIT_ERROR(error, "Unable to GetTag with class >>> trackingEnv"); >>> ?111???????? } >>> ?112???????? JDI_ASSERT(tag == NOT_TAGGED); >>> ?113???? } >>> >>> I think you should remove the gdata check. gdata should never be NULL >>> when you get to this code. If it is ever NULL then there's a bug, and >>> the check will hide the bug. >> Ok, will remove this. >> >>> Regarding testing, after you do the submit repo testing let me know the >>> jobID and I'll do additional testing on it. >> I did the submit repo earlier today, and it came back green: >> >> mach5-one-rkennke-JDK-8227269-2-20200325-1421-9706762 >> >> Thanks, >> Roman >> >>> thanks, >>> >>> Chris >>> >>> On 3/25/20 6:00 AM, Roman Kennke wrote: >>>> Hi Sergei, >>>> >>>>> The fix looks pretty clean now. >>>>> I also like new name of the lock.:) >>>> Thank you! >>>> >>>>> Just one comment below. >>>>> >>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>> >>>>> >>>>> 110 if (tag != 0l) { >>>>> 111 return; // Already added >>>>> ? 112???? } >>>>> >>>>> ??It is better to use a named constant or macro instead. >>>>> ??Also, it'd be nice to add a short comment about this value is. >>>> As I replied to Chris earlier, this whole block can be turned into an >>>> assert. I also made a constant for the value 0, which should be pretty >>>> much self-explaining. >>>> >>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.07/ >>>> >>>>> How do you test the fix? >>>> I am using a manual test that is provided in this bug report: >>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>> >>>> "Script to compare performance of GC with and without debugger, when >>>> many classes are loaded and classes are being unloaded": >>>> >>>> https://bugzilla.redhat.com/attachment.cgi?id=1640688 >>>> >>>> I am also using this test and manually attach/detach jdb a couple of >>>> times in a row to check that disconnecting and reconnecting works well >>>> (this tended to deadlock or crash with an earlier version of the patch, >>>> and is now looking good). >>>> >>>> I am also running tier1 and tier2 tests locally, and as soon as we all >>>> agree that the fix is reasonable, I will push it to the submit repo. I >>>> am not sure if any of those tests actually exercise that code, though. >>>> Let me know if you want me to run any specific tests. >>>> >>>> Thank you, >>>> Roman >>>> >>>> >>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 3/20/20 08:30, Roman Kennke wrote: >>>>>> I believe I came up with a much simpler solution that also solves the >>>>>> problems of the existing one, and the ones I proposed earlier. >>>>>> >>>>>> It turns out that we can take advantage of the fact that we can use >>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is >>>>>> explicitely >>>>>> mentioned in the JVMTI spec). This means we can simply stick a pointer >>>>>> to the signature of a class into the tag, and pull it out again when we >>>>>> get notified that the class gets unloaded. >>>>>> >>>>>> This means we don't need an extra data-structure to keep track of >>>>>> classes and signatures, and it also makes the story around locking >>>>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning of all >>>>>> classes needed (as in the current implementation) and no searching of >>>>>> table needed (like in my previous attempts). >>>>>> >>>>>> Please review this new revision: >>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/ >>>>>> >>>>>> (Notice that there still appears to be a performance bottleneck with >>>>>> class-unloading when an actual debugger is attached. This doesn't seem >>>>>> to be related to the classTrack.c implementation though, but looks like >>>>>> a consequence of getting all those class-unload notifications over the >>>>>> wire. My testcase generates 1000s of them, and it's clogging up the >>>>>> buffers.) >>>>>> >>>>>> I am not sure why jdb needs to enable class-unload listener always. A >>>>>> simple hack disables it, and performance is brilliant, even when jdb is >>>>>> attached: >>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch >>>>>> >>>>>> But this is not in the scope of this bug.) >>>>>> >>>>>> Roman >>>>>> >>>>>> >>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote: >>>>>>> Sorry, forgot to complete my comments at the end (see below). >>>>>>> >>>>>>> >>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote: >>>>>>>> Hi Roman, >>>>>>>> >>>>>>>> Thank you for the update and sorry for the latency in review. >>>>>>>> >>>>>>>> Some comments are below. >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>> >>>>>>>> >>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag) >>>>>>>> ?? 88 { >>>>>>>> 89 debugMonitorEnter(deletedSignatureLock); >>>>>>>> 90 if (currentClassTag == -1) { >>>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>>> 93 return; >>>>>>>> ?? 94???? } >>>>>>>> Just a question: >>>>>>>> ?? Q1: Should the ObjectFree events be disabled for the jvmtiEnv >>>>>>>> that does >>>>>>>> ?????? the class tracking if class tracking has not been initialized? >>>>>>>> >>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is >>>>>>>> better to >>>>>>>> be something like: lastClassTag or highestClassTag. >>>>>>>> >>>>>>>> 99 KlassNode* klass = *klass_ptr; >>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103 >>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr; >>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // klass >>>>>>>> not >>>>>>>> found - ignore. >>>>>>>> 107 debugMonitorExit(deletedSignatureLock); >>>>>>>> 108 return; >>>>>>>> ? 109???? } >>>>>>>> ??It seems to me, something is wrong in the condition at L106 above. >>>>>>>> ??Should it be? : >>>>>>>> ???? if (klass == NULL || klass->klass_tag != tag) >>>>>>>> >>>>>>>> ??Otherwise, how can the second check ever work correctly as the >>>>>>>> return >>>>>>>> will always happen when (klass != NULL)? >>>>>>>> >>>>>>>> ? There are several places in this file with the the indent: >>>>>>>> 90 if (currentClassTag == -1) { >>>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>>> 93 return; >>>>>>>> ?? 94???? } >>>>>>>> ? ... >>>>>>>> 152 if (currentClassTag == -1) { >>>>>>>> 153 // Class tracking not initialized yet, nobody's interested >>>>>>>> 154 debugMonitorExit(deletedSignatureLock); >>>>>>>> 155 return; >>>>>>>> ? 156???? } >>>>>>>> ? ... >>>>>>>> 161 if (error != JVMTI_ERROR_NONE) { >>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv"); >>>>>>>> ? 163???? } >>>>>>>> 164 if (tag != 0l) { >>>>>>>> 165 debugMonitorExit(deletedSignatureLock); >>>>>>>> 166 return; // Already added >>>>>>>> ? 167???? } >>>>>>>> ? ... >>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg) >>>>>>>> 282 { >>>>>>>> 283 char* sig = (char*)signatureVoid; >>>>>>>> 284 jvmtiDeallocate(sig); >>>>>>>> 285 return JNI_TRUE; >>>>>>>> ? 286 } >>>>>>>> ? ... >>>>>>>> ? 291 void >>>>>>>> ? 292 classTrack_reset(void) >>>>>>>> ? 293 { >>>>>>>> 294 int idx; >>>>>>>> 295 debugMonitorEnter(deletedSignatureLock); >>>>>>>> 296 >>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) { >>>>>>>> 298 KlassNode* node = table[idx]; >>>>>>>> 299 while (node != NULL) { >>>>>>>> 300 KlassNode* next = node->next; >>>>>>>> 301 jvmtiDeallocate(node->signature); >>>>>>>> 302 jvmtiDeallocate(node); >>>>>>>> 303 node = next; >>>>>>>> 304 } >>>>>>>> 305 } >>>>>>>> 306 jvmtiDeallocate(table); >>>>>>>> 307 >>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL); >>>>>>>> 309 bagDestroyBag(deletedSignatureBag); >>>>>>>> 310 >>>>>>>> 311 currentClassTag = -1; >>>>>>>> 312 >>>>>>>> 313 >>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); >>>>>>>> 314 trackingEnv = NULL; >>>>>>>> 315 >>>>>>>> 316 debugMonitorExit(deletedSignatureLock); >>>>>>>> >>>>>>>> Could you, please, fix several comments below? >>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for >>>>>>>> class-unloads >>>>>>>> ??The comma is not needed. >>>>>>>> ??Would it better to replace: klass tags => klass_tag's ? >>>>>>>> >>>>>>>> >>>>>>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag >>>>>>>> consistent >>>>>>>> ??Maybe: Lock to guard ... or lock to keep integrity of ... >>>>>>>> >>>>>>>> 84 * Callback when classes are freed, Finds the signature and >>>>>>>> remembers it in deletedSignatureBag. Would be better to use words >>>>>>>> like >>>>>>>> "store" or "record", "Find" should not start from capital letter: >>>>>>>> Invoke the callback when classes are freed, find and record the >>>>>>>> signature in deletedSignatureBag. >>>>>>>> >>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not initialized, >>>>>>>> nobody's interested 153 // Class tracking not initialized yet, >>>>>>>> nobody's interested 158 /* Check this is not a duplicate */ Missed >>>>>>>> dot >>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { // >>>>>>>> klass not found - ignore. In opposite, dot is not needed as the >>>>>>>> comment does not start from a capital letter. 111 // At this point we >>>>>>>> have the KlassNode corresponding to the tag >>>>>>>> 112 // in klass, and the pointer to it in klass_node. >>>>>>> ? The comment above can be better. Maybe, something like: >>>>>>> ? ? " At this point, we found the KlassNode matching the klass >>>>>>> tag(and it is >>>>>>> linked). >>>>>>> >>>>>>>> 113 // Remember the unloaded signature. >>>>>>> ??Better: Record the signature of the unloaded class and unlink it. >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> On 3/9/20 05:39, Roman Kennke wrote: >>>>>>>>> Hello all, >>>>>>>>> >>>>>>>>> Can I please get reviews of this change? In the meantime, we've done >>>>>>>>> more testing and also field-/torture-testing by a customer who is >>>>>>>>> happy >>>>>>>>> now. :-) >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Roman >>>>>>>>> >>>>>>>>> >>>>>>>>>> Hi Serguei, >>>>>>>>>> >>>>>>>>>> Thanks for reviewing! >>>>>>>>>> >>>>>>>>>> I updated the patch to reflect your suggestions, very good! >>>>>>>>>> It also includes a fix to allow re-connecting an agent after >>>>>>>>>> disconnect, >>>>>>>>>> namely move setup of the trackingEnv and deletedSignatureBag to >>>>>>>>>> _activate() to ensure have those structures after re-connect. >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/ >>>>>>>>>> >>>>>>>>>> Let me know what you think! >>>>>>>>>> Roman >>>>>>>>>> >>>>>>>>>>> Hi Roman, >>>>>>>>>>> >>>>>>>>>>> Thank you for taking care about this scalability issue! >>>>>>>>>>> >>>>>>>>>>> I have a couple of quick comments. >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 72 /* >>>>>>>>>>> 73 * Lock to protect deletedSignatureBag >>>>>>>>>>> 74 */ >>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /* >>>>>>>>>>> 78 * A bag containing all the deleted classes' signatures. Must be >>>>>>>>>>> accessed under >>>>>>>>>>> 79 * deletedTagLock, >>>>>>>>>>> ?? 80? */ >>>>>>>>>>> 81 struct bag* deletedSignatureBag; >>>>>>>>>>> >>>>>>>>>>> ?? The comments contradict to each other. >>>>>>>>>>> ?? I guess, the lock name at line 79 has to be >>>>>>>>>>> deletedSignatureLock >>>>>>>>>>> instead of deletedTagLock. >>>>>>>>>>> ?? Also, comma at the end must be replaced with dot. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 101 // Tag not found? Ignore. >>>>>>>>>>> 102 if (klass == NULL) { >>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>> 104 return; >>>>>>>>>>> 105 } >>>>>>>>>>> ? 106 >>>>>>>>>>> 107 // Scan linked-list. >>>>>>>>>>> 108 jlong found_tag = klass->klass_tag; >>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) { >>>>>>>>>>> 110 klass_ptr = &klass->next; >>>>>>>>>>> 111 klass = *klass_ptr; >>>>>>>>>>> 112 found_tag = klass->klass_tag; >>>>>>>>>>> ? 113???? } >>>>>>>>>>> 114 >>>>>>>>>>> 115 // Tag not found? Ignore. >>>>>>>>>>> 116 if (found_tag != tag) { >>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>> 118 return; >>>>>>>>>>> ? 119???? } >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> ??The code above can be simplified, so that the lines 101-105 >>>>>>>>>>> are not >>>>>>>>>>> needed anymore. >>>>>>>>>>> ??It can be something like this: >>>>>>>>>>> >>>>>>>>>>> // Scan linked-list. >>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) { >>>>>>>>>>> klass_ptr = &klass->next; >>>>>>>>>>> klass = *klass_ptr; >>>>>>>>>>> ????? } >>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not >>>>>>>>>>> found - ignore. >>>>>>>>>>> debugMonitorExit(deletedSignatureLock); >>>>>>>>>>> return; >>>>>>>>>>> ????? } >>>>>>>>>>> >>>>>>>>>>> It will take more time when I get a chance to look at the rest. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Serguei >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote: >>>>>>>>>>>> Here comes an update that resolves some races that happen when >>>>>>>>>>>> disconnecting an agent. In particular, we need to take the >>>>>>>>>>>> lock on >>>>>>>>>>>> basically every operation, and also need to check whether or not >>>>>>>>>>>> class-tracking is active and return an appropriate result >>>>>>>>>>>> (e.g. an empty >>>>>>>>>>>> list) when we're not. >>>>>>>>>>>> >>>>>>>>>>>> Updated webrev: >>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Roman >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> So, here comes the O(1) implementation: >>>>>>>>>>>>> >>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a >>>>>>>>>>>>> tag, and we >>>>>>>>>>>>> set-up a listener to get notified when it is unloaded. >>>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a >>>>>>>>>>>>> table, which >>>>>>>>>>>>> each entry being the head of a linked-list of KlassNode*. The >>>>>>>>>>>>> table is >>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend the new >>>>>>>>>>>>> KlassNode*. >>>>>>>>>>>>> This is O(1) operation. >>>>>>>>>>>>> - When we get notified of unloading a class, we look up the >>>>>>>>>>>>> signature of >>>>>>>>>>>>> the reported tag in that table, and remember it in a bag. The >>>>>>>>>>>>> KlassNode* >>>>>>>>>>>>> is then unlinked from the table and deallocated. This is >>>>>>>>>>>>> ~O(1) operation >>>>>>>>>>>>> too, depending on the depth of the table. In my testcase >>>>>>>>>>>>> which hammered >>>>>>>>>>>>> the code with class-loads and unloads, I usually see depths >>>>>>>>>>>>> of like 2-3, >>>>>>>>>>>>> but not usually more. It should be ok. >>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out that >>>>>>>>>>>>> bag, and >>>>>>>>>>>>> allocate a new one. >>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid >>>>>>>>>>>>> leaking the >>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets detached >>>>>>>>>>>>> and/or >>>>>>>>>>>>> re-attached (was missing before). >>>>>>>>>>>>> - I also added locks around data-structure-manipulation (was >>>>>>>>>>>>> missing >>>>>>>>>>>>> before). >>>>>>>>>>>>> - Also, I only activate this whole process when an actual >>>>>>>>>>>>> listener gets >>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when >>>>>>>>>>>>> attaching a >>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be something >>>>>>>>>>>>> to improve >>>>>>>>>>>>> in the future? >>>>>>>>>>>>> >>>>>>>>>>>>> In my tests, the performance of class-tracking itself looks >>>>>>>>>>>>> really good. >>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the >>>>>>>>>>>>> class-unload >>>>>>>>>>>>> events. I don't see how this can be helped when the debug >>>>>>>>>>>>> agent asks for it? >>>>>>>>>>>>> >>>>>>>>>>>>> Updated webrev: >>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ >>>>>>>>>>>>> >>>>>>>>>>>>> Please let me know what you think of it. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Roman >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am implementing >>>>>>>>>>>>>> the even more >>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing >>>>>>>>>>>>>> for now. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks,Roman >>>>>>>>>>>>>> >>>>>>>>>>>>>> ? Hi Chris, >>>>>>>>>>>>>>>> I'll have a look at this, although it might not be for a >>>>>>>>>>>>>>>> few days. In >>>>>>>>>>>>>>>> the meantime, maybe you can describe your new >>>>>>>>>>>>>>>> implementation in >>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the changes. >>>>>>>>>>>>>>> Sure. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to >>>>>>>>>>>>>>> determine the >>>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading >>>>>>>>>>>>>>> happened, so that >>>>>>>>>>>>>>> we can generate the appropriate JDWP event. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The current implementation does so by maintaining a table >>>>>>>>>>>>>>> of currently >>>>>>>>>>>>>>> prepared classes by building that table when classTrack is >>>>>>>>>>>>>>> initialized, >>>>>>>>>>>>>>> and then add new classes whenever a class gets loaded. When >>>>>>>>>>>>>>> unloading >>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and >>>>>>>>>>>>>>> compared with the >>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the new >>>>>>>>>>>>>>> table gets >>>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently >>>>>>>>>>>>>>> and/or many >>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to >>>>>>>>>>>>>>> O(classCount*gcCount) >>>>>>>>>>>>>>> complexity. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared >>>>>>>>>>>>>>> classes, and also >>>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). >>>>>>>>>>>>>>> Whenever an >>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned, >>>>>>>>>>>>>>> and classes >>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus >>>>>>>>>>>>>>> maintaining the >>>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list >>>>>>>>>>>>>>> that gets returned. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The implementation is not perfect. In order to determine >>>>>>>>>>>>>>> whether or not >>>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. >>>>>>>>>>>>>>> That process is >>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here >>>>>>>>>>>>>>> is that >>>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this >>>>>>>>>>>>>>> seems to be >>>>>>>>>>>>>>> true, and also reasonable to expect. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> (I have some ideas how to improve the implementation to >>>>>>>>>>>>>>> ~O(1) but it >>>>>>>>>>>>>>> would be considerably more complex: have to maintain a >>>>>>>>>>>>>>> (hash)table that >>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload, >>>>>>>>>>>>>>> and build the >>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently see >>>>>>>>>>>>>>> that it's >>>>>>>>>>>>>>> worth the effort). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> In addition to all that, this process is only activated >>>>>>>>>>>>>>> when there's an >>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote: >>>>>>>>>>>>>>>>> Hello all, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Issue: >>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c. >>>>>>>>>>>>>>>>> It avoids >>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps >>>>>>>>>>>>>>>>> track of >>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance until an >>>>>>>>>>>>>>>>> agent >>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and >>>>>>>>>>>>>>>>> timing. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Eg with the testcase provided here: >>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I am getting those numbers: >>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s >>>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Can I please get a review? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>> > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From serguei.spitsyn at oracle.com Wed Mar 25 20:01:23 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 25 Mar 2020 13:01:23 -0700 Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI connector port In-Reply-To: <202C5C21-BA69-4ACF-9421-A9B5D6704C8C@oracle.com> References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com> <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com> <41392FC2-AC78-4E94-878B-907DDFB3E968@oracle.com> <78EA1B33-1B0D-4969-AFC5-E73D8F82E6FA@oracle.com> <9c888cc4-f9f8-8f64-f90b-a949007bc1dc@oracle.com> <202C5C21-BA69-4ACF-9421-A9B5D6704C8C@oracle.com> Message-ID: <3fe390e9-39d1-c547-9480-fa1962cef0d8@oracle.com> Hi Daniil, On 3/24/20 10:00, Daniil Titov wrote: > Hi Serguei, > >> It looks like you removed the last call site of DebugServer.main. > Yes. It is correct. > >> Do we need to remove the DebugServer.java as well? > I was considering this but since it is a public class I think it needs to be deprecated first. I also think that it would be better to do in a separate issue > since a CSR for deprecation needs to be filed for that. If you agree I will create a new issue for that. I'm okay to separate this. Thanks, Serguei > > Thanks, > Daniil > > > ?On 3/23/20, 11:56 PM, "serguei.spitsyn at oracle.com" wrote: > > Hi Daniil, > > It looks pretty good in general. > > It looks like you removed the last call site of DebugServer.main. > Do we need to remove the DebugServer.java as well? > > Thanks, > Serguei > > > On 3/22/20 15:29, Daniil Titov wrote: > > Hi Yasumasa, Serguei and Alex, > > > > Please review a new version of the webrev that merges SADebugDTest.java with changes done in [2]. > > > > Also the CRS [3] and the help message for debug server in SALauncher.java were updated to specify that '--hostname' > > option could be a hostname or an IPv4/IPv6 address. > > > > > Ok, but I think it might be more simply with TestLibrary. > > > For example, can you use TestLibrary::getUnusedRandomPort ? It is used in test/jdk/java/rmi/testlibrary/RMID.java . > > > > TestLibrary:: getUnusedRandomPort() doesn't allow to specify what ports are reserved and it uses some hardcoded port range [FIXED_PORT_MIN, FIXED_PORT_MAX] as reserved ports. Besides, test/jdk/java/rmi/testlibrary/TestLibrary.java class cannot be directly used in test/hotspot/jtreg/serviceability/* tests (it doesn't compile). > > > > Nevertheless, to simplify the test itself I moved findUnreservedFreePort(int .. reservedPorts) from SADebugTest.java to jdk.test.lib.Utils in /test/lib. > > > > Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > [1] http://cr.openjdk.java.net/~dtitov/8196751/webrev.04/ > > [2] https://bugs.openjdk.java.net/browse/JDK-8238268 > > [3] https://bugs.openjdk.java.net/browse/JDK-8239831 > > > > Thank you, > > Daniil > > > > ?On 3/13/20, 7:23 PM, "Yasumasa Suenaga" wrote: > > > > Hi Daniil, > > > > On 2020/03/14 7:05, Daniil Titov wrote: > > > Hi Yasumasa, Serguei and Alex, > > > > > > Please review a new version of the webrev that includes the changes Yasumasa suggested. > > > > > >> Shutdown hook is already registered in c'tor of HotSpotAgent. > > >> It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed. > > > > > > The shutdown hook registered in the HotSpotAgent c'tor only works for non-servers, so we still need a > > > the shutdown hook for remote server being added in SALauncher. I changed it to use the lambda expression. > > > > > > 101 public HotSpotAgent() { > > > 102 // for non-server add shutdown hook to clean-up debugger in case > > > 103 // of forced exit. For remote server, shutdown hook is added by > > > 104 // DebugServer. > > > 105 Runtime.getRuntime().addShutdownHook(new java.lang.Thread( > > > 106 new Runnable() { > > > 107 public void run() { > > > 108 synchronized (HotSpotAgent.this) { > > > 109 if (!isServer) { > > > 110 detach(); > > > 111 } > > > 112 } > > > 113 } > > > 114 })); > > > 115 } > > > > I missed it, thanks! > > > > > > >>> Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains > > >>> `exclusiveAccess.dirs=.` to avoid concurrent execution > > > As I understand exclusiveAccess.dirs prevents only the tests located in this directory from being run simultaneously and other tests could still run in parallel with one of these tests. Thus I would prefer to have the retry mechanism in place. I simplified the code using the class variables instead of local arrays. > > > > Ok, but I think it might be more simply with TestLibrary. > > For example, can you use TestLibrary::getUnusedRandomPort ? It is used in test/jdk/java/rmi/testlibrary/RMID.java . > > > > > > Thanks, > > > > Yasumasa > > > > > > > Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.03/ > > > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > > [3] Bug: https://bugs.openjdk.java.net/browse/JDK-8196751 > > > > > > Thank you, > > > Daniil > > > > > > ?On 3/6/20, 6:15 PM, "Yasumasa Suenaga" wrote: > > > > > > Hi Daniil, > > > > > > On 2020/03/07 3:38, Daniil Titov wrote: > > > > Hi Yasumasa, > > > > > > > > -> checkBasicOptions() is needed? I think you can remove this method and embed it in caller. > > > > I think that having a piece of code that invokes a method named "buildAttachArgs" with a copy of the argument map just for its side-effect ( it throws an exception if parameters are incorrect) and ignores its return might look confusing. Thus, I found it more appropriate to wrap it inside a method with more relevant name . > > > > > > Ok, but I prefer to leave comment it. > > > > > > > > > > > SADebugDTest > > > > > - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array. > > > > We cannot use primitives there since these local variables are captured in lambda expression and are required to be final. > > > > The other option is to use some other wrapper for them but I don't see any obvious benefits in it comparing to the array. > > > > > > Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains `exclusiveAccess.dirs=.` to avoid concurrent execution. > > > If you do not think this error check, test code is more simply. > > > > > > > > > > I will include your other suggestion in the new version of the webrev. > > > > > > Sorry, I have one more comment: > > > > > > > - Shutdown hook is very good idea. You can implement more simply if you use lambda expression. > > > > > > Shutdown hook is already registered in c'tor of HotSpotAgent. > > > It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed. > > > > > > > > > Thanks, > > > > > > Yasumasa > > > > > > > > > > Thanks! > > > > Daniil > > > > > > > > ?On 3/6/20, 12:30 AM, "Yasumasa Suenaga" wrote: > > > > > > > > Hi Daniil, > > > > > > > > > > > > - SALauncher.java > > > > - checkBasicOptions() is needed? I think you can remove this method and embed it in caller. > > > > - I think registryPort should be checked with Integer.parseInt() like others (rmiPort and pid) rather than regex. > > > > - Shutdown hook is very good idea. You can implement more simply if you use lambda expression. > > > > > > > > - SADebugDTest.java > > > > - Please add bug ID to @bug. > > > > - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array. > > > > > > > > > > > > Thanks, > > > > > > > > Yasumasa > > > > > > > > > > > > On 2020/03/06 10:15, Daniil Titov wrote: > > > > > Hi Yasumasa, Serguei and Alex, > > > > > > > > > > Please review a new version of the fix [1] that addresses your comments. The new version in addition to RMI connector > > > > > port option introduces two more options to specify RMI registry port and RMI connector host name. Currently, these > > > > > last two settings could be specified using the system properties but the system properties have the following disadvantages > > > > > comparing to the command line options: > > > > > - It?s hard to know about them: they are not listed in tool?s help. > > > > > - They have long names that hard to remember > > > > > - It is easy to mistype them in the command line and you will not get any warning about it. > > > > > > > > > > The CSR [2] was also updated and needs to be reviewed. > > > > > > > > > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > > > > > container and connecting to it with the GUI debugger. Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > > > > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.02/ > > > > > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > > > > [3] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > > > > > > > > > Thank you, > > > > > Daniil > > > > > > > > > > ?On 2/24/20, 5:45 AM, "Yasumasa Suenaga" wrote: > > > > > > > > > > Hi Daniil, > > > > > > > > > > - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments. > > > > > Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply. > > > > > > > > > > - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used. > > > > > But you can use same port number as RMI registry (1099). > > > > > It is same as relation between jmxremote.port and jmxremote.rmi.port. > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Yasumasa > > > > > > > > > > > > > > > On 2020/02/24 13:21, Daniil Titov wrote: > > > > > > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port. > > > > > > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container. > > > > > > > > > > > > New CSR [3] was created for this change and it needs to be reviewed as well. > > > > > > > > > > > > Man pages for jhsdb will be updated in a separate issue. > > > > > > > > > > > > The current implementation (sun.jvm.hotspot.SALauncher) parses the command line options passed to jhsdb tool, > > > > > > converts them to the ones for the debug server and then delegates the call to sun.jvm.hotspot.DebugServer.main(). > > > > > > > > > > > > // delegate to the actual SA debug server. > > > > > > 367 DebugServer.main(newArgArray.toArray(new String[0])); > > > > > > > > > > > > However, sun.jvm.hotspot.DebugServer doesn't support named options and that prevents from efficiently adding new options to the tool. > > > > > > I found it more suitable to start Hotspot agent directly in SALauncher rather than adding a new option in both sun.jvm.hotspot.SALauncher > > > > > > and sun.jvm.hotspot.DebugServer and delegating the call. With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated > > > > > > but I would prefer to address it in a separate issue. > > > > > > > > > > > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > > > > > > container and connecting to it with the GUI debugger. > > > > > > Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > > > > > > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01 > > > > > > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > > > > > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > > > > > > > > > > > Thank you, > > > > > > Daniil > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From ioi.lam at oracle.com Wed Mar 25 20:11:52 2020 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 25 Mar 2020 13:11:52 -0700 Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the children process if vmArguments is already specified In-Reply-To: <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com> References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com> <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com> <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com> <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com> Message-ID: <29f2ce73-2f53-e34a-15c0-a6723437c4fb@oracle.com> This new versions looks good to me. Thanks - Ioi On 3/25/20 12:01 PM, Leonid Mesnik wrote: > > Added Ioi, who also proposed new version of startAppVmOpts. > > Please find new webrev: > http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/ > > Renamed startAppVmOpts/runAppVmOpts to > "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should make very > clear that this method doesn't use any of test.java.opts, test.vm.opts. > > Also, I fixed test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java > metnioned by Igor, and removed null pointer check as Ioi suggested in > startApp method. > > + public static void startApp(LingeredApp theApp, String... > additionalJvmOpts) throws IOException { > + startAppExactJvmOpts(theApp, > Utils.appendTestJavaOpts(additionalJvmOpts)); > + } > > Leonid > > On 3/25/20 10:14 AM, Stefan Karlsson wrote: >> On 2020-03-25 17:40, Igor Ignatyev wrote: >>> Hi Leonid, >>> >>> I have briefly looked at the patch, a few comments so far: >>> >>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java: >>> ? - at L#114, could you please call static method using class name >>> (as the opposite of using instance)? or was it meant to be >>> theApp.runAppVmOpts(vmArgs) ? >>> >>> test/lib/jdk/test/lib/apps/LingeredApp.java: >>> - it seems that code indent of startApp(LingeredApp, String[]) isn't >>> correct >>> - I don't like startAppVmOpts name, but unfortunately don't have a >>> better suggestion (yet) >> >> I was going to say the same. Jtreg has the concept of "java options" >> and "vm options". We have had a fair share of bugs and wasted time >> when tests have been using the "vm options" part (VM_OPTIONS, >> test.vm.options, etc), and we've been moving away from using that way >> to pass options. I recently cleaned up some of this with: >> >> 8237111: LingeredApp should be started with getTestJavaOpts >> >> Because of this, I would prefer if we used a name that doesn't >> include "VmOpts", because it's too alike the other concept. Some >> suggestions: >> ?startAppJavaOptions >> ?startAppUsingJavaOptions >> ?startAppWithJavaOptions >> ?startAppExactJavaOptions >> ?startAppJvmOptions >> >> Thanks, >> StefanK >> >>> Thanks, >>> -- Igor >>> >>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik >>>> wrote: >>>> >>>> Hi >>>> >>>> Could you please review following fix which change LingeredApp to >>>> prepend vm options to java/vm.test.opts when startApp is used and >>>> provide startAppVmOpts to override options completely. >>>> >>>> The intention is to avoid issue like in this bug where test/jtreg >>>> options were ignored by tests. Also I fixed some tests where >>>> intention was to append vm options rather than to override them. >>>> >>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/ >>>> >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698 >>>> >>>> Leonid >>>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Wed Mar 25 20:24:43 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 25 Mar 2020 13:24:43 -0700 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: <70717ba0-bcd1-05d4-3f25-ad1dd30309c1@redhat.com> References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com> <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com> <359442d4-d6a1-f123-dad1-ac712f06eb0e@redhat.com> <7c613970-1788-5bd5-06f9-5c754843a193@oracle.com> <70717ba0-bcd1-05d4-3f25-ad1dd30309c1@redhat.com> Message-ID: <80c64ca7-4f52-a7ba-0e35-9fa6417ce545@oracle.com> Yes, please submit a new job. I'll start my testing once I see that the builds are done. Chris On 3/25/20 12:59 PM, Roman Kennke wrote: > Hi Chris, > > Apparently we can get into classTrack_reset() before calling activate(), > and we're seeing a null deletedSignatureBag. A simple NULL-check around > the cleaning routine fixes the problem for me. > > http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.08/ > > Should I post another submit-repo job with that fix? > > Thanks, > Roman > > >> Hi Roman, >> >> com/sun/jdi/JdwpAllowTest.java crashed on many runs: >> >> Stack: [0x00007fbb790f9000,0x00007fbb791fa000], sp=0x00007fbb791f8af0, free space=1022k >> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [libjdwp.so+0xdb71] bagEnumerateOver+0x11 >> C [libjdwp.so+0xe365] classTrack_reset+0x25 >> C [libjdwp.so+0xfca1] debugInit_reset+0x71 >> C [libjdwp.so+0x12e0d] debugLoop_run+0x38d >> C [libjdwp.so+0x25700] acceptThread+0x80 >> V [libjvm.so+0xf4b5a7] JvmtiAgentThread::call_start_function()+0x1c7 >> V [libjvm.so+0x15215c6] JavaThread::thread_main_inner()+0x226 >> V [libjvm.so+0x1527736] Thread::call_run()+0xf6 >> V [libjvm.so+0x1250ade] thread_native_entry(Thread*)+0x10e >> >> >> This happened during a test task run of open/test/jdk/:jdk_jdi. There >> doesn't seem to be anything magic on the command line that might be >> triggering. Pretty much I see it with all the various VM configs we test. >> >> I'm also seeing crashes in the following tests, but not as often: >> >> serviceability/jvmti/ModuleAwareAgents/ThreadStart/MAAThreadStart.java >> vmTestbase/nsk/jdwp/VirtualMachine/Version/version002/TestDescription.java >> vmTestbase/nsk/jdwp/VirtualMachine/ReleaseEvents/releaseevents002/TestDescription.java >> vmTestbase/nsk/jdwp/VirtualMachine/HoldEvents/holdevents002/TestDescription.java >> vmTestbase/nsk/jdwp/VirtualMachine/Dispose/dispose001/TestDescription.java >> >> thanks, >> >> Chris >> >> >> On 3/25/20 11:37 AM, Roman Kennke wrote: >>> Hi Chris, >>> >>>> Regarding the new assert: >>>> >>>> ?105???? if (gdata && gdata->assertOn) { >>>> ?106???????? // Check this is not already tagged. >>>> ?107???????? jlong tag; >>>> ?108???????? error = JVMTI_FUNC_PTR(trackingEnv, GetTag)(env, klass, &tag); >>>> ?109???????? if (error != JVMTI_ERROR_NONE) { >>>> ?110???????????? EXIT_ERROR(error, "Unable to GetTag with class >>>> trackingEnv"); >>>> ?111???????? } >>>> ?112???????? JDI_ASSERT(tag == NOT_TAGGED); >>>> ?113???? } >>>> >>>> I think you should remove the gdata check. gdata should never be NULL >>>> when you get to this code. If it is ever NULL then there's a bug, and >>>> the check will hide the bug. >>> Ok, will remove this. >>> >>>> Regarding testing, after you do the submit repo testing let me know the >>>> jobID and I'll do additional testing on it. >>> I did the submit repo earlier today, and it came back green: >>> >>> mach5-one-rkennke-JDK-8227269-2-20200325-1421-9706762 >>> >>> Thanks, >>> Roman >>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 3/25/20 6:00 AM, Roman Kennke wrote: >>>>> Hi Sergei, >>>>> >>>>>> The fix looks pretty clean now. >>>>>> I also like new name of the lock.:) >>>>> Thank you! >>>>> >>>>>> Just one comment below. >>>>>> >>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>> >>>>>> >>>>>> 110 if (tag != 0l) { >>>>>> 111 return; // Already added >>>>>> ? 112???? } >>>>>> >>>>>> ??It is better to use a named constant or macro instead. >>>>>> ??Also, it'd be nice to add a short comment about this value is. >>>>> As I replied to Chris earlier, this whole block can be turned into an >>>>> assert. I also made a constant for the value 0, which should be pretty >>>>> much self-explaining. >>>>> >>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.07/ >>>>> >>>>>> How do you test the fix? >>>>> I am using a manual test that is provided in this bug report: >>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>> >>>>> "Script to compare performance of GC with and without debugger, when >>>>> many classes are loaded and classes are being unloaded": >>>>> >>>>> https://bugzilla.redhat.com/attachment.cgi?id=1640688 >>>>> >>>>> I am also using this test and manually attach/detach jdb a couple of >>>>> times in a row to check that disconnecting and reconnecting works well >>>>> (this tended to deadlock or crash with an earlier version of the patch, >>>>> and is now looking good). >>>>> >>>>> I am also running tier1 and tier2 tests locally, and as soon as we all >>>>> agree that the fix is reasonable, I will push it to the submit repo. I >>>>> am not sure if any of those tests actually exercise that code, though. >>>>> Let me know if you want me to run any specific tests. >>>>> >>>>> Thank you, >>>>> Roman >>>>> >>>>> >>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 3/20/20 08:30, Roman Kennke wrote: >>>>>>> I believe I came up with a much simpler solution that also solves the >>>>>>> problems of the existing one, and the ones I proposed earlier. >>>>>>> >>>>>>> It turns out that we can take advantage of the fact that we can use >>>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is >>>>>>> explicitely >>>>>>> mentioned in the JVMTI spec). This means we can simply stick a pointer >>>>>>> to the signature of a class into the tag, and pull it out again when we >>>>>>> get notified that the class gets unloaded. >>>>>>> >>>>>>> This means we don't need an extra data-structure to keep track of >>>>>>> classes and signatures, and it also makes the story around locking >>>>>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning of all >>>>>>> classes needed (as in the current implementation) and no searching of >>>>>>> table needed (like in my previous attempts). >>>>>>> >>>>>>> Please review this new revision: >>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/ >>>>>>> >>>>>>> (Notice that there still appears to be a performance bottleneck with >>>>>>> class-unloading when an actual debugger is attached. This doesn't seem >>>>>>> to be related to the classTrack.c implementation though, but looks like >>>>>>> a consequence of getting all those class-unload notifications over the >>>>>>> wire. My testcase generates 1000s of them, and it's clogging up the >>>>>>> buffers.) >>>>>>> >>>>>>> I am not sure why jdb needs to enable class-unload listener always. A >>>>>>> simple hack disables it, and performance is brilliant, even when jdb is >>>>>>> attached: >>>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch >>>>>>> >>>>>>> But this is not in the scope of this bug.) >>>>>>> >>>>>>> Roman >>>>>>> >>>>>>> >>>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote: >>>>>>>> Sorry, forgot to complete my comments at the end (see below). >>>>>>>> >>>>>>>> >>>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Hi Roman, >>>>>>>>> >>>>>>>>> Thank you for the update and sorry for the latency in review. >>>>>>>>> >>>>>>>>> Some comments are below. >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>> >>>>>>>>> >>>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag) >>>>>>>>> ?? 88 { >>>>>>>>> 89 debugMonitorEnter(deletedSignatureLock); >>>>>>>>> 90 if (currentClassTag == -1) { >>>>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>>>> 93 return; >>>>>>>>> ?? 94???? } >>>>>>>>> Just a question: >>>>>>>>> ?? Q1: Should the ObjectFree events be disabled for the jvmtiEnv >>>>>>>>> that does >>>>>>>>> ?????? the class tracking if class tracking has not been initialized? >>>>>>>>> >>>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is >>>>>>>>> better to >>>>>>>>> be something like: lastClassTag or highestClassTag. >>>>>>>>> >>>>>>>>> 99 KlassNode* klass = *klass_ptr; >>>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103 >>>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr; >>>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // klass >>>>>>>>> not >>>>>>>>> found - ignore. >>>>>>>>> 107 debugMonitorExit(deletedSignatureLock); >>>>>>>>> 108 return; >>>>>>>>> ? 109???? } >>>>>>>>> ??It seems to me, something is wrong in the condition at L106 above. >>>>>>>>> ??Should it be? : >>>>>>>>> ???? if (klass == NULL || klass->klass_tag != tag) >>>>>>>>> >>>>>>>>> ??Otherwise, how can the second check ever work correctly as the >>>>>>>>> return >>>>>>>>> will always happen when (klass != NULL)? >>>>>>>>> >>>>>>>>> ? There are several places in this file with the the indent: >>>>>>>>> 90 if (currentClassTag == -1) { >>>>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>>>> 93 return; >>>>>>>>> ?? 94???? } >>>>>>>>> ? ... >>>>>>>>> 152 if (currentClassTag == -1) { >>>>>>>>> 153 // Class tracking not initialized yet, nobody's interested >>>>>>>>> 154 debugMonitorExit(deletedSignatureLock); >>>>>>>>> 155 return; >>>>>>>>> ? 156???? } >>>>>>>>> ? ... >>>>>>>>> 161 if (error != JVMTI_ERROR_NONE) { >>>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv"); >>>>>>>>> ? 163???? } >>>>>>>>> 164 if (tag != 0l) { >>>>>>>>> 165 debugMonitorExit(deletedSignatureLock); >>>>>>>>> 166 return; // Already added >>>>>>>>> ? 167???? } >>>>>>>>> ? ... >>>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg) >>>>>>>>> 282 { >>>>>>>>> 283 char* sig = (char*)signatureVoid; >>>>>>>>> 284 jvmtiDeallocate(sig); >>>>>>>>> 285 return JNI_TRUE; >>>>>>>>> ? 286 } >>>>>>>>> ? ... >>>>>>>>> ? 291 void >>>>>>>>> ? 292 classTrack_reset(void) >>>>>>>>> ? 293 { >>>>>>>>> 294 int idx; >>>>>>>>> 295 debugMonitorEnter(deletedSignatureLock); >>>>>>>>> 296 >>>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) { >>>>>>>>> 298 KlassNode* node = table[idx]; >>>>>>>>> 299 while (node != NULL) { >>>>>>>>> 300 KlassNode* next = node->next; >>>>>>>>> 301 jvmtiDeallocate(node->signature); >>>>>>>>> 302 jvmtiDeallocate(node); >>>>>>>>> 303 node = next; >>>>>>>>> 304 } >>>>>>>>> 305 } >>>>>>>>> 306 jvmtiDeallocate(table); >>>>>>>>> 307 >>>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL); >>>>>>>>> 309 bagDestroyBag(deletedSignatureBag); >>>>>>>>> 310 >>>>>>>>> 311 currentClassTag = -1; >>>>>>>>> 312 >>>>>>>>> 313 >>>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); >>>>>>>>> 314 trackingEnv = NULL; >>>>>>>>> 315 >>>>>>>>> 316 debugMonitorExit(deletedSignatureLock); >>>>>>>>> >>>>>>>>> Could you, please, fix several comments below? >>>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for >>>>>>>>> class-unloads >>>>>>>>> ??The comma is not needed. >>>>>>>>> ??Would it better to replace: klass tags => klass_tag's ? >>>>>>>>> >>>>>>>>> >>>>>>>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag >>>>>>>>> consistent >>>>>>>>> ??Maybe: Lock to guard ... or lock to keep integrity of ... >>>>>>>>> >>>>>>>>> 84 * Callback when classes are freed, Finds the signature and >>>>>>>>> remembers it in deletedSignatureBag. Would be better to use words >>>>>>>>> like >>>>>>>>> "store" or "record", "Find" should not start from capital letter: >>>>>>>>> Invoke the callback when classes are freed, find and record the >>>>>>>>> signature in deletedSignatureBag. >>>>>>>>> >>>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not initialized, >>>>>>>>> nobody's interested 153 // Class tracking not initialized yet, >>>>>>>>> nobody's interested 158 /* Check this is not a duplicate */ Missed >>>>>>>>> dot >>>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { // >>>>>>>>> klass not found - ignore. In opposite, dot is not needed as the >>>>>>>>> comment does not start from a capital letter. 111 // At this point we >>>>>>>>> have the KlassNode corresponding to the tag >>>>>>>>> 112 // in klass, and the pointer to it in klass_node. >>>>>>>> ? The comment above can be better. Maybe, something like: >>>>>>>> ? ? " At this point, we found the KlassNode matching the klass >>>>>>>> tag(and it is >>>>>>>> linked). >>>>>>>> >>>>>>>>> 113 // Remember the unloaded signature. >>>>>>>> ??Better: Record the signature of the unloaded class and unlink it. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>> On 3/9/20 05:39, Roman Kennke wrote: >>>>>>>>>> Hello all, >>>>>>>>>> >>>>>>>>>> Can I please get reviews of this change? In the meantime, we've done >>>>>>>>>> more testing and also field-/torture-testing by a customer who is >>>>>>>>>> happy >>>>>>>>>> now. :-) >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Roman >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Hi Serguei, >>>>>>>>>>> >>>>>>>>>>> Thanks for reviewing! >>>>>>>>>>> >>>>>>>>>>> I updated the patch to reflect your suggestions, very good! >>>>>>>>>>> It also includes a fix to allow re-connecting an agent after >>>>>>>>>>> disconnect, >>>>>>>>>>> namely move setup of the trackingEnv and deletedSignatureBag to >>>>>>>>>>> _activate() to ensure have those structures after re-connect. >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/ >>>>>>>>>>> >>>>>>>>>>> Let me know what you think! >>>>>>>>>>> Roman >>>>>>>>>>> >>>>>>>>>>>> Hi Roman, >>>>>>>>>>>> >>>>>>>>>>>> Thank you for taking care about this scalability issue! >>>>>>>>>>>> >>>>>>>>>>>> I have a couple of quick comments. >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 72 /* >>>>>>>>>>>> 73 * Lock to protect deletedSignatureBag >>>>>>>>>>>> 74 */ >>>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /* >>>>>>>>>>>> 78 * A bag containing all the deleted classes' signatures. Must be >>>>>>>>>>>> accessed under >>>>>>>>>>>> 79 * deletedTagLock, >>>>>>>>>>>> ?? 80? */ >>>>>>>>>>>> 81 struct bag* deletedSignatureBag; >>>>>>>>>>>> >>>>>>>>>>>> ?? The comments contradict to each other. >>>>>>>>>>>> ?? I guess, the lock name at line 79 has to be >>>>>>>>>>>> deletedSignatureLock >>>>>>>>>>>> instead of deletedTagLock. >>>>>>>>>>>> ?? Also, comma at the end must be replaced with dot. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 101 // Tag not found? Ignore. >>>>>>>>>>>> 102 if (klass == NULL) { >>>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>> 104 return; >>>>>>>>>>>> 105 } >>>>>>>>>>>> ? 106 >>>>>>>>>>>> 107 // Scan linked-list. >>>>>>>>>>>> 108 jlong found_tag = klass->klass_tag; >>>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) { >>>>>>>>>>>> 110 klass_ptr = &klass->next; >>>>>>>>>>>> 111 klass = *klass_ptr; >>>>>>>>>>>> 112 found_tag = klass->klass_tag; >>>>>>>>>>>> ? 113???? } >>>>>>>>>>>> 114 >>>>>>>>>>>> 115 // Tag not found? Ignore. >>>>>>>>>>>> 116 if (found_tag != tag) { >>>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>> 118 return; >>>>>>>>>>>> ? 119???? } >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> ??The code above can be simplified, so that the lines 101-105 >>>>>>>>>>>> are not >>>>>>>>>>>> needed anymore. >>>>>>>>>>>> ??It can be something like this: >>>>>>>>>>>> >>>>>>>>>>>> // Scan linked-list. >>>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) { >>>>>>>>>>>> klass_ptr = &klass->next; >>>>>>>>>>>> klass = *klass_ptr; >>>>>>>>>>>> ????? } >>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not >>>>>>>>>>>> found - ignore. >>>>>>>>>>>> debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>> return; >>>>>>>>>>>> ????? } >>>>>>>>>>>> >>>>>>>>>>>> It will take more time when I get a chance to look at the rest. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Serguei >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote: >>>>>>>>>>>>> Here comes an update that resolves some races that happen when >>>>>>>>>>>>> disconnecting an agent. In particular, we need to take the >>>>>>>>>>>>> lock on >>>>>>>>>>>>> basically every operation, and also need to check whether or not >>>>>>>>>>>>> class-tracking is active and return an appropriate result >>>>>>>>>>>>> (e.g. an empty >>>>>>>>>>>>> list) when we're not. >>>>>>>>>>>>> >>>>>>>>>>>>> Updated webrev: >>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Roman >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> So, here comes the O(1) implementation: >>>>>>>>>>>>>> >>>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a >>>>>>>>>>>>>> tag, and we >>>>>>>>>>>>>> set-up a listener to get notified when it is unloaded. >>>>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a >>>>>>>>>>>>>> table, which >>>>>>>>>>>>>> each entry being the head of a linked-list of KlassNode*. The >>>>>>>>>>>>>> table is >>>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend the new >>>>>>>>>>>>>> KlassNode*. >>>>>>>>>>>>>> This is O(1) operation. >>>>>>>>>>>>>> - When we get notified of unloading a class, we look up the >>>>>>>>>>>>>> signature of >>>>>>>>>>>>>> the reported tag in that table, and remember it in a bag. The >>>>>>>>>>>>>> KlassNode* >>>>>>>>>>>>>> is then unlinked from the table and deallocated. This is >>>>>>>>>>>>>> ~O(1) operation >>>>>>>>>>>>>> too, depending on the depth of the table. In my testcase >>>>>>>>>>>>>> which hammered >>>>>>>>>>>>>> the code with class-loads and unloads, I usually see depths >>>>>>>>>>>>>> of like 2-3, >>>>>>>>>>>>>> but not usually more. It should be ok. >>>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out that >>>>>>>>>>>>>> bag, and >>>>>>>>>>>>>> allocate a new one. >>>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid >>>>>>>>>>>>>> leaking the >>>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets detached >>>>>>>>>>>>>> and/or >>>>>>>>>>>>>> re-attached (was missing before). >>>>>>>>>>>>>> - I also added locks around data-structure-manipulation (was >>>>>>>>>>>>>> missing >>>>>>>>>>>>>> before). >>>>>>>>>>>>>> - Also, I only activate this whole process when an actual >>>>>>>>>>>>>> listener gets >>>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when >>>>>>>>>>>>>> attaching a >>>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be something >>>>>>>>>>>>>> to improve >>>>>>>>>>>>>> in the future? >>>>>>>>>>>>>> >>>>>>>>>>>>>> In my tests, the performance of class-tracking itself looks >>>>>>>>>>>>>> really good. >>>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the >>>>>>>>>>>>>> class-unload >>>>>>>>>>>>>> events. I don't see how this can be helped when the debug >>>>>>>>>>>>>> agent asks for it? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Updated webrev: >>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> Please let me know what you think of it. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Roman >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am implementing >>>>>>>>>>>>>>> the even more >>>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing >>>>>>>>>>>>>>> for now. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks,Roman >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ? Hi Chris, >>>>>>>>>>>>>>>>> I'll have a look at this, although it might not be for a >>>>>>>>>>>>>>>>> few days. In >>>>>>>>>>>>>>>>> the meantime, maybe you can describe your new >>>>>>>>>>>>>>>>> implementation in >>>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the changes. >>>>>>>>>>>>>>>> Sure. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to >>>>>>>>>>>>>>>> determine the >>>>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading >>>>>>>>>>>>>>>> happened, so that >>>>>>>>>>>>>>>> we can generate the appropriate JDWP event. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The current implementation does so by maintaining a table >>>>>>>>>>>>>>>> of currently >>>>>>>>>>>>>>>> prepared classes by building that table when classTrack is >>>>>>>>>>>>>>>> initialized, >>>>>>>>>>>>>>>> and then add new classes whenever a class gets loaded. When >>>>>>>>>>>>>>>> unloading >>>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and >>>>>>>>>>>>>>>> compared with the >>>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the new >>>>>>>>>>>>>>>> table gets >>>>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently >>>>>>>>>>>>>>>> and/or many >>>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to >>>>>>>>>>>>>>>> O(classCount*gcCount) >>>>>>>>>>>>>>>> complexity. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared >>>>>>>>>>>>>>>> classes, and also >>>>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). >>>>>>>>>>>>>>>> Whenever an >>>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned, >>>>>>>>>>>>>>>> and classes >>>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus >>>>>>>>>>>>>>>> maintaining the >>>>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list >>>>>>>>>>>>>>>> that gets returned. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The implementation is not perfect. In order to determine >>>>>>>>>>>>>>>> whether or not >>>>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. >>>>>>>>>>>>>>>> That process is >>>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here >>>>>>>>>>>>>>>> is that >>>>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this >>>>>>>>>>>>>>>> seems to be >>>>>>>>>>>>>>>> true, and also reasonable to expect. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> (I have some ideas how to improve the implementation to >>>>>>>>>>>>>>>> ~O(1) but it >>>>>>>>>>>>>>>> would be considerably more complex: have to maintain a >>>>>>>>>>>>>>>> (hash)table that >>>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload, >>>>>>>>>>>>>>>> and build the >>>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently see >>>>>>>>>>>>>>>> that it's >>>>>>>>>>>>>>>> worth the effort). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> In addition to all that, this process is only activated >>>>>>>>>>>>>>>> when there's an >>>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote: >>>>>>>>>>>>>>>>>> Hello all, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Issue: >>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269 >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c. >>>>>>>>>>>>>>>>>> It avoids >>>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps >>>>>>>>>>>>>>>>>> track of >>>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance until an >>>>>>>>>>>>>>>>>> agent >>>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and >>>>>>>>>>>>>>>>>> timing. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Eg with the testcase provided here: >>>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I am getting those numbers: >>>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s >>>>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Can I please get a review? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>>> From magnus.ihse.bursie at oracle.com Wed Mar 25 20:45:16 2020 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Wed, 25 Mar 2020 21:45:16 +0100 Subject: RFR: JDK-8241618 Fix unchecked warning for jdk.hotspot.agent In-Reply-To: <01f77be3-e7d2-a051-80ab-e81c83922cf6@oracle.com> References: <8d884fcb-f424-1b54-7ece-5260037b2843@oracle.com> <01f77be3-e7d2-a051-80ab-e81c83922cf6@oracle.com> Message-ID: <007988a3-50d6-6a54-7af6-90af623129b1@oracle.com> On 2020-03-25 20:52, Chris Plummer wrote: > Hi Magus, > > I haven't looked at the changes yet, other to see that there are many > files touched, but after reading below (and only partly understanding > since I don't know this area well), I was wondering if this issue > wouldn't be better served with multiple passes made to fix the > warnings. Start with a straight forward one where you are maybe only > making one or two types of changes, but that affect a large number of > files and don't cascade into other more complicated changes. Unfortunately, many changes tends to cling together -- for instance, class Foo has a List fooList of say Integer. If I change that to List, then also the constructor needs to change, and the getFooList() method, and that in turn propagate to users of getFooList() etc. I tried to do this piecewise but for every line that I fixed I just ended up getting more and more places that needed fixing. On the other hand, the patch I present *is* indeed mostly trivial. Apart from the places I mentioned below, the fixes are straightforward. And I opted out of fixing the tricky ones by disabling the warnings. My intention is to file a follow-up bug for these @SuppressWarnings to be fixed properly. However, doing that is unfortunately beyond the scope of what I'm able to do, since I do not have enough domain knowledge. The fixes in this patch is more or less "stupid" applications of adding generics with the correct type. (Basically, what I've done is to locate a problematic type, like fooList, and check the type of elements inserted and extracted of it, and created it as a generic of that type. Boring, but not really difficult.) I realize the webrev can look daunting. Perhaps start by looking at the patch file, that will quickly show what kind of changes this is about. Also, 1/3 of the patch is just about updating those darned copyright years. :-( > This will get a lot of the noise out of the way, and then we can focus > on some of the harder issues you bring up below. > > As for testing, I think the following list will capture all of them, > but can't say for sure: > > open/test/hotspot/jtreg/serviceability/sa > open/test/hotspot/jtreg/resourcehogs/serviceability/sa > open/test/jdk/sun/tools/jhsdb > open/test/jdk/sun/tools/jstack > open/test/jdk/sun/tools/jmap > open/test/hotspot/jtreg/gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java > > open/test/hotspot/jtreg/compiler/ciReplay/TestSAClient.java > open/test/hotspot/jtreg/compiler/ciReplay/TestSAServer.java Thank you! I'll run these through our test system. /Magnus > > Chris > > On 3/25/20 12:29 PM, Magnus Ihse Bursie wrote: >> With the recent fixes in JDK-8241310, JDK-8237746 and JDK-8241073, >> and the upcoming fixes to remove the deprecated nashorn and jdk.rmi, >> the JDK build is very close to producing no warnings when compiling >> the Java classes. >> >> The one remaining sinner is jdk.hotspot.agent. Most of the warnings >> here are turned off, but unchecked and deprecation cannot be >> completely silenced. >> >> Since the poor agent does not seem to receive much love nowadays, I >> took it upon myself to fix these warnings, so we can finally get a >> quiet build. >> >> I started to address the unchecked warnings. Unfortunately, this was >> a much bigger task than I anticipated. I had to generify most of the >> module. On the plus side, the code is so much better now. And most of >> the changes were trivial, just tedious. >> >> There are a few places were I'm not entirely happy with the current >> solution, and that at least merits some discussion. >> >> I have resorted to @SuppressWarnings in four classes: ciMethodData, >> MethodData, TableModelComparator and VirtualBaseConstructor. All of >> them has in common that they are doing slightly fishy things with >> classes in collections. I'm not entirely sure they are bug-free, but >> this patch leaves the behavior untouched. I did some efforts to sort >> out the logic, but it turned out to be too hairy for me to fix, and >> it will probably require more substantial changes to the workings of >> the code. >> >> To make the code valid, I have moved ConstMethod to extend Metadata >> instead of VMObject. My understanding is that this is benign (and >> likely intended), but I really need for someone who knows the code to >> confirm this. I have also added a FIXME to signal this. I'll remove >> the FIXME as soon as I get confirmation that this is OK. >> (The reason for this is the following piece of code from >> Metadata.java: metadataConstructor.addMapping("ConstMethod", >> ConstMethod.class)) >> >> In ObjectListPanel, there is some code that screams "dead" with this >> change. I added a FIXME to point this out: >> ??? for (Iterator iter = elements.iterator(); iter.hasNext(); ) { >> ????? if (iter.next() instanceof Array) { >> ??????? // FIXME: Does not seem possible to happen >> ??????? hasArrays = true; >> ??????? return; >> ????? } >> It seems that if you start pulling this thread, even more dead code >> will unravel, so I'm not so eager to touch this in the current patch. >> But I can remove the FIXME if you want. >> >> My first iteration of this patch tried to generify the IntervalTree >> and related class hierarchy. However, this turned out to be >> impossible due to some weird usage in AnnotatedMemoryPanel, where >> there seemed to be confusion as to whether the tree stored >> Annotations or Addresses. I'm not entirely convinced the code is >> correct, it certainly looked and smelled very fishy. However, I >> reverted these changes since I could not get them to work due to >> this, and it was not needed for the goal of just getting rid of the >> warning. >> >> Finally, I have done no testing apart from verifying that it builds. >> Please advice on suitable tests to run. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8241618 >> WebRev: >> http://cr.openjdk.java.net/~ihse/JDK-8241618-fix-unchecked-warnings-for-agent/webrev.01 >> >> /Magnus > > From leonid.mesnik at oracle.com Wed Mar 25 21:31:01 2020 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Wed, 25 Mar 2020 14:31:01 -0700 Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the children process if vmArguments is already specified In-Reply-To: <042D1C56-75E3-40B9-8259-035C9A13C13B@oracle.com> References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com> <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com> <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com> <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com> <042D1C56-75E3-40B9-8259-035C9A13C13B@oracle.com> Message-ID: <47cba4c6-233d-84a9-1d89-b40e3c974c08@oracle.com> Igor, Stefan, Ioi Thank you for your feedback. Filed https://bugs.openjdk.java.net/browse/JDK-8241624 To change @run main... to @run driver. Test ClhsdbJstack.java is updated. Still waiting for review from SVC team. webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/ Leonid On 3/25/20 12:46 PM, Igor Ignatyev wrote: > Hi Leonid, > > not related related to your patch (but yet somewhat made more obvious > by it), it seems all (or at least almost all) the tests which > use?LingeredApp should be run in "driver" mode as they just > orchestrate execution of other JVMs, so running them w/ main (let > alone main/othervm) just wastes time, > test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java#id1, for > example, will now executed w/ Xcomp which will make it very slow for > no reasons. since you already got your hands dirty w/ these tests, > could you please file an RFE to sort this out and list all the > affected tests there? > > re: the patch, could you please update ClhsdbJstack.java test not to > be run w/ Xcomp and follow the same pattern you used in other tests > (e.g.?ClhsdbScanOops) ? other than that it looks fine to me, I however > wouldn't be able to tell if all svc tests continue to do that they > were supposed to, so I'd prefer for someone from svc team to?chime in. > > Thanks, > -- Igor > >> On Mar 25, 2020, at 12:01 PM, Leonid Mesnik > > wrote: >> >> Added Ioi, who also proposed new version of startAppVmOpts. >> >> Please find new webrev: >> http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/ >> >> Renamed startAppVmOpts/runAppVmOpts to >> "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should make >> very clear that this method doesn't use any of test.java.opts, >> test.vm.opts. >> >> Also, I fixed test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java >> metnioned by Igor, and removed null pointer check as Ioi suggested in >> startApp method. >> >> + public static void startApp(LingeredApp theApp, String... >> additionalJvmOpts) throws IOException { >> + startAppExactJvmOpts(theApp, >> Utils.appendTestJavaOpts(additionalJvmOpts)); >> + } >> >> Leonid >> >> On 3/25/20 10:14 AM, Stefan Karlsson wrote: >>> On 2020-03-25 17:40, Igor Ignatyev wrote: >>>> Hi Leonid, >>>> >>>> I have briefly looked at the patch, a few comments so far: >>>> >>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java: >>>> ? - at L#114, could you please call static method using class name >>>> (as the opposite of using instance)? or was it meant to be >>>> theApp.runAppVmOpts(vmArgs) ? >>>> >>>> test/lib/jdk/test/lib/apps/LingeredApp.java: >>>> - it seems that code indent of startApp(LingeredApp, String[]) >>>> isn't correct >>>> - I don't like startAppVmOpts name, but unfortunately don't have a >>>> better suggestion (yet) >>> >>> I was going to say the same. Jtreg has the concept of "java options" >>> and "vm options". We have had a fair share of bugs and wasted time >>> when tests have been using the "vm options" part (VM_OPTIONS, >>> test.vm.options, etc), and we've been moving away from using that >>> way to pass options. I recently cleaned up some of this with: >>> >>> 8237111: LingeredApp should be started with getTestJavaOpts >>> >>> Because of this, I would prefer if we used a name that doesn't >>> include "VmOpts", because it's too alike the other concept. Some >>> suggestions: >>> ?startAppJavaOptions >>> ?startAppUsingJavaOptions >>> ?startAppWithJavaOptions >>> ?startAppExactJavaOptions >>> ?startAppJvmOptions >>> >>> Thanks, >>> StefanK >>> >>>> Thanks, >>>> -- Igor >>>> >>>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik >>>>> wrote: >>>>> >>>>> Hi >>>>> >>>>> Could you please review following fix which change LingeredApp to >>>>> prepend vm options to java/vm.test.opts when startApp is used and >>>>> provide startAppVmOpts to override options completely. >>>>> >>>>> The intention is to avoid issue like in this bug where test/jtreg >>>>> options were ignored by tests. Also I fixed some tests where >>>>> intention was to append vm options rather than to override them. >>>>> >>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/ >>>>> >>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698 >>>>> >>>>> Leonid >>>>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From igor.ignatyev at oracle.com Wed Mar 25 21:58:50 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 25 Mar 2020 14:58:50 -0700 Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the children process if vmArguments is already specified In-Reply-To: <47cba4c6-233d-84a9-1d89-b40e3c974c08@oracle.com> References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com> <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com> <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com> <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com> <042D1C56-75E3-40B9-8259-035C9A13C13B@oracle.com> <47cba4c6-233d-84a9-1d89-b40e3c974c08@oracle.com> Message-ID: > Test ClhsdbJstack.java is updated. > now you reduced coverage provided by this test, I actually meant to create a separate jtreg test description in this test and pass "Xcomp" or "true" (or anything) as an argument to ClhsdbJstack, and use the value of this argument to decide if -Xcomp should be added to LingeredApp.startApp or not. Thanks, -- Igor > On Mar 25, 2020, at 2:31 PM, Leonid Mesnik wrote: > > Igor, Stefan, Ioi > > Thank you for your feedback. > > Filed https://bugs.openjdk.java.net/browse/JDK-8241624 To change @run main... to @run driver. > > Test ClhsdbJstack.java is updated. > > Still waiting for review from SVC team. > > webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/ > Leonid > > On 3/25/20 12:46 PM, Igor Ignatyev wrote: >> Hi Leonid, >> >> not related related to your patch (but yet somewhat made more obvious by it), it seems all (or at least almost all) the tests which use LingeredApp should be run in "driver" mode as they just orchestrate execution of other JVMs, so running them w/ main (let alone main/othervm) just wastes time, test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java#id1, for example, will now executed w/ Xcomp which will make it very slow for no reasons. since you already got your hands dirty w/ these tests, could you please file an RFE to sort this out and list all the affected tests there? >> >> re: the patch, could you please update ClhsdbJstack.java test not to be run w/ Xcomp and follow the same pattern you used in other tests (e.g. ClhsdbScanOops) ? other than that it looks fine to me, I however wouldn't be able to tell if all svc tests continue to do that they were supposed to, so I'd prefer for someone from svc team to chime in. >> >> Thanks, >> -- Igor >> >>> On Mar 25, 2020, at 12:01 PM, Leonid Mesnik > wrote: >>> >>> Added Ioi, who also proposed new version of startAppVmOpts. >>> >>> Please find new webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/ >>> Renamed startAppVmOpts/runAppVmOpts to "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should make very clear that this method doesn't use any of test.java.opts, test.vm.opts. >>> >>> Also, I fixed test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java metnioned by Igor, and removed null pointer check as Ioi suggested in startApp method. >>> >>> + public static void startApp(LingeredApp theApp, String... additionalJvmOpts) throws IOException { >>> + startAppExactJvmOpts(theApp, Utils.appendTestJavaOpts(additionalJvmOpts)); >>> + } >>> >>> Leonid >>> >>> On 3/25/20 10:14 AM, Stefan Karlsson wrote: >>>> On 2020-03-25 17:40, Igor Ignatyev wrote: >>>>> Hi Leonid, >>>>> >>>>> I have briefly looked at the patch, a few comments so far: >>>>> >>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java: >>>>> - at L#114, could you please call static method using class name (as the opposite of using instance)? or was it meant to be theApp.runAppVmOpts(vmArgs) ? >>>>> >>>>> test/lib/jdk/test/lib/apps/LingeredApp.java: >>>>> - it seems that code indent of startApp(LingeredApp, String[]) isn't correct >>>>> - I don't like startAppVmOpts name, but unfortunately don't have a better suggestion (yet) >>>> >>>> I was going to say the same. Jtreg has the concept of "java options" and "vm options". We have had a fair share of bugs and wasted time when tests have been using the "vm options" part (VM_OPTIONS, test.vm.options, etc), and we've been moving away from using that way to pass options. I recently cleaned up some of this with: >>>> >>>> 8237111: LingeredApp should be started with getTestJavaOpts >>>> >>>> Because of this, I would prefer if we used a name that doesn't include "VmOpts", because it's too alike the other concept. Some suggestions: >>>> startAppJavaOptions >>>> startAppUsingJavaOptions >>>> startAppWithJavaOptions >>>> startAppExactJavaOptions >>>> startAppJvmOptions >>>> >>>> Thanks, >>>> StefanK >>>> >>>>> Thanks, >>>>> -- Igor >>>>> >>>>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik wrote: >>>>>> >>>>>> Hi >>>>>> >>>>>> Could you please review following fix which change LingeredApp to prepend vm options to java/vm.test.opts when startApp is used and provide startAppVmOpts to override options completely. >>>>>> >>>>>> The intention is to avoid issue like in this bug where test/jtreg options were ignored by tests. Also I fixed some tests where intention was to append vm options rather than to override them. >>>>>> >>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/ >>>>>> >>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698 >>>>>> >>>>>> Leonid >>>>>> >>>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From rkennke at redhat.com Wed Mar 25 22:22:31 2020 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 25 Mar 2020 23:22:31 +0100 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: <80c64ca7-4f52-a7ba-0e35-9fa6417ce545@oracle.com> References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com> <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com> <359442d4-d6a1-f123-dad1-ac712f06eb0e@redhat.com> <7c613970-1788-5bd5-06f9-5c754843a193@oracle.com> <70717ba0-bcd1-05d4-3f25-ad1dd30309c1@redhat.com> <80c64ca7-4f52-a7ba-0e35-9fa6417ce545@oracle.com> Message-ID: The new job finished, its ID is: mach5-one-rkennke-JDK-8227269-2-20200325-2027-9716289 Thank you, Roman > Yes, please submit a new job. I'll start my testing once I see that the > builds are done. > > Chris > > On 3/25/20 12:59 PM, Roman Kennke wrote: >> Hi Chris, >> >> Apparently we can get into classTrack_reset() before calling activate(), >> and we're seeing a null deletedSignatureBag. A simple NULL-check around >> the cleaning routine fixes the problem for me. >> >> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.08/ >> >> Should I post another submit-repo job with that fix? >> >> Thanks, >> Roman >> >> >>> Hi Roman, >>> >>> com/sun/jdi/JdwpAllowTest.java crashed on many runs: >>> >>> Stack: [0x00007fbb790f9000,0x00007fbb791fa000],? >>> sp=0x00007fbb791f8af0,? free space=1022k >>> Native frames: (J=compiled Java code, A=aot compiled Java code, >>> j=interpreted, Vv=VM code, C=native code) >>> C? [libjdwp.so+0xdb71]? bagEnumerateOver+0x11 >>> C? [libjdwp.so+0xe365]? classTrack_reset+0x25 >>> C? [libjdwp.so+0xfca1]? debugInit_reset+0x71 >>> C? [libjdwp.so+0x12e0d]? debugLoop_run+0x38d >>> C? [libjdwp.so+0x25700]? acceptThread+0x80 >>> V? [libjvm.so+0xf4b5a7]? JvmtiAgentThread::call_start_function()+0x1c7 >>> V? [libjvm.so+0x15215c6]? JavaThread::thread_main_inner()+0x226 >>> V? [libjvm.so+0x1527736]? Thread::call_run()+0xf6 >>> V? [libjvm.so+0x1250ade]? thread_native_entry(Thread*)+0x10e >>> >>> >>> This happened during a test task run of open/test/jdk/:jdk_jdi. There >>> doesn't seem to be anything magic on the command line that might be >>> triggering. Pretty much I see it with all the various VM configs we >>> test. >>> >>> I'm also seeing crashes in the following tests, but not as often: >>> >>> serviceability/jvmti/ModuleAwareAgents/ThreadStart/MAAThreadStart.java >>> vmTestbase/nsk/jdwp/VirtualMachine/Version/version002/TestDescription.java >>> >>> vmTestbase/nsk/jdwp/VirtualMachine/ReleaseEvents/releaseevents002/TestDescription.java >>> >>> vmTestbase/nsk/jdwp/VirtualMachine/HoldEvents/holdevents002/TestDescription.java >>> >>> vmTestbase/nsk/jdwp/VirtualMachine/Dispose/dispose001/TestDescription.java >>> >>> >>> thanks, >>> >>> Chris >>> >>> >>> On 3/25/20 11:37 AM, Roman Kennke wrote: >>>> Hi Chris, >>>> >>>>> Regarding the new assert: >>>>> >>>>> ??105???? if (gdata && gdata->assertOn) { >>>>> ??106???????? // Check this is not already tagged. >>>>> ??107???????? jlong tag; >>>>> ??108???????? error = JVMTI_FUNC_PTR(trackingEnv, GetTag)(env, >>>>> klass, &tag); >>>>> ??109???????? if (error != JVMTI_ERROR_NONE) { >>>>> ??110???????????? EXIT_ERROR(error, "Unable to GetTag with class >>>>> trackingEnv"); >>>>> ??111???????? } >>>>> ??112???????? JDI_ASSERT(tag == NOT_TAGGED); >>>>> ??113???? } >>>>> >>>>> I think you should remove the gdata check. gdata should never be NULL >>>>> when you get to this code. If it is ever NULL then there's a bug, and >>>>> the check will hide the bug. >>>> Ok, will remove this. >>>> >>>>> Regarding testing, after you do the submit repo testing let me know >>>>> the >>>>> jobID and I'll do additional testing on it. >>>> I did the submit repo earlier today, and it came back green: >>>> >>>> mach5-one-rkennke-JDK-8227269-2-20200325-1421-9706762 >>>> >>>> Thanks, >>>> Roman >>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 3/25/20 6:00 AM, Roman Kennke wrote: >>>>>> Hi Sergei, >>>>>> >>>>>>> The fix looks pretty clean now. >>>>>>> I also like new name of the lock.:) >>>>>> Thank you! >>>>>> >>>>>>> Just one comment below. >>>>>>> >>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>> >>>>>>> >>>>>>> >>>>>>> 110 if (tag != 0l) { >>>>>>> 111 return; // Already added >>>>>>> ?? 112???? } >>>>>>> >>>>>>> ???It is better to use a named constant or macro instead. >>>>>>> ???Also, it'd be nice to add a short comment about this value is. >>>>>> As I replied to Chris earlier, this whole block can be turned into an >>>>>> assert. I also made a constant for the value 0, which should be >>>>>> pretty >>>>>> much self-explaining. >>>>>> >>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.07/ >>>>>> >>>>>>> How do you test the fix? >>>>>> I am using a manual test that is provided in this bug report: >>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>> >>>>>> "Script to compare performance of GC with and without debugger, when >>>>>> many classes are loaded and classes are being unloaded": >>>>>> >>>>>> https://bugzilla.redhat.com/attachment.cgi?id=1640688 >>>>>> >>>>>> I am also using this test and manually attach/detach jdb a couple of >>>>>> times in a row to check that disconnecting and reconnecting works >>>>>> well >>>>>> (this tended to deadlock or crash with an earlier version of the >>>>>> patch, >>>>>> and is now looking good). >>>>>> >>>>>> I am also running tier1 and tier2 tests locally, and as soon as we >>>>>> all >>>>>> agree that the fix is reasonable, I will push it to the submit >>>>>> repo. I >>>>>> am not sure if any of those tests actually exercise that code, >>>>>> though. >>>>>> Let me know if you want me to run any specific tests. >>>>>> >>>>>> Thank you, >>>>>> Roman >>>>>> >>>>>> >>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> On 3/20/20 08:30, Roman Kennke wrote: >>>>>>>> I believe I came up with a much simpler solution that also >>>>>>>> solves the >>>>>>>> problems of the existing one, and the ones I proposed earlier. >>>>>>>> >>>>>>>> It turns out that we can take advantage of the fact that we can use >>>>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is >>>>>>>> explicitely >>>>>>>> mentioned in the JVMTI spec). This means we can simply stick a >>>>>>>> pointer >>>>>>>> to the signature of a class into the tag, and pull it out again >>>>>>>> when we >>>>>>>> get notified that the class gets unloaded. >>>>>>>> >>>>>>>> This means we don't need an extra data-structure to keep track of >>>>>>>> classes and signatures, and it also makes the story around locking >>>>>>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning >>>>>>>> of all >>>>>>>> classes needed (as in the current implementation) and no >>>>>>>> searching of >>>>>>>> table needed (like in my previous attempts). >>>>>>>> >>>>>>>> Please review this new revision: >>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/ >>>>>>>> >>>>>>>> (Notice that there still appears to be a performance bottleneck >>>>>>>> with >>>>>>>> class-unloading when an actual debugger is attached. This >>>>>>>> doesn't seem >>>>>>>> to be related to the classTrack.c implementation though, but >>>>>>>> looks like >>>>>>>> a consequence of getting all those class-unload notifications >>>>>>>> over the >>>>>>>> wire. My testcase generates 1000s of them, and it's clogging up the >>>>>>>> buffers.) >>>>>>>> >>>>>>>> I am not sure why jdb needs to enable class-unload listener >>>>>>>> always. A >>>>>>>> simple hack disables it, and performance is brilliant, even when >>>>>>>> jdb is >>>>>>>> attached: >>>>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch >>>>>>>> >>>>>>>> But this is not in the scope of this bug.) >>>>>>>> >>>>>>>> Roman >>>>>>>> >>>>>>>> >>>>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Sorry, forgot to complete my comments at the end (see below). >>>>>>>>> >>>>>>>>> >>>>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote: >>>>>>>>>> Hi Roman, >>>>>>>>>> >>>>>>>>>> Thank you for the update and sorry for the latency in review. >>>>>>>>>> >>>>>>>>>> Some comments are below. >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag) >>>>>>>>>> ??? 88 { >>>>>>>>>> 89 debugMonitorEnter(deletedSignatureLock); >>>>>>>>>> 90 if (currentClassTag == -1) { >>>>>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>>>>> 93 return; >>>>>>>>>> ??? 94???? } >>>>>>>>>> Just a question: >>>>>>>>>> ??? Q1: Should the ObjectFree events be disabled for the jvmtiEnv >>>>>>>>>> that does >>>>>>>>>> ??????? the class tracking if class tracking has not been >>>>>>>>>> initialized? >>>>>>>>>> >>>>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is >>>>>>>>>> better to >>>>>>>>>> be something like: lastClassTag or highestClassTag. >>>>>>>>>> >>>>>>>>>> 99 KlassNode* klass = *klass_ptr; >>>>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103 >>>>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr; >>>>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // >>>>>>>>>> klass >>>>>>>>>> not >>>>>>>>>> found - ignore. >>>>>>>>>> 107 debugMonitorExit(deletedSignatureLock); >>>>>>>>>> 108 return; >>>>>>>>>> ?? 109???? } >>>>>>>>>> ???It seems to me, something is wrong in the condition at L106 >>>>>>>>>> above. >>>>>>>>>> ???Should it be? : >>>>>>>>>> ????? if (klass == NULL || klass->klass_tag != tag) >>>>>>>>>> >>>>>>>>>> ???Otherwise, how can the second check ever work correctly as the >>>>>>>>>> return >>>>>>>>>> will always happen when (klass != NULL)? >>>>>>>>>> >>>>>>>>>> ?? There are several places in this file with the the indent: >>>>>>>>>> 90 if (currentClassTag == -1) { >>>>>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>>>>> 93 return; >>>>>>>>>> ??? 94???? } >>>>>>>>>> ?? ... >>>>>>>>>> 152 if (currentClassTag == -1) { >>>>>>>>>> 153 // Class tracking not initialized yet, nobody's interested >>>>>>>>>> 154 debugMonitorExit(deletedSignatureLock); >>>>>>>>>> 155 return; >>>>>>>>>> ?? 156???? } >>>>>>>>>> ?? ... >>>>>>>>>> 161 if (error != JVMTI_ERROR_NONE) { >>>>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv"); >>>>>>>>>> ?? 163???? } >>>>>>>>>> 164 if (tag != 0l) { >>>>>>>>>> 165 debugMonitorExit(deletedSignatureLock); >>>>>>>>>> 166 return; // Already added >>>>>>>>>> ?? 167???? } >>>>>>>>>> ?? ... >>>>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg) >>>>>>>>>> 282 { >>>>>>>>>> 283 char* sig = (char*)signatureVoid; >>>>>>>>>> 284 jvmtiDeallocate(sig); >>>>>>>>>> 285 return JNI_TRUE; >>>>>>>>>> ?? 286 } >>>>>>>>>> ?? ... >>>>>>>>>> ?? 291 void >>>>>>>>>> ?? 292 classTrack_reset(void) >>>>>>>>>> ?? 293 { >>>>>>>>>> 294 int idx; >>>>>>>>>> 295 debugMonitorEnter(deletedSignatureLock); >>>>>>>>>> 296 >>>>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) { >>>>>>>>>> 298 KlassNode* node = table[idx]; >>>>>>>>>> 299 while (node != NULL) { >>>>>>>>>> 300 KlassNode* next = node->next; >>>>>>>>>> 301 jvmtiDeallocate(node->signature); >>>>>>>>>> 302 jvmtiDeallocate(node); >>>>>>>>>> 303 node = next; >>>>>>>>>> 304 } >>>>>>>>>> 305 } >>>>>>>>>> 306 jvmtiDeallocate(table); >>>>>>>>>> 307 >>>>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL); >>>>>>>>>> 309 bagDestroyBag(deletedSignatureBag); >>>>>>>>>> 310 >>>>>>>>>> 311 currentClassTag = -1; >>>>>>>>>> 312 >>>>>>>>>> 313 >>>>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); >>>>>>>>>> >>>>>>>>>> 314 trackingEnv = NULL; >>>>>>>>>> 315 >>>>>>>>>> 316 debugMonitorExit(deletedSignatureLock); >>>>>>>>>> >>>>>>>>>> Could you, please, fix several comments below? >>>>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for >>>>>>>>>> class-unloads >>>>>>>>>> ???The comma is not needed. >>>>>>>>>> ???Would it better to replace: klass tags => klass_tag's ? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag >>>>>>>>>> consistent >>>>>>>>>> ???Maybe: Lock to guard ... or lock to keep integrity of ... >>>>>>>>>> >>>>>>>>>> 84 * Callback when classes are freed, Finds the signature and >>>>>>>>>> remembers it in deletedSignatureBag. Would be better to use words >>>>>>>>>> like >>>>>>>>>> "store" or "record", "Find" should not start from capital letter: >>>>>>>>>> Invoke the callback when classes are freed, find and record the >>>>>>>>>> signature in deletedSignatureBag. >>>>>>>>>> >>>>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not >>>>>>>>>> initialized, >>>>>>>>>> nobody's interested 153 // Class tracking not initialized yet, >>>>>>>>>> nobody's interested 158 /* Check this is not a duplicate */ >>>>>>>>>> Missed >>>>>>>>>> dot >>>>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) >>>>>>>>>> { // >>>>>>>>>> klass not found - ignore. In opposite, dot is not needed as the >>>>>>>>>> comment does not start from a capital letter. 111 // At this >>>>>>>>>> point we >>>>>>>>>> have the KlassNode corresponding to the tag >>>>>>>>>> 112 // in klass, and the pointer to it in klass_node. >>>>>>>>> ?? The comment above can be better. Maybe, something like: >>>>>>>>> ?? ? " At this point, we found the KlassNode matching the klass >>>>>>>>> tag(and it is >>>>>>>>> linked). >>>>>>>>> >>>>>>>>>> 113 // Remember the unloaded signature. >>>>>>>>> ???Better: Record the signature of the unloaded class and >>>>>>>>> unlink it. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Serguei >>>>>>>>>> >>>>>>>>>> On 3/9/20 05:39, Roman Kennke wrote: >>>>>>>>>>> Hello all, >>>>>>>>>>> >>>>>>>>>>> Can I please get reviews of this change? In the meantime, >>>>>>>>>>> we've done >>>>>>>>>>> more testing and also field-/torture-testing by a customer >>>>>>>>>>> who is >>>>>>>>>>> happy >>>>>>>>>>> now. :-) >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Roman >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Hi Serguei, >>>>>>>>>>>> >>>>>>>>>>>> Thanks for reviewing! >>>>>>>>>>>> >>>>>>>>>>>> I updated the patch to reflect your suggestions, very good! >>>>>>>>>>>> It also includes a fix to allow re-connecting an agent after >>>>>>>>>>>> disconnect, >>>>>>>>>>>> namely move setup of the trackingEnv and deletedSignatureBag to >>>>>>>>>>>> _activate() to ensure have those structures after re-connect. >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/ >>>>>>>>>>>> >>>>>>>>>>>> Let me know what you think! >>>>>>>>>>>> Roman >>>>>>>>>>>> >>>>>>>>>>>>> Hi Roman, >>>>>>>>>>>>> >>>>>>>>>>>>> Thank you for taking care about this scalability issue! >>>>>>>>>>>>> >>>>>>>>>>>>> I have a couple of quick comments. >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 72 /* >>>>>>>>>>>>> 73 * Lock to protect deletedSignatureBag >>>>>>>>>>>>> 74 */ >>>>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /* >>>>>>>>>>>>> 78 * A bag containing all the deleted classes' signatures. >>>>>>>>>>>>> Must be >>>>>>>>>>>>> accessed under >>>>>>>>>>>>> 79 * deletedTagLock, >>>>>>>>>>>>> ??? 80? */ >>>>>>>>>>>>> 81 struct bag* deletedSignatureBag; >>>>>>>>>>>>> >>>>>>>>>>>>> ??? The comments contradict to each other. >>>>>>>>>>>>> ??? I guess, the lock name at line 79 has to be >>>>>>>>>>>>> deletedSignatureLock >>>>>>>>>>>>> instead of deletedTagLock. >>>>>>>>>>>>> ??? Also, comma at the end must be replaced with dot. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 101 // Tag not found? Ignore. >>>>>>>>>>>>> 102 if (klass == NULL) { >>>>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>> 104 return; >>>>>>>>>>>>> 105 } >>>>>>>>>>>>> ?? 106 >>>>>>>>>>>>> 107 // Scan linked-list. >>>>>>>>>>>>> 108 jlong found_tag = klass->klass_tag; >>>>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) { >>>>>>>>>>>>> 110 klass_ptr = &klass->next; >>>>>>>>>>>>> 111 klass = *klass_ptr; >>>>>>>>>>>>> 112 found_tag = klass->klass_tag; >>>>>>>>>>>>> ?? 113???? } >>>>>>>>>>>>> 114 >>>>>>>>>>>>> 115 // Tag not found? Ignore. >>>>>>>>>>>>> 116 if (found_tag != tag) { >>>>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>> 118 return; >>>>>>>>>>>>> ?? 119???? } >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> ???The code above can be simplified, so that the lines 101-105 >>>>>>>>>>>>> are not >>>>>>>>>>>>> needed anymore. >>>>>>>>>>>>> ???It can be something like this: >>>>>>>>>>>>> >>>>>>>>>>>>> // Scan linked-list. >>>>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) { >>>>>>>>>>>>> klass_ptr = &klass->next; >>>>>>>>>>>>> klass = *klass_ptr; >>>>>>>>>>>>> ?????? } >>>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not >>>>>>>>>>>>> found - ignore. >>>>>>>>>>>>> debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>> return; >>>>>>>>>>>>> ?????? } >>>>>>>>>>>>> >>>>>>>>>>>>> It will take more time when I get a chance to look at the >>>>>>>>>>>>> rest. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Serguei >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote: >>>>>>>>>>>>>> Here comes an update that resolves some races that happen >>>>>>>>>>>>>> when >>>>>>>>>>>>>> disconnecting an agent. In particular, we need to take the >>>>>>>>>>>>>> lock on >>>>>>>>>>>>>> basically every operation, and also need to check whether >>>>>>>>>>>>>> or not >>>>>>>>>>>>>> class-tracking is active and return an appropriate result >>>>>>>>>>>>>> (e.g. an empty >>>>>>>>>>>>>> list) when we're not. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Updated webrev: >>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Roman >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> So, here comes the O(1) implementation: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a >>>>>>>>>>>>>>> tag, and we >>>>>>>>>>>>>>> set-up a listener to get notified when it is unloaded. >>>>>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a >>>>>>>>>>>>>>> table, which >>>>>>>>>>>>>>> each entry being the head of a linked-list of KlassNode*. >>>>>>>>>>>>>>> The >>>>>>>>>>>>>>> table is >>>>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend the new >>>>>>>>>>>>>>> KlassNode*. >>>>>>>>>>>>>>> This is O(1) operation. >>>>>>>>>>>>>>> - When we get notified of unloading a class, we look up the >>>>>>>>>>>>>>> signature of >>>>>>>>>>>>>>> the reported tag in that table, and remember it in a bag. >>>>>>>>>>>>>>> The >>>>>>>>>>>>>>> KlassNode* >>>>>>>>>>>>>>> is then unlinked from the table and deallocated. This is >>>>>>>>>>>>>>> ~O(1) operation >>>>>>>>>>>>>>> too, depending on the depth of the table. In my testcase >>>>>>>>>>>>>>> which hammered >>>>>>>>>>>>>>> the code with class-loads and unloads, I usually see depths >>>>>>>>>>>>>>> of like 2-3, >>>>>>>>>>>>>>> but not usually more. It should be ok. >>>>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out that >>>>>>>>>>>>>>> bag, and >>>>>>>>>>>>>>> allocate a new one. >>>>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid >>>>>>>>>>>>>>> leaking the >>>>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets detached >>>>>>>>>>>>>>> and/or >>>>>>>>>>>>>>> re-attached (was missing before). >>>>>>>>>>>>>>> - I also added locks around data-structure-manipulation (was >>>>>>>>>>>>>>> missing >>>>>>>>>>>>>>> before). >>>>>>>>>>>>>>> - Also, I only activate this whole process when an actual >>>>>>>>>>>>>>> listener gets >>>>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when >>>>>>>>>>>>>>> attaching a >>>>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be >>>>>>>>>>>>>>> something >>>>>>>>>>>>>>> to improve >>>>>>>>>>>>>>> in the future? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> In my tests, the performance of class-tracking itself looks >>>>>>>>>>>>>>> really good. >>>>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the >>>>>>>>>>>>>>> class-unload >>>>>>>>>>>>>>> events. I don't see how this can be helped when the debug >>>>>>>>>>>>>>> agent asks for it? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Updated webrev: >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Please let me know what you think of it. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am implementing >>>>>>>>>>>>>>>> the even more >>>>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing >>>>>>>>>>>>>>>> for now. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks,Roman >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ?? Hi Chris, >>>>>>>>>>>>>>>>>> I'll have a look at this, although it might not be for a >>>>>>>>>>>>>>>>>> few days. In >>>>>>>>>>>>>>>>>> the meantime, maybe you can describe your new >>>>>>>>>>>>>>>>>> implementation in >>>>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the changes. >>>>>>>>>>>>>>>>> Sure. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to >>>>>>>>>>>>>>>>> determine the >>>>>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading >>>>>>>>>>>>>>>>> happened, so that >>>>>>>>>>>>>>>>> we can generate the appropriate JDWP event. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The current implementation does so by maintaining a table >>>>>>>>>>>>>>>>> of currently >>>>>>>>>>>>>>>>> prepared classes by building that table when classTrack is >>>>>>>>>>>>>>>>> initialized, >>>>>>>>>>>>>>>>> and then add new classes whenever a class gets loaded. >>>>>>>>>>>>>>>>> When >>>>>>>>>>>>>>>>> unloading >>>>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and >>>>>>>>>>>>>>>>> compared with the >>>>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the new >>>>>>>>>>>>>>>>> table gets >>>>>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently >>>>>>>>>>>>>>>>> and/or many >>>>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to >>>>>>>>>>>>>>>>> O(classCount*gcCount) >>>>>>>>>>>>>>>>> complexity. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared >>>>>>>>>>>>>>>>> classes, and also >>>>>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). >>>>>>>>>>>>>>>>> Whenever an >>>>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned, >>>>>>>>>>>>>>>>> and classes >>>>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus >>>>>>>>>>>>>>>>> maintaining the >>>>>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list >>>>>>>>>>>>>>>>> that gets returned. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The implementation is not perfect. In order to determine >>>>>>>>>>>>>>>>> whether or not >>>>>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. >>>>>>>>>>>>>>>>> That process is >>>>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here >>>>>>>>>>>>>>>>> is that >>>>>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this >>>>>>>>>>>>>>>>> seems to be >>>>>>>>>>>>>>>>> true, and also reasonable to expect. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> (I have some ideas how to improve the implementation to >>>>>>>>>>>>>>>>> ~O(1) but it >>>>>>>>>>>>>>>>> would be considerably more complex: have to maintain a >>>>>>>>>>>>>>>>> (hash)table that >>>>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload, >>>>>>>>>>>>>>>>> and build the >>>>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently see >>>>>>>>>>>>>>>>> that it's >>>>>>>>>>>>>>>>> worth the effort). >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> In addition to all that, this process is only activated >>>>>>>>>>>>>>>>> when there's an >>>>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote: >>>>>>>>>>>>>>>>>>> Hello all, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Issue: >>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of >>>>>>>>>>>>>>>>>>> classTrack.c. >>>>>>>>>>>>>>>>>>> It avoids >>>>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps >>>>>>>>>>>>>>>>>>> track of >>>>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance until an >>>>>>>>>>>>>>>>>>> agent >>>>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and >>>>>>>>>>>>>>>>>>> timing. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Eg with the testcase provided here: >>>>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I am getting those numbers: >>>>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s >>>>>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Can I please get a review? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>>>> > > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From magnus.ihse.bursie at oracle.com Wed Mar 25 22:34:18 2020 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Wed, 25 Mar 2020 23:34:18 +0100 Subject: Discussion about fixing deprecation in jdk.hotspot.agent Message-ID: Hi everyone, As a follow-up to the ongoing review for JDK-8241618, I have also looked at fixing the deprecation warnings in jdk.hotspot.agent. These fall in three broad categories: * Deprecation of the boxing type constructors (e.g. "new Integer(42)"). * Deprecation of java.util.Observer and Observable. * The rest (mostly Class.newInstance(), and a few number of other odd deprecations) The first category is trivial to fix. The last category need some special discussion. But the overwhelming majority of deprecation warnings come from the use of Observer and Observable. This really dwarfs anything else, and needs to be handled first, otherwise it's hard to even spot the other issues. My analysis of the situation is that the deprecation of Observer and Observable seems a bit harsh, from the PoV of jdk.hotspot.agent. Sure, it might be limited, but I think it does exactly what is needed here. So the migration suggested in Observable (java.beans or java.util.concurrent) seems overkill. If there are genuine threading issues at play here, this assumption might be wrong, and then maybe going the j.u.c. route is correct. But if that's not, the main goal should be to stay with the current implementation. One way to do this is to sprinkle the code with @SuppressWarning. But I think a better way would be to just implement our own Observer and Observable. After all, the classes are trivial. I've made a mock-up of this solution, were I just copied the java.util.Observer and Observable, and removed the deprecation annotations. The only thing needed for the rest of the code is to make sure we import these; I've done this for three arbitrarily selected classes just to show what the change would typically look like. Here's the mock-up: http://cr.openjdk.java.net/~ihse/hotspot-agent-observer/webrev.01 Let me know what you think. /Magnus From leonid.mesnik at oracle.com Wed Mar 25 22:36:04 2020 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Wed, 25 Mar 2020 15:36:04 -0700 Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the children process if vmArguments is already specified In-Reply-To: References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com> <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com> <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com> <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com> <042D1C56-75E3-40B9-8259-035C9A13C13B@oracle.com> <47cba4c6-233d-84a9-1d89-b40e3c974c08@oracle.com> Message-ID: <63ad8e28-fe30-53cb-93b3-85a4c09f919d@oracle.com> On 3/25/20 2:58 PM, Igor Ignatyev wrote: >> >> Test ClhsdbJstack.java is updated. >> > now you reduced coverage provided by this test, I actually meant to > create a separate jtreg test description in this test and pass "Xcomp" > or "true" (or anything) as an argument to ClhsdbJstack, and use the > value of this argument to decide if -Xcomp should be added > to?LingeredApp.startApp or not. Seems I misinterpret you words. Do you mean to change it to this? Basically the same as my original but faster and better prepared for "@run driver". http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java.udiff.html Leonid > > Thanks, > -- Igor > > >> On Mar 25, 2020, at 2:31 PM, Leonid Mesnik > > wrote: >> >> Igor, Stefan, Ioi >> >> Thank you for your feedback. >> >> Filed https://bugs.openjdk.java.net/browse/JDK-8241624 To change @run >> main... to @run driver. >> >> Test ClhsdbJstack.java is updated. >> >> Still waiting for review from SVC team. >> >> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/ >> >> Leonid >> >> On 3/25/20 12:46 PM, Igor Ignatyev wrote: >>> Hi Leonid, >>> >>> not related related to your patch (but yet somewhat made more >>> obvious by it), it seems all (or at least almost all) the tests >>> which use?LingeredApp should be run in "driver" mode as they just >>> orchestrate execution of other JVMs, so running them w/ main (let >>> alone main/othervm) just wastes time, >>> test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java#id1, for >>> example, will now executed w/ Xcomp which will make it very slow for >>> no reasons. since you already got your hands dirty w/ these tests, >>> could you please file an RFE to sort this out and list all the >>> affected tests there? >>> >>> re: the patch, could you please update ClhsdbJstack.java test not to >>> be run w/ Xcomp and follow the same pattern you used in other tests >>> (e.g.?ClhsdbScanOops) ? other than that it looks fine to me, I >>> however wouldn't be able to tell if all svc tests continue to do >>> that they were supposed to, so I'd prefer for someone from svc team >>> to?chime in. >>> >>> Thanks, >>> -- Igor >>> >>>> On Mar 25, 2020, at 12:01 PM, Leonid Mesnik >>>> > wrote: >>>> >>>> Added Ioi, who also proposed new version of startAppVmOpts. >>>> >>>> Please find new webrev: >>>> http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/ >>>> >>>> Renamed startAppVmOpts/runAppVmOpts to >>>> "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should make >>>> very clear that this method doesn't use any of test.java.opts, >>>> test.vm.opts. >>>> >>>> Also, I fixed test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java >>>> metnioned by Igor, and removed null pointer check as Ioi suggested >>>> in startApp method. >>>> >>>> + public static void startApp(LingeredApp theApp, String... >>>> additionalJvmOpts) throws IOException { >>>> + startAppExactJvmOpts(theApp, >>>> Utils.appendTestJavaOpts(additionalJvmOpts)); >>>> + } >>>> >>>> Leonid >>>> >>>> On 3/25/20 10:14 AM, Stefan Karlsson wrote: >>>>> On 2020-03-25 17:40, Igor Ignatyev wrote: >>>>>> Hi Leonid, >>>>>> >>>>>> I have briefly looked at the patch, a few comments so far: >>>>>> >>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java: >>>>>> ? - at L#114, could you please call static method using class >>>>>> name (as the opposite of using instance)? or was it meant to be >>>>>> theApp.runAppVmOpts(vmArgs) ? >>>>>> >>>>>> test/lib/jdk/test/lib/apps/LingeredApp.java: >>>>>> - it seems that code indent of startApp(LingeredApp, String[]) >>>>>> isn't correct >>>>>> - I don't like startAppVmOpts name, but unfortunately don't have >>>>>> a better suggestion (yet) >>>>> >>>>> I was going to say the same. Jtreg has the concept of "java >>>>> options" and "vm options". We have had a fair share of bugs and >>>>> wasted time when tests have been using the "vm options" part >>>>> (VM_OPTIONS, test.vm.options, etc), and we've been moving away >>>>> from using that way to pass options. I recently cleaned up some of >>>>> this with: >>>>> >>>>> 8237111: LingeredApp should be started with getTestJavaOpts >>>>> >>>>> Because of this, I would prefer if we used a name that doesn't >>>>> include "VmOpts", because it's too alike the other concept. Some >>>>> suggestions: >>>>> ?startAppJavaOptions >>>>> ?startAppUsingJavaOptions >>>>> ?startAppWithJavaOptions >>>>> ?startAppExactJavaOptions >>>>> ?startAppJvmOptions >>>>> >>>>> Thanks, >>>>> StefanK >>>>> >>>>>> Thanks, >>>>>> -- Igor >>>>>> >>>>>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik >>>>>>> wrote: >>>>>>> >>>>>>> Hi >>>>>>> >>>>>>> Could you please review following fix which change LingeredApp >>>>>>> to prepend vm options to java/vm.test.opts when startApp is used >>>>>>> and provide startAppVmOpts to override options completely. >>>>>>> >>>>>>> The intention is to avoid issue like in this bug where >>>>>>> test/jtreg options were ignored by tests. Also I fixed some >>>>>>> tests where intention was to append vm options rather than to >>>>>>> override them. >>>>>>> >>>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/ >>>>>>> >>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698 >>>>>>> >>>>>>> Leonid >>>>>>> >>>>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From igor.ignatyev at oracle.com Wed Mar 25 22:42:41 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 25 Mar 2020 15:42:41 -0700 Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the children process if vmArguments is already specified In-Reply-To: <63ad8e28-fe30-53cb-93b3-85a4c09f919d@oracle.com> References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com> <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com> <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com> <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com> <042D1C56-75E3-40B9-8259-035C9A13C13B@oracle.com> <47cba4c6-233d-84a9-1d89-b40e3c974c08@oracle.com> <63ad8e28-fe30-53cb-93b3-85a4c09f919d@oracle.com> Message-ID: <4F12C34D-4D61-4DAA-962C-2759EA83A6F9@oracle.com> > On Mar 25, 2020, at 3:36 PM, Leonid Mesnik wrote: > > > > On 3/25/20 2:58 PM, Igor Ignatyev wrote: >>> Test ClhsdbJstack.java is updated. >>> >> now you reduced coverage provided by this test, I actually meant to create a separate jtreg test description in this test and pass "Xcomp" or "true" (or anything) as an argument to ClhsdbJstack, and use the value of this argument to decide if -Xcomp should be added to LingeredApp.startApp or not. > Seems I misinterpret you words. Do you mean to change it to this? Basically the same as my original but faster and better prepared for "@run driver". > > http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java.udiff.html > yeap. > > Leonid > >> >> Thanks, >> -- Igor >> >> >>> On Mar 25, 2020, at 2:31 PM, Leonid Mesnik > wrote: >>> >>> Igor, Stefan, Ioi >>> >>> Thank you for your feedback. >>> >>> Filed https://bugs.openjdk.java.net/browse/JDK-8241624 To change @run main... to @run driver. >>> >>> Test ClhsdbJstack.java is updated. >>> >>> Still waiting for review from SVC team. >>> >>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/ >>> Leonid >>> >>> On 3/25/20 12:46 PM, Igor Ignatyev wrote: >>>> Hi Leonid, >>>> >>>> not related related to your patch (but yet somewhat made more obvious by it), it seems all (or at least almost all) the tests which use LingeredApp should be run in "driver" mode as they just orchestrate execution of other JVMs, so running them w/ main (let alone main/othervm) just wastes time, test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java#id1, for example, will now executed w/ Xcomp which will make it very slow for no reasons. since you already got your hands dirty w/ these tests, could you please file an RFE to sort this out and list all the affected tests there? >>>> >>>> re: the patch, could you please update ClhsdbJstack.java test not to be run w/ Xcomp and follow the same pattern you used in other tests (e.g. ClhsdbScanOops) ? other than that it looks fine to me, I however wouldn't be able to tell if all svc tests continue to do that they were supposed to, so I'd prefer for someone from svc team to chime in. >>>> >>>> Thanks, >>>> -- Igor >>>> >>>>> On Mar 25, 2020, at 12:01 PM, Leonid Mesnik > wrote: >>>>> >>>>> Added Ioi, who also proposed new version of startAppVmOpts. >>>>> >>>>> Please find new webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/ >>>>> Renamed startAppVmOpts/runAppVmOpts to "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should make very clear that this method doesn't use any of test.java.opts, test.vm.opts. >>>>> >>>>> Also, I fixed test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java metnioned by Igor, and removed null pointer check as Ioi suggested in startApp method. >>>>> >>>>> + public static void startApp(LingeredApp theApp, String... additionalJvmOpts) throws IOException { >>>>> + startAppExactJvmOpts(theApp, Utils.appendTestJavaOpts(additionalJvmOpts)); >>>>> + } >>>>> >>>>> Leonid >>>>> >>>>> On 3/25/20 10:14 AM, Stefan Karlsson wrote: >>>>>> On 2020-03-25 17:40, Igor Ignatyev wrote: >>>>>>> Hi Leonid, >>>>>>> >>>>>>> I have briefly looked at the patch, a few comments so far: >>>>>>> >>>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java: >>>>>>> - at L#114, could you please call static method using class name (as the opposite of using instance)? or was it meant to be theApp.runAppVmOpts(vmArgs) ? >>>>>>> >>>>>>> test/lib/jdk/test/lib/apps/LingeredApp.java: >>>>>>> - it seems that code indent of startApp(LingeredApp, String[]) isn't correct >>>>>>> - I don't like startAppVmOpts name, but unfortunately don't have a better suggestion (yet) >>>>>> >>>>>> I was going to say the same. Jtreg has the concept of "java options" and "vm options". We have had a fair share of bugs and wasted time when tests have been using the "vm options" part (VM_OPTIONS, test.vm.options, etc), and we've been moving away from using that way to pass options. I recently cleaned up some of this with: >>>>>> >>>>>> 8237111: LingeredApp should be started with getTestJavaOpts >>>>>> >>>>>> Because of this, I would prefer if we used a name that doesn't include "VmOpts", because it's too alike the other concept. Some suggestions: >>>>>> startAppJavaOptions >>>>>> startAppUsingJavaOptions >>>>>> startAppWithJavaOptions >>>>>> startAppExactJavaOptions >>>>>> startAppJvmOptions >>>>>> >>>>>> Thanks, >>>>>> StefanK >>>>>> >>>>>>> Thanks, >>>>>>> -- Igor >>>>>>> >>>>>>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik wrote: >>>>>>>> >>>>>>>> Hi >>>>>>>> >>>>>>>> Could you please review following fix which change LingeredApp to prepend vm options to java/vm.test.opts when startApp is used and provide startAppVmOpts to override options completely. >>>>>>>> >>>>>>>> The intention is to avoid issue like in this bug where test/jtreg options were ignored by tests. Also I fixed some tests where intention was to append vm options rather than to override them. >>>>>>>> >>>>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/ >>>>>>>> >>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698 >>>>>>>> >>>>>>>> Leonid >>>>>>>> >>>>>> >>>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Thu Mar 26 00:55:57 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 25 Mar 2020 17:55:57 -0700 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com> <359442d4-d6a1-f123-dad1-ac712f06eb0e@redhat.com> <7c613970-1788-5bd5-06f9-5c754843a193@oracle.com> <70717ba0-bcd1-05d4-3f25-ad1dd30309c1@redhat.com> <80c64ca7-4f52-a7ba-0e35-9fa6417ce545@oracle.com> Message-ID: <89d41371-4394-e506-b1d1-0a810c72b6e3@oracle.com> Hi Roman, It passed all my testing. I think before you push Serguei has a question regarding an issue you brought up a while back. You mentioned that you weren't getting some events, and suddenly started seeing them. We were discussing it today and it was unclear if this was an issue you were seeing before your changes, and your changes resolved it, or it was initially caused by an earlier version of your changes, and you later fixed it. We just want to better understand what this issue was and how it was fixed. thanks, Chris On 3/25/20 3:22 PM, Roman Kennke wrote: > The new job finished, its ID is: > > mach5-one-rkennke-JDK-8227269-2-20200325-2027-9716289 > > Thank you, > Roman > > >> Yes, please submit a new job. I'll start my testing once I see that the >> builds are done. >> >> Chris >> >> On 3/25/20 12:59 PM, Roman Kennke wrote: >>> Hi Chris, >>> >>> Apparently we can get into classTrack_reset() before calling activate(), >>> and we're seeing a null deletedSignatureBag. A simple NULL-check around >>> the cleaning routine fixes the problem for me. >>> >>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.08/ >>> >>> Should I post another submit-repo job with that fix? >>> >>> Thanks, >>> Roman >>> >>> >>>> Hi Roman, >>>> >>>> com/sun/jdi/JdwpAllowTest.java crashed on many runs: >>>> >>>> Stack: [0x00007fbb790f9000,0x00007fbb791fa000], >>>> sp=0x00007fbb791f8af0,? free space=1022k >>>> Native frames: (J=compiled Java code, A=aot compiled Java code, >>>> j=interpreted, Vv=VM code, C=native code) >>>> C? [libjdwp.so+0xdb71]? bagEnumerateOver+0x11 >>>> C? [libjdwp.so+0xe365]? classTrack_reset+0x25 >>>> C? [libjdwp.so+0xfca1]? debugInit_reset+0x71 >>>> C? [libjdwp.so+0x12e0d]? debugLoop_run+0x38d >>>> C? [libjdwp.so+0x25700]? acceptThread+0x80 >>>> V? [libjvm.so+0xf4b5a7]? JvmtiAgentThread::call_start_function()+0x1c7 >>>> V? [libjvm.so+0x15215c6]? JavaThread::thread_main_inner()+0x226 >>>> V? [libjvm.so+0x1527736]? Thread::call_run()+0xf6 >>>> V? [libjvm.so+0x1250ade]? thread_native_entry(Thread*)+0x10e >>>> >>>> >>>> This happened during a test task run of open/test/jdk/:jdk_jdi. There >>>> doesn't seem to be anything magic on the command line that might be >>>> triggering. Pretty much I see it with all the various VM configs we >>>> test. >>>> >>>> I'm also seeing crashes in the following tests, but not as often: >>>> >>>> serviceability/jvmti/ModuleAwareAgents/ThreadStart/MAAThreadStart.java >>>> vmTestbase/nsk/jdwp/VirtualMachine/Version/version002/TestDescription.java >>>> >>>> vmTestbase/nsk/jdwp/VirtualMachine/ReleaseEvents/releaseevents002/TestDescription.java >>>> >>>> vmTestbase/nsk/jdwp/VirtualMachine/HoldEvents/holdevents002/TestDescription.java >>>> >>>> vmTestbase/nsk/jdwp/VirtualMachine/Dispose/dispose001/TestDescription.java >>>> >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> >>>> On 3/25/20 11:37 AM, Roman Kennke wrote: >>>>> Hi Chris, >>>>> >>>>>> Regarding the new assert: >>>>>> >>>>>> ??105???? if (gdata && gdata->assertOn) { >>>>>> ??106???????? // Check this is not already tagged. >>>>>> ??107???????? jlong tag; >>>>>> ??108???????? error = JVMTI_FUNC_PTR(trackingEnv, GetTag)(env, >>>>>> klass, &tag); >>>>>> ??109???????? if (error != JVMTI_ERROR_NONE) { >>>>>> ??110???????????? EXIT_ERROR(error, "Unable to GetTag with class >>>>>> trackingEnv"); >>>>>> ??111???????? } >>>>>> ??112???????? JDI_ASSERT(tag == NOT_TAGGED); >>>>>> ??113???? } >>>>>> >>>>>> I think you should remove the gdata check. gdata should never be NULL >>>>>> when you get to this code. If it is ever NULL then there's a bug, and >>>>>> the check will hide the bug. >>>>> Ok, will remove this. >>>>> >>>>>> Regarding testing, after you do the submit repo testing let me know >>>>>> the >>>>>> jobID and I'll do additional testing on it. >>>>> I did the submit repo earlier today, and it came back green: >>>>> >>>>> mach5-one-rkennke-JDK-8227269-2-20200325-1421-9706762 >>>>> >>>>> Thanks, >>>>> Roman >>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 3/25/20 6:00 AM, Roman Kennke wrote: >>>>>>> Hi Sergei, >>>>>>> >>>>>>>> The fix looks pretty clean now. >>>>>>>> I also like new name of the lock.:) >>>>>>> Thank you! >>>>>>> >>>>>>>> Just one comment below. >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> 110 if (tag != 0l) { >>>>>>>> 111 return; // Already added >>>>>>>> ?? 112???? } >>>>>>>> >>>>>>>> ???It is better to use a named constant or macro instead. >>>>>>>> ???Also, it'd be nice to add a short comment about this value is. >>>>>>> As I replied to Chris earlier, this whole block can be turned into an >>>>>>> assert. I also made a constant for the value 0, which should be >>>>>>> pretty >>>>>>> much self-explaining. >>>>>>> >>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.07/ >>>>>>> >>>>>>>> How do you test the fix? >>>>>>> I am using a manual test that is provided in this bug report: >>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>> >>>>>>> "Script to compare performance of GC with and without debugger, when >>>>>>> many classes are loaded and classes are being unloaded": >>>>>>> >>>>>>> https://bugzilla.redhat.com/attachment.cgi?id=1640688 >>>>>>> >>>>>>> I am also using this test and manually attach/detach jdb a couple of >>>>>>> times in a row to check that disconnecting and reconnecting works >>>>>>> well >>>>>>> (this tended to deadlock or crash with an earlier version of the >>>>>>> patch, >>>>>>> and is now looking good). >>>>>>> >>>>>>> I am also running tier1 and tier2 tests locally, and as soon as we >>>>>>> all >>>>>>> agree that the fix is reasonable, I will push it to the submit >>>>>>> repo. I >>>>>>> am not sure if any of those tests actually exercise that code, >>>>>>> though. >>>>>>> Let me know if you want me to run any specific tests. >>>>>>> >>>>>>> Thank you, >>>>>>> Roman >>>>>>> >>>>>>> >>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> >>>>>>>> On 3/20/20 08:30, Roman Kennke wrote: >>>>>>>>> I believe I came up with a much simpler solution that also >>>>>>>>> solves the >>>>>>>>> problems of the existing one, and the ones I proposed earlier. >>>>>>>>> >>>>>>>>> It turns out that we can take advantage of the fact that we can use >>>>>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is >>>>>>>>> explicitely >>>>>>>>> mentioned in the JVMTI spec). This means we can simply stick a >>>>>>>>> pointer >>>>>>>>> to the signature of a class into the tag, and pull it out again >>>>>>>>> when we >>>>>>>>> get notified that the class gets unloaded. >>>>>>>>> >>>>>>>>> This means we don't need an extra data-structure to keep track of >>>>>>>>> classes and signatures, and it also makes the story around locking >>>>>>>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning >>>>>>>>> of all >>>>>>>>> classes needed (as in the current implementation) and no >>>>>>>>> searching of >>>>>>>>> table needed (like in my previous attempts). >>>>>>>>> >>>>>>>>> Please review this new revision: >>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/ >>>>>>>>> >>>>>>>>> (Notice that there still appears to be a performance bottleneck >>>>>>>>> with >>>>>>>>> class-unloading when an actual debugger is attached. This >>>>>>>>> doesn't seem >>>>>>>>> to be related to the classTrack.c implementation though, but >>>>>>>>> looks like >>>>>>>>> a consequence of getting all those class-unload notifications >>>>>>>>> over the >>>>>>>>> wire. My testcase generates 1000s of them, and it's clogging up the >>>>>>>>> buffers.) >>>>>>>>> >>>>>>>>> I am not sure why jdb needs to enable class-unload listener >>>>>>>>> always. A >>>>>>>>> simple hack disables it, and performance is brilliant, even when >>>>>>>>> jdb is >>>>>>>>> attached: >>>>>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch >>>>>>>>> >>>>>>>>> But this is not in the scope of this bug.) >>>>>>>>> >>>>>>>>> Roman >>>>>>>>> >>>>>>>>> >>>>>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>> Sorry, forgot to complete my comments at the end (see below). >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>> Hi Roman, >>>>>>>>>>> >>>>>>>>>>> Thank you for the update and sorry for the latency in review. >>>>>>>>>>> >>>>>>>>>>> Some comments are below. >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag) >>>>>>>>>>> ??? 88 { >>>>>>>>>>> 89 debugMonitorEnter(deletedSignatureLock); >>>>>>>>>>> 90 if (currentClassTag == -1) { >>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>> 93 return; >>>>>>>>>>> ??? 94???? } >>>>>>>>>>> Just a question: >>>>>>>>>>> ??? Q1: Should the ObjectFree events be disabled for the jvmtiEnv >>>>>>>>>>> that does >>>>>>>>>>> ??????? the class tracking if class tracking has not been >>>>>>>>>>> initialized? >>>>>>>>>>> >>>>>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is >>>>>>>>>>> better to >>>>>>>>>>> be something like: lastClassTag or highestClassTag. >>>>>>>>>>> >>>>>>>>>>> 99 KlassNode* klass = *klass_ptr; >>>>>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103 >>>>>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr; >>>>>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // >>>>>>>>>>> klass >>>>>>>>>>> not >>>>>>>>>>> found - ignore. >>>>>>>>>>> 107 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>> 108 return; >>>>>>>>>>> ?? 109???? } >>>>>>>>>>> ???It seems to me, something is wrong in the condition at L106 >>>>>>>>>>> above. >>>>>>>>>>> ???Should it be? : >>>>>>>>>>> ????? if (klass == NULL || klass->klass_tag != tag) >>>>>>>>>>> >>>>>>>>>>> ???Otherwise, how can the second check ever work correctly as the >>>>>>>>>>> return >>>>>>>>>>> will always happen when (klass != NULL)? >>>>>>>>>>> >>>>>>>>>>> ?? There are several places in this file with the the indent: >>>>>>>>>>> 90 if (currentClassTag == -1) { >>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>> 93 return; >>>>>>>>>>> ??? 94???? } >>>>>>>>>>> ?? ... >>>>>>>>>>> 152 if (currentClassTag == -1) { >>>>>>>>>>> 153 // Class tracking not initialized yet, nobody's interested >>>>>>>>>>> 154 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>> 155 return; >>>>>>>>>>> ?? 156???? } >>>>>>>>>>> ?? ... >>>>>>>>>>> 161 if (error != JVMTI_ERROR_NONE) { >>>>>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv"); >>>>>>>>>>> ?? 163???? } >>>>>>>>>>> 164 if (tag != 0l) { >>>>>>>>>>> 165 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>> 166 return; // Already added >>>>>>>>>>> ?? 167???? } >>>>>>>>>>> ?? ... >>>>>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg) >>>>>>>>>>> 282 { >>>>>>>>>>> 283 char* sig = (char*)signatureVoid; >>>>>>>>>>> 284 jvmtiDeallocate(sig); >>>>>>>>>>> 285 return JNI_TRUE; >>>>>>>>>>> ?? 286 } >>>>>>>>>>> ?? ... >>>>>>>>>>> ?? 291 void >>>>>>>>>>> ?? 292 classTrack_reset(void) >>>>>>>>>>> ?? 293 { >>>>>>>>>>> 294 int idx; >>>>>>>>>>> 295 debugMonitorEnter(deletedSignatureLock); >>>>>>>>>>> 296 >>>>>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) { >>>>>>>>>>> 298 KlassNode* node = table[idx]; >>>>>>>>>>> 299 while (node != NULL) { >>>>>>>>>>> 300 KlassNode* next = node->next; >>>>>>>>>>> 301 jvmtiDeallocate(node->signature); >>>>>>>>>>> 302 jvmtiDeallocate(node); >>>>>>>>>>> 303 node = next; >>>>>>>>>>> 304 } >>>>>>>>>>> 305 } >>>>>>>>>>> 306 jvmtiDeallocate(table); >>>>>>>>>>> 307 >>>>>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL); >>>>>>>>>>> 309 bagDestroyBag(deletedSignatureBag); >>>>>>>>>>> 310 >>>>>>>>>>> 311 currentClassTag = -1; >>>>>>>>>>> 312 >>>>>>>>>>> 313 >>>>>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); >>>>>>>>>>> >>>>>>>>>>> 314 trackingEnv = NULL; >>>>>>>>>>> 315 >>>>>>>>>>> 316 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>> >>>>>>>>>>> Could you, please, fix several comments below? >>>>>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for >>>>>>>>>>> class-unloads >>>>>>>>>>> ???The comma is not needed. >>>>>>>>>>> ???Would it better to replace: klass tags => klass_tag's ? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag >>>>>>>>>>> consistent >>>>>>>>>>> ???Maybe: Lock to guard ... or lock to keep integrity of ... >>>>>>>>>>> >>>>>>>>>>> 84 * Callback when classes are freed, Finds the signature and >>>>>>>>>>> remembers it in deletedSignatureBag. Would be better to use words >>>>>>>>>>> like >>>>>>>>>>> "store" or "record", "Find" should not start from capital letter: >>>>>>>>>>> Invoke the callback when classes are freed, find and record the >>>>>>>>>>> signature in deletedSignatureBag. >>>>>>>>>>> >>>>>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not >>>>>>>>>>> initialized, >>>>>>>>>>> nobody's interested 153 // Class tracking not initialized yet, >>>>>>>>>>> nobody's interested 158 /* Check this is not a duplicate */ >>>>>>>>>>> Missed >>>>>>>>>>> dot >>>>>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) >>>>>>>>>>> { // >>>>>>>>>>> klass not found - ignore. In opposite, dot is not needed as the >>>>>>>>>>> comment does not start from a capital letter. 111 // At this >>>>>>>>>>> point we >>>>>>>>>>> have the KlassNode corresponding to the tag >>>>>>>>>>> 112 // in klass, and the pointer to it in klass_node. >>>>>>>>>> ?? The comment above can be better. Maybe, something like: >>>>>>>>>> ?? ? " At this point, we found the KlassNode matching the klass >>>>>>>>>> tag(and it is >>>>>>>>>> linked). >>>>>>>>>> >>>>>>>>>>> 113 // Remember the unloaded signature. >>>>>>>>>> ???Better: Record the signature of the unloaded class and >>>>>>>>>> unlink it. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Serguei >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Serguei >>>>>>>>>>> >>>>>>>>>>> On 3/9/20 05:39, Roman Kennke wrote: >>>>>>>>>>>> Hello all, >>>>>>>>>>>> >>>>>>>>>>>> Can I please get reviews of this change? In the meantime, >>>>>>>>>>>> we've done >>>>>>>>>>>> more testing and also field-/torture-testing by a customer >>>>>>>>>>>> who is >>>>>>>>>>>> happy >>>>>>>>>>>> now. :-) >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Roman >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Hi Serguei, >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for reviewing! >>>>>>>>>>>>> >>>>>>>>>>>>> I updated the patch to reflect your suggestions, very good! >>>>>>>>>>>>> It also includes a fix to allow re-connecting an agent after >>>>>>>>>>>>> disconnect, >>>>>>>>>>>>> namely move setup of the trackingEnv and deletedSignatureBag to >>>>>>>>>>>>> _activate() to ensure have those structures after re-connect. >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/ >>>>>>>>>>>>> >>>>>>>>>>>>> Let me know what you think! >>>>>>>>>>>>> Roman >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Roman, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thank you for taking care about this scalability issue! >>>>>>>>>>>>>> >>>>>>>>>>>>>> I have a couple of quick comments. >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 72 /* >>>>>>>>>>>>>> 73 * Lock to protect deletedSignatureBag >>>>>>>>>>>>>> 74 */ >>>>>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /* >>>>>>>>>>>>>> 78 * A bag containing all the deleted classes' signatures. >>>>>>>>>>>>>> Must be >>>>>>>>>>>>>> accessed under >>>>>>>>>>>>>> 79 * deletedTagLock, >>>>>>>>>>>>>> ??? 80? */ >>>>>>>>>>>>>> 81 struct bag* deletedSignatureBag; >>>>>>>>>>>>>> >>>>>>>>>>>>>> ??? The comments contradict to each other. >>>>>>>>>>>>>> ??? I guess, the lock name at line 79 has to be >>>>>>>>>>>>>> deletedSignatureLock >>>>>>>>>>>>>> instead of deletedTagLock. >>>>>>>>>>>>>> ??? Also, comma at the end must be replaced with dot. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 101 // Tag not found? Ignore. >>>>>>>>>>>>>> 102 if (klass == NULL) { >>>>>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>> 104 return; >>>>>>>>>>>>>> 105 } >>>>>>>>>>>>>> ?? 106 >>>>>>>>>>>>>> 107 // Scan linked-list. >>>>>>>>>>>>>> 108 jlong found_tag = klass->klass_tag; >>>>>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) { >>>>>>>>>>>>>> 110 klass_ptr = &klass->next; >>>>>>>>>>>>>> 111 klass = *klass_ptr; >>>>>>>>>>>>>> 112 found_tag = klass->klass_tag; >>>>>>>>>>>>>> ?? 113???? } >>>>>>>>>>>>>> 114 >>>>>>>>>>>>>> 115 // Tag not found? Ignore. >>>>>>>>>>>>>> 116 if (found_tag != tag) { >>>>>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>> 118 return; >>>>>>>>>>>>>> ?? 119???? } >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> ???The code above can be simplified, so that the lines 101-105 >>>>>>>>>>>>>> are not >>>>>>>>>>>>>> needed anymore. >>>>>>>>>>>>>> ???It can be something like this: >>>>>>>>>>>>>> >>>>>>>>>>>>>> // Scan linked-list. >>>>>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) { >>>>>>>>>>>>>> klass_ptr = &klass->next; >>>>>>>>>>>>>> klass = *klass_ptr; >>>>>>>>>>>>>> ?????? } >>>>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not >>>>>>>>>>>>>> found - ignore. >>>>>>>>>>>>>> debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>> return; >>>>>>>>>>>>>> ?????? } >>>>>>>>>>>>>> >>>>>>>>>>>>>> It will take more time when I get a chance to look at the >>>>>>>>>>>>>> rest. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Serguei >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote: >>>>>>>>>>>>>>> Here comes an update that resolves some races that happen >>>>>>>>>>>>>>> when >>>>>>>>>>>>>>> disconnecting an agent. In particular, we need to take the >>>>>>>>>>>>>>> lock on >>>>>>>>>>>>>>> basically every operation, and also need to check whether >>>>>>>>>>>>>>> or not >>>>>>>>>>>>>>> class-tracking is active and return an appropriate result >>>>>>>>>>>>>>> (e.g. an empty >>>>>>>>>>>>>>> list) when we're not. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Updated webrev: >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> So, here comes the O(1) implementation: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a >>>>>>>>>>>>>>>> tag, and we >>>>>>>>>>>>>>>> set-up a listener to get notified when it is unloaded. >>>>>>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a >>>>>>>>>>>>>>>> table, which >>>>>>>>>>>>>>>> each entry being the head of a linked-list of KlassNode*. >>>>>>>>>>>>>>>> The >>>>>>>>>>>>>>>> table is >>>>>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend the new >>>>>>>>>>>>>>>> KlassNode*. >>>>>>>>>>>>>>>> This is O(1) operation. >>>>>>>>>>>>>>>> - When we get notified of unloading a class, we look up the >>>>>>>>>>>>>>>> signature of >>>>>>>>>>>>>>>> the reported tag in that table, and remember it in a bag. >>>>>>>>>>>>>>>> The >>>>>>>>>>>>>>>> KlassNode* >>>>>>>>>>>>>>>> is then unlinked from the table and deallocated. This is >>>>>>>>>>>>>>>> ~O(1) operation >>>>>>>>>>>>>>>> too, depending on the depth of the table. In my testcase >>>>>>>>>>>>>>>> which hammered >>>>>>>>>>>>>>>> the code with class-loads and unloads, I usually see depths >>>>>>>>>>>>>>>> of like 2-3, >>>>>>>>>>>>>>>> but not usually more. It should be ok. >>>>>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out that >>>>>>>>>>>>>>>> bag, and >>>>>>>>>>>>>>>> allocate a new one. >>>>>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid >>>>>>>>>>>>>>>> leaking the >>>>>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets detached >>>>>>>>>>>>>>>> and/or >>>>>>>>>>>>>>>> re-attached (was missing before). >>>>>>>>>>>>>>>> - I also added locks around data-structure-manipulation (was >>>>>>>>>>>>>>>> missing >>>>>>>>>>>>>>>> before). >>>>>>>>>>>>>>>> - Also, I only activate this whole process when an actual >>>>>>>>>>>>>>>> listener gets >>>>>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when >>>>>>>>>>>>>>>> attaching a >>>>>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be >>>>>>>>>>>>>>>> something >>>>>>>>>>>>>>>> to improve >>>>>>>>>>>>>>>> in the future? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> In my tests, the performance of class-tracking itself looks >>>>>>>>>>>>>>>> really good. >>>>>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the >>>>>>>>>>>>>>>> class-unload >>>>>>>>>>>>>>>> events. I don't see how this can be helped when the debug >>>>>>>>>>>>>>>> agent asks for it? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Updated webrev: >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Please let me know what you think of it. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am implementing >>>>>>>>>>>>>>>>> the even more >>>>>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing >>>>>>>>>>>>>>>>> for now. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks,Roman >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ?? Hi Chris, >>>>>>>>>>>>>>>>>>> I'll have a look at this, although it might not be for a >>>>>>>>>>>>>>>>>>> few days. In >>>>>>>>>>>>>>>>>>> the meantime, maybe you can describe your new >>>>>>>>>>>>>>>>>>> implementation in >>>>>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the changes. >>>>>>>>>>>>>>>>>> Sure. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to >>>>>>>>>>>>>>>>>> determine the >>>>>>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading >>>>>>>>>>>>>>>>>> happened, so that >>>>>>>>>>>>>>>>>> we can generate the appropriate JDWP event. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The current implementation does so by maintaining a table >>>>>>>>>>>>>>>>>> of currently >>>>>>>>>>>>>>>>>> prepared classes by building that table when classTrack is >>>>>>>>>>>>>>>>>> initialized, >>>>>>>>>>>>>>>>>> and then add new classes whenever a class gets loaded. >>>>>>>>>>>>>>>>>> When >>>>>>>>>>>>>>>>>> unloading >>>>>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and >>>>>>>>>>>>>>>>>> compared with the >>>>>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the new >>>>>>>>>>>>>>>>>> table gets >>>>>>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently >>>>>>>>>>>>>>>>>> and/or many >>>>>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to >>>>>>>>>>>>>>>>>> O(classCount*gcCount) >>>>>>>>>>>>>>>>>> complexity. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared >>>>>>>>>>>>>>>>>> classes, and also >>>>>>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). >>>>>>>>>>>>>>>>>> Whenever an >>>>>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned, >>>>>>>>>>>>>>>>>> and classes >>>>>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus >>>>>>>>>>>>>>>>>> maintaining the >>>>>>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list >>>>>>>>>>>>>>>>>> that gets returned. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The implementation is not perfect. In order to determine >>>>>>>>>>>>>>>>>> whether or not >>>>>>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. >>>>>>>>>>>>>>>>>> That process is >>>>>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here >>>>>>>>>>>>>>>>>> is that >>>>>>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this >>>>>>>>>>>>>>>>>> seems to be >>>>>>>>>>>>>>>>>> true, and also reasonable to expect. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> (I have some ideas how to improve the implementation to >>>>>>>>>>>>>>>>>> ~O(1) but it >>>>>>>>>>>>>>>>>> would be considerably more complex: have to maintain a >>>>>>>>>>>>>>>>>> (hash)table that >>>>>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload, >>>>>>>>>>>>>>>>>> and build the >>>>>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently see >>>>>>>>>>>>>>>>>> that it's >>>>>>>>>>>>>>>>>> worth the effort). >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> In addition to all that, this process is only activated >>>>>>>>>>>>>>>>>> when there's an >>>>>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote: >>>>>>>>>>>>>>>>>>>> Hello all, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Issue: >>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of >>>>>>>>>>>>>>>>>>>> classTrack.c. >>>>>>>>>>>>>>>>>>>> It avoids >>>>>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps >>>>>>>>>>>>>>>>>>>> track of >>>>>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance until an >>>>>>>>>>>>>>>>>>>> agent >>>>>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and >>>>>>>>>>>>>>>>>>>> timing. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Eg with the testcase provided here: >>>>>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I am getting those numbers: >>>>>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s >>>>>>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Can I please get a review? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>>>>> >> From rkennke at redhat.com Thu Mar 26 08:44:38 2020 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 26 Mar 2020 09:44:38 +0100 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: <89d41371-4394-e506-b1d1-0a810c72b6e3@oracle.com> References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com> <359442d4-d6a1-f123-dad1-ac712f06eb0e@redhat.com> <7c613970-1788-5bd5-06f9-5c754843a193@oracle.com> <70717ba0-bcd1-05d4-3f25-ad1dd30309c1@redhat.com> <80c64ca7-4f52-a7ba-0e35-9fa6417ce545@oracle.com> <89d41371-4394-e506-b1d1-0a810c72b6e3@oracle.com> Message-ID: That was in the previous implementation: I got a condition wrong in the table lookup (as noted by Serguei), and this prevented any class-unload-events from getting out. I have fixed this, but found other problems in that implementation (deadlocks and a crash). The current implementation has none of these problems: we don't need table-lookups - we simply pass-through the signatures, and locking is much simpler and in particular we don't need a lock around the JVMTI call (SetTag) which was the cause of the deadlock. Does that answer your questions? Thanks, Roman > Hi Roman, > > It passed all my testing. I think before you push Serguei has a question > regarding an issue you brought up a while back. You mentioned that you > weren't getting some events, and suddenly started seeing them. We were > discussing it today and it was unclear if this was an issue you were > seeing before your changes, and your changes resolved it, or it was > initially caused by an earlier version of your changes, and you later > fixed it. We just want to better understand what this issue was and how > it was fixed. > > thanks, > > Chris > > On 3/25/20 3:22 PM, Roman Kennke wrote: >> The new job finished, its ID is: >> >> ? mach5-one-rkennke-JDK-8227269-2-20200325-2027-9716289 >> >> Thank you, >> Roman >> >> >>> Yes, please submit a new job. I'll start my testing once I see that the >>> builds are done. >>> >>> Chris >>> >>> On 3/25/20 12:59 PM, Roman Kennke wrote: >>>> Hi Chris, >>>> >>>> Apparently we can get into classTrack_reset() before calling >>>> activate(), >>>> and we're seeing a null deletedSignatureBag. A simple NULL-check around >>>> the cleaning routine fixes the problem for me. >>>> >>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.08/ >>>> >>>> Should I post another submit-repo job with that fix? >>>> >>>> Thanks, >>>> Roman >>>> >>>> >>>>> Hi Roman, >>>>> >>>>> com/sun/jdi/JdwpAllowTest.java crashed on many runs: >>>>> >>>>> Stack: [0x00007fbb790f9000,0x00007fbb791fa000], >>>>> sp=0x00007fbb791f8af0,? free space=1022k >>>>> Native frames: (J=compiled Java code, A=aot compiled Java code, >>>>> j=interpreted, Vv=VM code, C=native code) >>>>> C? [libjdwp.so+0xdb71]? bagEnumerateOver+0x11 >>>>> C? [libjdwp.so+0xe365]? classTrack_reset+0x25 >>>>> C? [libjdwp.so+0xfca1]? debugInit_reset+0x71 >>>>> C? [libjdwp.so+0x12e0d]? debugLoop_run+0x38d >>>>> C? [libjdwp.so+0x25700]? acceptThread+0x80 >>>>> V? [libjvm.so+0xf4b5a7]? JvmtiAgentThread::call_start_function()+0x1c7 >>>>> V? [libjvm.so+0x15215c6]? JavaThread::thread_main_inner()+0x226 >>>>> V? [libjvm.so+0x1527736]? Thread::call_run()+0xf6 >>>>> V? [libjvm.so+0x1250ade]? thread_native_entry(Thread*)+0x10e >>>>> >>>>> >>>>> This happened during a test task run of open/test/jdk/:jdk_jdi. There >>>>> doesn't seem to be anything magic on the command line that might be >>>>> triggering. Pretty much I see it with all the various VM configs we >>>>> test. >>>>> >>>>> I'm also seeing crashes in the following tests, but not as often: >>>>> >>>>> serviceability/jvmti/ModuleAwareAgents/ThreadStart/MAAThreadStart.java >>>>> vmTestbase/nsk/jdwp/VirtualMachine/Version/version002/TestDescription.java >>>>> >>>>> >>>>> vmTestbase/nsk/jdwp/VirtualMachine/ReleaseEvents/releaseevents002/TestDescription.java >>>>> >>>>> >>>>> vmTestbase/nsk/jdwp/VirtualMachine/HoldEvents/holdevents002/TestDescription.java >>>>> >>>>> >>>>> vmTestbase/nsk/jdwp/VirtualMachine/Dispose/dispose001/TestDescription.java >>>>> >>>>> >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> >>>>> On 3/25/20 11:37 AM, Roman Kennke wrote: >>>>>> Hi Chris, >>>>>> >>>>>>> Regarding the new assert: >>>>>>> >>>>>>> ???105???? if (gdata && gdata->assertOn) { >>>>>>> ???106???????? // Check this is not already tagged. >>>>>>> ???107???????? jlong tag; >>>>>>> ???108???????? error = JVMTI_FUNC_PTR(trackingEnv, GetTag)(env, >>>>>>> klass, &tag); >>>>>>> ???109???????? if (error != JVMTI_ERROR_NONE) { >>>>>>> ???110???????????? EXIT_ERROR(error, "Unable to GetTag with class >>>>>>> trackingEnv"); >>>>>>> ???111???????? } >>>>>>> ???112???????? JDI_ASSERT(tag == NOT_TAGGED); >>>>>>> ???113???? } >>>>>>> >>>>>>> I think you should remove the gdata check. gdata should never be >>>>>>> NULL >>>>>>> when you get to this code. If it is ever NULL then there's a bug, >>>>>>> and >>>>>>> the check will hide the bug. >>>>>> Ok, will remove this. >>>>>> >>>>>>> Regarding testing, after you do the submit repo testing let me know >>>>>>> the >>>>>>> jobID and I'll do additional testing on it. >>>>>> I did the submit repo earlier today, and it came back green: >>>>>> >>>>>> mach5-one-rkennke-JDK-8227269-2-20200325-1421-9706762 >>>>>> >>>>>> Thanks, >>>>>> Roman >>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 3/25/20 6:00 AM, Roman Kennke wrote: >>>>>>>> Hi Sergei, >>>>>>>> >>>>>>>>> The fix looks pretty clean now. >>>>>>>>> I also like new name of the lock.:) >>>>>>>> Thank you! >>>>>>>> >>>>>>>>> Just one comment below. >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> 110 if (tag != 0l) { >>>>>>>>> 111 return; // Already added >>>>>>>>> ??? 112???? } >>>>>>>>> >>>>>>>>> ????It is better to use a named constant or macro instead. >>>>>>>>> ????Also, it'd be nice to add a short comment about this value is. >>>>>>>> As I replied to Chris earlier, this whole block can be turned >>>>>>>> into an >>>>>>>> assert. I also made a constant for the value 0, which should be >>>>>>>> pretty >>>>>>>> much self-explaining. >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.07/ >>>>>>>> >>>>>>>>> How do you test the fix? >>>>>>>> I am using a manual test that is provided in this bug report: >>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>> >>>>>>>> "Script to compare performance of GC with and without debugger, >>>>>>>> when >>>>>>>> many classes are loaded and classes are being unloaded": >>>>>>>> >>>>>>>> https://bugzilla.redhat.com/attachment.cgi?id=1640688 >>>>>>>> >>>>>>>> I am also using this test and manually attach/detach jdb a >>>>>>>> couple of >>>>>>>> times in a row to check that disconnecting and reconnecting works >>>>>>>> well >>>>>>>> (this tended to deadlock or crash with an earlier version of the >>>>>>>> patch, >>>>>>>> and is now looking good). >>>>>>>> >>>>>>>> I am also running tier1 and tier2 tests locally, and as soon as we >>>>>>>> all >>>>>>>> agree that the fix is reasonable, I will push it to the submit >>>>>>>> repo. I >>>>>>>> am not sure if any of those tests actually exercise that code, >>>>>>>> though. >>>>>>>> Let me know if you want me to run any specific tests. >>>>>>>> >>>>>>>> Thank you, >>>>>>>> Roman >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>> >>>>>>>>> On 3/20/20 08:30, Roman Kennke wrote: >>>>>>>>>> I believe I came up with a much simpler solution that also >>>>>>>>>> solves the >>>>>>>>>> problems of the existing one, and the ones I proposed earlier. >>>>>>>>>> >>>>>>>>>> It turns out that we can take advantage of the fact that we >>>>>>>>>> can use >>>>>>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is >>>>>>>>>> explicitely >>>>>>>>>> mentioned in the JVMTI spec). This means we can simply stick a >>>>>>>>>> pointer >>>>>>>>>> to the signature of a class into the tag, and pull it out again >>>>>>>>>> when we >>>>>>>>>> get notified that the class gets unloaded. >>>>>>>>>> >>>>>>>>>> This means we don't need an extra data-structure to keep track of >>>>>>>>>> classes and signatures, and it also makes the story around >>>>>>>>>> locking >>>>>>>>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning >>>>>>>>>> of all >>>>>>>>>> classes needed (as in the current implementation) and no >>>>>>>>>> searching of >>>>>>>>>> table needed (like in my previous attempts). >>>>>>>>>> >>>>>>>>>> Please review this new revision: >>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/ >>>>>>>>>> >>>>>>>>>> (Notice that there still appears to be a performance bottleneck >>>>>>>>>> with >>>>>>>>>> class-unloading when an actual debugger is attached. This >>>>>>>>>> doesn't seem >>>>>>>>>> to be related to the classTrack.c implementation though, but >>>>>>>>>> looks like >>>>>>>>>> a consequence of getting all those class-unload notifications >>>>>>>>>> over the >>>>>>>>>> wire. My testcase generates 1000s of them, and it's clogging >>>>>>>>>> up the >>>>>>>>>> buffers.) >>>>>>>>>> >>>>>>>>>> I am not sure why jdb needs to enable class-unload listener >>>>>>>>>> always. A >>>>>>>>>> simple hack disables it, and performance is brilliant, even when >>>>>>>>>> jdb is >>>>>>>>>> attached: >>>>>>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> But this is not in the scope of this bug.) >>>>>>>>>> >>>>>>>>>> Roman >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>> Sorry, forgot to complete my comments at the end (see below). >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>> Hi Roman, >>>>>>>>>>>> >>>>>>>>>>>> Thank you for the update and sorry for the latency in review. >>>>>>>>>>>> >>>>>>>>>>>> Some comments are below. >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag) >>>>>>>>>>>> ???? 88 { >>>>>>>>>>>> 89 debugMonitorEnter(deletedSignatureLock); >>>>>>>>>>>> 90 if (currentClassTag == -1) { >>>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>> 93 return; >>>>>>>>>>>> ???? 94???? } >>>>>>>>>>>> Just a question: >>>>>>>>>>>> ???? Q1: Should the ObjectFree events be disabled for the >>>>>>>>>>>> jvmtiEnv >>>>>>>>>>>> that does >>>>>>>>>>>> ???????? the class tracking if class tracking has not been >>>>>>>>>>>> initialized? >>>>>>>>>>>> >>>>>>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is >>>>>>>>>>>> better to >>>>>>>>>>>> be something like: lastClassTag or highestClassTag. >>>>>>>>>>>> >>>>>>>>>>>> 99 KlassNode* klass = *klass_ptr; >>>>>>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103 >>>>>>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr; >>>>>>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // >>>>>>>>>>>> klass >>>>>>>>>>>> not >>>>>>>>>>>> found - ignore. >>>>>>>>>>>> 107 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>> 108 return; >>>>>>>>>>>> ??? 109???? } >>>>>>>>>>>> ????It seems to me, something is wrong in the condition at L106 >>>>>>>>>>>> above. >>>>>>>>>>>> ????Should it be? : >>>>>>>>>>>> ?????? if (klass == NULL || klass->klass_tag != tag) >>>>>>>>>>>> >>>>>>>>>>>> ????Otherwise, how can the second check ever work correctly >>>>>>>>>>>> as the >>>>>>>>>>>> return >>>>>>>>>>>> will always happen when (klass != NULL)? >>>>>>>>>>>> >>>>>>>>>>>> ??? There are several places in this file with the the indent: >>>>>>>>>>>> 90 if (currentClassTag == -1) { >>>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>> 93 return; >>>>>>>>>>>> ???? 94???? } >>>>>>>>>>>> ??? ... >>>>>>>>>>>> 152 if (currentClassTag == -1) { >>>>>>>>>>>> 153 // Class tracking not initialized yet, nobody's interested >>>>>>>>>>>> 154 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>> 155 return; >>>>>>>>>>>> ??? 156???? } >>>>>>>>>>>> ??? ... >>>>>>>>>>>> 161 if (error != JVMTI_ERROR_NONE) { >>>>>>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class >>>>>>>>>>>> trackingEnv"); >>>>>>>>>>>> ??? 163???? } >>>>>>>>>>>> 164 if (tag != 0l) { >>>>>>>>>>>> 165 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>> 166 return; // Already added >>>>>>>>>>>> ??? 167???? } >>>>>>>>>>>> ??? ... >>>>>>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg) >>>>>>>>>>>> 282 { >>>>>>>>>>>> 283 char* sig = (char*)signatureVoid; >>>>>>>>>>>> 284 jvmtiDeallocate(sig); >>>>>>>>>>>> 285 return JNI_TRUE; >>>>>>>>>>>> ??? 286 } >>>>>>>>>>>> ??? ... >>>>>>>>>>>> ??? 291 void >>>>>>>>>>>> ??? 292 classTrack_reset(void) >>>>>>>>>>>> ??? 293 { >>>>>>>>>>>> 294 int idx; >>>>>>>>>>>> 295 debugMonitorEnter(deletedSignatureLock); >>>>>>>>>>>> 296 >>>>>>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) { >>>>>>>>>>>> 298 KlassNode* node = table[idx]; >>>>>>>>>>>> 299 while (node != NULL) { >>>>>>>>>>>> 300 KlassNode* next = node->next; >>>>>>>>>>>> 301 jvmtiDeallocate(node->signature); >>>>>>>>>>>> 302 jvmtiDeallocate(node); >>>>>>>>>>>> 303 node = next; >>>>>>>>>>>> 304 } >>>>>>>>>>>> 305 } >>>>>>>>>>>> 306 jvmtiDeallocate(table); >>>>>>>>>>>> 307 >>>>>>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL); >>>>>>>>>>>> 309 bagDestroyBag(deletedSignatureBag); >>>>>>>>>>>> 310 >>>>>>>>>>>> 311 currentClassTag = -1; >>>>>>>>>>>> 312 >>>>>>>>>>>> 313 >>>>>>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 314 trackingEnv = NULL; >>>>>>>>>>>> 315 >>>>>>>>>>>> 316 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>> >>>>>>>>>>>> Could you, please, fix several comments below? >>>>>>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for >>>>>>>>>>>> class-unloads >>>>>>>>>>>> ????The comma is not needed. >>>>>>>>>>>> ????Would it better to replace: klass tags => klass_tag's ? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 73 * Lock to keep table, currentClassTag and >>>>>>>>>>>> deletedSignatureBag >>>>>>>>>>>> consistent >>>>>>>>>>>> ????Maybe: Lock to guard ... or lock to keep integrity of ... >>>>>>>>>>>> >>>>>>>>>>>> 84 * Callback when classes are freed, Finds the signature and >>>>>>>>>>>> remembers it in deletedSignatureBag. Would be better to use >>>>>>>>>>>> words >>>>>>>>>>>> like >>>>>>>>>>>> "store" or "record", "Find" should not start from capital >>>>>>>>>>>> letter: >>>>>>>>>>>> Invoke the callback when classes are freed, find and record the >>>>>>>>>>>> signature in deletedSignatureBag. >>>>>>>>>>>> >>>>>>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not >>>>>>>>>>>> initialized, >>>>>>>>>>>> nobody's interested 153 // Class tracking not initialized yet, >>>>>>>>>>>> nobody's interested 158 /* Check this is not a duplicate */ >>>>>>>>>>>> Missed >>>>>>>>>>>> dot >>>>>>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) >>>>>>>>>>>> { // >>>>>>>>>>>> klass not found - ignore. In opposite, dot is not needed as the >>>>>>>>>>>> comment does not start from a capital letter. 111 // At this >>>>>>>>>>>> point we >>>>>>>>>>>> have the KlassNode corresponding to the tag >>>>>>>>>>>> 112 // in klass, and the pointer to it in klass_node. >>>>>>>>>>> ??? The comment above can be better. Maybe, something like: >>>>>>>>>>> ??? ? " At this point, we found the KlassNode matching the klass >>>>>>>>>>> tag(and it is >>>>>>>>>>> linked). >>>>>>>>>>> >>>>>>>>>>>> 113 // Remember the unloaded signature. >>>>>>>>>>> ????Better: Record the signature of the unloaded class and >>>>>>>>>>> unlink it. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Serguei >>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Serguei >>>>>>>>>>>> >>>>>>>>>>>> On 3/9/20 05:39, Roman Kennke wrote: >>>>>>>>>>>>> Hello all, >>>>>>>>>>>>> >>>>>>>>>>>>> Can I please get reviews of this change? In the meantime, >>>>>>>>>>>>> we've done >>>>>>>>>>>>> more testing and also field-/torture-testing by a customer >>>>>>>>>>>>> who is >>>>>>>>>>>>> happy >>>>>>>>>>>>> now. :-) >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Roman >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Serguei, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks for reviewing! >>>>>>>>>>>>>> >>>>>>>>>>>>>> I updated the patch to reflect your suggestions, very good! >>>>>>>>>>>>>> It also includes a fix to allow re-connecting an agent after >>>>>>>>>>>>>> disconnect, >>>>>>>>>>>>>> namely move setup of the trackingEnv and >>>>>>>>>>>>>> deletedSignatureBag to >>>>>>>>>>>>>> _activate() to ensure have those structures after re-connect. >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> Let me know what you think! >>>>>>>>>>>>>> Roman >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Roman, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thank you for taking care about this scalability issue! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I have a couple of quick comments. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 72 /* >>>>>>>>>>>>>>> 73 * Lock to protect deletedSignatureBag >>>>>>>>>>>>>>> 74 */ >>>>>>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /* >>>>>>>>>>>>>>> 78 * A bag containing all the deleted classes' signatures. >>>>>>>>>>>>>>> Must be >>>>>>>>>>>>>>> accessed under >>>>>>>>>>>>>>> 79 * deletedTagLock, >>>>>>>>>>>>>>> ???? 80? */ >>>>>>>>>>>>>>> 81 struct bag* deletedSignatureBag; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ???? The comments contradict to each other. >>>>>>>>>>>>>>> ???? I guess, the lock name at line 79 has to be >>>>>>>>>>>>>>> deletedSignatureLock >>>>>>>>>>>>>>> instead of deletedTagLock. >>>>>>>>>>>>>>> ???? Also, comma at the end must be replaced with dot. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 101 // Tag not found? Ignore. >>>>>>>>>>>>>>> 102 if (klass == NULL) { >>>>>>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>>> 104 return; >>>>>>>>>>>>>>> 105 } >>>>>>>>>>>>>>> ??? 106 >>>>>>>>>>>>>>> 107 // Scan linked-list. >>>>>>>>>>>>>>> 108 jlong found_tag = klass->klass_tag; >>>>>>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) { >>>>>>>>>>>>>>> 110 klass_ptr = &klass->next; >>>>>>>>>>>>>>> 111 klass = *klass_ptr; >>>>>>>>>>>>>>> 112 found_tag = klass->klass_tag; >>>>>>>>>>>>>>> ??? 113???? } >>>>>>>>>>>>>>> 114 >>>>>>>>>>>>>>> 115 // Tag not found? Ignore. >>>>>>>>>>>>>>> 116 if (found_tag != tag) { >>>>>>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>>> 118 return; >>>>>>>>>>>>>>> ??? 119???? } >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ????The code above can be simplified, so that the lines >>>>>>>>>>>>>>> 101-105 >>>>>>>>>>>>>>> are not >>>>>>>>>>>>>>> needed anymore. >>>>>>>>>>>>>>> ????It can be something like this: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> // Scan linked-list. >>>>>>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) { >>>>>>>>>>>>>>> klass_ptr = &klass->next; >>>>>>>>>>>>>>> klass = *klass_ptr; >>>>>>>>>>>>>>> ??????? } >>>>>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not >>>>>>>>>>>>>>> found - ignore. >>>>>>>>>>>>>>> debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>>> return; >>>>>>>>>>>>>>> ??????? } >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> It will take more time when I get a chance to look at the >>>>>>>>>>>>>>> rest. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Serguei >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote: >>>>>>>>>>>>>>>> Here comes an update that resolves some races that happen >>>>>>>>>>>>>>>> when >>>>>>>>>>>>>>>> disconnecting an agent. In particular, we need to take the >>>>>>>>>>>>>>>> lock on >>>>>>>>>>>>>>>> basically every operation, and also need to check whether >>>>>>>>>>>>>>>> or not >>>>>>>>>>>>>>>> class-tracking is active and return an appropriate result >>>>>>>>>>>>>>>> (e.g. an empty >>>>>>>>>>>>>>>> list) when we're not. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Updated webrev: >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> So, here comes the O(1) implementation: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a >>>>>>>>>>>>>>>>> tag, and we >>>>>>>>>>>>>>>>> set-up a listener to get notified when it is unloaded. >>>>>>>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a >>>>>>>>>>>>>>>>> table, which >>>>>>>>>>>>>>>>> each entry being the head of a linked-list of KlassNode*. >>>>>>>>>>>>>>>>> The >>>>>>>>>>>>>>>>> table is >>>>>>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend >>>>>>>>>>>>>>>>> the new >>>>>>>>>>>>>>>>> KlassNode*. >>>>>>>>>>>>>>>>> This is O(1) operation. >>>>>>>>>>>>>>>>> - When we get notified of unloading a class, we look up >>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>> signature of >>>>>>>>>>>>>>>>> the reported tag in that table, and remember it in a bag. >>>>>>>>>>>>>>>>> The >>>>>>>>>>>>>>>>> KlassNode* >>>>>>>>>>>>>>>>> is then unlinked from the table and deallocated. This is >>>>>>>>>>>>>>>>> ~O(1) operation >>>>>>>>>>>>>>>>> too, depending on the depth of the table. In my testcase >>>>>>>>>>>>>>>>> which hammered >>>>>>>>>>>>>>>>> the code with class-loads and unloads, I usually see >>>>>>>>>>>>>>>>> depths >>>>>>>>>>>>>>>>> of like 2-3, >>>>>>>>>>>>>>>>> but not usually more. It should be ok. >>>>>>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out >>>>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>> bag, and >>>>>>>>>>>>>>>>> allocate a new one. >>>>>>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid >>>>>>>>>>>>>>>>> leaking the >>>>>>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets >>>>>>>>>>>>>>>>> detached >>>>>>>>>>>>>>>>> and/or >>>>>>>>>>>>>>>>> re-attached (was missing before). >>>>>>>>>>>>>>>>> - I also added locks around data-structure-manipulation >>>>>>>>>>>>>>>>> (was >>>>>>>>>>>>>>>>> missing >>>>>>>>>>>>>>>>> before). >>>>>>>>>>>>>>>>> - Also, I only activate this whole process when an actual >>>>>>>>>>>>>>>>> listener gets >>>>>>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right >>>>>>>>>>>>>>>>> when >>>>>>>>>>>>>>>>> attaching a >>>>>>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be >>>>>>>>>>>>>>>>> something >>>>>>>>>>>>>>>>> to improve >>>>>>>>>>>>>>>>> in the future? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> In my tests, the performance of class-tracking itself >>>>>>>>>>>>>>>>> looks >>>>>>>>>>>>>>>>> really good. >>>>>>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the >>>>>>>>>>>>>>>>> class-unload >>>>>>>>>>>>>>>>> events. I don't see how this can be helped when the debug >>>>>>>>>>>>>>>>> agent asks for it? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Updated webrev: >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Please let me know what you think of it. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am >>>>>>>>>>>>>>>>>> implementing >>>>>>>>>>>>>>>>>> the even more >>>>>>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing >>>>>>>>>>>>>>>>>> for now. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks,Roman >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> ??? Hi Chris, >>>>>>>>>>>>>>>>>>>> I'll have a look at this, although it might not be >>>>>>>>>>>>>>>>>>>> for a >>>>>>>>>>>>>>>>>>>> few days. In >>>>>>>>>>>>>>>>>>>> the meantime, maybe you can describe your new >>>>>>>>>>>>>>>>>>>> implementation in >>>>>>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the >>>>>>>>>>>>>>>>>>>> changes. >>>>>>>>>>>>>>>>>>> Sure. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to >>>>>>>>>>>>>>>>>>> determine the >>>>>>>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading >>>>>>>>>>>>>>>>>>> happened, so that >>>>>>>>>>>>>>>>>>> we can generate the appropriate JDWP event. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The current implementation does so by maintaining a >>>>>>>>>>>>>>>>>>> table >>>>>>>>>>>>>>>>>>> of currently >>>>>>>>>>>>>>>>>>> prepared classes by building that table when >>>>>>>>>>>>>>>>>>> classTrack is >>>>>>>>>>>>>>>>>>> initialized, >>>>>>>>>>>>>>>>>>> and then add new classes whenever a class gets loaded. >>>>>>>>>>>>>>>>>>> When >>>>>>>>>>>>>>>>>>> unloading >>>>>>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and >>>>>>>>>>>>>>>>>>> compared with the >>>>>>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the >>>>>>>>>>>>>>>>>>> new >>>>>>>>>>>>>>>>>>> table gets >>>>>>>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently >>>>>>>>>>>>>>>>>>> and/or many >>>>>>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to >>>>>>>>>>>>>>>>>>> O(classCount*gcCount) >>>>>>>>>>>>>>>>>>> complexity. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared >>>>>>>>>>>>>>>>>>> classes, and also >>>>>>>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). >>>>>>>>>>>>>>>>>>> Whenever an >>>>>>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is >>>>>>>>>>>>>>>>>>> scanned, >>>>>>>>>>>>>>>>>>> and classes >>>>>>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus >>>>>>>>>>>>>>>>>>> maintaining the >>>>>>>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list >>>>>>>>>>>>>>>>>>> that gets returned. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The implementation is not perfect. In order to determine >>>>>>>>>>>>>>>>>>> whether or not >>>>>>>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. >>>>>>>>>>>>>>>>>>> That process is >>>>>>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption >>>>>>>>>>>>>>>>>>> here >>>>>>>>>>>>>>>>>>> is that >>>>>>>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this >>>>>>>>>>>>>>>>>>> seems to be >>>>>>>>>>>>>>>>>>> true, and also reasonable to expect. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> (I have some ideas how to improve the implementation to >>>>>>>>>>>>>>>>>>> ~O(1) but it >>>>>>>>>>>>>>>>>>> would be considerably more complex: have to maintain a >>>>>>>>>>>>>>>>>>> (hash)table that >>>>>>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon >>>>>>>>>>>>>>>>>>> unload, >>>>>>>>>>>>>>>>>>> and build the >>>>>>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently >>>>>>>>>>>>>>>>>>> see >>>>>>>>>>>>>>>>>>> that it's >>>>>>>>>>>>>>>>>>> worth the effort). >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> In addition to all that, this process is only activated >>>>>>>>>>>>>>>>>>> when there's an >>>>>>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote: >>>>>>>>>>>>>>>>>>>>> Hello all, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Issue: >>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269 >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of >>>>>>>>>>>>>>>>>>>>> classTrack.c. >>>>>>>>>>>>>>>>>>>>> It avoids >>>>>>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps >>>>>>>>>>>>>>>>>>>>> track of >>>>>>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance >>>>>>>>>>>>>>>>>>>>> until an >>>>>>>>>>>>>>>>>>>>> agent >>>>>>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and >>>>>>>>>>>>>>>>>>>>> timing. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Eg with the testcase provided here: >>>>>>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I am getting those numbers: >>>>>>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s >>>>>>>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Can I please get a review? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>>>>>> >>> > > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From daniil.x.titov at oracle.com Thu Mar 26 13:56:22 2020 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Thu, 26 Mar 2020 06:56:22 -0700 Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI connector port In-Reply-To: <3fe390e9-39d1-c547-9480-fa1962cef0d8@oracle.com> References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com> <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com> <41392FC2-AC78-4E94-878B-907DDFB3E968@oracle.com> <78EA1B33-1B0D-4969-AFC5-E73D8F82E6FA@oracle.com> <9c888cc4-f9f8-8f64-f90b-a949007bc1dc@oracle.com> <202C5C21-BA69-4ACF-9421-A9B5D6704C8C@oracle.com> <3fe390e9-39d1-c547-9480-fa1962cef0d8@oracle.com> Message-ID: Hi Yasumasa and Serguei, Thank you for reviewing this change. Best regards, --Daniil ?On 3/25/20, 1:01 PM, "serguei.spitsyn at oracle.com" wrote: Hi Daniil, On 3/24/20 10:00, Daniil Titov wrote: > Hi Serguei, > >> It looks like you removed the last call site of DebugServer.main. > Yes. It is correct. > >> Do we need to remove the DebugServer.java as well? > I was considering this but since it is a public class I think it needs to be deprecated first. I also think that it would be better to do in a separate issue > since a CSR for deprecation needs to be filed for that. If you agree I will create a new issue for that. I'm okay to separate this. Thanks, Serguei > > Thanks, > Daniil > > > ?On 3/23/20, 11:56 PM, "serguei.spitsyn at oracle.com" wrote: > > Hi Daniil, > > It looks pretty good in general. > > It looks like you removed the last call site of DebugServer.main. > Do we need to remove the DebugServer.java as well? > > Thanks, > Serguei > > > On 3/22/20 15:29, Daniil Titov wrote: > > Hi Yasumasa, Serguei and Alex, > > > > Please review a new version of the webrev that merges SADebugDTest.java with changes done in [2]. > > > > Also the CRS [3] and the help message for debug server in SALauncher.java were updated to specify that '--hostname' > > option could be a hostname or an IPv4/IPv6 address. > > > > > Ok, but I think it might be more simply with TestLibrary. > > > For example, can you use TestLibrary::getUnusedRandomPort ? It is used in test/jdk/java/rmi/testlibrary/RMID.java . > > > > TestLibrary:: getUnusedRandomPort() doesn't allow to specify what ports are reserved and it uses some hardcoded port range [FIXED_PORT_MIN, FIXED_PORT_MAX] as reserved ports. Besides, test/jdk/java/rmi/testlibrary/TestLibrary.java class cannot be directly used in test/hotspot/jtreg/serviceability/* tests (it doesn't compile). > > > > Nevertheless, to simplify the test itself I moved findUnreservedFreePort(int .. reservedPorts) from SADebugTest.java to jdk.test.lib.Utils in /test/lib. > > > > Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > [1] http://cr.openjdk.java.net/~dtitov/8196751/webrev.04/ > > [2] https://bugs.openjdk.java.net/browse/JDK-8238268 > > [3] https://bugs.openjdk.java.net/browse/JDK-8239831 > > > > Thank you, > > Daniil > > > > ?On 3/13/20, 7:23 PM, "Yasumasa Suenaga" wrote: > > > > Hi Daniil, > > > > On 2020/03/14 7:05, Daniil Titov wrote: > > > Hi Yasumasa, Serguei and Alex, > > > > > > Please review a new version of the webrev that includes the changes Yasumasa suggested. > > > > > >> Shutdown hook is already registered in c'tor of HotSpotAgent. > > >> It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed. > > > > > > The shutdown hook registered in the HotSpotAgent c'tor only works for non-servers, so we still need a > > > the shutdown hook for remote server being added in SALauncher. I changed it to use the lambda expression. > > > > > > 101 public HotSpotAgent() { > > > 102 // for non-server add shutdown hook to clean-up debugger in case > > > 103 // of forced exit. For remote server, shutdown hook is added by > > > 104 // DebugServer. > > > 105 Runtime.getRuntime().addShutdownHook(new java.lang.Thread( > > > 106 new Runnable() { > > > 107 public void run() { > > > 108 synchronized (HotSpotAgent.this) { > > > 109 if (!isServer) { > > > 110 detach(); > > > 111 } > > > 112 } > > > 113 } > > > 114 })); > > > 115 } > > > > I missed it, thanks! > > > > > > >>> Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains > > >>> `exclusiveAccess.dirs=.` to avoid concurrent execution > > > As I understand exclusiveAccess.dirs prevents only the tests located in this directory from being run simultaneously and other tests could still run in parallel with one of these tests. Thus I would prefer to have the retry mechanism in place. I simplified the code using the class variables instead of local arrays. > > > > Ok, but I think it might be more simply with TestLibrary. > > For example, can you use TestLibrary::getUnusedRandomPort ? It is used in test/jdk/java/rmi/testlibrary/RMID.java . > > > > > > Thanks, > > > > Yasumasa > > > > > > > Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.03/ > > > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > > [3] Bug: https://bugs.openjdk.java.net/browse/JDK-8196751 > > > > > > Thank you, > > > Daniil > > > > > > ?On 3/6/20, 6:15 PM, "Yasumasa Suenaga" wrote: > > > > > > Hi Daniil, > > > > > > On 2020/03/07 3:38, Daniil Titov wrote: > > > > Hi Yasumasa, > > > > > > > > -> checkBasicOptions() is needed? I think you can remove this method and embed it in caller. > > > > I think that having a piece of code that invokes a method named "buildAttachArgs" with a copy of the argument map just for its side-effect ( it throws an exception if parameters are incorrect) and ignores its return might look confusing. Thus, I found it more appropriate to wrap it inside a method with more relevant name . > > > > > > Ok, but I prefer to leave comment it. > > > > > > > > > > > SADebugDTest > > > > > - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array. > > > > We cannot use primitives there since these local variables are captured in lambda expression and are required to be final. > > > > The other option is to use some other wrapper for them but I don't see any obvious benefits in it comparing to the array. > > > > > > Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains `exclusiveAccess.dirs=.` to avoid concurrent execution. > > > If you do not think this error check, test code is more simply. > > > > > > > > > > I will include your other suggestion in the new version of the webrev. > > > > > > Sorry, I have one more comment: > > > > > > > - Shutdown hook is very good idea. You can implement more simply if you use lambda expression. > > > > > > Shutdown hook is already registered in c'tor of HotSpotAgent. > > > It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed. > > > > > > > > > Thanks, > > > > > > Yasumasa > > > > > > > > > > Thanks! > > > > Daniil > > > > > > > > ?On 3/6/20, 12:30 AM, "Yasumasa Suenaga" wrote: > > > > > > > > Hi Daniil, > > > > > > > > > > > > - SALauncher.java > > > > - checkBasicOptions() is needed? I think you can remove this method and embed it in caller. > > > > - I think registryPort should be checked with Integer.parseInt() like others (rmiPort and pid) rather than regex. > > > > - Shutdown hook is very good idea. You can implement more simply if you use lambda expression. > > > > > > > > - SADebugDTest.java > > > > - Please add bug ID to @bug. > > > > - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array. > > > > > > > > > > > > Thanks, > > > > > > > > Yasumasa > > > > > > > > > > > > On 2020/03/06 10:15, Daniil Titov wrote: > > > > > Hi Yasumasa, Serguei and Alex, > > > > > > > > > > Please review a new version of the fix [1] that addresses your comments. The new version in addition to RMI connector > > > > > port option introduces two more options to specify RMI registry port and RMI connector host name. Currently, these > > > > > last two settings could be specified using the system properties but the system properties have the following disadvantages > > > > > comparing to the command line options: > > > > > - It?s hard to know about them: they are not listed in tool?s help. > > > > > - They have long names that hard to remember > > > > > - It is easy to mistype them in the command line and you will not get any warning about it. > > > > > > > > > > The CSR [2] was also updated and needs to be reviewed. > > > > > > > > > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > > > > > container and connecting to it with the GUI debugger. Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > > > > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.02/ > > > > > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > > > > [3] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > > > > > > > > > Thank you, > > > > > Daniil > > > > > > > > > > ?On 2/24/20, 5:45 AM, "Yasumasa Suenaga" wrote: > > > > > > > > > > Hi Daniil, > > > > > > > > > > - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments. > > > > > Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply. > > > > > > > > > > - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used. > > > > > But you can use same port number as RMI registry (1099). > > > > > It is same as relation between jmxremote.port and jmxremote.rmi.port. > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Yasumasa > > > > > > > > > > > > > > > On 2020/02/24 13:21, Daniil Titov wrote: > > > > > > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port. > > > > > > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container. > > > > > > > > > > > > New CSR [3] was created for this change and it needs to be reviewed as well. > > > > > > > > > > > > Man pages for jhsdb will be updated in a separate issue. > > > > > > > > > > > > The current implementation (sun.jvm.hotspot.SALauncher) parses the command line options passed to jhsdb tool, > > > > > > converts them to the ones for the debug server and then delegates the call to sun.jvm.hotspot.DebugServer.main(). > > > > > > > > > > > > // delegate to the actual SA debug server. > > > > > > 367 DebugServer.main(newArgArray.toArray(new String[0])); > > > > > > > > > > > > However, sun.jvm.hotspot.DebugServer doesn't support named options and that prevents from efficiently adding new options to the tool. > > > > > > I found it more suitable to start Hotspot agent directly in SALauncher rather than adding a new option in both sun.jvm.hotspot.SALauncher > > > > > > and sun.jvm.hotspot.DebugServer and delegating the call. With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated > > > > > > but I would prefer to address it in a separate issue. > > > > > > > > > > > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > > > > > > container and connecting to it with the GUI debugger. > > > > > > Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > > > > > > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01 > > > > > > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > > > > > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > > > > > > > > > > > Thank you, > > > > > > Daniil > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From chris.plummer at oracle.com Thu Mar 26 14:59:09 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 26 Mar 2020 07:59:09 -0700 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com> <359442d4-d6a1-f123-dad1-ac712f06eb0e@redhat.com> <7c613970-1788-5bd5-06f9-5c754843a193@oracle.com> <70717ba0-bcd1-05d4-3f25-ad1dd30309c1@redhat.com> <80c64ca7-4f52-a7ba-0e35-9fa6417ce545@oracle.com> <89d41371-4394-e506-b1d1-0a810c72b6e3@oracle.com> Message-ID: <966eb7f4-ff8f-50ba-dabf-c1c29b1999ef@oracle.com> Hi Roman, Yes. Thank you. Chris On 3/26/20 1:44 AM, Roman Kennke wrote: > That was in the previous implementation: I got a condition wrong in the > table lookup (as noted by Serguei), and this prevented any > class-unload-events from getting out. I have fixed this, but found other > problems in that implementation (deadlocks and a crash). > > The current implementation has none of these problems: we don't need > table-lookups - we simply pass-through the signatures, and locking is > much simpler and in particular we don't need a lock around the JVMTI > call (SetTag) which was the cause of the deadlock. > > Does that answer your questions? > > Thanks, > Roman > >> Hi Roman, >> >> It passed all my testing. I think before you push Serguei has a question >> regarding an issue you brought up a while back. You mentioned that you >> weren't getting some events, and suddenly started seeing them. We were >> discussing it today and it was unclear if this was an issue you were >> seeing before your changes, and your changes resolved it, or it was >> initially caused by an earlier version of your changes, and you later >> fixed it. We just want to better understand what this issue was and how >> it was fixed. >> >> thanks, >> >> Chris >> >> On 3/25/20 3:22 PM, Roman Kennke wrote: >>> The new job finished, its ID is: >>> >>> ? mach5-one-rkennke-JDK-8227269-2-20200325-2027-9716289 >>> >>> Thank you, >>> Roman >>> >>> >>>> Yes, please submit a new job. I'll start my testing once I see that the >>>> builds are done. >>>> >>>> Chris >>>> >>>> On 3/25/20 12:59 PM, Roman Kennke wrote: >>>>> Hi Chris, >>>>> >>>>> Apparently we can get into classTrack_reset() before calling >>>>> activate(), >>>>> and we're seeing a null deletedSignatureBag. A simple NULL-check around >>>>> the cleaning routine fixes the problem for me. >>>>> >>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.08/ >>>>> >>>>> Should I post another submit-repo job with that fix? >>>>> >>>>> Thanks, >>>>> Roman >>>>> >>>>> >>>>>> Hi Roman, >>>>>> >>>>>> com/sun/jdi/JdwpAllowTest.java crashed on many runs: >>>>>> >>>>>> Stack: [0x00007fbb790f9000,0x00007fbb791fa000], >>>>>> sp=0x00007fbb791f8af0,? free space=1022k >>>>>> Native frames: (J=compiled Java code, A=aot compiled Java code, >>>>>> j=interpreted, Vv=VM code, C=native code) >>>>>> C? [libjdwp.so+0xdb71]? bagEnumerateOver+0x11 >>>>>> C? [libjdwp.so+0xe365]? classTrack_reset+0x25 >>>>>> C? [libjdwp.so+0xfca1]? debugInit_reset+0x71 >>>>>> C? [libjdwp.so+0x12e0d]? debugLoop_run+0x38d >>>>>> C? [libjdwp.so+0x25700]? acceptThread+0x80 >>>>>> V? [libjvm.so+0xf4b5a7]? JvmtiAgentThread::call_start_function()+0x1c7 >>>>>> V? [libjvm.so+0x15215c6]? JavaThread::thread_main_inner()+0x226 >>>>>> V? [libjvm.so+0x1527736]? Thread::call_run()+0xf6 >>>>>> V? [libjvm.so+0x1250ade]? thread_native_entry(Thread*)+0x10e >>>>>> >>>>>> >>>>>> This happened during a test task run of open/test/jdk/:jdk_jdi. There >>>>>> doesn't seem to be anything magic on the command line that might be >>>>>> triggering. Pretty much I see it with all the various VM configs we >>>>>> test. >>>>>> >>>>>> I'm also seeing crashes in the following tests, but not as often: >>>>>> >>>>>> serviceability/jvmti/ModuleAwareAgents/ThreadStart/MAAThreadStart.java >>>>>> vmTestbase/nsk/jdwp/VirtualMachine/Version/version002/TestDescription.java >>>>>> >>>>>> >>>>>> vmTestbase/nsk/jdwp/VirtualMachine/ReleaseEvents/releaseevents002/TestDescription.java >>>>>> >>>>>> >>>>>> vmTestbase/nsk/jdwp/VirtualMachine/HoldEvents/holdevents002/TestDescription.java >>>>>> >>>>>> >>>>>> vmTestbase/nsk/jdwp/VirtualMachine/Dispose/dispose001/TestDescription.java >>>>>> >>>>>> >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> >>>>>> On 3/25/20 11:37 AM, Roman Kennke wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>>> Regarding the new assert: >>>>>>>> >>>>>>>> ???105???? if (gdata && gdata->assertOn) { >>>>>>>> ???106???????? // Check this is not already tagged. >>>>>>>> ???107???????? jlong tag; >>>>>>>> ???108???????? error = JVMTI_FUNC_PTR(trackingEnv, GetTag)(env, >>>>>>>> klass, &tag); >>>>>>>> ???109???????? if (error != JVMTI_ERROR_NONE) { >>>>>>>> ???110???????????? EXIT_ERROR(error, "Unable to GetTag with class >>>>>>>> trackingEnv"); >>>>>>>> ???111???????? } >>>>>>>> ???112???????? JDI_ASSERT(tag == NOT_TAGGED); >>>>>>>> ???113???? } >>>>>>>> >>>>>>>> I think you should remove the gdata check. gdata should never be >>>>>>>> NULL >>>>>>>> when you get to this code. If it is ever NULL then there's a bug, >>>>>>>> and >>>>>>>> the check will hide the bug. >>>>>>> Ok, will remove this. >>>>>>> >>>>>>>> Regarding testing, after you do the submit repo testing let me know >>>>>>>> the >>>>>>>> jobID and I'll do additional testing on it. >>>>>>> I did the submit repo earlier today, and it came back green: >>>>>>> >>>>>>> mach5-one-rkennke-JDK-8227269-2-20200325-1421-9706762 >>>>>>> >>>>>>> Thanks, >>>>>>> Roman >>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 3/25/20 6:00 AM, Roman Kennke wrote: >>>>>>>>> Hi Sergei, >>>>>>>>> >>>>>>>>>> The fix looks pretty clean now. >>>>>>>>>> I also like new name of the lock.:) >>>>>>>>> Thank you! >>>>>>>>> >>>>>>>>>> Just one comment below. >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 110 if (tag != 0l) { >>>>>>>>>> 111 return; // Already added >>>>>>>>>> ??? 112???? } >>>>>>>>>> >>>>>>>>>> ????It is better to use a named constant or macro instead. >>>>>>>>>> ????Also, it'd be nice to add a short comment about this value is. >>>>>>>>> As I replied to Chris earlier, this whole block can be turned >>>>>>>>> into an >>>>>>>>> assert. I also made a constant for the value 0, which should be >>>>>>>>> pretty >>>>>>>>> much self-explaining. >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.07/ >>>>>>>>> >>>>>>>>>> How do you test the fix? >>>>>>>>> I am using a manual test that is provided in this bug report: >>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>> >>>>>>>>> "Script to compare performance of GC with and without debugger, >>>>>>>>> when >>>>>>>>> many classes are loaded and classes are being unloaded": >>>>>>>>> >>>>>>>>> https://bugzilla.redhat.com/attachment.cgi?id=1640688 >>>>>>>>> >>>>>>>>> I am also using this test and manually attach/detach jdb a >>>>>>>>> couple of >>>>>>>>> times in a row to check that disconnecting and reconnecting works >>>>>>>>> well >>>>>>>>> (this tended to deadlock or crash with an earlier version of the >>>>>>>>> patch, >>>>>>>>> and is now looking good). >>>>>>>>> >>>>>>>>> I am also running tier1 and tier2 tests locally, and as soon as we >>>>>>>>> all >>>>>>>>> agree that the fix is reasonable, I will push it to the submit >>>>>>>>> repo. I >>>>>>>>> am not sure if any of those tests actually exercise that code, >>>>>>>>> though. >>>>>>>>> Let me know if you want me to run any specific tests. >>>>>>>>> >>>>>>>>> Thank you, >>>>>>>>> Roman >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Serguei >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 3/20/20 08:30, Roman Kennke wrote: >>>>>>>>>>> I believe I came up with a much simpler solution that also >>>>>>>>>>> solves the >>>>>>>>>>> problems of the existing one, and the ones I proposed earlier. >>>>>>>>>>> >>>>>>>>>>> It turns out that we can take advantage of the fact that we >>>>>>>>>>> can use >>>>>>>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is >>>>>>>>>>> explicitely >>>>>>>>>>> mentioned in the JVMTI spec). This means we can simply stick a >>>>>>>>>>> pointer >>>>>>>>>>> to the signature of a class into the tag, and pull it out again >>>>>>>>>>> when we >>>>>>>>>>> get notified that the class gets unloaded. >>>>>>>>>>> >>>>>>>>>>> This means we don't need an extra data-structure to keep track of >>>>>>>>>>> classes and signatures, and it also makes the story around >>>>>>>>>>> locking >>>>>>>>>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning >>>>>>>>>>> of all >>>>>>>>>>> classes needed (as in the current implementation) and no >>>>>>>>>>> searching of >>>>>>>>>>> table needed (like in my previous attempts). >>>>>>>>>>> >>>>>>>>>>> Please review this new revision: >>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/ >>>>>>>>>>> >>>>>>>>>>> (Notice that there still appears to be a performance bottleneck >>>>>>>>>>> with >>>>>>>>>>> class-unloading when an actual debugger is attached. This >>>>>>>>>>> doesn't seem >>>>>>>>>>> to be related to the classTrack.c implementation though, but >>>>>>>>>>> looks like >>>>>>>>>>> a consequence of getting all those class-unload notifications >>>>>>>>>>> over the >>>>>>>>>>> wire. My testcase generates 1000s of them, and it's clogging >>>>>>>>>>> up the >>>>>>>>>>> buffers.) >>>>>>>>>>> >>>>>>>>>>> I am not sure why jdb needs to enable class-unload listener >>>>>>>>>>> always. A >>>>>>>>>>> simple hack disables it, and performance is brilliant, even when >>>>>>>>>>> jdb is >>>>>>>>>>> attached: >>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> But this is not in the scope of this bug.) >>>>>>>>>>> >>>>>>>>>>> Roman >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>> Sorry, forgot to complete my comments at the end (see below). >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>> Hi Roman, >>>>>>>>>>>>> >>>>>>>>>>>>> Thank you for the update and sorry for the latency in review. >>>>>>>>>>>>> >>>>>>>>>>>>> Some comments are below. >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag) >>>>>>>>>>>>> ???? 88 { >>>>>>>>>>>>> 89 debugMonitorEnter(deletedSignatureLock); >>>>>>>>>>>>> 90 if (currentClassTag == -1) { >>>>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>> 93 return; >>>>>>>>>>>>> ???? 94???? } >>>>>>>>>>>>> Just a question: >>>>>>>>>>>>> ???? Q1: Should the ObjectFree events be disabled for the >>>>>>>>>>>>> jvmtiEnv >>>>>>>>>>>>> that does >>>>>>>>>>>>> ???????? the class tracking if class tracking has not been >>>>>>>>>>>>> initialized? >>>>>>>>>>>>> >>>>>>>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is >>>>>>>>>>>>> better to >>>>>>>>>>>>> be something like: lastClassTag or highestClassTag. >>>>>>>>>>>>> >>>>>>>>>>>>> 99 KlassNode* klass = *klass_ptr; >>>>>>>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103 >>>>>>>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr; >>>>>>>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // >>>>>>>>>>>>> klass >>>>>>>>>>>>> not >>>>>>>>>>>>> found - ignore. >>>>>>>>>>>>> 107 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>> 108 return; >>>>>>>>>>>>> ??? 109???? } >>>>>>>>>>>>> ????It seems to me, something is wrong in the condition at L106 >>>>>>>>>>>>> above. >>>>>>>>>>>>> ????Should it be? : >>>>>>>>>>>>> ?????? if (klass == NULL || klass->klass_tag != tag) >>>>>>>>>>>>> >>>>>>>>>>>>> ????Otherwise, how can the second check ever work correctly >>>>>>>>>>>>> as the >>>>>>>>>>>>> return >>>>>>>>>>>>> will always happen when (klass != NULL)? >>>>>>>>>>>>> >>>>>>>>>>>>> ??? There are several places in this file with the the indent: >>>>>>>>>>>>> 90 if (currentClassTag == -1) { >>>>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>> 93 return; >>>>>>>>>>>>> ???? 94???? } >>>>>>>>>>>>> ??? ... >>>>>>>>>>>>> 152 if (currentClassTag == -1) { >>>>>>>>>>>>> 153 // Class tracking not initialized yet, nobody's interested >>>>>>>>>>>>> 154 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>> 155 return; >>>>>>>>>>>>> ??? 156???? } >>>>>>>>>>>>> ??? ... >>>>>>>>>>>>> 161 if (error != JVMTI_ERROR_NONE) { >>>>>>>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class >>>>>>>>>>>>> trackingEnv"); >>>>>>>>>>>>> ??? 163???? } >>>>>>>>>>>>> 164 if (tag != 0l) { >>>>>>>>>>>>> 165 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>> 166 return; // Already added >>>>>>>>>>>>> ??? 167???? } >>>>>>>>>>>>> ??? ... >>>>>>>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg) >>>>>>>>>>>>> 282 { >>>>>>>>>>>>> 283 char* sig = (char*)signatureVoid; >>>>>>>>>>>>> 284 jvmtiDeallocate(sig); >>>>>>>>>>>>> 285 return JNI_TRUE; >>>>>>>>>>>>> ??? 286 } >>>>>>>>>>>>> ??? ... >>>>>>>>>>>>> ??? 291 void >>>>>>>>>>>>> ??? 292 classTrack_reset(void) >>>>>>>>>>>>> ??? 293 { >>>>>>>>>>>>> 294 int idx; >>>>>>>>>>>>> 295 debugMonitorEnter(deletedSignatureLock); >>>>>>>>>>>>> 296 >>>>>>>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) { >>>>>>>>>>>>> 298 KlassNode* node = table[idx]; >>>>>>>>>>>>> 299 while (node != NULL) { >>>>>>>>>>>>> 300 KlassNode* next = node->next; >>>>>>>>>>>>> 301 jvmtiDeallocate(node->signature); >>>>>>>>>>>>> 302 jvmtiDeallocate(node); >>>>>>>>>>>>> 303 node = next; >>>>>>>>>>>>> 304 } >>>>>>>>>>>>> 305 } >>>>>>>>>>>>> 306 jvmtiDeallocate(table); >>>>>>>>>>>>> 307 >>>>>>>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL); >>>>>>>>>>>>> 309 bagDestroyBag(deletedSignatureBag); >>>>>>>>>>>>> 310 >>>>>>>>>>>>> 311 currentClassTag = -1; >>>>>>>>>>>>> 312 >>>>>>>>>>>>> 313 >>>>>>>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 314 trackingEnv = NULL; >>>>>>>>>>>>> 315 >>>>>>>>>>>>> 316 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>> >>>>>>>>>>>>> Could you, please, fix several comments below? >>>>>>>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for >>>>>>>>>>>>> class-unloads >>>>>>>>>>>>> ????The comma is not needed. >>>>>>>>>>>>> ????Would it better to replace: klass tags => klass_tag's ? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 73 * Lock to keep table, currentClassTag and >>>>>>>>>>>>> deletedSignatureBag >>>>>>>>>>>>> consistent >>>>>>>>>>>>> ????Maybe: Lock to guard ... or lock to keep integrity of ... >>>>>>>>>>>>> >>>>>>>>>>>>> 84 * Callback when classes are freed, Finds the signature and >>>>>>>>>>>>> remembers it in deletedSignatureBag. Would be better to use >>>>>>>>>>>>> words >>>>>>>>>>>>> like >>>>>>>>>>>>> "store" or "record", "Find" should not start from capital >>>>>>>>>>>>> letter: >>>>>>>>>>>>> Invoke the callback when classes are freed, find and record the >>>>>>>>>>>>> signature in deletedSignatureBag. >>>>>>>>>>>>> >>>>>>>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not >>>>>>>>>>>>> initialized, >>>>>>>>>>>>> nobody's interested 153 // Class tracking not initialized yet, >>>>>>>>>>>>> nobody's interested 158 /* Check this is not a duplicate */ >>>>>>>>>>>>> Missed >>>>>>>>>>>>> dot >>>>>>>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) >>>>>>>>>>>>> { // >>>>>>>>>>>>> klass not found - ignore. In opposite, dot is not needed as the >>>>>>>>>>>>> comment does not start from a capital letter. 111 // At this >>>>>>>>>>>>> point we >>>>>>>>>>>>> have the KlassNode corresponding to the tag >>>>>>>>>>>>> 112 // in klass, and the pointer to it in klass_node. >>>>>>>>>>>> ??? The comment above can be better. Maybe, something like: >>>>>>>>>>>> ??? ? " At this point, we found the KlassNode matching the klass >>>>>>>>>>>> tag(and it is >>>>>>>>>>>> linked). >>>>>>>>>>>> >>>>>>>>>>>>> 113 // Remember the unloaded signature. >>>>>>>>>>>> ????Better: Record the signature of the unloaded class and >>>>>>>>>>>> unlink it. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Serguei >>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Serguei >>>>>>>>>>>>> >>>>>>>>>>>>> On 3/9/20 05:39, Roman Kennke wrote: >>>>>>>>>>>>>> Hello all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Can I please get reviews of this change? In the meantime, >>>>>>>>>>>>>> we've done >>>>>>>>>>>>>> more testing and also field-/torture-testing by a customer >>>>>>>>>>>>>> who is >>>>>>>>>>>>>> happy >>>>>>>>>>>>>> now. :-) >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Roman >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Serguei, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks for reviewing! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I updated the patch to reflect your suggestions, very good! >>>>>>>>>>>>>>> It also includes a fix to allow re-connecting an agent after >>>>>>>>>>>>>>> disconnect, >>>>>>>>>>>>>>> namely move setup of the trackingEnv and >>>>>>>>>>>>>>> deletedSignatureBag to >>>>>>>>>>>>>>> _activate() to ensure have those structures after re-connect. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Let me know what you think! >>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi Roman, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thank you for taking care about this scalability issue! >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I have a couple of quick comments. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 72 /* >>>>>>>>>>>>>>>> 73 * Lock to protect deletedSignatureBag >>>>>>>>>>>>>>>> 74 */ >>>>>>>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /* >>>>>>>>>>>>>>>> 78 * A bag containing all the deleted classes' signatures. >>>>>>>>>>>>>>>> Must be >>>>>>>>>>>>>>>> accessed under >>>>>>>>>>>>>>>> 79 * deletedTagLock, >>>>>>>>>>>>>>>> ???? 80? */ >>>>>>>>>>>>>>>> 81 struct bag* deletedSignatureBag; >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ???? The comments contradict to each other. >>>>>>>>>>>>>>>> ???? I guess, the lock name at line 79 has to be >>>>>>>>>>>>>>>> deletedSignatureLock >>>>>>>>>>>>>>>> instead of deletedTagLock. >>>>>>>>>>>>>>>> ???? Also, comma at the end must be replaced with dot. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 101 // Tag not found? Ignore. >>>>>>>>>>>>>>>> 102 if (klass == NULL) { >>>>>>>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>>>> 104 return; >>>>>>>>>>>>>>>> 105 } >>>>>>>>>>>>>>>> ??? 106 >>>>>>>>>>>>>>>> 107 // Scan linked-list. >>>>>>>>>>>>>>>> 108 jlong found_tag = klass->klass_tag; >>>>>>>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) { >>>>>>>>>>>>>>>> 110 klass_ptr = &klass->next; >>>>>>>>>>>>>>>> 111 klass = *klass_ptr; >>>>>>>>>>>>>>>> 112 found_tag = klass->klass_tag; >>>>>>>>>>>>>>>> ??? 113???? } >>>>>>>>>>>>>>>> 114 >>>>>>>>>>>>>>>> 115 // Tag not found? Ignore. >>>>>>>>>>>>>>>> 116 if (found_tag != tag) { >>>>>>>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>>>> 118 return; >>>>>>>>>>>>>>>> ??? 119???? } >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ????The code above can be simplified, so that the lines >>>>>>>>>>>>>>>> 101-105 >>>>>>>>>>>>>>>> are not >>>>>>>>>>>>>>>> needed anymore. >>>>>>>>>>>>>>>> ????It can be something like this: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> // Scan linked-list. >>>>>>>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) { >>>>>>>>>>>>>>>> klass_ptr = &klass->next; >>>>>>>>>>>>>>>> klass = *klass_ptr; >>>>>>>>>>>>>>>> ??????? } >>>>>>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not >>>>>>>>>>>>>>>> found - ignore. >>>>>>>>>>>>>>>> debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>>>> return; >>>>>>>>>>>>>>>> ??????? } >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> It will take more time when I get a chance to look at the >>>>>>>>>>>>>>>> rest. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Serguei >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote: >>>>>>>>>>>>>>>>> Here comes an update that resolves some races that happen >>>>>>>>>>>>>>>>> when >>>>>>>>>>>>>>>>> disconnecting an agent. In particular, we need to take the >>>>>>>>>>>>>>>>> lock on >>>>>>>>>>>>>>>>> basically every operation, and also need to check whether >>>>>>>>>>>>>>>>> or not >>>>>>>>>>>>>>>>> class-tracking is active and return an appropriate result >>>>>>>>>>>>>>>>> (e.g. an empty >>>>>>>>>>>>>>>>> list) when we're not. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Updated webrev: >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> So, here comes the O(1) implementation: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a >>>>>>>>>>>>>>>>>> tag, and we >>>>>>>>>>>>>>>>>> set-up a listener to get notified when it is unloaded. >>>>>>>>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a >>>>>>>>>>>>>>>>>> table, which >>>>>>>>>>>>>>>>>> each entry being the head of a linked-list of KlassNode*. >>>>>>>>>>>>>>>>>> The >>>>>>>>>>>>>>>>>> table is >>>>>>>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend >>>>>>>>>>>>>>>>>> the new >>>>>>>>>>>>>>>>>> KlassNode*. >>>>>>>>>>>>>>>>>> This is O(1) operation. >>>>>>>>>>>>>>>>>> - When we get notified of unloading a class, we look up >>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>> signature of >>>>>>>>>>>>>>>>>> the reported tag in that table, and remember it in a bag. >>>>>>>>>>>>>>>>>> The >>>>>>>>>>>>>>>>>> KlassNode* >>>>>>>>>>>>>>>>>> is then unlinked from the table and deallocated. This is >>>>>>>>>>>>>>>>>> ~O(1) operation >>>>>>>>>>>>>>>>>> too, depending on the depth of the table. In my testcase >>>>>>>>>>>>>>>>>> which hammered >>>>>>>>>>>>>>>>>> the code with class-loads and unloads, I usually see >>>>>>>>>>>>>>>>>> depths >>>>>>>>>>>>>>>>>> of like 2-3, >>>>>>>>>>>>>>>>>> but not usually more. It should be ok. >>>>>>>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out >>>>>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>>> bag, and >>>>>>>>>>>>>>>>>> allocate a new one. >>>>>>>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid >>>>>>>>>>>>>>>>>> leaking the >>>>>>>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets >>>>>>>>>>>>>>>>>> detached >>>>>>>>>>>>>>>>>> and/or >>>>>>>>>>>>>>>>>> re-attached (was missing before). >>>>>>>>>>>>>>>>>> - I also added locks around data-structure-manipulation >>>>>>>>>>>>>>>>>> (was >>>>>>>>>>>>>>>>>> missing >>>>>>>>>>>>>>>>>> before). >>>>>>>>>>>>>>>>>> - Also, I only activate this whole process when an actual >>>>>>>>>>>>>>>>>> listener gets >>>>>>>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right >>>>>>>>>>>>>>>>>> when >>>>>>>>>>>>>>>>>> attaching a >>>>>>>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be >>>>>>>>>>>>>>>>>> something >>>>>>>>>>>>>>>>>> to improve >>>>>>>>>>>>>>>>>> in the future? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> In my tests, the performance of class-tracking itself >>>>>>>>>>>>>>>>>> looks >>>>>>>>>>>>>>>>>> really good. >>>>>>>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the >>>>>>>>>>>>>>>>>> class-unload >>>>>>>>>>>>>>>>>> events. I don't see how this can be helped when the debug >>>>>>>>>>>>>>>>>> agent asks for it? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Updated webrev: >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Please let me know what you think of it. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am >>>>>>>>>>>>>>>>>>> implementing >>>>>>>>>>>>>>>>>>> the even more >>>>>>>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing >>>>>>>>>>>>>>>>>>> for now. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks,Roman >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> ??? Hi Chris, >>>>>>>>>>>>>>>>>>>>> I'll have a look at this, although it might not be >>>>>>>>>>>>>>>>>>>>> for a >>>>>>>>>>>>>>>>>>>>> few days. In >>>>>>>>>>>>>>>>>>>>> the meantime, maybe you can describe your new >>>>>>>>>>>>>>>>>>>>> implementation in >>>>>>>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the >>>>>>>>>>>>>>>>>>>>> changes. >>>>>>>>>>>>>>>>>>>> Sure. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to >>>>>>>>>>>>>>>>>>>> determine the >>>>>>>>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading >>>>>>>>>>>>>>>>>>>> happened, so that >>>>>>>>>>>>>>>>>>>> we can generate the appropriate JDWP event. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> The current implementation does so by maintaining a >>>>>>>>>>>>>>>>>>>> table >>>>>>>>>>>>>>>>>>>> of currently >>>>>>>>>>>>>>>>>>>> prepared classes by building that table when >>>>>>>>>>>>>>>>>>>> classTrack is >>>>>>>>>>>>>>>>>>>> initialized, >>>>>>>>>>>>>>>>>>>> and then add new classes whenever a class gets loaded. >>>>>>>>>>>>>>>>>>>> When >>>>>>>>>>>>>>>>>>>> unloading >>>>>>>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and >>>>>>>>>>>>>>>>>>>> compared with the >>>>>>>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the >>>>>>>>>>>>>>>>>>>> new >>>>>>>>>>>>>>>>>>>> table gets >>>>>>>>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently >>>>>>>>>>>>>>>>>>>> and/or many >>>>>>>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to >>>>>>>>>>>>>>>>>>>> O(classCount*gcCount) >>>>>>>>>>>>>>>>>>>> complexity. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared >>>>>>>>>>>>>>>>>>>> classes, and also >>>>>>>>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). >>>>>>>>>>>>>>>>>>>> Whenever an >>>>>>>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is >>>>>>>>>>>>>>>>>>>> scanned, >>>>>>>>>>>>>>>>>>>> and classes >>>>>>>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus >>>>>>>>>>>>>>>>>>>> maintaining the >>>>>>>>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list >>>>>>>>>>>>>>>>>>>> that gets returned. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> The implementation is not perfect. In order to determine >>>>>>>>>>>>>>>>>>>> whether or not >>>>>>>>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. >>>>>>>>>>>>>>>>>>>> That process is >>>>>>>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption >>>>>>>>>>>>>>>>>>>> here >>>>>>>>>>>>>>>>>>>> is that >>>>>>>>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this >>>>>>>>>>>>>>>>>>>> seems to be >>>>>>>>>>>>>>>>>>>> true, and also reasonable to expect. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> (I have some ideas how to improve the implementation to >>>>>>>>>>>>>>>>>>>> ~O(1) but it >>>>>>>>>>>>>>>>>>>> would be considerably more complex: have to maintain a >>>>>>>>>>>>>>>>>>>> (hash)table that >>>>>>>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon >>>>>>>>>>>>>>>>>>>> unload, >>>>>>>>>>>>>>>>>>>> and build the >>>>>>>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently >>>>>>>>>>>>>>>>>>>> see >>>>>>>>>>>>>>>>>>>> that it's >>>>>>>>>>>>>>>>>>>> worth the effort). >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> In addition to all that, this process is only activated >>>>>>>>>>>>>>>>>>>> when there's an >>>>>>>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote: >>>>>>>>>>>>>>>>>>>>>> Hello all, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Issue: >>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269 >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of >>>>>>>>>>>>>>>>>>>>>> classTrack.c. >>>>>>>>>>>>>>>>>>>>>> It avoids >>>>>>>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps >>>>>>>>>>>>>>>>>>>>>> track of >>>>>>>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance >>>>>>>>>>>>>>>>>>>>>> until an >>>>>>>>>>>>>>>>>>>>>> agent >>>>>>>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and >>>>>>>>>>>>>>>>>>>>>> timing. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Eg with the testcase provided here: >>>>>>>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I am getting those numbers: >>>>>>>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s >>>>>>>>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Can I please get a review? >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>>>>>>> >> From kevin.walls at oracle.com Thu Mar 26 17:40:51 2020 From: kevin.walls at oracle.com (Kevin Walls) Date: Thu, 26 Mar 2020 17:40:51 +0000 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com> <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com> <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com> <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com> <4f940048-06e1-bda1-6f75-225001572232@oss.nttdata.com> <908005c7-c3ad-2f92-5853-2ae0895689b4@oracle.com> <2114ef12-b790-ee2d-4242-d114cc47151c@oracle.com> <9c9950af-5e36-90b5-d22f-a3e00d45fbcc@oss.nttdata.com> Message-ID: <0851349f-ee17-4b0e-f186-a378cedd6913@oracle.com> Hi Yasumasa, Oops, didn't catch this - I also had done some manual testing and in mach5 but clearly not enough. Generally I think this looks good. "lastFrame" can mean last as in final, or last as in previous. "last" is one of those annoying English words.? Here it means final, if we get an Exception during processDwarf, use this to flag that we should return null from sender().? "finalFrame" would be clearer to me, anything else probably gets more verbose than you wanted. Yes I like having the limit on the while loop in process_dwarf(), always worried how sane the information is that we are parsing through. Thanks! Kevin On 24/03/2020 23:47, Yasumasa Suenaga wrote: > Thanks Serguei! > > I will push it when I get second reviewer. > > > Yasumasa > > > On 2020/03/25 1:39, serguei.spitsyn at oracle.com wrote: >> Hi Yasumasa, >> >> I'm okay with this update. >> My mach5 test run for this patch is passed. >> >> Thanks, >> Serguei >> >> >> On 3/23/20 17:08, Yasumasa Suenaga wrote: >>> Hi Serguei, >>> >>> Thanks for your comment! >>> I uploaded new webrev: >>> >>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.04/ >>> >>> Also I pushed it to submit repo: >>> >>> ? http://hg.openjdk.java.net/jdk/submit/rev/fade6a949bd1 >>> >>> On 2020/03/24 7:39, serguei.spitsyn at oracle.com wrote: >>>> Hi Yasumasa, >>>> >>>> The mach5 tier5 testing looks good. >>>> The serviceability/sa/ClhsdbPstack.java is failed without fix and >>>> is not failed with it. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 3/23/20 10:18, serguei.spitsyn at oracle.com wrote: >>>>> Hi Yasumasa, >>>>> >>>>> I looked at you changes. >>>>> It is hard to understand if this fully solves the issue. >>>>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html >>>>> >>>>> >>>>> @@ -34,10 +34,11 @@ >>>>> ? ???? public static LinuxAMD64CFrame getTopFrame(LinuxDebugger >>>>> dbg, Address rip, ThreadContext context) { >>>>> ??????? Address libptr = dbg.findLibPtrByAddress(rip); >>>>> ??????? Address cfa = >>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); >>>>> ??????? DwarfParser dwarf = null; >>>>> + boolean unsupportedDwarf = false; >>>>> ? ??????? if (libptr != null) { // Native frame >>>>> ????????? try { >>>>> ??????????? dwarf = new DwarfParser(libptr); >>>>> ??????????? dwarf.processDwarf(rip); >>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >>>>> >>>>> @@ -45,24 +46,33 @@ >>>>> ?????????????????? !dwarf.isBPOffsetAvailable()) >>>>> ????????????????????? ? >>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP) >>>>> ????????????????????? : >>>>> context.getRegisterAsAddress(dwarf.getCFARegister()) >>>>> .addOffsetTo(dwarf.getCFAOffset()); >>>>> ????????? } catch (DebuggerException e) { >>>>> - // Bail out to Java frame case >>>>> + if (dwarf != null) { >>>>> + // DWARF processing should succeed when the frame is native >>>>> + // but it might fail if CIE has language personality routine >>>>> + // and/or LSDA. >>>>> + dwarf = null; >>>>> + unsupportedDwarf = true; >>>>> + } else { >>>>> + throw e; >>>>> + } >>>>> ????????? } >>>>> ??????? } >>>>> ? ??????? return (cfa == null) ? null >>>>> - : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf); >>>>> + : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf, !unsupportedDwarf); >>>>> ???? } >>>>> >>>>> @@ -121,13 +131,25 @@ >>>>> ?????? } >>>>> ? ?????? return isValidFrame(nextCFA, context) ? nextCFA : null; >>>>> ???? } >>>>> ? - private DwarfParser getNextDwarf(Address nextPC) { >>>>> - DwarfParser nextDwarf = null; >>>>> + @Override >>>>> + public CFrame sender(ThreadProxy thread) { >>>>> + if (!possibleNext) { >>>>> + return null; >>>>> + } >>>>> + >>>>> + ThreadContext context = thread.getContext(); >>>>> + >>>>> + Address nextPC = getNextPC(dwarf != null); >>>>> + if (nextPC == null) { >>>>> + return null; >>>>> + } >>>>> ? + DwarfParser nextDwarf = null; >>>>> + boolean unsupportedDwarf = false; >>>>> ?????? if ((dwarf != null) && dwarf.isIn(nextPC)) { >>>>> ???????? nextDwarf = dwarf; >>>>> ?????? } else { >>>>> ???????? Address libptr = dbg.findLibPtrByAddress(nextPC); >>>>> ???????? if (libptr != null) { >>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >>>>> >>>>> @@ -138,33 +160,29 @@ >>>>> ?????????? } >>>>> ???????? } >>>>> ?????? } >>>>> ? ?????? if (nextDwarf != null) { >>>>> + try { >>>>> ???????? nextDwarf.processDwarf(nextPC); >>>>> + } catch (DebuggerException e) { >>>>> + // DWARF processing should succeed when the frame is native >>>>> + // but it might fail if CIE has language personality routine >>>>> + // and/or LSDA. >>>>> + nextDwarf = null; >>>>> + unsupportedDwarf = true; >>>>> ?????? } >>>>> >>>>> This fix looks like a hack. >>>>> Should we just propagate the Debugging exception instead of trying >>>>> to maintain unsupportedDwarf flag? >>> >>> DwarfParser::processDwarf would throw DebuggerException if it cannot >>> find DWARF which relates to PC. >>> PC at this point is for next frame. So current frame (`this` object) >>> is valid, and it should be processed. >>> >>> >>>>> Also, I don't like that DWARF-specific abbreviations (like CIE, >>>>> IDE,LSDA, etc.) are used without any comments explaining them. >>>>> The code has to be generally readable without looking into the >>>>> DWARF spec each time. >>> >>> I added comments for them in this webrev. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>>>> I'm submitting mach5 jobs to make sure the issue has been resolved >>>>> with your fix. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 3/20/20 17:55, Yasumasa Suenaga wrote: >>>>>> Thanks Chris! >>>>>> I'm waiting for reviewers for this change. >>>>>> >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2020/03/21 4:23, Chris Plummer wrote: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> The failure is due to JDK-8231634, so not something you need to >>>>>>> worry about. >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 3/20/20 2:45 AM, Yasumasa Suenaga wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> I uploaded new webrev which includes reverting change for >>>>>>>> ProblemList: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/ >>>>>>>> >>>>>>>> I tested it on submit repo >>>>>>>> (mach5-one-ysuenaga-JDK-8240956-2-20200320-0814-9586301), >>>>>>>> but it has failed in ClhsdbJstackXcompStress.java. >>>>>>>> However I think it is not caused by this change because >>>>>>>> ClhsdbJstackXcompStress.java tests `jhsdb jstack`, not mixed >>>>>>>> mode, it would not parse DWARF. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2020/03/20 13:55, Chris Plummer wrote: >>>>>>>>> Hi Yasumasa, >>>>>>>>> >>>>>>>>> The test has been problem listed so please add undoing this to >>>>>>>>> your webrev. Here's the diff that problem listed it: >>>>>>>>> >>>>>>>>> diff --git a/test/hotspot/jtreg/ProblemList.txt >>>>>>>>> b/test/hotspot/jtreg/ProblemList.txt >>>>>>>>> --- a/test/hotspot/jtreg/ProblemList.txt >>>>>>>>> +++ b/test/hotspot/jtreg/ProblemList.txt >>>>>>>>> @@ -115,7 +115,7 @@ >>>>>>>>> ??serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all >>>>>>>>> ??serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all >>>>>>>>> ??serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all >>>>>>>>> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all >>>>>>>>> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 >>>>>>>>> solaris-all,linux-all >>>>>>>>> ??serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java >>>>>>>>> 8193639 solaris-all >>>>>>>>> ??serviceability/sa/ClhsdbScanOops.java >>>>>>>>> 8193639,8235220,8230731 >>>>>>>>> solaris-all,linux-x64,macosx-x64,windows-x64 >>>>>>>>> ??serviceability/sa/ClhsdbSource.java 8193639 solaris-all >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 3/16/20 5:07 AM, Yasumasa Suenaga wrote: >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> This webrev has passed submit repo >>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and >>>>>>>>>> additional tests. >>>>>>>>>> So please review it: >>>>>>>>>> >>>>>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>>>> ? webrev: >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2020/03/16 21:03, Yasumasa Suenaga wrote: >>>>>>>>>>> Thank you so much, David! >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 2020/03/16 21:01, David Holmes wrote: >>>>>>>>>>>> On 16/03/2020 9:46 pm, David Holmes wrote: >>>>>>>>>>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I missed loop condition, so I fixed it and pushed to >>>>>>>>>>>>>> submit repo. >>>>>>>>>>>>>> Could you try again? >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23 >>>>>>>>>>>>>> >>>>>>>>>>>>>> webrev is here: >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >>>>>>>>>>>>> >>>>>>>>>>>>> Test job resubmitted. Will advise results if it completes >>>>>>>>>>>>> before I go to bed :) >>>>>>>>>>>> >>>>>>>>>>>> Seems to have passed okay. >>>>>>>>>>>> >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>>> David >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks a lot! >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 2020/03/16 16:17, David Holmes wrote: >>>>>>>>>>>>>>> Sorry it is still crashing. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> # >>>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime >>>>>>>>>>>>>>> Environment: >>>>>>>>>>>>>>> # >>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, >>>>>>>>>>>>>>> tid=13704 >>>>>>>>>>>>>>> # >>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) >>>>>>>>>>>>>>> (fastdebug build >>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0640217.suenaga.source) >>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug >>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0640217.suenaga.source, mixed >>>>>>>>>>>>>>> mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>>>>>>>>>>>> # Problematic frame: >>>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] >>>>>>>>>>>>>>> DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>>>>>>>>>> # >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Same as before. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> David >>>>>>>>>>>>>>> ----- >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote: >>>>>>>>>>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote: >>>>>>>>>>>>>>>>>> BTW, if you submit it to the submit repo, we can then >>>>>>>>>>>>>>>>>> go and run additional internal tests (and even more >>>>>>>>>>>>>>>>>> builds) using that job. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks for that tip Chris! >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I've pushed the change to submit repo, but I've not >>>>>>>>>>>>>>>>> yet received the result. >>>>>>>>>>>>>>>>> I will share you when I get job ID. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> We can see the id. Just need to wait for the builds to >>>>>>>>>>>>>>>> complete before submitting the additional tests. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thank you for testing it. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I updated webrev to avoid bailout to Java frame when >>>>>>>>>>>>>>>>>>> DWARF has language personality routine or LSDA. >>>>>>>>>>>>>>>>>>> Could you try it? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 . >>>>>>>>>>>>>>>>>>> I've pushed it to submit repo. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Diff from webrev.00 is here: >>>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote: >>>>>>>>>>>>>>>>>>>> Correction ... >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote: >>>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote: >>>>>>>>>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I can't review this as I know nothing about the >>>>>>>>>>>>>>>>>>>>>> code, but I'm putting the patch through our >>>>>>>>>>>>>>>>>>>>>> internal testing. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Sorry but the crashes still exist: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java >>>>>>>>>>>>>>>>>>>>> Runtime Environment: >>>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, >>>>>>>>>>>>>>>>>>>>> pid=16948, tid=16949 >>>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment >>>>>>>>>>>>>>>>>>>>> (15.0) (fastdebug build >>>>>>>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM >>>>>>>>>>>>>>>>>>>>> (fastdebug >>>>>>>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, >>>>>>>>>>>>>>>>>>>>> mixed mode, sharing, tiered, compressed oops, g1 >>>>>>>>>>>>>>>>>>>>> gc, linux-amd64) >>>>>>>>>>>>>>>>>>>>> # Problematic frame: >>>>>>>>>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] >>>>>>>>>>>>>>>>>>>>> DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> in fact they seem worse as the test seems to >>>>>>>>>>>>>>>>>>>>> always crash now. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs >>>>>>>>>>>>>>>>>>>> of the test in linux-x64. I don't see a pattern as >>>>>>>>>>>>>>>>>>>> to where it fails versus passes. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> It doesn't fail for me locally. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Please review this change: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> ?? JBS: >>>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>>>>>>>>>>>>>>>>> ?? webrev: >>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for >>>>>>>>>>>>>>>>>>>>>>> unwinding native frames in jstack mixed mode. >>>>>>>>>>>>>>>>>>>>>>> However some error has seen intermittently after >>>>>>>>>>>>>>>>>>>>>>> that. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I investigated the cause of this, I found two >>>>>>>>>>>>>>>>>>>>>>> concerns: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) >>>>>>>>>>>>>>>>>>>>>>> range check >>>>>>>>>>>>>>>>>>>>>>> ?? B: Language personality routine and Language >>>>>>>>>>>>>>>>>>>>>>> Specific Data Area (LSDA) are not considered >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I addd range check for .eh_frame processing, and >>>>>>>>>>>>>>>>>>>>>>> ignore personality routine and LSDA in this webrev. >>>>>>>>>>>>>>>>>>>>>>> Also I added bailout code if DWARF processing is >>>>>>>>>>>>>>>>>>>>>>> failed due to these concerns. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> This change has passed all tests on submit repo >>>>>>>>>>>>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle >>>>>>>>>>>>>>>>>>>>>>> Linux 7.7 container. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>> >>>> >> >> From serguei.spitsyn at oracle.com Thu Mar 26 17:53:50 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 26 Mar 2020 10:53:50 -0700 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: <0851349f-ee17-4b0e-f186-a378cedd6913@oracle.com> References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com> <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com> <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com> <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com> <4f940048-06e1-bda1-6f75-225001572232@oss.nttdata.com> <908005c7-c3ad-2f92-5853-2ae0895689b4@oracle.com> <2114ef12-b790-ee2d-4242-d114cc47151c@oracle.com> <9c9950af-5e36-90b5-d22f-a3e00d45fbcc@oss.nttdata.com> <0851349f-ee17-4b0e-f186-a378cedd6913@oracle.com> Message-ID: Hi Kevin, Nice catch with the name "lastFrame". I was also confused when reviewed this but did not come up with something better. Thanks, Serguei On 3/26/20 10:40, Kevin Walls wrote: > Hi Yasumasa, > > Oops, didn't catch this - I also had done some manual testing and in > mach5 but clearly not enough. > > Generally I think this looks good. > > "lastFrame" can mean last as in final, or last as in previous. "last" > is one of those annoying English words.? Here it means final, if we > get an Exception during processDwarf, use this to flag that we should > return null from sender().? "finalFrame" would be clearer to me, > anything else probably gets more verbose than you wanted. > > Yes I like having the limit on the while loop in process_dwarf(), > always worried how sane the information is that we are parsing through. > > Thanks! > Kevin > > > On 24/03/2020 23:47, Yasumasa Suenaga wrote: >> Thanks Serguei! >> >> I will push it when I get second reviewer. >> >> >> Yasumasa >> >> >> On 2020/03/25 1:39, serguei.spitsyn at oracle.com wrote: >>> Hi Yasumasa, >>> >>> I'm okay with this update. >>> My mach5 test run for this patch is passed. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 3/23/20 17:08, Yasumasa Suenaga wrote: >>>> Hi Serguei, >>>> >>>> Thanks for your comment! >>>> I uploaded new webrev: >>>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.04/ >>>> >>>> Also I pushed it to submit repo: >>>> >>>> ? http://hg.openjdk.java.net/jdk/submit/rev/fade6a949bd1 >>>> >>>> On 2020/03/24 7:39, serguei.spitsyn at oracle.com wrote: >>>>> Hi Yasumasa, >>>>> >>>>> The mach5 tier5 testing looks good. >>>>> The serviceability/sa/ClhsdbPstack.java is failed without fix and >>>>> is not failed with it. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 3/23/20 10:18, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> I looked at you changes. >>>>>> It is hard to understand if this fully solves the issue. >>>>>> >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html >>>>>> >>>>>> >>>>>> @@ -34,10 +34,11 @@ >>>>>> ? ???? public static LinuxAMD64CFrame getTopFrame(LinuxDebugger >>>>>> dbg, Address rip, ThreadContext context) { >>>>>> ??????? Address libptr = dbg.findLibPtrByAddress(rip); >>>>>> ??????? Address cfa = >>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); >>>>>> ??????? DwarfParser dwarf = null; >>>>>> + boolean unsupportedDwarf = false; >>>>>> ? ??????? if (libptr != null) { // Native frame >>>>>> ????????? try { >>>>>> ??????????? dwarf = new DwarfParser(libptr); >>>>>> ??????????? dwarf.processDwarf(rip); >>>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >>>>>> >>>>>> @@ -45,24 +46,33 @@ >>>>>> ?????????????????? !dwarf.isBPOffsetAvailable()) >>>>>> ????????????????????? ? >>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP) >>>>>> ????????????????????? : >>>>>> context.getRegisterAsAddress(dwarf.getCFARegister()) >>>>>> .addOffsetTo(dwarf.getCFAOffset()); >>>>>> ????????? } catch (DebuggerException e) { >>>>>> - // Bail out to Java frame case >>>>>> + if (dwarf != null) { >>>>>> + // DWARF processing should succeed when the frame is native >>>>>> + // but it might fail if CIE has language personality routine >>>>>> + // and/or LSDA. >>>>>> + dwarf = null; >>>>>> + unsupportedDwarf = true; >>>>>> + } else { >>>>>> + throw e; >>>>>> + } >>>>>> ????????? } >>>>>> ??????? } >>>>>> ? ??????? return (cfa == null) ? null >>>>>> - : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf); >>>>>> + : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf, !unsupportedDwarf); >>>>>> ???? } >>>>>> >>>>>> @@ -121,13 +131,25 @@ >>>>>> ?????? } >>>>>> ? ?????? return isValidFrame(nextCFA, context) ? nextCFA : null; >>>>>> ???? } >>>>>> ? - private DwarfParser getNextDwarf(Address nextPC) { >>>>>> - DwarfParser nextDwarf = null; >>>>>> + @Override >>>>>> + public CFrame sender(ThreadProxy thread) { >>>>>> + if (!possibleNext) { >>>>>> + return null; >>>>>> + } >>>>>> + >>>>>> + ThreadContext context = thread.getContext(); >>>>>> + >>>>>> + Address nextPC = getNextPC(dwarf != null); >>>>>> + if (nextPC == null) { >>>>>> + return null; >>>>>> + } >>>>>> ? + DwarfParser nextDwarf = null; >>>>>> + boolean unsupportedDwarf = false; >>>>>> ?????? if ((dwarf != null) && dwarf.isIn(nextPC)) { >>>>>> ???????? nextDwarf = dwarf; >>>>>> ?????? } else { >>>>>> ???????? Address libptr = dbg.findLibPtrByAddress(nextPC); >>>>>> ???????? if (libptr != null) { >>>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >>>>>> >>>>>> @@ -138,33 +160,29 @@ >>>>>> ?????????? } >>>>>> ???????? } >>>>>> ?????? } >>>>>> ? ?????? if (nextDwarf != null) { >>>>>> + try { >>>>>> ???????? nextDwarf.processDwarf(nextPC); >>>>>> + } catch (DebuggerException e) { >>>>>> + // DWARF processing should succeed when the frame is native >>>>>> + // but it might fail if CIE has language personality routine >>>>>> + // and/or LSDA. >>>>>> + nextDwarf = null; >>>>>> + unsupportedDwarf = true; >>>>>> ?????? } >>>>>> >>>>>> This fix looks like a hack. >>>>>> Should we just propagate the Debugging exception instead of >>>>>> trying to maintain unsupportedDwarf flag? >>>> >>>> DwarfParser::processDwarf would throw DebuggerException if it >>>> cannot find DWARF which relates to PC. >>>> PC at this point is for next frame. So current frame (`this` >>>> object) is valid, and it should be processed. >>>> >>>> >>>>>> Also, I don't like that DWARF-specific abbreviations (like CIE, >>>>>> IDE,LSDA, etc.) are used without any comments explaining them. >>>>>> The code has to be generally readable without looking into the >>>>>> DWARF spec each time. >>>> >>>> I added comments for them in this webrev. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>>>> I'm submitting mach5 jobs to make sure the issue has been >>>>>> resolved with your fix. >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 3/20/20 17:55, Yasumasa Suenaga wrote: >>>>>>> Thanks Chris! >>>>>>> I'm waiting for reviewers for this change. >>>>>>> >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2020/03/21 4:23, Chris Plummer wrote: >>>>>>>> Hi Yasumasa, >>>>>>>> >>>>>>>> The failure is due to JDK-8231634, so not something you need to >>>>>>>> worry about. >>>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 3/20/20 2:45 AM, Yasumasa Suenaga wrote: >>>>>>>>> Hi Chris, >>>>>>>>> >>>>>>>>> I uploaded new webrev which includes reverting change for >>>>>>>>> ProblemList: >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/ >>>>>>>>> >>>>>>>>> I tested it on submit repo >>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-2-20200320-0814-9586301), >>>>>>>>> but it has failed in ClhsdbJstackXcompStress.java. >>>>>>>>> However I think it is not caused by this change because >>>>>>>>> ClhsdbJstackXcompStress.java tests `jhsdb jstack`, not mixed >>>>>>>>> mode, it would not parse DWARF. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2020/03/20 13:55, Chris Plummer wrote: >>>>>>>>>> Hi Yasumasa, >>>>>>>>>> >>>>>>>>>> The test has been problem listed so please add undoing this >>>>>>>>>> to your webrev. Here's the diff that problem listed it: >>>>>>>>>> >>>>>>>>>> diff --git a/test/hotspot/jtreg/ProblemList.txt >>>>>>>>>> b/test/hotspot/jtreg/ProblemList.txt >>>>>>>>>> --- a/test/hotspot/jtreg/ProblemList.txt >>>>>>>>>> +++ b/test/hotspot/jtreg/ProblemList.txt >>>>>>>>>> @@ -115,7 +115,7 @@ >>>>>>>>>> ??serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all >>>>>>>>>> ??serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all >>>>>>>>>> ??serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all >>>>>>>>>> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all >>>>>>>>>> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 >>>>>>>>>> solaris-all,linux-all >>>>>>>>>> ??serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java >>>>>>>>>> 8193639 solaris-all >>>>>>>>>> ??serviceability/sa/ClhsdbScanOops.java >>>>>>>>>> 8193639,8235220,8230731 >>>>>>>>>> solaris-all,linux-x64,macosx-x64,windows-x64 >>>>>>>>>> ??serviceability/sa/ClhsdbSource.java 8193639 solaris-all >>>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> On 3/16/20 5:07 AM, Yasumasa Suenaga wrote: >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> This webrev has passed submit repo >>>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and >>>>>>>>>>> additional tests. >>>>>>>>>>> So please review it: >>>>>>>>>>> >>>>>>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>>>>> ? webrev: >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 2020/03/16 21:03, Yasumasa Suenaga wrote: >>>>>>>>>>>> Thank you so much, David! >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 2020/03/16 21:01, David Holmes wrote: >>>>>>>>>>>>> On 16/03/2020 9:46 pm, David Holmes wrote: >>>>>>>>>>>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I missed loop condition, so I fixed it and pushed to >>>>>>>>>>>>>>> submit repo. >>>>>>>>>>>>>>> Could you try again? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> webrev is here: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> Test job resubmitted. Will advise results if it completes >>>>>>>>>>>>>> before I go to bed :) >>>>>>>>>>>>> >>>>>>>>>>>>> Seems to have passed okay. >>>>>>>>>>>>> >>>>>>>>>>>>> David >>>>>>>>>>>>> >>>>>>>>>>>>>> David >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks a lot! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 2020/03/16 16:17, David Holmes wrote: >>>>>>>>>>>>>>>> Sorry it is still crashing. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime >>>>>>>>>>>>>>>> Environment: >>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, >>>>>>>>>>>>>>>> tid=13704 >>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) >>>>>>>>>>>>>>>> (fastdebug build >>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0640217.suenaga.source) >>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug >>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0640217.suenaga.source, mixed >>>>>>>>>>>>>>>> mode, sharing, tiered, compressed oops, g1 gc, >>>>>>>>>>>>>>>> linux-amd64) >>>>>>>>>>>>>>>> # Problematic frame: >>>>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] >>>>>>>>>>>>>>>> DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Same as before. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>> ----- >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote: >>>>>>>>>>>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote: >>>>>>>>>>>>>>>>>>> BTW, if you submit it to the submit repo, we can >>>>>>>>>>>>>>>>>>> then go and run additional internal tests (and even >>>>>>>>>>>>>>>>>>> more builds) using that job. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks for that tip Chris! >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I've pushed the change to submit repo, but I've not >>>>>>>>>>>>>>>>>> yet received the result. >>>>>>>>>>>>>>>>>> I will share you when I get job ID. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> We can see the id. Just need to wait for the builds to >>>>>>>>>>>>>>>>> complete before submitting the additional tests. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thank you for testing it. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I updated webrev to avoid bailout to Java frame >>>>>>>>>>>>>>>>>>>> when DWARF has language personality routine or LSDA. >>>>>>>>>>>>>>>>>>>> Could you try it? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 . >>>>>>>>>>>>>>>>>>>> I've pushed it to submit repo. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Diff from webrev.00 is here: >>>>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote: >>>>>>>>>>>>>>>>>>>>> Correction ... >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote: >>>>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote: >>>>>>>>>>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I can't review this as I know nothing about the >>>>>>>>>>>>>>>>>>>>>>> code, but I'm putting the patch through our >>>>>>>>>>>>>>>>>>>>>>> internal testing. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Sorry but the crashes still exist: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java >>>>>>>>>>>>>>>>>>>>>> Runtime Environment: >>>>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, >>>>>>>>>>>>>>>>>>>>>> pid=16948, tid=16949 >>>>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment >>>>>>>>>>>>>>>>>>>>>> (15.0) (fastdebug build >>>>>>>>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM >>>>>>>>>>>>>>>>>>>>>> (fastdebug >>>>>>>>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, >>>>>>>>>>>>>>>>>>>>>> mixed mode, sharing, tiered, compressed oops, g1 >>>>>>>>>>>>>>>>>>>>>> gc, linux-amd64) >>>>>>>>>>>>>>>>>>>>>> # Problematic frame: >>>>>>>>>>>>>>>>>>>>>> # C [libsaproc.so+0x494e] >>>>>>>>>>>>>>>>>>>>>> DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> in fact they seem worse as the test seems to >>>>>>>>>>>>>>>>>>>>>> always crash now. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 >>>>>>>>>>>>>>>>>>>>> runs of the test in linux-x64. I don't see a >>>>>>>>>>>>>>>>>>>>> pattern as to where it fails versus passes. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> It doesn't fail for me locally. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Please review this change: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> ?? JBS: >>>>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>>>>>>>>>>>>>>>>>> ?? webrev: >>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for >>>>>>>>>>>>>>>>>>>>>>>> unwinding native frames in jstack mixed mode. >>>>>>>>>>>>>>>>>>>>>>>> However some error has seen intermittently >>>>>>>>>>>>>>>>>>>>>>>> after that. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> I investigated the cause of this, I found two >>>>>>>>>>>>>>>>>>>>>>>> concerns: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) >>>>>>>>>>>>>>>>>>>>>>>> range check >>>>>>>>>>>>>>>>>>>>>>>> ?? B: Language personality routine and Language >>>>>>>>>>>>>>>>>>>>>>>> Specific Data Area (LSDA) are not considered >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> I addd range check for .eh_frame processing, >>>>>>>>>>>>>>>>>>>>>>>> and ignore personality routine and LSDA in this >>>>>>>>>>>>>>>>>>>>>>>> webrev. >>>>>>>>>>>>>>>>>>>>>>>> Also I added bailout code if DWARF processing >>>>>>>>>>>>>>>>>>>>>>>> is failed due to these concerns. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> This change has passed all tests on submit repo >>>>>>>>>>>>>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle >>>>>>>>>>>>>>>>>>>>>>>> Linux 7.7 container. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>> >>> >>> > From chris.plummer at oracle.com Thu Mar 26 20:27:00 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 26 Mar 2020 13:27:00 -0700 Subject: RFR(XS) 8241696: ProblemList gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java due to JDK-8241293 Message-ID: Hello, Please review the following: https://bugs.openjdk.java.net/browse/JDK-8241696 diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt +++ b/test/hotspot/jtreg/ProblemList.txt @@ -85,7 +85,7 @@ ?gc/stress/gclocker/TestGCLockerWithParallel.java 8180622 generic-all ?gc/stress/gclocker/TestGCLockerWithG1.java 8180622 generic-all ?gc/stress/TestJNIBlockFullGC/TestJNIBlockFullGC.java 8192647 generic-all -gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java 8193639 solaris-all +gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java 8193639,8241293 solaris-all,macosx-x64 thanks, Chris From christian.tornqvist at oracle.com Thu Mar 26 20:41:55 2020 From: christian.tornqvist at oracle.com (Christian Tornqvist) Date: Thu, 26 Mar 2020 13:41:55 -0700 Subject: RFR(XS) 8241696: ProblemList gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java due to JDK-8241293 In-Reply-To: References: Message-ID: <5FAF6E40-AAC8-42D6-9678-13250BDCDF29@oracle.com> Hi Chris, Looks good, thanks for fixing this. Thanks, Christian > On Mar 26, 2020, at 1:27 PM, Chris Plummer wrote: > > Hello, > > Please review the following: > > https://bugs.openjdk.java.net/browse/JDK-8241696 > > diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt > --- a/test/hotspot/jtreg/ProblemList.txt > +++ b/test/hotspot/jtreg/ProblemList.txt > @@ -85,7 +85,7 @@ > gc/stress/gclocker/TestGCLockerWithParallel.java 8180622 generic-all > gc/stress/gclocker/TestGCLockerWithG1.java 8180622 generic-all > gc/stress/TestJNIBlockFullGC/TestJNIBlockFullGC.java 8192647 generic-all > -gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java 8193639 solaris-all > +gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java 8193639,8241293 solaris-all,macosx-x64 > > thanks, > > Chris > From daniel.daugherty at oracle.com Thu Mar 26 20:55:43 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 26 Mar 2020 16:55:43 -0400 Subject: RFR(XS) 8241696: ProblemList gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java due to JDK-8241293 In-Reply-To: <5FAF6E40-AAC8-42D6-9678-13250BDCDF29@oracle.com> References: <5FAF6E40-AAC8-42D6-9678-13250BDCDF29@oracle.com> Message-ID: <4a41868c-0fb9-046f-8911-b1b602de8004@oracle.com> Thumbs up. This is a trivial review, but you didn't qualify it as such so now you have a second review. Dan On 3/26/20 4:41 PM, Christian Tornqvist wrote: > Hi Chris, > > Looks good, thanks for fixing this. > > Thanks, > Christian > >> On Mar 26, 2020, at 1:27 PM, Chris Plummer wrote: >> >> Hello, >> >> Please review the following: >> >> https://bugs.openjdk.java.net/browse/JDK-8241696 >> >> diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt >> --- a/test/hotspot/jtreg/ProblemList.txt >> +++ b/test/hotspot/jtreg/ProblemList.txt >> @@ -85,7 +85,7 @@ >> gc/stress/gclocker/TestGCLockerWithParallel.java 8180622 generic-all >> gc/stress/gclocker/TestGCLockerWithG1.java 8180622 generic-all >> gc/stress/TestJNIBlockFullGC/TestJNIBlockFullGC.java 8192647 generic-all >> -gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java 8193639 solaris-all >> +gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java 8193639,8241293 solaris-all,macosx-x64 >> >> thanks, >> >> Chris >> From leonid.mesnik at oracle.com Thu Mar 26 21:39:15 2020 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Thu, 26 Mar 2020 14:39:15 -0700 Subject: RFR: 8241456: ThreadRunner shouldn't use Wicket for threads starting synchronization In-Reply-To: References: Message-ID: <412d8c29-7742-a138-dc74-8f07def5eeae@oracle.com> Replying with correct summary. Leonid On 3/23/20 8:55 PM, Leonid Mesnik wrote: > Hi > > Could you please review following fix which update ThreadsRunner to use AtomicInteger/spinOnWait instead of Wicket to synchronize starting of stress test threads. > > Failing tests allocated all memory by earlier started threads before Lock.unlock is called in the latest threads. So thread might get an OOME exception while trying to release lock and/or get into inconsistent state. > > The bug was introduced by https://bugs.openjdk.java.net/browse/JDK-8241123 > The Atomic works fine for stress test finishing sync. I just didn't expect that tests might OOME while releasing start lock. > Verified that tests now don't fail with -Xcomp -server -XX:-TieredCompilation -XX:-UseCompressedOops. > > webrev: http://cr.openjdk.java.net/~lmesnik/8241456/webrev.00/ > bug: https://bugs.openjdk.java.net/browse/JDK-8241456 > > Leonid From chris.plummer at oracle.com Thu Mar 26 22:15:11 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 26 Mar 2020 15:15:11 -0700 Subject: RFR(XS) 8241696: ProblemList gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java due to JDK-8241293 In-Reply-To: <4a41868c-0fb9-046f-8911-b1b602de8004@oracle.com> References: <5FAF6E40-AAC8-42D6-9678-13250BDCDF29@oracle.com> <4a41868c-0fb9-046f-8911-b1b602de8004@oracle.com> Message-ID: Thanks! On 3/26/20 1:55 PM, Daniel D. Daugherty wrote: > Thumbs up. This is a trivial review, but you didn't qualify it as such > so now you have a second review. > > Dan > > > On 3/26/20 4:41 PM, Christian Tornqvist wrote: >> Hi Chris, >> >> Looks good, thanks for fixing this. >> >> Thanks, >> Christian >> >>> On Mar 26, 2020, at 1:27 PM, Chris Plummer >>> wrote: >>> >>> Hello, >>> >>> Please review the following: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8241696 >>> >>> diff --git a/test/hotspot/jtreg/ProblemList.txt >>> b/test/hotspot/jtreg/ProblemList.txt >>> --- a/test/hotspot/jtreg/ProblemList.txt >>> +++ b/test/hotspot/jtreg/ProblemList.txt >>> @@ -85,7 +85,7 @@ >>> ? gc/stress/gclocker/TestGCLockerWithParallel.java 8180622 generic-all >>> ? gc/stress/gclocker/TestGCLockerWithG1.java 8180622 generic-all >>> ? gc/stress/TestJNIBlockFullGC/TestJNIBlockFullGC.java 8192647 >>> generic-all >>> -gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java 8193639 >>> solaris-all >>> +gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java >>> 8193639,8241293 solaris-all,macosx-x64 >>> >>> thanks, >>> >>> Chris >>> > From david.holmes at oracle.com Thu Mar 26 23:06:53 2020 From: david.holmes at oracle.com (David Holmes) Date: Fri, 27 Mar 2020 09:06:53 +1000 Subject: RFR: 8241456: ThreadRunner shouldn't use Wicket for threads starting synchronization In-Reply-To: <412d8c29-7742-a138-dc74-8f07def5eeae@oracle.com> References: <412d8c29-7742-a138-dc74-8f07def5eeae@oracle.com> Message-ID: <9502df2b-07d1-b1d2-5e66-fce0eb4ac9d7@oracle.com> Hi Leonid, On 27/03/2020 7:39 am, Leonid Mesnik wrote: > Replying with correct summary. > > Leonid > > On 3/23/20 8:55 PM, Leonid Mesnik wrote: >> Hi >> >> Could you please review following fix which update ThreadsRunner to >> use AtomicInteger/spinOnWait instead of Wicket to synchronize starting >> of stress test threads. >> >> Failing tests allocated all memory by earlier started threads before >> Lock.unlock is called in the latest threads. So thread might get an >> OOME exception while trying to release lock and/or get into >> inconsistent state. You have a bug in Wicket: + try { + lock.lock(); ... + } finally { + lock.unlock(); The lock() has to go outside the try block. That is why you were getting IllegalMonitorStateExceptions when the lock() threw OOME. But the OOME itself is still a problem as it means you can't use any proper synchronizer. I don't like seeing the spin-loops but in this code you may have no choice if memory may already be exhausted. David ----- >> >> The bug was introduced by >> https://bugs.openjdk.java.net/browse/JDK-8241123 >> >> The Atomic works fine for stress test finishing sync. I just didn't >> expect that tests might OOME while releasing start lock. >> Verified that tests now don't fail with -Xcomp -server >> -XX:-TieredCompilation -XX:-UseCompressedOops. >> >> webrev: http://cr.openjdk.java.net/~lmesnik/8241456/webrev.00/ >> >> bug: https://bugs.openjdk.java.net/browse/JDK-8241456 >> >> >> Leonid From leonid.mesnik at oracle.com Thu Mar 26 23:16:39 2020 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Thu, 26 Mar 2020 16:16:39 -0700 Subject: RFR: 8241456: ThreadRunner shouldn't use Wicket for threads starting synchronization In-Reply-To: <9502df2b-07d1-b1d2-5e66-fce0eb4ac9d7@oracle.com> References: <412d8c29-7742-a138-dc74-8f07def5eeae@oracle.com> <9502df2b-07d1-b1d2-5e66-fce0eb4ac9d7@oracle.com> Message-ID: On 3/26/20 4:06 PM, David Holmes wrote: > Hi Leonid, > > On 27/03/2020 7:39 am, Leonid Mesnik wrote: >> Replying with correct summary. >> >> Leonid >> >> On 3/23/20 8:55 PM, Leonid Mesnik wrote: >>> Hi >>> >>> Could you please review following fix which update ThreadsRunner to >>> use AtomicInteger/spinOnWait instead of Wicket to synchronize >>> starting of stress test threads. >>> >>> Failing tests allocated all memory by earlier started threads before >>> Lock.unlock is called in the latest threads. So thread might get an >>> OOME exception while trying to release lock and/or get into >>> inconsistent state. > > You have a bug in Wicket: > > +??????? try { > +??????????? lock.lock(); > ... > +??????? } finally { > +??????????? lock.unlock(); > > The lock() has to go outside the try block. That is why you were > getting IllegalMonitorStateExceptions when the lock() threw OOME. Thanks for explanation. But anyway, as I understand locks use memory and might be inconsistent if OOME happened. > > But the OOME itself is still a problem as it means you can't use any > proper synchronizer. I don't like seeing the spin-loops but in this > code you may have no choice if memory may already be exhausted. It should be really short spin-loop, test only start thread during this loop and don't do anything more. Also, it is done only once for all stress test. The goal is to start thread completely before heap is exhausted. Leonid > > David > ----- > > >>> >>> The bug was introduced by >>> https://bugs.openjdk.java.net/browse/JDK-8241123 >>> >>> The Atomic works fine for stress test finishing sync. I just didn't >>> expect that tests might OOME while releasing start lock. >>> Verified that tests now don't fail with -Xcomp -server >>> -XX:-TieredCompilation -XX:-UseCompressedOops. >>> >>> webrev: http://cr.openjdk.java.net/~lmesnik/8241456/webrev.00/ >>> >>> bug: https://bugs.openjdk.java.net/browse/JDK-8241456 >>> >>> >>> Leonid From david.holmes at oracle.com Thu Mar 26 23:29:18 2020 From: david.holmes at oracle.com (David Holmes) Date: Fri, 27 Mar 2020 09:29:18 +1000 Subject: RFR: 8241456: ThreadRunner shouldn't use Wicket for threads starting synchronization In-Reply-To: References: <412d8c29-7742-a138-dc74-8f07def5eeae@oracle.com> <9502df2b-07d1-b1d2-5e66-fce0eb4ac9d7@oracle.com> Message-ID: <12701240-fd7d-560d-8974-ff0be9cafa7e@oracle.com> On 27/03/2020 9:16 am, Leonid Mesnik wrote: > > On 3/26/20 4:06 PM, David Holmes wrote: >> Hi Leonid, >> >> On 27/03/2020 7:39 am, Leonid Mesnik wrote: >>> Replying with correct summary. >>> >>> Leonid >>> >>> On 3/23/20 8:55 PM, Leonid Mesnik wrote: >>>> Hi >>>> >>>> Could you please review following fix which update ThreadsRunner to >>>> use AtomicInteger/spinOnWait instead of Wicket to synchronize >>>> starting of stress test threads. >>>> >>>> Failing tests allocated all memory by earlier started threads before >>>> Lock.unlock is called in the latest threads. So thread might get an >>>> OOME exception while trying to release lock and/or get into >>>> inconsistent state. >> >> You have a bug in Wicket: >> >> +??????? try { >> +??????????? lock.lock(); >> ... >> +??????? } finally { >> +??????????? lock.unlock(); >> >> The lock() has to go outside the try block. That is why you were >> getting IllegalMonitorStateExceptions when the lock() threw OOME. > Thanks for explanation. But anyway, as I understand locks use memory and > might be inconsistent if OOME happened. They use memory and so lock() can throw OOME, but they are never inconsistent. >> >> But the OOME itself is still a problem as it means you can't use any >> proper synchronizer. I don't like seeing the spin-loops but in this >> code you may have no choice if memory may already be exhausted. > > It should be really short spin-loop, test only start thread during this > loop and don't do anything more. Also, it is done only once for all > stress test. The goal is to start thread completely before heap is > exhausted. Okay. I'm somewhat dubious about making these changes in mainline now just to support loom. I don't see why we need to care about pinning threads in this kind of situation. David > Leonid > >> >> David >> ----- >> >> >>>> >>>> The bug was introduced by >>>> https://bugs.openjdk.java.net/browse/JDK-8241123 >>>> >>>> The Atomic works fine for stress test finishing sync. I just didn't >>>> expect that tests might OOME while releasing start lock. >>>> Verified that tests now don't fail with -Xcomp -server >>>> -XX:-TieredCompilation -XX:-UseCompressedOops. >>>> >>>> webrev: http://cr.openjdk.java.net/~lmesnik/8241456/webrev.00/ >>>> >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8241456 >>>> >>>> >>>> Leonid From david.holmes at oracle.com Thu Mar 26 23:36:35 2020 From: david.holmes at oracle.com (David Holmes) Date: Fri, 27 Mar 2020 09:36:35 +1000 Subject: RFR: 8241585: Remove unused _recursion_counter facility from PerfTraceTime In-Reply-To: <91a8ebbc-522d-bd67-6304-c9e097bd8366@oracle.com> References: <91a8ebbc-522d-bd67-6304-c9e097bd8366@oracle.com> Message-ID: <31c866a4-e135-adbc-cb8c-81fbd77bb59e@oracle.com> Hi Claes, Adding serviceability as they are the consumers of this IIUC. On 27/03/2020 3:40 am, Claes Redestad wrote: > Hi, > > PerfTraceTime::_recursion_counter is unused, and removing it > gets rid of some branchy (but well-predicted) code in paths that is > somewhat startup sensitive. Okay. > http://cr.openjdk.java.net/~redestad/8241585/open.00/ > > Also added some trace logging to determine the number of perf > data counter or each type along with a tune-up to exactly match > the defaults. Okay so can you change the bug synopsis and description to cover this more general cleanup and tuneup please. I'm never very clear on the uses of these PerfCounters. It seems SUN_NS is unused after this change. The references to jvmstat seem no longer correct - these are read via jstat ? > Testing: tier1+2 I think serviceability testing is mainly in tier3. Thanks, David ----- > > Thanks! > > /Claes From leonid.mesnik at oracle.com Thu Mar 26 23:41:36 2020 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Thu, 26 Mar 2020 16:41:36 -0700 Subject: RFR: 8241456: ThreadRunner shouldn't use Wicket for threads starting synchronization In-Reply-To: <12701240-fd7d-560d-8974-ff0be9cafa7e@oracle.com> References: <412d8c29-7742-a138-dc74-8f07def5eeae@oracle.com> <9502df2b-07d1-b1d2-5e66-fce0eb4ac9d7@oracle.com> <12701240-fd7d-560d-8974-ff0be9cafa7e@oracle.com> Message-ID: <70175e02-2c50-50e7-0646-4fb82be6c768@oracle.com> On 3/26/20 4:29 PM, David Holmes wrote: > On 27/03/2020 9:16 am, Leonid Mesnik wrote: >> >> On 3/26/20 4:06 PM, David Holmes wrote: >>> Hi Leonid, >>> >>> On 27/03/2020 7:39 am, Leonid Mesnik wrote: >>>> Replying with correct summary. >>>> >>>> Leonid >>>> >>>> On 3/23/20 8:55 PM, Leonid Mesnik wrote: >>>>> Hi >>>>> >>>>> Could you please review following fix which update ThreadsRunner >>>>> to use AtomicInteger/spinOnWait instead of Wicket to synchronize >>>>> starting of stress test threads. >>>>> >>>>> Failing tests allocated all memory by earlier started threads >>>>> before Lock.unlock is called in the latest threads. So thread >>>>> might get an OOME exception while trying to release lock and/or >>>>> get into inconsistent state. >>> >>> You have a bug in Wicket: >>> >>> +??????? try { >>> +??????????? lock.lock(); >>> ... >>> +??????? } finally { >>> +??????????? lock.unlock(); >>> >>> The lock() has to go outside the try block. That is why you were >>> getting IllegalMonitorStateExceptions when the lock() threw OOME. >> Thanks for explanation. But anyway, as I understand locks use memory >> and might be inconsistent if OOME happened. > > They use memory and so lock() can throw OOME, but they are never > inconsistent. Ok, I will move lock.lock() outside of try {}. Thanks for explanation. > >>> >>> But the OOME itself is still a problem as it means you can't use any >>> proper synchronizer. I don't like seeing the spin-loops but in this >>> code you may have no choice if memory may already be exhausted. >> >> It should be really short spin-loop, test only start thread during >> this loop and don't do anything more. Also, it is done only once for >> all stress test. The goal is to start thread completely before heap >> is exhausted. > > Okay. I'm somewhat dubious about making these changes in mainline now > just to support loom. I don't see why we need to care about pinning > threads in this kind of situation. The idea is to add some nsk/share stress tests for virtual threads. Basically, there are the same tests as existing (gc, sysdict) but running in virtual threads. And these tests are going to be executed after loom is integrated. And I want to keep the difference as small as possible between mainline and loom. Leonid > > David > >> Leonid >> >>> >>> David >>> ----- >>> >>> >>>>> >>>>> The bug was introduced by >>>>> https://bugs.openjdk.java.net/browse/JDK-8241123 >>>>> >>>>> The Atomic works fine for stress test finishing sync. I just >>>>> didn't expect that tests might OOME while releasing start lock. >>>>> Verified that tests now don't fail with -Xcomp -server >>>>> -XX:-TieredCompilation -XX:-UseCompressedOops. >>>>> >>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8241456/webrev.00/ >>>>> >>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8241456 >>>>> >>>>> >>>>> Leonid From chris.plummer at oracle.com Thu Mar 26 23:46:12 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 26 Mar 2020 16:46:12 -0700 Subject: RFR: 8241585: Remove unused _recursion_counter facility from PerfTraceTime In-Reply-To: <31c866a4-e135-adbc-cb8c-81fbd77bb59e@oracle.com> References: <91a8ebbc-522d-bd67-6304-c9e097bd8366@oracle.com> <31c866a4-e135-adbc-cb8c-81fbd77bb59e@oracle.com> Message-ID: On 3/26/20 4:36 PM, David Holmes wrote: > Hi Claes, > > Adding serviceability as they are the consumers of this IIUC. > > On 27/03/2020 3:40 am, Claes Redestad wrote: >> Hi, >> >> PerfTraceTime::_recursion_counter is unused, and removing it >> gets rid of some branchy (but well-predicted) code in paths that is >> somewhat startup sensitive. > > Okay. > >> http://cr.openjdk.java.net/~redestad/8241585/open.00/ >> >> Also added some trace logging to determine the number of perf >> data counter or each type along with a tune-up to exactly match >> the defaults. > > Okay so can you change the bug synopsis and description to cover this > more general cleanup and tuneup please. > > I'm never very clear on the uses of these PerfCounters. It seems > SUN_NS is unused after this change. The references to jvmstat seem no > longer correct - these are read via jstat ? jstat uses jvmstat. Chris > >> Testing: tier1+2 > > I think serviceability testing is mainly in tier3. > > Thanks, > David > ----- > >> >> Thanks! >> >> /Claes From david.holmes at oracle.com Thu Mar 26 23:49:16 2020 From: david.holmes at oracle.com (David Holmes) Date: Fri, 27 Mar 2020 09:49:16 +1000 Subject: RFR: 8241585: Remove unused _recursion_counter facility from PerfTraceTime In-Reply-To: References: <91a8ebbc-522d-bd67-6304-c9e097bd8366@oracle.com> <31c866a4-e135-adbc-cb8c-81fbd77bb59e@oracle.com> Message-ID: <3a4ac731-c79c-8094-3dd6-b6f94a8f6df4@oracle.com> On 27/03/2020 9:46 am, Chris Plummer wrote: > On 3/26/20 4:36 PM, David Holmes wrote: >> Hi Claes, >> >> Adding serviceability as they are the consumers of this IIUC. >> >> On 27/03/2020 3:40 am, Claes Redestad wrote: >>> Hi, >>> >>> PerfTraceTime::_recursion_counter is unused, and removing it >>> gets rid of some branchy (but well-predicted) code in paths that is >>> somewhat startup sensitive. >> >> Okay. >> >>> http://cr.openjdk.java.net/~redestad/8241585/open.00/ >>> >>> Also added some trace logging to determine the number of perf >>> data counter or each type along with a tune-up to exactly match >>> the defaults. >> >> Okay so can you change the bug synopsis and description to cover this >> more general cleanup and tuneup please. >> >> I'm never very clear on the uses of these PerfCounters. It seems >> SUN_NS is unused after this change. The references to jvmstat seem no >> longer correct - these are read via jstat ? > jstat uses jvmstat. Thanks Chris, I was grepping C++ code not realizing jvmstat is a Java API. David > Chris >> >>> Testing: tier1+2 >> >> I think serviceability testing is mainly in tier3. >> >> Thanks, >> David >> ----- >> >>> >>> Thanks! >>> >>> /Claes > From mandy.chung at oracle.com Thu Mar 26 23:57:39 2020 From: mandy.chung at oracle.com (Mandy Chung) Date: Thu, 26 Mar 2020 16:57:39 -0700 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes Message-ID: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> Please review the implementation of JEP 371: Hidden Classes. The main changes are in core-libs and hotspot runtime area.? Small changes are made in javac, VM compiler (intrinsification of Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been reviewed and is in the finalized state (see specdiff and javadoc below for reference). Webrev: http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 Hidden class is created via `Lookup::defineHiddenClass`. From JVM's point of view, a hidden class is a normal class except the following: - A hidden class has no initiating class loader and is not registered in any dictionary. - A hidden class has a name containing an illegal character `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` returns "Lp/Foo.0x1234;". - A hidden class is not modifiable, i.e. cannot be redefined or retransformed. JVM TI IsModifableClass returns false on a hidden. - Final fields in a hidden class is "final".? The value of final fields cannot be overriden via reflection.? setAccessible(true) can still be called on reflected objects representing final fields in a hidden class and its access check will be suppressed but only have read-access (i.e. can do Field::getXXX but not setXXX). Brief summary of this patch: 1. A new Lookup::defineHiddenClass method is the API to create a hidden class. 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG option that ?? can be specified when creating a hidden class. 3. A new Class::isHiddenClass method tests if a class is a hidden class. 4. Field::setXXX method will throw IAE on a final field of a hidden class ?? regardless of the value of the accessible flag. 5. JVM_LookupDefineClass is the new JVM entry point for Lookup::defineClass ?? and defineHiddenClass to create a class from the given bytes. 6. ClassLoaderData implementation is not changed.? There is one primary CLD ?? that holds the classes strongly referenced by its defining loader.? There ?? can be zero or more additional CLDs - one per weak class. 7. Nest host determination is updated per revised JVMS 5.4.4. Access control ?? check no longer throws LinkageError but instead it will throw IAE with ?? a clear message if a class fails to resolve/validate the nest host declared ?? in NestHost/NestMembers attribute. 8. JFR, jcmd, JDI are updated to support hidden classes. 9. update javac LambdaToMethod as lambda proxy starts using nestmates ?? and generate a bridge method to desuger a method reference to a protected ?? method in its supertype in a different package This patch also updates StringConcatFactory, LambdaMetaFactory, and LambdaForms to use hidden classes.? The webrev includes changes in nashorn to hidden class and I will update the webrev if JEP 372 removes it any time soon. We uncovered a bug in Lookup::defineClass spec throws LinkageError and intends to have the newly created class linked.? However, the implementation in 14 does not link the class.? A separate CSR [2] proposes to update the implementation to match the spec.? This patch fixes the implementation. The spec update on JVM TI, JDI and Instrumentation will be done as a separate RFE [3].? This patch includes new tests for JVM TI and java.instrument that validates how the existing APIs work for hidden classes. javadoc/specdiff http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ JVMS 5.4.4 change: http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf CSR: https://bugs.openjdk.java.net/browse/JDK-8238359 Thanks Mandy [1] https://bugs.openjdk.java.net/browse/JDK-8238359 [2] https://bugs.openjdk.java.net/browse/JDK-8240338 [3] https://bugs.openjdk.java.net/browse/JDK-8230502 -------------- next part -------------- An HTML attachment was scrubbed... URL: From suenaga at oss.nttdata.com Fri Mar 27 00:07:15 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Fri, 27 Mar 2020 09:07:15 +0900 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com> <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com> <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com> <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com> <4f940048-06e1-bda1-6f75-225001572232@oss.nttdata.com> <908005c7-c3ad-2f92-5853-2ae0895689b4@oracle.com> <2114ef12-b790-ee2d-4242-d114cc47151c@oracle.com> <9c9950af-5e36-90b5-d22f-a3e00d45fbcc@oss.nttdata.com> <0851349f-ee17-4b0e-f186-a378cedd6913@oracle.com> Message-ID: <48d5c2a6-777b-b17d-db05-1b71b712fb4e@oss.nttdata.com> Thanks Kevin and Serguei! and sorry for my English... I uploaded new webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.05/ Diff from webrev.04 is here: http://hg.openjdk.java.net/jdk/submit/rev/d5f400d70e94 Thanks, Yasumasa On 2020/03/27 2:53, serguei.spitsyn at oracle.com wrote: > Hi Kevin, > > Nice catch with the name "lastFrame". > I was also confused when reviewed this but did not come up with something better. > > Thanks, > Serguei > > On 3/26/20 10:40, Kevin Walls wrote: >> Hi Yasumasa, >> >> Oops, didn't catch this - I also had done some manual testing and in mach5 but clearly not enough. >> >> Generally I think this looks good. >> >> "lastFrame" can mean last as in final, or last as in previous. "last" is one of those annoying English words.? Here it means final, if we get an Exception during processDwarf, use this to flag that we should return null from sender().? "finalFrame" would be clearer to me, anything else probably gets more verbose than you wanted. >> >> Yes I like having the limit on the while loop in process_dwarf(), always worried how sane the information is that we are parsing through. >> >> Thanks! >> Kevin >> >> >> On 24/03/2020 23:47, Yasumasa Suenaga wrote: >>> Thanks Serguei! >>> >>> I will push it when I get second reviewer. >>> >>> >>> Yasumasa >>> >>> >>> On 2020/03/25 1:39, serguei.spitsyn at oracle.com wrote: >>>> Hi Yasumasa, >>>> >>>> I'm okay with this update. >>>> My mach5 test run for this patch is passed. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 3/23/20 17:08, Yasumasa Suenaga wrote: >>>>> Hi Serguei, >>>>> >>>>> Thanks for your comment! >>>>> I uploaded new webrev: >>>>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.04/ >>>>> >>>>> Also I pushed it to submit repo: >>>>> >>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/fade6a949bd1 >>>>> >>>>> On 2020/03/24 7:39, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> The mach5 tier5 testing looks good. >>>>>> The serviceability/sa/ClhsdbPstack.java is failed without fix and is not failed with it. >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 3/23/20 10:18, serguei.spitsyn at oracle.com wrote: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> I looked at you changes. >>>>>>> It is hard to understand if this fully solves the issue. >>>>>>> >>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html >>>>>>> >>>>>>> @@ -34,10 +34,11 @@ >>>>>>> ? ???? public static LinuxAMD64CFrame getTopFrame(LinuxDebugger dbg, Address rip, ThreadContext context) { >>>>>>> ??????? Address libptr = dbg.findLibPtrByAddress(rip); >>>>>>> ??????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); >>>>>>> ??????? DwarfParser dwarf = null; >>>>>>> + boolean unsupportedDwarf = false; >>>>>>> ? ??????? if (libptr != null) { // Native frame >>>>>>> ????????? try { >>>>>>> ??????????? dwarf = new DwarfParser(libptr); >>>>>>> ??????????? dwarf.processDwarf(rip); >>>>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >>>>>>> >>>>>>> @@ -45,24 +46,33 @@ >>>>>>> ?????????????????? !dwarf.isBPOffsetAvailable()) >>>>>>> ????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) >>>>>>> ????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister()) >>>>>>> .addOffsetTo(dwarf.getCFAOffset()); >>>>>>> ????????? } catch (DebuggerException e) { >>>>>>> - // Bail out to Java frame case >>>>>>> + if (dwarf != null) { >>>>>>> + // DWARF processing should succeed when the frame is native >>>>>>> + // but it might fail if CIE has language personality routine >>>>>>> + // and/or LSDA. >>>>>>> + dwarf = null; >>>>>>> + unsupportedDwarf = true; >>>>>>> + } else { >>>>>>> + throw e; >>>>>>> + } >>>>>>> ????????? } >>>>>>> ??????? } >>>>>>> ? ??????? return (cfa == null) ? null >>>>>>> - : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf); >>>>>>> + : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf, !unsupportedDwarf); >>>>>>> ???? } >>>>>>> >>>>>>> @@ -121,13 +131,25 @@ >>>>>>> ?????? } >>>>>>> ? ?????? return isValidFrame(nextCFA, context) ? nextCFA : null; >>>>>>> ???? } >>>>>>> ? - private DwarfParser getNextDwarf(Address nextPC) { >>>>>>> - DwarfParser nextDwarf = null; >>>>>>> + @Override >>>>>>> + public CFrame sender(ThreadProxy thread) { >>>>>>> + if (!possibleNext) { >>>>>>> + return null; >>>>>>> + } >>>>>>> + >>>>>>> + ThreadContext context = thread.getContext(); >>>>>>> + >>>>>>> + Address nextPC = getNextPC(dwarf != null); >>>>>>> + if (nextPC == null) { >>>>>>> + return null; >>>>>>> + } >>>>>>> ? + DwarfParser nextDwarf = null; >>>>>>> + boolean unsupportedDwarf = false; >>>>>>> ?????? if ((dwarf != null) && dwarf.isIn(nextPC)) { >>>>>>> ???????? nextDwarf = dwarf; >>>>>>> ?????? } else { >>>>>>> ???????? Address libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>> ???????? if (libptr != null) { >>>>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >>>>>>> >>>>>>> @@ -138,33 +160,29 @@ >>>>>>> ?????????? } >>>>>>> ???????? } >>>>>>> ?????? } >>>>>>> ? ?????? if (nextDwarf != null) { >>>>>>> + try { >>>>>>> ???????? nextDwarf.processDwarf(nextPC); >>>>>>> + } catch (DebuggerException e) { >>>>>>> + // DWARF processing should succeed when the frame is native >>>>>>> + // but it might fail if CIE has language personality routine >>>>>>> + // and/or LSDA. >>>>>>> + nextDwarf = null; >>>>>>> + unsupportedDwarf = true; >>>>>>> ?????? } >>>>>>> >>>>>>> This fix looks like a hack. >>>>>>> Should we just propagate the Debugging exception instead of trying to maintain unsupportedDwarf flag? >>>>> >>>>> DwarfParser::processDwarf would throw DebuggerException if it cannot find DWARF which relates to PC. >>>>> PC at this point is for next frame. So current frame (`this` object) is valid, and it should be processed. >>>>> >>>>> >>>>>>> Also, I don't like that DWARF-specific abbreviations (like CIE, IDE,LSDA, etc.) are used without any comments explaining them. >>>>>>> The code has to be generally readable without looking into the DWARF spec each time. >>>>> >>>>> I added comments for them in this webrev. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>>>> I'm submitting mach5 jobs to make sure the issue has been resolved with your fix. >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> On 3/20/20 17:55, Yasumasa Suenaga wrote: >>>>>>>> Thanks Chris! >>>>>>>> I'm waiting for reviewers for this change. >>>>>>>> >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2020/03/21 4:23, Chris Plummer wrote: >>>>>>>>> Hi Yasumasa, >>>>>>>>> >>>>>>>>> The failure is due to JDK-8231634, so not something you need to worry about. >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 3/20/20 2:45 AM, Yasumasa Suenaga wrote: >>>>>>>>>> Hi Chris, >>>>>>>>>> >>>>>>>>>> I uploaded new webrev which includes reverting change for ProblemList: >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/ >>>>>>>>>> >>>>>>>>>> I tested it on submit repo (mach5-one-ysuenaga-JDK-8240956-2-20200320-0814-9586301), >>>>>>>>>> but it has failed in ClhsdbJstackXcompStress.java. >>>>>>>>>> However I think it is not caused by this change because ClhsdbJstackXcompStress.java tests `jhsdb jstack`, not mixed mode, it would not parse DWARF. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2020/03/20 13:55, Chris Plummer wrote: >>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>> >>>>>>>>>>> The test has been problem listed so please add undoing this to your webrev. Here's the diff that problem listed it: >>>>>>>>>>> >>>>>>>>>>> diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt >>>>>>>>>>> --- a/test/hotspot/jtreg/ProblemList.txt >>>>>>>>>>> +++ b/test/hotspot/jtreg/ProblemList.txt >>>>>>>>>>> @@ -115,7 +115,7 @@ >>>>>>>>>>> ??serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all >>>>>>>>>>> ??serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all >>>>>>>>>>> ??serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all >>>>>>>>>>> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all >>>>>>>>>>> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 solaris-all,linux-all >>>>>>>>>>> ??serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 solaris-all >>>>>>>>>>> ??serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 solaris-all,linux-x64,macosx-x64,windows-x64 >>>>>>>>>>> ??serviceability/sa/ClhsdbSource.java 8193639 solaris-all >>>>>>>>>>> >>>>>>>>>>> thanks, >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> On 3/16/20 5:07 AM, Yasumasa Suenaga wrote: >>>>>>>>>>>> Hi all, >>>>>>>>>>>> >>>>>>>>>>>> This webrev has passed submit repo (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and additional tests. >>>>>>>>>>>> So please review it: >>>>>>>>>>>> >>>>>>>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>>>>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 2020/03/16 21:03, Yasumasa Suenaga wrote: >>>>>>>>>>>>> Thank you so much, David! >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 2020/03/16 21:01, David Holmes wrote: >>>>>>>>>>>>>> On 16/03/2020 9:46 pm, David Holmes wrote: >>>>>>>>>>>>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I missed loop condition, so I fixed it and pushed to submit repo. >>>>>>>>>>>>>>>> Could you try again? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> webrev is here: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Test job resubmitted. Will advise results if it completes before I go to bed :) >>>>>>>>>>>>>> >>>>>>>>>>>>>> Seems to have passed okay. >>>>>>>>>>>>>> >>>>>>>>>>>>>> David >>>>>>>>>>>>>> >>>>>>>>>>>>>>> David >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks a lot! >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 2020/03/16 16:17, David Holmes wrote: >>>>>>>>>>>>>>>>> Sorry it is still crashing. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704 >>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0640217.suenaga.source) >>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>>>>>>>>>>>>>> # Problematic frame: >>>>>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Same as before. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>> ----- >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote: >>>>>>>>>>>>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote: >>>>>>>>>>>>>>>>>>>> BTW, if you submit it to the submit repo, we can then go and run additional internal tests (and even more builds) using that job. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks for that tip Chris! >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I've pushed the change to submit repo, but I've not yet received the result. >>>>>>>>>>>>>>>>>>> I will share you when I get job ID. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> We can see the id. Just need to wait for the builds to complete before submitting the additional tests. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thank you for testing it. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I updated webrev to avoid bailout to Java frame when DWARF has language personality routine or LSDA. >>>>>>>>>>>>>>>>>>>>> Could you try it? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 . >>>>>>>>>>>>>>>>>>>>> I've pushed it to submit repo. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Diff from webrev.00 is here: >>>>>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652 >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote: >>>>>>>>>>>>>>>>>>>>>> Correction ... >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote: >>>>>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote: >>>>>>>>>>>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> I can't review this as I know nothing about the code, but I'm putting the patch through our internal testing. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Sorry but the crashes still exist: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949 >>>>>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) >>>>>>>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>>>>>>>>>>>>>>>>>>>> # Problematic frame: >>>>>>>>>>>>>>>>>>>>>>> # C [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> in fact they seem worse as the test seems to always crash now. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in linux-x64. I don't see a pattern as to where it fails versus passes. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> It doesn't fail for me locally. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Please review this change: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>>>>>>>>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames in jstack mixed mode. >>>>>>>>>>>>>>>>>>>>>>>>> However some error has seen intermittently after that. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I investigated the cause of this, I found two concerns: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check >>>>>>>>>>>>>>>>>>>>>>>>> ?? B: Language personality routine and Language Specific Data Area (LSDA) are not considered >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I addd range check for .eh_frame processing, and ignore personality routine and LSDA in this webrev. >>>>>>>>>>>>>>>>>>>>>>>>> Also I added bailout code if DWARF processing is failed due to these concerns. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> This change has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), >>>>>>>>>>>>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>> >>>> >>>> >> > > From claes.redestad at oracle.com Fri Mar 27 00:11:34 2020 From: claes.redestad at oracle.com (Claes Redestad) Date: Fri, 27 Mar 2020 01:11:34 +0100 Subject: RFR: 8241585: Remove unused _recursion_counter facility from PerfTraceTime In-Reply-To: <31c866a4-e135-adbc-cb8c-81fbd77bb59e@oracle.com> References: <91a8ebbc-522d-bd67-6304-c9e097bd8366@oracle.com> <31c866a4-e135-adbc-cb8c-81fbd77bb59e@oracle.com> Message-ID: On 2020-03-27 00:36, David Holmes wrote: >> > > Okay so can you change the bug synopsis and description to cover this > more general cleanup and tuneup please. I filed an addendum RFE and will add this RFE bug id to the single changeset push: https://bugs.openjdk.java.net/browse/JDK-8241705 > > I'm never very clear on the uses of these PerfCounters. It seems SUN_NS > is unused after this change. The references to jvmstat seem no longer > correct - these are read via jstat ? The general confusion about PerfData/-Counters and what they're for is why I'm trying to untangle this. Generally I think we should pull the plug on it, but the perfdata shared file is tangled up with functionality to detect running JVMs used by jcmd etc, so it might take a few iterations to get there. /Claes From serguei.spitsyn at oracle.com Fri Mar 27 00:15:19 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 26 Mar 2020 17:15:19 -0700 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com> <359442d4-d6a1-f123-dad1-ac712f06eb0e@redhat.com> <7c613970-1788-5bd5-06f9-5c754843a193@oracle.com> <70717ba0-bcd1-05d4-3f25-ad1dd30309c1@redhat.com> <80c64ca7-4f52-a7ba-0e35-9fa6417ce545@oracle.com> <89d41371-4394-e506-b1d1-0a810c72b6e3@oracle.com> Message-ID: <99649661-67cf-9583-673a-8c53d038aed9@oracle.com> Hi Roman, Yes. Thank you for the explanation. Thanks, Serguei On 3/26/20 01:44, Roman Kennke wrote: > That was in the previous implementation: I got a condition wrong in the > table lookup (as noted by Serguei), and this prevented any > class-unload-events from getting out. I have fixed this, but found other > problems in that implementation (deadlocks and a crash). > > The current implementation has none of these problems: we don't need > table-lookups - we simply pass-through the signatures, and locking is > much simpler and in particular we don't need a lock around the JVMTI > call (SetTag) which was the cause of the deadlock. > > Does that answer your questions? > > Thanks, > Roman > >> Hi Roman, >> >> It passed all my testing. I think before you push Serguei has a question >> regarding an issue you brought up a while back. You mentioned that you >> weren't getting some events, and suddenly started seeing them. We were >> discussing it today and it was unclear if this was an issue you were >> seeing before your changes, and your changes resolved it, or it was >> initially caused by an earlier version of your changes, and you later >> fixed it. We just want to better understand what this issue was and how >> it was fixed. >> >> thanks, >> >> Chris >> >> On 3/25/20 3:22 PM, Roman Kennke wrote: >>> The new job finished, its ID is: >>> >>> ? mach5-one-rkennke-JDK-8227269-2-20200325-2027-9716289 >>> >>> Thank you, >>> Roman >>> >>> >>>> Yes, please submit a new job. I'll start my testing once I see that the >>>> builds are done. >>>> >>>> Chris >>>> >>>> On 3/25/20 12:59 PM, Roman Kennke wrote: >>>>> Hi Chris, >>>>> >>>>> Apparently we can get into classTrack_reset() before calling >>>>> activate(), >>>>> and we're seeing a null deletedSignatureBag. A simple NULL-check around >>>>> the cleaning routine fixes the problem for me. >>>>> >>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.08/ >>>>> >>>>> Should I post another submit-repo job with that fix? >>>>> >>>>> Thanks, >>>>> Roman >>>>> >>>>> >>>>>> Hi Roman, >>>>>> >>>>>> com/sun/jdi/JdwpAllowTest.java crashed on many runs: >>>>>> >>>>>> Stack: [0x00007fbb790f9000,0x00007fbb791fa000], >>>>>> sp=0x00007fbb791f8af0,? free space=1022k >>>>>> Native frames: (J=compiled Java code, A=aot compiled Java code, >>>>>> j=interpreted, Vv=VM code, C=native code) >>>>>> C? [libjdwp.so+0xdb71]? bagEnumerateOver+0x11 >>>>>> C? [libjdwp.so+0xe365]? classTrack_reset+0x25 >>>>>> C? [libjdwp.so+0xfca1]? debugInit_reset+0x71 >>>>>> C? [libjdwp.so+0x12e0d]? debugLoop_run+0x38d >>>>>> C? [libjdwp.so+0x25700]? acceptThread+0x80 >>>>>> V? [libjvm.so+0xf4b5a7]? JvmtiAgentThread::call_start_function()+0x1c7 >>>>>> V? [libjvm.so+0x15215c6]? JavaThread::thread_main_inner()+0x226 >>>>>> V? [libjvm.so+0x1527736]? Thread::call_run()+0xf6 >>>>>> V? [libjvm.so+0x1250ade]? thread_native_entry(Thread*)+0x10e >>>>>> >>>>>> >>>>>> This happened during a test task run of open/test/jdk/:jdk_jdi. There >>>>>> doesn't seem to be anything magic on the command line that might be >>>>>> triggering. Pretty much I see it with all the various VM configs we >>>>>> test. >>>>>> >>>>>> I'm also seeing crashes in the following tests, but not as often: >>>>>> >>>>>> serviceability/jvmti/ModuleAwareAgents/ThreadStart/MAAThreadStart.java >>>>>> vmTestbase/nsk/jdwp/VirtualMachine/Version/version002/TestDescription.java >>>>>> >>>>>> >>>>>> vmTestbase/nsk/jdwp/VirtualMachine/ReleaseEvents/releaseevents002/TestDescription.java >>>>>> >>>>>> >>>>>> vmTestbase/nsk/jdwp/VirtualMachine/HoldEvents/holdevents002/TestDescription.java >>>>>> >>>>>> >>>>>> vmTestbase/nsk/jdwp/VirtualMachine/Dispose/dispose001/TestDescription.java >>>>>> >>>>>> >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> >>>>>> On 3/25/20 11:37 AM, Roman Kennke wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>>> Regarding the new assert: >>>>>>>> >>>>>>>> ???105???? if (gdata && gdata->assertOn) { >>>>>>>> ???106???????? // Check this is not already tagged. >>>>>>>> ???107???????? jlong tag; >>>>>>>> ???108???????? error = JVMTI_FUNC_PTR(trackingEnv, GetTag)(env, >>>>>>>> klass, &tag); >>>>>>>> ???109???????? if (error != JVMTI_ERROR_NONE) { >>>>>>>> ???110???????????? EXIT_ERROR(error, "Unable to GetTag with class >>>>>>>> trackingEnv"); >>>>>>>> ???111???????? } >>>>>>>> ???112???????? JDI_ASSERT(tag == NOT_TAGGED); >>>>>>>> ???113???? } >>>>>>>> >>>>>>>> I think you should remove the gdata check. gdata should never be >>>>>>>> NULL >>>>>>>> when you get to this code. If it is ever NULL then there's a bug, >>>>>>>> and >>>>>>>> the check will hide the bug. >>>>>>> Ok, will remove this. >>>>>>> >>>>>>>> Regarding testing, after you do the submit repo testing let me know >>>>>>>> the >>>>>>>> jobID and I'll do additional testing on it. >>>>>>> I did the submit repo earlier today, and it came back green: >>>>>>> >>>>>>> mach5-one-rkennke-JDK-8227269-2-20200325-1421-9706762 >>>>>>> >>>>>>> Thanks, >>>>>>> Roman >>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 3/25/20 6:00 AM, Roman Kennke wrote: >>>>>>>>> Hi Sergei, >>>>>>>>> >>>>>>>>>> The fix looks pretty clean now. >>>>>>>>>> I also like new name of the lock.:) >>>>>>>>> Thank you! >>>>>>>>> >>>>>>>>>> Just one comment below. >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 110 if (tag != 0l) { >>>>>>>>>> 111 return; // Already added >>>>>>>>>> ??? 112???? } >>>>>>>>>> >>>>>>>>>> ????It is better to use a named constant or macro instead. >>>>>>>>>> ????Also, it'd be nice to add a short comment about this value is. >>>>>>>>> As I replied to Chris earlier, this whole block can be turned >>>>>>>>> into an >>>>>>>>> assert. I also made a constant for the value 0, which should be >>>>>>>>> pretty >>>>>>>>> much self-explaining. >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.07/ >>>>>>>>> >>>>>>>>>> How do you test the fix? >>>>>>>>> I am using a manual test that is provided in this bug report: >>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>> >>>>>>>>> "Script to compare performance of GC with and without debugger, >>>>>>>>> when >>>>>>>>> many classes are loaded and classes are being unloaded": >>>>>>>>> >>>>>>>>> https://bugzilla.redhat.com/attachment.cgi?id=1640688 >>>>>>>>> >>>>>>>>> I am also using this test and manually attach/detach jdb a >>>>>>>>> couple of >>>>>>>>> times in a row to check that disconnecting and reconnecting works >>>>>>>>> well >>>>>>>>> (this tended to deadlock or crash with an earlier version of the >>>>>>>>> patch, >>>>>>>>> and is now looking good). >>>>>>>>> >>>>>>>>> I am also running tier1 and tier2 tests locally, and as soon as we >>>>>>>>> all >>>>>>>>> agree that the fix is reasonable, I will push it to the submit >>>>>>>>> repo. I >>>>>>>>> am not sure if any of those tests actually exercise that code, >>>>>>>>> though. >>>>>>>>> Let me know if you want me to run any specific tests. >>>>>>>>> >>>>>>>>> Thank you, >>>>>>>>> Roman >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Serguei >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 3/20/20 08:30, Roman Kennke wrote: >>>>>>>>>>> I believe I came up with a much simpler solution that also >>>>>>>>>>> solves the >>>>>>>>>>> problems of the existing one, and the ones I proposed earlier. >>>>>>>>>>> >>>>>>>>>>> It turns out that we can take advantage of the fact that we >>>>>>>>>>> can use >>>>>>>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is >>>>>>>>>>> explicitely >>>>>>>>>>> mentioned in the JVMTI spec). This means we can simply stick a >>>>>>>>>>> pointer >>>>>>>>>>> to the signature of a class into the tag, and pull it out again >>>>>>>>>>> when we >>>>>>>>>>> get notified that the class gets unloaded. >>>>>>>>>>> >>>>>>>>>>> This means we don't need an extra data-structure to keep track of >>>>>>>>>>> classes and signatures, and it also makes the story around >>>>>>>>>>> locking >>>>>>>>>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning >>>>>>>>>>> of all >>>>>>>>>>> classes needed (as in the current implementation) and no >>>>>>>>>>> searching of >>>>>>>>>>> table needed (like in my previous attempts). >>>>>>>>>>> >>>>>>>>>>> Please review this new revision: >>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/ >>>>>>>>>>> >>>>>>>>>>> (Notice that there still appears to be a performance bottleneck >>>>>>>>>>> with >>>>>>>>>>> class-unloading when an actual debugger is attached. This >>>>>>>>>>> doesn't seem >>>>>>>>>>> to be related to the classTrack.c implementation though, but >>>>>>>>>>> looks like >>>>>>>>>>> a consequence of getting all those class-unload notifications >>>>>>>>>>> over the >>>>>>>>>>> wire. My testcase generates 1000s of them, and it's clogging >>>>>>>>>>> up the >>>>>>>>>>> buffers.) >>>>>>>>>>> >>>>>>>>>>> I am not sure why jdb needs to enable class-unload listener >>>>>>>>>>> always. A >>>>>>>>>>> simple hack disables it, and performance is brilliant, even when >>>>>>>>>>> jdb is >>>>>>>>>>> attached: >>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> But this is not in the scope of this bug.) >>>>>>>>>>> >>>>>>>>>>> Roman >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>> Sorry, forgot to complete my comments at the end (see below). >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>> Hi Roman, >>>>>>>>>>>>> >>>>>>>>>>>>> Thank you for the update and sorry for the latency in review. >>>>>>>>>>>>> >>>>>>>>>>>>> Some comments are below. >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag) >>>>>>>>>>>>> ???? 88 { >>>>>>>>>>>>> 89 debugMonitorEnter(deletedSignatureLock); >>>>>>>>>>>>> 90 if (currentClassTag == -1) { >>>>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>> 93 return; >>>>>>>>>>>>> ???? 94???? } >>>>>>>>>>>>> Just a question: >>>>>>>>>>>>> ???? Q1: Should the ObjectFree events be disabled for the >>>>>>>>>>>>> jvmtiEnv >>>>>>>>>>>>> that does >>>>>>>>>>>>> ???????? the class tracking if class tracking has not been >>>>>>>>>>>>> initialized? >>>>>>>>>>>>> >>>>>>>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is >>>>>>>>>>>>> better to >>>>>>>>>>>>> be something like: lastClassTag or highestClassTag. >>>>>>>>>>>>> >>>>>>>>>>>>> 99 KlassNode* klass = *klass_ptr; >>>>>>>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103 >>>>>>>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr; >>>>>>>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // >>>>>>>>>>>>> klass >>>>>>>>>>>>> not >>>>>>>>>>>>> found - ignore. >>>>>>>>>>>>> 107 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>> 108 return; >>>>>>>>>>>>> ??? 109???? } >>>>>>>>>>>>> ????It seems to me, something is wrong in the condition at L106 >>>>>>>>>>>>> above. >>>>>>>>>>>>> ????Should it be? : >>>>>>>>>>>>> ?????? if (klass == NULL || klass->klass_tag != tag) >>>>>>>>>>>>> >>>>>>>>>>>>> ????Otherwise, how can the second check ever work correctly >>>>>>>>>>>>> as the >>>>>>>>>>>>> return >>>>>>>>>>>>> will always happen when (klass != NULL)? >>>>>>>>>>>>> >>>>>>>>>>>>> ??? There are several places in this file with the the indent: >>>>>>>>>>>>> 90 if (currentClassTag == -1) { >>>>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>> 93 return; >>>>>>>>>>>>> ???? 94???? } >>>>>>>>>>>>> ??? ... >>>>>>>>>>>>> 152 if (currentClassTag == -1) { >>>>>>>>>>>>> 153 // Class tracking not initialized yet, nobody's interested >>>>>>>>>>>>> 154 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>> 155 return; >>>>>>>>>>>>> ??? 156???? } >>>>>>>>>>>>> ??? ... >>>>>>>>>>>>> 161 if (error != JVMTI_ERROR_NONE) { >>>>>>>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class >>>>>>>>>>>>> trackingEnv"); >>>>>>>>>>>>> ??? 163???? } >>>>>>>>>>>>> 164 if (tag != 0l) { >>>>>>>>>>>>> 165 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>> 166 return; // Already added >>>>>>>>>>>>> ??? 167???? } >>>>>>>>>>>>> ??? ... >>>>>>>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg) >>>>>>>>>>>>> 282 { >>>>>>>>>>>>> 283 char* sig = (char*)signatureVoid; >>>>>>>>>>>>> 284 jvmtiDeallocate(sig); >>>>>>>>>>>>> 285 return JNI_TRUE; >>>>>>>>>>>>> ??? 286 } >>>>>>>>>>>>> ??? ... >>>>>>>>>>>>> ??? 291 void >>>>>>>>>>>>> ??? 292 classTrack_reset(void) >>>>>>>>>>>>> ??? 293 { >>>>>>>>>>>>> 294 int idx; >>>>>>>>>>>>> 295 debugMonitorEnter(deletedSignatureLock); >>>>>>>>>>>>> 296 >>>>>>>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) { >>>>>>>>>>>>> 298 KlassNode* node = table[idx]; >>>>>>>>>>>>> 299 while (node != NULL) { >>>>>>>>>>>>> 300 KlassNode* next = node->next; >>>>>>>>>>>>> 301 jvmtiDeallocate(node->signature); >>>>>>>>>>>>> 302 jvmtiDeallocate(node); >>>>>>>>>>>>> 303 node = next; >>>>>>>>>>>>> 304 } >>>>>>>>>>>>> 305 } >>>>>>>>>>>>> 306 jvmtiDeallocate(table); >>>>>>>>>>>>> 307 >>>>>>>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL); >>>>>>>>>>>>> 309 bagDestroyBag(deletedSignatureBag); >>>>>>>>>>>>> 310 >>>>>>>>>>>>> 311 currentClassTag = -1; >>>>>>>>>>>>> 312 >>>>>>>>>>>>> 313 >>>>>>>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 314 trackingEnv = NULL; >>>>>>>>>>>>> 315 >>>>>>>>>>>>> 316 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>> >>>>>>>>>>>>> Could you, please, fix several comments below? >>>>>>>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for >>>>>>>>>>>>> class-unloads >>>>>>>>>>>>> ????The comma is not needed. >>>>>>>>>>>>> ????Would it better to replace: klass tags => klass_tag's ? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 73 * Lock to keep table, currentClassTag and >>>>>>>>>>>>> deletedSignatureBag >>>>>>>>>>>>> consistent >>>>>>>>>>>>> ????Maybe: Lock to guard ... or lock to keep integrity of ... >>>>>>>>>>>>> >>>>>>>>>>>>> 84 * Callback when classes are freed, Finds the signature and >>>>>>>>>>>>> remembers it in deletedSignatureBag. Would be better to use >>>>>>>>>>>>> words >>>>>>>>>>>>> like >>>>>>>>>>>>> "store" or "record", "Find" should not start from capital >>>>>>>>>>>>> letter: >>>>>>>>>>>>> Invoke the callback when classes are freed, find and record the >>>>>>>>>>>>> signature in deletedSignatureBag. >>>>>>>>>>>>> >>>>>>>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not >>>>>>>>>>>>> initialized, >>>>>>>>>>>>> nobody's interested 153 // Class tracking not initialized yet, >>>>>>>>>>>>> nobody's interested 158 /* Check this is not a duplicate */ >>>>>>>>>>>>> Missed >>>>>>>>>>>>> dot >>>>>>>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) >>>>>>>>>>>>> { // >>>>>>>>>>>>> klass not found - ignore. In opposite, dot is not needed as the >>>>>>>>>>>>> comment does not start from a capital letter. 111 // At this >>>>>>>>>>>>> point we >>>>>>>>>>>>> have the KlassNode corresponding to the tag >>>>>>>>>>>>> 112 // in klass, and the pointer to it in klass_node. >>>>>>>>>>>> ??? The comment above can be better. Maybe, something like: >>>>>>>>>>>> ??? ? " At this point, we found the KlassNode matching the klass >>>>>>>>>>>> tag(and it is >>>>>>>>>>>> linked). >>>>>>>>>>>> >>>>>>>>>>>>> 113 // Remember the unloaded signature. >>>>>>>>>>>> ????Better: Record the signature of the unloaded class and >>>>>>>>>>>> unlink it. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Serguei >>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Serguei >>>>>>>>>>>>> >>>>>>>>>>>>> On 3/9/20 05:39, Roman Kennke wrote: >>>>>>>>>>>>>> Hello all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Can I please get reviews of this change? In the meantime, >>>>>>>>>>>>>> we've done >>>>>>>>>>>>>> more testing and also field-/torture-testing by a customer >>>>>>>>>>>>>> who is >>>>>>>>>>>>>> happy >>>>>>>>>>>>>> now. :-) >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Roman >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Serguei, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks for reviewing! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I updated the patch to reflect your suggestions, very good! >>>>>>>>>>>>>>> It also includes a fix to allow re-connecting an agent after >>>>>>>>>>>>>>> disconnect, >>>>>>>>>>>>>>> namely move setup of the trackingEnv and >>>>>>>>>>>>>>> deletedSignatureBag to >>>>>>>>>>>>>>> _activate() to ensure have those structures after re-connect. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Let me know what you think! >>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi Roman, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thank you for taking care about this scalability issue! >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I have a couple of quick comments. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 72 /* >>>>>>>>>>>>>>>> 73 * Lock to protect deletedSignatureBag >>>>>>>>>>>>>>>> 74 */ >>>>>>>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /* >>>>>>>>>>>>>>>> 78 * A bag containing all the deleted classes' signatures. >>>>>>>>>>>>>>>> Must be >>>>>>>>>>>>>>>> accessed under >>>>>>>>>>>>>>>> 79 * deletedTagLock, >>>>>>>>>>>>>>>> ???? 80? */ >>>>>>>>>>>>>>>> 81 struct bag* deletedSignatureBag; >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ???? The comments contradict to each other. >>>>>>>>>>>>>>>> ???? I guess, the lock name at line 79 has to be >>>>>>>>>>>>>>>> deletedSignatureLock >>>>>>>>>>>>>>>> instead of deletedTagLock. >>>>>>>>>>>>>>>> ???? Also, comma at the end must be replaced with dot. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 101 // Tag not found? Ignore. >>>>>>>>>>>>>>>> 102 if (klass == NULL) { >>>>>>>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>>>> 104 return; >>>>>>>>>>>>>>>> 105 } >>>>>>>>>>>>>>>> ??? 106 >>>>>>>>>>>>>>>> 107 // Scan linked-list. >>>>>>>>>>>>>>>> 108 jlong found_tag = klass->klass_tag; >>>>>>>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) { >>>>>>>>>>>>>>>> 110 klass_ptr = &klass->next; >>>>>>>>>>>>>>>> 111 klass = *klass_ptr; >>>>>>>>>>>>>>>> 112 found_tag = klass->klass_tag; >>>>>>>>>>>>>>>> ??? 113???? } >>>>>>>>>>>>>>>> 114 >>>>>>>>>>>>>>>> 115 // Tag not found? Ignore. >>>>>>>>>>>>>>>> 116 if (found_tag != tag) { >>>>>>>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>>>> 118 return; >>>>>>>>>>>>>>>> ??? 119???? } >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ????The code above can be simplified, so that the lines >>>>>>>>>>>>>>>> 101-105 >>>>>>>>>>>>>>>> are not >>>>>>>>>>>>>>>> needed anymore. >>>>>>>>>>>>>>>> ????It can be something like this: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> // Scan linked-list. >>>>>>>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) { >>>>>>>>>>>>>>>> klass_ptr = &klass->next; >>>>>>>>>>>>>>>> klass = *klass_ptr; >>>>>>>>>>>>>>>> ??????? } >>>>>>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not >>>>>>>>>>>>>>>> found - ignore. >>>>>>>>>>>>>>>> debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>>>> return; >>>>>>>>>>>>>>>> ??????? } >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> It will take more time when I get a chance to look at the >>>>>>>>>>>>>>>> rest. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Serguei >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote: >>>>>>>>>>>>>>>>> Here comes an update that resolves some races that happen >>>>>>>>>>>>>>>>> when >>>>>>>>>>>>>>>>> disconnecting an agent. In particular, we need to take the >>>>>>>>>>>>>>>>> lock on >>>>>>>>>>>>>>>>> basically every operation, and also need to check whether >>>>>>>>>>>>>>>>> or not >>>>>>>>>>>>>>>>> class-tracking is active and return an appropriate result >>>>>>>>>>>>>>>>> (e.g. an empty >>>>>>>>>>>>>>>>> list) when we're not. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Updated webrev: >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> So, here comes the O(1) implementation: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a >>>>>>>>>>>>>>>>>> tag, and we >>>>>>>>>>>>>>>>>> set-up a listener to get notified when it is unloaded. >>>>>>>>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a >>>>>>>>>>>>>>>>>> table, which >>>>>>>>>>>>>>>>>> each entry being the head of a linked-list of KlassNode*. >>>>>>>>>>>>>>>>>> The >>>>>>>>>>>>>>>>>> table is >>>>>>>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend >>>>>>>>>>>>>>>>>> the new >>>>>>>>>>>>>>>>>> KlassNode*. >>>>>>>>>>>>>>>>>> This is O(1) operation. >>>>>>>>>>>>>>>>>> - When we get notified of unloading a class, we look up >>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>> signature of >>>>>>>>>>>>>>>>>> the reported tag in that table, and remember it in a bag. >>>>>>>>>>>>>>>>>> The >>>>>>>>>>>>>>>>>> KlassNode* >>>>>>>>>>>>>>>>>> is then unlinked from the table and deallocated. This is >>>>>>>>>>>>>>>>>> ~O(1) operation >>>>>>>>>>>>>>>>>> too, depending on the depth of the table. In my testcase >>>>>>>>>>>>>>>>>> which hammered >>>>>>>>>>>>>>>>>> the code with class-loads and unloads, I usually see >>>>>>>>>>>>>>>>>> depths >>>>>>>>>>>>>>>>>> of like 2-3, >>>>>>>>>>>>>>>>>> but not usually more. It should be ok. >>>>>>>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out >>>>>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>>> bag, and >>>>>>>>>>>>>>>>>> allocate a new one. >>>>>>>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid >>>>>>>>>>>>>>>>>> leaking the >>>>>>>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets >>>>>>>>>>>>>>>>>> detached >>>>>>>>>>>>>>>>>> and/or >>>>>>>>>>>>>>>>>> re-attached (was missing before). >>>>>>>>>>>>>>>>>> - I also added locks around data-structure-manipulation >>>>>>>>>>>>>>>>>> (was >>>>>>>>>>>>>>>>>> missing >>>>>>>>>>>>>>>>>> before). >>>>>>>>>>>>>>>>>> - Also, I only activate this whole process when an actual >>>>>>>>>>>>>>>>>> listener gets >>>>>>>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right >>>>>>>>>>>>>>>>>> when >>>>>>>>>>>>>>>>>> attaching a >>>>>>>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be >>>>>>>>>>>>>>>>>> something >>>>>>>>>>>>>>>>>> to improve >>>>>>>>>>>>>>>>>> in the future? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> In my tests, the performance of class-tracking itself >>>>>>>>>>>>>>>>>> looks >>>>>>>>>>>>>>>>>> really good. >>>>>>>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the >>>>>>>>>>>>>>>>>> class-unload >>>>>>>>>>>>>>>>>> events. I don't see how this can be helped when the debug >>>>>>>>>>>>>>>>>> agent asks for it? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Updated webrev: >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Please let me know what you think of it. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am >>>>>>>>>>>>>>>>>>> implementing >>>>>>>>>>>>>>>>>>> the even more >>>>>>>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing >>>>>>>>>>>>>>>>>>> for now. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks,Roman >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> ??? Hi Chris, >>>>>>>>>>>>>>>>>>>>> I'll have a look at this, although it might not be >>>>>>>>>>>>>>>>>>>>> for a >>>>>>>>>>>>>>>>>>>>> few days. In >>>>>>>>>>>>>>>>>>>>> the meantime, maybe you can describe your new >>>>>>>>>>>>>>>>>>>>> implementation in >>>>>>>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the >>>>>>>>>>>>>>>>>>>>> changes. >>>>>>>>>>>>>>>>>>>> Sure. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to >>>>>>>>>>>>>>>>>>>> determine the >>>>>>>>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading >>>>>>>>>>>>>>>>>>>> happened, so that >>>>>>>>>>>>>>>>>>>> we can generate the appropriate JDWP event. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> The current implementation does so by maintaining a >>>>>>>>>>>>>>>>>>>> table >>>>>>>>>>>>>>>>>>>> of currently >>>>>>>>>>>>>>>>>>>> prepared classes by building that table when >>>>>>>>>>>>>>>>>>>> classTrack is >>>>>>>>>>>>>>>>>>>> initialized, >>>>>>>>>>>>>>>>>>>> and then add new classes whenever a class gets loaded. >>>>>>>>>>>>>>>>>>>> When >>>>>>>>>>>>>>>>>>>> unloading >>>>>>>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and >>>>>>>>>>>>>>>>>>>> compared with the >>>>>>>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the >>>>>>>>>>>>>>>>>>>> new >>>>>>>>>>>>>>>>>>>> table gets >>>>>>>>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently >>>>>>>>>>>>>>>>>>>> and/or many >>>>>>>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to >>>>>>>>>>>>>>>>>>>> O(classCount*gcCount) >>>>>>>>>>>>>>>>>>>> complexity. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared >>>>>>>>>>>>>>>>>>>> classes, and also >>>>>>>>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). >>>>>>>>>>>>>>>>>>>> Whenever an >>>>>>>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is >>>>>>>>>>>>>>>>>>>> scanned, >>>>>>>>>>>>>>>>>>>> and classes >>>>>>>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus >>>>>>>>>>>>>>>>>>>> maintaining the >>>>>>>>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list >>>>>>>>>>>>>>>>>>>> that gets returned. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> The implementation is not perfect. In order to determine >>>>>>>>>>>>>>>>>>>> whether or not >>>>>>>>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. >>>>>>>>>>>>>>>>>>>> That process is >>>>>>>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption >>>>>>>>>>>>>>>>>>>> here >>>>>>>>>>>>>>>>>>>> is that >>>>>>>>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this >>>>>>>>>>>>>>>>>>>> seems to be >>>>>>>>>>>>>>>>>>>> true, and also reasonable to expect. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> (I have some ideas how to improve the implementation to >>>>>>>>>>>>>>>>>>>> ~O(1) but it >>>>>>>>>>>>>>>>>>>> would be considerably more complex: have to maintain a >>>>>>>>>>>>>>>>>>>> (hash)table that >>>>>>>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon >>>>>>>>>>>>>>>>>>>> unload, >>>>>>>>>>>>>>>>>>>> and build the >>>>>>>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently >>>>>>>>>>>>>>>>>>>> see >>>>>>>>>>>>>>>>>>>> that it's >>>>>>>>>>>>>>>>>>>> worth the effort). >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> In addition to all that, this process is only activated >>>>>>>>>>>>>>>>>>>> when there's an >>>>>>>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote: >>>>>>>>>>>>>>>>>>>>>> Hello all, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Issue: >>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269 >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of >>>>>>>>>>>>>>>>>>>>>> classTrack.c. >>>>>>>>>>>>>>>>>>>>>> It avoids >>>>>>>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps >>>>>>>>>>>>>>>>>>>>>> track of >>>>>>>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance >>>>>>>>>>>>>>>>>>>>>> until an >>>>>>>>>>>>>>>>>>>>>> agent >>>>>>>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and >>>>>>>>>>>>>>>>>>>>>> timing. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Eg with the testcase provided here: >>>>>>>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I am getting those numbers: >>>>>>>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s >>>>>>>>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Can I please get a review? >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>>>>>>> >> From serguei.spitsyn at oracle.com Fri Mar 27 00:22:43 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 26 Mar 2020 17:22:43 -0700 Subject: RFR: 8227269: Slow class loading when running JVM in debug mode In-Reply-To: <99649661-67cf-9583-673a-8c53d038aed9@oracle.com> References: <8870829e-c558-c956-2184-00204632abb6@redhat.com> <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com> <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com> <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com> <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com> <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com> <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com> <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com> <359442d4-d6a1-f123-dad1-ac712f06eb0e@redhat.com> <7c613970-1788-5bd5-06f9-5c754843a193@oracle.com> <70717ba0-bcd1-05d4-3f25-ad1dd30309c1@redhat.com> <80c64ca7-4f52-a7ba-0e35-9fa6417ce545@oracle.com> <89d41371-4394-e506-b1d1-0a810c72b6e3@oracle.com> <99649661-67cf-9583-673a-8c53d038aed9@oracle.com> Message-ID: <33703a4c-3fcc-33ee-4774-e158f5980218@oracle.com> Hi Roman, I'm okay with fix. Thanks, Serguei On 3/26/20 17:15, serguei.spitsyn at oracle.com wrote: > Hi Roman, > > Yes. Thank you for the explanation. > > Thanks, > Serguei > > On 3/26/20 01:44, Roman Kennke wrote: >> That was in the previous implementation: I got a condition wrong in the >> table lookup (as noted by Serguei), and this prevented any >> class-unload-events from getting out. I have fixed this, but found other >> problems in that implementation (deadlocks and a crash). >> >> ? The current implementation has none of these problems: we don't need >> table-lookups - we simply pass-through the signatures, and locking is >> much simpler and in particular we don't need a lock around the JVMTI >> call (SetTag) which was the cause of the deadlock. >> >> Does that answer your questions? >> >> Thanks, >> Roman >> >>> Hi Roman, >>> >>> It passed all my testing. I think before you push Serguei has a >>> question >>> regarding an issue you brought up a while back. You mentioned that you >>> weren't getting some events, and suddenly started seeing them. We were >>> discussing it today and it was unclear if this was an issue you were >>> seeing before your changes, and your changes resolved it, or it was >>> initially caused by an earlier version of your changes, and you later >>> fixed it. We just want to better understand what this issue was and how >>> it was fixed. >>> >>> thanks, >>> >>> Chris >>> >>> On 3/25/20 3:22 PM, Roman Kennke wrote: >>>> The new job finished, its ID is: >>>> >>>> ?? mach5-one-rkennke-JDK-8227269-2-20200325-2027-9716289 >>>> >>>> Thank you, >>>> Roman >>>> >>>> >>>>> Yes, please submit a new job. I'll start my testing once I see >>>>> that the >>>>> builds are done. >>>>> >>>>> Chris >>>>> >>>>> On 3/25/20 12:59 PM, Roman Kennke wrote: >>>>>> Hi Chris, >>>>>> >>>>>> Apparently we can get into classTrack_reset() before calling >>>>>> activate(), >>>>>> and we're seeing a null deletedSignatureBag. A simple NULL-check >>>>>> around >>>>>> the cleaning routine fixes the problem for me. >>>>>> >>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.08/ >>>>>> >>>>>> Should I post another submit-repo job with that fix? >>>>>> >>>>>> Thanks, >>>>>> Roman >>>>>> >>>>>> >>>>>>> Hi Roman, >>>>>>> >>>>>>> com/sun/jdi/JdwpAllowTest.java crashed on many runs: >>>>>>> >>>>>>> Stack: [0x00007fbb790f9000,0x00007fbb791fa000], >>>>>>> sp=0x00007fbb791f8af0,? free space=1022k >>>>>>> Native frames: (J=compiled Java code, A=aot compiled Java code, >>>>>>> j=interpreted, Vv=VM code, C=native code) >>>>>>> C? [libjdwp.so+0xdb71]? bagEnumerateOver+0x11 >>>>>>> C? [libjdwp.so+0xe365]? classTrack_reset+0x25 >>>>>>> C? [libjdwp.so+0xfca1]? debugInit_reset+0x71 >>>>>>> C? [libjdwp.so+0x12e0d]? debugLoop_run+0x38d >>>>>>> C? [libjdwp.so+0x25700]? acceptThread+0x80 >>>>>>> V? [libjvm.so+0xf4b5a7] >>>>>>> JvmtiAgentThread::call_start_function()+0x1c7 >>>>>>> V? [libjvm.so+0x15215c6] JavaThread::thread_main_inner()+0x226 >>>>>>> V? [libjvm.so+0x1527736]? Thread::call_run()+0xf6 >>>>>>> V? [libjvm.so+0x1250ade] thread_native_entry(Thread*)+0x10e >>>>>>> >>>>>>> >>>>>>> This happened during a test task run of open/test/jdk/:jdk_jdi. >>>>>>> There >>>>>>> doesn't seem to be anything magic on the command line that might be >>>>>>> triggering. Pretty much I see it with all the various VM configs we >>>>>>> test. >>>>>>> >>>>>>> I'm also seeing crashes in the following tests, but not as often: >>>>>>> >>>>>>> serviceability/jvmti/ModuleAwareAgents/ThreadStart/MAAThreadStart.java >>>>>>> >>>>>>> vmTestbase/nsk/jdwp/VirtualMachine/Version/version002/TestDescription.java >>>>>>> >>>>>>> >>>>>>> >>>>>>> vmTestbase/nsk/jdwp/VirtualMachine/ReleaseEvents/releaseevents002/TestDescription.java >>>>>>> >>>>>>> >>>>>>> >>>>>>> vmTestbase/nsk/jdwp/VirtualMachine/HoldEvents/holdevents002/TestDescription.java >>>>>>> >>>>>>> >>>>>>> >>>>>>> vmTestbase/nsk/jdwp/VirtualMachine/Dispose/dispose001/TestDescription.java >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> >>>>>>> On 3/25/20 11:37 AM, Roman Kennke wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>>> Regarding the new assert: >>>>>>>>> >>>>>>>>> ????105???? if (gdata && gdata->assertOn) { >>>>>>>>> ????106???????? // Check this is not already tagged. >>>>>>>>> ????107???????? jlong tag; >>>>>>>>> ????108???????? error = JVMTI_FUNC_PTR(trackingEnv, GetTag)(env, >>>>>>>>> klass, &tag); >>>>>>>>> ????109???????? if (error != JVMTI_ERROR_NONE) { >>>>>>>>> ????110???????????? EXIT_ERROR(error, "Unable to GetTag with >>>>>>>>> class >>>>>>>>> trackingEnv"); >>>>>>>>> ????111???????? } >>>>>>>>> ????112???????? JDI_ASSERT(tag == NOT_TAGGED); >>>>>>>>> ????113???? } >>>>>>>>> >>>>>>>>> I think you should remove the gdata check. gdata should never be >>>>>>>>> NULL >>>>>>>>> when you get to this code. If it is ever NULL then there's a bug, >>>>>>>>> and >>>>>>>>> the check will hide the bug. >>>>>>>> Ok, will remove this. >>>>>>>> >>>>>>>>> Regarding testing, after you do the submit repo testing let me >>>>>>>>> know >>>>>>>>> the >>>>>>>>> jobID and I'll do additional testing on it. >>>>>>>> I did the submit repo earlier today, and it came back green: >>>>>>>> >>>>>>>> mach5-one-rkennke-JDK-8227269-2-20200325-1421-9706762 >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Roman >>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 3/25/20 6:00 AM, Roman Kennke wrote: >>>>>>>>>> Hi Sergei, >>>>>>>>>> >>>>>>>>>>> The fix looks pretty clean now. >>>>>>>>>>> I also like new name of the lock.:) >>>>>>>>>> Thank you! >>>>>>>>>> >>>>>>>>>>> Just one comment below. >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 110 if (tag != 0l) { >>>>>>>>>>> 111 return; // Already added >>>>>>>>>>> ???? 112???? } >>>>>>>>>>> >>>>>>>>>>> ?????It is better to use a named constant or macro instead. >>>>>>>>>>> ?????Also, it'd be nice to add a short comment about this >>>>>>>>>>> value is. >>>>>>>>>> As I replied to Chris earlier, this whole block can be turned >>>>>>>>>> into an >>>>>>>>>> assert. I also made a constant for the value 0, which should be >>>>>>>>>> pretty >>>>>>>>>> much self-explaining. >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.07/ >>>>>>>>>> >>>>>>>>>>> How do you test the fix? >>>>>>>>>> I am using a manual test that is provided in this bug report: >>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>>> >>>>>>>>>> "Script to compare performance of GC with and without debugger, >>>>>>>>>> when >>>>>>>>>> many classes are loaded and classes are being unloaded": >>>>>>>>>> >>>>>>>>>> https://bugzilla.redhat.com/attachment.cgi?id=1640688 >>>>>>>>>> >>>>>>>>>> I am also using this test and manually attach/detach jdb a >>>>>>>>>> couple of >>>>>>>>>> times in a row to check that disconnecting and reconnecting >>>>>>>>>> works >>>>>>>>>> well >>>>>>>>>> (this tended to deadlock or crash with an earlier version of the >>>>>>>>>> patch, >>>>>>>>>> and is now looking good). >>>>>>>>>> >>>>>>>>>> I am also running tier1 and tier2 tests locally, and as soon >>>>>>>>>> as we >>>>>>>>>> all >>>>>>>>>> agree that the fix is reasonable, I will push it to the submit >>>>>>>>>> repo. I >>>>>>>>>> am not sure if any of those tests actually exercise that code, >>>>>>>>>> though. >>>>>>>>>> Let me know if you want me to run any specific tests. >>>>>>>>>> >>>>>>>>>> Thank you, >>>>>>>>>> Roman >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Serguei >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 3/20/20 08:30, Roman Kennke wrote: >>>>>>>>>>>> I believe I came up with a much simpler solution that also >>>>>>>>>>>> solves the >>>>>>>>>>>> problems of the existing one, and the ones I proposed earlier. >>>>>>>>>>>> >>>>>>>>>>>> It turns out that we can take advantage of the fact that we >>>>>>>>>>>> can use >>>>>>>>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is >>>>>>>>>>>> explicitely >>>>>>>>>>>> mentioned in the JVMTI spec). This means we can simply stick a >>>>>>>>>>>> pointer >>>>>>>>>>>> to the signature of a class into the tag, and pull it out >>>>>>>>>>>> again >>>>>>>>>>>> when we >>>>>>>>>>>> get notified that the class gets unloaded. >>>>>>>>>>>> >>>>>>>>>>>> This means we don't need an extra data-structure to keep >>>>>>>>>>>> track of >>>>>>>>>>>> classes and signatures, and it also makes the story around >>>>>>>>>>>> locking >>>>>>>>>>>> *much* simpler. Performance-wise this is O(1), i.e. no >>>>>>>>>>>> scanning >>>>>>>>>>>> of all >>>>>>>>>>>> classes needed (as in the current implementation) and no >>>>>>>>>>>> searching of >>>>>>>>>>>> table needed (like in my previous attempts). >>>>>>>>>>>> >>>>>>>>>>>> Please review this new revision: >>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/ >>>>>>>>>>>> >>>>>>>>>>>> (Notice that there still appears to be a performance >>>>>>>>>>>> bottleneck >>>>>>>>>>>> with >>>>>>>>>>>> class-unloading when an actual debugger is attached. This >>>>>>>>>>>> doesn't seem >>>>>>>>>>>> to be related to the classTrack.c implementation though, but >>>>>>>>>>>> looks like >>>>>>>>>>>> a consequence of getting all those class-unload notifications >>>>>>>>>>>> over the >>>>>>>>>>>> wire. My testcase generates 1000s of them, and it's clogging >>>>>>>>>>>> up the >>>>>>>>>>>> buffers.) >>>>>>>>>>>> >>>>>>>>>>>> I am not sure why jdb needs to enable class-unload listener >>>>>>>>>>>> always. A >>>>>>>>>>>> simple hack disables it, and performance is brilliant, even >>>>>>>>>>>> when >>>>>>>>>>>> jdb is >>>>>>>>>>>> attached: >>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> But this is not in the scope of this bug.) >>>>>>>>>>>> >>>>>>>>>>>> Roman >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>> Sorry, forgot to complete my comments at the end (see below). >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>>> Hi Roman, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thank you for the update and sorry for the latency in >>>>>>>>>>>>>> review. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Some comments are below. >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag) >>>>>>>>>>>>>> ????? 88 { >>>>>>>>>>>>>> 89 debugMonitorEnter(deletedSignatureLock); >>>>>>>>>>>>>> 90 if (currentClassTag == -1) { >>>>>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>> 93 return; >>>>>>>>>>>>>> ????? 94???? } >>>>>>>>>>>>>> Just a question: >>>>>>>>>>>>>> ????? Q1: Should the ObjectFree events be disabled for the >>>>>>>>>>>>>> jvmtiEnv >>>>>>>>>>>>>> that does >>>>>>>>>>>>>> ????????? the class tracking if class tracking has not been >>>>>>>>>>>>>> initialized? >>>>>>>>>>>>>> >>>>>>>>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is >>>>>>>>>>>>>> better to >>>>>>>>>>>>>> be something like: lastClassTag or highestClassTag. >>>>>>>>>>>>>> >>>>>>>>>>>>>> 99 KlassNode* klass = *klass_ptr; >>>>>>>>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) >>>>>>>>>>>>>> { 103 >>>>>>>>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr; >>>>>>>>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // >>>>>>>>>>>>>> klass >>>>>>>>>>>>>> not >>>>>>>>>>>>>> found - ignore. >>>>>>>>>>>>>> 107 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>> 108 return; >>>>>>>>>>>>>> ???? 109???? } >>>>>>>>>>>>>> ?????It seems to me, something is wrong in the condition >>>>>>>>>>>>>> at L106 >>>>>>>>>>>>>> above. >>>>>>>>>>>>>> ?????Should it be? : >>>>>>>>>>>>>> ??????? if (klass == NULL || klass->klass_tag != tag) >>>>>>>>>>>>>> >>>>>>>>>>>>>> ?????Otherwise, how can the second check ever work correctly >>>>>>>>>>>>>> as the >>>>>>>>>>>>>> return >>>>>>>>>>>>>> will always happen when (klass != NULL)? >>>>>>>>>>>>>> >>>>>>>>>>>>>> ???? There are several places in this file with the the >>>>>>>>>>>>>> indent: >>>>>>>>>>>>>> 90 if (currentClassTag == -1) { >>>>>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested >>>>>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>> 93 return; >>>>>>>>>>>>>> ????? 94???? } >>>>>>>>>>>>>> ???? ... >>>>>>>>>>>>>> 152 if (currentClassTag == -1) { >>>>>>>>>>>>>> 153 // Class tracking not initialized yet, nobody's >>>>>>>>>>>>>> interested >>>>>>>>>>>>>> 154 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>> 155 return; >>>>>>>>>>>>>> ???? 156???? } >>>>>>>>>>>>>> ???? ... >>>>>>>>>>>>>> 161 if (error != JVMTI_ERROR_NONE) { >>>>>>>>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class >>>>>>>>>>>>>> trackingEnv"); >>>>>>>>>>>>>> ???? 163???? } >>>>>>>>>>>>>> 164 if (tag != 0l) { >>>>>>>>>>>>>> 165 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>> 166 return; // Already added >>>>>>>>>>>>>> ???? 167???? } >>>>>>>>>>>>>> ???? ... >>>>>>>>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg) >>>>>>>>>>>>>> 282 { >>>>>>>>>>>>>> 283 char* sig = (char*)signatureVoid; >>>>>>>>>>>>>> 284 jvmtiDeallocate(sig); >>>>>>>>>>>>>> 285 return JNI_TRUE; >>>>>>>>>>>>>> ???? 286 } >>>>>>>>>>>>>> ???? ... >>>>>>>>>>>>>> ???? 291 void >>>>>>>>>>>>>> ???? 292 classTrack_reset(void) >>>>>>>>>>>>>> ???? 293 { >>>>>>>>>>>>>> 294 int idx; >>>>>>>>>>>>>> 295 debugMonitorEnter(deletedSignatureLock); >>>>>>>>>>>>>> 296 >>>>>>>>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) { >>>>>>>>>>>>>> 298 KlassNode* node = table[idx]; >>>>>>>>>>>>>> 299 while (node != NULL) { >>>>>>>>>>>>>> 300 KlassNode* next = node->next; >>>>>>>>>>>>>> 301 jvmtiDeallocate(node->signature); >>>>>>>>>>>>>> 302 jvmtiDeallocate(node); >>>>>>>>>>>>>> 303 node = next; >>>>>>>>>>>>>> 304 } >>>>>>>>>>>>>> 305 } >>>>>>>>>>>>>> 306 jvmtiDeallocate(table); >>>>>>>>>>>>>> 307 >>>>>>>>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, >>>>>>>>>>>>>> NULL); >>>>>>>>>>>>>> 309 bagDestroyBag(deletedSignatureBag); >>>>>>>>>>>>>> 310 >>>>>>>>>>>>>> 311 currentClassTag = -1; >>>>>>>>>>>>>> 312 >>>>>>>>>>>>>> 313 >>>>>>>>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 314 trackingEnv = NULL; >>>>>>>>>>>>>> 315 >>>>>>>>>>>>>> 316 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>> >>>>>>>>>>>>>> Could you, please, fix several comments below? >>>>>>>>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for >>>>>>>>>>>>>> class-unloads >>>>>>>>>>>>>> ?????The comma is not needed. >>>>>>>>>>>>>> ?????Would it better to replace: klass tags => klass_tag's ? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 73 * Lock to keep table, currentClassTag and >>>>>>>>>>>>>> deletedSignatureBag >>>>>>>>>>>>>> consistent >>>>>>>>>>>>>> ?????Maybe: Lock to guard ... or lock to keep integrity >>>>>>>>>>>>>> of ... >>>>>>>>>>>>>> >>>>>>>>>>>>>> 84 * Callback when classes are freed, Finds the signature >>>>>>>>>>>>>> and >>>>>>>>>>>>>> remembers it in deletedSignatureBag. Would be better to use >>>>>>>>>>>>>> words >>>>>>>>>>>>>> like >>>>>>>>>>>>>> "store" or "record", "Find" should not start from capital >>>>>>>>>>>>>> letter: >>>>>>>>>>>>>> Invoke the callback when classes are freed, find and >>>>>>>>>>>>>> record the >>>>>>>>>>>>>> signature in deletedSignatureBag. >>>>>>>>>>>>>> >>>>>>>>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not >>>>>>>>>>>>>> initialized, >>>>>>>>>>>>>> nobody's interested 153 // Class tracking not initialized >>>>>>>>>>>>>> yet, >>>>>>>>>>>>>> nobody's interested 158 /* Check this is not a duplicate */ >>>>>>>>>>>>>> Missed >>>>>>>>>>>>>> dot >>>>>>>>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != >>>>>>>>>>>>>> tag) >>>>>>>>>>>>>> { // >>>>>>>>>>>>>> klass not found - ignore. In opposite, dot is not needed >>>>>>>>>>>>>> as the >>>>>>>>>>>>>> comment does not start from a capital letter. 111 // At this >>>>>>>>>>>>>> point we >>>>>>>>>>>>>> have the KlassNode corresponding to the tag >>>>>>>>>>>>>> 112 // in klass, and the pointer to it in klass_node. >>>>>>>>>>>>> ???? The comment above can be better. Maybe, something like: >>>>>>>>>>>>> ???? ? " At this point, we found the KlassNode matching >>>>>>>>>>>>> the klass >>>>>>>>>>>>> tag(and it is >>>>>>>>>>>>> linked). >>>>>>>>>>>>> >>>>>>>>>>>>>> 113 // Remember the unloaded signature. >>>>>>>>>>>>> ?????Better: Record the signature of the unloaded class and >>>>>>>>>>>>> unlink it. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Serguei >>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Serguei >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 3/9/20 05:39, Roman Kennke wrote: >>>>>>>>>>>>>>> Hello all, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Can I please get reviews of this change? In the meantime, >>>>>>>>>>>>>>> we've done >>>>>>>>>>>>>>> more testing and also field-/torture-testing by a customer >>>>>>>>>>>>>>> who is >>>>>>>>>>>>>>> happy >>>>>>>>>>>>>>> now. :-) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi Serguei, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks for reviewing! >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I updated the patch to reflect your suggestions, very >>>>>>>>>>>>>>>> good! >>>>>>>>>>>>>>>> It also includes a fix to allow re-connecting an agent >>>>>>>>>>>>>>>> after >>>>>>>>>>>>>>>> disconnect, >>>>>>>>>>>>>>>> namely move setup of the trackingEnv and >>>>>>>>>>>>>>>> deletedSignatureBag to >>>>>>>>>>>>>>>> _activate() to ensure have those structures after >>>>>>>>>>>>>>>> re-connect. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Let me know what you think! >>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi Roman, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thank you for taking care about this scalability issue! >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I have a couple of quick comments. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 72 /* >>>>>>>>>>>>>>>>> 73 * Lock to protect deletedSignatureBag >>>>>>>>>>>>>>>>> 74 */ >>>>>>>>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /* >>>>>>>>>>>>>>>>> 78 * A bag containing all the deleted classes' >>>>>>>>>>>>>>>>> signatures. >>>>>>>>>>>>>>>>> Must be >>>>>>>>>>>>>>>>> accessed under >>>>>>>>>>>>>>>>> 79 * deletedTagLock, >>>>>>>>>>>>>>>>> ????? 80? */ >>>>>>>>>>>>>>>>> 81 struct bag* deletedSignatureBag; >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ????? The comments contradict to each other. >>>>>>>>>>>>>>>>> ????? I guess, the lock name at line 79 has to be >>>>>>>>>>>>>>>>> deletedSignatureLock >>>>>>>>>>>>>>>>> instead of deletedTagLock. >>>>>>>>>>>>>>>>> ????? Also, comma at the end must be replaced with dot. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 101 // Tag not found? Ignore. >>>>>>>>>>>>>>>>> 102 if (klass == NULL) { >>>>>>>>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>>>>> 104 return; >>>>>>>>>>>>>>>>> 105 } >>>>>>>>>>>>>>>>> ???? 106 >>>>>>>>>>>>>>>>> 107 // Scan linked-list. >>>>>>>>>>>>>>>>> 108 jlong found_tag = klass->klass_tag; >>>>>>>>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) { >>>>>>>>>>>>>>>>> 110 klass_ptr = &klass->next; >>>>>>>>>>>>>>>>> 111 klass = *klass_ptr; >>>>>>>>>>>>>>>>> 112 found_tag = klass->klass_tag; >>>>>>>>>>>>>>>>> ???? 113???? } >>>>>>>>>>>>>>>>> 114 >>>>>>>>>>>>>>>>> 115 // Tag not found? Ignore. >>>>>>>>>>>>>>>>> 116 if (found_tag != tag) { >>>>>>>>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>>>>> 118 return; >>>>>>>>>>>>>>>>> ???? 119???? } >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ?????The code above can be simplified, so that the lines >>>>>>>>>>>>>>>>> 101-105 >>>>>>>>>>>>>>>>> are not >>>>>>>>>>>>>>>>> needed anymore. >>>>>>>>>>>>>>>>> ?????It can be something like this: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> // Scan linked-list. >>>>>>>>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) { >>>>>>>>>>>>>>>>> klass_ptr = &klass->next; >>>>>>>>>>>>>>>>> klass = *klass_ptr; >>>>>>>>>>>>>>>>> ???????? } >>>>>>>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // >>>>>>>>>>>>>>>>> klass not >>>>>>>>>>>>>>>>> found - ignore. >>>>>>>>>>>>>>>>> debugMonitorExit(deletedSignatureLock); >>>>>>>>>>>>>>>>> return; >>>>>>>>>>>>>>>>> ???????? } >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> It will take more time when I get a chance to look at the >>>>>>>>>>>>>>>>> rest. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> Serguei >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote: >>>>>>>>>>>>>>>>>> Here comes an update that resolves some races that >>>>>>>>>>>>>>>>>> happen >>>>>>>>>>>>>>>>>> when >>>>>>>>>>>>>>>>>> disconnecting an agent. In particular, we need to >>>>>>>>>>>>>>>>>> take the >>>>>>>>>>>>>>>>>> lock on >>>>>>>>>>>>>>>>>> basically every operation, and also need to check >>>>>>>>>>>>>>>>>> whether >>>>>>>>>>>>>>>>>> or not >>>>>>>>>>>>>>>>>> class-tracking is active and return an appropriate >>>>>>>>>>>>>>>>>> result >>>>>>>>>>>>>>>>>> (e.g. an empty >>>>>>>>>>>>>>>>>> list) when we're not. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Updated webrev: >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> So, here comes the O(1) implementation: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered >>>>>>>>>>>>>>>>>>> with a >>>>>>>>>>>>>>>>>>> tag, and we >>>>>>>>>>>>>>>>>>> set-up a listener to get notified when it is unloaded. >>>>>>>>>>>>>>>>>>> - Prepared classes are kept in a datastructure that >>>>>>>>>>>>>>>>>>> is a >>>>>>>>>>>>>>>>>>> table, which >>>>>>>>>>>>>>>>>>> each entry being the head of a linked-list of >>>>>>>>>>>>>>>>>>> KlassNode*. >>>>>>>>>>>>>>>>>>> The >>>>>>>>>>>>>>>>>>> table is >>>>>>>>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend >>>>>>>>>>>>>>>>>>> the new >>>>>>>>>>>>>>>>>>> KlassNode*. >>>>>>>>>>>>>>>>>>> This is O(1) operation. >>>>>>>>>>>>>>>>>>> - When we get notified of unloading a class, we look up >>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>> signature of >>>>>>>>>>>>>>>>>>> the reported tag in that table, and remember it in a >>>>>>>>>>>>>>>>>>> bag. >>>>>>>>>>>>>>>>>>> The >>>>>>>>>>>>>>>>>>> KlassNode* >>>>>>>>>>>>>>>>>>> is then unlinked from the table and deallocated. >>>>>>>>>>>>>>>>>>> This is >>>>>>>>>>>>>>>>>>> ~O(1) operation >>>>>>>>>>>>>>>>>>> too, depending on the depth of the table. In my >>>>>>>>>>>>>>>>>>> testcase >>>>>>>>>>>>>>>>>>> which hammered >>>>>>>>>>>>>>>>>>> the code with class-loads and unloads, I usually see >>>>>>>>>>>>>>>>>>> depths >>>>>>>>>>>>>>>>>>> of like 2-3, >>>>>>>>>>>>>>>>>>> but not usually more. It should be ok. >>>>>>>>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out >>>>>>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>>>> bag, and >>>>>>>>>>>>>>>>>>> allocate a new one. >>>>>>>>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to >>>>>>>>>>>>>>>>>>> avoid >>>>>>>>>>>>>>>>>>> leaking the >>>>>>>>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets >>>>>>>>>>>>>>>>>>> detached >>>>>>>>>>>>>>>>>>> and/or >>>>>>>>>>>>>>>>>>> re-attached (was missing before). >>>>>>>>>>>>>>>>>>> - I also added locks around data-structure-manipulation >>>>>>>>>>>>>>>>>>> (was >>>>>>>>>>>>>>>>>>> missing >>>>>>>>>>>>>>>>>>> before). >>>>>>>>>>>>>>>>>>> - Also, I only activate this whole process when an >>>>>>>>>>>>>>>>>>> actual >>>>>>>>>>>>>>>>>>> listener gets >>>>>>>>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right >>>>>>>>>>>>>>>>>>> when >>>>>>>>>>>>>>>>>>> attaching a >>>>>>>>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be >>>>>>>>>>>>>>>>>>> something >>>>>>>>>>>>>>>>>>> to improve >>>>>>>>>>>>>>>>>>> in the future? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> In my tests, the performance of class-tracking itself >>>>>>>>>>>>>>>>>>> looks >>>>>>>>>>>>>>>>>>> really good. >>>>>>>>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the >>>>>>>>>>>>>>>>>>> class-unload >>>>>>>>>>>>>>>>>>> events. I don't see how this can be helped when the >>>>>>>>>>>>>>>>>>> debug >>>>>>>>>>>>>>>>>>> agent asks for it? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Updated webrev: >>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Please let me know what you think of it. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am >>>>>>>>>>>>>>>>>>>> implementing >>>>>>>>>>>>>>>>>>>> the even more >>>>>>>>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off >>>>>>>>>>>>>>>>>>>> reviewing >>>>>>>>>>>>>>>>>>>> for now. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks,Roman >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> ???? Hi Chris, >>>>>>>>>>>>>>>>>>>>>> I'll have a look at this, although it might not be >>>>>>>>>>>>>>>>>>>>>> for a >>>>>>>>>>>>>>>>>>>>>> few days. In >>>>>>>>>>>>>>>>>>>>>> the meantime, maybe you can describe your new >>>>>>>>>>>>>>>>>>>>>> implementation in >>>>>>>>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the >>>>>>>>>>>>>>>>>>>>>> changes. >>>>>>>>>>>>>>>>>>>>> Sure. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to >>>>>>>>>>>>>>>>>>>>> determine the >>>>>>>>>>>>>>>>>>>>> signatures of unloaded classes when >>>>>>>>>>>>>>>>>>>>> GC/class-unloading >>>>>>>>>>>>>>>>>>>>> happened, so that >>>>>>>>>>>>>>>>>>>>> we can generate the appropriate JDWP event. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> The current implementation does so by maintaining a >>>>>>>>>>>>>>>>>>>>> table >>>>>>>>>>>>>>>>>>>>> of currently >>>>>>>>>>>>>>>>>>>>> prepared classes by building that table when >>>>>>>>>>>>>>>>>>>>> classTrack is >>>>>>>>>>>>>>>>>>>>> initialized, >>>>>>>>>>>>>>>>>>>>> and then add new classes whenever a class gets >>>>>>>>>>>>>>>>>>>>> loaded. >>>>>>>>>>>>>>>>>>>>> When >>>>>>>>>>>>>>>>>>>>> unloading >>>>>>>>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and >>>>>>>>>>>>>>>>>>>>> compared with the >>>>>>>>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the >>>>>>>>>>>>>>>>>>>>> new >>>>>>>>>>>>>>>>>>>>> table gets >>>>>>>>>>>>>>>>>>>>> returned. The problem is that when GCs happen >>>>>>>>>>>>>>>>>>>>> frequently >>>>>>>>>>>>>>>>>>>>> and/or many >>>>>>>>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to >>>>>>>>>>>>>>>>>>>>> O(classCount*gcCount) >>>>>>>>>>>>>>>>>>>>> complexity. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> The new implementation keeps a linked-list of >>>>>>>>>>>>>>>>>>>>> prepared >>>>>>>>>>>>>>>>>>>>> classes, and also >>>>>>>>>>>>>>>>>>>>> tracks unloads via the listener >>>>>>>>>>>>>>>>>>>>> cbTrackingObjectFree(). >>>>>>>>>>>>>>>>>>>>> Whenever an >>>>>>>>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is >>>>>>>>>>>>>>>>>>>>> scanned, >>>>>>>>>>>>>>>>>>>>> and classes >>>>>>>>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus >>>>>>>>>>>>>>>>>>>>> maintaining the >>>>>>>>>>>>>>>>>>>>> prepared-classes-list) and its signature put in >>>>>>>>>>>>>>>>>>>>> the list >>>>>>>>>>>>>>>>>>>>> that gets returned. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> The implementation is not perfect. In order to >>>>>>>>>>>>>>>>>>>>> determine >>>>>>>>>>>>>>>>>>>>> whether or not >>>>>>>>>>>>>>>>>>>>> a class is unloaded, it needs to scan the >>>>>>>>>>>>>>>>>>>>> deletedTagBag. >>>>>>>>>>>>>>>>>>>>> That process is >>>>>>>>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption >>>>>>>>>>>>>>>>>>>>> here >>>>>>>>>>>>>>>>>>>>> is that >>>>>>>>>>>>>>>>>>>>> unloadedClassCount << classCount. In my >>>>>>>>>>>>>>>>>>>>> experiments this >>>>>>>>>>>>>>>>>>>>> seems to be >>>>>>>>>>>>>>>>>>>>> true, and also reasonable to expect. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> (I have some ideas how to improve the >>>>>>>>>>>>>>>>>>>>> implementation to >>>>>>>>>>>>>>>>>>>>> ~O(1) but it >>>>>>>>>>>>>>>>>>>>> would be considerably more complex: have to >>>>>>>>>>>>>>>>>>>>> maintain a >>>>>>>>>>>>>>>>>>>>> (hash)table that >>>>>>>>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon >>>>>>>>>>>>>>>>>>>>> unload, >>>>>>>>>>>>>>>>>>>>> and build the >>>>>>>>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently >>>>>>>>>>>>>>>>>>>>> see >>>>>>>>>>>>>>>>>>>>> that it's >>>>>>>>>>>>>>>>>>>>> worth the effort). >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> In addition to all that, this process is only >>>>>>>>>>>>>>>>>>>>> activated >>>>>>>>>>>>>>>>>>>>> when there's an >>>>>>>>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote: >>>>>>>>>>>>>>>>>>>>>>> Hello all, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Issue: >>>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269 >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of >>>>>>>>>>>>>>>>>>>>>>> classTrack.c. >>>>>>>>>>>>>>>>>>>>>>> It avoids >>>>>>>>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead >>>>>>>>>>>>>>>>>>>>>>> keeps >>>>>>>>>>>>>>>>>>>>>>> track of >>>>>>>>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance >>>>>>>>>>>>>>>>>>>>>>> until an >>>>>>>>>>>>>>>>>>>>>>> agent >>>>>>>>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Testing: manual testing of provided test >>>>>>>>>>>>>>>>>>>>>>> scenarios and >>>>>>>>>>>>>>>>>>>>>>> timing. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Eg with the testcase provided here: >>>>>>>>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985 >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I am getting those numbers: >>>>>>>>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s >>>>>>>>>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Can I please get a review? >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>> Roman >>>>>>>>>>>>>>>>>>>>>>> >>> > From david.holmes at oracle.com Fri Mar 27 02:18:28 2020 From: david.holmes at oracle.com (David Holmes) Date: Fri, 27 Mar 2020 12:18:28 +1000 Subject: RFR: 8241585: Remove unused _recursion_counter facility from PerfTraceTime In-Reply-To: References: <91a8ebbc-522d-bd67-6304-c9e097bd8366@oracle.com> <31c866a4-e135-adbc-cb8c-81fbd77bb59e@oracle.com> Message-ID: <880cc6fc-de02-0891-9f66-46bb6185b04f@oracle.com> Hi Claes, On 27/03/2020 10:11 am, Claes Redestad wrote: > > > On 2020-03-27 00:36, David Holmes wrote: >>> >> >> Okay so can you change the bug synopsis and description to cover this >> more general cleanup and tuneup please. > > I filed an addendum RFE and will add this RFE bug id to the single > changeset push: > https://bugs.openjdk.java.net/browse/JDK-8241705 That works too :) Thanks. >> >> I'm never very clear on the uses of these PerfCounters. It seems >> SUN_NS is unused after this change. The references to jvmstat seem no >> longer correct - these are read via jstat ? > > The general confusion about PerfData/-Counters and what they're for is > why I'm trying to untangle this. Generally I think we should pull the > plug on it, but the perfdata shared file is tangled up with > functionality to detect running JVMs used by jcmd etc, so it might > take a few iterations to get there. Yeah they confuse me. Which makes it hard to see what impact your changes may have. Hopefully serviceability folk are more familiar with how things hook together. Thanks, David > /Claes From suenaga at oss.nttdata.com Fri Mar 27 02:49:38 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Fri, 27 Mar 2020 11:49:38 +0900 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: <48d5c2a6-777b-b17d-db05-1b71b712fb4e@oss.nttdata.com> References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com> <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com> <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com> <4f940048-06e1-bda1-6f75-225001572232@oss.nttdata.com> <908005c7-c3ad-2f92-5853-2ae0895689b4@oracle.com> <2114ef12-b790-ee2d-4242-d114cc47151c@oracle.com> <9c9950af-5e36-90b5-d22f-a3e00d45fbcc@oss.nttdata.com> <0851349f-ee17-4b0e-f186-a378cedd6913@oracle.com> <48d5c2a6-777b-b17d-db05-1b71b712fb4e@oss.nttdata.com> Message-ID: <676255e6-e5b2-dfcb-7dc9-4dd8646032ec@oss.nttdata.com> All tests on submit repo has been passed. (mach5-one-ysuenaga-JDK-8240956-3-20200327-0003-9753265) Yasumasa On 2020/03/27 9:07, Yasumasa Suenaga wrote: > Thanks Kevin and Serguei! and sorry for my English... > > I uploaded new webrev: > > ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.05/ > > Diff from webrev.04 is here: > > ? http://hg.openjdk.java.net/jdk/submit/rev/d5f400d70e94 > > > Thanks, > > Yasumasa > > > On 2020/03/27 2:53, serguei.spitsyn at oracle.com wrote: >> Hi Kevin, >> >> Nice catch with the name "lastFrame". >> I was also confused when reviewed this but did not come up with something better. >> >> Thanks, >> Serguei >> >> On 3/26/20 10:40, Kevin Walls wrote: >>> Hi Yasumasa, >>> >>> Oops, didn't catch this - I also had done some manual testing and in mach5 but clearly not enough. >>> >>> Generally I think this looks good. >>> >>> "lastFrame" can mean last as in final, or last as in previous. "last" is one of those annoying English words.? Here it means final, if we get an Exception during processDwarf, use this to flag that we should return null from sender().? "finalFrame" would be clearer to me, anything else probably gets more verbose than you wanted. >>> >>> Yes I like having the limit on the while loop in process_dwarf(), always worried how sane the information is that we are parsing through. >>> >>> Thanks! >>> Kevin >>> >>> >>> On 24/03/2020 23:47, Yasumasa Suenaga wrote: >>>> Thanks Serguei! >>>> >>>> I will push it when I get second reviewer. >>>> >>>> >>>> Yasumasa >>>> >>>> >>>> On 2020/03/25 1:39, serguei.spitsyn at oracle.com wrote: >>>>> Hi Yasumasa, >>>>> >>>>> I'm okay with this update. >>>>> My mach5 test run for this patch is passed. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 3/23/20 17:08, Yasumasa Suenaga wrote: >>>>>> Hi Serguei, >>>>>> >>>>>> Thanks for your comment! >>>>>> I uploaded new webrev: >>>>>> >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.04/ >>>>>> >>>>>> Also I pushed it to submit repo: >>>>>> >>>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/fade6a949bd1 >>>>>> >>>>>> On 2020/03/24 7:39, serguei.spitsyn at oracle.com wrote: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> The mach5 tier5 testing looks good. >>>>>>> The serviceability/sa/ClhsdbPstack.java is failed without fix and is not failed with it. >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> On 3/23/20 10:18, serguei.spitsyn at oracle.com wrote: >>>>>>>> Hi Yasumasa, >>>>>>>> >>>>>>>> I looked at you changes. >>>>>>>> It is hard to understand if this fully solves the issue. >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html >>>>>>>> >>>>>>>> @@ -34,10 +34,11 @@ >>>>>>>> ? ???? public static LinuxAMD64CFrame getTopFrame(LinuxDebugger dbg, Address rip, ThreadContext context) { >>>>>>>> ??????? Address libptr = dbg.findLibPtrByAddress(rip); >>>>>>>> ??????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); >>>>>>>> ??????? DwarfParser dwarf = null; >>>>>>>> + boolean unsupportedDwarf = false; >>>>>>>> ? ??????? if (libptr != null) { // Native frame >>>>>>>> ????????? try { >>>>>>>> ??????????? dwarf = new DwarfParser(libptr); >>>>>>>> ??????????? dwarf.processDwarf(rip); >>>>>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >>>>>>>> >>>>>>>> @@ -45,24 +46,33 @@ >>>>>>>> ?????????????????? !dwarf.isBPOffsetAvailable()) >>>>>>>> ????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) >>>>>>>> ????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister()) >>>>>>>> .addOffsetTo(dwarf.getCFAOffset()); >>>>>>>> ????????? } catch (DebuggerException e) { >>>>>>>> - // Bail out to Java frame case >>>>>>>> + if (dwarf != null) { >>>>>>>> + // DWARF processing should succeed when the frame is native >>>>>>>> + // but it might fail if CIE has language personality routine >>>>>>>> + // and/or LSDA. >>>>>>>> + dwarf = null; >>>>>>>> + unsupportedDwarf = true; >>>>>>>> + } else { >>>>>>>> + throw e; >>>>>>>> + } >>>>>>>> ????????? } >>>>>>>> ??????? } >>>>>>>> ? ??????? return (cfa == null) ? null >>>>>>>> - : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf); >>>>>>>> + : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf, !unsupportedDwarf); >>>>>>>> ???? } >>>>>>>> >>>>>>>> @@ -121,13 +131,25 @@ >>>>>>>> ?????? } >>>>>>>> ? ?????? return isValidFrame(nextCFA, context) ? nextCFA : null; >>>>>>>> ???? } >>>>>>>> ? - private DwarfParser getNextDwarf(Address nextPC) { >>>>>>>> - DwarfParser nextDwarf = null; >>>>>>>> + @Override >>>>>>>> + public CFrame sender(ThreadProxy thread) { >>>>>>>> + if (!possibleNext) { >>>>>>>> + return null; >>>>>>>> + } >>>>>>>> + >>>>>>>> + ThreadContext context = thread.getContext(); >>>>>>>> + >>>>>>>> + Address nextPC = getNextPC(dwarf != null); >>>>>>>> + if (nextPC == null) { >>>>>>>> + return null; >>>>>>>> + } >>>>>>>> ? + DwarfParser nextDwarf = null; >>>>>>>> + boolean unsupportedDwarf = false; >>>>>>>> ?????? if ((dwarf != null) && dwarf.isIn(nextPC)) { >>>>>>>> ???????? nextDwarf = dwarf; >>>>>>>> ?????? } else { >>>>>>>> ???????? Address libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>> ???????? if (libptr != null) { >>>>>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >>>>>>>> >>>>>>>> @@ -138,33 +160,29 @@ >>>>>>>> ?????????? } >>>>>>>> ???????? } >>>>>>>> ?????? } >>>>>>>> ? ?????? if (nextDwarf != null) { >>>>>>>> + try { >>>>>>>> ???????? nextDwarf.processDwarf(nextPC); >>>>>>>> + } catch (DebuggerException e) { >>>>>>>> + // DWARF processing should succeed when the frame is native >>>>>>>> + // but it might fail if CIE has language personality routine >>>>>>>> + // and/or LSDA. >>>>>>>> + nextDwarf = null; >>>>>>>> + unsupportedDwarf = true; >>>>>>>> ?????? } >>>>>>>> >>>>>>>> This fix looks like a hack. >>>>>>>> Should we just propagate the Debugging exception instead of trying to maintain unsupportedDwarf flag? >>>>>> >>>>>> DwarfParser::processDwarf would throw DebuggerException if it cannot find DWARF which relates to PC. >>>>>> PC at this point is for next frame. So current frame (`this` object) is valid, and it should be processed. >>>>>> >>>>>> >>>>>>>> Also, I don't like that DWARF-specific abbreviations (like CIE, IDE,LSDA, etc.) are used without any comments explaining them. >>>>>>>> The code has to be generally readable without looking into the DWARF spec each time. >>>>>> >>>>>> I added comments for them in this webrev. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>>>> I'm submitting mach5 jobs to make sure the issue has been resolved with your fix. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> >>>>>>>> On 3/20/20 17:55, Yasumasa Suenaga wrote: >>>>>>>>> Thanks Chris! >>>>>>>>> I'm waiting for reviewers for this change. >>>>>>>>> >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2020/03/21 4:23, Chris Plummer wrote: >>>>>>>>>> Hi Yasumasa, >>>>>>>>>> >>>>>>>>>> The failure is due to JDK-8231634, so not something you need to worry about. >>>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> On 3/20/20 2:45 AM, Yasumasa Suenaga wrote: >>>>>>>>>>> Hi Chris, >>>>>>>>>>> >>>>>>>>>>> I uploaded new webrev which includes reverting change for ProblemList: >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/ >>>>>>>>>>> >>>>>>>>>>> I tested it on submit repo (mach5-one-ysuenaga-JDK-8240956-2-20200320-0814-9586301), >>>>>>>>>>> but it has failed in ClhsdbJstackXcompStress.java. >>>>>>>>>>> However I think it is not caused by this change because ClhsdbJstackXcompStress.java tests `jhsdb jstack`, not mixed mode, it would not parse DWARF. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 2020/03/20 13:55, Chris Plummer wrote: >>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>> >>>>>>>>>>>> The test has been problem listed so please add undoing this to your webrev. Here's the diff that problem listed it: >>>>>>>>>>>> >>>>>>>>>>>> diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt >>>>>>>>>>>> --- a/test/hotspot/jtreg/ProblemList.txt >>>>>>>>>>>> +++ b/test/hotspot/jtreg/ProblemList.txt >>>>>>>>>>>> @@ -115,7 +115,7 @@ >>>>>>>>>>>> ??serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all >>>>>>>>>>>> ??serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all >>>>>>>>>>>> ??serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all >>>>>>>>>>>> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all >>>>>>>>>>>> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 solaris-all,linux-all >>>>>>>>>>>> ??serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 solaris-all >>>>>>>>>>>> ??serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 solaris-all,linux-x64,macosx-x64,windows-x64 >>>>>>>>>>>> ??serviceability/sa/ClhsdbSource.java 8193639 solaris-all >>>>>>>>>>>> >>>>>>>>>>>> thanks, >>>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> On 3/16/20 5:07 AM, Yasumasa Suenaga wrote: >>>>>>>>>>>>> Hi all, >>>>>>>>>>>>> >>>>>>>>>>>>> This webrev has passed submit repo (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and additional tests. >>>>>>>>>>>>> So please review it: >>>>>>>>>>>>> >>>>>>>>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>>>>>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 2020/03/16 21:03, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> Thank you so much, David! >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 2020/03/16 21:01, David Holmes wrote: >>>>>>>>>>>>>>> On 16/03/2020 9:46 pm, David Holmes wrote: >>>>>>>>>>>>>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I missed loop condition, so I fixed it and pushed to submit repo. >>>>>>>>>>>>>>>>> Could you try again? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> webrev is here: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Test job resubmitted. Will advise results if it completes before I go to bed :) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Seems to have passed okay. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> David >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks a lot! >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 2020/03/16 16:17, David Holmes wrote: >>>>>>>>>>>>>>>>>> Sorry it is still crashing. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704 >>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0640217.suenaga.source) >>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>>>>>>>>>>>>>>> # Problematic frame: >>>>>>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Same as before. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>> ----- >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote: >>>>>>>>>>>>>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote: >>>>>>>>>>>>>>>>>>>>> BTW, if you submit it to the submit repo, we can then go and run additional internal tests (and even more builds) using that job. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks for that tip Chris! >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I've pushed the change to submit repo, but I've not yet received the result. >>>>>>>>>>>>>>>>>>>> I will share you when I get job ID. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> We can see the id. Just need to wait for the builds to complete before submitting the additional tests. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thank you for testing it. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I updated webrev to avoid bailout to Java frame when DWARF has language personality routine or LSDA. >>>>>>>>>>>>>>>>>>>>>> Could you try it? >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 . >>>>>>>>>>>>>>>>>>>>>> I've pushed it to submit repo. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Diff from webrev.00 is here: >>>>>>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652 >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote: >>>>>>>>>>>>>>>>>>>>>>> Correction ... >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote: >>>>>>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote: >>>>>>>>>>>>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I can't review this as I know nothing about the code, but I'm putting the patch through our internal testing. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Sorry but the crashes still exist: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>>>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949 >>>>>>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) >>>>>>>>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>>>>>>>>>>>>>>>>>>>>> # Problematic frame: >>>>>>>>>>>>>>>>>>>>>>>> # C [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> in fact they seem worse as the test seems to always crash now. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in linux-x64. I don't see a pattern as to where it fails versus passes. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> It doesn't fail for me locally. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Please review this change: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>>>>>>>>>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames in jstack mixed mode. >>>>>>>>>>>>>>>>>>>>>>>>>> However some error has seen intermittently after that. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> I investigated the cause of this, I found two concerns: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check >>>>>>>>>>>>>>>>>>>>>>>>>> ?? B: Language personality routine and Language Specific Data Area (LSDA) are not considered >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> I addd range check for .eh_frame processing, and ignore personality routine and LSDA in this webrev. >>>>>>>>>>>>>>>>>>>>>>>>>> Also I added bailout code if DWARF processing is failed due to these concerns. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> This change has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), >>>>>>>>>>>>>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>> >>>>> >>>>> >>> >> >> From kevin.walls at oracle.com Fri Mar 27 07:42:16 2020 From: kevin.walls at oracle.com (Kevin Walls) Date: Fri, 27 Mar 2020 07:42:16 +0000 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: <676255e6-e5b2-dfcb-7dc9-4dd8646032ec@oss.nttdata.com> References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com> <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com> <4f940048-06e1-bda1-6f75-225001572232@oss.nttdata.com> <908005c7-c3ad-2f92-5853-2ae0895689b4@oracle.com> <2114ef12-b790-ee2d-4242-d114cc47151c@oracle.com> <9c9950af-5e36-90b5-d22f-a3e00d45fbcc@oss.nttdata.com> <0851349f-ee17-4b0e-f186-a378cedd6913@oracle.com> <48d5c2a6-777b-b17d-db05-1b71b712fb4e@oss.nttdata.com> <676255e6-e5b2-dfcb-7dc9-4dd8646032ec@oss.nttdata.com> Message-ID: Great, thanks Yasumasa.? Don't worry, the language is not just you - it's often unclear in other places. 8-)? Sorry maybe I should have said you didn't need to resubmit the webrev for that, but a retest is nice. Thanks Kevin On 27/03/2020 02:49, Yasumasa Suenaga wrote: > All tests on submit repo has been passed. > (mach5-one-ysuenaga-JDK-8240956-3-20200327-0003-9753265) > > Yasumasa > > On 2020/03/27 9:07, Yasumasa Suenaga wrote: >> Thanks Kevin and Serguei! and sorry for my English... >> >> I uploaded new webrev: >> >> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.05/ >> >> Diff from webrev.04 is here: >> >> ?? http://hg.openjdk.java.net/jdk/submit/rev/d5f400d70e94 >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2020/03/27 2:53, serguei.spitsyn at oracle.com wrote: >>> Hi Kevin, >>> >>> Nice catch with the name "lastFrame". >>> I was also confused when reviewed this but did not come up with >>> something better. >>> >>> Thanks, >>> Serguei >>> >>> On 3/26/20 10:40, Kevin Walls wrote: >>>> Hi Yasumasa, >>>> >>>> Oops, didn't catch this - I also had done some manual testing and >>>> in mach5 but clearly not enough. >>>> >>>> Generally I think this looks good. >>>> >>>> "lastFrame" can mean last as in final, or last as in previous. >>>> "last" is one of those annoying English words. Here it means final, >>>> if we get an Exception during processDwarf, use this to flag that >>>> we should return null from sender().? "finalFrame" would be clearer >>>> to me, anything else probably gets more verbose than you wanted. >>>> >>>> Yes I like having the limit on the while loop in process_dwarf(), >>>> always worried how sane the information is that we are parsing >>>> through. >>>> >>>> Thanks! >>>> Kevin >>>> >>>> >>>> On 24/03/2020 23:47, Yasumasa Suenaga wrote: >>>>> Thanks Serguei! >>>>> >>>>> I will push it when I get second reviewer. >>>>> >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2020/03/25 1:39, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> I'm okay with this update. >>>>>> My mach5 test run for this patch is passed. >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 3/23/20 17:08, Yasumasa Suenaga wrote: >>>>>>> Hi Serguei, >>>>>>> >>>>>>> Thanks for your comment! >>>>>>> I uploaded new webrev: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.04/ >>>>>>> >>>>>>> Also I pushed it to submit repo: >>>>>>> >>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/fade6a949bd1 >>>>>>> >>>>>>> On 2020/03/24 7:39, serguei.spitsyn at oracle.com wrote: >>>>>>>> Hi Yasumasa, >>>>>>>> >>>>>>>> The mach5 tier5 testing looks good. >>>>>>>> The serviceability/sa/ClhsdbPstack.java is failed without fix >>>>>>>> and is not failed with it. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> >>>>>>>> On 3/23/20 10:18, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Hi Yasumasa, >>>>>>>>> >>>>>>>>> I looked at you changes. >>>>>>>>> It is hard to understand if this fully solves the issue. >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html >>>>>>>>> >>>>>>>>> >>>>>>>>> @@ -34,10 +34,11 @@ >>>>>>>>> ? ???? public static LinuxAMD64CFrame >>>>>>>>> getTopFrame(LinuxDebugger dbg, Address rip, ThreadContext >>>>>>>>> context) { >>>>>>>>> ??????? Address libptr = dbg.findLibPtrByAddress(rip); >>>>>>>>> ??????? Address cfa = >>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); >>>>>>>>> ??????? DwarfParser dwarf = null; >>>>>>>>> + boolean unsupportedDwarf = false; >>>>>>>>> ? ??????? if (libptr != null) { // Native frame >>>>>>>>> ????????? try { >>>>>>>>> ??????????? dwarf = new DwarfParser(libptr); >>>>>>>>> ??????????? dwarf.processDwarf(rip); >>>>>>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >>>>>>>>> >>>>>>>>> @@ -45,24 +46,33 @@ >>>>>>>>> ?????????????????? !dwarf.isBPOffsetAvailable()) >>>>>>>>> ????????????????????? ? >>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP) >>>>>>>>> ????????????????????? : >>>>>>>>> context.getRegisterAsAddress(dwarf.getCFARegister()) >>>>>>>>> .addOffsetTo(dwarf.getCFAOffset()); >>>>>>>>> ????????? } catch (DebuggerException e) { >>>>>>>>> - // Bail out to Java frame case >>>>>>>>> + if (dwarf != null) { >>>>>>>>> + // DWARF processing should succeed when the frame is native >>>>>>>>> + // but it might fail if CIE has language personality routine >>>>>>>>> + // and/or LSDA. >>>>>>>>> + dwarf = null; >>>>>>>>> + unsupportedDwarf = true; >>>>>>>>> + } else { >>>>>>>>> + throw e; >>>>>>>>> + } >>>>>>>>> ????????? } >>>>>>>>> ??????? } >>>>>>>>> ? ??????? return (cfa == null) ? null >>>>>>>>> - : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf); >>>>>>>>> + : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf, >>>>>>>>> !unsupportedDwarf); >>>>>>>>> ???? } >>>>>>>>> >>>>>>>>> @@ -121,13 +131,25 @@ >>>>>>>>> ?????? } >>>>>>>>> ? ?????? return isValidFrame(nextCFA, context) ? nextCFA : null; >>>>>>>>> ???? } >>>>>>>>> ? - private DwarfParser getNextDwarf(Address nextPC) { >>>>>>>>> - DwarfParser nextDwarf = null; >>>>>>>>> + @Override >>>>>>>>> + public CFrame sender(ThreadProxy thread) { >>>>>>>>> + if (!possibleNext) { >>>>>>>>> + return null; >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> + ThreadContext context = thread.getContext(); >>>>>>>>> + >>>>>>>>> + Address nextPC = getNextPC(dwarf != null); >>>>>>>>> + if (nextPC == null) { >>>>>>>>> + return null; >>>>>>>>> + } >>>>>>>>> ? + DwarfParser nextDwarf = null; >>>>>>>>> + boolean unsupportedDwarf = false; >>>>>>>>> ?????? if ((dwarf != null) && dwarf.isIn(nextPC)) { >>>>>>>>> ???????? nextDwarf = dwarf; >>>>>>>>> ?????? } else { >>>>>>>>> ???????? Address libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>> ???????? if (libptr != null) { >>>>>>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >>>>>>>>> >>>>>>>>> @@ -138,33 +160,29 @@ >>>>>>>>> ?????????? } >>>>>>>>> ???????? } >>>>>>>>> ?????? } >>>>>>>>> ? ?????? if (nextDwarf != null) { >>>>>>>>> + try { >>>>>>>>> ???????? nextDwarf.processDwarf(nextPC); >>>>>>>>> + } catch (DebuggerException e) { >>>>>>>>> + // DWARF processing should succeed when the frame is native >>>>>>>>> + // but it might fail if CIE has language personality routine >>>>>>>>> + // and/or LSDA. >>>>>>>>> + nextDwarf = null; >>>>>>>>> + unsupportedDwarf = true; >>>>>>>>> ?????? } >>>>>>>>> >>>>>>>>> This fix looks like a hack. >>>>>>>>> Should we just propagate the Debugging exception instead of >>>>>>>>> trying to maintain unsupportedDwarf flag? >>>>>>> >>>>>>> DwarfParser::processDwarf would throw DebuggerException if it >>>>>>> cannot find DWARF which relates to PC. >>>>>>> PC at this point is for next frame. So current frame (`this` >>>>>>> object) is valid, and it should be processed. >>>>>>> >>>>>>> >>>>>>>>> Also, I don't like that DWARF-specific abbreviations (like >>>>>>>>> CIE, IDE,LSDA, etc.) are used without any comments explaining >>>>>>>>> them. >>>>>>>>> The code has to be generally readable without looking into the >>>>>>>>> DWARF spec each time. >>>>>>> >>>>>>> I added comments for them in this webrev. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>>>> I'm submitting mach5 jobs to make sure the issue has been >>>>>>>>> resolved with your fix. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>> >>>>>>>>> On 3/20/20 17:55, Yasumasa Suenaga wrote: >>>>>>>>>> Thanks Chris! >>>>>>>>>> I'm waiting for reviewers for this change. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2020/03/21 4:23, Chris Plummer wrote: >>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>> >>>>>>>>>>> The failure is due to JDK-8231634, so not something you need >>>>>>>>>>> to worry about. >>>>>>>>>>> >>>>>>>>>>> thanks, >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> On 3/20/20 2:45 AM, Yasumasa Suenaga wrote: >>>>>>>>>>>> Hi Chris, >>>>>>>>>>>> >>>>>>>>>>>> I uploaded new webrev which includes reverting change for >>>>>>>>>>>> ProblemList: >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/ >>>>>>>>>>>> >>>>>>>>>>>> I tested it on submit repo >>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-2-20200320-0814-9586301), >>>>>>>>>>>> but it has failed in ClhsdbJstackXcompStress.java. >>>>>>>>>>>> However I think it is not caused by this change because >>>>>>>>>>>> ClhsdbJstackXcompStress.java tests `jhsdb jstack`, not >>>>>>>>>>>> mixed mode, it would not parse DWARF. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 2020/03/20 13:55, Chris Plummer wrote: >>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>> >>>>>>>>>>>>> The test has been problem listed so please add undoing >>>>>>>>>>>>> this to your webrev. Here's the diff that problem listed it: >>>>>>>>>>>>> >>>>>>>>>>>>> diff --git a/test/hotspot/jtreg/ProblemList.txt >>>>>>>>>>>>> b/test/hotspot/jtreg/ProblemList.txt >>>>>>>>>>>>> --- a/test/hotspot/jtreg/ProblemList.txt >>>>>>>>>>>>> +++ b/test/hotspot/jtreg/ProblemList.txt >>>>>>>>>>>>> @@ -115,7 +115,7 @@ >>>>>>>>>>>>> ??serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all >>>>>>>>>>>>> ??serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all >>>>>>>>>>>>> ??serviceability/sa/ClhsdbPrintStatics.java 8193639 >>>>>>>>>>>>> solaris-all >>>>>>>>>>>>> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all >>>>>>>>>>>>> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 >>>>>>>>>>>>> solaris-all,linux-all >>>>>>>>>>>>> ??serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java >>>>>>>>>>>>> 8193639 solaris-all >>>>>>>>>>>>> ??serviceability/sa/ClhsdbScanOops.java >>>>>>>>>>>>> 8193639,8235220,8230731 >>>>>>>>>>>>> solaris-all,linux-x64,macosx-x64,windows-x64 >>>>>>>>>>>>> ??serviceability/sa/ClhsdbSource.java 8193639 solaris-all >>>>>>>>>>>>> >>>>>>>>>>>>> thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Chris >>>>>>>>>>>>> >>>>>>>>>>>>> On 3/16/20 5:07 AM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> This webrev has passed submit repo >>>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) >>>>>>>>>>>>>> and additional tests. >>>>>>>>>>>>>> So please review it: >>>>>>>>>>>>>> >>>>>>>>>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>>>>>>>> ? webrev: >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 2020/03/16 21:03, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>> Thank you so much, David! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 2020/03/16 21:01, David Holmes wrote: >>>>>>>>>>>>>>>> On 16/03/2020 9:46 pm, David Holmes wrote: >>>>>>>>>>>>>>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I missed loop condition, so I fixed it and pushed to >>>>>>>>>>>>>>>>>> submit repo. >>>>>>>>>>>>>>>>>> Could you try again? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23 >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> webrev is here: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Test job resubmitted. Will advise results if it >>>>>>>>>>>>>>>>> completes before I go to bed :) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Seems to have passed okay. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks a lot! >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 2020/03/16 16:17, David Holmes wrote: >>>>>>>>>>>>>>>>>>> Sorry it is still crashing. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java >>>>>>>>>>>>>>>>>>> Runtime Environment: >>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, >>>>>>>>>>>>>>>>>>> pid=13702, tid=13704 >>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment >>>>>>>>>>>>>>>>>>> (15.0) (fastdebug build >>>>>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0640217.suenaga.source) >>>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM >>>>>>>>>>>>>>>>>>> (fastdebug >>>>>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0640217.suenaga.source, >>>>>>>>>>>>>>>>>>> mixed mode, sharing, tiered, compressed oops, g1 gc, >>>>>>>>>>>>>>>>>>> linux-amd64) >>>>>>>>>>>>>>>>>>> # Problematic frame: >>>>>>>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] >>>>>>>>>>>>>>>>>>> DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Same as before. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>> ----- >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote: >>>>>>>>>>>>>>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote: >>>>>>>>>>>>>>>>>>>>>> BTW, if you submit it to the submit repo, we can >>>>>>>>>>>>>>>>>>>>>> then go and run additional internal tests (and >>>>>>>>>>>>>>>>>>>>>> even more builds) using that job. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks for that tip Chris! >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I've pushed the change to submit repo, but I've >>>>>>>>>>>>>>>>>>>>> not yet received the result. >>>>>>>>>>>>>>>>>>>>> I will share you when I get job ID. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> We can see the id. Just need to wait for the builds >>>>>>>>>>>>>>>>>>>> to complete before submitting the additional tests. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thank you for testing it. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I updated webrev to avoid bailout to Java frame >>>>>>>>>>>>>>>>>>>>>>> when DWARF has language personality routine or >>>>>>>>>>>>>>>>>>>>>>> LSDA. >>>>>>>>>>>>>>>>>>>>>>> Could you try it? >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux >>>>>>>>>>>>>>>>>>>>>>> 7.7 . >>>>>>>>>>>>>>>>>>>>>>> I've pushed it to submit repo. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Diff from webrev.00 is here: >>>>>>>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652 >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote: >>>>>>>>>>>>>>>>>>>>>>>> Correction ... >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote: >>>>>>>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> I can't review this as I know nothing about >>>>>>>>>>>>>>>>>>>>>>>>>> the code, but I'm putting the patch through >>>>>>>>>>>>>>>>>>>>>>>>>> our internal testing. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Sorry but the crashes still exist: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java >>>>>>>>>>>>>>>>>>>>>>>>> Runtime Environment: >>>>>>>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, >>>>>>>>>>>>>>>>>>>>>>>>> pid=16948, tid=16949 >>>>>>>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment >>>>>>>>>>>>>>>>>>>>>>>>> (15.0) (fastdebug build >>>>>>>>>>>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM >>>>>>>>>>>>>>>>>>>>>>>>> (fastdebug >>>>>>>>>>>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, >>>>>>>>>>>>>>>>>>>>>>>>> mixed mode, sharing, tiered, compressed oops, >>>>>>>>>>>>>>>>>>>>>>>>> g1 gc, linux-amd64) >>>>>>>>>>>>>>>>>>>>>>>>> # Problematic frame: >>>>>>>>>>>>>>>>>>>>>>>>> # C [libsaproc.so+0x494e] >>>>>>>>>>>>>>>>>>>>>>>>> DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> in fact they seem worse as the test seems to >>>>>>>>>>>>>>>>>>>>>>>>> always crash now. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 >>>>>>>>>>>>>>>>>>>>>>>> runs of the test in linux-x64. I don't see a >>>>>>>>>>>>>>>>>>>>>>>> pattern as to where it fails versus passes. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> It doesn't fail for me locally. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Please review this change: >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> ?? JBS: >>>>>>>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> ?? webrev: >>>>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA >>>>>>>>>>>>>>>>>>>>>>>>>>> for unwinding native frames in jstack mixed >>>>>>>>>>>>>>>>>>>>>>>>>>> mode. >>>>>>>>>>>>>>>>>>>>>>>>>>> However some error has seen intermittently >>>>>>>>>>>>>>>>>>>>>>>>>>> after that. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> I investigated the cause of this, I found >>>>>>>>>>>>>>>>>>>>>>>>>>> two concerns: >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section >>>>>>>>>>>>>>>>>>>>>>>>>>> data) range check >>>>>>>>>>>>>>>>>>>>>>>>>>> ?? B: Language personality routine and >>>>>>>>>>>>>>>>>>>>>>>>>>> Language Specific Data Area (LSDA) are not >>>>>>>>>>>>>>>>>>>>>>>>>>> considered >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> I addd range check for .eh_frame processing, >>>>>>>>>>>>>>>>>>>>>>>>>>> and ignore personality routine and LSDA in >>>>>>>>>>>>>>>>>>>>>>>>>>> this webrev. >>>>>>>>>>>>>>>>>>>>>>>>>>> Also I added bailout code if DWARF >>>>>>>>>>>>>>>>>>>>>>>>>>> processing is failed due to these concerns. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> This change has passed all tests on submit >>>>>>>>>>>>>>>>>>>>>>>>>>> repo >>>>>>>>>>>>>>>>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and >>>>>>>>>>>>>>>>>>>>>>>>>>> Oracle Linux 7.7 container. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>> >>> >>> From suenaga at oss.nttdata.com Fri Mar 27 07:54:08 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Fri, 27 Mar 2020 16:54:08 +0900 Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624 In-Reply-To: References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com> <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com> <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com> <4f940048-06e1-bda1-6f75-225001572232@oss.nttdata.com> <908005c7-c3ad-2f92-5853-2ae0895689b4@oracle.com> <2114ef12-b790-ee2d-4242-d114cc47151c@oracle.com> <9c9950af-5e36-90b5-d22f-a3e00d45fbcc@oss.nttdata.com> <0851349f-ee17-4b0e-f186-a378cedd6913@oracle.com> <48d5c2a6-777b-b17d-db05-1b71b712fb4e@oss.nttdata.com> <676255e6-e5b2-dfcb-7dc9-4dd8646032ec@oss.nttdata.com> Message-ID: Thanks Kevin! I will push it. Yasumasa On 2020/03/27 16:42, Kevin Walls wrote: > Great, thanks Yasumasa.? Don't worry, the language is not just you - it's often unclear in other places. 8-)? Sorry maybe I should have said you didn't need to resubmit the webrev for that, but a retest is nice. > > Thanks > Kevin > > > On 27/03/2020 02:49, Yasumasa Suenaga wrote: >> All tests on submit repo has been passed. (mach5-one-ysuenaga-JDK-8240956-3-20200327-0003-9753265) >> >> Yasumasa >> >> On 2020/03/27 9:07, Yasumasa Suenaga wrote: >>> Thanks Kevin and Serguei! and sorry for my English... >>> >>> I uploaded new webrev: >>> >>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.05/ >>> >>> Diff from webrev.04 is here: >>> >>> ?? http://hg.openjdk.java.net/jdk/submit/rev/d5f400d70e94 >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2020/03/27 2:53, serguei.spitsyn at oracle.com wrote: >>>> Hi Kevin, >>>> >>>> Nice catch with the name "lastFrame". >>>> I was also confused when reviewed this but did not come up with something better. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> On 3/26/20 10:40, Kevin Walls wrote: >>>>> Hi Yasumasa, >>>>> >>>>> Oops, didn't catch this - I also had done some manual testing and in mach5 but clearly not enough. >>>>> >>>>> Generally I think this looks good. >>>>> >>>>> "lastFrame" can mean last as in final, or last as in previous. "last" is one of those annoying English words. Here it means final, if we get an Exception during processDwarf, use this to flag that we should return null from sender().? "finalFrame" would be clearer to me, anything else probably gets more verbose than you wanted. >>>>> >>>>> Yes I like having the limit on the while loop in process_dwarf(), always worried how sane the information is that we are parsing through. >>>>> >>>>> Thanks! >>>>> Kevin >>>>> >>>>> >>>>> On 24/03/2020 23:47, Yasumasa Suenaga wrote: >>>>>> Thanks Serguei! >>>>>> >>>>>> I will push it when I get second reviewer. >>>>>> >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2020/03/25 1:39, serguei.spitsyn at oracle.com wrote: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> I'm okay with this update. >>>>>>> My mach5 test run for this patch is passed. >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> On 3/23/20 17:08, Yasumasa Suenaga wrote: >>>>>>>> Hi Serguei, >>>>>>>> >>>>>>>> Thanks for your comment! >>>>>>>> I uploaded new webrev: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.04/ >>>>>>>> >>>>>>>> Also I pushed it to submit repo: >>>>>>>> >>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/fade6a949bd1 >>>>>>>> >>>>>>>> On 2020/03/24 7:39, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Hi Yasumasa, >>>>>>>>> >>>>>>>>> The mach5 tier5 testing looks good. >>>>>>>>> The serviceability/sa/ClhsdbPstack.java is failed without fix and is not failed with it. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>> >>>>>>>>> On 3/23/20 10:18, serguei.spitsyn at oracle.com wrote: >>>>>>>>>> Hi Yasumasa, >>>>>>>>>> >>>>>>>>>> I looked at you changes. >>>>>>>>>> It is hard to understand if this fully solves the issue. >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html >>>>>>>>>> >>>>>>>>>> @@ -34,10 +34,11 @@ >>>>>>>>>> ? ???? public static LinuxAMD64CFrame getTopFrame(LinuxDebugger dbg, Address rip, ThreadContext context) { >>>>>>>>>> ??????? Address libptr = dbg.findLibPtrByAddress(rip); >>>>>>>>>> ??????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); >>>>>>>>>> ??????? DwarfParser dwarf = null; >>>>>>>>>> + boolean unsupportedDwarf = false; >>>>>>>>>> ? ??????? if (libptr != null) { // Native frame >>>>>>>>>> ????????? try { >>>>>>>>>> ??????????? dwarf = new DwarfParser(libptr); >>>>>>>>>> ??????????? dwarf.processDwarf(rip); >>>>>>>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >>>>>>>>>> >>>>>>>>>> @@ -45,24 +46,33 @@ >>>>>>>>>> ?????????????????? !dwarf.isBPOffsetAvailable()) >>>>>>>>>> ????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) >>>>>>>>>> ????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister()) >>>>>>>>>> .addOffsetTo(dwarf.getCFAOffset()); >>>>>>>>>> ????????? } catch (DebuggerException e) { >>>>>>>>>> - // Bail out to Java frame case >>>>>>>>>> + if (dwarf != null) { >>>>>>>>>> + // DWARF processing should succeed when the frame is native >>>>>>>>>> + // but it might fail if CIE has language personality routine >>>>>>>>>> + // and/or LSDA. >>>>>>>>>> + dwarf = null; >>>>>>>>>> + unsupportedDwarf = true; >>>>>>>>>> + } else { >>>>>>>>>> + throw e; >>>>>>>>>> + } >>>>>>>>>> ????????? } >>>>>>>>>> ??????? } >>>>>>>>>> ? ??????? return (cfa == null) ? null >>>>>>>>>> - : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf); >>>>>>>>>> + : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf, !unsupportedDwarf); >>>>>>>>>> ???? } >>>>>>>>>> >>>>>>>>>> @@ -121,13 +131,25 @@ >>>>>>>>>> ?????? } >>>>>>>>>> ? ?????? return isValidFrame(nextCFA, context) ? nextCFA : null; >>>>>>>>>> ???? } >>>>>>>>>> ? - private DwarfParser getNextDwarf(Address nextPC) { >>>>>>>>>> - DwarfParser nextDwarf = null; >>>>>>>>>> + @Override >>>>>>>>>> + public CFrame sender(ThreadProxy thread) { >>>>>>>>>> + if (!possibleNext) { >>>>>>>>>> + return null; >>>>>>>>>> + } >>>>>>>>>> + >>>>>>>>>> + ThreadContext context = thread.getContext(); >>>>>>>>>> + >>>>>>>>>> + Address nextPC = getNextPC(dwarf != null); >>>>>>>>>> + if (nextPC == null) { >>>>>>>>>> + return null; >>>>>>>>>> + } >>>>>>>>>> ? + DwarfParser nextDwarf = null; >>>>>>>>>> + boolean unsupportedDwarf = false; >>>>>>>>>> ?????? if ((dwarf != null) && dwarf.isIn(nextPC)) { >>>>>>>>>> ???????? nextDwarf = dwarf; >>>>>>>>>> ?????? } else { >>>>>>>>>> ???????? Address libptr = dbg.findLibPtrByAddress(nextPC); >>>>>>>>>> ???????? if (libptr != null) { >>>>>>>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >>>>>>>>>> >>>>>>>>>> @@ -138,33 +160,29 @@ >>>>>>>>>> ?????????? } >>>>>>>>>> ???????? } >>>>>>>>>> ?????? } >>>>>>>>>> ? ?????? if (nextDwarf != null) { >>>>>>>>>> + try { >>>>>>>>>> ???????? nextDwarf.processDwarf(nextPC); >>>>>>>>>> + } catch (DebuggerException e) { >>>>>>>>>> + // DWARF processing should succeed when the frame is native >>>>>>>>>> + // but it might fail if CIE has language personality routine >>>>>>>>>> + // and/or LSDA. >>>>>>>>>> + nextDwarf = null; >>>>>>>>>> + unsupportedDwarf = true; >>>>>>>>>> ?????? } >>>>>>>>>> >>>>>>>>>> This fix looks like a hack. >>>>>>>>>> Should we just propagate the Debugging exception instead of trying to maintain unsupportedDwarf flag? >>>>>>>> >>>>>>>> DwarfParser::processDwarf would throw DebuggerException if it cannot find DWARF which relates to PC. >>>>>>>> PC at this point is for next frame. So current frame (`this` object) is valid, and it should be processed. >>>>>>>> >>>>>>>> >>>>>>>>>> Also, I don't like that DWARF-specific abbreviations (like CIE, IDE,LSDA, etc.) are used without any comments explaining them. >>>>>>>>>> The code has to be generally readable without looking into the DWARF spec each time. >>>>>>>> >>>>>>>> I added comments for them in this webrev. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>>>> I'm submitting mach5 jobs to make sure the issue has been resolved with your fix. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Serguei >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 3/20/20 17:55, Yasumasa Suenaga wrote: >>>>>>>>>>> Thanks Chris! >>>>>>>>>>> I'm waiting for reviewers for this change. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 2020/03/21 4:23, Chris Plummer wrote: >>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>> >>>>>>>>>>>> The failure is due to JDK-8231634, so not something you need to worry about. >>>>>>>>>>>> >>>>>>>>>>>> thanks, >>>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> On 3/20/20 2:45 AM, Yasumasa Suenaga wrote: >>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>> >>>>>>>>>>>>> I uploaded new webrev which includes reverting change for ProblemList: >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/ >>>>>>>>>>>>> >>>>>>>>>>>>> I tested it on submit repo (mach5-one-ysuenaga-JDK-8240956-2-20200320-0814-9586301), >>>>>>>>>>>>> but it has failed in ClhsdbJstackXcompStress.java. >>>>>>>>>>>>> However I think it is not caused by this change because ClhsdbJstackXcompStress.java tests `jhsdb jstack`, not mixed mode, it would not parse DWARF. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 2020/03/20 13:55, Chris Plummer wrote: >>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>> >>>>>>>>>>>>>> The test has been problem listed so please add undoing this to your webrev. Here's the diff that problem listed it: >>>>>>>>>>>>>> >>>>>>>>>>>>>> diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt >>>>>>>>>>>>>> --- a/test/hotspot/jtreg/ProblemList.txt >>>>>>>>>>>>>> +++ b/test/hotspot/jtreg/ProblemList.txt >>>>>>>>>>>>>> @@ -115,7 +115,7 @@ >>>>>>>>>>>>>> ??serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all >>>>>>>>>>>>>> ??serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all >>>>>>>>>>>>>> ??serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all >>>>>>>>>>>>>> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all >>>>>>>>>>>>>> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 solaris-all,linux-all >>>>>>>>>>>>>> ??serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 solaris-all >>>>>>>>>>>>>> ??serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 solaris-all,linux-x64,macosx-x64,windows-x64 >>>>>>>>>>>>>> ??serviceability/sa/ClhsdbSource.java 8193639 solaris-all >>>>>>>>>>>>>> >>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Chris >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 3/16/20 5:07 AM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> This webrev has passed submit repo (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and additional tests. >>>>>>>>>>>>>>> So please review it: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>>>>>>>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 2020/03/16 21:03, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>> Thank you so much, David! >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 2020/03/16 21:01, David Holmes wrote: >>>>>>>>>>>>>>>>> On 16/03/2020 9:46 pm, David Holmes wrote: >>>>>>>>>>>>>>>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I missed loop condition, so I fixed it and pushed to submit repo. >>>>>>>>>>>>>>>>>>> Could you try again? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> webrev is here: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Test job resubmitted. Will advise results if it completes before I go to bed :) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Seems to have passed okay. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks a lot! >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 2020/03/16 16:17, David Holmes wrote: >>>>>>>>>>>>>>>>>>>> Sorry it is still crashing. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704 >>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0640217.suenaga.source) >>>>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>>>>>>>>>>>>>>>>> # Problematic frame: >>>>>>>>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Same as before. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>> ----- >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote: >>>>>>>>>>>>>>>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote: >>>>>>>>>>>>>>>>>>>>>>> BTW, if you submit it to the submit repo, we can then go and run additional internal tests (and even more builds) using that job. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks for that tip Chris! >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I've pushed the change to submit repo, but I've not yet received the result. >>>>>>>>>>>>>>>>>>>>>> I will share you when I get job ID. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> We can see the id. Just need to wait for the builds to complete before submitting the additional tests. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Thank you for testing it. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> I updated webrev to avoid bailout to Java frame when DWARF has language personality routine or LSDA. >>>>>>>>>>>>>>>>>>>>>>>> Could you try it? >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 . >>>>>>>>>>>>>>>>>>>>>>>> I've pushed it to submit repo. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Diff from webrev.00 is here: >>>>>>>>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652 >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote: >>>>>>>>>>>>>>>>>>>>>>>>> Correction ... >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> I can't review this as I know nothing about the code, but I'm putting the patch through our internal testing. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Sorry but the crashes still exist: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>>>>>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949 >>>>>>>>>>>>>>>>>>>>>>>>>> # >>>>>>>>>>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) >>>>>>>>>>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>>>>>>>>>>>>>>>>>>>>>>> # Problematic frame: >>>>>>>>>>>>>>>>>>>>>>>>>> # C [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> in fact they seem worse as the test seems to always crash now. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in linux-x64. I don't see a pattern as to where it fails versus passes. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> It doesn't fail for me locally. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Please review this change: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956 >>>>>>>>>>>>>>>>>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames in jstack mixed mode. >>>>>>>>>>>>>>>>>>>>>>>>>>>> However some error has seen intermittently after that. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> I investigated the cause of this, I found two concerns: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check >>>>>>>>>>>>>>>>>>>>>>>>>>>> ?? B: Language personality routine and Language Specific Data Area (LSDA) are not considered >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> I addd range check for .eh_frame processing, and ignore personality routine and LSDA in this webrev. >>>>>>>>>>>>>>>>>>>>>>>>>>>> Also I added bailout code if DWARF processing is failed due to these concerns. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> This change has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), >>>>>>>>>>>>>>>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>> >>>> >>>> > From claes.redestad at oracle.com Fri Mar 27 10:26:36 2020 From: claes.redestad at oracle.com (Claes Redestad) Date: Fri, 27 Mar 2020 11:26:36 +0100 Subject: RFR: 8241585: Remove unused _recursion_counter facility from PerfTraceTime In-Reply-To: <880cc6fc-de02-0891-9f66-46bb6185b04f@oracle.com> References: <91a8ebbc-522d-bd67-6304-c9e097bd8366@oracle.com> <31c866a4-e135-adbc-cb8c-81fbd77bb59e@oracle.com> <880cc6fc-de02-0891-9f66-46bb6185b04f@oracle.com> Message-ID: <2a064a78-09cb-4f0e-6bef-4a74ca9712f9@oracle.com> On 2020-03-27 03:18, David Holmes wrote: > > Yeah they confuse me. Which makes it hard to see what impact your > changes may have. This patch removes some internal, unused code on the JVM end that is not observable via jstat / jvmstat. I'm happy if serviceability can weigh in though. The other RFE[1] I've filed to remove StatSampler[1] might be more contentious since it changes what gets periodically stored in the perfdata shared file. I've not yet decided if it's worth the trouble to move ahead with that at this point. /Claes [1] https://bugs.openjdk.java.net/browse/JDK-8241701 From jan.lahoda at oracle.com Fri Mar 27 11:31:33 2020 From: jan.lahoda at oracle.com (Jan Lahoda) Date: Fri, 27 Mar 2020 12:31:33 +0100 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> Message-ID: Hi Mandy, Regarding the javac changes - should those be switched on/off depending the Target? Or, if one compiles with e.g. --release 14, will the newly generated output still work on JDK 14? Jan On 27. 03. 20 0:57, Mandy Chung wrote: > Please review the implementation of JEP 371: Hidden Classes. The main > changes are in core-libs and hotspot runtime area.? Small changes are > made in javac, VM compiler (intrinsification of Class::isHiddenClass), > JFR, JDI, and jcmd.? CSR [1]has been reviewed and is in the finalized > state (see specdiff and javadoc below for reference). > > Webrev: > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 > > > Hidden class is created via `Lookup::defineHiddenClass`. From JVM's point > of view, a hidden class is a normal class except the following: > > - A hidden class has no initiating class loader and is not registered in > any dictionary. > - A hidden class has a name containing an illegal character > `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` > returns "Lp/Foo.0x1234;". > - A hidden class is not modifiable, i.e. cannot be redefined or > retransformed. JVM TI IsModifableClass returns false on a hidden. > - Final fields in a hidden class is "final".? The value of final fields > cannot be overriden via reflection.? setAccessible(true) can still be > called on reflected objects representing final fields in a hidden class > and its access check will be suppressed but only have read-access (i.e. > can do Field::getXXX but not setXXX). > > Brief summary of this patch: > > 1. A new Lookup::defineHiddenClass method is the API to create a hidden > class. > 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG > option that > ?? can be specified when creating a hidden class. > 3. A new Class::isHiddenClass method tests if a class is a hidden class. > 4. Field::setXXX method will throw IAE on a final field of a hidden class > ?? regardless of the value of the accessible flag. > 5. JVM_LookupDefineClass is the new JVM entry point for Lookup::defineClass > ?? and defineHiddenClass to create a class from the given bytes. > 6. ClassLoaderData implementation is not changed.? There is one primary CLD > ?? that holds the classes strongly referenced by its defining loader. > There > ?? can be zero or more additional CLDs - one per weak class. > 7. Nest host determination is updated per revised JVMS 5.4.4. Access > control > ?? check no longer throws LinkageError but instead it will throw IAE with > ?? a clear message if a class fails to resolve/validate the nest host > declared > ?? in NestHost/NestMembers attribute. > 8. JFR, jcmd, JDI are updated to support hidden classes. > 9. update javac LambdaToMethod as lambda proxy starts using nestmates > ?? and generate a bridge method to desuger a method reference to a > protected > ?? method in its supertype in a different package > > This patch also updates StringConcatFactory, LambdaMetaFactory, and > LambdaForms > to use hidden classes.? The webrev includes changes in nashorn to hidden > class > and I will update the webrev if JEP 372 removes it any time soon. > > We uncovered a bug in Lookup::defineClass spec throws LinkageError and > intends > to have the newly created class linked.? However, the implementation in 14 > does not link the class.? A separate CSR [2] proposes to update the > implementation to match the spec.? This patch fixes the implementation. > > The spec update on JVM TI, JDI and Instrumentation will be done as > a separate RFE [3].? This patch includes new tests for JVM TI and > java.instrument that validates how the existing APIs work for hidden > classes. > > javadoc/specdiff > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ > > > JVMS 5.4.4 change: > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf > > > CSR: > https://bugs.openjdk.java.net/browse/JDK-8238359 > > Thanks > Mandy > [1] https://bugs.openjdk.java.net/browse/JDK-8238359 > [2] https://bugs.openjdk.java.net/browse/JDK-8240338 > [3] https://bugs.openjdk.java.net/browse/JDK-8230502 From forax at univ-mlv.fr Fri Mar 27 12:00:06 2020 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 27 Mar 2020 13:00:06 +0100 (CET) Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> Message-ID: <1271059704.1214239.1585310406005.JavaMail.zimbra@u-pem.fr> Hi Mandy, in ReflectionFactory, why in the case of a constructor the check to the anonymous class is removed ? in BytecodeGenerator, the comment "// bootstrapping issue if using condy" can be promoted on top of clinit, because i ask myself the same question seeing a static block was generated in AbstractValidatingLambdaMetafactory.java, the field caller is not used after all ? regards, R?mi ----- Mail original ----- > De: "mandy chung" > ?: "valhalla-dev" , "core-libs-dev" , > "serviceability-dev" , "hotspot-dev" > Envoy?: Vendredi 27 Mars 2020 00:57:39 > Objet: Review Request: 8238358: Implementation of JEP 371: Hidden Classes > Please review the implementation of JEP 371: Hidden Classes. The main > changes are in core-libs and hotspot runtime area.? Small changes are > made in javac, VM compiler (intrinsification of Class::isHiddenClass), > JFR, JDI, and jcmd.? CSR [1]has been reviewed and is in the finalized > state (see specdiff and javadoc below for reference). > > Webrev: > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 > > Hidden class is created via `Lookup::defineHiddenClass`. From JVM's point > of view, a hidden class is a normal class except the following: > > - A hidden class has no initiating class loader and is not registered in > any dictionary. > - A hidden class has a name containing an illegal character > `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` > returns "Lp/Foo.0x1234;". > - A hidden class is not modifiable, i.e. cannot be redefined or > retransformed. JVM TI IsModifableClass returns false on a hidden. > - Final fields in a hidden class is "final".? The value of final fields > cannot be overriden via reflection.? setAccessible(true) can still be > called on reflected objects representing final fields in a hidden class > and its access check will be suppressed but only have read-access (i.e. > can do Field::getXXX but not setXXX). > > Brief summary of this patch: > > 1. A new Lookup::defineHiddenClass method is the API to create a hidden > class. > 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG > option that > ?? can be specified when creating a hidden class. > 3. A new Class::isHiddenClass method tests if a class is a hidden class. > 4. Field::setXXX method will throw IAE on a final field of a hidden class > ?? regardless of the value of the accessible flag. > 5. JVM_LookupDefineClass is the new JVM entry point for Lookup::defineClass > ?? and defineHiddenClass to create a class from the given bytes. > 6. ClassLoaderData implementation is not changed.? There is one primary CLD > ?? that holds the classes strongly referenced by its defining loader. > There > ?? can be zero or more additional CLDs - one per weak class. > 7. Nest host determination is updated per revised JVMS 5.4.4. Access control > ?? check no longer throws LinkageError but instead it will throw IAE with > ?? a clear message if a class fails to resolve/validate the nest host > declared > ?? in NestHost/NestMembers attribute. > 8. JFR, jcmd, JDI are updated to support hidden classes. > 9. update javac LambdaToMethod as lambda proxy starts using nestmates > ?? and generate a bridge method to desuger a method reference to a > protected > ?? method in its supertype in a different package > > This patch also updates StringConcatFactory, LambdaMetaFactory, and > LambdaForms > to use hidden classes.? The webrev includes changes in nashorn to hidden > class > and I will update the webrev if JEP 372 removes it any time soon. > > We uncovered a bug in Lookup::defineClass spec throws LinkageError and > intends > to have the newly created class linked.? However, the implementation in 14 > does not link the class.? A separate CSR [2] proposes to update the > implementation to match the spec.? This patch fixes the implementation. > > The spec update on JVM TI, JDI and Instrumentation will be done as > a separate RFE [3].? This patch includes new tests for JVM TI and > java.instrument that validates how the existing APIs work for hidden > classes. > > javadoc/specdiff > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ > > JVMS 5.4.4 change: > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf > > CSR: > https://bugs.openjdk.java.net/browse/JDK-8238359 > > Thanks > Mandy > [1] https://bugs.openjdk.java.net/browse/JDK-8238359 > [2] https://bugs.openjdk.java.net/browse/JDK-8240338 > [3] https://bugs.openjdk.java.net/browse/JDK-8230502 From mandy.chung at oracle.com Fri Mar 27 15:50:55 2020 From: mandy.chung at oracle.com (Mandy Chung) Date: Fri, 27 Mar 2020 08:50:55 -0700 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: <1271059704.1214239.1585310406005.JavaMail.zimbra@u-pem.fr> References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> <1271059704.1214239.1585310406005.JavaMail.zimbra@u-pem.fr> Message-ID: On 3/27/20 5:00 AM, Remi Forax wrote: > Hi Mandy, > in ReflectionFactory, why in the case of a constructor the check to the anonymous class is removed ? Good catch.? Fixed > > in BytecodeGenerator, the comment "// bootstrapping issue if using condy" > can be promoted on top of clinit, because i ask myself the same question seeing a static block was generated OK, that's clearer. > > in AbstractValidatingLambdaMetafactory.java, the field caller is not used after all ? Thanks.? Removed.? It was left behind from an early prototype. Below is the patch.? I will send out a new webrev and delta webrev in the next revision. thanks Mandy diff --git a/src/java.base/share/classes/java/lang/invoke/AbstractValidatingLambdaMetafactory.java b/src/java.base/share/classes/java/lang/invoke/AbstractValidatingLambdaMetafactory.java --- a/src/java.base/share/classes/java/lang/invoke/AbstractValidatingLambdaMetafactory.java +++ b/src/java.base/share/classes/java/lang/invoke/AbstractValidatingLambdaMetafactory.java @@ -51,7 +51,6 @@ ????? *???????? System.out.printf(">>> %s\n", iii.foo(44)); ????? * }} ????? */ -??? final MethodHandles.Lookup caller; ???? final Class targetClass;?????????????? // The class calling the meta-factory via invokedynamic "class X" ???? final MethodType invokedType;???????????? // The type of the invoked method "(CC)II" ???? final Class samBase;?????????????????? // The type of the returned instance "interface JJ" @@ -121,7 +120,6 @@ ???????????????????? "Invalid caller: %s", ???????????????????? caller.lookupClass().getName())); ???????? } -??????? this.caller = caller; ???????? this.targetClass = caller.lookupClass(); ???????? this.invokedType = invokedType; diff --git a/src/java.base/share/classes/java/lang/invoke/InvokerBytecodeGenerator.java b/src/java.base/share/classes/java/lang/invoke/InvokerBytecodeGenerator.java --- a/src/java.base/share/classes/java/lang/invoke/InvokerBytecodeGenerator.java +++ b/src/java.base/share/classes/java/lang/invoke/InvokerBytecodeGenerator.java @@ -363,6 +363,10 @@ ???????? clinit(cw, className(), classData); ???? } +??? /* +???? * to initialize the static final fields with the live class data +???? * LambdaForms can't use condy due to bootstrapping issue. +???? */ ???? static void clinit(ClassWriter cw, String className, List classData) { ???????? if (classData.isEmpty()) ???????????? return; @@ -375,7 +379,6 @@ ???????? MethodVisitor mv = cw.visitMethod(Opcodes.ACC_STATIC, "", "()V", null, null); ???????? mv.visitCode(); -??????? // bootstrapping issue if using condy ???????? mv.visitLdcInsn(Type.getType("L" + className + ";")); ???????? mv.visitMethodInsn(Opcodes.INVOKESTATIC, "java/lang/invoke/MethodHandleNatives", ??????????????????????????? "classData", "(Ljava/lang/Class;)Ljava/lang/Object;", false); diff --git a/src/java.base/share/classes/jdk/internal/reflect/ReflectionFactory.java b/src/java.base/share/classes/jdk/internal/reflect/ReflectionFactory.java --- a/src/java.base/share/classes/jdk/internal/reflect/ReflectionFactory.java +++ b/src/java.base/share/classes/jdk/internal/reflect/ReflectionFactory.java @@ -245,7 +245,8 @@ ???????????? return new BootstrapConstructorAccessorImpl(c); ???????? } -??????? if (noInflation && !c.getDeclaringClass().isHiddenClass()) { +??????? if (noInflation && !c.getDeclaringClass().isHiddenClass() +??????????????? && !ReflectUtil.isVMAnonymousClass(c.getDeclaringClass())) { ???????????? return new MethodAccessorGenerator(). ???????????????? generateConstructor(c.getDeclaringClass(), ???????????????????????????????????? c.getParameterTypes(), -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Mar 27 15:54:37 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 27 Mar 2020 16:54:37 +0100 (CET) Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> <1271059704.1214239.1585310406005.JavaMail.zimbra@u-pem.fr> Message-ID: <387396409.1414184.1585324477993.JavaMail.zimbra@u-pem.fr> > De: "mandy chung" > ?: "Remi Forax" > Cc: "valhalla-dev" , "core-libs-dev" > , "serviceability-dev" > , "hotspot-dev" > > Envoy?: Vendredi 27 Mars 2020 16:50:55 > Objet: Re: Review Request: 8238358: Implementation of JEP 371: Hidden Classes > On 3/27/20 5:00 AM, Remi Forax wrote: >> Hi Mandy, >> in ReflectionFactory, why in the case of a constructor the check to the >> anonymous class is removed ? > Good catch. Fixed >> in BytecodeGenerator, the comment "// bootstrapping issue if using condy" >> can be promoted on top of clinit, because i ask myself the same question seeing >> a static block was generated > OK, that's clearer. >> in AbstractValidatingLambdaMetafactory.java, the field caller is not used after >> all ? > Thanks. Removed. It was left behind from an early prototype. > Below is the patch. I will send out a new webrev and delta webrev in the next > revision. Thanks Mandy, Looks good. R?mi > thanks > Mandy > diff --git > a/src/java.base/share/classes/java/lang/invoke/AbstractValidatingLambdaMetafactory.java > b/src/java.base/share/classes/java/lang/invoke/AbstractValidatingLambdaMetafactory.java > --- > a/src/java.base/share/classes/java/lang/invoke/AbstractValidatingLambdaMetafactory.java > +++ > b/src/java.base/share/classes/java/lang/invoke/AbstractValidatingLambdaMetafactory.java > @@ -51,7 +51,6 @@ > * System.out.printf(">>> %s\n", iii.foo(44)); > * }} > */ > - final MethodHandles.Lookup caller; > final Class targetClass; // The class calling the meta-factory via > invokedynamic "class X" > final MethodType invokedType; // The type of the invoked method "(CC)II" > final Class samBase; // The type of the returned instance "interface JJ" > @@ -121,7 +120,6 @@ > "Invalid caller: %s", > caller.lookupClass().getName())); > } > - this.caller = caller; > this.targetClass = caller.lookupClass(); > this.invokedType = invokedType; > diff --git > a/src/java.base/share/classes/java/lang/invoke/InvokerBytecodeGenerator.java > b/src/java.base/share/classes/java/lang/invoke/InvokerBytecodeGenerator.java > --- a/src/java.base/share/classes/java/lang/invoke/InvokerBytecodeGenerator.java > +++ b/src/java.base/share/classes/java/lang/invoke/InvokerBytecodeGenerator.java > @@ -363,6 +363,10 @@ > clinit(cw, className(), classData); > } > + /* > + * to initialize the static final fields with the live class data > + * LambdaForms can't use condy due to bootstrapping issue. > + */ > static void clinit(ClassWriter cw, String className, List classData) > { > if (classData.isEmpty()) > return; > @@ -375,7 +379,6 @@ > MethodVisitor mv = cw.visitMethod(Opcodes.ACC_STATIC, "", "()V", null, > null); > mv.visitCode(); > - // bootstrapping issue if using condy > mv.visitLdcInsn(Type.getType("L" + className + ";")); > mv.visitMethodInsn(Opcodes.INVOKESTATIC, "java/lang/invoke/MethodHandleNatives", > "classData", "(Ljava/lang/Class;)Ljava/lang/Object;", false); > diff --git > a/src/java.base/share/classes/jdk/internal/reflect/ReflectionFactory.java > b/src/java.base/share/classes/jdk/internal/reflect/ReflectionFactory.java > --- a/src/java.base/share/classes/jdk/internal/reflect/ReflectionFactory.java > +++ b/src/java.base/share/classes/jdk/internal/reflect/ReflectionFactory.java > @@ -245,7 +245,8 @@ > return new BootstrapConstructorAccessorImpl(c); > } > - if (noInflation && !c.getDeclaringClass().isHiddenClass()) { > + if (noInflation && !c.getDeclaringClass().isHiddenClass() > + && !ReflectUtil.isVMAnonymousClass(c.getDeclaringClass())) { > return new MethodAccessorGenerator(). > generateConstructor(c.getDeclaringClass(), > c.getParameterTypes(), -------------- next part -------------- An HTML attachment was scrubbed... URL: From mandy.chung at oracle.com Fri Mar 27 16:29:46 2020 From: mandy.chung at oracle.com (Mandy Chung) Date: Fri, 27 Mar 2020 09:29:46 -0700 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> Message-ID: <08191054-8d0a-ae60-ac99-e2849f08ce85@oracle.com> Hi Jan, Good point.? The javac change only applies to JDK 15 and later and the lambda proxy class is not a nestmate when running on JDK 14 or earlier. I probably need the help from langtools team to fix this.? I'll give it a try. Mandy On 3/27/20 4:31 AM, Jan Lahoda wrote: > Hi Mandy, > > Regarding the javac changes - should those be switched on/off > depending the Target? Or, if one compiles with e.g. --release 14, will > the newly generated output still work on JDK 14? > > Jan > > On 27. 03. 20 0:57, Mandy Chung wrote: >> Please review the implementation of JEP 371: Hidden Classes. The main >> changes are in core-libs and hotspot runtime area.? Small changes are >> made in javac, VM compiler (intrinsification of >> Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been reviewed >> and is in the finalized state (see specdiff and javadoc below for >> reference). >> >> Webrev: >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 >> >> >> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's >> point >> of view, a hidden class is a normal class except the following: >> >> - A hidden class has no initiating class loader and is not registered >> in any dictionary. >> - A hidden class has a name containing an illegal character >> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` >> returns "Lp/Foo.0x1234;". >> - A hidden class is not modifiable, i.e. cannot be redefined or >> retransformed. JVM TI IsModifableClass returns false on a hidden. >> - Final fields in a hidden class is "final".? The value of final >> fields cannot be overriden via reflection.? setAccessible(true) can >> still be called on reflected objects representing final fields in a >> hidden class and its access check will be suppressed but only have >> read-access (i.e. can do Field::getXXX but not setXXX). >> >> Brief summary of this patch: >> >> 1. A new Lookup::defineHiddenClass method is the API to create a >> hidden class. >> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG >> option that >> ??? can be specified when creating a hidden class. >> 3. A new Class::isHiddenClass method tests if a class is a hidden class. >> 4. Field::setXXX method will throw IAE on a final field of a hidden >> class >> ??? regardless of the value of the accessible flag. >> 5. JVM_LookupDefineClass is the new JVM entry point for >> Lookup::defineClass >> ??? and defineHiddenClass to create a class from the given bytes. >> 6. ClassLoaderData implementation is not changed.? There is one >> primary CLD >> ??? that holds the classes strongly referenced by its defining >> loader. There >> ??? can be zero or more additional CLDs - one per weak class. >> 7. Nest host determination is updated per revised JVMS 5.4.4. Access >> control >> ??? check no longer throws LinkageError but instead it will throw IAE >> with >> ??? a clear message if a class fails to resolve/validate the nest >> host declared >> ??? in NestHost/NestMembers attribute. >> 8. JFR, jcmd, JDI are updated to support hidden classes. >> 9. update javac LambdaToMethod as lambda proxy starts using nestmates >> ??? and generate a bridge method to desuger a method reference to a >> protected >> ??? method in its supertype in a different package >> >> This patch also updates StringConcatFactory, LambdaMetaFactory, and >> LambdaForms >> to use hidden classes.? The webrev includes changes in nashorn to >> hidden class >> and I will update the webrev if JEP 372 removes it any time soon. >> >> We uncovered a bug in Lookup::defineClass spec throws LinkageError >> and intends >> to have the newly created class linked.? However, the implementation >> in 14 >> does not link the class.? A separate CSR [2] proposes to update the >> implementation to match the spec.? This patch fixes the implementation. >> >> The spec update on JVM TI, JDI and Instrumentation will be done as >> a separate RFE [3].? This patch includes new tests for JVM TI and >> java.instrument that validates how the existing APIs work for hidden >> classes. >> >> javadoc/specdiff >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ >> >> >> JVMS 5.4.4 change: >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf >> >> >> CSR: >> https://bugs.openjdk.java.net/browse/JDK-8238359 >> >> Thanks >> Mandy >> [1] https://bugs.openjdk.java.net/browse/JDK-8238359 >> [2] https://bugs.openjdk.java.net/browse/JDK-8240338 >> [3] https://bugs.openjdk.java.net/browse/JDK-8230502 -------------- next part -------------- An HTML attachment was scrubbed... URL: From shade at redhat.com Fri Mar 27 16:57:19 2020 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 27 Mar 2020 17:57:19 +0100 Subject: RFR (XS) 8241750: x86_32 build failure after JDK-8227269 Message-ID: Bug: https://bugs.openjdk.java.net/browse/JDK-8241750 Fix: diff -r fef47d126675 src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c --- a/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c Fri Mar 27 15:33:24 2020 +0100 +++ b/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c Fri Mar 27 17:47:31 2020 +0100 @@ -70,5 +70,5 @@ return; } - *(char**)bagAdd(deletedSignatures) = (char*)tag; + *(char**)bagAdd(deletedSignatures) = (char*)jlong_to_ptr(tag); debugMonitorExit(classTrackLock); @@ -118,5 +118,5 @@ EXIT_ERROR(error,"signature"); } - error = JVMTI_FUNC_PTR(trackingEnv, SetTag)(env, klass, (jlong)signature); + error = JVMTI_FUNC_PTR(trackingEnv, SetTag)(env, klass, ptr_to_jlong(signature)); if (error != JVMTI_ERROR_NONE) { jvmtiDeallocate(signature); Testing: Linux {x86_64, x86_32} x {builds, vmTestbase_nsk_jdwp}; jdk-submit (running) -- Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From rkennke at redhat.com Fri Mar 27 17:03:49 2020 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 27 Mar 2020 18:03:49 +0100 Subject: RFR (XS) 8241750: x86_32 build failure after JDK-8227269 In-Reply-To: References: Message-ID: Looks good to me, thanks! Roman > Bug: > https://bugs.openjdk.java.net/browse/JDK-8241750 > > Fix: > > diff -r fef47d126675 src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c > --- a/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c Fri Mar 27 15:33:24 2020 +0100 > +++ b/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c Fri Mar 27 17:47:31 2020 +0100 > @@ -70,5 +70,5 @@ > return; > } > - *(char**)bagAdd(deletedSignatures) = (char*)tag; > + *(char**)bagAdd(deletedSignatures) = (char*)jlong_to_ptr(tag); > > debugMonitorExit(classTrackLock); > @@ -118,5 +118,5 @@ > EXIT_ERROR(error,"signature"); > } > - error = JVMTI_FUNC_PTR(trackingEnv, SetTag)(env, klass, (jlong)signature); > + error = JVMTI_FUNC_PTR(trackingEnv, SetTag)(env, klass, ptr_to_jlong(signature)); > if (error != JVMTI_ERROR_NONE) { > jvmtiDeallocate(signature); > > Testing: Linux {x86_64, x86_32} x {builds, vmTestbase_nsk_jdwp}; jdk-submit (running) > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From chris.plummer at oracle.com Fri Mar 27 17:08:45 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 27 Mar 2020 10:08:45 -0700 Subject: RFR (XS) 8241750: x86_32 build failure after JDK-8227269 In-Reply-To: References: Message-ID: +1 Chris On 3/27/20 10:03 AM, Roman Kennke wrote: > Looks good to me, thanks! > > Roman > >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8241750 >> >> Fix: >> >> diff -r fef47d126675 src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c >> --- a/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c Fri Mar 27 15:33:24 2020 +0100 >> +++ b/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c Fri Mar 27 17:47:31 2020 +0100 >> @@ -70,5 +70,5 @@ >> return; >> } >> - *(char**)bagAdd(deletedSignatures) = (char*)tag; >> + *(char**)bagAdd(deletedSignatures) = (char*)jlong_to_ptr(tag); >> >> debugMonitorExit(classTrackLock); >> @@ -118,5 +118,5 @@ >> EXIT_ERROR(error,"signature"); >> } >> - error = JVMTI_FUNC_PTR(trackingEnv, SetTag)(env, klass, (jlong)signature); >> + error = JVMTI_FUNC_PTR(trackingEnv, SetTag)(env, klass, ptr_to_jlong(signature)); >> if (error != JVMTI_ERROR_NONE) { >> jvmtiDeallocate(signature); >> >> Testing: Linux {x86_64, x86_32} x {builds, vmTestbase_nsk_jdwp}; jdk-submit (running) >> From shade at redhat.com Fri Mar 27 17:10:09 2020 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 27 Mar 2020 18:10:09 +0100 Subject: RFR (XS) 8241750: x86_32 build failure after JDK-8227269 In-Reply-To: References: Message-ID: <60fcb978-f7f0-48b9-f438-0fe0c918dd32@redhat.com> Thanks! Trivial, right? If so, I'll push as soon as jdk-submit clears. -Aleksey On 3/27/20 6:08 PM, Chris Plummer wrote: > +1 > > Chris > > On 3/27/20 10:03 AM, Roman Kennke wrote: >> Looks good to me, thanks! >> >> Roman >> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8241750 >>> >>> Fix: >>> >>> diff -r fef47d126675 src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c >>> --- a/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c Fri Mar 27 15:33:24 2020 +0100 >>> +++ b/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c Fri Mar 27 17:47:31 2020 +0100 >>> @@ -70,5 +70,5 @@ >>> return; >>> } >>> - *(char**)bagAdd(deletedSignatures) = (char*)tag; >>> + *(char**)bagAdd(deletedSignatures) = (char*)jlong_to_ptr(tag); >>> >>> debugMonitorExit(classTrackLock); >>> @@ -118,5 +118,5 @@ >>> EXIT_ERROR(error,"signature"); >>> } >>> - error = JVMTI_FUNC_PTR(trackingEnv, SetTag)(env, klass, (jlong)signature); >>> + error = JVMTI_FUNC_PTR(trackingEnv, SetTag)(env, klass, ptr_to_jlong(signature)); >>> if (error != JVMTI_ERROR_NONE) { >>> jvmtiDeallocate(signature); >>> >>> Testing: Linux {x86_64, x86_32} x {builds, vmTestbase_nsk_jdwp}; jdk-submit (running) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From chris.plummer at oracle.com Fri Mar 27 17:15:55 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 27 Mar 2020 10:15:55 -0700 Subject: RFR (XS) 8241750: x86_32 build failure after JDK-8227269 In-Reply-To: <60fcb978-f7f0-48b9-f438-0fe0c918dd32@redhat.com> References: <60fcb978-f7f0-48b9-f438-0fe0c918dd32@redhat.com> Message-ID: <6d1a21fa-637a-b1fc-4d12-0032711e45a0@oracle.com> Yeah, I think given that it fixes a broken build it should be fine to push right away. Chris On 3/27/20 10:10 AM, Aleksey Shipilev wrote: > Thanks! Trivial, right? > > If so, I'll push as soon as jdk-submit clears. > > -Aleksey > > On 3/27/20 6:08 PM, Chris Plummer wrote: >> +1 >> >> Chris >> >> On 3/27/20 10:03 AM, Roman Kennke wrote: >>> Looks good to me, thanks! >>> >>> Roman >>> >>>> Bug: >>>> https://bugs.openjdk.java.net/browse/JDK-8241750 >>>> >>>> Fix: >>>> >>>> diff -r fef47d126675 src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c >>>> --- a/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c Fri Mar 27 15:33:24 2020 +0100 >>>> +++ b/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c Fri Mar 27 17:47:31 2020 +0100 >>>> @@ -70,5 +70,5 @@ >>>> return; >>>> } >>>> - *(char**)bagAdd(deletedSignatures) = (char*)tag; >>>> + *(char**)bagAdd(deletedSignatures) = (char*)jlong_to_ptr(tag); >>>> >>>> debugMonitorExit(classTrackLock); >>>> @@ -118,5 +118,5 @@ >>>> EXIT_ERROR(error,"signature"); >>>> } >>>> - error = JVMTI_FUNC_PTR(trackingEnv, SetTag)(env, klass, (jlong)signature); >>>> + error = JVMTI_FUNC_PTR(trackingEnv, SetTag)(env, klass, ptr_to_jlong(signature)); >>>> if (error != JVMTI_ERROR_NONE) { >>>> jvmtiDeallocate(signature); >>>> >>>> Testing: Linux {x86_64, x86_32} x {builds, vmTestbase_nsk_jdwp}; jdk-submit (running) From shade at redhat.com Fri Mar 27 18:05:34 2020 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 27 Mar 2020 19:05:34 +0100 Subject: RFR (XS) 8241750: x86_32 build failure after JDK-8227269 In-Reply-To: <6d1a21fa-637a-b1fc-4d12-0032711e45a0@oracle.com> References: <60fcb978-f7f0-48b9-f438-0fe0c918dd32@redhat.com> <6d1a21fa-637a-b1fc-4d12-0032711e45a0@oracle.com> Message-ID: On 3/27/20 6:15 PM, Chris Plummer wrote: > Yeah, I think given that it fixes a broken build it should be fine to > push right away. jdk-submit came clean, pushed. -- Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From paul.sandoz at oracle.com Fri Mar 27 18:59:10 2020 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Fri, 27 Mar 2020 11:59:10 -0700 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> Message-ID: <72EC6AFD-412B-4238-B98E-90AF8C185D98@oracle.com> Hi Mandy, Very thorough, bravo! Minor suggestions below. Paul. MethodHandleNatives.java ? 142 143 /** 144 * Flags for Lookup.ClassOptions 145 */ 146 static final int 147 NESTMATE_CLASS = 0x00000001, 148 HIDDEN_CLASS = 0x00000002, 149 STRONG_LOADER_LINK = 0x00000004, 150 ACCESS_VM_ANNOTATIONS = 0x00000008; 151 } Suggest you add a comment to keep the values in sync with the VM component. MethodHandles.java ? 1786 * (Given the {@code Lookup} object returned this method, its lookup class 1787 * is a {@code Class} object for which {@link Class#getName()} returns a string 1788 * that is not a binary name.) ? (The {@code Lookup} object returned from this method has a lookup class that is a {@code Class} object whose {@link Class#getName()} returns a string that is not a binary name.) ? 1902 Set opts = options.length > 0 ? Set.of(options) : Set.of(); You can just do: Set opts = Set.of(options) And/or inline it into the subsequent method call. The implementation of Set.of checks the array length. 2001 ClassDefiner makeHiddenClassDefiner(byte[] bytes, I think you can telescope the methods for non-name and name accepting since IIUC the name is derived from the byte[]. Thereby you can remove some code duplication. i.e. pull ClassDefiner.className out from ClassDefiner and place the logic in the factory methods. Alternative push the factory methods into ClassDefiner to keep all the logic together. 3797 public enum ClassOption { Shuffle up to be closer to the defineHiddenClass 3798 /** 3799 * This class option specifies the hidden class be added to 3800 * {@linkplain Class#getNestHost nest} of a lookup class as 3801 * a nestmate. Suggest: "This class option specifies the hidden class ? -> ?Specifies that a hidden class 3812 * This class option specifies the hidden class to have a strong ?Specifies that a hidden class have a ?" 3813 * relationship with the class loader marked as its defining loader, 3814 * as a normal class or interface has with its own defining loader. 3815 * This means that the hidden class may be unloaded if and only if 3816 * its defining loader is not reachable and thus may be reclaimed 3817 * by a garbage collector (JLS 12.7). StringConcatFactory.java ? 861 // use of @ForceInline no longer has any effect ? 862 mv.visitAnnotation("Ljdk/internal/vm/annotation/ForceInline;", true); 863 mv.visitCode(); > On Mar 26, 2020, at 4:57 PM, Mandy Chung wrote: > > Please review the implementation of JEP 371: Hidden Classes. The main changes are in core-libs and hotspot runtime area. Small changes are made in javac, VM compiler (intrinsification of Class::isHiddenClass), JFR, JDI, and jcmd. CSR [1]has been reviewed and is in the finalized state (see specdiff and javadoc below for reference). > > Webrev: > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 > > Hidden class is created via `Lookup::defineHiddenClass`. From JVM's point > of view, a hidden class is a normal class except the following: > > - A hidden class has no initiating class loader and is not registered in any dictionary. > - A hidden class has a name containing an illegal character `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` returns "Lp/Foo.0x1234;". > - A hidden class is not modifiable, i.e. cannot be redefined or retransformed. JVM TI IsModifableClass returns false on a hidden. > - Final fields in a hidden class is "final". The value of final fields cannot be overriden via reflection. setAccessible(true) can still be called on reflected objects representing final fields in a hidden class and its access check will be suppressed but only have read-access (i.e. can do Field::getXXX but not setXXX). > > Brief summary of this patch: > > 1. A new Lookup::defineHiddenClass method is the API to create a hidden class. > 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG option that > can be specified when creating a hidden class. > 3. A new Class::isHiddenClass method tests if a class is a hidden class. > 4. Field::setXXX method will throw IAE on a final field of a hidden class > regardless of the value of the accessible flag. > 5. JVM_LookupDefineClass is the new JVM entry point for Lookup::defineClass > and defineHiddenClass to create a class from the given bytes. > 6. ClassLoaderData implementation is not changed. There is one primary CLD > that holds the classes strongly referenced by its defining loader. There > can be zero or more additional CLDs - one per weak class. > 7. Nest host determination is updated per revised JVMS 5.4.4. Access control > check no longer throws LinkageError but instead it will throw IAE with > a clear message if a class fails to resolve/validate the nest host declared > in NestHost/NestMembers attribute. > 8. JFR, jcmd, JDI are updated to support hidden classes. > 9. update javac LambdaToMethod as lambda proxy starts using nestmates > and generate a bridge method to desuger a method reference to a protected > method in its supertype in a different package > > This patch also updates StringConcatFactory, LambdaMetaFactory, and LambdaForms > to use hidden classes. The webrev includes changes in nashorn to hidden class > and I will update the webrev if JEP 372 removes it any time soon. > > We uncovered a bug in Lookup::defineClass spec throws LinkageError and intends > to have the newly created class linked. However, the implementation in 14 > does not link the class. A separate CSR [2] proposes to update the > implementation to match the spec. This patch fixes the implementation. > > The spec update on JVM TI, JDI and Instrumentation will be done as > a separate RFE [3]. This patch includes new tests for JVM TI and > java.instrument that validates how the existing APIs work for hidden classes. > > javadoc/specdiff > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ > > JVMS 5.4.4 change: > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf > > CSR: > https://bugs.openjdk.java.net/browse/JDK-8238359 > > Thanks > Mandy > [1] https://bugs.openjdk.java.net/browse/JDK-8238359 > [2] https://bugs.openjdk.java.net/browse/JDK-8240338 > [3] https://bugs.openjdk.java.net/browse/JDK-8230502 From mandy.chung at oracle.com Fri Mar 27 20:18:07 2020 From: mandy.chung at oracle.com (Mandy Chung) Date: Fri, 27 Mar 2020 13:18:07 -0700 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: <72EC6AFD-412B-4238-B98E-90AF8C185D98@oracle.com> References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> <72EC6AFD-412B-4238-B98E-90AF8C185D98@oracle.com> Message-ID: On 3/27/20 11:59 AM, Paul Sandoz wrote: > Hi Mandy, > > Very thorough, bravo! Thanks. > Minor suggestions below. > > Paul. > > MethodHandleNatives.java > ? > > 142 > 143 /** > 144 * Flags for Lookup.ClassOptions > 145 */ > 146 static final int > 147 NESTMATE_CLASS = 0x00000001, > 148 HIDDEN_CLASS = 0x00000002, > 149 STRONG_LOADER_LINK = 0x00000004, > 150 ACCESS_VM_ANNOTATIONS = 0x00000008; > 151 } > > > Suggest you add a comment to keep the values in sync with the VM component. Already in the class spec of this Constants class.? The values of all constants defined in this Constants class are verified in sync with VM (see verifyConstants). > > MethodHandles.java > ? > > 1786 * (Given the {@code Lookup} object returned this method, its lookup class > 1787 * is a {@code Class} object for which {@link Class#getName()} returns a string > 1788 * that is not a binary name.) > > ? > (The {@code Lookup} object returned from this method has a lookup class that is > a {@code Class} object whose {@link Class#getName()} returns a string > that is not a binary name.) > ? > > > 1902 Set opts = options.length > 0 ? Set.of(options) : Set.of(); > > You can just do: > > Set opts = Set.of(options) > > And/or inline it into the subsequent method call. The implementation of Set.of checks the array length. Great to know.? Thanks. > > 2001 ClassDefiner makeHiddenClassDefiner(byte[] bytes, > > I think you can telescope the methods for non-name and name accepting since IIUC the name is derived from the byte[]. Thereby you can remove some code duplication. i.e. pull ClassDefiner.className out from ClassDefiner and place the logic in the factory methods. Alternative push the factory methods into ClassDefiner to keep all the logic together. > Ok.? I will move the className out. > > 3797 public enum ClassOption { > > Shuffle up to be closer to the defineHiddenClass Moved before defineHiddenClass. > > 3798 /** > 3799 * This class option specifies the hidden class be added to > 3800 * {@linkplain Class#getNestHost nest} of a lookup class as > 3801 * a nestmate. > > Suggest: > > "This class option specifies the hidden class ? -> ?Specifies that a hidden class > > 3812 * This class option specifies the hidden class to have a strong > > ?Specifies that a hidden class have a ?" Specifies that a hidden class has a... > > 3813 * relationship with the class loader marked as its defining loader, > 3814 * as a normal class or interface has with its own defining loader. > 3815 * This means that the hidden class may be unloaded if and only if > 3816 * its defining loader is not reachable and thus may be reclaimed > 3817 * by a garbage collector (JLS 12.7). > > > StringConcatFactory.java > ? > > 861 // use of @ForceInline no longer has any effect > > ? Right, I should have explained this [1]. This @ForceInline is used by BytecodeStringBuilderStrategy that generates code to have the same StringBuilder chain javac would emit. It uses `@ForceInline` annotation which may probably be for performance.? It's believed people rarely uses this non-default strategy.? This patch changes StringConcatFactory to the standard defineHiddenClass method and hence `@ForceInline` has no effect in the generated class for this non-default strategy.? If it turns out to be an issue, then we will determine if it should enable the access to VM annotations (I doubt this is supported strategy). [1] https://bugs.openjdk.java.net/browse/JDK-8241548 -------------- next part -------------- An HTML attachment was scrubbed... URL: From vicente.romero at oracle.com Fri Mar 27 21:15:29 2020 From: vicente.romero at oracle.com (Vicente Romero) Date: Fri, 27 Mar 2020 17:15:29 -0400 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: <08191054-8d0a-ae60-ac99-e2849f08ce85@oracle.com> References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> <08191054-8d0a-ae60-ac99-e2849f08ce85@oracle.com> Message-ID: Hi Mandy, The patch for nestmates [1] could be used as a reference. There a new method was added to class `com.sun.tools.javac.jvm.Target`, named: `hasNestmateAccess` which checks if a target is ready for nestmates or not. I think that you can follow a similar approach here. Thanks, Vicente [1] http://hg.openjdk.java.net/jdk/jdk/rev/2f2af62dfac7 On 3/27/20 12:29 PM, Mandy Chung wrote: > Hi Jan, > > Good point.? The javac change only applies to JDK 15 and later and the > lambda proxy class is not a nestmate when running on JDK 14 or earlier. > > I probably need the help from langtools team to fix this.? I'll give > it a try. > > Mandy > > On 3/27/20 4:31 AM, Jan Lahoda wrote: >> Hi Mandy, >> >> Regarding the javac changes - should those be switched on/off >> depending the Target? Or, if one compiles with e.g. --release 14, >> will the newly generated output still work on JDK 14? >> >> Jan >> >> On 27. 03. 20 0:57, Mandy Chung wrote: >>> Please review the implementation of JEP 371: Hidden Classes. The >>> main changes are in core-libs and hotspot runtime area.? Small >>> changes are made in javac, VM compiler (intrinsification of >>> Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been reviewed >>> and is in the finalized state (see specdiff and javadoc below for >>> reference). >>> >>> Webrev: >>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 >>> >>> >>> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's >>> point >>> of view, a hidden class is a normal class except the following: >>> >>> - A hidden class has no initiating class loader and is not >>> registered in any dictionary. >>> - A hidden class has a name containing an illegal character >>> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` >>> returns "Lp/Foo.0x1234;". >>> - A hidden class is not modifiable, i.e. cannot be redefined or >>> retransformed. JVM TI IsModifableClass returns false on a hidden. >>> - Final fields in a hidden class is "final".? The value of final >>> fields cannot be overriden via reflection. setAccessible(true) can >>> still be called on reflected objects representing final fields in a >>> hidden class and its access check will be suppressed but only have >>> read-access (i.e. can do Field::getXXX but not setXXX). >>> >>> Brief summary of this patch: >>> >>> 1. A new Lookup::defineHiddenClass method is the API to create a >>> hidden class. >>> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG >>> option that >>> ??? can be specified when creating a hidden class. >>> 3. A new Class::isHiddenClass method tests if a class is a hidden >>> class. >>> 4. Field::setXXX method will throw IAE on a final field of a hidden >>> class >>> ??? regardless of the value of the accessible flag. >>> 5. JVM_LookupDefineClass is the new JVM entry point for >>> Lookup::defineClass >>> ??? and defineHiddenClass to create a class from the given bytes. >>> 6. ClassLoaderData implementation is not changed.? There is one >>> primary CLD >>> ??? that holds the classes strongly referenced by its defining >>> loader. There >>> ??? can be zero or more additional CLDs - one per weak class. >>> 7. Nest host determination is updated per revised JVMS 5.4.4. Access >>> control >>> ??? check no longer throws LinkageError but instead it will throw >>> IAE with >>> ??? a clear message if a class fails to resolve/validate the nest >>> host declared >>> ??? in NestHost/NestMembers attribute. >>> 8. JFR, jcmd, JDI are updated to support hidden classes. >>> 9. update javac LambdaToMethod as lambda proxy starts using nestmates >>> ??? and generate a bridge method to desuger a method reference to a >>> protected >>> ??? method in its supertype in a different package >>> >>> This patch also updates StringConcatFactory, LambdaMetaFactory, and >>> LambdaForms >>> to use hidden classes.? The webrev includes changes in nashorn to >>> hidden class >>> and I will update the webrev if JEP 372 removes it any time soon. >>> >>> We uncovered a bug in Lookup::defineClass spec throws LinkageError >>> and intends >>> to have the newly created class linked.? However, the implementation >>> in 14 >>> does not link the class.? A separate CSR [2] proposes to update the >>> implementation to match the spec.? This patch fixes the implementation. >>> >>> The spec update on JVM TI, JDI and Instrumentation will be done as >>> a separate RFE [3].? This patch includes new tests for JVM TI and >>> java.instrument that validates how the existing APIs work for hidden >>> classes. >>> >>> javadoc/specdiff >>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ >>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ >>> >>> >>> JVMS 5.4.4 change: >>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf >>> >>> >>> CSR: >>> https://bugs.openjdk.java.net/browse/JDK-8238359 >>> >>> Thanks >>> Mandy >>> [1] https://bugs.openjdk.java.net/browse/JDK-8238359 >>> [2] https://bugs.openjdk.java.net/browse/JDK-8240338 >>> [3] https://bugs.openjdk.java.net/browse/JDK-8230502 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mandy.chung at oracle.com Fri Mar 27 22:22:19 2020 From: mandy.chung at oracle.com (Mandy Chung) Date: Fri, 27 Mar 2020 15:22:19 -0700 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> <72EC6AFD-412B-4238-B98E-90AF8C185D98@oracle.com> Message-ID: <41b99011-921a-9fc6-c738-98c47e9959c3@oracle.com> Hi Paul, This is the delta incorporating your comment: http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03-delta-psandoz/ This patch also took Alex's comment to make it clear that the hidden class is the lookup class of the returned Lookup object and drops the sentence you commented on: On 3/27/20 1:18 PM, Mandy Chung wrote: >> MethodHandles.java >> ? >> >> 1786????????? * (Given the {@code Lookup} object returned this >> method, its lookup class >> 1787????????? * is a {@code Class} object for which {@link >> Class#getName()} returns a string >> 1788????????? * that is not a binary name.) >> >> ? >> (The {@code Lookup} object returned from this method has a lookup >> class that is >> a {@code Class} object whose {@link Class#getName()} returns a string >> that is not a binary name.) >> ? Mandy -------------- next part -------------- An HTML attachment was scrubbed... URL: From mandy.chung at oracle.com Fri Mar 27 22:29:03 2020 From: mandy.chung at oracle.com (Mandy Chung) Date: Fri, 27 Mar 2020 15:29:03 -0700 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> <08191054-8d0a-ae60-ac99-e2849f08ce85@oracle.com> Message-ID: Hi Vicente, hasNestmateAccess is about VM supports static nestmates on JDK release >= 11. However this is about javac --release 14 and the compiled classes may run on JDK 14 that lambda and string concat spin classes that are not nestmates. I have a patch with Jan's help: http://cr.openjdk.java.net/~mchung/valhalla/webrevs/8171335/webrev-javac-target-release-14/index.html (you can apply the above patch on valhalla repo "nestmates" branch) About testing, I wanted to run BridgeMethodsForLambdaTest and TestLambdaBytecode test with --release 14 but it turns out not straight-forward.? Any help would be appreciated. thanks Mandy On 3/27/20 2:15 PM, Vicente Romero wrote: > Hi Mandy, > > The patch for nestmates [1] could be used as a reference. There a new > method was added to class `com.sun.tools.javac.jvm.Target`, named: > `hasNestmateAccess` which checks if a target is ready for nestmates or > not. I think that you can follow a similar approach here. > > Thanks, > Vicente > > [1] http://hg.openjdk.java.net/jdk/jdk/rev/2f2af62dfac7 > > On 3/27/20 12:29 PM, Mandy Chung wrote: >> Hi Jan, >> >> Good point.? The javac change only applies to JDK 15 and later and >> the lambda proxy class is not a nestmate when running on JDK 14 or >> earlier. >> >> I probably need the help from langtools team to fix this.? I'll give >> it a try. >> >> Mandy >> >> On 3/27/20 4:31 AM, Jan Lahoda wrote: >>> Hi Mandy, >>> >>> Regarding the javac changes - should those be switched on/off >>> depending the Target? Or, if one compiles with e.g. --release 14, >>> will the newly generated output still work on JDK 14? >>> >>> Jan >>> >>> On 27. 03. 20 0:57, Mandy Chung wrote: >>>> Please review the implementation of JEP 371: Hidden Classes. The >>>> main changes are in core-libs and hotspot runtime area.? Small >>>> changes are made in javac, VM compiler (intrinsification of >>>> Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been >>>> reviewed and is in the finalized state (see specdiff and javadoc >>>> below for reference). >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 >>>> >>>> >>>> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's >>>> point >>>> of view, a hidden class is a normal class except the following: >>>> >>>> - A hidden class has no initiating class loader and is not >>>> registered in any dictionary. >>>> - A hidden class has a name containing an illegal character >>>> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` >>>> returns "Lp/Foo.0x1234;". >>>> - A hidden class is not modifiable, i.e. cannot be redefined or >>>> retransformed. JVM TI IsModifableClass returns false on a hidden. >>>> - Final fields in a hidden class is "final".? The value of final >>>> fields cannot be overriden via reflection. setAccessible(true) can >>>> still be called on reflected objects representing final fields in a >>>> hidden class and its access check will be suppressed but only have >>>> read-access (i.e. can do Field::getXXX but not setXXX). >>>> >>>> Brief summary of this patch: >>>> >>>> 1. A new Lookup::defineHiddenClass method is the API to create a >>>> hidden class. >>>> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG >>>> option that >>>> ??? can be specified when creating a hidden class. >>>> 3. A new Class::isHiddenClass method tests if a class is a hidden >>>> class. >>>> 4. Field::setXXX method will throw IAE on a final field of a hidden >>>> class >>>> ??? regardless of the value of the accessible flag. >>>> 5. JVM_LookupDefineClass is the new JVM entry point for >>>> Lookup::defineClass >>>> ??? and defineHiddenClass to create a class from the given bytes. >>>> 6. ClassLoaderData implementation is not changed.? There is one >>>> primary CLD >>>> ??? that holds the classes strongly referenced by its defining >>>> loader. There >>>> ??? can be zero or more additional CLDs - one per weak class. >>>> 7. Nest host determination is updated per revised JVMS 5.4.4. >>>> Access control >>>> ??? check no longer throws LinkageError but instead it will throw >>>> IAE with >>>> ??? a clear message if a class fails to resolve/validate the nest >>>> host declared >>>> ??? in NestHost/NestMembers attribute. >>>> 8. JFR, jcmd, JDI are updated to support hidden classes. >>>> 9. update javac LambdaToMethod as lambda proxy starts using nestmates >>>> ??? and generate a bridge method to desuger a method reference to a >>>> protected >>>> ??? method in its supertype in a different package >>>> >>>> This patch also updates StringConcatFactory, LambdaMetaFactory, and >>>> LambdaForms >>>> to use hidden classes.? The webrev includes changes in nashorn to >>>> hidden class >>>> and I will update the webrev if JEP 372 removes it any time soon. >>>> >>>> We uncovered a bug in Lookup::defineClass spec throws LinkageError >>>> and intends >>>> to have the newly created class linked.? However, the >>>> implementation in 14 >>>> does not link the class.? A separate CSR [2] proposes to update the >>>> implementation to match the spec.? This patch fixes the >>>> implementation. >>>> >>>> The spec update on JVM TI, JDI and Instrumentation will be done as >>>> a separate RFE [3].? This patch includes new tests for JVM TI and >>>> java.instrument that validates how the existing APIs work for >>>> hidden classes. >>>> >>>> javadoc/specdiff >>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ >>>> >>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ >>>> >>>> >>>> JVMS 5.4.4 change: >>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf >>>> >>>> >>>> CSR: >>>> https://bugs.openjdk.java.net/browse/JDK-8238359 >>>> >>>> Thanks >>>> Mandy >>>> [1] https://bugs.openjdk.java.net/browse/JDK-8238359 >>>> [2] https://bugs.openjdk.java.net/browse/JDK-8240338 >>>> [3] https://bugs.openjdk.java.net/browse/JDK-8230502 >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vicente.romero at oracle.com Fri Mar 27 22:48:52 2020 From: vicente.romero at oracle.com (Vicente Romero) Date: Fri, 27 Mar 2020 18:48:52 -0400 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> <08191054-8d0a-ae60-ac99-e2849f08ce85@oracle.com> Message-ID: <60348819-ace5-8e52-f1ff-5f9654c915e0@oracle.com> Hi Mandy, On 3/27/20 6:29 PM, Mandy Chung wrote: > Hi Vicente, > > hasNestmateAccess is about VM supports static nestmates on JDK release > >= 11. I was not suggesting the use of `hasNestmateAccess` but to follow the same approach which is adding a new method at class `Target` to check if the new goodies were in the given target > > However this is about javac --release 14 and the compiled classes may > run on JDK 14 that lambda and string concat spin classes that are not > nestmates. I have a patch with Jan's help: > > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/8171335/webrev-javac-target-release-14/index.html which is what the patch above is doing > > (you can apply the above patch on valhalla repo "nestmates" branch) > > About testing, I wanted to run BridgeMethodsForLambdaTest and > TestLambdaBytecode test with --release 14 but it turns out not > straight-forward.? Any help would be appreciated. > > thanks > Mandy Vicente > > On 3/27/20 2:15 PM, Vicente Romero wrote: >> Hi Mandy, >> >> The patch for nestmates [1] could be used as a reference. There a new >> method was added to class `com.sun.tools.javac.jvm.Target`, named: >> `hasNestmateAccess` which checks if a target is ready for nestmates >> or not. I think that you can follow a similar approach here. >> >> Thanks, >> Vicente >> >> [1] http://hg.openjdk.java.net/jdk/jdk/rev/2f2af62dfac7 >> >> On 3/27/20 12:29 PM, Mandy Chung wrote: >>> Hi Jan, >>> >>> Good point.? The javac change only applies to JDK 15 and later and >>> the lambda proxy class is not a nestmate when running on JDK 14 or >>> earlier. >>> >>> I probably need the help from langtools team to fix this. I'll give >>> it a try. >>> >>> Mandy >>> >>> On 3/27/20 4:31 AM, Jan Lahoda wrote: >>>> Hi Mandy, >>>> >>>> Regarding the javac changes - should those be switched on/off >>>> depending the Target? Or, if one compiles with e.g. --release 14, >>>> will the newly generated output still work on JDK 14? >>>> >>>> Jan >>>> >>>> On 27. 03. 20 0:57, Mandy Chung wrote: >>>>> Please review the implementation of JEP 371: Hidden Classes. The >>>>> main changes are in core-libs and hotspot runtime area.? Small >>>>> changes are made in javac, VM compiler (intrinsification of >>>>> Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been >>>>> reviewed and is in the finalized state (see specdiff and javadoc >>>>> below for reference). >>>>> >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 >>>>> >>>>> >>>>> Hidden class is created via `Lookup::defineHiddenClass`. From >>>>> JVM's point >>>>> of view, a hidden class is a normal class except the following: >>>>> >>>>> - A hidden class has no initiating class loader and is not >>>>> registered in any dictionary. >>>>> - A hidden class has a name containing an illegal character >>>>> `Class::getName` returns `p.Foo/0x1234` whereas >>>>> `GetClassSignature` returns "Lp/Foo.0x1234;". >>>>> - A hidden class is not modifiable, i.e. cannot be redefined or >>>>> retransformed. JVM TI IsModifableClass returns false on a hidden. >>>>> - Final fields in a hidden class is "final".? The value of final >>>>> fields cannot be overriden via reflection. setAccessible(true) can >>>>> still be called on reflected objects representing final fields in >>>>> a hidden class and its access check will be suppressed but only >>>>> have read-access (i.e. can do Field::getXXX but not setXXX). >>>>> >>>>> Brief summary of this patch: >>>>> >>>>> 1. A new Lookup::defineHiddenClass method is the API to create a >>>>> hidden class. >>>>> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG >>>>> option that >>>>> ??? can be specified when creating a hidden class. >>>>> 3. A new Class::isHiddenClass method tests if a class is a hidden >>>>> class. >>>>> 4. Field::setXXX method will throw IAE on a final field of a >>>>> hidden class >>>>> ??? regardless of the value of the accessible flag. >>>>> 5. JVM_LookupDefineClass is the new JVM entry point for >>>>> Lookup::defineClass >>>>> ??? and defineHiddenClass to create a class from the given bytes. >>>>> 6. ClassLoaderData implementation is not changed.? There is one >>>>> primary CLD >>>>> ??? that holds the classes strongly referenced by its defining >>>>> loader. There >>>>> ??? can be zero or more additional CLDs - one per weak class. >>>>> 7. Nest host determination is updated per revised JVMS 5.4.4. >>>>> Access control >>>>> ??? check no longer throws LinkageError but instead it will throw >>>>> IAE with >>>>> ??? a clear message if a class fails to resolve/validate the nest >>>>> host declared >>>>> ??? in NestHost/NestMembers attribute. >>>>> 8. JFR, jcmd, JDI are updated to support hidden classes. >>>>> 9. update javac LambdaToMethod as lambda proxy starts using nestmates >>>>> ??? and generate a bridge method to desuger a method reference to >>>>> a protected >>>>> ??? method in its supertype in a different package >>>>> >>>>> This patch also updates StringConcatFactory, LambdaMetaFactory, >>>>> and LambdaForms >>>>> to use hidden classes.? The webrev includes changes in nashorn to >>>>> hidden class >>>>> and I will update the webrev if JEP 372 removes it any time soon. >>>>> >>>>> We uncovered a bug in Lookup::defineClass spec throws LinkageError >>>>> and intends >>>>> to have the newly created class linked.? However, the >>>>> implementation in 14 >>>>> does not link the class.? A separate CSR [2] proposes to update the >>>>> implementation to match the spec.? This patch fixes the >>>>> implementation. >>>>> >>>>> The spec update on JVM TI, JDI and Instrumentation will be done as >>>>> a separate RFE [3].? This patch includes new tests for JVM TI and >>>>> java.instrument that validates how the existing APIs work for >>>>> hidden classes. >>>>> >>>>> javadoc/specdiff >>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ >>>>> >>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ >>>>> >>>>> >>>>> JVMS 5.4.4 change: >>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf >>>>> >>>>> >>>>> CSR: >>>>> https://bugs.openjdk.java.net/browse/JDK-8238359 >>>>> >>>>> Thanks >>>>> Mandy >>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8238359 >>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8240338 >>>>> [3] https://bugs.openjdk.java.net/browse/JDK-8230502 >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Fri Mar 27 23:01:58 2020 From: david.holmes at oracle.com (David Holmes) Date: Sat, 28 Mar 2020 09:01:58 +1000 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> <08191054-8d0a-ae60-ac99-e2849f08ce85@oracle.com> Message-ID: <51d0a4cf-2565-2d61-31bb-ff43bca6b0e4@oracle.com> Hi Mandy, On 28/03/2020 8:29 am, Mandy Chung wrote: > Hi Vicente, > > hasNestmateAccess is about VM supports static nestmates on JDK release > >= 11. > > However this is about javac --release 14 and the compiled classes may > run on JDK 14 that lambda and string concat spin classes that are not > nestmates. I have a patch with Jan's help: > > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/8171335/webrev-javac-target-release-14/index.html + /** + * The VM does not support access across nested classes (8010319). + * Were that ever to change, this should be removed. + */ + boolean isPrivateInOtherClass() { I'm not at all sure what this means - access across different nests? (I'm not even sure what that means.) Thanks, David ----- > > (you can apply the above patch on valhalla repo "nestmates" branch) > > About testing, I wanted to run BridgeMethodsForLambdaTest and > TestLambdaBytecode test with --release 14 but it turns out not > straight-forward.? Any help would be appreciated. > > thanks > Mandy > > On 3/27/20 2:15 PM, Vicente Romero wrote: >> Hi Mandy, >> >> The patch for nestmates [1] could be used as a reference. There a new >> method was added to class `com.sun.tools.javac.jvm.Target`, named: >> `hasNestmateAccess` which checks if a target is ready for nestmates or >> not. I think that you can follow a similar approach here. >> >> Thanks, >> Vicente >> >> [1] http://hg.openjdk.java.net/jdk/jdk/rev/2f2af62dfac7 >> >> On 3/27/20 12:29 PM, Mandy Chung wrote: >>> Hi Jan, >>> >>> Good point.? The javac change only applies to JDK 15 and later and >>> the lambda proxy class is not a nestmate when running on JDK 14 or >>> earlier. >>> >>> I probably need the help from langtools team to fix this.? I'll give >>> it a try. >>> >>> Mandy >>> >>> On 3/27/20 4:31 AM, Jan Lahoda wrote: >>>> Hi Mandy, >>>> >>>> Regarding the javac changes - should those be switched on/off >>>> depending the Target? Or, if one compiles with e.g. --release 14, >>>> will the newly generated output still work on JDK 14? >>>> >>>> Jan >>>> >>>> On 27. 03. 20 0:57, Mandy Chung wrote: >>>>> Please review the implementation of JEP 371: Hidden Classes. The >>>>> main changes are in core-libs and hotspot runtime area.? Small >>>>> changes are made in javac, VM compiler (intrinsification of >>>>> Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been >>>>> reviewed and is in the finalized state (see specdiff and javadoc >>>>> below for reference). >>>>> >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 >>>>> >>>>> >>>>> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's >>>>> point >>>>> of view, a hidden class is a normal class except the following: >>>>> >>>>> - A hidden class has no initiating class loader and is not >>>>> registered in any dictionary. >>>>> - A hidden class has a name containing an illegal character >>>>> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` >>>>> returns "Lp/Foo.0x1234;". >>>>> - A hidden class is not modifiable, i.e. cannot be redefined or >>>>> retransformed. JVM TI IsModifableClass returns false on a hidden. >>>>> - Final fields in a hidden class is "final".? The value of final >>>>> fields cannot be overriden via reflection. setAccessible(true) can >>>>> still be called on reflected objects representing final fields in a >>>>> hidden class and its access check will be suppressed but only have >>>>> read-access (i.e. can do Field::getXXX but not setXXX). >>>>> >>>>> Brief summary of this patch: >>>>> >>>>> 1. A new Lookup::defineHiddenClass method is the API to create a >>>>> hidden class. >>>>> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG >>>>> option that >>>>> ??? can be specified when creating a hidden class. >>>>> 3. A new Class::isHiddenClass method tests if a class is a hidden >>>>> class. >>>>> 4. Field::setXXX method will throw IAE on a final field of a hidden >>>>> class >>>>> ??? regardless of the value of the accessible flag. >>>>> 5. JVM_LookupDefineClass is the new JVM entry point for >>>>> Lookup::defineClass >>>>> ??? and defineHiddenClass to create a class from the given bytes. >>>>> 6. ClassLoaderData implementation is not changed.? There is one >>>>> primary CLD >>>>> ??? that holds the classes strongly referenced by its defining >>>>> loader. There >>>>> ??? can be zero or more additional CLDs - one per weak class. >>>>> 7. Nest host determination is updated per revised JVMS 5.4.4. >>>>> Access control >>>>> ??? check no longer throws LinkageError but instead it will throw >>>>> IAE with >>>>> ??? a clear message if a class fails to resolve/validate the nest >>>>> host declared >>>>> ??? in NestHost/NestMembers attribute. >>>>> 8. JFR, jcmd, JDI are updated to support hidden classes. >>>>> 9. update javac LambdaToMethod as lambda proxy starts using nestmates >>>>> ??? and generate a bridge method to desuger a method reference to a >>>>> protected >>>>> ??? method in its supertype in a different package >>>>> >>>>> This patch also updates StringConcatFactory, LambdaMetaFactory, and >>>>> LambdaForms >>>>> to use hidden classes.? The webrev includes changes in nashorn to >>>>> hidden class >>>>> and I will update the webrev if JEP 372 removes it any time soon. >>>>> >>>>> We uncovered a bug in Lookup::defineClass spec throws LinkageError >>>>> and intends >>>>> to have the newly created class linked.? However, the >>>>> implementation in 14 >>>>> does not link the class.? A separate CSR [2] proposes to update the >>>>> implementation to match the spec.? This patch fixes the >>>>> implementation. >>>>> >>>>> The spec update on JVM TI, JDI and Instrumentation will be done as >>>>> a separate RFE [3].? This patch includes new tests for JVM TI and >>>>> java.instrument that validates how the existing APIs work for >>>>> hidden classes. >>>>> >>>>> javadoc/specdiff >>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ >>>>> >>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ >>>>> >>>>> >>>>> JVMS 5.4.4 change: >>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf >>>>> >>>>> >>>>> CSR: >>>>> https://bugs.openjdk.java.net/browse/JDK-8238359 >>>>> >>>>> Thanks >>>>> Mandy >>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8238359 >>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8240338 >>>>> [3] https://bugs.openjdk.java.net/browse/JDK-8230502 >>> >> > From forax at univ-mlv.fr Fri Mar 27 23:40:59 2020 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 28 Mar 2020 00:40:59 +0100 (CET) Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: <51d0a4cf-2565-2d61-31bb-ff43bca6b0e4@oracle.com> References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> <08191054-8d0a-ae60-ac99-e2849f08ce85@oracle.com> <51d0a4cf-2565-2d61-31bb-ff43bca6b0e4@oracle.com> Message-ID: <405050984.1553152.1585352459094.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "David Holmes" > ?: "mandy chung" , "Vicente Romero" , "jan lahoda" > > Cc: "serviceability-dev" , "hotspot-dev" , > "core-libs-dev" , "valhalla-dev" > Envoy?: Samedi 28 Mars 2020 00:01:58 > Objet: Re: Review Request: 8238358: Implementation of JEP 371: Hidden Classes > Hi Mandy, Hi David, > > On 28/03/2020 8:29 am, Mandy Chung wrote: >> Hi Vicente, >> >> hasNestmateAccess is about VM supports static nestmates on JDK release >> >= 11. >> >> However this is about javac --release 14 and the compiled classes may >> run on JDK 14 that lambda and string concat spin classes that are not >> nestmates. I have a patch with Jan's help: >> >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/8171335/webrev-javac-target-release-14/index.html > > + /** > + * The VM does not support access across nested classes > (8010319). > + * Were that ever to change, this should be removed. > + */ > + boolean isPrivateInOtherClass() { > > I'm not at all sure what this means - access across different nests? > (I'm not even sure what that means.) Access inside the same nest. As you know, until now, a lambda proxy is a VM anonymous class that can only see the private fields of the class declaring the lambda (the host class) and not the private fields of a class of the nest (the enclosing classes in term of Java the language). R?mi > > Thanks, > David > ----- > >> >> (you can apply the above patch on valhalla repo "nestmates" branch) >> >> About testing, I wanted to run BridgeMethodsForLambdaTest and >> TestLambdaBytecode test with --release 14 but it turns out not >> straight-forward.? Any help would be appreciated. >> >> thanks >> Mandy >> >> On 3/27/20 2:15 PM, Vicente Romero wrote: >>> Hi Mandy, >>> >>> The patch for nestmates [1] could be used as a reference. There a new >>> method was added to class `com.sun.tools.javac.jvm.Target`, named: >>> `hasNestmateAccess` which checks if a target is ready for nestmates or >>> not. I think that you can follow a similar approach here. >>> >>> Thanks, >>> Vicente >>> >>> [1] http://hg.openjdk.java.net/jdk/jdk/rev/2f2af62dfac7 >>> >>> On 3/27/20 12:29 PM, Mandy Chung wrote: >>>> Hi Jan, >>>> >>>> Good point.? The javac change only applies to JDK 15 and later and >>>> the lambda proxy class is not a nestmate when running on JDK 14 or >>>> earlier. >>>> >>>> I probably need the help from langtools team to fix this.? I'll give >>>> it a try. >>>> >>>> Mandy >>>> >>>> On 3/27/20 4:31 AM, Jan Lahoda wrote: >>>>> Hi Mandy, >>>>> >>>>> Regarding the javac changes - should those be switched on/off >>>>> depending the Target? Or, if one compiles with e.g. --release 14, >>>>> will the newly generated output still work on JDK 14? >>>>> >>>>> Jan >>>>> >>>>> On 27. 03. 20 0:57, Mandy Chung wrote: >>>>>> Please review the implementation of JEP 371: Hidden Classes. The >>>>>> main changes are in core-libs and hotspot runtime area.? Small >>>>>> changes are made in javac, VM compiler (intrinsification of >>>>>> Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been >>>>>> reviewed and is in the finalized state (see specdiff and javadoc >>>>>> below for reference). >>>>>> >>>>>> Webrev: >>>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 >>>>>> >>>>>> >>>>>> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's >>>>>> point >>>>>> of view, a hidden class is a normal class except the following: >>>>>> >>>>>> - A hidden class has no initiating class loader and is not >>>>>> registered in any dictionary. >>>>>> - A hidden class has a name containing an illegal character >>>>>> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` >>>>>> returns "Lp/Foo.0x1234;". >>>>>> - A hidden class is not modifiable, i.e. cannot be redefined or >>>>>> retransformed. JVM TI IsModifableClass returns false on a hidden. >>>>>> - Final fields in a hidden class is "final".? The value of final >>>>>> fields cannot be overriden via reflection. setAccessible(true) can >>>>>> still be called on reflected objects representing final fields in a >>>>>> hidden class and its access check will be suppressed but only have >>>>>> read-access (i.e. can do Field::getXXX but not setXXX). >>>>>> >>>>>> Brief summary of this patch: >>>>>> >>>>>> 1. A new Lookup::defineHiddenClass method is the API to create a >>>>>> hidden class. >>>>>> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG >>>>>> option that >>>>>> ??? can be specified when creating a hidden class. >>>>>> 3. A new Class::isHiddenClass method tests if a class is a hidden >>>>>> class. >>>>>> 4. Field::setXXX method will throw IAE on a final field of a hidden >>>>>> class >>>>>> ??? regardless of the value of the accessible flag. >>>>>> 5. JVM_LookupDefineClass is the new JVM entry point for >>>>>> Lookup::defineClass >>>>>> ??? and defineHiddenClass to create a class from the given bytes. >>>>>> 6. ClassLoaderData implementation is not changed.? There is one >>>>>> primary CLD >>>>>> ??? that holds the classes strongly referenced by its defining >>>>>> loader. There >>>>>> ??? can be zero or more additional CLDs - one per weak class. >>>>>> 7. Nest host determination is updated per revised JVMS 5.4.4. >>>>>> Access control >>>>>> ??? check no longer throws LinkageError but instead it will throw >>>>>> IAE with >>>>>> ??? a clear message if a class fails to resolve/validate the nest >>>>>> host declared >>>>>> ??? in NestHost/NestMembers attribute. >>>>>> 8. JFR, jcmd, JDI are updated to support hidden classes. >>>>>> 9. update javac LambdaToMethod as lambda proxy starts using nestmates >>>>>> ??? and generate a bridge method to desuger a method reference to a >>>>>> protected >>>>>> ??? method in its supertype in a different package >>>>>> >>>>>> This patch also updates StringConcatFactory, LambdaMetaFactory, and >>>>>> LambdaForms >>>>>> to use hidden classes.? The webrev includes changes in nashorn to >>>>>> hidden class >>>>>> and I will update the webrev if JEP 372 removes it any time soon. >>>>>> >>>>>> We uncovered a bug in Lookup::defineClass spec throws LinkageError >>>>>> and intends >>>>>> to have the newly created class linked.? However, the >>>>>> implementation in 14 >>>>>> does not link the class.? A separate CSR [2] proposes to update the >>>>>> implementation to match the spec.? This patch fixes the >>>>>> implementation. >>>>>> >>>>>> The spec update on JVM TI, JDI and Instrumentation will be done as >>>>>> a separate RFE [3].? This patch includes new tests for JVM TI and >>>>>> java.instrument that validates how the existing APIs work for >>>>>> hidden classes. >>>>>> >>>>>> javadoc/specdiff >>>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ >>>>>> >>>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ >>>>>> >>>>>> >>>>>> JVMS 5.4.4 change: >>>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf >>>>>> >>>>>> >>>>>> CSR: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8238359 >>>>>> >>>>>> Thanks >>>>>> Mandy >>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8238359 >>>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8240338 >>>>>> [3] https://bugs.openjdk.java.net/browse/JDK-8230502 >>>> >>> From mandy.chung at oracle.com Fri Mar 27 23:46:00 2020 From: mandy.chung at oracle.com (Mandy Chung) Date: Fri, 27 Mar 2020 16:46:00 -0700 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: <51d0a4cf-2565-2d61-31bb-ff43bca6b0e4@oracle.com> References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> <08191054-8d0a-ae60-ac99-e2849f08ce85@oracle.com> <51d0a4cf-2565-2d61-31bb-ff43bca6b0e4@oracle.com> Message-ID: <09126b64-2aa5-9b0d-ec5a-62753a416ea9@oracle.com> On 3/27/20 4:01 PM, David Holmes wrote: > Hi Mandy, > > On 28/03/2020 8:29 am, Mandy Chung wrote: >> Hi Vicente, >> >> hasNestmateAccess is about VM supports static nestmates on JDK >> release ?>= 11. >> >> However this is about javac --release 14 and the compiled classes may >> run on JDK 14 that lambda and string concat spin classes that are not >> nestmates. I have a patch with Jan's help: >> >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/8171335/webrev-javac-target-release-14/index.html > > > +???????????? /** > +????????????? * The VM does not support access across nested classes > (8010319). > +????????????? * Were that ever to change, this should be removed. > +????????????? */ > +???????????? boolean isPrivateInOtherClass() { > > I'm not at all sure what this means - access across different nests? > (I'm not even sure what that means.) This just reverts? the old code that I removed. What this method is trying to determine if it accesses a private in another class in the same nest (nested classes) that needs a synthetic bridge method to access. Mandy -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Sat Mar 28 01:15:43 2020 From: david.holmes at oracle.com (David Holmes) Date: Sat, 28 Mar 2020 11:15:43 +1000 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: <09126b64-2aa5-9b0d-ec5a-62753a416ea9@oracle.com> References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> <08191054-8d0a-ae60-ac99-e2849f08ce85@oracle.com> <51d0a4cf-2565-2d61-31bb-ff43bca6b0e4@oracle.com> <09126b64-2aa5-9b0d-ec5a-62753a416ea9@oracle.com> Message-ID: <069c76b6-dd85-f8f5-c2dd-1ba994178084@oracle.com> Hi Mandy, On 28/03/2020 9:46 am, Mandy Chung wrote: > > > On 3/27/20 4:01 PM, David Holmes wrote: >> Hi Mandy, >> >> On 28/03/2020 8:29 am, Mandy Chung wrote: >>> Hi Vicente, >>> >>> hasNestmateAccess is about VM supports static nestmates on JDK >>> release ?>= 11. >>> >>> However this is about javac --release 14 and the compiled classes may >>> run on JDK 14 that lambda and string concat spin classes that are not >>> nestmates. I have a patch with Jan's help: >>> >>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/8171335/webrev-javac-target-release-14/index.html >> >> >> +???????????? /** >> +????????????? * The VM does not support access across nested classes >> (8010319). >> +????????????? * Were that ever to change, this should be removed. >> +????????????? */ >> +???????????? boolean isPrivateInOtherClass() { >> >> I'm not at all sure what this means - access across different nests? >> (I'm not even sure what that means.) > > This just reverts? the old code that I removed. Ah I see. This is ancient pre-nestmate code. Can we at least fix the comment as it really doesn't make any sense > What this method is trying to determine if it accesses a private in > another class in the same nest (nested classes) that needs a synthetic > bridge method to access. That would be a good comment to add. Something like: If compiling for a release where the VM does not support access between nested classes, this method indicates if a synthetic bridge method is needed for access. Thanks, David > Mandy From paul.sandoz at oracle.com Sat Mar 28 01:39:46 2020 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Fri, 27 Mar 2020 18:39:46 -0700 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: <41b99011-921a-9fc6-c738-98c47e9959c3@oracle.com> References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> <72EC6AFD-412B-4238-B98E-90AF8C185D98@oracle.com> <41b99011-921a-9fc6-c738-98c47e9959c3@oracle.com> Message-ID: <2667FFDE-44A9-4584-BF16-897B863D89F3@oracle.com> +1 Paul. > On Mar 27, 2020, at 3:22 PM, Mandy Chung wrote: > > Hi Paul, > > This is the delta incorporating your comment: > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03-delta-psandoz/ > > This patch also took Alex's comment to make it clear that the hidden class is the lookup class of the returned Lookup object and drops the sentence you commented on: > > On 3/27/20 1:18 PM, Mandy Chung wrote: >>> MethodHandles.java >>> ? >>> >>> 1786 * (Given the {@code Lookup} object returned this method, its lookup class >>> 1787 * is a {@code Class} object for which {@link Class#getName()} returns a string >>> 1788 * that is not a binary name.) >>> >>> ? >>> (The {@code Lookup} object returned from this method has a lookup class that is >>> a {@code Class} object whose {@link Class#getName()} returns a string >>> that is not a binary name.) >>> ? > > > Mandy -------------- next part -------------- An HTML attachment was scrubbed... URL: From dean.long at oracle.com Sat Mar 28 02:25:36 2020 From: dean.long at oracle.com (Dean Long) Date: Fri, 27 Mar 2020 19:25:36 -0700 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> Message-ID: <13d7ba73-e49f-a55d-7e80-bd10153152a4@oracle.com> I looked at the AOT, C2, and JVMCI changes and I didn't find any issues. dl On 3/26/20 4:57 PM, Mandy Chung wrote: > Please review the implementation of JEP 371: Hidden Classes. The main > changes are in core-libs and hotspot runtime area. Small changes are > made in javac, VM compiler (intrinsification of Class::isHiddenClass), > JFR, JDI, and jcmd.? CSR [1]has been reviewed and is in the finalized > state (see specdiff and javadoc below for reference). > > Webrev: > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 > > Hidden class is created via `Lookup::defineHiddenClass`. From JVM's point > of view, a hidden class is a normal class except the following: > > - A hidden class has no initiating class loader and is not registered > in any dictionary. > - A hidden class has a name containing an illegal character > `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` > returns "Lp/Foo.0x1234;". > - A hidden class is not modifiable, i.e. cannot be redefined or > retransformed. JVM TI IsModifableClass returns false on a hidden. > - Final fields in a hidden class is "final".? The value of final > fields cannot be overriden via reflection.? setAccessible(true) can > still be called on reflected objects representing final fields in a > hidden class and its access check will be suppressed but only have > read-access (i.e. can do Field::getXXX but not setXXX). > > Brief summary of this patch: > > 1. A new Lookup::defineHiddenClass method is the API to create a > hidden class. > 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG > option that > ?? can be specified when creating a hidden class. > 3. A new Class::isHiddenClass method tests if a class is a hidden class. > 4. Field::setXXX method will throw IAE on a final field of a hidden class > ?? regardless of the value of the accessible flag. > 5. JVM_LookupDefineClass is the new JVM entry point for > Lookup::defineClass > ?? and defineHiddenClass to create a class from the given bytes. > 6. ClassLoaderData implementation is not changed.? There is one > primary CLD > ?? that holds the classes strongly referenced by its defining loader.? > There > ?? can be zero or more additional CLDs - one per weak class. > 7. Nest host determination is updated per revised JVMS 5.4.4. Access > control > ?? check no longer throws LinkageError but instead it will throw IAE with > ?? a clear message if a class fails to resolve/validate the nest host > declared > ?? in NestHost/NestMembers attribute. > 8. JFR, jcmd, JDI are updated to support hidden classes. > 9. update javac LambdaToMethod as lambda proxy starts using nestmates > ?? and generate a bridge method to desuger a method reference to a > protected > ?? method in its supertype in a different package > > This patch also updates StringConcatFactory, LambdaMetaFactory, and > LambdaForms > to use hidden classes.? The webrev includes changes in nashorn to > hidden class > and I will update the webrev if JEP 372 removes it any time soon. > > We uncovered a bug in Lookup::defineClass spec throws LinkageError and > intends > to have the newly created class linked.? However, the implementation in 14 > does not link the class.? A separate CSR [2] proposes to update the > implementation to match the spec.? This patch fixes the implementation. > > The spec update on JVM TI, JDI and Instrumentation will be done as > a separate RFE [3].? This patch includes new tests for JVM TI and > java.instrument that validates how the existing APIs work for hidden > classes. > > javadoc/specdiff > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ > > JVMS 5.4.4 change: > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf > > CSR: > https://bugs.openjdk.java.net/browse/JDK-8238359 > > Thanks > Mandy > [1] https://bugs.openjdk.java.net/browse/JDK-8238359 > [2] https://bugs.openjdk.java.net/browse/JDK-8240338 > [3] https://bugs.openjdk.java.net/browse/JDK-8230502 From chris.plummer at oracle.com Sat Mar 28 03:51:37 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 27 Mar 2020 20:51:37 -0700 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> Message-ID: <90f9276d-2777-88ba-b1ec-9901711fcf02@oracle.com> Hi Mandy, A couple of very minor nits in the jvmtiRedefineClasses.cpp comments: ?153???? // classes for primitives, arrays, hidden and vm unsafe anonymous classes ?154???? // cannot be redefined.? Check here so following code can assume these classes ?155???? // are InstanceKlass. ?156???? if (!is_modifiable_class(mirror)) { ?157?????? _res = JVMTI_ERROR_UNMODIFIABLE_CLASS; ?158?????? return false; ?159???? } I think this code and comment predate anonymous classes. Probably before anonymous classes the check was not for !is_modifiable_class() but instead was just a check for primitive or array class types since they are not an InstanceKlass, and would cause issues when cast to one in the code that lies below this section. When anonymous classes were added, the code got changed to use !is_modifiable_class() and the comment was not correctly updated (anonymous classes are an InstanceKlass). Then with this webrev the mention of hidden classes was added, also incorrectly implying they are not an InstanceKlass. I think you should just leave off the last sentence of the comment. There's some ambiguity in the application of adjectives in the following: ?297?? // Cannot redefine or retransform a hidden or an unsafe anonymous class. I'd suggest: ?297?? // Cannot redefine or retransform a hidden class or an unsafe anonymous class. There are some places in libjdwp that need to be fixed. I spoke to Serguei about those this afternoon. Basically the convertSignatureToClassname() function needs to be fixed to handle hidden classes. Without the fix classname filtering will have problems if the filter contains a pattern with a '/' to filter on hidden classes. Also CLASS_UNLOAD events will not properly convert hidden class names. We also need tests for these cases. I think these are all things that can be addressed later. I still need to look over the JVMTI tests. thanks, Chris On 3/26/20 4:57 PM, Mandy Chung wrote: > Please review the implementation of JEP 371: Hidden Classes. The main > changes are in core-libs and hotspot runtime area. Small changes are > made in javac, VM compiler (intrinsification of Class::isHiddenClass), > JFR, JDI, and jcmd.? CSR [1]has been reviewed and is in the finalized > state (see specdiff and javadoc below for reference). > > Webrev: > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 > > Hidden class is created via `Lookup::defineHiddenClass`. From JVM's point > of view, a hidden class is a normal class except the following: > > - A hidden class has no initiating class loader and is not registered > in any dictionary. > - A hidden class has a name containing an illegal character > `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` > returns "Lp/Foo.0x1234;". > - A hidden class is not modifiable, i.e. cannot be redefined or > retransformed. JVM TI IsModifableClass returns false on a hidden. > - Final fields in a hidden class is "final".? The value of final > fields cannot be overriden via reflection.? setAccessible(true) can > still be called on reflected objects representing final fields in a > hidden class and its access check will be suppressed but only have > read-access (i.e. can do Field::getXXX but not setXXX). > > Brief summary of this patch: > > 1. A new Lookup::defineHiddenClass method is the API to create a > hidden class. > 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG > option that > ?? can be specified when creating a hidden class. > 3. A new Class::isHiddenClass method tests if a class is a hidden class. > 4. Field::setXXX method will throw IAE on a final field of a hidden class > ?? regardless of the value of the accessible flag. > 5. JVM_LookupDefineClass is the new JVM entry point for > Lookup::defineClass > ?? and defineHiddenClass to create a class from the given bytes. > 6. ClassLoaderData implementation is not changed.? There is one > primary CLD > ?? that holds the classes strongly referenced by its defining loader.? > There > ?? can be zero or more additional CLDs - one per weak class. > 7. Nest host determination is updated per revised JVMS 5.4.4. Access > control > ?? check no longer throws LinkageError but instead it will throw IAE with > ?? a clear message if a class fails to resolve/validate the nest host > declared > ?? in NestHost/NestMembers attribute. > 8. JFR, jcmd, JDI are updated to support hidden classes. > 9. update javac LambdaToMethod as lambda proxy starts using nestmates > ?? and generate a bridge method to desuger a method reference to a > protected > ?? method in its supertype in a different package > > This patch also updates StringConcatFactory, LambdaMetaFactory, and > LambdaForms > to use hidden classes.? The webrev includes changes in nashorn to > hidden class > and I will update the webrev if JEP 372 removes it any time soon. > > We uncovered a bug in Lookup::defineClass spec throws LinkageError and > intends > to have the newly created class linked.? However, the implementation in 14 > does not link the class.? A separate CSR [2] proposes to update the > implementation to match the spec.? This patch fixes the implementation. > > The spec update on JVM TI, JDI and Instrumentation will be done as > a separate RFE [3].? This patch includes new tests for JVM TI and > java.instrument that validates how the existing APIs work for hidden > classes. > > javadoc/specdiff > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ > > JVMS 5.4.4 change: > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf > > CSR: > https://bugs.openjdk.java.net/browse/JDK-8238359 > > Thanks > Mandy > [1] https://bugs.openjdk.java.net/browse/JDK-8238359 > [2] https://bugs.openjdk.java.net/browse/JDK-8240338 > [3] https://bugs.openjdk.java.net/browse/JDK-8230502 From mandy.chung at oracle.com Mon Mar 30 02:17:27 2020 From: mandy.chung at oracle.com (Mandy Chung) Date: Sun, 29 Mar 2020 19:17:27 -0700 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: <90f9276d-2777-88ba-b1ec-9901711fcf02@oracle.com> References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> <90f9276d-2777-88ba-b1ec-9901711fcf02@oracle.com> Message-ID: On 3/27/20 8:51 PM, Chris Plummer wrote: > Hi Mandy, > > A couple of very minor nits in the jvmtiRedefineClasses.cpp comments: > > ?153???? // classes for primitives, arrays, hidden and vm unsafe > anonymous classes > ?154???? // cannot be redefined.? Check here so following code can > assume these classes > ?155???? // are InstanceKlass. > ?156???? if (!is_modifiable_class(mirror)) { > ?157?????? _res = JVMTI_ERROR_UNMODIFIABLE_CLASS; > ?158?????? return false; > ?159???? } > > I think this code and comment predate anonymous classes. Probably > before anonymous classes the check was not for !is_modifiable_class() > but instead was just a check for primitive or array class types since > they are not an InstanceKlass, and would cause issues when cast to one > in the code that lies below this section. When anonymous classes were > added, the code got changed to use !is_modifiable_class() and the > comment was not correctly updated (anonymous classes are an > InstanceKlass). Then with this webrev the mention of hidden classes > was added, also incorrectly implying they are not an InstanceKlass. I > think you should just leave off the last sentence of the comment. > I agree with you that this comment needs update.?? Perhaps it should say "primitive, array types and hidden classes are non-modifiable. A modifiable class must be an InstanceKlass." I leave it to Serguei who may have other opinion. > There's some ambiguity in the application of adjectives in the following: > > ?297?? // Cannot redefine or retransform a hidden or an unsafe > anonymous class. > > I'd suggest: > > ?297?? // Cannot redefine or retransform a hidden class or an unsafe > anonymous class. > +1 > There are some places in libjdwp that need to be fixed. I spoke to > Serguei about those this afternoon. Basically the > convertSignatureToClassname() function needs to be fixed to handle > hidden classes. Without the fix classname filtering will have problems > if the filter contains a pattern with a '/' to filter on hidden > classes. Also CLASS_UNLOAD events will not properly convert hidden > class names. We also need tests for these cases. I think these are all > things that can be addressed later. > Good catch.? I have created a subtask under JDK-8230502: ?? https://bugs.openjdk.java.net/browse/JDK-8230502 > I still need to look over the JVMTI tests. > Thanks Mandy > thanks, > > Chris > > On 3/26/20 4:57 PM, Mandy Chung wrote: >> Please review the implementation of JEP 371: Hidden Classes. The main >> changes are in core-libs and hotspot runtime area. Small changes are >> made in javac, VM compiler (intrinsification of >> Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been reviewed >> and is in the finalized state (see specdiff and javadoc below for >> reference). >> >> Webrev: >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 >> >> >> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's >> point >> of view, a hidden class is a normal class except the following: >> >> - A hidden class has no initiating class loader and is not registered >> in any dictionary. >> - A hidden class has a name containing an illegal character >> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` >> returns "Lp/Foo.0x1234;". >> - A hidden class is not modifiable, i.e. cannot be redefined or >> retransformed. JVM TI IsModifableClass returns false on a hidden. >> - Final fields in a hidden class is "final".? The value of final >> fields cannot be overriden via reflection.? setAccessible(true) can >> still be called on reflected objects representing final fields in a >> hidden class and its access check will be suppressed but only have >> read-access (i.e. can do Field::getXXX but not setXXX). >> >> Brief summary of this patch: >> >> 1. A new Lookup::defineHiddenClass method is the API to create a >> hidden class. >> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG >> option that >> ?? can be specified when creating a hidden class. >> 3. A new Class::isHiddenClass method tests if a class is a hidden class. >> 4. Field::setXXX method will throw IAE on a final field of a hidden >> class >> ?? regardless of the value of the accessible flag. >> 5. JVM_LookupDefineClass is the new JVM entry point for >> Lookup::defineClass >> ?? and defineHiddenClass to create a class from the given bytes. >> 6. ClassLoaderData implementation is not changed.? There is one >> primary CLD >> ?? that holds the classes strongly referenced by its defining >> loader.? There >> ?? can be zero or more additional CLDs - one per weak class. >> 7. Nest host determination is updated per revised JVMS 5.4.4. Access >> control >> ?? check no longer throws LinkageError but instead it will throw IAE >> with >> ?? a clear message if a class fails to resolve/validate the nest host >> declared >> ?? in NestHost/NestMembers attribute. >> 8. JFR, jcmd, JDI are updated to support hidden classes. >> 9. update javac LambdaToMethod as lambda proxy starts using nestmates >> ?? and generate a bridge method to desuger a method reference to a >> protected >> ?? method in its supertype in a different package >> >> This patch also updates StringConcatFactory, LambdaMetaFactory, and >> LambdaForms >> to use hidden classes.? The webrev includes changes in nashorn to >> hidden class >> and I will update the webrev if JEP 372 removes it any time soon. >> >> We uncovered a bug in Lookup::defineClass spec throws LinkageError >> and intends >> to have the newly created class linked.? However, the implementation >> in 14 >> does not link the class.? A separate CSR [2] proposes to update the >> implementation to match the spec.? This patch fixes the implementation. >> >> The spec update on JVM TI, JDI and Instrumentation will be done as >> a separate RFE [3].? This patch includes new tests for JVM TI and >> java.instrument that validates how the existing APIs work for hidden >> classes. >> >> javadoc/specdiff >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ >> >> >> JVMS 5.4.4 change: >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf >> >> >> CSR: >> https://bugs.openjdk.java.net/browse/JDK-8238359 >> >> Thanks >> Mandy >> [1] https://bugs.openjdk.java.net/browse/JDK-8238359 >> [2] https://bugs.openjdk.java.net/browse/JDK-8240338 >> [3] https://bugs.openjdk.java.net/browse/JDK-8230502 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Mon Mar 30 03:40:43 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Sun, 29 Mar 2020 20:40:43 -0700 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> <90f9276d-2777-88ba-b1ec-9901711fcf02@oracle.com> Message-ID: <7fe9fd65-8bee-beb3-03af-ab56120a4cc1@oracle.com> Hi Mandy and Chris, On 3/29/20 19:17, Mandy Chung wrote: > > > On 3/27/20 8:51 PM, Chris Plummer wrote: >> Hi Mandy, >> >> A couple of very minor nits in the jvmtiRedefineClasses.cpp comments: >> >> ?153???? // classes for primitives, arrays, hidden and vm unsafe >> anonymous classes >> ?154???? // cannot be redefined.? Check here so following code can >> assume these classes >> ?155???? // are InstanceKlass. >> ?156???? if (!is_modifiable_class(mirror)) { >> ?157?????? _res = JVMTI_ERROR_UNMODIFIABLE_CLASS; >> ?158?????? return false; >> ?159???? } >> >> I think this code and comment predate anonymous classes. Probably >> before anonymous classes the check was not for !is_modifiable_class() >> but instead was just a check for primitive or array class types since >> they are not an InstanceKlass, and would cause issues when cast to >> one in the code that lies below this section. When anonymous classes >> were added, the code got changed to use !is_modifiable_class() and >> the comment was not correctly updated (anonymous classes are an >> InstanceKlass). Then with this webrev the mention of hidden classes >> was added, also incorrectly implying they are not an InstanceKlass. I >> think you should just leave off the last sentence of the comment. >> > > I agree with you that this comment needs update.?? Perhaps it should > say "primitive, array types and hidden classes are non-modifiable. A > modifiable class must be an InstanceKlass." > > I leave it to Serguei who may have other opinion. We already had a chat with Chris about this. This suggestion looks right. >> There's some ambiguity in the application of adjectives in the >> following: >> >> ?297?? // Cannot redefine or retransform a hidden or an unsafe >> anonymous class. >> >> I'd suggest: >> >> ?297?? // Cannot redefine or retransform a hidden class or an unsafe >> anonymous class. >> > > +1 +1 >> There are some places in libjdwp that need to be fixed. I spoke to >> Serguei about those this afternoon. Basically the >> convertSignatureToClassname() function needs to be fixed to handle >> hidden classes. Without the fix classname filtering will have >> problems if the filter contains a pattern with a '/' to filter on >> hidden classes. Also CLASS_UNLOAD events will not properly convert >> hidden class names. We also need tests for these cases. I think these >> are all things that can be addressed later. >> > > Good catch.? I have created a subtask under JDK-8230502: > ?? https://bugs.openjdk.java.net/browse/JDK-8230502 Yes, it is good catch. Thank you for filing the subtask. We discussed this with Chris. This was expected to be found with new test coverage and fixed in the JDI chunk of work which we have decided to separate from JEP 371. Thanks, Serguei >> I still need to look over the JVMTI tests. >> > > Thanks > Mandy >> thanks, >> >> Chris >> >> On 3/26/20 4:57 PM, Mandy Chung wrote: >>> Please review the implementation of JEP 371: Hidden Classes. The >>> main changes are in core-libs and hotspot runtime area. Small >>> changes are made in javac, VM compiler (intrinsification of >>> Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been reviewed >>> and is in the finalized state (see specdiff and javadoc below for >>> reference). >>> >>> Webrev: >>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 >>> >>> >>> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's >>> point >>> of view, a hidden class is a normal class except the following: >>> >>> - A hidden class has no initiating class loader and is not >>> registered in any dictionary. >>> - A hidden class has a name containing an illegal character >>> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` >>> returns "Lp/Foo.0x1234;". >>> - A hidden class is not modifiable, i.e. cannot be redefined or >>> retransformed. JVM TI IsModifableClass returns false on a hidden. >>> - Final fields in a hidden class is "final".? The value of final >>> fields cannot be overriden via reflection. setAccessible(true) can >>> still be called on reflected objects representing final fields in a >>> hidden class and its access check will be suppressed but only have >>> read-access (i.e. can do Field::getXXX but not setXXX). >>> >>> Brief summary of this patch: >>> >>> 1. A new Lookup::defineHiddenClass method is the API to create a >>> hidden class. >>> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG >>> option that >>> ?? can be specified when creating a hidden class. >>> 3. A new Class::isHiddenClass method tests if a class is a hidden >>> class. >>> 4. Field::setXXX method will throw IAE on a final field of a hidden >>> class >>> ?? regardless of the value of the accessible flag. >>> 5. JVM_LookupDefineClass is the new JVM entry point for >>> Lookup::defineClass >>> ?? and defineHiddenClass to create a class from the given bytes. >>> 6. ClassLoaderData implementation is not changed.? There is one >>> primary CLD >>> ?? that holds the classes strongly referenced by its defining >>> loader.? There >>> ?? can be zero or more additional CLDs - one per weak class. >>> 7. Nest host determination is updated per revised JVMS 5.4.4. Access >>> control >>> ?? check no longer throws LinkageError but instead it will throw IAE >>> with >>> ?? a clear message if a class fails to resolve/validate the nest >>> host declared >>> ?? in NestHost/NestMembers attribute. >>> 8. JFR, jcmd, JDI are updated to support hidden classes. >>> 9. update javac LambdaToMethod as lambda proxy starts using nestmates >>> ?? and generate a bridge method to desuger a method reference to a >>> protected >>> ?? method in its supertype in a different package >>> >>> This patch also updates StringConcatFactory, LambdaMetaFactory, and >>> LambdaForms >>> to use hidden classes.? The webrev includes changes in nashorn to >>> hidden class >>> and I will update the webrev if JEP 372 removes it any time soon. >>> >>> We uncovered a bug in Lookup::defineClass spec throws LinkageError >>> and intends >>> to have the newly created class linked.? However, the implementation >>> in 14 >>> does not link the class.? A separate CSR [2] proposes to update the >>> implementation to match the spec.? This patch fixes the implementation. >>> >>> The spec update on JVM TI, JDI and Instrumentation will be done as >>> a separate RFE [3].? This patch includes new tests for JVM TI and >>> java.instrument that validates how the existing APIs work for hidden >>> classes. >>> >>> javadoc/specdiff >>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ >>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ >>> >>> >>> JVMS 5.4.4 change: >>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf >>> >>> >>> CSR: >>> https://bugs.openjdk.java.net/browse/JDK-8238359 >>> >>> Thanks >>> Mandy >>> [1] https://bugs.openjdk.java.net/browse/JDK-8238359 >>> [2] https://bugs.openjdk.java.net/browse/JDK-8240338 >>> [3] https://bugs.openjdk.java.net/browse/JDK-8230502 >> >> > From richard.reingruber at sap.com Mon Mar 30 08:10:42 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Mon, 30 Mar 2020 08:10:42 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com> <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com> <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com> Message-ID: Hi, this is webrev.5 based on Robbin's feedback and Martin's review - thanks! :) The change affects jvmti, hotspot and c2. Partial reviews are very welcome too. Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5/ Delta: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5.inc/ Robbin, Martin, please let me know, if anything shouldn't be quite as you wanted it. Also find my comments on your feedback below. Robbin, can I count you as Reviewer for the runtime part? Thanks, Richard. -----Original Message----- From: Doerr, Martin Sent: Donnerstag, 12. M?rz 2020 17:28 To: Reingruber, Richard ; 'Robbin Ehn' ; Lindenmaier, Goetz ; David Holmes ; Vladimir Kozlov (vladimir.kozlov at oracle.com) ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents Hi Richard, I managed to find time for a (almost) complete review of webrev.4. (I'll review the tests separately.) First of all, the change seems to be in pretty good quality for its significant complexity. I couldn't find any real bugs. But I'd like to propose minor improvements. I'm convinced that it's mature because we did substantial testing. I like the new functionality for object deoptimization. It can possibly be reused for future escape analysis based optimizations. So I appreciate having it available in the code base. In addition to that, your change makes the JVMTI implementation better integrated into the VM. Now to the details: src/hotspot/share/c1/c1_IR.hpp describe_scope parameters. Ok. src/hotspot/share/ci/ciEnv.cpp src/hotspot/share/ci/ciEnv.hpp Fix for JvmtiExport::can_walk_any_space() capability. Ok. src/hotspot/share/code/compiledMethod.cpp Nice cleanup! src/hotspot/share/code/debugInfoRec.cpp src/hotspot/share/code/debugInfoRec.hpp Additional parmeters. (Remark: I think "non_global_escape_in_scope" would read better than "not_global_escape_in_scope", but your version is consistent with existing code, so no change request from my side.) Ok. src/hotspot/share/code/nmethod.cpp Nice cleanup! src/hotspot/share/code/pcDesc.hpp Additional parameters. Ok. src/hotspot/share/code/scopeDesc.cpp src/hotspot/share/code/scopeDesc.hpp Improved implementation + additional parameters. Ok. src/hotspot/share/compiler/compileBroker.cpp src/hotspot/share/compiler/compileBroker.hpp Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into a follow up change together with the test in order to make this webrev smaller, but since it is included, I'm reviewing everything at once. Not a big deal.) Ok. src/hotspot/share/jvmci/jvmciCodeInstaller.cpp Additional parameters. Ok. src/hotspot/share/opto/c2compiler.cpp Make do_escape_analysis independent of JVMCI capabilities. Nice! src/hotspot/share/opto/callnode.hpp Additional fields for MachSafePointNodes. Ok. src/hotspot/share/opto/escape.cpp Annotation for MachSafePointNodes. Your added functionality looks correct. But I'd prefer to move the bulky code out of the large function. I suggest to factor out something like has_not_global_escape and has_arg_escape. So the code could look like this: SafePointNode* sfn = sfn_worklist.at(next); sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn)); if (sfn->is_CallJava()) { CallJavaNode* call = sfn->as_CallJava(); call->set_arg_escape(has_arg_escape(call)); } This would also allow us to get rid of the found_..._escape_in_args variables making the loops better readable. It's kind of ugly to use strcmp to recognize uncommon trap, but that seems to be the way to do it (there are more such places). So it's ok. src/hotspot/share/opto/machnode.hpp Additional fields for MachSafePointNodes. Ok. src/hotspot/share/opto/macro.cpp Allow elimination of non-escaping allocations. Ok. src/hotspot/share/opto/matcher.cpp src/hotspot/share/opto/output.cpp Copy attribute / pass parameters. Ok. src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp Nice cleanup! src/hotspot/share/prims/jvmtiEnv.cpp src/hotspot/share/prims/jvmtiEnvBase.cpp Escape barriers + deoptimize objects for target thread. Good. src/hotspot/share/prims/jvmtiImpl.cpp src/hotspot/share/prims/jvmtiImpl.hpp The sequence is pretty complex: VM_GetOrSetLocal element initialization executes EscapeBarrier code which suspends the target thread (extra VM Operation). VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM Thread to prepare VM Operation with frame deoptimization). VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor which resumes the target thread. But I don't have any improvement proposal. Performance is probably not a concern, here. So it's ok. VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it has non-globally escaping objects and other frames if they have arg escaping ones. Good. src/hotspot/share/prims/jvmtiTagMap.cpp Escape barriers + deoptimize objects for all threads. Ok. src/hotspot/share/prims/whitebox.cpp Added WB_IsFrameDeoptimized to API. Ok. src/hotspot/share/runtime/deoptimization.cpp Object deoptimization. I have more comments and proposals, here. First of all, handling recursive and waiting locks in relock_objects is tricky, but looks correct. Comments are sufficient to understand why things are done as they are implemented. BiasedLocking related parts are complex, but we may get rid of them in the future (with BiasedLocking removal). Anyway, looks correct, too. Typo in comment: "regularily" => "regularly" Deoptimization::fetch_unroll_info_helper is the only place where _jvmti_deferred_updates get deallocated (except JavaThread destructor). But I think we always go through it, so I can't see a memory leak or such kind of issues. EscapeBarrier::deoptimize_objects: ResourceMark should use calling_thread(). You can use MutexLocker and MonitorLocker with Thread* to save the Thread::current() call. I'd make set_objs_are_deoptimized static and remove it from the EscapeBarrier interface because I think it shouldn't be used outside of EscapeBarrier::deoptimize_objects. Typo in comment: "we must only deoptimize" => "we only have to deoptimize" "bool EscapeBarrier::deoptimize_objects(intptr_t* fr_id)" is trivial and barrier_active() is redundant. Implementation can get moved to hpp file. I'll get back to suspend flags, later. There are weird cases regarding _self_deoptimization_in_progress. Assume we have 3 threads A, B and C. A deopts C, B deopts C, C deopts C. C can set _self_deoptimization_in_progress while A performs the handshake for suspending C. I think this doesn't lead to errors, but it's probably not desired. I think it would be better to use only one "wait" call in sync_and_suspend_one and sync_and_suspend_all. I first thought it'd be better to move ThreadBlockInVM before wait() to reduce thread state transitions, but that seems to be problematic because ThreadBlockInVM destructor contains a safepoint check which we shouldn't do while holding EscapeBarrier_lock. So no change request. Change in thred_added: I think the sequence would be more comprehensive if we waited for deopt_all_threads in Thread::start and all other places where a new thread can run into Java code (e.g. JVMTI attach). Your version makes new threads come up with suspend flag set. That looks correct, too. Advantage is that you only have to change one place (thread_added). It'll be interesting to see how it will look like when we use async handshakes instead of suspend flags. For now, I'm ok with your version. I'd only move MutexLocker ml(EscapeBarrier_lock...) after if (!jt->is_hidden_from_external_view()). Having 4 different deoptimize_objects functions makes it a little hard to keep an overview of which one is used for what. Maybe adding suffixes would help a little bit, but I can also live with what you have. Implementation looks correct to me. src/hotspot/share/runtime/deoptimization.hpp Escape barriers and object deoptimization functions. Typo in comment: "helt" => "held" src/hotspot/share/runtime/globals.hpp Addition of develop flag DeoptimizeObjectsALotInterval. Ok. src/hotspot/share/runtime/interfaceSupport.cpp InterfaceSupport::deoptimizeAllObjects() is only used for DeoptimizeObjectsALot = 1. I think DeoptimizeObjectsALot = 2 is more important, but I think it's not bad to have DeoptimizeObjectsALot = 1 in addition. Ok. src/hotspot/share/runtime/interfaceSupport.inline.hpp Addition of deoptimizeAllObjects. Ok. src/hotspot/share/runtime/mutexLocker.cpp src/hotspot/share/runtime/mutexLocker.hpp Addition of EscapeBarrier_lock. Ok. src/hotspot/share/runtime/objectMonitor.cpp Make recursion count relock aware. Ok. src/hotspot/share/runtime/stackValue.hpp Better reinitilization in StackValue. Good. src/hotspot/share/runtime/thread.cpp src/hotspot/share/runtime/thread.hpp src/hotspot/share/runtime/thread.inline.hpp wait_for_object_deoptimization, suspend flag, deferred updates and test feature to deoptimize objects. In the long term, we want to get rid of suspend flags, so it's not so nice to introduce a new one. But I agree with G?tz that it should be acceptable as temporary solution until async handshakes are available (which takes more time). So I'm ok with your change. You can use MutexLocker with Thread*. JVMTIDeferredUpdates: I agree with Robin. It'd be nice to move the class out of thread.hpp. src/hotspot/share/runtime/vframe.cpp Added support for entry frame to new_vframe. Ok. src/hotspot/share/runtime/vframe_hp.cpp src/hotspot/share/runtime/vframe_hp.hpp I think code()->as_nmethod() in not_global_escape_in_scope() and arg_escape() should better be under #ifdef ASSERT or inside the assert statement (no need for code cache walking in product build). jvmtiDeferredLocalVariableSet::update_monitors: Please add a comment explaining that owner referenced by original info may be scalar replaced, but it is deoptimized in the vframe. src/hotspot/share/utilities/macros.hpp Addition of NOT_COMPILER2_OR_JVMCI_RETURN macros. Ok. test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysisEnabled.java test/hotspot/jtreg/serviceability/jvmti/Heap/libIterateHeapWithEscapeAnalysisEnabled.c New test. Will review separately. test/jdk/TEST.ROOT Addition of vm.jvmci as required property. Ok. test/jdk/com/sun/jdi/EATests.java test/jdk/com/sun/jdi/EATestsJVMCI.java New test. Will review separately. test/lib/sun/hotspot/WhiteBox.java Added isFrameDeoptimized to API. Ok. That was it. Best regards, Martin > -----Original Message----- > From: hotspot-compiler-dev bounces at openjdk.java.net> On Behalf Of Reingruber, Richard > Sent: Dienstag, 3. M?rz 2020 21:23 > To: 'Robbin Ehn' ; Lindenmaier, Goetz > ; David Holmes ; > Vladimir Kozlov (vladimir.kozlov at oracle.com) > ; serviceability-dev at openjdk.java.net; > hotspot-compiler-dev at openjdk.java.net; hotspot-runtime- > dev at openjdk.java.net > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better > Performance in the Presence of JVMTI Agents > > Hi Robbin, > > > > I understand that Robbin proposed to replace the usage of > > > _suspend_flag with handshakes. Apparently, async handshakes > > > are needed to do so. We have been waiting a while for removal > > > of the _suspend_flag / introduction of async handshakes [2]. > > > What is the status here? > > > I have an old prototype which I would like to continue to work on. > > So do not assume asynch handshakes will make 15. > > Even if it would, I think there are a lot more investigate work to remove > > _suspend_flag. > > Let us know, if we can be of any help to you and be it only testing. > > > >> Full: > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/ > > > DeoptimizeObjectsALotThread is only used in compileBroker.cpp. > > You can move both declaration and definition to that file, no need to > clobber > > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry) > > Will do. > > > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's > own > > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp. > > You are right. It shouldn't be declared in thread.hpp. I will look into that. > > > Note that we also think we may have a bug in deopt: > > https://bugs.openjdk.java.net/browse/JDK-8238237 > > > I think it would be best, if possible, to push after that is resolved. > > Sure. > > > Not even nearly a full review :) > > I know :) > > Anyways, thanks a lot, > Richard. > > > -----Original Message----- > From: Robbin Ehn > Sent: Monday, March 2, 2020 11:17 AM > To: Lindenmaier, Goetz ; Reingruber, Richard > ; David Holmes ; > Vladimir Kozlov (vladimir.kozlov at oracle.com) > ; serviceability-dev at openjdk.java.net; > hotspot-compiler-dev at openjdk.java.net; hotspot-runtime- > dev at openjdk.java.net > Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance > in the Presence of JVMTI Agents > > Hi, > > On 2/24/20 5:39 PM, Lindenmaier, Goetz wrote: > > Hi, > > > > I had a look at the progress of this change. Nothing > > happened since Richard posted his update using more > > handshakes [1]. > > But we (SAP) would appreciate a lot if this change could > > be successfully reviewed and pushed. > > > > I think there is basic understanding that this > > change is helpful. It fixes a number of issues with JVMTI, > > and will deliver the same performance benefits as EA > > does in current production mode for debugging scenarios. > > > > This is important for us as we run our VMs prepared > > for debugging in production mode. > > > > I understand that Robbin proposed to replace the usage of > > _suspend_flag with handshakes. Apparently, async handshakes > > are needed to do so. We have been waiting a while for removal > > of the _suspend_flag / introduction of async handshakes [2]. > > What is the status here? > > I have an old prototype which I would like to continue to work on. > So do not assume asynch handshakes will make 15. > Even if it would, I think there are a lot more investigate work to remove > _suspend_flag. > > > > > I think we should no longer wait, but proceed with > > this change. We will look into removing the usage of > > suspend_flag introduced here once it is possible to implement > > it with handshakes. > > Yes, sure. > > >> Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/ > > DeoptimizeObjectsALotThread is only used in compileBroker.cpp. > You can move both declaration and definition to that file, no need to clobber > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry) > > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's > own > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp. > > Note that we also think we may have a bug in deopt: > https://bugs.openjdk.java.net/browse/JDK-8238237 > > I think it would be best, if possible, to push after that is resolved. > > Not even nearly a full review :) > > Thanks, Robbin > > > >> Incremental: > >> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/ > >> > >> I was not able to eliminate the additional suspend flag now. I'll take care > of this > >> as soon as the > >> existing suspend-resume-mechanism is reworked. > >> > >> Testing: > >> > >> Nightly tests @SAP: > >> > >> JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, > Renaissance > >> Suite, SAP specific tests > >> with fastdebug and release builds on all platforms > >> > >> Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x > parallel > >> for 24h > >> > >> Thanks, Richard. > >> > >> > >> More details on the changes: > >> > >> * Hide DeoptimizeObjectsALotThread from external view. > >> > >> * Changed EscapeBarrier_lock to be a _safepoint_check_never lock. > >> It used to be _safepoint_check_sometimes, which will be eliminated > sooner or > >> later. > >> I added explicit thread state changes with ThreadBlockInVM to code > paths > >> where we can wait() > >> on EscapeBarrier_lock to become safepoint safe. > >> > >> * Use handshake EscapeBarrierSuspendHandshake to suspend target > threads > >> instead of vm operation > >> VM_ThreadSuspendAllForObjDeopt. > >> > >> * Removed uses of Threads_lock. When adding a new thread we suspend > it iff > >> EA optimizations are > >> being reverted. In the previous version we were waiting on > Threads_lock > >> while EA optimizations > >> were reverted. See EscapeBarrier::thread_added(). > >> > >> * Made tests require Xmixed compilation mode. > >> > >> * Made tests agnostic regarding tiered compilation. > >> I.e. tc isn't disabled anymore, and the tests can be run with tc enabled or > >> disabled. > >> > >> * Exercising EATests.java as well with stress test options > >> DeoptimizeObjectsALot* > >> Due to the non-deterministic deoptimizations some tests need to be > skipped. > >> We do this to prevent bit-rot of the stress test code. > >> > >> * Executing EATests.java as well with graal if available. Driver for this is > >> EATestsJVMCI.java. Graal cannot pass all tests, because it does not > provide all > >> the new debug info > >> (namely not_global_escape_in_scope and arg_escape in > scopeDesc.hpp). > >> And graal does not yet support the JVMTI operations force early return > and > >> pop frame. > >> > >> * Removed tracing from new jdi tests in EATests.java. Too much trace > output > >> before the debugging > >> connection is established can cause deadlock because output buffers fill > up. > >> (See https://bugs.openjdk.java.net/browse/JDK-8173304) > >> > >> * Many copyright year changes and smaller clean-up changes of testing > code > >> (trailing white-space and > >> the like). > >> > >> > >> -----Original Message----- > >> From: David Holmes > >> Sent: Donnerstag, 19. Dezember 2019 03:12 > >> To: Reingruber, Richard ; serviceability- > >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; > hotspot- > >> runtime-dev at openjdk.java.net; Vladimir Kozlov > (vladimir.kozlov at oracle.com) > >> > >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > Performance in > >> the Presence of JVMTI Agents > >> > >> Hi Richard, > >> > >> I think my issue is with the way EliminateNestedLocks works so I'm going > >> to look into that more deeply. > >> > >> Thanks for the explanations. > >> > >> David > >> > >> On 18/12/2019 12:47 am, Reingruber, Richard wrote: > >>> Hi David, > >>> > >>> > > > Some further queries/concerns: > >>> > > > > >>> > > > src/hotspot/share/runtime/objectMonitor.cpp > >>> > > > > >>> > > > Can you please explain the changes to ObjectMonitor::wait: > >>> > > > > >>> > > > ! _recursions = save // restore the old recursion count > >>> > > > ! + jt->get_and_reset_relock_count_after_wait(); // > >>> > > > increased by the deferred relock count > >>> > > > > >>> > > > what is the "deferred relock count"? I gather it relates to > >>> > > > > >>> > > > "The code was extended to be able to deoptimize objects of a > >>> > > frame that > >>> > > > is not the top frame and to let another thread than the owning > >>> > > thread do > >>> > > > it." > >>> > > > >>> > > Yes, these relate. Currently EA based optimizations are reverted, > when a > >> compiled frame is > >>> > > replaced with corresponding interpreter frames. Part of this is > relocking > >> objects with eliminated > >>> > > locking. New with the enhancement is that we do this also just > before > >> object references are > >>> > > acquired through JVMTI. In this case we deoptimize also the > owning > >> compiled frame C and we > >>> > > register deoptimized objects as deferred updates. When control > returns > >> to C it gets deoptimized, > >>> > > we notice that objects are already deoptimized (reallocated and > >> relocked), so we don't do it again > >>> > > (relocking twice would be incorrect of course). Deferred updates > are > >> copied into the new > >>> > > interpreter frames. > >>> > > > >>> > > Problem: relocking is not possible if the target thread T is waiting > on the > >> monitor that needs to > >>> > > be relocked. This happens only with non-local objects with > >> EliminateNestedLocks. Instead relocking > >>> > > is deferred until T owns the monitor again. This is what the piece of > >> code above does. > >>> > > >>> > Sorry I need some more detail here. How can you wait() on an > object > >>> > monitor if the object allocation and/or locking was optimised away? > And > >>> > what is a "non-local object" in this context? Isn't EA restricted to > >>> > thread-confined objects? > >>> > >>> "Non-local object" is an object that escapes its thread. The issue I'm > >> addressing with the changes > >>> in ObjectMonitor::wait are almost unrelated to EA. They are caused by > >> EliminateNestedLocks, where C2 > >>> eliminates recursive locking of an already owned lock. The lock owning > object > >> exists on the heap, it > >>> is locked and you can call wait() on it. > >>> > >>> EliminateLocks is the C2 option that controls lock elimination based on > EA. > >> Both optimizations have > >>> in common that objects with eliminated locking need to be relocked > when > >> deoptimizing a frame, > >>> i.e. when replacing a compiled frame with equivalent interpreter > >>> frames. Deoptimization::relock_objects does that job for /all/ eliminated > >> locks in scope. /All/ can > >>> be a mix of eliminated nested locks and locks of not-escaping objects. > >>> > >>> New with the enhancement: I call relock_objects earlier, just before > objects > >> pontentially > >>> escape. But then later when the owning compiled frame gets > deoptimized, I > >> must not do it again: > >>> > >>> See call to EscapeBarrier::objs_are_deoptimized in deoptimization.cpp: > >>> > >>> 373 if ((jvmci_enabled || ((DoEscapeAnalysis || > EliminateNestedLocks) && > >> EliminateLocks)) > >>> 374 && !EscapeBarrier::objs_are_deoptimized(thread, > deoptee.id())) { > >>> 375 bool unused; > >>> 376 eliminate_locks(thread, chunk, realloc_failures, deoptee, > exec_mode, > >> unused); > >>> 377 } > >>> > >>> Now when calling relock_objects early it is quiet possible that I have to > relock > >> an object the > >>> target thread currently waits for. Obviously I cannot relock in this case, > >> instead I chose to > >>> introduce relock_count_after_wait to JavaThread. > >>> > >>> > Is it just that some of the locking gets optimized away e.g. > >>> > > >>> > synchronised(obj) { > >>> > synchronised(obj) { > >>> > synchronised(obj) { > >>> > obj.wait(); > >>> > } > >>> > } > >>> > } > >>> > > >>> > If this is reduced to a form as-if it were a single lock of the monitor > >>> > (due to EA) and the wait() triggers a JVM TI event which leads to the > >>> > escape of "obj" then we need to reconstruct the true lock state, and > so > >>> > when the wait() internally unblocks and reacquires the monitor it > has to > >>> > set the true recursion count to 3, not the 1 that it appeared to be > when > >>> > wait() was initially called. Is that the scenario? > >>> > >>> Kind of... except that the locking is not eliminated due to EA and there is > no > >> JVM TI event > >>> triggered by wait. > >>> > >>> Add > >>> > >>> LocalObject l1 = new LocalObject(); > >>> > >>> in front of the synchrnized blocks and assume a JVM TI agent acquires l1. > This > >> triggers the code in > >>> question. > >>> > >>> See that relocking/reallocating is transactional. If it is done then for /all/ > >> objects in scope and it is > >>> done at most once. It wouldn't be quite so easy to split this in relocking > of > >> nested/EA-based > >>> eliminated locks. > >>> > >>> > If so I find this truly awful. Anyone using wait() in a realistic form > >>> > requires a notification and so the object cannot be thread confined. > In > >>> > >>> It is not thread confined. > >>> > >>> > which case I would strongly argue that upon hitting the wait() the > deopt > >>> > should occur unconditionally and so the lock state is correct before > we > >>> > wait and so we don't need to mess with the recursion count > internally > >>> > when we reacquire the monitor. > >>> > > >>> > > > >>> > > > which I don't like the sound of at all when it comes to > ObjectMonitor > >>> > > > state. So I'd like to understand in detail exactly what is going on > here > >>> > > > and why. This is a very intrusive change that seems to badly > break > >>> > > > encapsulation and impacts future changes to ObjectMonitor > that are > >> under > >>> > > > investigation. > >>> > > > >>> > > I would not regard this as breaking encapsulation. Certainly not > badly. > >>> > > > >>> > > I've added a property relock_count_after_wait to JavaThread. The > >> property is well > >>> > > encapsulated. Future ObjectMonitor implementations have to deal > with > >> recursion too. They are free > >>> > > in choosing a way to do that as long as that property is taken into > >> account. This is hardly a > >>> > > limitation. > >>> > > >>> > I do think this badly breaks encapsulation as you have to add a > callout > >>> > from the guts of the ObjectMonitor code to reach into the thread to > get > >>> > this lock count adjustment. I understand why you have had to do > this but > >>> > I would much rather see a change to the EA optimisation strategy so > that > >>> > this is not needed. > >>> > > >>> > > Note also that the property is a straight forward extension of the > >> existing concept of deferred > >>> > > local updates. It is embedded into the structure holding them. So > not > >> even the footprint of a > >>> > > JavaThread is enlarged if no deferred updates are generated. > >>> > > >>> > [...] > >>> > > >>> > > > >>> > > I'm actually duplicating the existing external suspend mechanism, > >> because a thread can be > >>> > > suspended at most once. And hey, and don't like that either! But it > >> seems not unlikely that the > >>> > > duplicate can be removed together with the original and the new > type > >> of handshakes that will be > >>> > > used for thread suspend can be used for object deoptimization > too. See > >> today's discussion in > >>> > > JDK-8227745 [2]. > >>> > > >>> > I hope that discussion bears some fruit, at the moment it seems not > to > >>> > be possible to use handshakes here. :( > >>> > > >>> > The external suspend mechanism is a royal pain in the proverbial > that we > >>> > have to carefully live with. The idea that we're duplicating that for > >>> > use in another fringe area of functionality does not thrill me at all. > >>> > > >>> > To be clear, I understand the problem that exists and that you wish > to > >>> > solve, but for the runtime parts I balk at the complexity cost of > >>> > solving it. > >>> > >>> I know it's complex, but by far no rocket science. > >>> > >>> Also I find it hard to imagine another fix for JDK-8233915 besides > changing > >> the JVM TI specification. > >>> > >>> Thanks, Richard. > >>> > >>> -----Original Message----- > >>> From: David Holmes > >>> Sent: Dienstag, 17. Dezember 2019 08:03 > >>> To: Reingruber, Richard ; serviceability- > >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; > hotspot- > >> runtime-dev at openjdk.java.net; Vladimir Kozlov > (vladimir.kozlov at oracle.com) > >> > >>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > Performance > >> in the Presence of JVMTI Agents > >>> > >>> > >>> > >>> David > >>> > >>> On 17/12/2019 4:57 pm, David Holmes wrote: > >>>> Hi Richard, > >>>> > >>>> On 14/12/2019 5:01 am, Reingruber, Richard wrote: > >>>>> Hi David, > >>>>> > >>>>> ?? > Some further queries/concerns: > >>>>> ?? > > >>>>> ?? > src/hotspot/share/runtime/objectMonitor.cpp > >>>>> ?? > > >>>>> ?? > Can you please explain the changes to ObjectMonitor::wait: > >>>>> ?? > > >>>>> ?? > !?? _recursions = save????? // restore the old recursion count > >>>>> ?? > !???????????????? + jt->get_and_reset_relock_count_after_wait(); // > >>>>> ?? > increased by the deferred relock count > >>>>> ?? > > >>>>> ?? > what is the "deferred relock count"? I gather it relates to > >>>>> ?? > > >>>>> ?? > "The code was extended to be able to deoptimize objects of a > >>>>> frame that > >>>>> ?? > is not the top frame and to let another thread than the owning > >>>>> thread do > >>>>> ?? > it." > >>>>> > >>>>> Yes, these relate. Currently EA based optimizations are reverted, > when > >>>>> a compiled frame is replaced > >>>>> with corresponding interpreter frames. Part of this is relocking > >>>>> objects with eliminated > >>>>> locking. New with the enhancement is that we do this also just before > >>>>> object references are acquired > >>>>> through JVMTI. In this case we deoptimize also the owning compiled > >>>>> frame C and we register > >>>>> deoptimized objects as deferred updates. When control returns to C > it > >>>>> gets deoptimized, we notice > >>>>> that objects are already deoptimized (reallocated and relocked), so > we > >>>>> don't do it again (relocking > >>>>> twice would be incorrect of course). Deferred updates are copied into > >>>>> the new interpreter frames. > >>>>> > >>>>> Problem: relocking is not possible if the target thread T is waiting > >>>>> on the monitor that needs to be > >>>>> relocked. This happens only with non-local objects with > >>>>> EliminateNestedLocks. Instead relocking is > >>>>> deferred until T owns the monitor again. This is what the piece of > >>>>> code above does. > >>>> > >>>> Sorry I need some more detail here. How can you wait() on an object > >>>> monitor if the object allocation and/or locking was optimised away? > And > >>>> what is a "non-local object" in this context? Isn't EA restricted to > >>>> thread-confined objects? > >>>> > >>>> Is it just that some of the locking gets optimized away e.g. > >>>> > >>>> synchronised(obj) { > >>>> ? synchronised(obj) { > >>>> ??? synchronised(obj) { > >>>> ????? obj.wait(); > >>>> ??? } > >>>> ? } > >>>> } > >>>> > >>>> If this is reduced to a form as-if it were a single lock of the monitor > >>>> (due to EA) and the wait() triggers a JVM TI event which leads to the > >>>> escape of "obj" then we need to reconstruct the true lock state, and so > >>>> when the wait() internally unblocks and reacquires the monitor it has to > >>>> set the true recursion count to 3, not the 1 that it appeared to be when > >>>> wait() was initially called. Is that the scenario? > >>>> > >>>> If so I find this truly awful. Anyone using wait() in a realistic form > >>>> requires a notification and so the object cannot be thread confined. In > >>>> which case I would strongly argue that upon hitting the wait() the > deopt > >>>> should occur unconditionally and so the lock state is correct before we > >>>> wait and so we don't need to mess with the recursion count internally > >>>> when we reacquire the monitor. > >>>> > >>>>> > >>>>> ?? > which I don't like the sound of at all when it comes to > >>>>> ObjectMonitor > >>>>> ?? > state. So I'd like to understand in detail exactly what is going > >>>>> on here > >>>>> ?? > and why.? This is a very intrusive change that seems to badly > break > >>>>> ?? > encapsulation and impacts future changes to ObjectMonitor that > >>>>> are under > >>>>> ?? > investigation. > >>>>> > >>>>> I would not regard this as breaking encapsulation. Certainly not badly. > >>>>> > >>>>> I've added a property relock_count_after_wait to JavaThread. The > >>>>> property is well > >>>>> encapsulated. Future ObjectMonitor implementations have to deal > with > >>>>> recursion too. They are free in > >>>>> choosing a way to do that as long as that property is taken into > >>>>> account. This is hardly a > >>>>> limitation. > >>>> > >>>> I do think this badly breaks encapsulation as you have to add a callout > >>>> from the guts of the ObjectMonitor code to reach into the thread to > get > >>>> this lock count adjustment. I understand why you have had to do this > but > >>>> I would much rather see a change to the EA optimisation strategy so > that > >>>> this is not needed. > >>>> > >>>>> Note also that the property is a straight forward extension of the > >>>>> existing concept of deferred > >>>>> local updates. It is embedded into the structure holding them. So not > >>>>> even the footprint of a > >>>>> JavaThread is enlarged if no deferred updates are generated. > >>>>> > >>>>> ?? > --- > >>>>> ?? > > >>>>> ?? > src/hotspot/share/runtime/thread.cpp > >>>>> ?? > > >>>>> ?? > Can you please explain why > >>>>> JavaThread::wait_for_object_deoptimization > >>>>> ?? > has to be handcrafted in this way rather than using proper > >>>>> transitions. > >>>>> ?? > > >>>>> > >>>>> I wrote wait_for_object_deoptimization taking > >>>>> JavaThread::java_suspend_self_with_safepoint_check > >>>>> as template. So in short: for the same reasons :) > >>>>> > >>>>> Threads reach both methods as part of thread state transitions, > >>>>> therefore special handling is > >>>>> required to change thread state on top of ongoing transitions. > >>>>> > >>>>> ?? > We got rid of "deopt suspend" some time ago and it is disturbing > >>>>> to see > >>>>> ?? > it being added back (effectively). This seems like it may be > >>>>> something > >>>>> ?? > that handshakes could be used for. > >>>>> > >>>>> Deopt suspend used to be something rather different with a similar > >>>>> name[1]. It is not being added back. > >>>> > >>>> I stand corrected. Despite comments in the code to the contrary > >>>> deopt_suspend didn't actually cause a self-suspend. I was doing a lot of > >>>> cleanup in this area 13 years ago :) > >>>> > >>>>> > >>>>> I'm actually duplicating the existing external suspend mechanism, > >>>>> because a thread can be suspended > >>>>> at most once. And hey, and don't like that either! But it seems not > >>>>> unlikely that the duplicate can > >>>>> be removed together with the original and the new type of > handshakes > >>>>> that will be used for > >>>>> thread suspend can be used for object deoptimization too. See > today's > >>>>> discussion in JDK-8227745 [2]. > >>>> > >>>> I hope that discussion bears some fruit, at the moment it seems not to > >>>> be possible to use handshakes here. :( > >>>> > >>>> The external suspend mechanism is a royal pain in the proverbial that > we > >>>> have to carefully live with. The idea that we're duplicating that for > >>>> use in another fringe area of functionality does not thrill me at all. > >>>> > >>>> To be clear, I understand the problem that exists and that you wish to > >>>> solve, but for the runtime parts I balk at the complexity cost of > >>>> solving it. > >>>> > >>>> Thanks, > >>>> David > >>>> ----- > >>>> > >>>>> Thanks, Richard. > >>>>> > >>>>> [1] Deopt suspend was something like an async. handshake for > >>>>> architectures with register windows, > >>>>> ???? where patching the return pc for deoptimization of a compiled > >>>>> frame was racy if the owner thread > >>>>> ???? was in native code. Instead a "deopt" suspend flag was set on > >>>>> which the thread patched its own > >>>>> ???? frame upon return from native. So no thread was suspended. It > got > >>>>> its name only from the name of > >>>>> ???? the flags. > >>>>> > >>>>> [2] Discussion about using handshakes to sync. with the target thread: > >>>>> > >>>>> https://bugs.openjdk.java.net/browse/JDK- > >> > 8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.syst > e > >> m.issuetabpanels:comment-tabpanel#comment-14306727 > >>>>> > >>>>> > >>>>> -----Original Message----- > >>>>> From: David Holmes > >>>>> Sent: Freitag, 13. Dezember 2019 00:56 > >>>>> To: Reingruber, Richard ; > >>>>> serviceability-dev at openjdk.java.net; > >>>>> hotspot-compiler-dev at openjdk.java.net; > >>>>> hotspot-runtime-dev at openjdk.java.net > >>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > >>>>> Performance in the Presence of JVMTI Agents > >>>>> > >>>>> Hi Richard, > >>>>> > >>>>> Some further queries/concerns: > >>>>> > >>>>> src/hotspot/share/runtime/objectMonitor.cpp > >>>>> > >>>>> Can you please explain the changes to ObjectMonitor::wait: > >>>>> > >>>>> !?? _recursions = save????? // restore the old recursion count > >>>>> !???????????????? + jt->get_and_reset_relock_count_after_wait(); // > >>>>> increased by the deferred relock count > >>>>> > >>>>> what is the "deferred relock count"? I gather it relates to > >>>>> > >>>>> "The code was extended to be able to deoptimize objects of a frame > that > >>>>> is not the top frame and to let another thread than the owning thread > do > >>>>> it." > >>>>> > >>>>> which I don't like the sound of at all when it comes to ObjectMonitor > >>>>> state. So I'd like to understand in detail exactly what is going on here > >>>>> and why.? This is a very intrusive change that seems to badly break > >>>>> encapsulation and impacts future changes to ObjectMonitor that are > under > >>>>> investigation. > >>>>> > >>>>> --- > >>>>> > >>>>> src/hotspot/share/runtime/thread.cpp > >>>>> > >>>>> Can you please explain why > JavaThread::wait_for_object_deoptimization > >>>>> has to be handcrafted in this way rather than using proper transitions. > >>>>> > >>>>> We got rid of "deopt suspend" some time ago and it is disturbing to > see > >>>>> it being added back (effectively). This seems like it may be something > >>>>> that handshakes could be used for. > >>>>> > >>>>> Thanks, > >>>>> David > >>>>> ----- > >>>>> > >>>>> On 12/12/2019 7:02 am, David Holmes wrote: > >>>>>> On 12/12/2019 1:07 am, Reingruber, Richard wrote: > >>>>>>> Hi David, > >>>>>>> > >>>>>>> ??? > Most of the details here are in areas I can comment on in > detail, > >>>>>>> but I > >>>>>>> ??? > did take an initial general look at things. > >>>>>>> > >>>>>>> Thanks for taking the time! > >>>>>> > >>>>>> Apologies the above should read: > >>>>>> > >>>>>> "Most of the details here are in areas I *can't* comment on in detail > >>>>>> ..." > >>>>>> > >>>>>> David > >>>>>> > >>>>>>> ??? > The only thing that jumped out at me is that I think the > >>>>>>> ??? > DeoptimizeObjectsALotThread should be a hidden thread. > >>>>>>> ??? > > >>>>>>> ??? > +? bool is_hidden_from_external_view() const { return true; } > >>>>>>> > >>>>>>> Yes, it should. Will add the method like above. > >>>>>>> > >>>>>>> ??? > Also I don't see any testing of the > DeoptimizeObjectsALotThread. > >>>>>>> Without > >>>>>>> ??? > active testing this will just bit-rot. > >>>>>>> > >>>>>>> DeoptimizeObjectsALot is meant for stress testing with a larger > >>>>>>> workload. I will add a minimal test > >>>>>>> to keep it fresh. > >>>>>>> > >>>>>>> ??? > Also on the tests I don't understand your @requires clause: > >>>>>>> ??? > > >>>>>>> ??? >?? @requires ((vm.compMode != "Xcomp") & > vm.compiler2.enabled > >> & > >>>>>>> ??? > (vm.opt.TieredCompilation != true)) > >>>>>>> ??? > > >>>>>>> ??? > This seems to require that TieredCompilation is disabled, but > >>>>>>> tiered is > >>>>>>> ??? > our normal mode of operation. ?? > >>>>>>> ??? > > >>>>>>> > >>>>>>> I removed the clause. I guess I wanted to target the tests towards > the > >>>>>>> code they are supposed to > >>>>>>> test, and it's easier to analyze failures w/o tiered compilation and > >>>>>>> with just one compiler thread. > >>>>>>> > >>>>>>> Additionally I will make use of > >>>>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests. > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Richard. > >>>>>>> > >>>>>>> -----Original Message----- > >>>>>>> From: David Holmes > >>>>>>> Sent: Mittwoch, 11. Dezember 2019 08:03 > >>>>>>> To: Reingruber, Richard ; > >>>>>>> serviceability-dev at openjdk.java.net; > >>>>>>> hotspot-compiler-dev at openjdk.java.net; > >>>>>>> hotspot-runtime-dev at openjdk.java.net > >>>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > >>>>>>> Performance in the Presence of JVMTI Agents > >>>>>>> > >>>>>>> Hi Richard, > >>>>>>> > >>>>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote: > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> I would like to get reviews please for > >>>>>>>> > >>>>>>>> > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ > >>>>>>>> > >>>>>>>> Corresponding RFE: > >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745 > >>>>>>>> > >>>>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 > >>>>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK- > 8214584 [1] > >>>>>>>> > >>>>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing > without > >>>>>>>> issues (thanks!). In addition the > >>>>>>>> change is being tested at SAP since I posted the first RFR some > >>>>>>>> months ago. > >>>>>>>> > >>>>>>>> The intention of this enhancement is to benefit performance wise > from > >>>>>>>> escape analysis even if JVMTI > >>>>>>>> agents request capabilities that allow them to access local variable > >>>>>>>> values. E.g. if you start-up > >>>>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, > then > >>>>>>>> escape analysis is disabled right > >>>>>>>> from the beginning, well before a debugger attaches -- if ever one > >>>>>>>> should do so. With the > >>>>>>>> enhancement, escape analysis will remain enabled until and after > a > >>>>>>>> debugger attaches. EA based > >>>>>>>> optimizations are reverted just before an agent acquires the > >>>>>>>> reference to an object. In the JBS item > >>>>>>>> you'll find more details. > >>>>>>> > >>>>>>> Most of the details here are in areas I can comment on in detail, but > I > >>>>>>> did take an initial general look at things. > >>>>>>> > >>>>>>> The only thing that jumped out at me is that I think the > >>>>>>> DeoptimizeObjectsALotThread should be a hidden thread. > >>>>>>> > >>>>>>> +? bool is_hidden_from_external_view() const { return true; } > >>>>>>> > >>>>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread. > >>>>>>> Without > >>>>>>> active testing this will just bit-rot. > >>>>>>> > >>>>>>> Also on the tests I don't understand your @requires clause: > >>>>>>> > >>>>>>> ??? @requires ((vm.compMode != "Xcomp") & > vm.compiler2.enabled & > >>>>>>> (vm.opt.TieredCompilation != true)) > >>>>>>> > >>>>>>> This seems to require that TieredCompilation is disabled, but tiered > is > >>>>>>> our normal mode of operation. ?? > >>>>>>> > >>>>>>> Thanks, > >>>>>>> David > >>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Richard. > >>>>>>>> > >>>>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745 > >>>>>>>> > >> > http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.pa > tc > >> h > >>>>>>>> > >>>>>>>> > >>>>>>>> From richard.reingruber at sap.com Mon Mar 30 08:31:30 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Mon, 30 Mar 2020 08:31:30 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com> <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com> <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com> Message-ID: Hi, this is webrev.5 based on Robbin's feedback and Martin's review - thanks! :) The change affects jvmti, hotspot and c2. Partial reviews are very welcome too. Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5/ Delta: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5.inc/ Robbin, Martin, please let me know, if anything shouldn't be quite as you wanted it. Also find my comments on your feedback below. Robbin, can I count you as Reviewer for the runtime part? Thanks, Richard. -- > DeoptimizeObjectsALotThread is only used in compileBroker.cpp. > You can move both declaration and definition to that file, no need to clobber > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry) Done. > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's own > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp. I moved JvmtiDeferredUpdates to vframe_hp.hpp where preexisting jvmtiDeferredLocalVariableSet is declared. > src/hotspot/share/code/compiledMethod.cpp > Nice cleanup! Thanks :) > src/hotspot/share/code/debugInfoRec.cpp > src/hotspot/share/code/debugInfoRec.hpp > Additional parmeters. (Remark: I think "non_global_escape_in_scope" would read better than "not_global_escape_in_scope", but your version is consistent with existing code, so no change request from my side.) Ok. I've been thinking about this too and finally stayed with not_global_escape_in_scope. It's supposed to mean an object whose escape state is not GlobalEscape is in scope. > src/hotspot/share/compiler/compileBroker.cpp > src/hotspot/share/compiler/compileBroker.hpp > Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into a follow up change together with the test in order to make this webrev smaller, but since it is included, I'm reviewing everything at once. Not a big deal.) Ok. Yes the change would be a little smaller. And if it helps I'll split it off. In general I prefer patches that bring along a suitable amount of tests. > src/hotspot/share/opto/c2compiler.cpp > Make do_escape_analysis independent of JVMCI capabilities. Nice! It is the main goal of the enhancement. It is done for C2, but could be done for JVMCI compilers with just a small effort as well. > src/hotspot/share/opto/escape.cpp > Annotation for MachSafePointNodes. Your added functionality looks correct. > But I'd prefer to move the bulky code out of the large function. > I suggest to factor out something like has_not_global_escape and has_arg_escape. So the code could look like this: > SafePointNode* sfn = sfn_worklist.at(next); > sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn)); > if (sfn->is_CallJava()) { > CallJavaNode* call = sfn->as_CallJava(); > call->set_arg_escape(has_arg_escape(call)); > } > This would also allow us to get rid of the found_..._escape_in_args variables making the loops better readable. Done. > It's kind of ugly to use strcmp to recognize uncommon trap, but that seems to be the way to do it (there are more such places). So it's ok. Yeah. I copied the snippet. > src/hotspot/share/prims/jvmtiImpl.cpp > src/hotspot/share/prims/jvmtiImpl.hpp > The sequence is pretty complex: > VM_GetOrSetLocal element initialization executes EscapeBarrier code which suspends the target thread (extra VM Operation). Note that the target threads have to be suspended already for VM_GetOrSetLocal*. So it's mainly the synchronization effect of EscapeBarrier::sync_and_suspend_one() that is required here. Also no extra _handshake_ is executed, since sync_and_suspend_one() will find the target threads already suspended. > VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM Thread to prepare VM Operation with frame deoptimization). > VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor which resumes the target thread. > But I don't have any improvement proposal. Performance is probably not a concern, here. So it's ok. > VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it has non-globally escaping objects and other frames if they have arg escaping ones. Good. It's not specifically the top frame, but the frame that is accessed. > src/hotspot/share/runtime/deoptimization.cpp > Object deoptimization. I have more comments and proposals, here. > First of all, handling recursive and waiting locks in relock_objects is tricky, but looks correct. > Comments are sufficient to understand why things are done as they are implemented. > BiasedLocking related parts are complex, but we may get rid of them in the future (with BiasedLocking removal). > Anyway, looks correct, too. > Typo in comment: "regularily" => "regularly" > Deoptimization::fetch_unroll_info_helper is the only place where _jvmti_deferred_updates get deallocated (except JavaThread destructor). But I think we always go through it, so I can't see a memory leak or such kind of issues. That's correct. The compiled frame for which deferred updates are allocated is always deoptimized before (see EscapeBarrier::deoptimize_objects()). This is also asserted in compiledVFrame::update_deferred_value(). I've added the same assertion to Deoptimization::relock_objects(). So we can be sure that _jvmti_deferred_updates are deallocated again in fetch_unroll_info_helper(). > EscapeBarrier::deoptimize_objects: ResourceMark should use calling_thread(). Sure, well spotted! > You can use MutexLocker and MonitorLocker with Thread* to save the Thread::current() call. Right, good hint. This was recently introduced with 8235678. I even had to resolve conflicts. Should have done this then. > I'd make set_objs_are_deoptimized static and remove it from the EscapeBarrier interface because I think it shouldn't be used outside of EscapeBarrier::deoptimize_objects. Done. > Typo in comment: "we must only deoptimize" => "we only have to deoptimize" Replaced with "[...] we deoptimize iff local objects are passed as args" > "bool EscapeBarrier::deoptimize_objects(intptr_t* fr_id)" is trivial and barrier_active() is redundant. Implementation can get moved to hpp file. Ok. Done. > I'll get back to suspend flags, later. > There are weird cases regarding _self_deoptimization_in_progress. > Assume we have 3 threads A, B and C. A deopts C, B deopts C, C deopts C. C can set _self_deoptimization_in_progress while A performs the handshake for suspending C. I think this doesn't lead to errors, but it's probably not desired. > I think it would be better to use only one "wait" call in sync_and_suspend_one and sync_and_suspend_all. You're right. We've discussed that face-to-face, but couldn't find a real issue. But now, thinking again, a reckon I found one: 2808 // Sync with other threads that might be doing deoptimizations 2809 { 2810 // Need to switch to _thread_blocked for the wait() call 2811 ThreadBlockInVM tbivm(_calling_thread); 2812 MonitorLocker ml(EscapeBarrier_lock, Mutex::_no_safepoint_check_flag); 2813 while (_self_deoptimization_in_progress) { 2814 ml.wait(); 2815 } 2816 2817 if (self_deopt()) { 2818 _self_deoptimization_in_progress = true; 2819 } 2820 2821 while (_deoptee_thread->is_ea_obj_deopt_suspend()) { 2822 ml.wait(); 2823 } 2824 2825 if (self_deopt()) { 2826 return; 2827 } 2828 2829 // set suspend flag for target thread 2830 _deoptee_thread->set_ea_obj_deopt_flag(); 2831 } - A waits in 2822 - C is suspended - B notifies all in resume_one() - A and C wake up - C wins over A and sets _self_deoptimization_in_progress = true in 2818 - C does the self deoptimization - A executes 2830 _deoptee_thread->set_ea_obj_deopt_flag() C will self suspend at some undefined point. The resulting state is illegal. > I first thought it'd be better to move ThreadBlockInVM before wait() to reduce thread state transitions, but that seems to be problematic because ThreadBlockInVM destructor contains a safepoint check which we shouldn't do while holding EscapeBarrier_lock. So no change request. Yes, would be nice to have the state change only if needed, but for the reason you mentioned it is not quite as easy as it seems to be. I experimented as well with a second lock, but did not succeed. > Change in thred_added: > I think the sequence would be more comprehensive if we waited for deopt_all_threads in Thread::start and all other places where a new thread can run into Java code (e.g. JVMTI attach). > Your version makes new threads come up with suspend flag set. That looks correct, too. Advantage is that you only have to change one place (thread_added). It'll be interesting to see how it will look like when we use async handshakes instead of suspend flags. > For now, I'm ok with your version. I had a version that did what you are suggesting. The current version also has the advantage, that there are fewer places where a thread has to wait for ongoing object deoptimization. This means viewer places where you have to worry about correct thread state transitions, possible deadlocks, and if all oops are properly Handle'ed. > I'd only move MutexLocker ml(EscapeBarrier_lock...) after if (!jt->is_hidden_from_external_view()). Done. > Having 4 different deoptimize_objects functions makes it a little hard to keep an overview of which one is used for what. > Maybe adding suffixes would help a little bit, but I can also live with what you have. > Implementation looks correct to me. 2 are internal. I added the suffix _internal to them. This leaves 2 to choose from. > src/hotspot/share/runtime/deoptimization.hpp > Escape barriers and object deoptimization functions. > Typo in comment: "helt" => "held" Done in place already. > src/hotspot/share/runtime/interfaceSupport.cpp > InterfaceSupport::deoptimizeAllObjects() is only used for DeoptimizeObjectsALot = 1. > I think DeoptimizeObjectsALot = 2 is more important, but I think it's not bad to have DeoptimizeObjectsALot = 1 in addition. Ok. I never used DeoptimizeObjectsALot = 1 that much. It could be more deterministic in single threaded scenarios. I wouldn't object to get rid of it though. > src/hotspot/share/runtime/stackValue.hpp > Better reinitilization in StackValue. Good. StackValue::obj_is_scalar_replaced() should not return true after calling set_obj(). > src/hotspot/share/runtime/thread.cpp > src/hotspot/share/runtime/thread.hpp > src/hotspot/share/runtime/thread.inline.hpp > wait_for_object_deoptimization, suspend flag, deferred updates and test feature to deoptimize objects. > In the long term, we want to get rid of suspend flags, so it's not so nice to introduce a new one. But I agree with G?tz that it should be acceptable as temporary solution until async handshakes are available (which takes more time). So I'm ok with your change. I'm keen to build the feature on async handshakes when the arive. > You can use MutexLocker with Thread*. Done. > JVMTIDeferredUpdates: I agree with Robin. It'd be nice to move the class out of thread.hpp. Done. > src/hotspot/share/runtime/vframe.cpp > Added support for entry frame to new_vframe. Ok. > src/hotspot/share/runtime/vframe_hp.cpp > src/hotspot/share/runtime/vframe_hp.hpp > I think code()->as_nmethod() in not_global_escape_in_scope() and arg_escape() should better be under #ifdef ASSERT or inside the assert statement (no need for code cache walking in product build). Done. > jvmtiDeferredLocalVariableSet::update_monitors: > Please add a comment explaining that owner referenced by original info may be scalar replaced, but it is deoptimized in the vframe. Done. -----Original Message----- From: Doerr, Martin Sent: Donnerstag, 12. M?rz 2020 17:28 To: Reingruber, Richard ; 'Robbin Ehn' ; Lindenmaier, Goetz ; David Holmes ; Vladimir Kozlov (vladimir.kozlov at oracle.com) ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents Hi Richard, I managed to find time for a (almost) complete review of webrev.4. (I'll review the tests separately.) First of all, the change seems to be in pretty good quality for its significant complexity. I couldn't find any real bugs. But I'd like to propose minor improvements. I'm convinced that it's mature because we did substantial testing. I like the new functionality for object deoptimization. It can possibly be reused for future escape analysis based optimizations. So I appreciate having it available in the code base. In addition to that, your change makes the JVMTI implementation better integrated into the VM. Now to the details: src/hotspot/share/c1/c1_IR.hpp describe_scope parameters. Ok. src/hotspot/share/ci/ciEnv.cpp src/hotspot/share/ci/ciEnv.hpp Fix for JvmtiExport::can_walk_any_space() capability. Ok. src/hotspot/share/code/compiledMethod.cpp Nice cleanup! src/hotspot/share/code/debugInfoRec.cpp src/hotspot/share/code/debugInfoRec.hpp Additional parmeters. (Remark: I think "non_global_escape_in_scope" would read better than "not_global_escape_in_scope", but your version is consistent with existing code, so no change request from my side.) Ok. src/hotspot/share/code/nmethod.cpp Nice cleanup! src/hotspot/share/code/pcDesc.hpp Additional parameters. Ok. src/hotspot/share/code/scopeDesc.cpp src/hotspot/share/code/scopeDesc.hpp Improved implementation + additional parameters. Ok. src/hotspot/share/compiler/compileBroker.cpp src/hotspot/share/compiler/compileBroker.hpp Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into a follow up change together with the test in order to make this webrev smaller, but since it is included, I'm reviewing everything at once. Not a big deal.) Ok. src/hotspot/share/jvmci/jvmciCodeInstaller.cpp Additional parameters. Ok. src/hotspot/share/opto/c2compiler.cpp Make do_escape_analysis independent of JVMCI capabilities. Nice! src/hotspot/share/opto/callnode.hpp Additional fields for MachSafePointNodes. Ok. src/hotspot/share/opto/escape.cpp Annotation for MachSafePointNodes. Your added functionality looks correct. But I'd prefer to move the bulky code out of the large function. I suggest to factor out something like has_not_global_escape and has_arg_escape. So the code could look like this: SafePointNode* sfn = sfn_worklist.at(next); sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn)); if (sfn->is_CallJava()) { CallJavaNode* call = sfn->as_CallJava(); call->set_arg_escape(has_arg_escape(call)); } This would also allow us to get rid of the found_..._escape_in_args variables making the loops better readable. It's kind of ugly to use strcmp to recognize uncommon trap, but that seems to be the way to do it (there are more such places). So it's ok. src/hotspot/share/opto/machnode.hpp Additional fields for MachSafePointNodes. Ok. src/hotspot/share/opto/macro.cpp Allow elimination of non-escaping allocations. Ok. src/hotspot/share/opto/matcher.cpp src/hotspot/share/opto/output.cpp Copy attribute / pass parameters. Ok. src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp Nice cleanup! src/hotspot/share/prims/jvmtiEnv.cpp src/hotspot/share/prims/jvmtiEnvBase.cpp Escape barriers + deoptimize objects for target thread. Good. src/hotspot/share/prims/jvmtiImpl.cpp src/hotspot/share/prims/jvmtiImpl.hpp The sequence is pretty complex: VM_GetOrSetLocal element initialization executes EscapeBarrier code which suspends the target thread (extra VM Operation). VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM Thread to prepare VM Operation with frame deoptimization). VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor which resumes the target thread. But I don't have any improvement proposal. Performance is probably not a concern, here. So it's ok. VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it has non-globally escaping objects and other frames if they have arg escaping ones. Good. src/hotspot/share/prims/jvmtiTagMap.cpp Escape barriers + deoptimize objects for all threads. Ok. src/hotspot/share/prims/whitebox.cpp Added WB_IsFrameDeoptimized to API. Ok. src/hotspot/share/runtime/deoptimization.cpp Object deoptimization. I have more comments and proposals, here. First of all, handling recursive and waiting locks in relock_objects is tricky, but looks correct. Comments are sufficient to understand why things are done as they are implemented. BiasedLocking related parts are complex, but we may get rid of them in the future (with BiasedLocking removal). Anyway, looks correct, too. Typo in comment: "regularily" => "regularly" Deoptimization::fetch_unroll_info_helper is the only place where _jvmti_deferred_updates get deallocated (except JavaThread destructor). But I think we always go through it, so I can't see a memory leak or such kind of issues. EscapeBarrier::deoptimize_objects: ResourceMark should use calling_thread(). You can use MutexLocker and MonitorLocker with Thread* to save the Thread::current() call. I'd make set_objs_are_deoptimized static and remove it from the EscapeBarrier interface because I think it shouldn't be used outside of EscapeBarrier::deoptimize_objects. Typo in comment: "we must only deoptimize" => "we only have to deoptimize" "bool EscapeBarrier::deoptimize_objects(intptr_t* fr_id)" is trivial and barrier_active() is redundant. Implementation can get moved to hpp file. I'll get back to suspend flags, later. There are weird cases regarding _self_deoptimization_in_progress. Assume we have 3 threads A, B and C. A deopts C, B deopts C, C deopts C. C can set _self_deoptimization_in_progress while A performs the handshake for suspending C. I think this doesn't lead to errors, but it's probably not desired. I think it would be better to use only one "wait" call in sync_and_suspend_one and sync_and_suspend_all. I first thought it'd be better to move ThreadBlockInVM before wait() to reduce thread state transitions, but that seems to be problematic because ThreadBlockInVM destructor contains a safepoint check which we shouldn't do while holding EscapeBarrier_lock. So no change request. Change in thred_added: I think the sequence would be more comprehensive if we waited for deopt_all_threads in Thread::start and all other places where a new thread can run into Java code (e.g. JVMTI attach). Your version makes new threads come up with suspend flag set. That looks correct, too. Advantage is that you only have to change one place (thread_added). It'll be interesting to see how it will look like when we use async handshakes instead of suspend flags. For now, I'm ok with your version. I'd only move MutexLocker ml(EscapeBarrier_lock...) after if (!jt->is_hidden_from_external_view()). Having 4 different deoptimize_objects functions makes it a little hard to keep an overview of which one is used for what. Maybe adding suffixes would help a little bit, but I can also live with what you have. Implementation looks correct to me. src/hotspot/share/runtime/deoptimization.hpp Escape barriers and object deoptimization functions. Typo in comment: "helt" => "held" src/hotspot/share/runtime/globals.hpp Addition of develop flag DeoptimizeObjectsALotInterval. Ok. src/hotspot/share/runtime/interfaceSupport.cpp InterfaceSupport::deoptimizeAllObjects() is only used for DeoptimizeObjectsALot = 1. I think DeoptimizeObjectsALot = 2 is more important, but I think it's not bad to have DeoptimizeObjectsALot = 1 in addition. Ok. src/hotspot/share/runtime/interfaceSupport.inline.hpp Addition of deoptimizeAllObjects. Ok. src/hotspot/share/runtime/mutexLocker.cpp src/hotspot/share/runtime/mutexLocker.hpp Addition of EscapeBarrier_lock. Ok. src/hotspot/share/runtime/objectMonitor.cpp Make recursion count relock aware. Ok. src/hotspot/share/runtime/stackValue.hpp Better reinitilization in StackValue. Good. src/hotspot/share/runtime/thread.cpp src/hotspot/share/runtime/thread.hpp src/hotspot/share/runtime/thread.inline.hpp wait_for_object_deoptimization, suspend flag, deferred updates and test feature to deoptimize objects. In the long term, we want to get rid of suspend flags, so it's not so nice to introduce a new one. But I agree with G?tz that it should be acceptable as temporary solution until async handshakes are available (which takes more time). So I'm ok with your change. You can use MutexLocker with Thread*. JVMTIDeferredUpdates: I agree with Robin. It'd be nice to move the class out of thread.hpp. src/hotspot/share/runtime/vframe.cpp Added support for entry frame to new_vframe. Ok. src/hotspot/share/runtime/vframe_hp.cpp src/hotspot/share/runtime/vframe_hp.hpp I think code()->as_nmethod() in not_global_escape_in_scope() and arg_escape() should better be under #ifdef ASSERT or inside the assert statement (no need for code cache walking in product build). jvmtiDeferredLocalVariableSet::update_monitors: Please add a comment explaining that owner referenced by original info may be scalar replaced, but it is deoptimized in the vframe. src/hotspot/share/utilities/macros.hpp Addition of NOT_COMPILER2_OR_JVMCI_RETURN macros. Ok. test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysisEnabled.java test/hotspot/jtreg/serviceability/jvmti/Heap/libIterateHeapWithEscapeAnalysisEnabled.c New test. Will review separately. test/jdk/TEST.ROOT Addition of vm.jvmci as required property. Ok. test/jdk/com/sun/jdi/EATests.java test/jdk/com/sun/jdi/EATestsJVMCI.java New test. Will review separately. test/lib/sun/hotspot/WhiteBox.java Added isFrameDeoptimized to API. Ok. That was it. Best regards, Martin > -----Original Message----- > From: hotspot-compiler-dev bounces at openjdk.java.net> On Behalf Of Reingruber, Richard > Sent: Dienstag, 3. M?rz 2020 21:23 > To: 'Robbin Ehn' ; Lindenmaier, Goetz > ; David Holmes ; > Vladimir Kozlov (vladimir.kozlov at oracle.com) > ; serviceability-dev at openjdk.java.net; > hotspot-compiler-dev at openjdk.java.net; hotspot-runtime- > dev at openjdk.java.net > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better > Performance in the Presence of JVMTI Agents > > Hi Robbin, > > > > I understand that Robbin proposed to replace the usage of > > > _suspend_flag with handshakes. Apparently, async handshakes > > > are needed to do so. We have been waiting a while for removal > > > of the _suspend_flag / introduction of async handshakes [2]. > > > What is the status here? > > > I have an old prototype which I would like to continue to work on. > > So do not assume asynch handshakes will make 15. > > Even if it would, I think there are a lot more investigate work to remove > > _suspend_flag. > > Let us know, if we can be of any help to you and be it only testing. > > > >> Full: > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/ > > > DeoptimizeObjectsALotThread is only used in compileBroker.cpp. > > You can move both declaration and definition to that file, no need to > clobber > > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry) > > Will do. > > > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's > own > > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp. > > You are right. It shouldn't be declared in thread.hpp. I will look into that. > > > Note that we also think we may have a bug in deopt: > > https://bugs.openjdk.java.net/browse/JDK-8238237 > > > I think it would be best, if possible, to push after that is resolved. > > Sure. > > > Not even nearly a full review :) > > I know :) > > Anyways, thanks a lot, > Richard. > > > -----Original Message----- > From: Robbin Ehn > Sent: Monday, March 2, 2020 11:17 AM > To: Lindenmaier, Goetz ; Reingruber, Richard > ; David Holmes ; > Vladimir Kozlov (vladimir.kozlov at oracle.com) > ; serviceability-dev at openjdk.java.net; > hotspot-compiler-dev at openjdk.java.net; hotspot-runtime- > dev at openjdk.java.net > Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance > in the Presence of JVMTI Agents > > Hi, > > On 2/24/20 5:39 PM, Lindenmaier, Goetz wrote: > > Hi, > > > > I had a look at the progress of this change. Nothing > > happened since Richard posted his update using more > > handshakes [1]. > > But we (SAP) would appreciate a lot if this change could > > be successfully reviewed and pushed. > > > > I think there is basic understanding that this > > change is helpful. It fixes a number of issues with JVMTI, > > and will deliver the same performance benefits as EA > > does in current production mode for debugging scenarios. > > > > This is important for us as we run our VMs prepared > > for debugging in production mode. > > > > I understand that Robbin proposed to replace the usage of > > _suspend_flag with handshakes. Apparently, async handshakes > > are needed to do so. We have been waiting a while for removal > > of the _suspend_flag / introduction of async handshakes [2]. > > What is the status here? > > I have an old prototype which I would like to continue to work on. > So do not assume asynch handshakes will make 15. > Even if it would, I think there are a lot more investigate work to remove > _suspend_flag. > > > > > I think we should no longer wait, but proceed with > > this change. We will look into removing the usage of > > suspend_flag introduced here once it is possible to implement > > it with handshakes. > > Yes, sure. > > >> Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/ > > DeoptimizeObjectsALotThread is only used in compileBroker.cpp. > You can move both declaration and definition to that file, no need to clobber > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry) > > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's > own > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp. > > Note that we also think we may have a bug in deopt: > https://bugs.openjdk.java.net/browse/JDK-8238237 > > I think it would be best, if possible, to push after that is resolved. > > Not even nearly a full review :) > > Thanks, Robbin > > > >> Incremental: > >> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/ > >> > >> I was not able to eliminate the additional suspend flag now. I'll take care > of this > >> as soon as the > >> existing suspend-resume-mechanism is reworked. > >> > >> Testing: > >> > >> Nightly tests @SAP: > >> > >> JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, > Renaissance > >> Suite, SAP specific tests > >> with fastdebug and release builds on all platforms > >> > >> Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x > parallel > >> for 24h > >> > >> Thanks, Richard. > >> > >> > >> More details on the changes: > >> > >> * Hide DeoptimizeObjectsALotThread from external view. > >> > >> * Changed EscapeBarrier_lock to be a _safepoint_check_never lock. > >> It used to be _safepoint_check_sometimes, which will be eliminated > sooner or > >> later. > >> I added explicit thread state changes with ThreadBlockInVM to code > paths > >> where we can wait() > >> on EscapeBarrier_lock to become safepoint safe. > >> > >> * Use handshake EscapeBarrierSuspendHandshake to suspend target > threads > >> instead of vm operation > >> VM_ThreadSuspendAllForObjDeopt. > >> > >> * Removed uses of Threads_lock. When adding a new thread we suspend > it iff > >> EA optimizations are > >> being reverted. In the previous version we were waiting on > Threads_lock > >> while EA optimizations > >> were reverted. See EscapeBarrier::thread_added(). > >> > >> * Made tests require Xmixed compilation mode. > >> > >> * Made tests agnostic regarding tiered compilation. > >> I.e. tc isn't disabled anymore, and the tests can be run with tc enabled or > >> disabled. > >> > >> * Exercising EATests.java as well with stress test options > >> DeoptimizeObjectsALot* > >> Due to the non-deterministic deoptimizations some tests need to be > skipped. > >> We do this to prevent bit-rot of the stress test code. > >> > >> * Executing EATests.java as well with graal if available. Driver for this is > >> EATestsJVMCI.java. Graal cannot pass all tests, because it does not > provide all > >> the new debug info > >> (namely not_global_escape_in_scope and arg_escape in > scopeDesc.hpp). > >> And graal does not yet support the JVMTI operations force early return > and > >> pop frame. > >> > >> * Removed tracing from new jdi tests in EATests.java. Too much trace > output > >> before the debugging > >> connection is established can cause deadlock because output buffers fill > up. > >> (See https://bugs.openjdk.java.net/browse/JDK-8173304) > >> > >> * Many copyright year changes and smaller clean-up changes of testing > code > >> (trailing white-space and > >> the like). > >> > >> > >> -----Original Message----- > >> From: David Holmes > >> Sent: Donnerstag, 19. Dezember 2019 03:12 > >> To: Reingruber, Richard ; serviceability- > >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; > hotspot- > >> runtime-dev at openjdk.java.net; Vladimir Kozlov > (vladimir.kozlov at oracle.com) > >> > >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > Performance in > >> the Presence of JVMTI Agents > >> > >> Hi Richard, > >> > >> I think my issue is with the way EliminateNestedLocks works so I'm going > >> to look into that more deeply. > >> > >> Thanks for the explanations. > >> > >> David > >> > >> On 18/12/2019 12:47 am, Reingruber, Richard wrote: > >>> Hi David, > >>> > >>> > > > Some further queries/concerns: > >>> > > > > >>> > > > src/hotspot/share/runtime/objectMonitor.cpp > >>> > > > > >>> > > > Can you please explain the changes to ObjectMonitor::wait: > >>> > > > > >>> > > > ! _recursions = save // restore the old recursion count > >>> > > > ! + jt->get_and_reset_relock_count_after_wait(); // > >>> > > > increased by the deferred relock count > >>> > > > > >>> > > > what is the "deferred relock count"? I gather it relates to > >>> > > > > >>> > > > "The code was extended to be able to deoptimize objects of a > >>> > > frame that > >>> > > > is not the top frame and to let another thread than the owning > >>> > > thread do > >>> > > > it." > >>> > > > >>> > > Yes, these relate. Currently EA based optimizations are reverted, > when a > >> compiled frame is > >>> > > replaced with corresponding interpreter frames. Part of this is > relocking > >> objects with eliminated > >>> > > locking. New with the enhancement is that we do this also just > before > >> object references are > >>> > > acquired through JVMTI. In this case we deoptimize also the > owning > >> compiled frame C and we > >>> > > register deoptimized objects as deferred updates. When control > returns > >> to C it gets deoptimized, > >>> > > we notice that objects are already deoptimized (reallocated and > >> relocked), so we don't do it again > >>> > > (relocking twice would be incorrect of course). Deferred updates > are > >> copied into the new > >>> > > interpreter frames. > >>> > > > >>> > > Problem: relocking is not possible if the target thread T is waiting > on the > >> monitor that needs to > >>> > > be relocked. This happens only with non-local objects with > >> EliminateNestedLocks. Instead relocking > >>> > > is deferred until T owns the monitor again. This is what the piece of > >> code above does. > >>> > > >>> > Sorry I need some more detail here. How can you wait() on an > object > >>> > monitor if the object allocation and/or locking was optimised away? > And > >>> > what is a "non-local object" in this context? Isn't EA restricted to > >>> > thread-confined objects? > >>> > >>> "Non-local object" is an object that escapes its thread. The issue I'm > >> addressing with the changes > >>> in ObjectMonitor::wait are almost unrelated to EA. They are caused by > >> EliminateNestedLocks, where C2 > >>> eliminates recursive locking of an already owned lock. The lock owning > object > >> exists on the heap, it > >>> is locked and you can call wait() on it. > >>> > >>> EliminateLocks is the C2 option that controls lock elimination based on > EA. > >> Both optimizations have > >>> in common that objects with eliminated locking need to be relocked > when > >> deoptimizing a frame, > >>> i.e. when replacing a compiled frame with equivalent interpreter > >>> frames. Deoptimization::relock_objects does that job for /all/ eliminated > >> locks in scope. /All/ can > >>> be a mix of eliminated nested locks and locks of not-escaping objects. > >>> > >>> New with the enhancement: I call relock_objects earlier, just before > objects > >> pontentially > >>> escape. But then later when the owning compiled frame gets > deoptimized, I > >> must not do it again: > >>> > >>> See call to EscapeBarrier::objs_are_deoptimized in deoptimization.cpp: > >>> > >>> 373 if ((jvmci_enabled || ((DoEscapeAnalysis || > EliminateNestedLocks) && > >> EliminateLocks)) > >>> 374 && !EscapeBarrier::objs_are_deoptimized(thread, > deoptee.id())) { > >>> 375 bool unused; > >>> 376 eliminate_locks(thread, chunk, realloc_failures, deoptee, > exec_mode, > >> unused); > >>> 377 } > >>> > >>> Now when calling relock_objects early it is quiet possible that I have to > relock > >> an object the > >>> target thread currently waits for. Obviously I cannot relock in this case, > >> instead I chose to > >>> introduce relock_count_after_wait to JavaThread. > >>> > >>> > Is it just that some of the locking gets optimized away e.g. > >>> > > >>> > synchronised(obj) { > >>> > synchronised(obj) { > >>> > synchronised(obj) { > >>> > obj.wait(); > >>> > } > >>> > } > >>> > } > >>> > > >>> > If this is reduced to a form as-if it were a single lock of the monitor > >>> > (due to EA) and the wait() triggers a JVM TI event which leads to the > >>> > escape of "obj" then we need to reconstruct the true lock state, and > so > >>> > when the wait() internally unblocks and reacquires the monitor it > has to > >>> > set the true recursion count to 3, not the 1 that it appeared to be > when > >>> > wait() was initially called. Is that the scenario? > >>> > >>> Kind of... except that the locking is not eliminated due to EA and there is > no > >> JVM TI event > >>> triggered by wait. > >>> > >>> Add > >>> > >>> LocalObject l1 = new LocalObject(); > >>> > >>> in front of the synchrnized blocks and assume a JVM TI agent acquires l1. > This > >> triggers the code in > >>> question. > >>> > >>> See that relocking/reallocating is transactional. If it is done then for /all/ > >> objects in scope and it is > >>> done at most once. It wouldn't be quite so easy to split this in relocking > of > >> nested/EA-based > >>> eliminated locks. > >>> > >>> > If so I find this truly awful. Anyone using wait() in a realistic form > >>> > requires a notification and so the object cannot be thread confined. > In > >>> > >>> It is not thread confined. > >>> > >>> > which case I would strongly argue that upon hitting the wait() the > deopt > >>> > should occur unconditionally and so the lock state is correct before > we > >>> > wait and so we don't need to mess with the recursion count > internally > >>> > when we reacquire the monitor. > >>> > > >>> > > > >>> > > > which I don't like the sound of at all when it comes to > ObjectMonitor > >>> > > > state. So I'd like to understand in detail exactly what is going on > here > >>> > > > and why. This is a very intrusive change that seems to badly > break > >>> > > > encapsulation and impacts future changes to ObjectMonitor > that are > >> under > >>> > > > investigation. > >>> > > > >>> > > I would not regard this as breaking encapsulation. Certainly not > badly. > >>> > > > >>> > > I've added a property relock_count_after_wait to JavaThread. The > >> property is well > >>> > > encapsulated. Future ObjectMonitor implementations have to deal > with > >> recursion too. They are free > >>> > > in choosing a way to do that as long as that property is taken into > >> account. This is hardly a > >>> > > limitation. > >>> > > >>> > I do think this badly breaks encapsulation as you have to add a > callout > >>> > from the guts of the ObjectMonitor code to reach into the thread to > get > >>> > this lock count adjustment. I understand why you have had to do > this but > >>> > I would much rather see a change to the EA optimisation strategy so > that > >>> > this is not needed. > >>> > > >>> > > Note also that the property is a straight forward extension of the > >> existing concept of deferred > >>> > > local updates. It is embedded into the structure holding them. So > not > >> even the footprint of a > >>> > > JavaThread is enlarged if no deferred updates are generated. > >>> > > >>> > [...] > >>> > > >>> > > > >>> > > I'm actually duplicating the existing external suspend mechanism, > >> because a thread can be > >>> > > suspended at most once. And hey, and don't like that either! But it > >> seems not unlikely that the > >>> > > duplicate can be removed together with the original and the new > type > >> of handshakes that will be > >>> > > used for thread suspend can be used for object deoptimization > too. See > >> today's discussion in > >>> > > JDK-8227745 [2]. > >>> > > >>> > I hope that discussion bears some fruit, at the moment it seems not > to > >>> > be possible to use handshakes here. :( > >>> > > >>> > The external suspend mechanism is a royal pain in the proverbial > that we > >>> > have to carefully live with. The idea that we're duplicating that for > >>> > use in another fringe area of functionality does not thrill me at all. > >>> > > >>> > To be clear, I understand the problem that exists and that you wish > to > >>> > solve, but for the runtime parts I balk at the complexity cost of > >>> > solving it. > >>> > >>> I know it's complex, but by far no rocket science. > >>> > >>> Also I find it hard to imagine another fix for JDK-8233915 besides > changing > >> the JVM TI specification. > >>> > >>> Thanks, Richard. > >>> > >>> -----Original Message----- > >>> From: David Holmes > >>> Sent: Dienstag, 17. Dezember 2019 08:03 > >>> To: Reingruber, Richard ; serviceability- > >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; > hotspot- > >> runtime-dev at openjdk.java.net; Vladimir Kozlov > (vladimir.kozlov at oracle.com) > >> > >>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > Performance > >> in the Presence of JVMTI Agents > >>> > >>> > >>> > >>> David > >>> > >>> On 17/12/2019 4:57 pm, David Holmes wrote: > >>>> Hi Richard, > >>>> > >>>> On 14/12/2019 5:01 am, Reingruber, Richard wrote: > >>>>> Hi David, > >>>>> > >>>>> ?? > Some further queries/concerns: > >>>>> ?? > > >>>>> ?? > src/hotspot/share/runtime/objectMonitor.cpp > >>>>> ?? > > >>>>> ?? > Can you please explain the changes to ObjectMonitor::wait: > >>>>> ?? > > >>>>> ?? > !?? _recursions = save????? // restore the old recursion count > >>>>> ?? > !???????????????? + jt->get_and_reset_relock_count_after_wait(); // > >>>>> ?? > increased by the deferred relock count > >>>>> ?? > > >>>>> ?? > what is the "deferred relock count"? I gather it relates to > >>>>> ?? > > >>>>> ?? > "The code was extended to be able to deoptimize objects of a > >>>>> frame that > >>>>> ?? > is not the top frame and to let another thread than the owning > >>>>> thread do > >>>>> ?? > it." > >>>>> > >>>>> Yes, these relate. Currently EA based optimizations are reverted, > when > >>>>> a compiled frame is replaced > >>>>> with corresponding interpreter frames. Part of this is relocking > >>>>> objects with eliminated > >>>>> locking. New with the enhancement is that we do this also just before > >>>>> object references are acquired > >>>>> through JVMTI. In this case we deoptimize also the owning compiled > >>>>> frame C and we register > >>>>> deoptimized objects as deferred updates. When control returns to C > it > >>>>> gets deoptimized, we notice > >>>>> that objects are already deoptimized (reallocated and relocked), so > we > >>>>> don't do it again (relocking > >>>>> twice would be incorrect of course). Deferred updates are copied into > >>>>> the new interpreter frames. > >>>>> > >>>>> Problem: relocking is not possible if the target thread T is waiting > >>>>> on the monitor that needs to be > >>>>> relocked. This happens only with non-local objects with > >>>>> EliminateNestedLocks. Instead relocking is > >>>>> deferred until T owns the monitor again. This is what the piece of > >>>>> code above does. > >>>> > >>>> Sorry I need some more detail here. How can you wait() on an object > >>>> monitor if the object allocation and/or locking was optimised away? > And > >>>> what is a "non-local object" in this context? Isn't EA restricted to > >>>> thread-confined objects? > >>>> > >>>> Is it just that some of the locking gets optimized away e.g. > >>>> > >>>> synchronised(obj) { > >>>> ? synchronised(obj) { > >>>> ??? synchronised(obj) { > >>>> ????? obj.wait(); > >>>> ??? } > >>>> ? } > >>>> } > >>>> > >>>> If this is reduced to a form as-if it were a single lock of the monitor > >>>> (due to EA) and the wait() triggers a JVM TI event which leads to the > >>>> escape of "obj" then we need to reconstruct the true lock state, and so > >>>> when the wait() internally unblocks and reacquires the monitor it has to > >>>> set the true recursion count to 3, not the 1 that it appeared to be when > >>>> wait() was initially called. Is that the scenario? > >>>> > >>>> If so I find this truly awful. Anyone using wait() in a realistic form > >>>> requires a notification and so the object cannot be thread confined. In > >>>> which case I would strongly argue that upon hitting the wait() the > deopt > >>>> should occur unconditionally and so the lock state is correct before we > >>>> wait and so we don't need to mess with the recursion count internally > >>>> when we reacquire the monitor. > >>>> > >>>>> > >>>>> ?? > which I don't like the sound of at all when it comes to > >>>>> ObjectMonitor > >>>>> ?? > state. So I'd like to understand in detail exactly what is going > >>>>> on here > >>>>> ?? > and why.? This is a very intrusive change that seems to badly > break > >>>>> ?? > encapsulation and impacts future changes to ObjectMonitor that > >>>>> are under > >>>>> ?? > investigation. > >>>>> > >>>>> I would not regard this as breaking encapsulation. Certainly not badly. > >>>>> > >>>>> I've added a property relock_count_after_wait to JavaThread. The > >>>>> property is well > >>>>> encapsulated. Future ObjectMonitor implementations have to deal > with > >>>>> recursion too. They are free in > >>>>> choosing a way to do that as long as that property is taken into > >>>>> account. This is hardly a > >>>>> limitation. > >>>> > >>>> I do think this badly breaks encapsulation as you have to add a callout > >>>> from the guts of the ObjectMonitor code to reach into the thread to > get > >>>> this lock count adjustment. I understand why you have had to do this > but > >>>> I would much rather see a change to the EA optimisation strategy so > that > >>>> this is not needed. > >>>> > >>>>> Note also that the property is a straight forward extension of the > >>>>> existing concept of deferred > >>>>> local updates. It is embedded into the structure holding them. So not > >>>>> even the footprint of a > >>>>> JavaThread is enlarged if no deferred updates are generated. > >>>>> > >>>>> ?? > --- > >>>>> ?? > > >>>>> ?? > src/hotspot/share/runtime/thread.cpp > >>>>> ?? > > >>>>> ?? > Can you please explain why > >>>>> JavaThread::wait_for_object_deoptimization > >>>>> ?? > has to be handcrafted in this way rather than using proper > >>>>> transitions. > >>>>> ?? > > >>>>> > >>>>> I wrote wait_for_object_deoptimization taking > >>>>> JavaThread::java_suspend_self_with_safepoint_check > >>>>> as template. So in short: for the same reasons :) > >>>>> > >>>>> Threads reach both methods as part of thread state transitions, > >>>>> therefore special handling is > >>>>> required to change thread state on top of ongoing transitions. > >>>>> > >>>>> ?? > We got rid of "deopt suspend" some time ago and it is disturbing > >>>>> to see > >>>>> ?? > it being added back (effectively). This seems like it may be > >>>>> something > >>>>> ?? > that handshakes could be used for. > >>>>> > >>>>> Deopt suspend used to be something rather different with a similar > >>>>> name[1]. It is not being added back. > >>>> > >>>> I stand corrected. Despite comments in the code to the contrary > >>>> deopt_suspend didn't actually cause a self-suspend. I was doing a lot of > >>>> cleanup in this area 13 years ago :) > >>>> > >>>>> > >>>>> I'm actually duplicating the existing external suspend mechanism, > >>>>> because a thread can be suspended > >>>>> at most once. And hey, and don't like that either! But it seems not > >>>>> unlikely that the duplicate can > >>>>> be removed together with the original and the new type of > handshakes > >>>>> that will be used for > >>>>> thread suspend can be used for object deoptimization too. See > today's > >>>>> discussion in JDK-8227745 [2]. > >>>> > >>>> I hope that discussion bears some fruit, at the moment it seems not to > >>>> be possible to use handshakes here. :( > >>>> > >>>> The external suspend mechanism is a royal pain in the proverbial that > we > >>>> have to carefully live with. The idea that we're duplicating that for > >>>> use in another fringe area of functionality does not thrill me at all. > >>>> > >>>> To be clear, I understand the problem that exists and that you wish to > >>>> solve, but for the runtime parts I balk at the complexity cost of > >>>> solving it. > >>>> > >>>> Thanks, > >>>> David > >>>> ----- > >>>> > >>>>> Thanks, Richard. > >>>>> > >>>>> [1] Deopt suspend was something like an async. handshake for > >>>>> architectures with register windows, > >>>>> ???? where patching the return pc for deoptimization of a compiled > >>>>> frame was racy if the owner thread > >>>>> ???? was in native code. Instead a "deopt" suspend flag was set on > >>>>> which the thread patched its own > >>>>> ???? frame upon return from native. So no thread was suspended. It > got > >>>>> its name only from the name of > >>>>> ???? the flags. > >>>>> > >>>>> [2] Discussion about using handshakes to sync. with the target thread: > >>>>> > >>>>> https://bugs.openjdk.java.net/browse/JDK- > >> > 8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.syst > e > >> m.issuetabpanels:comment-tabpanel#comment-14306727 > >>>>> > >>>>> > >>>>> -----Original Message----- > >>>>> From: David Holmes > >>>>> Sent: Freitag, 13. Dezember 2019 00:56 > >>>>> To: Reingruber, Richard ; > >>>>> serviceability-dev at openjdk.java.net; > >>>>> hotspot-compiler-dev at openjdk.java.net; > >>>>> hotspot-runtime-dev at openjdk.java.net > >>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > >>>>> Performance in the Presence of JVMTI Agents > >>>>> > >>>>> Hi Richard, > >>>>> > >>>>> Some further queries/concerns: > >>>>> > >>>>> src/hotspot/share/runtime/objectMonitor.cpp > >>>>> > >>>>> Can you please explain the changes to ObjectMonitor::wait: > >>>>> > >>>>> !?? _recursions = save????? // restore the old recursion count > >>>>> !???????????????? + jt->get_and_reset_relock_count_after_wait(); // > >>>>> increased by the deferred relock count > >>>>> > >>>>> what is the "deferred relock count"? I gather it relates to > >>>>> > >>>>> "The code was extended to be able to deoptimize objects of a frame > that > >>>>> is not the top frame and to let another thread than the owning thread > do > >>>>> it." > >>>>> > >>>>> which I don't like the sound of at all when it comes to ObjectMonitor > >>>>> state. So I'd like to understand in detail exactly what is going on here > >>>>> and why.? This is a very intrusive change that seems to badly break > >>>>> encapsulation and impacts future changes to ObjectMonitor that are > under > >>>>> investigation. > >>>>> > >>>>> --- > >>>>> > >>>>> src/hotspot/share/runtime/thread.cpp > >>>>> > >>>>> Can you please explain why > JavaThread::wait_for_object_deoptimization > >>>>> has to be handcrafted in this way rather than using proper transitions. > >>>>> > >>>>> We got rid of "deopt suspend" some time ago and it is disturbing to > see > >>>>> it being added back (effectively). This seems like it may be something > >>>>> that handshakes could be used for. > >>>>> > >>>>> Thanks, > >>>>> David > >>>>> ----- > >>>>> > >>>>> On 12/12/2019 7:02 am, David Holmes wrote: > >>>>>> On 12/12/2019 1:07 am, Reingruber, Richard wrote: > >>>>>>> Hi David, > >>>>>>> > >>>>>>> ??? > Most of the details here are in areas I can comment on in > detail, > >>>>>>> but I > >>>>>>> ??? > did take an initial general look at things. > >>>>>>> > >>>>>>> Thanks for taking the time! > >>>>>> > >>>>>> Apologies the above should read: > >>>>>> > >>>>>> "Most of the details here are in areas I *can't* comment on in detail > >>>>>> ..." > >>>>>> > >>>>>> David > >>>>>> > >>>>>>> ??? > The only thing that jumped out at me is that I think the > >>>>>>> ??? > DeoptimizeObjectsALotThread should be a hidden thread. > >>>>>>> ??? > > >>>>>>> ??? > +? bool is_hidden_from_external_view() const { return true; } > >>>>>>> > >>>>>>> Yes, it should. Will add the method like above. > >>>>>>> > >>>>>>> ??? > Also I don't see any testing of the > DeoptimizeObjectsALotThread. > >>>>>>> Without > >>>>>>> ??? > active testing this will just bit-rot. > >>>>>>> > >>>>>>> DeoptimizeObjectsALot is meant for stress testing with a larger > >>>>>>> workload. I will add a minimal test > >>>>>>> to keep it fresh. > >>>>>>> > >>>>>>> ??? > Also on the tests I don't understand your @requires clause: > >>>>>>> ??? > > >>>>>>> ??? >?? @requires ((vm.compMode != "Xcomp") & > vm.compiler2.enabled > >> & > >>>>>>> ??? > (vm.opt.TieredCompilation != true)) > >>>>>>> ??? > > >>>>>>> ??? > This seems to require that TieredCompilation is disabled, but > >>>>>>> tiered is > >>>>>>> ??? > our normal mode of operation. ?? > >>>>>>> ??? > > >>>>>>> > >>>>>>> I removed the clause. I guess I wanted to target the tests towards > the > >>>>>>> code they are supposed to > >>>>>>> test, and it's easier to analyze failures w/o tiered compilation and > >>>>>>> with just one compiler thread. > >>>>>>> > >>>>>>> Additionally I will make use of > >>>>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests. > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Richard. > >>>>>>> > >>>>>>> -----Original Message----- > >>>>>>> From: David Holmes > >>>>>>> Sent: Mittwoch, 11. Dezember 2019 08:03 > >>>>>>> To: Reingruber, Richard ; > >>>>>>> serviceability-dev at openjdk.java.net; > >>>>>>> hotspot-compiler-dev at openjdk.java.net; > >>>>>>> hotspot-runtime-dev at openjdk.java.net > >>>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > >>>>>>> Performance in the Presence of JVMTI Agents > >>>>>>> > >>>>>>> Hi Richard, > >>>>>>> > >>>>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote: > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> I would like to get reviews please for > >>>>>>>> > >>>>>>>> > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ > >>>>>>>> > >>>>>>>> Corresponding RFE: > >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745 > >>>>>>>> > >>>>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 > >>>>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK- > 8214584 [1] > >>>>>>>> > >>>>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing > without > >>>>>>>> issues (thanks!). In addition the > >>>>>>>> change is being tested at SAP since I posted the first RFR some > >>>>>>>> months ago. > >>>>>>>> > >>>>>>>> The intention of this enhancement is to benefit performance wise > from > >>>>>>>> escape analysis even if JVMTI > >>>>>>>> agents request capabilities that allow them to access local variable > >>>>>>>> values. E.g. if you start-up > >>>>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, > then > >>>>>>>> escape analysis is disabled right > >>>>>>>> from the beginning, well before a debugger attaches -- if ever one > >>>>>>>> should do so. With the > >>>>>>>> enhancement, escape analysis will remain enabled until and after > a > >>>>>>>> debugger attaches. EA based > >>>>>>>> optimizations are reverted just before an agent acquires the > >>>>>>>> reference to an object. In the JBS item > >>>>>>>> you'll find more details. > >>>>>>> > >>>>>>> Most of the details here are in areas I can comment on in detail, but > I > >>>>>>> did take an initial general look at things. > >>>>>>> > >>>>>>> The only thing that jumped out at me is that I think the > >>>>>>> DeoptimizeObjectsALotThread should be a hidden thread. > >>>>>>> > >>>>>>> +? bool is_hidden_from_external_view() const { return true; } > >>>>>>> > >>>>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread. > >>>>>>> Without > >>>>>>> active testing this will just bit-rot. > >>>>>>> > >>>>>>> Also on the tests I don't understand your @requires clause: > >>>>>>> > >>>>>>> ??? @requires ((vm.compMode != "Xcomp") & > vm.compiler2.enabled & > >>>>>>> (vm.opt.TieredCompilation != true)) > >>>>>>> > >>>>>>> This seems to require that TieredCompilation is disabled, but tiered > is > >>>>>>> our normal mode of operation. ?? > >>>>>>> > >>>>>>> Thanks, > >>>>>>> David > >>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Richard. > >>>>>>>> > >>>>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745 > >>>>>>>> > >> > http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.pa > tc > >> h > >>>>>>>> > >>>>>>>> > >>>>>>>> From serguei.spitsyn at oracle.com Mon Mar 30 09:30:54 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 30 Mar 2020 02:30:54 -0700 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> Message-ID: Hi Mandy, I have just one comment so far. http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/classfile/classLoaderHierarchyDCmd.cpp.frames.html ?356?? void add_classes(LoadedClassInfo* first_class, int num_classes, bool has_class_mirror_holder) { ?357???? LoadedClassInfo** p_list_to_add_to; ?358???? bool is_hidden = first_class->_klass->is_hidden(); ?359???? if (has_class_mirror_holder) { ?360?????? p_list_to_add_to = is_hidden ? &_hidden_weak_classes : &_anon_classes; ?361???? } else { ?362?????? p_list_to_add_to = &_classes; ?363???? } ?364???? // Search tail. ?365???? while ((*p_list_to_add_to) != NULL) { ?366?????? p_list_to_add_to = &(*p_list_to_add_to)->_next; ?367???? } ?368???? *p_list_to_add_to = first_class; ?369???? if (has_class_mirror_holder) { ?370?????? if (is_hidden) { ?371???????? _num_hidden_weak_classes += num_classes; ?372?????? } else { ?373???????? _num_anon_classes += num_classes; ?374?????? } ?375???? } else { ?376?????? _num_classes += num_classes; ?377???? } ?378?? } ?Q1: I'm just curious, what happens if a cld has arrays of hidden classes? ???? Is the bottom_klass always expected to be the first? Thanks, Serguei On 3/26/20 16:57, Mandy Chung wrote: > Please review the implementation of JEP 371: Hidden Classes. The main > changes are in core-libs and hotspot runtime area.? Small changes are > made in javac, VM compiler (intrinsification of Class::isHiddenClass), > JFR, JDI, and jcmd.? CSR [1]has been reviewed and is in the finalized > state (see specdiff and javadoc below for reference). > > Webrev: > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 > > > Hidden class is created via `Lookup::defineHiddenClass`. From JVM's point > of view, a hidden class is a normal class except the following: > > - A hidden class has no initiating class loader and is not registered > in any dictionary. > - A hidden class has a name containing an illegal character > `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` > returns "Lp/Foo.0x1234;". > - A hidden class is not modifiable, i.e. cannot be redefined or > retransformed. JVM TI IsModifableClass returns false on a hidden. > - Final fields in a hidden class is "final".? The value of final > fields cannot be overriden via reflection.? setAccessible(true) can > still be called on reflected objects representing final fields in a > hidden class and its access check will be suppressed but only have > read-access (i.e. can do Field::getXXX but not setXXX). > > Brief summary of this patch: > > 1. A new Lookup::defineHiddenClass method is the API to create a > hidden class. > 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG > option that > ?? can be specified when creating a hidden class. > 3. A new Class::isHiddenClass method tests if a class is a hidden class. > 4. Field::setXXX method will throw IAE on a final field of a hidden class > ?? regardless of the value of the accessible flag. > 5. JVM_LookupDefineClass is the new JVM entry point for > Lookup::defineClass > ?? and defineHiddenClass to create a class from the given bytes. > 6. ClassLoaderData implementation is not changed.? There is one > primary CLD > ?? that holds the classes strongly referenced by its defining loader.? > There > ?? can be zero or more additional CLDs - one per weak class. > 7. Nest host determination is updated per revised JVMS 5.4.4. Access > control > ?? check no longer throws LinkageError but instead it will throw IAE with > ?? a clear message if a class fails to resolve/validate the nest host > declared > ?? in NestHost/NestMembers attribute. > 8. JFR, jcmd, JDI are updated to support hidden classes. > 9. update javac LambdaToMethod as lambda proxy starts using nestmates > ?? and generate a bridge method to desuger a method reference to a > protected > ?? method in its supertype in a different package > > This patch also updates StringConcatFactory, LambdaMetaFactory, and > LambdaForms > to use hidden classes.? The webrev includes changes in nashorn to > hidden class > and I will update the webrev if JEP 372 removes it any time soon. > > We uncovered a bug in Lookup::defineClass spec throws LinkageError and > intends > to have the newly created class linked.? However, the implementation > in 14 > does not link the class.? A separate CSR [2] proposes to update the > implementation to match the spec.? This patch fixes the implementation. > > The spec update on JVM TI, JDI and Instrumentation will be done as > a separate RFE [3].? This patch includes new tests for JVM TI and > java.instrument that validates how the existing APIs work for hidden > classes. > > javadoc/specdiff > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ > > > JVMS 5.4.4 change: > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf > > > CSR: > https://bugs.openjdk.java.net/browse/JDK-8238359 > > Thanks > Mandy > [1] https://bugs.openjdk.java.net/browse/JDK-8238359 > [2] https://bugs.openjdk.java.net/browse/JDK-8240338 > [3] https://bugs.openjdk.java.net/browse/JDK-8230502 From david.holmes at oracle.com Mon Mar 30 09:54:51 2020 From: david.holmes at oracle.com (David Holmes) Date: Mon, 30 Mar 2020 19:54:51 +1000 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> Message-ID: <546cf8e4-00e4-1d22-6402-6620b8d7b2db@oracle.com> Sorry to jump in on this but it caught my eye though I may be missing a larger context ... On 30/03/2020 7:30 pm, serguei.spitsyn at oracle.com wrote: > Hi Mandy, > > I have just one comment so far. > > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/classfile/classLoaderHierarchyDCmd.cpp.frames.html > > > ?356?? void add_classes(LoadedClassInfo* first_class, int num_classes, > bool has_class_mirror_holder) { > ?357???? LoadedClassInfo** p_list_to_add_to; > ?358???? bool is_hidden = first_class->_klass->is_hidden(); > ?359???? if (has_class_mirror_holder) { > ?360?????? p_list_to_add_to = is_hidden ? &_hidden_weak_classes : > &_anon_classes; > ?361???? } else { > ?362?????? p_list_to_add_to = &_classes; > ?363???? } > ?364???? // Search tail. > ?365???? while ((*p_list_to_add_to) != NULL) { > ?366?????? p_list_to_add_to = &(*p_list_to_add_to)->_next; > ?367???? } > ?368???? *p_list_to_add_to = first_class; > ?369???? if (has_class_mirror_holder) { > ?370?????? if (is_hidden) { > ?371???????? _num_hidden_weak_classes += num_classes; Why does hidden imply weak here? David ----- > ?372?????? } else { > ?373???????? _num_anon_classes += num_classes; > ?374?????? } > ?375???? } else { > ?376?????? _num_classes += num_classes; > ?377???? } > ?378?? } > > ?Q1: I'm just curious, what happens if a cld has arrays of hidden classes? > ???? Is the bottom_klass always expected to be the first? > > > Thanks, > Serguei > > > On 3/26/20 16:57, Mandy Chung wrote: >> Please review the implementation of JEP 371: Hidden Classes. The main >> changes are in core-libs and hotspot runtime area.? Small changes are >> made in javac, VM compiler (intrinsification of Class::isHiddenClass), >> JFR, JDI, and jcmd.? CSR [1]has been reviewed and is in the finalized >> state (see specdiff and javadoc below for reference). >> >> Webrev: >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 >> >> >> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's point >> of view, a hidden class is a normal class except the following: >> >> - A hidden class has no initiating class loader and is not registered >> in any dictionary. >> - A hidden class has a name containing an illegal character >> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` >> returns "Lp/Foo.0x1234;". >> - A hidden class is not modifiable, i.e. cannot be redefined or >> retransformed. JVM TI IsModifableClass returns false on a hidden. >> - Final fields in a hidden class is "final".? The value of final >> fields cannot be overriden via reflection.? setAccessible(true) can >> still be called on reflected objects representing final fields in a >> hidden class and its access check will be suppressed but only have >> read-access (i.e. can do Field::getXXX but not setXXX). >> >> Brief summary of this patch: >> >> 1. A new Lookup::defineHiddenClass method is the API to create a >> hidden class. >> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG >> option that >> ?? can be specified when creating a hidden class. >> 3. A new Class::isHiddenClass method tests if a class is a hidden class. >> 4. Field::setXXX method will throw IAE on a final field of a hidden class >> ?? regardless of the value of the accessible flag. >> 5. JVM_LookupDefineClass is the new JVM entry point for >> Lookup::defineClass >> ?? and defineHiddenClass to create a class from the given bytes. >> 6. ClassLoaderData implementation is not changed.? There is one >> primary CLD >> ?? that holds the classes strongly referenced by its defining loader. >> There >> ?? can be zero or more additional CLDs - one per weak class. >> 7. Nest host determination is updated per revised JVMS 5.4.4. Access >> control >> ?? check no longer throws LinkageError but instead it will throw IAE with >> ?? a clear message if a class fails to resolve/validate the nest host >> declared >> ?? in NestHost/NestMembers attribute. >> 8. JFR, jcmd, JDI are updated to support hidden classes. >> 9. update javac LambdaToMethod as lambda proxy starts using nestmates >> ?? and generate a bridge method to desuger a method reference to a >> protected >> ?? method in its supertype in a different package >> >> This patch also updates StringConcatFactory, LambdaMetaFactory, and >> LambdaForms >> to use hidden classes.? The webrev includes changes in nashorn to >> hidden class >> and I will update the webrev if JEP 372 removes it any time soon. >> >> We uncovered a bug in Lookup::defineClass spec throws LinkageError and >> intends >> to have the newly created class linked.? However, the implementation >> in 14 >> does not link the class.? A separate CSR [2] proposes to update the >> implementation to match the spec.? This patch fixes the implementation. >> >> The spec update on JVM TI, JDI and Instrumentation will be done as >> a separate RFE [3].? This patch includes new tests for JVM TI and >> java.instrument that validates how the existing APIs work for hidden >> classes. >> >> javadoc/specdiff >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ >> >> >> JVMS 5.4.4 change: >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf >> >> >> CSR: >> https://bugs.openjdk.java.net/browse/JDK-8238359 >> >> Thanks >> Mandy >> [1] https://bugs.openjdk.java.net/browse/JDK-8238359 >> [2] https://bugs.openjdk.java.net/browse/JDK-8240338 >> [3] https://bugs.openjdk.java.net/browse/JDK-8230502 > From magnus.ihse.bursie at oracle.com Mon Mar 30 12:14:44 2020 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Mon, 30 Mar 2020 14:14:44 +0200 Subject: Discussion about fixing deprecation in jdk.hotspot.agent In-Reply-To: References: Message-ID: <1916207b-de97-1f25-f93c-8830025fad62@oracle.com> No opinions on this? /Magnus On 2020-03-25 23:34, Magnus Ihse Bursie wrote: > Hi everyone, > > As a follow-up to the ongoing review for JDK-8241618, I have also > looked at fixing the deprecation warnings in jdk.hotspot.agent. These > fall in three broad categories: > > * Deprecation of the boxing type constructors (e.g. "new Integer(42)"). > > * Deprecation of java.util.Observer and Observable. > > * The rest (mostly Class.newInstance(), and a few number of other odd > deprecations) > > The first category is trivial to fix. The last category need some > special discussion. But the overwhelming majority of deprecation > warnings come from the use of Observer and Observable. This really > dwarfs anything else, and needs to be handled first, otherwise it's > hard to even spot the other issues. > > My analysis of the situation is that the deprecation of Observer and > Observable seems a bit harsh, from the PoV of jdk.hotspot.agent. Sure, > it might be limited, but I think it does exactly what is needed here. > So the migration suggested in Observable (java.beans or > java.util.concurrent) seems overkill. If there are genuine threading > issues at play here, this assumption might be wrong, and then maybe > going the j.u.c. route is correct. > > But if that's not, the main goal should be to stay with the current > implementation. One way to do this is to sprinkle the code with > @SuppressWarning. But I think a better way would be to just implement > our own Observer and Observable. After all, the classes are trivial. > > I've made a mock-up of this solution, were I just copied the > java.util.Observer and Observable, and removed the deprecation > annotations. The only thing needed for the rest of the code is to make > sure we import these; I've done this for three arbitrarily selected > classes just to show what the change would typically look like. Here's > the mock-up: > > http://cr.openjdk.java.net/~ihse/hotspot-agent-observer/webrev.01 > > Let me know what you think. > > /Magnus From coleen.phillimore at oracle.com Mon Mar 30 14:18:58 2020 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 30 Mar 2020 10:18:58 -0400 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: <546cf8e4-00e4-1d22-6402-6620b8d7b2db@oracle.com> References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> <546cf8e4-00e4-1d22-6402-6620b8d7b2db@oracle.com> Message-ID: <9cd71367-edc6-efc8-0a53-2e703ffbbfab@oracle.com> On 3/30/20 5:54 AM, David Holmes wrote: > Sorry to jump in on this but it caught my eye though I may be missing > a larger context ... > > On 30/03/2020 7:30 pm, serguei.spitsyn at oracle.com wrote: >> Hi Mandy, >> >> I have just one comment so far. >> >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/classfile/classLoaderHierarchyDCmd.cpp.frames.html >> >> >> ??356?? void add_classes(LoadedClassInfo* first_class, int >> num_classes, bool has_class_mirror_holder) { >> ??357???? LoadedClassInfo** p_list_to_add_to; >> ??358???? bool is_hidden = first_class->_klass->is_hidden(); >> ??359???? if (has_class_mirror_holder) { >> ??360?????? p_list_to_add_to = is_hidden ? &_hidden_weak_classes : >> &_anon_classes; >> ??361???? } else { >> ??362?????? p_list_to_add_to = &_classes; >> ??363???? } >> ??364???? // Search tail. >> ??365???? while ((*p_list_to_add_to) != NULL) { >> ??366?????? p_list_to_add_to = &(*p_list_to_add_to)->_next; >> ??367???? } >> ??368???? *p_list_to_add_to = first_class; >> ??369???? if (has_class_mirror_holder) { >> ??370?????? if (is_hidden) { >> ??371???????? _num_hidden_weak_classes += num_classes; > > Why does hidden imply weak here? has_class_mirror_holder() implies weak. Coleen > > David > ----- > >> ??372?????? } else { >> ??373???????? _num_anon_classes += num_classes; >> ??374?????? } >> ??375???? } else { >> ??376?????? _num_classes += num_classes; >> ??377???? } >> ??378?? } >> >> ??Q1: I'm just curious, what happens if a cld has arrays of hidden >> classes? >> ????? Is the bottom_klass always expected to be the first? >> >> >> Thanks, >> Serguei >> >> >> On 3/26/20 16:57, Mandy Chung wrote: >>> Please review the implementation of JEP 371: Hidden Classes. The >>> main changes are in core-libs and hotspot runtime area.? Small >>> changes are made in javac, VM compiler (intrinsification of >>> Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been reviewed >>> and is in the finalized state (see specdiff and javadoc below for >>> reference). >>> >>> Webrev: >>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 >>> >>> >>> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's >>> point >>> of view, a hidden class is a normal class except the following: >>> >>> - A hidden class has no initiating class loader and is not >>> registered in any dictionary. >>> - A hidden class has a name containing an illegal character >>> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` >>> returns "Lp/Foo.0x1234;". >>> - A hidden class is not modifiable, i.e. cannot be redefined or >>> retransformed. JVM TI IsModifableClass returns false on a hidden. >>> - Final fields in a hidden class is "final".? The value of final >>> fields cannot be overriden via reflection. setAccessible(true) can >>> still be called on reflected objects representing final fields in a >>> hidden class and its access check will be suppressed but only have >>> read-access (i.e. can do Field::getXXX but not setXXX). >>> >>> Brief summary of this patch: >>> >>> 1. A new Lookup::defineHiddenClass method is the API to create a >>> hidden class. >>> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG >>> option that >>> ?? can be specified when creating a hidden class. >>> 3. A new Class::isHiddenClass method tests if a class is a hidden >>> class. >>> 4. Field::setXXX method will throw IAE on a final field of a hidden >>> class >>> ?? regardless of the value of the accessible flag. >>> 5. JVM_LookupDefineClass is the new JVM entry point for >>> Lookup::defineClass >>> ?? and defineHiddenClass to create a class from the given bytes. >>> 6. ClassLoaderData implementation is not changed.? There is one >>> primary CLD >>> ?? that holds the classes strongly referenced by its defining >>> loader. There >>> ?? can be zero or more additional CLDs - one per weak class. >>> 7. Nest host determination is updated per revised JVMS 5.4.4. Access >>> control >>> ?? check no longer throws LinkageError but instead it will throw IAE >>> with >>> ?? a clear message if a class fails to resolve/validate the nest >>> host declared >>> ?? in NestHost/NestMembers attribute. >>> 8. JFR, jcmd, JDI are updated to support hidden classes. >>> 9. update javac LambdaToMethod as lambda proxy starts using nestmates >>> ?? and generate a bridge method to desuger a method reference to a >>> protected >>> ?? method in its supertype in a different package >>> >>> This patch also updates StringConcatFactory, LambdaMetaFactory, and >>> LambdaForms >>> to use hidden classes.? The webrev includes changes in nashorn to >>> hidden class >>> and I will update the webrev if JEP 372 removes it any time soon. >>> >>> We uncovered a bug in Lookup::defineClass spec throws LinkageError >>> and intends >>> to have the newly created class linked.? However, the implementation >>> in 14 >>> does not link the class.? A separate CSR [2] proposes to update the >>> implementation to match the spec.? This patch fixes the implementation. >>> >>> The spec update on JVM TI, JDI and Instrumentation will be done as >>> a separate RFE [3].? This patch includes new tests for JVM TI and >>> java.instrument that validates how the existing APIs work for hidden >>> classes. >>> >>> javadoc/specdiff >>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ >>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ >>> >>> >>> JVMS 5.4.4 change: >>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf >>> >>> >>> CSR: >>> https://bugs.openjdk.java.net/browse/JDK-8238359 >>> >>> Thanks >>> Mandy >>> [1] https://bugs.openjdk.java.net/browse/JDK-8238359 >>> [2] https://bugs.openjdk.java.net/browse/JDK-8240338 >>> [3] https://bugs.openjdk.java.net/browse/JDK-8230502 >> From coleen.phillimore at oracle.com Mon Mar 30 14:20:01 2020 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 30 Mar 2020 10:20:01 -0400 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> <90f9276d-2777-88ba-b1ec-9901711fcf02@oracle.com> Message-ID: Adding back serviceability-dev.? Sometimes reply (and myself) remembers it and sometimes it strips it off.... Coleen On 3/30/20 10:16 AM, coleen.phillimore at oracle.com wrote: > > > On 3/29/20 10:17 PM, Mandy Chung wrote: >> >> >> On 3/27/20 8:51 PM, Chris Plummer wrote: >>> Hi Mandy, >>> >>> A couple of very minor nits in the jvmtiRedefineClasses.cpp comments: >>> >>> ?153???? // classes for primitives, arrays, hidden and vm unsafe >>> anonymous classes >>> ?154???? // cannot be redefined.? Check here so following code can >>> assume these classes >>> ?155???? // are InstanceKlass. >>> ?156???? if (!is_modifiable_class(mirror)) { >>> ?157?????? _res = JVMTI_ERROR_UNMODIFIABLE_CLASS; >>> ?158?????? return false; >>> ?159???? } >>> >>> I think this code and comment predate anonymous classes. Probably >>> before anonymous classes the check was not for >>> !is_modifiable_class() but instead was just a check for primitive or >>> array class types since they are not an InstanceKlass, and would >>> cause issues when cast to one in the code that lies below this >>> section. When anonymous classes were added, the code got changed to >>> use !is_modifiable_class() and the comment was not correctly updated >>> (anonymous classes are an InstanceKlass). Then with this webrev the >>> mention of hidden classes was added, also incorrectly implying they >>> are not an InstanceKlass. I think you should just leave off the last >>> sentence of the comment. >>> >> >> I agree with you that this comment needs update.?? Perhaps it should >> say "primitive, array types and hidden classes are non-modifiable. A >> modifiable class must be an InstanceKlass." > > I may have written the last part of that comment (or remember it at > least).? I think Chris's suggestion to remove the last sentence makes > sense.? Anything further will just adds unnecessary confusion to the > reader.? Anyone modifying this will get the InstanceKlass::cast() > assert soon after if they mess up. > > Coleen > >> >> I leave it to Serguei who may have other opinion. >> >>> There's some ambiguity in the application of adjectives in the >>> following: >>> >>> ?297?? // Cannot redefine or retransform a hidden or an unsafe >>> anonymous class. >>> >>> I'd suggest: >>> >>> ?297?? // Cannot redefine or retransform a hidden class or an unsafe >>> anonymous class. >>> >> >> +1 >> >>> There are some places in libjdwp that need to be fixed. I spoke to >>> Serguei about those this afternoon. Basically the >>> convertSignatureToClassname() function needs to be fixed to handle >>> hidden classes. Without the fix classname filtering will have >>> problems if the filter contains a pattern with a '/' to filter on >>> hidden classes. Also CLASS_UNLOAD events will not properly convert >>> hidden class names. We also need tests for these cases. I think >>> these are all things that can be addressed later. >>> >> >> Good catch.? I have created a subtask under JDK-8230502: >> ?? https://bugs.openjdk.java.net/browse/JDK-8230502 >> >>> I still need to look over the JVMTI tests. >>> >> >> Thanks >> Mandy >>> thanks, >>> >>> Chris >>> >>> On 3/26/20 4:57 PM, Mandy Chung wrote: >>>> Please review the implementation of JEP 371: Hidden Classes. The >>>> main changes are in core-libs and hotspot runtime area. Small >>>> changes are made in javac, VM compiler (intrinsification of >>>> Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been >>>> reviewed and is in the finalized state (see specdiff and javadoc >>>> below for reference). >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 >>>> >>>> >>>> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's >>>> point >>>> of view, a hidden class is a normal class except the following: >>>> >>>> - A hidden class has no initiating class loader and is not >>>> registered in any dictionary. >>>> - A hidden class has a name containing an illegal character >>>> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` >>>> returns "Lp/Foo.0x1234;". >>>> - A hidden class is not modifiable, i.e. cannot be redefined or >>>> retransformed. JVM TI IsModifableClass returns false on a hidden. >>>> - Final fields in a hidden class is "final".? The value of final >>>> fields cannot be overriden via reflection. setAccessible(true) can >>>> still be called on reflected objects representing final fields in a >>>> hidden class and its access check will be suppressed but only have >>>> read-access (i.e. can do Field::getXXX but not setXXX). >>>> >>>> Brief summary of this patch: >>>> >>>> 1. A new Lookup::defineHiddenClass method is the API to create a >>>> hidden class. >>>> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG >>>> option that >>>> ?? can be specified when creating a hidden class. >>>> 3. A new Class::isHiddenClass method tests if a class is a hidden >>>> class. >>>> 4. Field::setXXX method will throw IAE on a final field of a hidden >>>> class >>>> ?? regardless of the value of the accessible flag. >>>> 5. JVM_LookupDefineClass is the new JVM entry point for >>>> Lookup::defineClass >>>> ?? and defineHiddenClass to create a class from the given bytes. >>>> 6. ClassLoaderData implementation is not changed.? There is one >>>> primary CLD >>>> ?? that holds the classes strongly referenced by its defining >>>> loader.? There >>>> ?? can be zero or more additional CLDs - one per weak class. >>>> 7. Nest host determination is updated per revised JVMS 5.4.4. >>>> Access control >>>> ?? check no longer throws LinkageError but instead it will throw >>>> IAE with >>>> ?? a clear message if a class fails to resolve/validate the nest >>>> host declared >>>> ?? in NestHost/NestMembers attribute. >>>> 8. JFR, jcmd, JDI are updated to support hidden classes. >>>> 9. update javac LambdaToMethod as lambda proxy starts using nestmates >>>> ?? and generate a bridge method to desuger a method reference to a >>>> protected >>>> ?? method in its supertype in a different package >>>> >>>> This patch also updates StringConcatFactory, LambdaMetaFactory, and >>>> LambdaForms >>>> to use hidden classes.? The webrev includes changes in nashorn to >>>> hidden class >>>> and I will update the webrev if JEP 372 removes it any time soon. >>>> >>>> We uncovered a bug in Lookup::defineClass spec throws LinkageError >>>> and intends >>>> to have the newly created class linked.? However, the >>>> implementation in 14 >>>> does not link the class.? A separate CSR [2] proposes to update the >>>> implementation to match the spec.? This patch fixes the >>>> implementation. >>>> >>>> The spec update on JVM TI, JDI and Instrumentation will be done as >>>> a separate RFE [3].? This patch includes new tests for JVM TI and >>>> java.instrument that validates how the existing APIs work for >>>> hidden classes. >>>> >>>> javadoc/specdiff >>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ >>>> >>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ >>>> >>>> >>>> JVMS 5.4.4 change: >>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf >>>> >>>> >>>> CSR: >>>> https://bugs.openjdk.java.net/browse/JDK-8238359 >>>> >>>> Thanks >>>> Mandy >>>> [1] https://bugs.openjdk.java.net/browse/JDK-8238359 >>>> [2] https://bugs.openjdk.java.net/browse/JDK-8240338 >>>> [3] https://bugs.openjdk.java.net/browse/JDK-8230502 >>> >>> >> > From magnus.ihse.bursie at oracle.com Mon Mar 30 14:25:13 2020 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Mon, 30 Mar 2020 16:25:13 +0200 Subject: RFR: JDK-8241618 Fix unchecked warning for jdk.hotspot.agent In-Reply-To: <01f77be3-e7d2-a051-80ab-e81c83922cf6@oracle.com> References: <8d884fcb-f424-1b54-7ece-5260037b2843@oracle.com> <01f77be3-e7d2-a051-80ab-e81c83922cf6@oracle.com> Message-ID: On 2020-03-25 20:52, Chris Plummer wrote: > Hi Magus, > > I haven't looked at the changes yet, other to see that there are many > files touched, but after reading below (and only partly understanding > since I don't know this area well), I was wondering if this issue > wouldn't be better served with multiple passes made to fix the > warnings. Start with a straight forward one where you are maybe only > making one or two types of changes, but that affect a large number of > files and don't cascade into other more complicated changes. This will > get a lot of the noise out of the way, and then we can focus on some > of the harder issues you bring up below. Ok, I did just this. Here is an updated webrev. It contain the bulk of the changes, but all changes are -- I dare not say trivially obvious, but at least no-brainers. Hopefully it should be easier to review so I can get this pushed and out of the way. This also means that it is not possible to turn on the warning just yet. http://cr.openjdk.java.net/~ihse/JDK-8241618-fix-unchecked-warnings-for-agent/webrev.02 /Magnus > > As for testing, I think the following list will capture all of them, > but can't say for sure: > > open/test/hotspot/jtreg/serviceability/sa > open/test/hotspot/jtreg/resourcehogs/serviceability/sa > open/test/jdk/sun/tools/jhsdb > open/test/jdk/sun/tools/jstack > open/test/jdk/sun/tools/jmap > open/test/hotspot/jtreg/gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java > > open/test/hotspot/jtreg/compiler/ciReplay/TestSAClient.java > open/test/hotspot/jtreg/compiler/ciReplay/TestSAServer.java > > Chris > > On 3/25/20 12:29 PM, Magnus Ihse Bursie wrote: >> With the recent fixes in JDK-8241310, JDK-8237746 and JDK-8241073, >> and the upcoming fixes to remove the deprecated nashorn and jdk.rmi, >> the JDK build is very close to producing no warnings when compiling >> the Java classes. >> >> The one remaining sinner is jdk.hotspot.agent. Most of the warnings >> here are turned off, but unchecked and deprecation cannot be >> completely silenced. >> >> Since the poor agent does not seem to receive much love nowadays, I >> took it upon myself to fix these warnings, so we can finally get a >> quiet build. >> >> I started to address the unchecked warnings. Unfortunately, this was >> a much bigger task than I anticipated. I had to generify most of the >> module. On the plus side, the code is so much better now. And most of >> the changes were trivial, just tedious. >> >> There are a few places were I'm not entirely happy with the current >> solution, and that at least merits some discussion. >> >> I have resorted to @SuppressWarnings in four classes: ciMethodData, >> MethodData, TableModelComparator and VirtualBaseConstructor. All of >> them has in common that they are doing slightly fishy things with >> classes in collections. I'm not entirely sure they are bug-free, but >> this patch leaves the behavior untouched. I did some efforts to sort >> out the logic, but it turned out to be too hairy for me to fix, and >> it will probably require more substantial changes to the workings of >> the code. >> >> To make the code valid, I have moved ConstMethod to extend Metadata >> instead of VMObject. My understanding is that this is benign (and >> likely intended), but I really need for someone who knows the code to >> confirm this. I have also added a FIXME to signal this. I'll remove >> the FIXME as soon as I get confirmation that this is OK. >> (The reason for this is the following piece of code from >> Metadata.java: metadataConstructor.addMapping("ConstMethod", >> ConstMethod.class)) >> >> In ObjectListPanel, there is some code that screams "dead" with this >> change. I added a FIXME to point this out: >> ??? for (Iterator iter = elements.iterator(); iter.hasNext(); ) { >> ????? if (iter.next() instanceof Array) { >> ??????? // FIXME: Does not seem possible to happen >> ??????? hasArrays = true; >> ??????? return; >> ????? } >> It seems that if you start pulling this thread, even more dead code >> will unravel, so I'm not so eager to touch this in the current patch. >> But I can remove the FIXME if you want. >> >> My first iteration of this patch tried to generify the IntervalTree >> and related class hierarchy. However, this turned out to be >> impossible due to some weird usage in AnnotatedMemoryPanel, where >> there seemed to be confusion as to whether the tree stored >> Annotations or Addresses. I'm not entirely convinced the code is >> correct, it certainly looked and smelled very fishy. However, I >> reverted these changes since I could not get them to work due to >> this, and it was not needed for the goal of just getting rid of the >> warning. >> >> Finally, I have done no testing apart from verifying that it builds. >> Please advice on suitable tests to run. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8241618 >> WebRev: >> http://cr.openjdk.java.net/~ihse/JDK-8241618-fix-unchecked-warnings-for-agent/webrev.01 >> >> /Magnus > > From coleen.phillimore at oracle.com Mon Mar 30 15:02:02 2020 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 30 Mar 2020 11:02:02 -0400 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> Message-ID: <81f90a3e-dfda-0566-a9b2-ea0c4f17e7ac@oracle.com> Hi,? This is great work!? I did a prereview and all of my comments were addressed.? These are a few minor things I noticed. http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/ci/ciInstanceKlass.hpp.udiff.html Nit. Can you add 'const' to the is_hidden accessor? http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/classfile/classFileParser.cpp.udiff.html + ID annotation_index(const ClassLoaderData* loader_data, const Symbol* name, const bool can_access_vm_annotations); 'const' bool is weird and unnecessary.? Can you remove const here? + if (is_hidden()) { // Mark methods in hidden classes as 'hidden'. + m->set_hidden(true); + } + Could be: + // Mark methods in hidden classes as 'hidden'. + m->set_hidden(is_hidden()); + http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/classfile/javaClasses.cpp.udiff.html + macro(_classData_offset, k, "classData", object_signature, false); \ Probably should remove trailing backslash here. http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/classfile/systemDictionary.cpp.udiff.html I think in a future RFE, we should add a default parameter to register_loader to make the code in the beginning of parse_stream() cleaner and remove has_class_mirror_holder_cld(). http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/prims/jvm.cpp.udiff.html + jboolean is_nestmate = (flags & NESTMATE) == NESTMATE; + jboolean is_hidden = (flags & HIDDEN_CLASS) == HIDDEN_CLASS; + jboolean is_strong = (flags & STRONG_LOADER_LINK) == STRONG_LOADER_LINK; + jboolean vm_annotations = (flags & ACCESS_VM_ANNOTATIONS) == ACCESS_VM_ANNOTATION Instead of jboolean, please use C++ bool here. + oop loader = lookup_k->class_loader(); + Handle class_loader (THREAD, loader); Can you rewrite as this to prevent potential unhandled oop for oop loader. + Handle class_loader (THREAD, lookup_k->class_loader()); Here: + InstanceKlass::cast(defined_k)->class_loader_data()->dec_keep_alive(); Don't have to cast defined_k to get class_loader_data(), but you probably just want to move this up to remove the rest of the InstanceKlass::cast(). + InstanceKlass* ik = InstanceKlass::cast(defined_k); http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/runtime/vmStructs.cpp.udiff.html http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/classfile/ClassLoaderData.java.udiff.html We agreed already that these changes aren't needed by the SA.? You can revert these. These are minor changes.? I don't need to see another webrev. Thanks, Coleen On 3/26/20 7:57 PM, Mandy Chung wrote: > Please review the implementation of JEP 371: Hidden Classes. The main > changes are in core-libs and hotspot runtime area. Small changes are > made in javac, VM compiler (intrinsification of Class::isHiddenClass), > JFR, JDI, and jcmd.? CSR [1]has been reviewed and is in the finalized > state (see specdiff and javadoc below for reference). > > Webrev: > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 > > Hidden class is created via `Lookup::defineHiddenClass`. From JVM's point > of view, a hidden class is a normal class except the following: > > - A hidden class has no initiating class loader and is not registered > in any dictionary. > - A hidden class has a name containing an illegal character > `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` > returns "Lp/Foo.0x1234;". > - A hidden class is not modifiable, i.e. cannot be redefined or > retransformed. JVM TI IsModifableClass returns false on a hidden. > - Final fields in a hidden class is "final".? The value of final > fields cannot be overriden via reflection.? setAccessible(true) can > still be called on reflected objects representing final fields in a > hidden class and its access check will be suppressed but only have > read-access (i.e. can do Field::getXXX but not setXXX). > > Brief summary of this patch: > > 1. A new Lookup::defineHiddenClass method is the API to create a > hidden class. > 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG > option that > ?? can be specified when creating a hidden class. > 3. A new Class::isHiddenClass method tests if a class is a hidden class. > 4. Field::setXXX method will throw IAE on a final field of a hidden class > ?? regardless of the value of the accessible flag. > 5. JVM_LookupDefineClass is the new JVM entry point for > Lookup::defineClass > ?? and defineHiddenClass to create a class from the given bytes. > 6. ClassLoaderData implementation is not changed.? There is one > primary CLD > ?? that holds the classes strongly referenced by its defining loader.? > There > ?? can be zero or more additional CLDs - one per weak class. > 7. Nest host determination is updated per revised JVMS 5.4.4. Access > control > ?? check no longer throws LinkageError but instead it will throw IAE with > ?? a clear message if a class fails to resolve/validate the nest host > declared > ?? in NestHost/NestMembers attribute. > 8. JFR, jcmd, JDI are updated to support hidden classes. > 9. update javac LambdaToMethod as lambda proxy starts using nestmates > ?? and generate a bridge method to desuger a method reference to a > protected > ?? method in its supertype in a different package > > This patch also updates StringConcatFactory, LambdaMetaFactory, and > LambdaForms > to use hidden classes.? The webrev includes changes in nashorn to > hidden class > and I will update the webrev if JEP 372 removes it any time soon. > > We uncovered a bug in Lookup::defineClass spec throws LinkageError and > intends > to have the newly created class linked.? However, the implementation in 14 > does not link the class.? A separate CSR [2] proposes to update the > implementation to match the spec.? This patch fixes the implementation. > > The spec update on JVM TI, JDI and Instrumentation will be done as > a separate RFE [3].? This patch includes new tests for JVM TI and > java.instrument that validates how the existing APIs work for hidden > classes. > > javadoc/specdiff > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ > > JVMS 5.4.4 change: > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf > > CSR: > https://bugs.openjdk.java.net/browse/JDK-8238359 > > Thanks > Mandy > [1] https://bugs.openjdk.java.net/browse/JDK-8238359 > [2] https://bugs.openjdk.java.net/browse/JDK-8240338 > [3] https://bugs.openjdk.java.net/browse/JDK-8230502 -------------- next part -------------- An HTML attachment was scrubbed... URL: From coleen.phillimore at oracle.com Mon Mar 30 15:23:19 2020 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 30 Mar 2020 11:23:19 -0400 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: <81f90a3e-dfda-0566-a9b2-ea0c4f17e7ac@oracle.com> References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> <81f90a3e-dfda-0566-a9b2-ea0c4f17e7ac@oracle.com> Message-ID: <42af77ec-0f03-a4b0-164d-3b25c14c7f37@oracle.com> Adding back hotspot-dev. On 3/30/20 11:02 AM, coleen.phillimore at oracle.com wrote: > > Hi,? This is great work!? I did a prereview and all of my comments > were addressed.? These are a few minor things I noticed. > > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/ci/ciInstanceKlass.hpp.udiff.html > > Nit. Can you add 'const' to the is_hidden accessor? > > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/classfile/classFileParser.cpp.udiff.html > > + ID annotation_index(const ClassLoaderData* loader_data, const > Symbol* name, const bool can_access_vm_annotations); > > 'const' bool is weird and unnecessary.? Can you remove const here? > > + if (is_hidden()) { // Mark methods in hidden classes as 'hidden'. > + m->set_hidden(true); > + } > + > Could be: > > + // Mark methods in hidden classes as 'hidden'. > + m->set_hidden(is_hidden()); > + > > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/classfile/javaClasses.cpp.udiff.html > > + macro(_classData_offset, k, "classData", object_signature, false); \ > > Probably should remove trailing backslash here. > > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/classfile/systemDictionary.cpp.udiff.html > > I think in a future RFE, we should add a default parameter to > register_loader to make the code in the beginning of parse_stream() > cleaner and remove has_class_mirror_holder_cld(). > > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/prims/jvm.cpp.udiff.html > + jboolean is_nestmate = (flags & NESTMATE) == NESTMATE; > + jboolean is_hidden = (flags & HIDDEN_CLASS) == HIDDEN_CLASS; > + jboolean is_strong = (flags & STRONG_LOADER_LINK) == STRONG_LOADER_LINK; > + jboolean vm_annotations = (flags & ACCESS_VM_ANNOTATIONS) == > ACCESS_VM_ANNOTATION > > Instead of jboolean, please use C++ bool here. > > + oop loader = lookup_k->class_loader(); > + Handle class_loader (THREAD, loader); > Can you rewrite as this to prevent potential unhandled oop for oop loader. > + Handle class_loader (THREAD, lookup_k->class_loader()); > > Here: > + InstanceKlass::cast(defined_k)->class_loader_data()->dec_keep_alive(); > > Don't have to cast defined_k to get class_loader_data(), but you > probably just want to move this up to remove the rest of the > InstanceKlass::cast(). > > + InstanceKlass* ik = InstanceKlass::cast(defined_k); > > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/runtime/vmStructs.cpp.udiff.html > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/classfile/ClassLoaderData.java.udiff.html > > We agreed already that these changes aren't needed by the SA.? You can > revert these. > > These are minor changes.? I don't need to see another webrev. > > Thanks, > Coleen > > > > On 3/26/20 7:57 PM, Mandy Chung wrote: >> Please review the implementation of JEP 371: Hidden Classes.? The >> main changes are in core-libs and hotspot runtime area.? Small >> changes are made in javac, VM compiler (intrinsification of >> Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been reviewed >> and is in the finalized state (see specdiff and javadoc below for >> reference). >> >> Webrev: >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 >> >> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's point >> of view, a hidden class is a normal class except the following: >> >> - A hidden class has no initiating class loader and is not registered >> in any dictionary. >> - A hidden class has a name containing an illegal character >> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` >> returns "Lp/Foo.0x1234;". >> - A hidden class is not modifiable, i.e. cannot be redefined or >> retransformed. JVM TI IsModifableClass returns false on a hidden. >> - Final fields in a hidden class is "final".? The value of final >> fields cannot be overriden via reflection. setAccessible(true) can >> still be called on reflected objects representing final fields in a >> hidden class and its access check will be suppressed but only have >> read-access (i.e. can do Field::getXXX but not setXXX). >> >> Brief summary of this patch: >> >> 1. A new Lookup::defineHiddenClass method is the API to create a >> hidden class. >> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG >> option that >> ?? can be specified when creating a hidden class. >> 3. A new Class::isHiddenClass method tests if a class is a hidden class. >> 4. Field::setXXX method will throw IAE on a final field of a hidden class >> ?? regardless of the value of the accessible flag. >> 5. JVM_LookupDefineClass is the new JVM entry point for >> Lookup::defineClass >> ?? and defineHiddenClass to create a class from the given bytes. >> 6. ClassLoaderData implementation is not changed.? There is one >> primary CLD >> ?? that holds the classes strongly referenced by its defining >> loader.? There >> ?? can be zero or more additional CLDs - one per weak class. >> 7. Nest host determination is updated per revised JVMS 5.4.4. Access >> control >> ?? check no longer throws LinkageError but instead it will throw IAE with >> ?? a clear message if a class fails to resolve/validate the nest host >> declared >> ?? in NestHost/NestMembers attribute. >> 8. JFR, jcmd, JDI are updated to support hidden classes. >> 9. update javac LambdaToMethod as lambda proxy starts using nestmates >> ?? and generate a bridge method to desuger a method reference to a >> protected >> ?? method in its supertype in a different package >> >> This patch also updates StringConcatFactory, LambdaMetaFactory, and >> LambdaForms >> to use hidden classes.? The webrev includes changes in nashorn to >> hidden class >> and I will update the webrev if JEP 372 removes it any time soon. >> >> We uncovered a bug in Lookup::defineClass spec throws LinkageError >> and intends >> to have the newly created class linked.? However, the implementation >> in 14 >> does not link the class.? A separate CSR [2] proposes to update the >> implementation to match the spec.? This patch fixes the implementation. >> >> The spec update on JVM TI, JDI and Instrumentation will be done as >> a separate RFE [3].? This patch includes new tests for JVM TI and >> java.instrument that validates how the existing APIs work for hidden >> classes. >> >> javadoc/specdiff >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ >> >> JVMS 5.4.4 change: >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf >> >> CSR: >> https://bugs.openjdk.java.net/browse/JDK-8238359 >> >> Thanks >> Mandy >> [1] https://bugs.openjdk.java.net/browse/JDK-8238359 >> [2] https://bugs.openjdk.java.net/browse/JDK-8240338 >> [3] https://bugs.openjdk.java.net/browse/JDK-8230502 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mandy.chung at oracle.com Mon Mar 30 16:18:48 2020 From: mandy.chung at oracle.com (Mandy Chung) Date: Mon, 30 Mar 2020 09:18:48 -0700 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> <90f9276d-2777-88ba-b1ec-9901711fcf02@oracle.com> Message-ID: <0b585750-54f9-aaa0-19b3-752f723894d1@oracle.com> On 3/30/20 7:16 AM, coleen.phillimore at oracle.com wrote: >> I agree with you that this comment needs update.?? Perhaps it should >> say "primitive, array types and hidden classes are non-modifiable. A >> modifiable class must be an InstanceKlass." > > I may have written the last part of that comment (or remember it at > least).? I think Chris's suggestion to remove the last sentence makes > sense.? Anything further will just adds unnecessary confusion to the > reader.? Anyone modifying this will get the InstanceKlass::cast() > assert soon after if they mess up. OK.? That's fine too. Mandy -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Mon Mar 30 16:19:19 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 30 Mar 2020 09:19:19 -0700 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> Message-ID: <50b1658d-2195-53af-ea0b-e13842e00496@oracle.com> On 3/30/20 02:30, serguei.spitsyn at oracle.com wrote: > Hi Mandy, > > I have just one comment so far. > > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/classfile/classLoaderHierarchyDCmd.cpp.frames.html > > > ?356?? void add_classes(LoadedClassInfo* first_class, int num_classes, > bool has_class_mirror_holder) { > ?357???? LoadedClassInfo** p_list_to_add_to; > ?358???? bool is_hidden = first_class->_klass->is_hidden(); > ?359???? if (has_class_mirror_holder) { > ?360?????? p_list_to_add_to = is_hidden ? &_hidden_weak_classes : > &_anon_classes; > ?361???? } else { > ?362?????? p_list_to_add_to = &_classes; > ?363???? } > ?364???? // Search tail. > ?365???? while ((*p_list_to_add_to) != NULL) { > ?366?????? p_list_to_add_to = &(*p_list_to_add_to)->_next; > ?367???? } > ?368???? *p_list_to_add_to = first_class; > ?369???? if (has_class_mirror_holder) { > ?370?????? if (is_hidden) { > ?371???????? _num_hidden_weak_classes += num_classes; > ?372?????? } else { > ?373???????? _num_anon_classes += num_classes; > ?374?????? } > ?375???? } else { > ?376?????? _num_classes += num_classes; > ?377???? } > ?378?? } > > ?Q1: I'm just curious, what happens if a cld has arrays of hidden > classes? > ???? Is the bottom_klass always expected to be the first? Please, skip it. I've got the answer. The array classes were not included into the LoadedClassInfo* by the classes_do. Thanks, Serguei > > Thanks, > Serguei > > > On 3/26/20 16:57, Mandy Chung wrote: >> Please review the implementation of JEP 371: Hidden Classes. The main >> changes are in core-libs and hotspot runtime area.? Small changes are >> made in javac, VM compiler (intrinsification of >> Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been reviewed >> and is in the finalized state (see specdiff and javadoc below for >> reference). >> >> Webrev: >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 >> >> >> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's >> point >> of view, a hidden class is a normal class except the following: >> >> - A hidden class has no initiating class loader and is not registered >> in any dictionary. >> - A hidden class has a name containing an illegal character >> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` >> returns "Lp/Foo.0x1234;". >> - A hidden class is not modifiable, i.e. cannot be redefined or >> retransformed. JVM TI IsModifableClass returns false on a hidden. >> - Final fields in a hidden class is "final".? The value of final >> fields cannot be overriden via reflection.? setAccessible(true) can >> still be called on reflected objects representing final fields in a >> hidden class and its access check will be suppressed but only have >> read-access (i.e. can do Field::getXXX but not setXXX). >> >> Brief summary of this patch: >> >> 1. A new Lookup::defineHiddenClass method is the API to create a >> hidden class. >> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG >> option that >> ?? can be specified when creating a hidden class. >> 3. A new Class::isHiddenClass method tests if a class is a hidden class. >> 4. Field::setXXX method will throw IAE on a final field of a hidden >> class >> ?? regardless of the value of the accessible flag. >> 5. JVM_LookupDefineClass is the new JVM entry point for >> Lookup::defineClass >> ?? and defineHiddenClass to create a class from the given bytes. >> 6. ClassLoaderData implementation is not changed.? There is one >> primary CLD >> ?? that holds the classes strongly referenced by its defining >> loader.? There >> ?? can be zero or more additional CLDs - one per weak class. >> 7. Nest host determination is updated per revised JVMS 5.4.4. Access >> control >> ?? check no longer throws LinkageError but instead it will throw IAE >> with >> ?? a clear message if a class fails to resolve/validate the nest host >> declared >> ?? in NestHost/NestMembers attribute. >> 8. JFR, jcmd, JDI are updated to support hidden classes. >> 9. update javac LambdaToMethod as lambda proxy starts using nestmates >> ?? and generate a bridge method to desuger a method reference to a >> protected >> ?? method in its supertype in a different package >> >> This patch also updates StringConcatFactory, LambdaMetaFactory, and >> LambdaForms >> to use hidden classes.? The webrev includes changes in nashorn to >> hidden class >> and I will update the webrev if JEP 372 removes it any time soon. >> >> We uncovered a bug in Lookup::defineClass spec throws LinkageError >> and intends >> to have the newly created class linked.? However, the implementation >> in 14 >> does not link the class.? A separate CSR [2] proposes to update the >> implementation to match the spec.? This patch fixes the implementation. >> >> The spec update on JVM TI, JDI and Instrumentation will be done as >> a separate RFE [3].? This patch includes new tests for JVM TI and >> java.instrument that validates how the existing APIs work for hidden >> classes. >> >> javadoc/specdiff >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ >> >> >> JVMS 5.4.4 change: >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf >> >> >> CSR: >> https://bugs.openjdk.java.net/browse/JDK-8238359 >> >> Thanks >> Mandy >> [1] https://bugs.openjdk.java.net/browse/JDK-8238359 >> [2] https://bugs.openjdk.java.net/browse/JDK-8240338 >> [3] https://bugs.openjdk.java.net/browse/JDK-8230502 > From chris.plummer at oracle.com Mon Mar 30 18:39:19 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 30 Mar 2020 11:39:19 -0700 Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the children process if vmArguments is already specified In-Reply-To: <47cba4c6-233d-84a9-1d89-b40e3c974c08@oracle.com> References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com> <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com> <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com> <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com> <042D1C56-75E3-40B9-8259-035C9A13C13B@oracle.com> <47cba4c6-233d-84a9-1d89-b40e3c974c08@oracle.com> Message-ID: <8bcf232e-e05c-98ae-767f-26adf18ad3fd@oracle.com> Hi Leonid, I haven't gone through all the tests yet.? I've accumulated enough questions that I'd like to see them answered or addressed before I continue on. This isn't directly related to your changes, but I noticed that users of JDKToolLauncher do nothing to make sure that default test options are used. This means we are never running these tools with the test options being specified with the jtreg run. Is that a bug or intentional? In the problem lists, is it necessary to list the test multiple times with #id0, #id1, etc, or could you list it just once and leave that part off. It seems very error prone. Also, changing tests like ClhsdbFindPC, ClhsdbJstack, and ClhsdbScanOops to split out the testing in this manner seems completely unrelated to this CR, especially when the tests do not even contain any changes related to the CR. ?426???? public static LingeredApp startApp(String... additionalJvmOpts) throws IOException { The default test opts are appended to additionalJvmOpts, and if you want prepended you need to call Utils.prependTestJavaOpts(). I would have thought the opposite would be more desirable and expected default behavior. Why did you choose this way? I also find it somewhat confusing that there is even a default mode for where the additionalJvmOpts go. Maybe it would be best to have startAppAppendJvmArgs() and startAppPrependJvmArgs() just to make it explicit. This would also be in line with the existing startAppExactJvmOpts(). Is ClhsdbFindPC correct. It used to use just use -Xcomp or -Xint, ignoring any default test opts. You've fixed it to include the default test opts, but the are appended, possibly overriding the -Xcomp or -Xint. Don't we want the default test opts prepended? Same for ClhsdbJstack. thanks, Chris On 3/25/20 2:31 PM, Leonid Mesnik wrote: > > Igor, Stefan, Ioi > > Thank you for your feedback. > > Filed https://bugs.openjdk.java.net/browse/JDK-8241624 To change @run > main... to @run driver. > > Test ClhsdbJstack.java is updated. > > Still waiting for review from SVC team. > > webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/ > > Leonid > > On 3/25/20 12:46 PM, Igor Ignatyev wrote: >> Hi Leonid, >> >> not related related to your patch (but yet somewhat made more obvious >> by it), it seems all (or at least almost all) the tests which >> use?LingeredApp should be run in "driver" mode as they just >> orchestrate execution of other JVMs, so running them w/ main (let >> alone main/othervm) just wastes time, >> test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java#id1, for >> example, will now executed w/ Xcomp which will make it very slow for >> no reasons. since you already got your hands dirty w/ these tests, >> could you please file an RFE to sort this out and list all the >> affected tests there? >> >> re: the patch, could you please update ClhsdbJstack.java test not to >> be run w/ Xcomp and follow the same pattern you used in other tests >> (e.g.?ClhsdbScanOops) ? other than that it looks fine to me, I >> however wouldn't be able to tell if all svc tests continue to do that >> they were supposed to, so I'd prefer for someone from svc team >> to?chime in. >> >> Thanks, >> -- Igor >> >>> On Mar 25, 2020, at 12:01 PM, Leonid Mesnik >>> > wrote: >>> >>> Added Ioi, who also proposed new version of startAppVmOpts. >>> >>> Please find new webrev: >>> http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/ >>> >>> Renamed startAppVmOpts/runAppVmOpts to >>> "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should make >>> very clear that this method doesn't use any of test.java.opts, >>> test.vm.opts. >>> >>> Also, I fixed test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java >>> metnioned by Igor, and removed null pointer check as Ioi suggested >>> in startApp method. >>> >>> + public static void startApp(LingeredApp theApp, String... >>> additionalJvmOpts) throws IOException { >>> + startAppExactJvmOpts(theApp, >>> Utils.appendTestJavaOpts(additionalJvmOpts)); >>> + } >>> >>> Leonid >>> >>> On 3/25/20 10:14 AM, Stefan Karlsson wrote: >>>> On 2020-03-25 17:40, Igor Ignatyev wrote: >>>>> Hi Leonid, >>>>> >>>>> I have briefly looked at the patch, a few comments so far: >>>>> >>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java: >>>>> ? - at L#114, could you please call static method using class name >>>>> (as the opposite of using instance)? or was it meant to be >>>>> theApp.runAppVmOpts(vmArgs) ? >>>>> >>>>> test/lib/jdk/test/lib/apps/LingeredApp.java: >>>>> - it seems that code indent of startApp(LingeredApp, String[]) >>>>> isn't correct >>>>> - I don't like startAppVmOpts name, but unfortunately don't have a >>>>> better suggestion (yet) >>>> >>>> I was going to say the same. Jtreg has the concept of "java >>>> options" and "vm options". We have had a fair share of bugs and >>>> wasted time when tests have been using the "vm options" part >>>> (VM_OPTIONS, test.vm.options, etc), and we've been moving away from >>>> using that way to pass options. I recently cleaned up some of this >>>> with: >>>> >>>> 8237111: LingeredApp should be started with getTestJavaOpts >>>> >>>> Because of this, I would prefer if we used a name that doesn't >>>> include "VmOpts", because it's too alike the other concept. Some >>>> suggestions: >>>> ?startAppJavaOptions >>>> ?startAppUsingJavaOptions >>>> ?startAppWithJavaOptions >>>> ?startAppExactJavaOptions >>>> ?startAppJvmOptions >>>> >>>> Thanks, >>>> StefanK >>>> >>>>> Thanks, >>>>> -- Igor >>>>> >>>>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik >>>>>> wrote: >>>>>> >>>>>> Hi >>>>>> >>>>>> Could you please review following fix which change LingeredApp to >>>>>> prepend vm options to java/vm.test.opts when startApp is used and >>>>>> provide startAppVmOpts to override options completely. >>>>>> >>>>>> The intention is to avoid issue like in this bug where test/jtreg >>>>>> options were ignored by tests. Also I fixed some tests where >>>>>> intention was to append vm options rather than to override them. >>>>>> >>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/ >>>>>> >>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698 >>>>>> >>>>>> Leonid >>>>>> >>>> >> From coleen.phillimore at oracle.com Mon Mar 30 19:04:34 2020 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 30 Mar 2020 15:04:34 -0400 Subject: Discussion about fixing deprecation in jdk.hotspot.agent In-Reply-To: <1916207b-de97-1f25-f93c-8830025fad62@oracle.com> References: <1916207b-de97-1f25-f93c-8830025fad62@oracle.com> Message-ID: I was wondering why this is needed when debugging a core file, which is the key thing we need the SA for: ? /** This is used by both the debugger and any runtime system. It is ????? the basic mechanism by which classes which mimic underlying VM ????? functionality cause themselves to be initialized. The given ????? observer will be notified (with arguments (null, null)) when the ????? VM is re-initialized, as well as when it registers itself with ????? the VM. */ ? public static void registerVMInitializedObserver(Observer o) { ??? vmInitializedObservers.add(o); ??? o.update(null, null); ? } It seems like if it isn't needed, we shouldn't add these classes and remove their use. Coleen On 3/30/20 8:14 AM, Magnus Ihse Bursie wrote: > No opinions on this? > > /Magnus > > On 2020-03-25 23:34, Magnus Ihse Bursie wrote: >> Hi everyone, >> >> As a follow-up to the ongoing review for JDK-8241618, I have also >> looked at fixing the deprecation warnings in jdk.hotspot.agent. These >> fall in three broad categories: >> >> * Deprecation of the boxing type constructors (e.g. "new Integer(42)"). >> >> * Deprecation of java.util.Observer and Observable. >> >> * The rest (mostly Class.newInstance(), and a few number of other odd >> deprecations) >> >> The first category is trivial to fix. The last category need some >> special discussion. But the overwhelming majority of deprecation >> warnings come from the use of Observer and Observable. This really >> dwarfs anything else, and needs to be handled first, otherwise it's >> hard to even spot the other issues. >> >> My analysis of the situation is that the deprecation of Observer and >> Observable seems a bit harsh, from the PoV of jdk.hotspot.agent. >> Sure, it might be limited, but I think it does exactly what is needed >> here. So the migration suggested in Observable (java.beans or >> java.util.concurrent) seems overkill. If there are genuine threading >> issues at play here, this assumption might be wrong, and then maybe >> going the j.u.c. route is correct. >> >> But if that's not, the main goal should be to stay with the current >> implementation. One way to do this is to sprinkle the code with >> @SuppressWarning. But I think a better way would be to just implement >> our own Observer and Observable. After all, the classes are trivial. >> >> I've made a mock-up of this solution, were I just copied the >> java.util.Observer and Observable, and removed the deprecation >> annotations. The only thing needed for the rest of the code is to >> make sure we import these; I've done this for three arbitrarily >> selected classes just to show what the change would typically look >> like. Here's the mock-up: >> >> http://cr.openjdk.java.net/~ihse/hotspot-agent-observer/webrev.01 >> >> Let me know what you think. >> >> /Magnus > From mandy.chung at oracle.com Mon Mar 30 19:13:57 2020 From: mandy.chung at oracle.com (Mandy Chung) Date: Mon, 30 Mar 2020 12:13:57 -0700 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: <08191054-8d0a-ae60-ac99-e2849f08ce85@oracle.com> References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> <08191054-8d0a-ae60-ac99-e2849f08ce85@oracle.com> Message-ID: <528c7933-be32-9863-6cc5-92223a75bbee@oracle.com> This is the patch to keep the JDK 14 behavior if target release to 14 (thanks to Jan for helping making change in javac to get the tests working) http://cr.openjdk.java.net/~mchung/valhalla/webrevs/8171335/webrev-javac-target-release-14/ Mandy On 3/27/20 9:29 AM, Mandy Chung wrote: > Hi Jan, > > Good point.? The javac change only applies to JDK 15 and later and the > lambda proxy class is not a nestmate when running on JDK 14 or earlier. > > I probably need the help from langtools team to fix this.? I'll give > it a try. > > Mandy -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniil.x.titov at oracle.com Mon Mar 30 19:43:01 2020 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Mon, 30 Mar 2020 12:43:01 -0700 Subject: RFR: 8241530: com/sun/jdi tests fail due to network issues on OSX 10.15 Message-ID: <84D85D3D-AFCA-42BD-BD02-35604E462D5F@oracle.com> Please review the change [1] that fixes the failure of com/sun/jdi/JdwpListenTest.java and com/sun/jdi/JdwpAttachTest.java tests on OSX 10.15. The problem here is the similar to the one solved in [4] by additional filtering of unusual network interfaces in the test library class jdk.test.lib.NetworkConfiguration. However, the failing com/sun/jdi tests do not use jdk.test.lib.NetworkConfiguration and Instead do repeat the same logic themselves. The fix changes these tests to start using jdk.test.lib.NetworkConfiguration to find all local addresses. Initially the issue [2] also included 3 other failing tests from sun/management/jdp package, but these tests fail for a different reason so I moved them in the new issue [3] and updated the ProblemList.txt for them. [1] Webrev: http://cr.openjdk.java.net/~dtitov/8241530/webrev.01/ [2] Jira Issue: https://bugs.openjdk.java.net/browse/JDK-8241530 [3] https://bugs.openjdk.java.net/browse/JDK-8241865 [4] https://bugs.openjdk.java.net/browse/JDK-8241336 Thank you, Daniil From alexey.menkov at oracle.com Mon Mar 30 20:06:48 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Mon, 30 Mar 2020 13:06:48 -0700 Subject: RFR: 8241530: com/sun/jdi tests fail due to network issues on OSX 10.15 In-Reply-To: <84D85D3D-AFCA-42BD-BD02-35604E462D5F@oracle.com> References: <84D85D3D-AFCA-42BD-BD02-35604E462D5F@oracle.com> Message-ID: Looks good. --alex On 03/30/2020 12:43, Daniil Titov wrote: > Please review the change [1] that fixes the failure of com/sun/jdi/JdwpListenTest.java > and com/sun/jdi/JdwpAttachTest.java tests on OSX 10.15. > > The problem here is the similar to the one solved in [4] by additional filtering > of unusual network interfaces in the test library class jdk.test.lib.NetworkConfiguration. > However, the failing com/sun/jdi tests do not use jdk.test.lib.NetworkConfiguration and > Instead do repeat the same logic themselves. > > The fix changes these tests to start using jdk.test.lib.NetworkConfiguration to find all local addresses. > > Initially the issue [2] also included 3 other failing tests from sun/management/jdp package, but these tests fail > for a different reason so I moved them in the new issue [3] and updated the ProblemList.txt for them. > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8241530/webrev.01/ > [2] Jira Issue: https://bugs.openjdk.java.net/browse/JDK-8241530 > [3] https://bugs.openjdk.java.net/browse/JDK-8241865 > [4] https://bugs.openjdk.java.net/browse/JDK-8241336 > > Thank you, > Daniil > > From leonid.mesnik at oracle.com Tue Mar 31 00:42:11 2020 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Mon, 30 Mar 2020 17:42:11 -0700 Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the children process if vmArguments is already specified In-Reply-To: <8bcf232e-e05c-98ae-767f-26adf18ad3fd@oracle.com> References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com> <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com> <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com> <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com> <042D1C56-75E3-40B9-8259-035C9A13C13B@oracle.com> <47cba4c6-233d-84a9-1d89-b40e3c974c08@oracle.com> <8bcf232e-e05c-98ae-767f-26adf18ad3fd@oracle.com> Message-ID: Hi See my comments inline. I will update webrev after go through all your comments. On 3/30/20 11:39 AM, Chris Plummer wrote: > Hi Leonid, > > I haven't gone through all the tests yet.? I've accumulated enough > questions that I'd like to see them answered or addressed before I > continue on. > > This isn't directly related to your changes, but I noticed that users > of JDKToolLauncher do nothing to make sure that default test options > are used. This means we are never running these tools with the test > options being specified with the jtreg run. Is that a bug or intentional? Which "default test options" do you mean? We have 2 properties to set JVM options. The idea is to pass test.vm.opts to ALL java processes and test.java.opts? to only tested processes if applicable. Usually, for example we don't want to run jcmd with -Xcomp. test.vm.opts was used (a long time ago) for options like '-d32/-d64' on Solaris where JVM don't start without choosing correct version. Also, it is used to reduce maximum heap for all JVM instances when tests are running concurrently. So, probably test.vm.opts (or test.vm.tools.opts) should be added by JDKToolLauncher but not test.java.opts. It is separate topic, there are a lot of launchers which ignore test.vm.opts now. > > In the problem lists, is it necessary to list the test multiple times > with #id0, #id1, etc, or could you list it just once and leave that > part off. It seems very error prone. Also, changing tests like > ClhsdbFindPC, ClhsdbJstack, and ClhsdbScanOops to split out the > testing in this manner seems completely unrelated to this CR, > especially when the tests do not even contain any changes related to > the CR. I think, that these chages are related. The startApp(...) was updated so some test combinations become invalid or redundant. ClhsdbFindPC and ClhsdbJstack were always run twice. Now, when test options passed in test it is not needed to run it twice when Xcomp is already set by user. ClhsdbScanOops is fixed to don't allow to run incompatible GC combination. So I should update these tests by splitting them or change them to? startAppExactJvmOpts() if we wan't continue to ignore user-given test options. It seems that #idN are required by jtreg now, otherwise it just run test. > > ?426???? public static LingeredApp startApp(String... > additionalJvmOpts) throws IOException { > > The default test opts are appended to additionalJvmOpts, and if you > want prepended you need to call Utils.prependTestJavaOpts(). I would > have thought the opposite would be more desirable and expected default > behavior. Why did you choose this way? I also find it somewhat > confusing that there is even a default mode for where the > additionalJvmOpts go. Maybe it would be best to have > startAppAppendJvmArgs() and startAppPrependJvmArgs() just to make it > explicit. This would also be in line with the existing > startAppExactJvmOpts(). > I've chosen the most popular usage, which was Utils.appendTestJavaOpts. But I agree, that it would be better to change it to prepend. Thanks for pointing to this. I don't want to add startAppAppendJvmArgs()/startAppPrependJvmArgs() to don't complicate all things. I think that startApp() should be used in the cases when test vm options really shouldn't interfere with user-provided options or overwrite them. So basically the behavior is the same as for ProcessTools.createJavaProcessBuilder(true, ...) and jtreg itself. > Is ClhsdbFindPC correct. It used to use just use -Xcomp or -Xint, > ignoring any default test opts. You've fixed it to include the default > test opts, but the are appended, possibly overriding the -Xcomp or > -Xint. Don't we want the default test opts prepended? Same for > ClhsdbJstack. The idea is to don't mix Xcomp and Xmixed/Xint using requires filter. However ClhsdbFindPC might override Xint with Xmixed if it is set explicitly. Switching to prepending will fix it. Leonid > > thanks, > > Chris > > On 3/25/20 2:31 PM, Leonid Mesnik wrote: >> >> Igor, Stefan, Ioi >> >> Thank you for your feedback. >> >> Filed https://bugs.openjdk.java.net/browse/JDK-8241624 To change @run >> main... to @run driver. >> >> Test ClhsdbJstack.java is updated. >> >> Still waiting for review from SVC team. >> >> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/ >> >> Leonid >> >> On 3/25/20 12:46 PM, Igor Ignatyev wrote: >>> Hi Leonid, >>> >>> not related related to your patch (but yet somewhat made more >>> obvious by it), it seems all (or at least almost all) the tests >>> which use?LingeredApp should be run in "driver" mode as they just >>> orchestrate execution of other JVMs, so running them w/ main (let >>> alone main/othervm) just wastes time, >>> test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java#id1, for >>> example, will now executed w/ Xcomp which will make it very slow for >>> no reasons. since you already got your hands dirty w/ these tests, >>> could you please file an RFE to sort this out and list all the >>> affected tests there? >>> >>> re: the patch, could you please update ClhsdbJstack.java test not to >>> be run w/ Xcomp and follow the same pattern you used in other tests >>> (e.g.?ClhsdbScanOops) ? other than that it looks fine to me, I >>> however wouldn't be able to tell if all svc tests continue to do >>> that they were supposed to, so I'd prefer for someone from svc team >>> to?chime in. >>> >>> Thanks, >>> -- Igor >>> >>>> On Mar 25, 2020, at 12:01 PM, Leonid Mesnik >>>> > wrote: >>>> >>>> Added Ioi, who also proposed new version of startAppVmOpts. >>>> >>>> Please find new webrev: >>>> http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/ >>>> >>>> Renamed startAppVmOpts/runAppVmOpts to >>>> "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should make >>>> very clear that this method doesn't use any of test.java.opts, >>>> test.vm.opts. >>>> >>>> Also, I fixed test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java >>>> metnioned by Igor, and removed null pointer check as Ioi suggested >>>> in startApp method. >>>> >>>> + public static void startApp(LingeredApp theApp, String... >>>> additionalJvmOpts) throws IOException { >>>> + startAppExactJvmOpts(theApp, >>>> Utils.appendTestJavaOpts(additionalJvmOpts)); >>>> + } >>>> >>>> Leonid >>>> >>>> On 3/25/20 10:14 AM, Stefan Karlsson wrote: >>>>> On 2020-03-25 17:40, Igor Ignatyev wrote: >>>>>> Hi Leonid, >>>>>> >>>>>> I have briefly looked at the patch, a few comments so far: >>>>>> >>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java: >>>>>> ? - at L#114, could you please call static method using class >>>>>> name (as the opposite of using instance)? or was it meant to be >>>>>> theApp.runAppVmOpts(vmArgs) ? >>>>>> >>>>>> test/lib/jdk/test/lib/apps/LingeredApp.java: >>>>>> - it seems that code indent of startApp(LingeredApp, String[]) >>>>>> isn't correct >>>>>> - I don't like startAppVmOpts name, but unfortunately don't have >>>>>> a better suggestion (yet) >>>>> >>>>> I was going to say the same. Jtreg has the concept of "java >>>>> options" and "vm options". We have had a fair share of bugs and >>>>> wasted time when tests have been using the "vm options" part >>>>> (VM_OPTIONS, test.vm.options, etc), and we've been moving away >>>>> from using that way to pass options. I recently cleaned up some of >>>>> this with: >>>>> >>>>> 8237111: LingeredApp should be started with getTestJavaOpts >>>>> >>>>> Because of this, I would prefer if we used a name that doesn't >>>>> include "VmOpts", because it's too alike the other concept. Some >>>>> suggestions: >>>>> ?startAppJavaOptions >>>>> ?startAppUsingJavaOptions >>>>> ?startAppWithJavaOptions >>>>> ?startAppExactJavaOptions >>>>> ?startAppJvmOptions >>>>> >>>>> Thanks, >>>>> StefanK >>>>> >>>>>> Thanks, >>>>>> -- Igor >>>>>> >>>>>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik >>>>>>> wrote: >>>>>>> >>>>>>> Hi >>>>>>> >>>>>>> Could you please review following fix which change LingeredApp >>>>>>> to prepend vm options to java/vm.test.opts when startApp is used >>>>>>> and provide startAppVmOpts to override options completely. >>>>>>> >>>>>>> The intention is to avoid issue like in this bug where >>>>>>> test/jtreg options were ignored by tests. Also I fixed some >>>>>>> tests where intention was to append vm options rather than to >>>>>>> override them. >>>>>>> >>>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/ >>>>>>> >>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698 >>>>>>> >>>>>>> Leonid >>>>>>> >>>>> >>> > > From chris.plummer at oracle.com Tue Mar 31 04:43:13 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 30 Mar 2020 21:43:13 -0700 Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the children process if vmArguments is already specified In-Reply-To: References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com> <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com> <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com> <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com> <042D1C56-75E3-40B9-8259-035C9A13C13B@oracle.com> <47cba4c6-233d-84a9-1d89-b40e3c974c08@oracle.com> <8bcf232e-e05c-98ae-767f-26adf18ad3fd@oracle.com> Message-ID: Hi Leonid, On 3/30/20 5:42 PM, Leonid Mesnik wrote: > Hi > > See my comments inline. I will update webrev after go through all your > comments. > > > On 3/30/20 11:39 AM, Chris Plummer wrote: >> Hi Leonid, >> >> I haven't gone through all the tests yet.? I've accumulated enough >> questions that I'd like to see them answered or addressed before I >> continue on. >> >> This isn't directly related to your changes, but I noticed that users >> of JDKToolLauncher do nothing to make sure that default test options >> are used. This means we are never running these tools with the test >> options being specified with the jtreg run. Is that a bug or >> intentional? > > Which "default test options" do you mean? We have 2 properties to set > JVM options. The idea is to pass test.vm.opts to ALL java processes > and test.java.opts? to only tested processes if applicable. Usually, > for example we don't want to run jcmd with -Xcomp. test.vm.opts was > used (a long time ago) for options like '-d32/-d64' on Solaris where > JVM don't start without choosing correct version. Also, it is used to > reduce maximum heap for all JVM instances when tests are running > concurrently. > > So, probably test.vm.opts (or test.vm.tools.opts) should be added by > JDKToolLauncher but not test.java.opts. It is separate topic, there > are a lot of launchers which ignore test.vm.opts now. I always get confused about which set of options these properties represent, but basically I'm suggesting that if for example we are doing a -Xcomp run in mach5, JDKToolLauncher (at least in some cases) should be launched with this option. I think this is what you get from Utils.getTestJavaOpts(),. For example the SA tests use JDKToolLauncher.createUsingTestJDK("jhsdb"). jhsdb is what is really being tested here, and it should be launched with the test vm options. Currently we launch the target process with these options, which is probably also a good idea.? Also we aren't too concerned with the options that the test itself is run with, although I'm guessing they also get run with the test java opts. So we have 3 processes here: ?- jhsdb, which should be getting test java opts but is not ?- the target process, which should be getting test java opts and currently is ?- the test itself, where options don't really matter, but is getting passed test java opts However, you could argue that tests like jinfo, jstack, and jcmd, all of which use the Attach API and the bulk of the work is done on the target process, are not that concerned with the options passed to the command, but do want the options passed to the target process. > >> >> In the problem lists, is it necessary to list the test multiple times >> with #id0, #id1, etc, or could you list it just once and leave that >> part off. It seems very error prone. Also, changing tests like >> ClhsdbFindPC, ClhsdbJstack, and ClhsdbScanOops to split out the >> testing in this manner seems completely unrelated to this CR, >> especially when the tests do not even contain any changes related to >> the CR. > > I think, that these chages are related. The startApp(...) was updated > so some test combinations become invalid or redundant. > > ClhsdbFindPC and ClhsdbJstack were always run twice. Now, when test > options passed in test it is not needed to run it twice when Xcomp is > already set by user. > Ok. I see now that the second test run, which is the non -Xcomp run, adds '@requires vm.compMode != "Xcomp"'. But this also is strange. The first test run, which does not have the @requires and is the one that makes LingeredApp launch with -Xcomp, will always run whether or not it is an -Xcomp test run. So it will run as part of the a regular test run and as part of a -Xcomp test run. The only difference between the two is the -Xcomp run will also run the test with -Xcomp, but that's not really needed (I think it will also end up passing -Xcomp to the target processs twice). Perhaps '@requires vm.compMode == "Xcomp"' should be used for the first test run, but that means it no longer gets run until later tiers when we use -Xcomp. Why not revert it back to a single test, but also add '@requires vm.compMode != "Xcomp"'. Then it gets run both ways in an early tier and not run during the -Xcomp run, which isn't really needed. > ClhsdbScanOops is fixed to don't allow to run incompatible GC > combination. Ok > > So I should update these tests by splitting them or change them to? > startAppExactJvmOpts() if we wan't continue to ignore user-given test > options. I don't think I was suggesting removing user-given test options. I don't see why you would. > > It seems that #idN are required by jtreg now, otherwise it just run test. Ok. > >> >> ?426???? public static LingeredApp startApp(String... >> additionalJvmOpts) throws IOException { >> >> The default test opts are appended to additionalJvmOpts, and if you >> want prepended you need to call Utils.prependTestJavaOpts(). I would >> have thought the opposite would be more desirable and expected >> default behavior. Why did you choose this way? I also find it >> somewhat confusing that there is even a default mode for where the >> additionalJvmOpts go. Maybe it would be best to have >> startAppAppendJvmArgs() and startAppPrependJvmArgs() just to make it >> explicit. This would also be in line with the existing >> startAppExactJvmOpts(). >> > I've chosen the most popular usage, which was > Utils.appendTestJavaOpts. But I agree, that it would be better to > change it to prepend. Thanks for pointing to this. > > I don't want to add startAppAppendJvmArgs()/startAppPrependJvmArgs() > to don't complicate all things. I think that startApp() should be used > in the cases when test vm options really shouldn't interfere with > user-provided options or overwrite them. So basically the behavior is > the same as for ProcessTools.createJavaProcessBuilder(true, ...) and > jtreg itself. > Ok. > >> Is ClhsdbFindPC correct. It used to use just use -Xcomp or -Xint, >> ignoring any default test opts. You've fixed it to include the >> default test opts, but the are appended, possibly overriding the >> -Xcomp or -Xint. Don't we want the default test opts prepended? Same >> for ClhsdbJstack. > > The idea is to don't mix Xcomp and Xmixed/Xint using requires filter. > However ClhsdbFindPC might override Xint with Xmixed if it is set > explicitly. Switching to prepending will fix it. Yes, that's what I was thinking and one reason I thought that should be default behavior. thanks, Chris > > Leonid > >> >> thanks, >> >> Chris >> >> On 3/25/20 2:31 PM, Leonid Mesnik wrote: >>> >>> Igor, Stefan, Ioi >>> >>> Thank you for your feedback. >>> >>> Filed https://bugs.openjdk.java.net/browse/JDK-8241624 To change >>> @run main... to @run driver. >>> >>> Test ClhsdbJstack.java is updated. >>> >>> Still waiting for review from SVC team. >>> >>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/ >>> >>> Leonid >>> >>> On 3/25/20 12:46 PM, Igor Ignatyev wrote: >>>> Hi Leonid, >>>> >>>> not related related to your patch (but yet somewhat made more >>>> obvious by it), it seems all (or at least almost all) the tests >>>> which use?LingeredApp should be run in "driver" mode as they just >>>> orchestrate execution of other JVMs, so running them w/ main (let >>>> alone main/othervm) just wastes time, >>>> test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java#id1, for >>>> example, will now executed w/ Xcomp which will make it very slow >>>> for no reasons. since you already got your hands dirty w/ these >>>> tests, could you please file an RFE to sort this out and list all >>>> the affected tests there? >>>> >>>> re: the patch, could you please update ClhsdbJstack.java test not >>>> to be run w/ Xcomp and follow the same pattern you used in other >>>> tests (e.g.?ClhsdbScanOops) ? other than that it looks fine to me, >>>> I however wouldn't be able to tell if all svc tests continue to do >>>> that they were supposed to, so I'd prefer for someone from svc team >>>> to?chime in. >>>> >>>> Thanks, >>>> -- Igor >>>> >>>>> On Mar 25, 2020, at 12:01 PM, Leonid Mesnik >>>>> > wrote: >>>>> >>>>> Added Ioi, who also proposed new version of startAppVmOpts. >>>>> >>>>> Please find new webrev: >>>>> http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/ >>>>> >>>>> Renamed startAppVmOpts/runAppVmOpts to >>>>> "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should make >>>>> very clear that this method doesn't use any of test.java.opts, >>>>> test.vm.opts. >>>>> >>>>> Also, I fixed >>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java metnioned by >>>>> Igor, and removed null pointer check as Ioi suggested in startApp >>>>> method. >>>>> >>>>> + public static void startApp(LingeredApp theApp, String... >>>>> additionalJvmOpts) throws IOException { >>>>> + startAppExactJvmOpts(theApp, >>>>> Utils.appendTestJavaOpts(additionalJvmOpts)); >>>>> + } >>>>> >>>>> Leonid >>>>> >>>>> On 3/25/20 10:14 AM, Stefan Karlsson wrote: >>>>>> On 2020-03-25 17:40, Igor Ignatyev wrote: >>>>>>> Hi Leonid, >>>>>>> >>>>>>> I have briefly looked at the patch, a few comments so far: >>>>>>> >>>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java: >>>>>>> ? - at L#114, could you please call static method using class >>>>>>> name (as the opposite of using instance)? or was it meant to be >>>>>>> theApp.runAppVmOpts(vmArgs) ? >>>>>>> >>>>>>> test/lib/jdk/test/lib/apps/LingeredApp.java: >>>>>>> - it seems that code indent of startApp(LingeredApp, String[]) >>>>>>> isn't correct >>>>>>> - I don't like startAppVmOpts name, but unfortunately don't have >>>>>>> a better suggestion (yet) >>>>>> >>>>>> I was going to say the same. Jtreg has the concept of "java >>>>>> options" and "vm options". We have had a fair share of bugs and >>>>>> wasted time when tests have been using the "vm options" part >>>>>> (VM_OPTIONS, test.vm.options, etc), and we've been moving away >>>>>> from using that way to pass options. I recently cleaned up some >>>>>> of this with: >>>>>> >>>>>> 8237111: LingeredApp should be started with getTestJavaOpts >>>>>> >>>>>> Because of this, I would prefer if we used a name that doesn't >>>>>> include "VmOpts", because it's too alike the other concept. Some >>>>>> suggestions: >>>>>> ?startAppJavaOptions >>>>>> ?startAppUsingJavaOptions >>>>>> ?startAppWithJavaOptions >>>>>> ?startAppExactJavaOptions >>>>>> ?startAppJvmOptions >>>>>> >>>>>> Thanks, >>>>>> StefanK >>>>>> >>>>>>> Thanks, >>>>>>> -- Igor >>>>>>> >>>>>>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik >>>>>>>> wrote: >>>>>>>> >>>>>>>> Hi >>>>>>>> >>>>>>>> Could you please review following fix which change LingeredApp >>>>>>>> to prepend vm options to java/vm.test.opts when startApp is >>>>>>>> used and provide startAppVmOpts to override options completely. >>>>>>>> >>>>>>>> The intention is to avoid issue like in this bug where >>>>>>>> test/jtreg options were ignored by tests. Also I fixed some >>>>>>>> tests where intention was to append vm options rather than to >>>>>>>> override them. >>>>>>>> >>>>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/ >>>>>>>> >>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698 >>>>>>>> >>>>>>>> Leonid >>>>>>>> >>>>>> >>>> >> >> From suenaga at oss.nttdata.com Tue Mar 31 10:06:25 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Tue, 31 Mar 2020 19:06:25 +0900 Subject: Thread Local Handshake in JVMTI functions Message-ID: <9ecf6856-f5c7-4723-7cc9-7d257e7bb7c0@oss.nttdata.com> Hi all, Many JVMTI functions uses VM Operation to get information. However some of them need to stop only one thread - they don't need to stop all threads. So I think we can use Thread Local Handshake as this webrev. It is example for GetOneCurrentContendedMonitor(). http://cr.openjdk.java.net/~ysuenaga/jvmti-thread-local-handshake/ Also I think we can replace following VM Operations to Thread Local Handshake: class VM_GetCurrentLocation class VM_EnterInterpOnlyMode class VM_UpdateForPopTopFrame class VM_SetFramePop class VM_GetOwnedMonitorInfo class VM_GetCurrentContendedMonitor class VM_GetFrameCount class VM_GetFrameLocation What do you think? It it is acceptable, I will file it to JBS and send review request. Thanks, Yasumasa From david.holmes at oracle.com Tue Mar 31 10:16:23 2020 From: david.holmes at oracle.com (David Holmes) Date: Tue, 31 Mar 2020 20:16:23 +1000 Subject: Thread Local Handshake in JVMTI functions In-Reply-To: <9ecf6856-f5c7-4723-7cc9-7d257e7bb7c0@oss.nttdata.com> References: <9ecf6856-f5c7-4723-7cc9-7d257e7bb7c0@oss.nttdata.com> Message-ID: Hi Yasumasa, On 31/03/2020 8:06 pm, Yasumasa Suenaga wrote: > Hi all, > > Many JVMTI functions uses VM Operation to get information. However some > of them need to stop only one thread - they don't need to stop all threads. > So I think we can use Thread Local Handshake as this webrev. It is > example for GetOneCurrentContendedMonitor(). True, but at the moment handshakes involve the VMThread. There is work being done to support direct thread-to-thread handshakes and once that is done this kind of conversion should be more easily done. It might be worth waiting for that. > ? http://cr.openjdk.java.net/~ysuenaga/jvmti-thread-local-handshake/ An observation, it seems to me that calling_thread is not used when this is not a VMOperation. Cheers, David > Also I think we can replace following VM Operations to Thread Local > Handshake: > > class VM_GetCurrentLocation > class VM_EnterInterpOnlyMode > class VM_UpdateForPopTopFrame > class VM_SetFramePop > class VM_GetOwnedMonitorInfo > class VM_GetCurrentContendedMonitor > class VM_GetFrameCount > class VM_GetFrameLocation > > What do you think? > It it is acceptable, I will file it to JBS and send review request. > > > Thanks, > > Yasumasa From suenaga at oss.nttdata.com Tue Mar 31 11:40:28 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Tue, 31 Mar 2020 20:40:28 +0900 Subject: Thread Local Handshake in JVMTI functions In-Reply-To: References: <9ecf6856-f5c7-4723-7cc9-7d257e7bb7c0@oss.nttdata.com> Message-ID: <4c9aa3ab-468d-eede-18f7-ac8d352575b6@oss.nttdata.com> Hi David, On 2020/03/31 19:16, David Holmes wrote: > Hi Yasumasa, > > On 31/03/2020 8:06 pm, Yasumasa Suenaga wrote: >> Hi all, >> >> Many JVMTI functions uses VM Operation to get information. However some of them need to stop only one thread - they don't need to stop all threads. >> So I think we can use Thread Local Handshake as this webrev. It is example for GetOneCurrentContendedMonitor(). > > True, but at the moment handshakes involve the VMThread. There is work being done to support direct thread-to-thread handshakes and once that is done this kind of conversion should be more easily done. It might be worth waiting for that. Thanks, I will be back to this topic when thread-to-thread handshake is done. I wondered at first why VMThread involves handshake. Its improvement is welcome for me ;) Cheers, Yasumasa >> ?? http://cr.openjdk.java.net/~ysuenaga/jvmti-thread-local-handshake/ > > An observation, it seems to me that calling_thread is not used when this is not a VMOperation. > > Cheers, > David > >> Also I think we can replace following VM Operations to Thread Local Handshake: >> >> class VM_GetCurrentLocation >> class VM_EnterInterpOnlyMode >> class VM_UpdateForPopTopFrame >> class VM_SetFramePop >> class VM_GetOwnedMonitorInfo >> class VM_GetCurrentContendedMonitor >> class VM_GetFrameCount >> class VM_GetFrameLocation >> >> What do you think? >> It it is acceptable, I will file it to JBS and send review request. >> >> >> Thanks, >> >> Yasumasa From coleen.phillimore at oracle.com Tue Mar 31 12:34:46 2020 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 31 Mar 2020 08:34:46 -0400 Subject: Discussion about fixing deprecation in jdk.hotspot.agent In-Reply-To: References: <1916207b-de97-1f25-f93c-8830025fad62@oracle.com> Message-ID: <113dd83a-82a3-88fc-8f31-fe9bfd00c12c@oracle.com> To answer my own question, this functionality is used to allow detach/reattach from {cl}hsdb.? Which seems to work on linux but not windows with this code removed. The next question is whether this is useful functionality to justify all this code (900+ and this new code that Magnus has added).? Can't you just exit and restart the clhsdb process on the core file or process? For the record, this is me playing with python to remove this code. http://cr.openjdk.java.net/~coleenp/2020/01/webrev/index.html Thanks, Coleen On 3/30/20 3:04 PM, coleen.phillimore at oracle.com wrote: > > I was wondering why this is needed when debugging a core file, which > is the key thing we need the SA for: > > ? /** This is used by both the debugger and any runtime system. It is > ????? the basic mechanism by which classes which mimic underlying VM > ????? functionality cause themselves to be initialized. The given > ????? observer will be notified (with arguments (null, null)) when the > ????? VM is re-initialized, as well as when it registers itself with > ????? the VM. */ > ? public static void registerVMInitializedObserver(Observer o) { > ??? vmInitializedObservers.add(o); > ??? o.update(null, null); > ? } > > It seems like if it isn't needed, we shouldn't add these classes and > remove their use. > > Coleen > > On 3/30/20 8:14 AM, Magnus Ihse Bursie wrote: >> No opinions on this? >> >> /Magnus >> >> On 2020-03-25 23:34, Magnus Ihse Bursie wrote: >>> Hi everyone, >>> >>> As a follow-up to the ongoing review for JDK-8241618, I have also >>> looked at fixing the deprecation warnings in jdk.hotspot.agent. >>> These fall in three broad categories: >>> >>> * Deprecation of the boxing type constructors (e.g. "new Integer(42)"). >>> >>> * Deprecation of java.util.Observer and Observable. >>> >>> * The rest (mostly Class.newInstance(), and a few number of other >>> odd deprecations) >>> >>> The first category is trivial to fix. The last category need some >>> special discussion. But the overwhelming majority of deprecation >>> warnings come from the use of Observer and Observable. This really >>> dwarfs anything else, and needs to be handled first, otherwise it's >>> hard to even spot the other issues. >>> >>> My analysis of the situation is that the deprecation of Observer and >>> Observable seems a bit harsh, from the PoV of jdk.hotspot.agent. >>> Sure, it might be limited, but I think it does exactly what is >>> needed here. So the migration suggested in Observable (java.beans or >>> java.util.concurrent) seems overkill. If there are genuine threading >>> issues at play here, this assumption might be wrong, and then maybe >>> going the j.u.c. route is correct. >>> >>> But if that's not, the main goal should be to stay with the current >>> implementation. One way to do this is to sprinkle the code with >>> @SuppressWarning. But I think a better way would be to just >>> implement our own Observer and Observable. After all, the classes >>> are trivial. >>> >>> I've made a mock-up of this solution, were I just copied the >>> java.util.Observer and Observable, and removed the deprecation >>> annotations. The only thing needed for the rest of the code is to >>> make sure we import these; I've done this for three arbitrarily >>> selected classes just to show what the change would typically look >>> like. Here's the mock-up: >>> >>> http://cr.openjdk.java.net/~ihse/hotspot-agent-observer/webrev.01 >>> >>> Let me know what you think. >>> >>> /Magnus >> > From magnus.ihse.bursie at oracle.com Tue Mar 31 12:51:52 2020 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Tue, 31 Mar 2020 14:51:52 +0200 Subject: Discussion about fixing deprecation in jdk.hotspot.agent In-Reply-To: <113dd83a-82a3-88fc-8f31-fe9bfd00c12c@oracle.com> References: <1916207b-de97-1f25-f93c-8830025fad62@oracle.com> <113dd83a-82a3-88fc-8f31-fe9bfd00c12c@oracle.com> Message-ID: <51a5b160-1af8-69a3-1dff-deb04c8a2447@oracle.com> On 2020-03-31 14:34, coleen.phillimore at oracle.com wrote: > > To answer my own question, this functionality is used to allow > detach/reattach from {cl}hsdb.? Which seems to work on linux but not > windows with this code removed. > > The next question is whether this is useful functionality to justify > all this code (900+ and this new code that Magnus has added).? Can't > you just exit and restart the clhsdb process on the core file or process? Personally, I'm happy for any solution. All I want to see is that SA stops polluting the build log with warnings that cannot be disabled. My approach was to minimize the amount of code changes that'd allow for this, but if you all can agree that this code is better off removed, then I'm completely OK with it. (And as a rule of thumb, dead and removed code is good code!) /Magnus > > For the record, this is me playing with python to remove this code. > > http://cr.openjdk.java.net/~coleenp/2020/01/webrev/index.html > > Thanks, > Coleen > > On 3/30/20 3:04 PM, coleen.phillimore at oracle.com wrote: >> >> I was wondering why this is needed when debugging a core file, which >> is the key thing we need the SA for: >> >> ? /** This is used by both the debugger and any runtime system. It is >> ????? the basic mechanism by which classes which mimic underlying VM >> ????? functionality cause themselves to be initialized. The given >> ????? observer will be notified (with arguments (null, null)) when the >> ????? VM is re-initialized, as well as when it registers itself with >> ????? the VM. */ >> ? public static void registerVMInitializedObserver(Observer o) { >> ??? vmInitializedObservers.add(o); >> ??? o.update(null, null); >> ? } >> >> It seems like if it isn't needed, we shouldn't add these classes and >> remove their use. >> >> Coleen >> >> On 3/30/20 8:14 AM, Magnus Ihse Bursie wrote: >>> No opinions on this? >>> >>> /Magnus >>> >>> On 2020-03-25 23:34, Magnus Ihse Bursie wrote: >>>> Hi everyone, >>>> >>>> As a follow-up to the ongoing review for JDK-8241618, I have also >>>> looked at fixing the deprecation warnings in jdk.hotspot.agent. >>>> These fall in three broad categories: >>>> >>>> * Deprecation of the boxing type constructors (e.g. "new >>>> Integer(42)"). >>>> >>>> * Deprecation of java.util.Observer and Observable. >>>> >>>> * The rest (mostly Class.newInstance(), and a few number of other >>>> odd deprecations) >>>> >>>> The first category is trivial to fix. The last category need some >>>> special discussion. But the overwhelming majority of deprecation >>>> warnings come from the use of Observer and Observable. This really >>>> dwarfs anything else, and needs to be handled first, otherwise it's >>>> hard to even spot the other issues. >>>> >>>> My analysis of the situation is that the deprecation of Observer >>>> and Observable seems a bit harsh, from the PoV of >>>> jdk.hotspot.agent. Sure, it might be limited, but I think it does >>>> exactly what is needed here. So the migration suggested in >>>> Observable (java.beans or java.util.concurrent) seems overkill. If >>>> there are genuine threading issues at play here, this assumption >>>> might be wrong, and then maybe going the j.u.c. route is correct. >>>> >>>> But if that's not, the main goal should be to stay with the current >>>> implementation. One way to do this is to sprinkle the code with >>>> @SuppressWarning. But I think a better way would be to just >>>> implement our own Observer and Observable. After all, the classes >>>> are trivial. >>>> >>>> I've made a mock-up of this solution, were I just copied the >>>> java.util.Observer and Observable, and removed the deprecation >>>> annotations. The only thing needed for the rest of the code is to >>>> make sure we import these; I've done this for three arbitrarily >>>> selected classes just to show what the change would typically look >>>> like. Here's the mock-up: >>>> >>>> http://cr.openjdk.java.net/~ihse/hotspot-agent-observer/webrev.01 >>>> >>>> Let me know what you think. >>>> >>>> /Magnus >>> >> > From martin.doerr at sap.com Tue Mar 31 14:01:02 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 31 Mar 2020 14:01:02 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com> <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com> <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com> Message-ID: Hi Richard, thanks for addressing all my points. I've looked over webrev.5 and I'm satisfied with your changes. I had also promised to review the tests. test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysisEnabled.java Thanks for updating the @summary comment. Looks good in webrev.5. test/hotspot/jtreg/serviceability/jvmti/Heap/libIterateHeapWithEscapeAnalysisEnabled.c JVMTI agent for object tagging and heap iteration. Good. test/jdk/com/sun/jdi/EATests.java This is a substantial amount of tests which is appropriate for a such a large change. Skipping some subtests with UseJVMCICompiler makes sense because it doesn't provide the necessary JVMTI functionality, yet. Nice work! I also like that you test with and without BiasedLocking. Your tests will still be fine after BiasedLocking deprecation. Very minor nits: - 2 typos in comment above EARelockingNestedInflatedTarget: "lockes are ommitted" (sounds funny) - You sometimes write "graal" and sometimes "Graal". I guess the capital G is better. (Also in EATestsJVMCI.java.) test/jdk/com/sun/jdi/EATestsJVMCI.java EATests with Graal enabled. Nice that you support Graal to some extent. Maybe Graal folks want to enhance them in the future. I think this is a good starting point. Conclusion: Looks good and not trivial :-) Now, you have one full review. I'd be ok with covering 2nd review by partial reviews. Compiler and JVMTI parts are not too complicated IMHO. Runtime part should get at least one additional careful review. Best regards, Martin > -----Original Message----- > From: Reingruber, Richard > Sent: Montag, 30. M?rz 2020 10:32 > To: Doerr, Martin ; 'Robbin Ehn' > ; Lindenmaier, Goetz > ; David Holmes ; > Vladimir Kozlov (vladimir.kozlov at oracle.com) > ; serviceability-dev at openjdk.java.net; > hotspot-compiler-dev at openjdk.java.net; hotspot-runtime- > dev at openjdk.java.net > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance > in the Presence of JVMTI Agents > > Hi, > > this is webrev.5 based on Robbin's feedback and Martin's review - thanks! :) > > The change affects jvmti, hotspot and c2. Partial reviews are very welcome > too. > > Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5/ > Delta: > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5.inc/ > > Robbin, Martin, please let me know, if anything shouldn't be quite as you > wanted it. Also find my > comments on your feedback below. > > Robbin, can I count you as Reviewer for the runtime part? > > Thanks, Richard. > > -- > > > DeoptimizeObjectsALotThread is only used in compileBroker.cpp. > > You can move both declaration and definition to that file, no need to > clobber > > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry) > > Done. > > > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's > own > > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp. > > I moved JvmtiDeferredUpdates to vframe_hp.hpp where preexisting > jvmtiDeferredLocalVariableSet is > declared. > > > src/hotspot/share/code/compiledMethod.cpp > > Nice cleanup! > > Thanks :) > > > src/hotspot/share/code/debugInfoRec.cpp > > src/hotspot/share/code/debugInfoRec.hpp > > Additional parmeters. (Remark: I think "non_global_escape_in_scope" > would read better than "not_global_escape_in_scope", but your version is > consistent with existing code, so no change request from my side.) Ok. > > I've been thinking about this too and finally stayed with > not_global_escape_in_scope. It's supposed > to mean an object whose escape state is not GlobalEscape is in scope. > > > src/hotspot/share/compiler/compileBroker.cpp > > src/hotspot/share/compiler/compileBroker.hpp > > Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into > a follow up change together with the test in order to make this webrev > smaller, but since it is included, I'm reviewing everything at once. Not a big > deal.) Ok. > > Yes the change would be a little smaller. And if it helps I'll split it off. In > general I prefer > patches that bring along a suitable amount of tests. > > > src/hotspot/share/opto/c2compiler.cpp > > Make do_escape_analysis independent of JVMCI capabilities. Nice! > > It is the main goal of the enhancement. It is done for C2, but could be done > for JVMCI compilers > with just a small effort as well. > > > src/hotspot/share/opto/escape.cpp > > Annotation for MachSafePointNodes. Your added functionality looks > correct. > > But I'd prefer to move the bulky code out of the large function. > > I suggest to factor out something like has_not_global_escape and > has_arg_escape. So the code could look like this: > > SafePointNode* sfn = sfn_worklist.at(next); > > sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn)); > > if (sfn->is_CallJava()) { > > CallJavaNode* call = sfn->as_CallJava(); > > call->set_arg_escape(has_arg_escape(call)); > > } > > This would also allow us to get rid of the found_..._escape_in_args > variables making the loops better readable. > > Done. > > > It's kind of ugly to use strcmp to recognize uncommon trap, but that seems > to be the way to do it (there are more such places). So it's ok. > > Yeah. I copied the snippet. > > > src/hotspot/share/prims/jvmtiImpl.cpp > > src/hotspot/share/prims/jvmtiImpl.hpp > > The sequence is pretty complex: > > VM_GetOrSetLocal element initialization executes EscapeBarrier code > which suspends the target thread (extra VM Operation). > > Note that the target threads have to be suspended already for > VM_GetOrSetLocal*. So it's mainly the > synchronization effect of EscapeBarrier::sync_and_suspend_one() that is > required here. Also no extra > _handshake_ is executed, since sync_and_suspend_one() will find the > target threads already > suspended. > > > VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM > Thread to prepare VM Operation with frame deoptimization). > > VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor > which resumes the target thread. > > But I don't have any improvement proposal. Performance is probably not a > concern, here. So it's ok. > > > VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it > has non-globally escaping objects and other frames if they have arg escaping > ones. Good. > > It's not specifically the top frame, but the frame that is accessed. > > > src/hotspot/share/runtime/deoptimization.cpp > > Object deoptimization. I have more comments and proposals, here. > > First of all, handling recursive and waiting locks in relock_objects is tricky, > but looks correct. > > Comments are sufficient to understand why things are done as they are > implemented. > > > BiasedLocking related parts are complex, but we may get rid of them in the > future (with BiasedLocking removal). > > Anyway, looks correct, too. > > > Typo in comment: "regularily" => "regularly" > > > Deoptimization::fetch_unroll_info_helper is the only place where > _jvmti_deferred_updates get deallocated (except JavaThread destructor). > But I think we always go through it, so I can't see a memory leak or such kind > of issues. > > That's correct. The compiled frame for which deferred updates are allocated > is always deoptimized > before (see EscapeBarrier::deoptimize_objects()). This is also asserted in > compiledVFrame::update_deferred_value(). I've added the same assertion > to > Deoptimization::relock_objects(). So we can be sure that > _jvmti_deferred_updates are deallocated > again in fetch_unroll_info_helper(). > > > EscapeBarrier::deoptimize_objects: ResourceMark should use > calling_thread(). > > Sure, well spotted! > > > You can use MutexLocker and MonitorLocker with Thread* to save the > Thread::current() call. > > Right, good hint. This was recently introduced with 8235678. I even had to > resolve conflicts. Should > have done this then. > > > I'd make set_objs_are_deoptimized static and remove it from the > EscapeBarrier interface because I think it shouldn't be used outside of > EscapeBarrier::deoptimize_objects. > > Done. > > > Typo in comment: "we must only deoptimize" => "we only have to > deoptimize" > > Replaced with "[...] we deoptimize iff local objects are passed as args" > > > "bool EscapeBarrier::deoptimize_objects(intptr_t* fr_id)" is trivial and > barrier_active() is redundant. Implementation can get moved to hpp file. > > Ok. Done. > > > I'll get back to suspend flags, later. > > > There are weird cases regarding _self_deoptimization_in_progress. > > Assume we have 3 threads A, B and C. A deopts C, B deopts C, C deopts C. > C can set _self_deoptimization_in_progress while A performs the handshake > for suspending C. I think this doesn't lead to errors, but it's probably not > desired. > > I think it would be better to use only one "wait" call in > sync_and_suspend_one and sync_and_suspend_all. > > You're right. We've discussed that face-to-face, but couldn't find a real issue. > But now, thinking again, a reckon I found one: > > 2808 // Sync with other threads that might be doing deoptimizations > 2809 { > 2810 // Need to switch to _thread_blocked for the wait() call > 2811 ThreadBlockInVM tbivm(_calling_thread); > 2812 MonitorLocker ml(EscapeBarrier_lock, > Mutex::_no_safepoint_check_flag); > 2813 while (_self_deoptimization_in_progress) { > 2814 ml.wait(); > 2815 } > 2816 > 2817 if (self_deopt()) { > 2818 _self_deoptimization_in_progress = true; > 2819 } > 2820 > 2821 while (_deoptee_thread->is_ea_obj_deopt_suspend()) { > 2822 ml.wait(); > 2823 } > 2824 > 2825 if (self_deopt()) { > 2826 return; > 2827 } > 2828 > 2829 // set suspend flag for target thread > 2830 _deoptee_thread->set_ea_obj_deopt_flag(); > 2831 } > > - A waits in 2822 > - C is suspended > - B notifies all in resume_one() > - A and C wake up > - C wins over A and sets _self_deoptimization_in_progress = true in 2818 > - C does the self deoptimization > - A executes 2830 _deoptee_thread->set_ea_obj_deopt_flag() > > C will self suspend at some undefined point. The resulting state is illegal. > > > I first thought it'd be better to move ThreadBlockInVM before wait() to > reduce thread state transitions, but that seems to be problematic because > ThreadBlockInVM destructor contains a safepoint check which we shouldn't > do while holding EscapeBarrier_lock. So no change request. > > Yes, would be nice to have the state change only if needed, but for the > reason you mentioned it is > not quite as easy as it seems to be. I experimented as well with a second > lock, but did not succeed. > > > Change in thred_added: > > I think the sequence would be more comprehensive if we waited for > deopt_all_threads in Thread::start and all other places where a new thread > can run into Java code (e.g. JVMTI attach). > > Your version makes new threads come up with suspend flag set. That looks > correct, too. Advantage is that you only have to change one place > (thread_added). It'll be interesting to see how it will look like when we use > async handshakes instead of suspend flags. > > For now, I'm ok with your version. > > I had a version that did what you are suggesting. The current version also has > the advantage, that > there are fewer places where a thread has to wait for ongoing object > deoptimization. This means > viewer places where you have to worry about correct thread state > transitions, possible deadlocks, > and if all oops are properly Handle'ed. > > > I'd only move MutexLocker ml(EscapeBarrier_lock...) after if (!jt- > >is_hidden_from_external_view()). > > Done. > > > Having 4 different deoptimize_objects functions makes it a little hard to > keep an overview of which one is used for what. > > Maybe adding suffixes would help a little bit, but I can also live with what > you have. > > Implementation looks correct to me. > > 2 are internal. I added the suffix _internal to them. This leaves 2 to choose > from. > > > src/hotspot/share/runtime/deoptimization.hpp > > Escape barriers and object deoptimization functions. > > Typo in comment: "helt" => "held" > > Done in place already. > > > src/hotspot/share/runtime/interfaceSupport.cpp > > InterfaceSupport::deoptimizeAllObjects() is only used for > DeoptimizeObjectsALot = 1. > > I think DeoptimizeObjectsALot = 2 is more important, but I think it's not bad > to have DeoptimizeObjectsALot = 1 in addition. Ok. > > I never used DeoptimizeObjectsALot = 1 that much. It could be more > deterministic in single threaded > scenarios. I wouldn't object to get rid of it though. > > > src/hotspot/share/runtime/stackValue.hpp > > Better reinitilization in StackValue. Good. > > StackValue::obj_is_scalar_replaced() should not return true after calling > set_obj(). > > > src/hotspot/share/runtime/thread.cpp > > src/hotspot/share/runtime/thread.hpp > > src/hotspot/share/runtime/thread.inline.hpp > > wait_for_object_deoptimization, suspend flag, deferred updates and test > feature to deoptimize objects. > > > In the long term, we want to get rid of suspend flags, so it's not so nice to > introduce a new one. But I agree with G?tz that it should be acceptable as > temporary solution until async handshakes are available (which takes more > time). So I'm ok with your change. > > I'm keen to build the feature on async handshakes when the arive. > > > You can use MutexLocker with Thread*. > > Done. > > > JVMTIDeferredUpdates: I agree with Robin. It'd be nice to move the class > out of thread.hpp. > > Done. > > > src/hotspot/share/runtime/vframe.cpp > > Added support for entry frame to new_vframe. Ok. > > > > src/hotspot/share/runtime/vframe_hp.cpp > > src/hotspot/share/runtime/vframe_hp.hpp > > > I think code()->as_nmethod() in not_global_escape_in_scope() and > arg_escape() should better be under #ifdef ASSERT or inside the assert > statement (no need for code cache walking in product build). > > Done. > > > jvmtiDeferredLocalVariableSet::update_monitors: > > Please add a comment explaining that owner referenced by original info > may be scalar replaced, but it is deoptimized in the vframe. > > Done. > > -----Original Message----- > From: Doerr, Martin > Sent: Donnerstag, 12. M?rz 2020 17:28 > To: Reingruber, Richard ; 'Robbin Ehn' > ; Lindenmaier, Goetz > ; David Holmes ; > Vladimir Kozlov (vladimir.kozlov at oracle.com) > ; serviceability-dev at openjdk.java.net; > hotspot-compiler-dev at openjdk.java.net; hotspot-runtime- > dev at openjdk.java.net > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance > in the Presence of JVMTI Agents > > Hi Richard, > > > I managed to find time for a (almost) complete review of webrev.4. (I'll > review the tests separately.) > > First of all, the change seems to be in pretty good quality for its significant > complexity. I couldn't find any real bugs. But I'd like to propose minor > improvements. > I'm convinced that it's mature because we did substantial testing. > > I like the new functionality for object deoptimization. It can possibly be > reused for future escape analysis based optimizations. So I appreciate having > it available in the code base. > In addition to that, your change makes the JVMTI implementation better > integrated into the VM. > > > Now to the details: > > > src/hotspot/share/c1/c1_IR.hpp > describe_scope parameters. Ok. > > > src/hotspot/share/ci/ciEnv.cpp > src/hotspot/share/ci/ciEnv.hpp > Fix for JvmtiExport::can_walk_any_space() capability. Ok. > > > src/hotspot/share/code/compiledMethod.cpp > Nice cleanup! > > > src/hotspot/share/code/debugInfoRec.cpp > src/hotspot/share/code/debugInfoRec.hpp > Additional parmeters. (Remark: I think "non_global_escape_in_scope" > would read better than "not_global_escape_in_scope", but your version is > consistent with existing code, so no change request from my side.) Ok. > > > src/hotspot/share/code/nmethod.cpp > Nice cleanup! > > > src/hotspot/share/code/pcDesc.hpp > Additional parameters. Ok. > > > src/hotspot/share/code/scopeDesc.cpp > src/hotspot/share/code/scopeDesc.hpp > Improved implementation + additional parameters. Ok. > > > src/hotspot/share/compiler/compileBroker.cpp > src/hotspot/share/compiler/compileBroker.hpp > Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into a > follow up change together with the test in order to make this webrev > smaller, but since it is included, I'm reviewing everything at once. Not a big > deal.) Ok. > > > src/hotspot/share/jvmci/jvmciCodeInstaller.cpp > Additional parameters. Ok. > > > src/hotspot/share/opto/c2compiler.cpp > Make do_escape_analysis independent of JVMCI capabilities. Nice! > > > src/hotspot/share/opto/callnode.hpp > Additional fields for MachSafePointNodes. Ok. > > > src/hotspot/share/opto/escape.cpp > Annotation for MachSafePointNodes. Your added functionality looks correct. > But I'd prefer to move the bulky code out of the large function. > I suggest to factor out something like has_not_global_escape and > has_arg_escape. So the code could look like this: > SafePointNode* sfn = sfn_worklist.at(next); > sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn)); > if (sfn->is_CallJava()) { > CallJavaNode* call = sfn->as_CallJava(); > call->set_arg_escape(has_arg_escape(call)); > } > This would also allow us to get rid of the found_..._escape_in_args variables > making the loops better readable. > > It's kind of ugly to use strcmp to recognize uncommon trap, but that seems > to be the way to do it (there are more such places). So it's ok. > > > src/hotspot/share/opto/machnode.hpp > Additional fields for MachSafePointNodes. Ok. > > > src/hotspot/share/opto/macro.cpp > Allow elimination of non-escaping allocations. Ok. > > > src/hotspot/share/opto/matcher.cpp > src/hotspot/share/opto/output.cpp > Copy attribute / pass parameters. Ok. > > > src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp > Nice cleanup! > > > src/hotspot/share/prims/jvmtiEnv.cpp > src/hotspot/share/prims/jvmtiEnvBase.cpp > Escape barriers + deoptimize objects for target thread. Good. > > > src/hotspot/share/prims/jvmtiImpl.cpp > src/hotspot/share/prims/jvmtiImpl.hpp > The sequence is pretty complex: > VM_GetOrSetLocal element initialization executes EscapeBarrier code which > suspends the target thread (extra VM Operation). > VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM > Thread to prepare VM Operation with frame deoptimization). > VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor which > resumes the target thread. > But I don't have any improvement proposal. Performance is probably not a > concern, here. So it's ok. > > VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it has > non-globally escaping objects and other frames if they have arg escaping > ones. Good. > > > src/hotspot/share/prims/jvmtiTagMap.cpp > Escape barriers + deoptimize objects for all threads. Ok. > > > src/hotspot/share/prims/whitebox.cpp > Added WB_IsFrameDeoptimized to API. Ok. > > > src/hotspot/share/runtime/deoptimization.cpp > Object deoptimization. I have more comments and proposals, here. > First of all, handling recursive and waiting locks in relock_objects is tricky, but > looks correct. > Comments are sufficient to understand why things are done as they are > implemented. > > BiasedLocking related parts are complex, but we may get rid of them in the > future (with BiasedLocking removal). > Anyway, looks correct, too. > > Typo in comment: "regularily" => "regularly" > > Deoptimization::fetch_unroll_info_helper is the only place where > _jvmti_deferred_updates get deallocated (except JavaThread destructor). > But I think we always go through it, so I can't see a memory leak or such kind > of issues. > > EscapeBarrier::deoptimize_objects: ResourceMark should use > calling_thread(). > > You can use MutexLocker and MonitorLocker with Thread* to save the > Thread::current() call. > > I'd make set_objs_are_deoptimized static and remove it from the > EscapeBarrier interface because I think it shouldn't be used outside of > EscapeBarrier::deoptimize_objects. > > Typo in comment: "we must only deoptimize" => "we only have to > deoptimize" > > "bool EscapeBarrier::deoptimize_objects(intptr_t* fr_id)" is trivial and > barrier_active() is redundant. Implementation can get moved to hpp file. > > I'll get back to suspend flags, later. > > There are weird cases regarding _self_deoptimization_in_progress. > Assume we have 3 threads A, B and C. A deopts C, B deopts C, C deopts C. C > can set _self_deoptimization_in_progress while A performs the handshake > for suspending C. I think this doesn't lead to errors, but it's probably not > desired. > I think it would be better to use only one "wait" call in > sync_and_suspend_one and sync_and_suspend_all. > > I first thought it'd be better to move ThreadBlockInVM before wait() to > reduce thread state transitions, but that seems to be problematic because > ThreadBlockInVM destructor contains a safepoint check which we shouldn't > do while holding EscapeBarrier_lock. So no change request. > > Change in thred_added: > I think the sequence would be more comprehensive if we waited for > deopt_all_threads in Thread::start and all other places where a new thread > can run into Java code (e.g. JVMTI attach). > Your version makes new threads come up with suspend flag set. That looks > correct, too. Advantage is that you only have to change one place > (thread_added). It'll be interesting to see how it will look like when we use > async handshakes instead of suspend flags. > For now, I'm ok with your version. > > I'd only move MutexLocker ml(EscapeBarrier_lock...) after if (!jt- > >is_hidden_from_external_view()). > > Having 4 different deoptimize_objects functions makes it a little hard to keep > an overview of which one is used for what. > Maybe adding suffixes would help a little bit, but I can also live with what you > have. > Implementation looks correct to me. > > > src/hotspot/share/runtime/deoptimization.hpp > Escape barriers and object deoptimization functions. > Typo in comment: "helt" => "held" > > > src/hotspot/share/runtime/globals.hpp > Addition of develop flag DeoptimizeObjectsALotInterval. Ok. > > > src/hotspot/share/runtime/interfaceSupport.cpp > InterfaceSupport::deoptimizeAllObjects() is only used for > DeoptimizeObjectsALot = 1. > I think DeoptimizeObjectsALot = 2 is more important, but I think it's not bad > to have DeoptimizeObjectsALot = 1 in addition. Ok. > > > src/hotspot/share/runtime/interfaceSupport.inline.hpp > Addition of deoptimizeAllObjects. Ok. > > > src/hotspot/share/runtime/mutexLocker.cpp > src/hotspot/share/runtime/mutexLocker.hpp > Addition of EscapeBarrier_lock. Ok. > > > src/hotspot/share/runtime/objectMonitor.cpp > Make recursion count relock aware. Ok. > > > src/hotspot/share/runtime/stackValue.hpp > Better reinitilization in StackValue. Good. > > > src/hotspot/share/runtime/thread.cpp > src/hotspot/share/runtime/thread.hpp > src/hotspot/share/runtime/thread.inline.hpp > wait_for_object_deoptimization, suspend flag, deferred updates and test > feature to deoptimize objects. > > In the long term, we want to get rid of suspend flags, so it's not so nice to > introduce a new one. But I agree with G?tz that it should be acceptable as > temporary solution until async handshakes are available (which takes more > time). So I'm ok with your change. > > You can use MutexLocker with Thread*. > > JVMTIDeferredUpdates: I agree with Robin. It'd be nice to move the class out > of thread.hpp. > > > src/hotspot/share/runtime/vframe.cpp > Added support for entry frame to new_vframe. Ok. > > > src/hotspot/share/runtime/vframe_hp.cpp > src/hotspot/share/runtime/vframe_hp.hpp > > I think code()->as_nmethod() in not_global_escape_in_scope() and > arg_escape() should better be under #ifdef ASSERT or inside the assert > statement (no need for code cache walking in product build). > > jvmtiDeferredLocalVariableSet::update_monitors: > Please add a comment explaining that owner referenced by original info may > be scalar replaced, but it is deoptimized in the vframe. > > > src/hotspot/share/utilities/macros.hpp > Addition of NOT_COMPILER2_OR_JVMCI_RETURN macros. Ok. > > > test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysi > sEnabled.java > test/hotspot/jtreg/serviceability/jvmti/Heap/libIterateHeapWithEscapeAnal > ysisEnabled.c > New test. Will review separately. > > > test/jdk/TEST.ROOT > Addition of vm.jvmci as required property. Ok. > > > test/jdk/com/sun/jdi/EATests.java > test/jdk/com/sun/jdi/EATestsJVMCI.java > New test. Will review separately. > > > test/lib/sun/hotspot/WhiteBox.java > Added isFrameDeoptimized to API. Ok. > > > That was it. Best regards, > Martin > > > > -----Original Message----- > > From: hotspot-compiler-dev > bounces at openjdk.java.net> On Behalf Of Reingruber, Richard > > Sent: Dienstag, 3. M?rz 2020 21:23 > > To: 'Robbin Ehn' ; Lindenmaier, Goetz > > ; David Holmes > ; > > Vladimir Kozlov (vladimir.kozlov at oracle.com) > > ; serviceability-dev at openjdk.java.net; > > hotspot-compiler-dev at openjdk.java.net; hotspot-runtime- > > dev at openjdk.java.net > > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better > > Performance in the Presence of JVMTI Agents > > > > Hi Robbin, > > > > > > I understand that Robbin proposed to replace the usage of > > > > _suspend_flag with handshakes. Apparently, async handshakes > > > > are needed to do so. We have been waiting a while for removal > > > > of the _suspend_flag / introduction of async handshakes [2]. > > > > What is the status here? > > > > > I have an old prototype which I would like to continue to work on. > > > So do not assume asynch handshakes will make 15. > > > Even if it would, I think there are a lot more investigate work to remove > > > _suspend_flag. > > > > Let us know, if we can be of any help to you and be it only testing. > > > > > >> Full: > > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/ > > > > > DeoptimizeObjectsALotThread is only used in compileBroker.cpp. > > > You can move both declaration and definition to that file, no need to > > clobber > > > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry) > > > > Will do. > > > > > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in > it's > > own > > > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp. > > > > You are right. It shouldn't be declared in thread.hpp. I will look into that. > > > > > Note that we also think we may have a bug in deopt: > > > https://bugs.openjdk.java.net/browse/JDK-8238237 > > > > > I think it would be best, if possible, to push after that is resolved. > > > > Sure. > > > > > Not even nearly a full review :) > > > > I know :) > > > > Anyways, thanks a lot, > > Richard. > > > > > > -----Original Message----- > > From: Robbin Ehn > > Sent: Monday, March 2, 2020 11:17 AM > > To: Lindenmaier, Goetz ; Reingruber, > Richard > > ; David Holmes > ; > > Vladimir Kozlov (vladimir.kozlov at oracle.com) > > ; serviceability-dev at openjdk.java.net; > > hotspot-compiler-dev at openjdk.java.net; hotspot-runtime- > > dev at openjdk.java.net > > Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance > > in the Presence of JVMTI Agents > > > > Hi, > > > > On 2/24/20 5:39 PM, Lindenmaier, Goetz wrote: > > > Hi, > > > > > > I had a look at the progress of this change. Nothing > > > happened since Richard posted his update using more > > > handshakes [1]. > > > But we (SAP) would appreciate a lot if this change could > > > be successfully reviewed and pushed. > > > > > > I think there is basic understanding that this > > > change is helpful. It fixes a number of issues with JVMTI, > > > and will deliver the same performance benefits as EA > > > does in current production mode for debugging scenarios. > > > > > > This is important for us as we run our VMs prepared > > > for debugging in production mode. > > > > > > I understand that Robbin proposed to replace the usage of > > > _suspend_flag with handshakes. Apparently, async handshakes > > > are needed to do so. We have been waiting a while for removal > > > of the _suspend_flag / introduction of async handshakes [2]. > > > What is the status here? > > > > I have an old prototype which I would like to continue to work on. > > So do not assume asynch handshakes will make 15. > > Even if it would, I think there are a lot more investigate work to remove > > _suspend_flag. > > > > > > > > I think we should no longer wait, but proceed with > > > this change. We will look into removing the usage of > > > suspend_flag introduced here once it is possible to implement > > > it with handshakes. > > > > Yes, sure. > > > > >> Full: > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/ > > > > DeoptimizeObjectsALotThread is only used in compileBroker.cpp. > > You can move both declaration and definition to that file, no need to > clobber > > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry) > > > > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's > > own > > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp. > > > > Note that we also think we may have a bug in deopt: > > https://bugs.openjdk.java.net/browse/JDK-8238237 > > > > I think it would be best, if possible, to push after that is resolved. > > > > Not even nearly a full review :) > > > > Thanks, Robbin > > > > > > >> Incremental: > > >> > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/ > > >> > > >> I was not able to eliminate the additional suspend flag now. I'll take care > > of this > > >> as soon as the > > >> existing suspend-resume-mechanism is reworked. > > >> > > >> Testing: > > >> > > >> Nightly tests @SAP: > > >> > > >> JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, > > Renaissance > > >> Suite, SAP specific tests > > >> with fastdebug and release builds on all platforms > > >> > > >> Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x > > parallel > > >> for 24h > > >> > > >> Thanks, Richard. > > >> > > >> > > >> More details on the changes: > > >> > > >> * Hide DeoptimizeObjectsALotThread from external view. > > >> > > >> * Changed EscapeBarrier_lock to be a _safepoint_check_never lock. > > >> It used to be _safepoint_check_sometimes, which will be eliminated > > sooner or > > >> later. > > >> I added explicit thread state changes with ThreadBlockInVM to code > > paths > > >> where we can wait() > > >> on EscapeBarrier_lock to become safepoint safe. > > >> > > >> * Use handshake EscapeBarrierSuspendHandshake to suspend target > > threads > > >> instead of vm operation > > >> VM_ThreadSuspendAllForObjDeopt. > > >> > > >> * Removed uses of Threads_lock. When adding a new thread we > suspend > > it iff > > >> EA optimizations are > > >> being reverted. In the previous version we were waiting on > > Threads_lock > > >> while EA optimizations > > >> were reverted. See EscapeBarrier::thread_added(). > > >> > > >> * Made tests require Xmixed compilation mode. > > >> > > >> * Made tests agnostic regarding tiered compilation. > > >> I.e. tc isn't disabled anymore, and the tests can be run with tc enabled > or > > >> disabled. > > >> > > >> * Exercising EATests.java as well with stress test options > > >> DeoptimizeObjectsALot* > > >> Due to the non-deterministic deoptimizations some tests need to be > > skipped. > > >> We do this to prevent bit-rot of the stress test code. > > >> > > >> * Executing EATests.java as well with graal if available. Driver for this is > > >> EATestsJVMCI.java. Graal cannot pass all tests, because it does not > > provide all > > >> the new debug info > > >> (namely not_global_escape_in_scope and arg_escape in > > scopeDesc.hpp). > > >> And graal does not yet support the JVMTI operations force early > return > > and > > >> pop frame. > > >> > > >> * Removed tracing from new jdi tests in EATests.java. Too much trace > > output > > >> before the debugging > > >> connection is established can cause deadlock because output buffers > fill > > up. > > >> (See https://bugs.openjdk.java.net/browse/JDK-8173304) > > >> > > >> * Many copyright year changes and smaller clean-up changes of testing > > code > > >> (trailing white-space and > > >> the like). > > >> > > >> > > >> -----Original Message----- > > >> From: David Holmes > > >> Sent: Donnerstag, 19. Dezember 2019 03:12 > > >> To: Reingruber, Richard ; serviceability- > > >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; > > hotspot- > > >> runtime-dev at openjdk.java.net; Vladimir Kozlov > > (vladimir.kozlov at oracle.com) > > >> > > >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > > Performance in > > >> the Presence of JVMTI Agents > > >> > > >> Hi Richard, > > >> > > >> I think my issue is with the way EliminateNestedLocks works so I'm going > > >> to look into that more deeply. > > >> > > >> Thanks for the explanations. > > >> > > >> David > > >> > > >> On 18/12/2019 12:47 am, Reingruber, Richard wrote: > > >>> Hi David, > > >>> > > >>> > > > Some further queries/concerns: > > >>> > > > > > >>> > > > src/hotspot/share/runtime/objectMonitor.cpp > > >>> > > > > > >>> > > > Can you please explain the changes to ObjectMonitor::wait: > > >>> > > > > > >>> > > > ! _recursions = save // restore the old recursion count > > >>> > > > ! + jt->get_and_reset_relock_count_after_wait(); // > > >>> > > > increased by the deferred relock count > > >>> > > > > > >>> > > > what is the "deferred relock count"? I gather it relates to > > >>> > > > > > >>> > > > "The code was extended to be able to deoptimize objects of a > > >>> > > frame that > > >>> > > > is not the top frame and to let another thread than the > owning > > >>> > > thread do > > >>> > > > it." > > >>> > > > > >>> > > Yes, these relate. Currently EA based optimizations are reverted, > > when a > > >> compiled frame is > > >>> > > replaced with corresponding interpreter frames. Part of this is > > relocking > > >> objects with eliminated > > >>> > > locking. New with the enhancement is that we do this also just > > before > > >> object references are > > >>> > > acquired through JVMTI. In this case we deoptimize also the > > owning > > >> compiled frame C and we > > >>> > > register deoptimized objects as deferred updates. When control > > returns > > >> to C it gets deoptimized, > > >>> > > we notice that objects are already deoptimized (reallocated and > > >> relocked), so we don't do it again > > >>> > > (relocking twice would be incorrect of course). Deferred updates > > are > > >> copied into the new > > >>> > > interpreter frames. > > >>> > > > > >>> > > Problem: relocking is not possible if the target thread T is waiting > > on the > > >> monitor that needs to > > >>> > > be relocked. This happens only with non-local objects with > > >> EliminateNestedLocks. Instead relocking > > >>> > > is deferred until T owns the monitor again. This is what the piece > of > > >> code above does. > > >>> > > > >>> > Sorry I need some more detail here. How can you wait() on an > > object > > >>> > monitor if the object allocation and/or locking was optimised > away? > > And > > >>> > what is a "non-local object" in this context? Isn't EA restricted to > > >>> > thread-confined objects? > > >>> > > >>> "Non-local object" is an object that escapes its thread. The issue I'm > > >> addressing with the changes > > >>> in ObjectMonitor::wait are almost unrelated to EA. They are caused by > > >> EliminateNestedLocks, where C2 > > >>> eliminates recursive locking of an already owned lock. The lock owning > > object > > >> exists on the heap, it > > >>> is locked and you can call wait() on it. > > >>> > > >>> EliminateLocks is the C2 option that controls lock elimination based on > > EA. > > >> Both optimizations have > > >>> in common that objects with eliminated locking need to be relocked > > when > > >> deoptimizing a frame, > > >>> i.e. when replacing a compiled frame with equivalent interpreter > > >>> frames. Deoptimization::relock_objects does that job for /all/ > eliminated > > >> locks in scope. /All/ can > > >>> be a mix of eliminated nested locks and locks of not-escaping objects. > > >>> > > >>> New with the enhancement: I call relock_objects earlier, just before > > objects > > >> pontentially > > >>> escape. But then later when the owning compiled frame gets > > deoptimized, I > > >> must not do it again: > > >>> > > >>> See call to EscapeBarrier::objs_are_deoptimized in > deoptimization.cpp: > > >>> > > >>> 373 if ((jvmci_enabled || ((DoEscapeAnalysis || > > EliminateNestedLocks) && > > >> EliminateLocks)) > > >>> 374 && !EscapeBarrier::objs_are_deoptimized(thread, > > deoptee.id())) { > > >>> 375 bool unused; > > >>> 376 eliminate_locks(thread, chunk, realloc_failures, deoptee, > > exec_mode, > > >> unused); > > >>> 377 } > > >>> > > >>> Now when calling relock_objects early it is quiet possible that I have to > > relock > > >> an object the > > >>> target thread currently waits for. Obviously I cannot relock in this case, > > >> instead I chose to > > >>> introduce relock_count_after_wait to JavaThread. > > >>> > > >>> > Is it just that some of the locking gets optimized away e.g. > > >>> > > > >>> > synchronised(obj) { > > >>> > synchronised(obj) { > > >>> > synchronised(obj) { > > >>> > obj.wait(); > > >>> > } > > >>> > } > > >>> > } > > >>> > > > >>> > If this is reduced to a form as-if it were a single lock of the monitor > > >>> > (due to EA) and the wait() triggers a JVM TI event which leads to > the > > >>> > escape of "obj" then we need to reconstruct the true lock state, > and > > so > > >>> > when the wait() internally unblocks and reacquires the monitor it > > has to > > >>> > set the true recursion count to 3, not the 1 that it appeared to be > > when > > >>> > wait() was initially called. Is that the scenario? > > >>> > > >>> Kind of... except that the locking is not eliminated due to EA and there > is > > no > > >> JVM TI event > > >>> triggered by wait. > > >>> > > >>> Add > > >>> > > >>> LocalObject l1 = new LocalObject(); > > >>> > > >>> in front of the synchrnized blocks and assume a JVM TI agent acquires > l1. > > This > > >> triggers the code in > > >>> question. > > >>> > > >>> See that relocking/reallocating is transactional. If it is done then for > /all/ > > >> objects in scope and it is > > >>> done at most once. It wouldn't be quite so easy to split this in relocking > > of > > >> nested/EA-based > > >>> eliminated locks. > > >>> > > >>> > If so I find this truly awful. Anyone using wait() in a realistic form > > >>> > requires a notification and so the object cannot be thread > confined. > > In > > >>> > > >>> It is not thread confined. > > >>> > > >>> > which case I would strongly argue that upon hitting the wait() the > > deopt > > >>> > should occur unconditionally and so the lock state is correct before > > we > > >>> > wait and so we don't need to mess with the recursion count > > internally > > >>> > when we reacquire the monitor. > > >>> > > > >>> > > > > >>> > > > which I don't like the sound of at all when it comes to > > ObjectMonitor > > >>> > > > state. So I'd like to understand in detail exactly what is going > on > > here > > >>> > > > and why. This is a very intrusive change that seems to badly > > break > > >>> > > > encapsulation and impacts future changes to ObjectMonitor > > that are > > >> under > > >>> > > > investigation. > > >>> > > > > >>> > > I would not regard this as breaking encapsulation. Certainly not > > badly. > > >>> > > > > >>> > > I've added a property relock_count_after_wait to JavaThread. > The > > >> property is well > > >>> > > encapsulated. Future ObjectMonitor implementations have to > deal > > with > > >> recursion too. They are free > > >>> > > in choosing a way to do that as long as that property is taken into > > >> account. This is hardly a > > >>> > > limitation. > > >>> > > > >>> > I do think this badly breaks encapsulation as you have to add a > > callout > > >>> > from the guts of the ObjectMonitor code to reach into the thread > to > > get > > >>> > this lock count adjustment. I understand why you have had to do > > this but > > >>> > I would much rather see a change to the EA optimisation strategy > so > > that > > >>> > this is not needed. > > >>> > > > >>> > > Note also that the property is a straight forward extension of the > > >> existing concept of deferred > > >>> > > local updates. It is embedded into the structure holding them. So > > not > > >> even the footprint of a > > >>> > > JavaThread is enlarged if no deferred updates are generated. > > >>> > > > >>> > [...] > > >>> > > > >>> > > > > >>> > > I'm actually duplicating the existing external suspend mechanism, > > >> because a thread can be > > >>> > > suspended at most once. And hey, and don't like that either! But > it > > >> seems not unlikely that the > > >>> > > duplicate can be removed together with the original and the new > > type > > >> of handshakes that will be > > >>> > > used for thread suspend can be used for object deoptimization > > too. See > > >> today's discussion in > > >>> > > JDK-8227745 [2]. > > >>> > > > >>> > I hope that discussion bears some fruit, at the moment it seems > not > > to > > >>> > be possible to use handshakes here. :( > > >>> > > > >>> > The external suspend mechanism is a royal pain in the proverbial > > that we > > >>> > have to carefully live with. The idea that we're duplicating that for > > >>> > use in another fringe area of functionality does not thrill me at all. > > >>> > > > >>> > To be clear, I understand the problem that exists and that you > wish > > to > > >>> > solve, but for the runtime parts I balk at the complexity cost of > > >>> > solving it. > > >>> > > >>> I know it's complex, but by far no rocket science. > > >>> > > >>> Also I find it hard to imagine another fix for JDK-8233915 besides > > changing > > >> the JVM TI specification. > > >>> > > >>> Thanks, Richard. > > >>> > > >>> -----Original Message----- > > >>> From: David Holmes > > >>> Sent: Dienstag, 17. Dezember 2019 08:03 > > >>> To: Reingruber, Richard ; serviceability- > > >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; > > hotspot- > > >> runtime-dev at openjdk.java.net; Vladimir Kozlov > > (vladimir.kozlov at oracle.com) > > >> > > >>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > > Performance > > >> in the Presence of JVMTI Agents > > >>> > > >>> > > >>> > > >>> David > > >>> > > >>> On 17/12/2019 4:57 pm, David Holmes wrote: > > >>>> Hi Richard, > > >>>> > > >>>> On 14/12/2019 5:01 am, Reingruber, Richard wrote: > > >>>>> Hi David, > > >>>>> > > >>>>> ?? > Some further queries/concerns: > > >>>>> ?? > > > >>>>> ?? > src/hotspot/share/runtime/objectMonitor.cpp > > >>>>> ?? > > > >>>>> ?? > Can you please explain the changes to ObjectMonitor::wait: > > >>>>> ?? > > > >>>>> ?? > !?? _recursions = save????? // restore the old recursion count > > >>>>> ?? > !???????????????? + jt->get_and_reset_relock_count_after_wait(); // > > >>>>> ?? > increased by the deferred relock count > > >>>>> ?? > > > >>>>> ?? > what is the "deferred relock count"? I gather it relates to > > >>>>> ?? > > > >>>>> ?? > "The code was extended to be able to deoptimize objects of a > > >>>>> frame that > > >>>>> ?? > is not the top frame and to let another thread than the owning > > >>>>> thread do > > >>>>> ?? > it." > > >>>>> > > >>>>> Yes, these relate. Currently EA based optimizations are reverted, > > when > > >>>>> a compiled frame is replaced > > >>>>> with corresponding interpreter frames. Part of this is relocking > > >>>>> objects with eliminated > > >>>>> locking. New with the enhancement is that we do this also just > before > > >>>>> object references are acquired > > >>>>> through JVMTI. In this case we deoptimize also the owning compiled > > >>>>> frame C and we register > > >>>>> deoptimized objects as deferred updates. When control returns to > C > > it > > >>>>> gets deoptimized, we notice > > >>>>> that objects are already deoptimized (reallocated and relocked), so > > we > > >>>>> don't do it again (relocking > > >>>>> twice would be incorrect of course). Deferred updates are copied > into > > >>>>> the new interpreter frames. > > >>>>> > > >>>>> Problem: relocking is not possible if the target thread T is waiting > > >>>>> on the monitor that needs to be > > >>>>> relocked. This happens only with non-local objects with > > >>>>> EliminateNestedLocks. Instead relocking is > > >>>>> deferred until T owns the monitor again. This is what the piece of > > >>>>> code above does. > > >>>> > > >>>> Sorry I need some more detail here. How can you wait() on an object > > >>>> monitor if the object allocation and/or locking was optimised away? > > And > > >>>> what is a "non-local object" in this context? Isn't EA restricted to > > >>>> thread-confined objects? > > >>>> > > >>>> Is it just that some of the locking gets optimized away e.g. > > >>>> > > >>>> synchronised(obj) { > > >>>> ? synchronised(obj) { > > >>>> ??? synchronised(obj) { > > >>>> ????? obj.wait(); > > >>>> ??? } > > >>>> ? } > > >>>> } > > >>>> > > >>>> If this is reduced to a form as-if it were a single lock of the monitor > > >>>> (due to EA) and the wait() triggers a JVM TI event which leads to the > > >>>> escape of "obj" then we need to reconstruct the true lock state, and > so > > >>>> when the wait() internally unblocks and reacquires the monitor it has > to > > >>>> set the true recursion count to 3, not the 1 that it appeared to be > when > > >>>> wait() was initially called. Is that the scenario? > > >>>> > > >>>> If so I find this truly awful. Anyone using wait() in a realistic form > > >>>> requires a notification and so the object cannot be thread confined. > In > > >>>> which case I would strongly argue that upon hitting the wait() the > > deopt > > >>>> should occur unconditionally and so the lock state is correct before > we > > >>>> wait and so we don't need to mess with the recursion count internally > > >>>> when we reacquire the monitor. > > >>>> > > >>>>> > > >>>>> ?? > which I don't like the sound of at all when it comes to > > >>>>> ObjectMonitor > > >>>>> ?? > state. So I'd like to understand in detail exactly what is going > > >>>>> on here > > >>>>> ?? > and why.? This is a very intrusive change that seems to badly > > break > > >>>>> ?? > encapsulation and impacts future changes to ObjectMonitor > that > > >>>>> are under > > >>>>> ?? > investigation. > > >>>>> > > >>>>> I would not regard this as breaking encapsulation. Certainly not > badly. > > >>>>> > > >>>>> I've added a property relock_count_after_wait to JavaThread. The > > >>>>> property is well > > >>>>> encapsulated. Future ObjectMonitor implementations have to deal > > with > > >>>>> recursion too. They are free in > > >>>>> choosing a way to do that as long as that property is taken into > > >>>>> account. This is hardly a > > >>>>> limitation. > > >>>> > > >>>> I do think this badly breaks encapsulation as you have to add a callout > > >>>> from the guts of the ObjectMonitor code to reach into the thread to > > get > > >>>> this lock count adjustment. I understand why you have had to do this > > but > > >>>> I would much rather see a change to the EA optimisation strategy so > > that > > >>>> this is not needed. > > >>>> > > >>>>> Note also that the property is a straight forward extension of the > > >>>>> existing concept of deferred > > >>>>> local updates. It is embedded into the structure holding them. So > not > > >>>>> even the footprint of a > > >>>>> JavaThread is enlarged if no deferred updates are generated. > > >>>>> > > >>>>> ?? > --- > > >>>>> ?? > > > >>>>> ?? > src/hotspot/share/runtime/thread.cpp > > >>>>> ?? > > > >>>>> ?? > Can you please explain why > > >>>>> JavaThread::wait_for_object_deoptimization > > >>>>> ?? > has to be handcrafted in this way rather than using proper > > >>>>> transitions. > > >>>>> ?? > > > >>>>> > > >>>>> I wrote wait_for_object_deoptimization taking > > >>>>> JavaThread::java_suspend_self_with_safepoint_check > > >>>>> as template. So in short: for the same reasons :) > > >>>>> > > >>>>> Threads reach both methods as part of thread state transitions, > > >>>>> therefore special handling is > > >>>>> required to change thread state on top of ongoing transitions. > > >>>>> > > >>>>> ?? > We got rid of "deopt suspend" some time ago and it is > disturbing > > >>>>> to see > > >>>>> ?? > it being added back (effectively). This seems like it may be > > >>>>> something > > >>>>> ?? > that handshakes could be used for. > > >>>>> > > >>>>> Deopt suspend used to be something rather different with a similar > > >>>>> name[1]. It is not being added back. > > >>>> > > >>>> I stand corrected. Despite comments in the code to the contrary > > >>>> deopt_suspend didn't actually cause a self-suspend. I was doing a lot > of > > >>>> cleanup in this area 13 years ago :) > > >>>> > > >>>>> > > >>>>> I'm actually duplicating the existing external suspend mechanism, > > >>>>> because a thread can be suspended > > >>>>> at most once. And hey, and don't like that either! But it seems not > > >>>>> unlikely that the duplicate can > > >>>>> be removed together with the original and the new type of > > handshakes > > >>>>> that will be used for > > >>>>> thread suspend can be used for object deoptimization too. See > > today's > > >>>>> discussion in JDK-8227745 [2]. > > >>>> > > >>>> I hope that discussion bears some fruit, at the moment it seems not > to > > >>>> be possible to use handshakes here. :( > > >>>> > > >>>> The external suspend mechanism is a royal pain in the proverbial that > > we > > >>>> have to carefully live with. The idea that we're duplicating that for > > >>>> use in another fringe area of functionality does not thrill me at all. > > >>>> > > >>>> To be clear, I understand the problem that exists and that you wish to > > >>>> solve, but for the runtime parts I balk at the complexity cost of > > >>>> solving it. > > >>>> > > >>>> Thanks, > > >>>> David > > >>>> ----- > > >>>> > > >>>>> Thanks, Richard. > > >>>>> > > >>>>> [1] Deopt suspend was something like an async. handshake for > > >>>>> architectures with register windows, > > >>>>> ???? where patching the return pc for deoptimization of a compiled > > >>>>> frame was racy if the owner thread > > >>>>> ???? was in native code. Instead a "deopt" suspend flag was set on > > >>>>> which the thread patched its own > > >>>>> ???? frame upon return from native. So no thread was suspended. It > > got > > >>>>> its name only from the name of > > >>>>> ???? the flags. > > >>>>> > > >>>>> [2] Discussion about using handshakes to sync. with the target > thread: > > >>>>> > > >>>>> https://bugs.openjdk.java.net/browse/JDK- > > >> > > > 8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.syst > > e > > >> m.issuetabpanels:comment-tabpanel#comment-14306727 > > >>>>> > > >>>>> > > >>>>> -----Original Message----- > > >>>>> From: David Holmes > > >>>>> Sent: Freitag, 13. Dezember 2019 00:56 > > >>>>> To: Reingruber, Richard ; > > >>>>> serviceability-dev at openjdk.java.net; > > >>>>> hotspot-compiler-dev at openjdk.java.net; > > >>>>> hotspot-runtime-dev at openjdk.java.net > > >>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > > >>>>> Performance in the Presence of JVMTI Agents > > >>>>> > > >>>>> Hi Richard, > > >>>>> > > >>>>> Some further queries/concerns: > > >>>>> > > >>>>> src/hotspot/share/runtime/objectMonitor.cpp > > >>>>> > > >>>>> Can you please explain the changes to ObjectMonitor::wait: > > >>>>> > > >>>>> !?? _recursions = save????? // restore the old recursion count > > >>>>> !???????????????? + jt->get_and_reset_relock_count_after_wait(); // > > >>>>> increased by the deferred relock count > > >>>>> > > >>>>> what is the "deferred relock count"? I gather it relates to > > >>>>> > > >>>>> "The code was extended to be able to deoptimize objects of a > frame > > that > > >>>>> is not the top frame and to let another thread than the owning > thread > > do > > >>>>> it." > > >>>>> > > >>>>> which I don't like the sound of at all when it comes to ObjectMonitor > > >>>>> state. So I'd like to understand in detail exactly what is going on here > > >>>>> and why.? This is a very intrusive change that seems to badly break > > >>>>> encapsulation and impacts future changes to ObjectMonitor that > are > > under > > >>>>> investigation. > > >>>>> > > >>>>> --- > > >>>>> > > >>>>> src/hotspot/share/runtime/thread.cpp > > >>>>> > > >>>>> Can you please explain why > > JavaThread::wait_for_object_deoptimization > > >>>>> has to be handcrafted in this way rather than using proper > transitions. > > >>>>> > > >>>>> We got rid of "deopt suspend" some time ago and it is disturbing to > > see > > >>>>> it being added back (effectively). This seems like it may be > something > > >>>>> that handshakes could be used for. > > >>>>> > > >>>>> Thanks, > > >>>>> David > > >>>>> ----- > > >>>>> > > >>>>> On 12/12/2019 7:02 am, David Holmes wrote: > > >>>>>> On 12/12/2019 1:07 am, Reingruber, Richard wrote: > > >>>>>>> Hi David, > > >>>>>>> > > >>>>>>> ??? > Most of the details here are in areas I can comment on in > > detail, > > >>>>>>> but I > > >>>>>>> ??? > did take an initial general look at things. > > >>>>>>> > > >>>>>>> Thanks for taking the time! > > >>>>>> > > >>>>>> Apologies the above should read: > > >>>>>> > > >>>>>> "Most of the details here are in areas I *can't* comment on in > detail > > >>>>>> ..." > > >>>>>> > > >>>>>> David > > >>>>>> > > >>>>>>> ??? > The only thing that jumped out at me is that I think the > > >>>>>>> ??? > DeoptimizeObjectsALotThread should be a hidden thread. > > >>>>>>> ??? > > > >>>>>>> ??? > +? bool is_hidden_from_external_view() const { return true; > } > > >>>>>>> > > >>>>>>> Yes, it should. Will add the method like above. > > >>>>>>> > > >>>>>>> ??? > Also I don't see any testing of the > > DeoptimizeObjectsALotThread. > > >>>>>>> Without > > >>>>>>> ??? > active testing this will just bit-rot. > > >>>>>>> > > >>>>>>> DeoptimizeObjectsALot is meant for stress testing with a larger > > >>>>>>> workload. I will add a minimal test > > >>>>>>> to keep it fresh. > > >>>>>>> > > >>>>>>> ??? > Also on the tests I don't understand your @requires clause: > > >>>>>>> ??? > > > >>>>>>> ??? >?? @requires ((vm.compMode != "Xcomp") & > > vm.compiler2.enabled > > >> & > > >>>>>>> ??? > (vm.opt.TieredCompilation != true)) > > >>>>>>> ??? > > > >>>>>>> ??? > This seems to require that TieredCompilation is disabled, but > > >>>>>>> tiered is > > >>>>>>> ??? > our normal mode of operation. ?? > > >>>>>>> ??? > > > >>>>>>> > > >>>>>>> I removed the clause. I guess I wanted to target the tests towards > > the > > >>>>>>> code they are supposed to > > >>>>>>> test, and it's easier to analyze failures w/o tiered compilation and > > >>>>>>> with just one compiler thread. > > >>>>>>> > > >>>>>>> Additionally I will make use of > > >>>>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the > tests. > > >>>>>>> > > >>>>>>> Thanks, > > >>>>>>> Richard. > > >>>>>>> > > >>>>>>> -----Original Message----- > > >>>>>>> From: David Holmes > > >>>>>>> Sent: Mittwoch, 11. Dezember 2019 08:03 > > >>>>>>> To: Reingruber, Richard ; > > >>>>>>> serviceability-dev at openjdk.java.net; > > >>>>>>> hotspot-compiler-dev at openjdk.java.net; > > >>>>>>> hotspot-runtime-dev at openjdk.java.net > > >>>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > > >>>>>>> Performance in the Presence of JVMTI Agents > > >>>>>>> > > >>>>>>> Hi Richard, > > >>>>>>> > > >>>>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote: > > >>>>>>>> Hi, > > >>>>>>>> > > >>>>>>>> I would like to get reviews please for > > >>>>>>>> > > >>>>>>>> > > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ > > >>>>>>>> > > >>>>>>>> Corresponding RFE: > > >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745 > > >>>>>>>> > > >>>>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 > > >>>>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK- > > 8214584 [1] > > >>>>>>>> > > >>>>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing > > without > > >>>>>>>> issues (thanks!). In addition the > > >>>>>>>> change is being tested at SAP since I posted the first RFR some > > >>>>>>>> months ago. > > >>>>>>>> > > >>>>>>>> The intention of this enhancement is to benefit performance > wise > > from > > >>>>>>>> escape analysis even if JVMTI > > >>>>>>>> agents request capabilities that allow them to access local > variable > > >>>>>>>> values. E.g. if you start-up > > >>>>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, > > then > > >>>>>>>> escape analysis is disabled right > > >>>>>>>> from the beginning, well before a debugger attaches -- if ever > one > > >>>>>>>> should do so. With the > > >>>>>>>> enhancement, escape analysis will remain enabled until and > after > > a > > >>>>>>>> debugger attaches. EA based > > >>>>>>>> optimizations are reverted just before an agent acquires the > > >>>>>>>> reference to an object. In the JBS item > > >>>>>>>> you'll find more details. > > >>>>>>> > > >>>>>>> Most of the details here are in areas I can comment on in detail, > but > > I > > >>>>>>> did take an initial general look at things. > > >>>>>>> > > >>>>>>> The only thing that jumped out at me is that I think the > > >>>>>>> DeoptimizeObjectsALotThread should be a hidden thread. > > >>>>>>> > > >>>>>>> +? bool is_hidden_from_external_view() const { return true; } > > >>>>>>> > > >>>>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread. > > >>>>>>> Without > > >>>>>>> active testing this will just bit-rot. > > >>>>>>> > > >>>>>>> Also on the tests I don't understand your @requires clause: > > >>>>>>> > > >>>>>>> ??? @requires ((vm.compMode != "Xcomp") & > > vm.compiler2.enabled & > > >>>>>>> (vm.opt.TieredCompilation != true)) > > >>>>>>> > > >>>>>>> This seems to require that TieredCompilation is disabled, but > tiered > > is > > >>>>>>> our normal mode of operation. ?? > > >>>>>>> > > >>>>>>> Thanks, > > >>>>>>> David > > >>>>>>> > > >>>>>>>> Thanks, > > >>>>>>>> Richard. > > >>>>>>>> > > >>>>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745 > > >>>>>>>> > > >> > > > http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.pa > > tc > > >> h > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> From robbin.ehn at oracle.com Tue Mar 31 14:20:41 2020 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 31 Mar 2020 16:20:41 +0200 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com> <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com> <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com> Message-ID: <0a07f87e-ede1-edbd-c754-e7df884e0545@oracle.com> Thanks for cleaning up thread.hpp! /Robbin On 2020-03-30 10:31, Reingruber, Richard wrote: > Hi, > > this is webrev.5 based on Robbin's feedback and Martin's review - thanks! :) > > The change affects jvmti, hotspot and c2. Partial reviews are very welcome too. > > Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5/ > Delta: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5.inc/ > > Robbin, Martin, please let me know, if anything shouldn't be quite as you wanted it. Also find my > comments on your feedback below. > > Robbin, can I count you as Reviewer for the runtime part? > > Thanks, Richard. > > -- > >> DeoptimizeObjectsALotThread is only used in compileBroker.cpp. >> You can move both declaration and definition to that file, no need to clobber >> thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry) > > Done. > >> Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's own >> hpp file? It doesn't seem right to add JVM TI classes into thread.hpp. > > I moved JvmtiDeferredUpdates to vframe_hp.hpp where preexisting jvmtiDeferredLocalVariableSet is > declared. > >> src/hotspot/share/code/compiledMethod.cpp >> Nice cleanup! > > Thanks :) > >> src/hotspot/share/code/debugInfoRec.cpp >> src/hotspot/share/code/debugInfoRec.hpp >> Additional parmeters. (Remark: I think "non_global_escape_in_scope" would read better than "not_global_escape_in_scope", but your version is consistent with existing code, so no change request from my side.) Ok. > > I've been thinking about this too and finally stayed with not_global_escape_in_scope. It's supposed > to mean an object whose escape state is not GlobalEscape is in scope. > >> src/hotspot/share/compiler/compileBroker.cpp >> src/hotspot/share/compiler/compileBroker.hpp >> Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into a follow up change together with the test in order to make this webrev smaller, but since it is included, I'm reviewing everything at once. Not a big deal.) Ok. > > Yes the change would be a little smaller. And if it helps I'll split it off. In general I prefer > patches that bring along a suitable amount of tests. > >> src/hotspot/share/opto/c2compiler.cpp >> Make do_escape_analysis independent of JVMCI capabilities. Nice! > > It is the main goal of the enhancement. It is done for C2, but could be done for JVMCI compilers > with just a small effort as well. > >> src/hotspot/share/opto/escape.cpp >> Annotation for MachSafePointNodes. Your added functionality looks correct. >> But I'd prefer to move the bulky code out of the large function. >> I suggest to factor out something like has_not_global_escape and has_arg_escape. So the code could look like this: >> SafePointNode* sfn = sfn_worklist.at(next); >> sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn)); >> if (sfn->is_CallJava()) { >> CallJavaNode* call = sfn->as_CallJava(); >> call->set_arg_escape(has_arg_escape(call)); >> } >> This would also allow us to get rid of the found_..._escape_in_args variables making the loops better readable. > > Done. > >> It's kind of ugly to use strcmp to recognize uncommon trap, but that seems to be the way to do it (there are more such places). So it's ok. > > Yeah. I copied the snippet. > >> src/hotspot/share/prims/jvmtiImpl.cpp >> src/hotspot/share/prims/jvmtiImpl.hpp >> The sequence is pretty complex: >> VM_GetOrSetLocal element initialization executes EscapeBarrier code which suspends the target thread (extra VM Operation). > > Note that the target threads have to be suspended already for VM_GetOrSetLocal*. So it's mainly the > synchronization effect of EscapeBarrier::sync_and_suspend_one() that is required here. Also no extra > _handshake_ is executed, since sync_and_suspend_one() will find the target threads already > suspended. > >> VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM Thread to prepare VM Operation with frame deoptimization). >> VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor which resumes the target thread. >> But I don't have any improvement proposal. Performance is probably not a concern, here. So it's ok. > >> VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it has non-globally escaping objects and other frames if they have arg escaping ones. Good. > > It's not specifically the top frame, but the frame that is accessed. > >> src/hotspot/share/runtime/deoptimization.cpp >> Object deoptimization. I have more comments and proposals, here. >> First of all, handling recursive and waiting locks in relock_objects is tricky, but looks correct. >> Comments are sufficient to understand why things are done as they are implemented. > >> BiasedLocking related parts are complex, but we may get rid of them in the future (with BiasedLocking removal). >> Anyway, looks correct, too. > >> Typo in comment: "regularily" => "regularly" > >> Deoptimization::fetch_unroll_info_helper is the only place where _jvmti_deferred_updates get deallocated (except JavaThread destructor). But I think we always go through it, so I can't see a memory leak or such kind of issues. > > That's correct. The compiled frame for which deferred updates are allocated is always deoptimized > before (see EscapeBarrier::deoptimize_objects()). This is also asserted in > compiledVFrame::update_deferred_value(). I've added the same assertion to > Deoptimization::relock_objects(). So we can be sure that _jvmti_deferred_updates are deallocated > again in fetch_unroll_info_helper(). > >> EscapeBarrier::deoptimize_objects: ResourceMark should use calling_thread(). > > Sure, well spotted! > >> You can use MutexLocker and MonitorLocker with Thread* to save the Thread::current() call. > > Right, good hint. This was recently introduced with 8235678. I even had to resolve conflicts. Should > have done this then. > >> I'd make set_objs_are_deoptimized static and remove it from the EscapeBarrier interface because I think it shouldn't be used outside of EscapeBarrier::deoptimize_objects. > > Done. > >> Typo in comment: "we must only deoptimize" => "we only have to deoptimize" > > Replaced with "[...] we deoptimize iff local objects are passed as args" > >> "bool EscapeBarrier::deoptimize_objects(intptr_t* fr_id)" is trivial and barrier_active() is redundant. Implementation can get moved to hpp file. > > Ok. Done. > >> I'll get back to suspend flags, later. > >> There are weird cases regarding _self_deoptimization_in_progress. >> Assume we have 3 threads A, B and C. A deopts C, B deopts C, C deopts C. C can set _self_deoptimization_in_progress while A performs the handshake for suspending C. I think this doesn't lead to errors, but it's probably not desired. >> I think it would be better to use only one "wait" call in sync_and_suspend_one and sync_and_suspend_all. > > You're right. We've discussed that face-to-face, but couldn't find a real issue. But now, thinking again, a reckon I found one: > > 2808 // Sync with other threads that might be doing deoptimizations > 2809 { > 2810 // Need to switch to _thread_blocked for the wait() call > 2811 ThreadBlockInVM tbivm(_calling_thread); > 2812 MonitorLocker ml(EscapeBarrier_lock, Mutex::_no_safepoint_check_flag); > 2813 while (_self_deoptimization_in_progress) { > 2814 ml.wait(); > 2815 } > 2816 > 2817 if (self_deopt()) { > 2818 _self_deoptimization_in_progress = true; > 2819 } > 2820 > 2821 while (_deoptee_thread->is_ea_obj_deopt_suspend()) { > 2822 ml.wait(); > 2823 } > 2824 > 2825 if (self_deopt()) { > 2826 return; > 2827 } > 2828 > 2829 // set suspend flag for target thread > 2830 _deoptee_thread->set_ea_obj_deopt_flag(); > 2831 } > > - A waits in 2822 > - C is suspended > - B notifies all in resume_one() > - A and C wake up > - C wins over A and sets _self_deoptimization_in_progress = true in 2818 > - C does the self deoptimization > - A executes 2830 _deoptee_thread->set_ea_obj_deopt_flag() > > C will self suspend at some undefined point. The resulting state is illegal. > >> I first thought it'd be better to move ThreadBlockInVM before wait() to reduce thread state transitions, but that seems to be problematic because ThreadBlockInVM destructor contains a safepoint check which we shouldn't do while holding EscapeBarrier_lock. So no change request. > > Yes, would be nice to have the state change only if needed, but for the reason you mentioned it is > not quite as easy as it seems to be. I experimented as well with a second lock, but did not succeed. > >> Change in thred_added: >> I think the sequence would be more comprehensive if we waited for deopt_all_threads in Thread::start and all other places where a new thread can run into Java code (e.g. JVMTI attach). >> Your version makes new threads come up with suspend flag set. That looks correct, too. Advantage is that you only have to change one place (thread_added). It'll be interesting to see how it will look like when we use async handshakes instead of suspend flags. >> For now, I'm ok with your version. > > I had a version that did what you are suggesting. The current version also has the advantage, that > there are fewer places where a thread has to wait for ongoing object deoptimization. This means > viewer places where you have to worry about correct thread state transitions, possible deadlocks, > and if all oops are properly Handle'ed. > >> I'd only move MutexLocker ml(EscapeBarrier_lock...) after if (!jt->is_hidden_from_external_view()). > > Done. > >> Having 4 different deoptimize_objects functions makes it a little hard to keep an overview of which one is used for what. >> Maybe adding suffixes would help a little bit, but I can also live with what you have. >> Implementation looks correct to me. > > 2 are internal. I added the suffix _internal to them. This leaves 2 to choose from. > >> src/hotspot/share/runtime/deoptimization.hpp >> Escape barriers and object deoptimization functions. >> Typo in comment: "helt" => "held" > > Done in place already. > >> src/hotspot/share/runtime/interfaceSupport.cpp >> InterfaceSupport::deoptimizeAllObjects() is only used for DeoptimizeObjectsALot = 1. >> I think DeoptimizeObjectsALot = 2 is more important, but I think it's not bad to have DeoptimizeObjectsALot = 1 in addition. Ok. > > I never used DeoptimizeObjectsALot = 1 that much. It could be more deterministic in single threaded > scenarios. I wouldn't object to get rid of it though. > >> src/hotspot/share/runtime/stackValue.hpp >> Better reinitilization in StackValue. Good. > > StackValue::obj_is_scalar_replaced() should not return true after calling set_obj(). > >> src/hotspot/share/runtime/thread.cpp >> src/hotspot/share/runtime/thread.hpp >> src/hotspot/share/runtime/thread.inline.hpp >> wait_for_object_deoptimization, suspend flag, deferred updates and test feature to deoptimize objects. > >> In the long term, we want to get rid of suspend flags, so it's not so nice to introduce a new one. But I agree with G?tz that it should be acceptable as temporary solution until async handshakes are available (which takes more time). So I'm ok with your change. > > I'm keen to build the feature on async handshakes when the arive. > >> You can use MutexLocker with Thread*. > > Done. > >> JVMTIDeferredUpdates: I agree with Robin. It'd be nice to move the class out of thread.hpp. > > Done. > >> src/hotspot/share/runtime/vframe.cpp >> Added support for entry frame to new_vframe. Ok. > > >> src/hotspot/share/runtime/vframe_hp.cpp >> src/hotspot/share/runtime/vframe_hp.hpp > >> I think code()->as_nmethod() in not_global_escape_in_scope() and arg_escape() should better be under #ifdef ASSERT or inside the assert statement (no need for code cache walking in product build). > > Done. > >> jvmtiDeferredLocalVariableSet::update_monitors: >> Please add a comment explaining that owner referenced by original info may be scalar replaced, but it is deoptimized in the vframe. > > Done. > > -----Original Message----- > From: Doerr, Martin > Sent: Donnerstag, 12. M?rz 2020 17:28 > To: Reingruber, Richard ; 'Robbin Ehn' ; Lindenmaier, Goetz ; David Holmes ; Vladimir Kozlov (vladimir.kozlov at oracle.com) ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents > > Hi Richard, > > > I managed to find time for a (almost) complete review of webrev.4. (I'll review the tests separately.) > > First of all, the change seems to be in pretty good quality for its significant complexity. I couldn't find any real bugs. But I'd like to propose minor improvements. > I'm convinced that it's mature because we did substantial testing. > > I like the new functionality for object deoptimization. It can possibly be reused for future escape analysis based optimizations. So I appreciate having it available in the code base. > In addition to that, your change makes the JVMTI implementation better integrated into the VM. > > > Now to the details: > > > src/hotspot/share/c1/c1_IR.hpp > describe_scope parameters. Ok. > > > src/hotspot/share/ci/ciEnv.cpp > src/hotspot/share/ci/ciEnv.hpp > Fix for JvmtiExport::can_walk_any_space() capability. Ok. > > > src/hotspot/share/code/compiledMethod.cpp > Nice cleanup! > > > src/hotspot/share/code/debugInfoRec.cpp > src/hotspot/share/code/debugInfoRec.hpp > Additional parmeters. (Remark: I think "non_global_escape_in_scope" would read better than "not_global_escape_in_scope", but your version is consistent with existing code, so no change request from my side.) Ok. > > > src/hotspot/share/code/nmethod.cpp > Nice cleanup! > > > src/hotspot/share/code/pcDesc.hpp > Additional parameters. Ok. > > > src/hotspot/share/code/scopeDesc.cpp > src/hotspot/share/code/scopeDesc.hpp > Improved implementation + additional parameters. Ok. > > > src/hotspot/share/compiler/compileBroker.cpp > src/hotspot/share/compiler/compileBroker.hpp > Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into a follow up change together with the test in order to make this webrev smaller, but since it is included, I'm reviewing everything at once. Not a big deal.) Ok. > > > src/hotspot/share/jvmci/jvmciCodeInstaller.cpp > Additional parameters. Ok. > > > src/hotspot/share/opto/c2compiler.cpp > Make do_escape_analysis independent of JVMCI capabilities. Nice! > > > src/hotspot/share/opto/callnode.hpp > Additional fields for MachSafePointNodes. Ok. > > > src/hotspot/share/opto/escape.cpp > Annotation for MachSafePointNodes. Your added functionality looks correct. > But I'd prefer to move the bulky code out of the large function. > I suggest to factor out something like has_not_global_escape and has_arg_escape. So the code could look like this: > SafePointNode* sfn = sfn_worklist.at(next); > sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn)); > if (sfn->is_CallJava()) { > CallJavaNode* call = sfn->as_CallJava(); > call->set_arg_escape(has_arg_escape(call)); > } > This would also allow us to get rid of the found_..._escape_in_args variables making the loops better readable. > > It's kind of ugly to use strcmp to recognize uncommon trap, but that seems to be the way to do it (there are more such places). So it's ok. > > > src/hotspot/share/opto/machnode.hpp > Additional fields for MachSafePointNodes. Ok. > > > src/hotspot/share/opto/macro.cpp > Allow elimination of non-escaping allocations. Ok. > > > src/hotspot/share/opto/matcher.cpp > src/hotspot/share/opto/output.cpp > Copy attribute / pass parameters. Ok. > > > src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp > Nice cleanup! > > > src/hotspot/share/prims/jvmtiEnv.cpp > src/hotspot/share/prims/jvmtiEnvBase.cpp > Escape barriers + deoptimize objects for target thread. Good. > > > src/hotspot/share/prims/jvmtiImpl.cpp > src/hotspot/share/prims/jvmtiImpl.hpp > The sequence is pretty complex: > VM_GetOrSetLocal element initialization executes EscapeBarrier code which suspends the target thread (extra VM Operation). > VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM Thread to prepare VM Operation with frame deoptimization). > VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor which resumes the target thread. > But I don't have any improvement proposal. Performance is probably not a concern, here. So it's ok. > > VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it has non-globally escaping objects and other frames if they have arg escaping ones. Good. > > > src/hotspot/share/prims/jvmtiTagMap.cpp > Escape barriers + deoptimize objects for all threads. Ok. > > > src/hotspot/share/prims/whitebox.cpp > Added WB_IsFrameDeoptimized to API. Ok. > > > src/hotspot/share/runtime/deoptimization.cpp > Object deoptimization. I have more comments and proposals, here. > First of all, handling recursive and waiting locks in relock_objects is tricky, but looks correct. > Comments are sufficient to understand why things are done as they are implemented. > > BiasedLocking related parts are complex, but we may get rid of them in the future (with BiasedLocking removal). > Anyway, looks correct, too. > > Typo in comment: "regularily" => "regularly" > > Deoptimization::fetch_unroll_info_helper is the only place where _jvmti_deferred_updates get deallocated (except JavaThread destructor). But I think we always go through it, so I can't see a memory leak or such kind of issues. > > EscapeBarrier::deoptimize_objects: ResourceMark should use calling_thread(). > > You can use MutexLocker and MonitorLocker with Thread* to save the Thread::current() call. > > I'd make set_objs_are_deoptimized static and remove it from the EscapeBarrier interface because I think it shouldn't be used outside of EscapeBarrier::deoptimize_objects. > > Typo in comment: "we must only deoptimize" => "we only have to deoptimize" > > "bool EscapeBarrier::deoptimize_objects(intptr_t* fr_id)" is trivial and barrier_active() is redundant. Implementation can get moved to hpp file. > > I'll get back to suspend flags, later. > > There are weird cases regarding _self_deoptimization_in_progress. > Assume we have 3 threads A, B and C. A deopts C, B deopts C, C deopts C. C can set _self_deoptimization_in_progress while A performs the handshake for suspending C. I think this doesn't lead to errors, but it's probably not desired. > I think it would be better to use only one "wait" call in sync_and_suspend_one and sync_and_suspend_all. > > I first thought it'd be better to move ThreadBlockInVM before wait() to reduce thread state transitions, but that seems to be problematic because ThreadBlockInVM destructor contains a safepoint check which we shouldn't do while holding EscapeBarrier_lock. So no change request. > > Change in thred_added: > I think the sequence would be more comprehensive if we waited for deopt_all_threads in Thread::start and all other places where a new thread can run into Java code (e.g. JVMTI attach). > Your version makes new threads come up with suspend flag set. That looks correct, too. Advantage is that you only have to change one place (thread_added). It'll be interesting to see how it will look like when we use async handshakes instead of suspend flags. > For now, I'm ok with your version. > > I'd only move MutexLocker ml(EscapeBarrier_lock...) after if (!jt->is_hidden_from_external_view()). > > Having 4 different deoptimize_objects functions makes it a little hard to keep an overview of which one is used for what. > Maybe adding suffixes would help a little bit, but I can also live with what you have. > Implementation looks correct to me. > > > src/hotspot/share/runtime/deoptimization.hpp > Escape barriers and object deoptimization functions. > Typo in comment: "helt" => "held" > > > src/hotspot/share/runtime/globals.hpp > Addition of develop flag DeoptimizeObjectsALotInterval. Ok. > > > src/hotspot/share/runtime/interfaceSupport.cpp > InterfaceSupport::deoptimizeAllObjects() is only used for DeoptimizeObjectsALot = 1. > I think DeoptimizeObjectsALot = 2 is more important, but I think it's not bad to have DeoptimizeObjectsALot = 1 in addition. Ok. > > > src/hotspot/share/runtime/interfaceSupport.inline.hpp > Addition of deoptimizeAllObjects. Ok. > > > src/hotspot/share/runtime/mutexLocker.cpp > src/hotspot/share/runtime/mutexLocker.hpp > Addition of EscapeBarrier_lock. Ok. > > > src/hotspot/share/runtime/objectMonitor.cpp > Make recursion count relock aware. Ok. > > > src/hotspot/share/runtime/stackValue.hpp > Better reinitilization in StackValue. Good. > > > src/hotspot/share/runtime/thread.cpp > src/hotspot/share/runtime/thread.hpp > src/hotspot/share/runtime/thread.inline.hpp > wait_for_object_deoptimization, suspend flag, deferred updates and test feature to deoptimize objects. > > In the long term, we want to get rid of suspend flags, so it's not so nice to introduce a new one. But I agree with G?tz that it should be acceptable as temporary solution until async handshakes are available (which takes more time). So I'm ok with your change. > > You can use MutexLocker with Thread*. > > JVMTIDeferredUpdates: I agree with Robin. It'd be nice to move the class out of thread.hpp. > > > src/hotspot/share/runtime/vframe.cpp > Added support for entry frame to new_vframe. Ok. > > > src/hotspot/share/runtime/vframe_hp.cpp > src/hotspot/share/runtime/vframe_hp.hpp > > I think code()->as_nmethod() in not_global_escape_in_scope() and arg_escape() should better be under #ifdef ASSERT or inside the assert statement (no need for code cache walking in product build). > > jvmtiDeferredLocalVariableSet::update_monitors: > Please add a comment explaining that owner referenced by original info may be scalar replaced, but it is deoptimized in the vframe. > > > src/hotspot/share/utilities/macros.hpp > Addition of NOT_COMPILER2_OR_JVMCI_RETURN macros. Ok. > > > test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysisEnabled.java > test/hotspot/jtreg/serviceability/jvmti/Heap/libIterateHeapWithEscapeAnalysisEnabled.c > New test. Will review separately. > > > test/jdk/TEST.ROOT > Addition of vm.jvmci as required property. Ok. > > > test/jdk/com/sun/jdi/EATests.java > test/jdk/com/sun/jdi/EATestsJVMCI.java > New test. Will review separately. > > > test/lib/sun/hotspot/WhiteBox.java > Added isFrameDeoptimized to API. Ok. > > > That was it. Best regards, > Martin > > >> -----Original Message----- >> From: hotspot-compiler-dev > bounces at openjdk.java.net> On Behalf Of Reingruber, Richard >> Sent: Dienstag, 3. M?rz 2020 21:23 >> To: 'Robbin Ehn' ; Lindenmaier, Goetz >> ; David Holmes ; >> Vladimir Kozlov (vladimir.kozlov at oracle.com) >> ; serviceability-dev at openjdk.java.net; >> hotspot-compiler-dev at openjdk.java.net; hotspot-runtime- >> dev at openjdk.java.net >> Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better >> Performance in the Presence of JVMTI Agents >> >> Hi Robbin, >> >>>> I understand that Robbin proposed to replace the usage of >>>> _suspend_flag with handshakes. Apparently, async handshakes >>>> are needed to do so. We have been waiting a while for removal >>>> of the _suspend_flag / introduction of async handshakes [2]. >>>> What is the status here? >> >>> I have an old prototype which I would like to continue to work on. >>> So do not assume asynch handshakes will make 15. >>> Even if it would, I think there are a lot more investigate work to remove >>> _suspend_flag. >> >> Let us know, if we can be of any help to you and be it only testing. >> >>>>> Full: >> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/ >> >>> DeoptimizeObjectsALotThread is only used in compileBroker.cpp. >>> You can move both declaration and definition to that file, no need to >> clobber >>> thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry) >> >> Will do. >> >>> Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's >> own >>> hpp file? It doesn't seem right to add JVM TI classes into thread.hpp. >> >> You are right. It shouldn't be declared in thread.hpp. I will look into that. >> >>> Note that we also think we may have a bug in deopt: >>> https://bugs.openjdk.java.net/browse/JDK-8238237 >> >>> I think it would be best, if possible, to push after that is resolved. >> >> Sure. >> >>> Not even nearly a full review :) >> >> I know :) >> >> Anyways, thanks a lot, >> Richard. >> >> >> -----Original Message----- >> From: Robbin Ehn >> Sent: Monday, March 2, 2020 11:17 AM >> To: Lindenmaier, Goetz ; Reingruber, Richard >> ; David Holmes ; >> Vladimir Kozlov (vladimir.kozlov at oracle.com) >> ; serviceability-dev at openjdk.java.net; >> hotspot-compiler-dev at openjdk.java.net; hotspot-runtime- >> dev at openjdk.java.net >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance >> in the Presence of JVMTI Agents >> >> Hi, >> >> On 2/24/20 5:39 PM, Lindenmaier, Goetz wrote: >>> Hi, >>> >>> I had a look at the progress of this change. Nothing >>> happened since Richard posted his update using more >>> handshakes [1]. >>> But we (SAP) would appreciate a lot if this change could >>> be successfully reviewed and pushed. >>> >>> I think there is basic understanding that this >>> change is helpful. It fixes a number of issues with JVMTI, >>> and will deliver the same performance benefits as EA >>> does in current production mode for debugging scenarios. >>> >>> This is important for us as we run our VMs prepared >>> for debugging in production mode. >>> >>> I understand that Robbin proposed to replace the usage of >>> _suspend_flag with handshakes. Apparently, async handshakes >>> are needed to do so. We have been waiting a while for removal >>> of the _suspend_flag / introduction of async handshakes [2]. >>> What is the status here? >> >> I have an old prototype which I would like to continue to work on. >> So do not assume asynch handshakes will make 15. >> Even if it would, I think there are a lot more investigate work to remove >> _suspend_flag. >> >>> >>> I think we should no longer wait, but proceed with >>> this change. We will look into removing the usage of >>> suspend_flag introduced here once it is possible to implement >>> it with handshakes. >> >> Yes, sure. >> >>>> Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/ >> >> DeoptimizeObjectsALotThread is only used in compileBroker.cpp. >> You can move both declaration and definition to that file, no need to clobber >> thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry) >> >> Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's >> own >> hpp file? It doesn't seem right to add JVM TI classes into thread.hpp. >> >> Note that we also think we may have a bug in deopt: >> https://bugs.openjdk.java.net/browse/JDK-8238237 >> >> I think it would be best, if possible, to push after that is resolved. >> >> Not even nearly a full review :) >> >> Thanks, Robbin >> >> >>>> Incremental: >>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/ >>>> >>>> I was not able to eliminate the additional suspend flag now. I'll take care >> of this >>>> as soon as the >>>> existing suspend-resume-mechanism is reworked. >>>> >>>> Testing: >>>> >>>> Nightly tests @SAP: >>>> >>>> JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, >> Renaissance >>>> Suite, SAP specific tests >>>> with fastdebug and release builds on all platforms >>>> >>>> Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x >> parallel >>>> for 24h >>>> >>>> Thanks, Richard. >>>> >>>> >>>> More details on the changes: >>>> >>>> * Hide DeoptimizeObjectsALotThread from external view. >>>> >>>> * Changed EscapeBarrier_lock to be a _safepoint_check_never lock. >>>> It used to be _safepoint_check_sometimes, which will be eliminated >> sooner or >>>> later. >>>> I added explicit thread state changes with ThreadBlockInVM to code >> paths >>>> where we can wait() >>>> on EscapeBarrier_lock to become safepoint safe. >>>> >>>> * Use handshake EscapeBarrierSuspendHandshake to suspend target >> threads >>>> instead of vm operation >>>> VM_ThreadSuspendAllForObjDeopt. >>>> >>>> * Removed uses of Threads_lock. When adding a new thread we suspend >> it iff >>>> EA optimizations are >>>> being reverted. In the previous version we were waiting on >> Threads_lock >>>> while EA optimizations >>>> were reverted. See EscapeBarrier::thread_added(). >>>> >>>> * Made tests require Xmixed compilation mode. >>>> >>>> * Made tests agnostic regarding tiered compilation. >>>> I.e. tc isn't disabled anymore, and the tests can be run with tc enabled or >>>> disabled. >>>> >>>> * Exercising EATests.java as well with stress test options >>>> DeoptimizeObjectsALot* >>>> Due to the non-deterministic deoptimizations some tests need to be >> skipped. >>>> We do this to prevent bit-rot of the stress test code. >>>> >>>> * Executing EATests.java as well with graal if available. Driver for this is >>>> EATestsJVMCI.java. Graal cannot pass all tests, because it does not >> provide all >>>> the new debug info >>>> (namely not_global_escape_in_scope and arg_escape in >> scopeDesc.hpp). >>>> And graal does not yet support the JVMTI operations force early return >> and >>>> pop frame. >>>> >>>> * Removed tracing from new jdi tests in EATests.java. Too much trace >> output >>>> before the debugging >>>> connection is established can cause deadlock because output buffers fill >> up. >>>> (See https://bugs.openjdk.java.net/browse/JDK-8173304) >>>> >>>> * Many copyright year changes and smaller clean-up changes of testing >> code >>>> (trailing white-space and >>>> the like). >>>> >>>> >>>> -----Original Message----- >>>> From: David Holmes >>>> Sent: Donnerstag, 19. Dezember 2019 03:12 >>>> To: Reingruber, Richard ; serviceability- >>>> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; >> hotspot- >>>> runtime-dev at openjdk.java.net; Vladimir Kozlov >> (vladimir.kozlov at oracle.com) >>>> >>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >> Performance in >>>> the Presence of JVMTI Agents >>>> >>>> Hi Richard, >>>> >>>> I think my issue is with the way EliminateNestedLocks works so I'm going >>>> to look into that more deeply. >>>> >>>> Thanks for the explanations. >>>> >>>> David >>>> >>>> On 18/12/2019 12:47 am, Reingruber, Richard wrote: >>>>> Hi David, >>>>> >>>>> > > > Some further queries/concerns: >>>>> > > > >>>>> > > > src/hotspot/share/runtime/objectMonitor.cpp >>>>> > > > >>>>> > > > Can you please explain the changes to ObjectMonitor::wait: >>>>> > > > >>>>> > > > ! _recursions = save // restore the old recursion count >>>>> > > > ! + jt->get_and_reset_relock_count_after_wait(); // >>>>> > > > increased by the deferred relock count >>>>> > > > >>>>> > > > what is the "deferred relock count"? I gather it relates to >>>>> > > > >>>>> > > > "The code was extended to be able to deoptimize objects of a >>>>> > > frame that >>>>> > > > is not the top frame and to let another thread than the owning >>>>> > > thread do >>>>> > > > it." >>>>> > > >>>>> > > Yes, these relate. Currently EA based optimizations are reverted, >> when a >>>> compiled frame is >>>>> > > replaced with corresponding interpreter frames. Part of this is >> relocking >>>> objects with eliminated >>>>> > > locking. New with the enhancement is that we do this also just >> before >>>> object references are >>>>> > > acquired through JVMTI. In this case we deoptimize also the >> owning >>>> compiled frame C and we >>>>> > > register deoptimized objects as deferred updates. When control >> returns >>>> to C it gets deoptimized, >>>>> > > we notice that objects are already deoptimized (reallocated and >>>> relocked), so we don't do it again >>>>> > > (relocking twice would be incorrect of course). Deferred updates >> are >>>> copied into the new >>>>> > > interpreter frames. >>>>> > > >>>>> > > Problem: relocking is not possible if the target thread T is waiting >> on the >>>> monitor that needs to >>>>> > > be relocked. This happens only with non-local objects with >>>> EliminateNestedLocks. Instead relocking >>>>> > > is deferred until T owns the monitor again. This is what the piece of >>>> code above does. >>>>> > >>>>> > Sorry I need some more detail here. How can you wait() on an >> object >>>>> > monitor if the object allocation and/or locking was optimised away? >> And >>>>> > what is a "non-local object" in this context? Isn't EA restricted to >>>>> > thread-confined objects? >>>>> >>>>> "Non-local object" is an object that escapes its thread. The issue I'm >>>> addressing with the changes >>>>> in ObjectMonitor::wait are almost unrelated to EA. They are caused by >>>> EliminateNestedLocks, where C2 >>>>> eliminates recursive locking of an already owned lock. The lock owning >> object >>>> exists on the heap, it >>>>> is locked and you can call wait() on it. >>>>> >>>>> EliminateLocks is the C2 option that controls lock elimination based on >> EA. >>>> Both optimizations have >>>>> in common that objects with eliminated locking need to be relocked >> when >>>> deoptimizing a frame, >>>>> i.e. when replacing a compiled frame with equivalent interpreter >>>>> frames. Deoptimization::relock_objects does that job for /all/ eliminated >>>> locks in scope. /All/ can >>>>> be a mix of eliminated nested locks and locks of not-escaping objects. >>>>> >>>>> New with the enhancement: I call relock_objects earlier, just before >> objects >>>> pontentially >>>>> escape. But then later when the owning compiled frame gets >> deoptimized, I >>>> must not do it again: >>>>> >>>>> See call to EscapeBarrier::objs_are_deoptimized in deoptimization.cpp: >>>>> >>>>> 373 if ((jvmci_enabled || ((DoEscapeAnalysis || >> EliminateNestedLocks) && >>>> EliminateLocks)) >>>>> 374 && !EscapeBarrier::objs_are_deoptimized(thread, >> deoptee.id())) { >>>>> 375 bool unused; >>>>> 376 eliminate_locks(thread, chunk, realloc_failures, deoptee, >> exec_mode, >>>> unused); >>>>> 377 } >>>>> >>>>> Now when calling relock_objects early it is quiet possible that I have to >> relock >>>> an object the >>>>> target thread currently waits for. Obviously I cannot relock in this case, >>>> instead I chose to >>>>> introduce relock_count_after_wait to JavaThread. >>>>> >>>>> > Is it just that some of the locking gets optimized away e.g. >>>>> > >>>>> > synchronised(obj) { >>>>> > synchronised(obj) { >>>>> > synchronised(obj) { >>>>> > obj.wait(); >>>>> > } >>>>> > } >>>>> > } >>>>> > >>>>> > If this is reduced to a form as-if it were a single lock of the monitor >>>>> > (due to EA) and the wait() triggers a JVM TI event which leads to the >>>>> > escape of "obj" then we need to reconstruct the true lock state, and >> so >>>>> > when the wait() internally unblocks and reacquires the monitor it >> has to >>>>> > set the true recursion count to 3, not the 1 that it appeared to be >> when >>>>> > wait() was initially called. Is that the scenario? >>>>> >>>>> Kind of... except that the locking is not eliminated due to EA and there is >> no >>>> JVM TI event >>>>> triggered by wait. >>>>> >>>>> Add >>>>> >>>>> LocalObject l1 = new LocalObject(); >>>>> >>>>> in front of the synchrnized blocks and assume a JVM TI agent acquires l1. >> This >>>> triggers the code in >>>>> question. >>>>> >>>>> See that relocking/reallocating is transactional. If it is done then for /all/ >>>> objects in scope and it is >>>>> done at most once. It wouldn't be quite so easy to split this in relocking >> of >>>> nested/EA-based >>>>> eliminated locks. >>>>> >>>>> > If so I find this truly awful. Anyone using wait() in a realistic form >>>>> > requires a notification and so the object cannot be thread confined. >> In >>>>> >>>>> It is not thread confined. >>>>> >>>>> > which case I would strongly argue that upon hitting the wait() the >> deopt >>>>> > should occur unconditionally and so the lock state is correct before >> we >>>>> > wait and so we don't need to mess with the recursion count >> internally >>>>> > when we reacquire the monitor. >>>>> > >>>>> > > >>>>> > > > which I don't like the sound of at all when it comes to >> ObjectMonitor >>>>> > > > state. So I'd like to understand in detail exactly what is going on >> here >>>>> > > > and why. This is a very intrusive change that seems to badly >> break >>>>> > > > encapsulation and impacts future changes to ObjectMonitor >> that are >>>> under >>>>> > > > investigation. >>>>> > > >>>>> > > I would not regard this as breaking encapsulation. Certainly not >> badly. >>>>> > > >>>>> > > I've added a property relock_count_after_wait to JavaThread. The >>>> property is well >>>>> > > encapsulated. Future ObjectMonitor implementations have to deal >> with >>>> recursion too. They are free >>>>> > > in choosing a way to do that as long as that property is taken into >>>> account. This is hardly a >>>>> > > limitation. >>>>> > >>>>> > I do think this badly breaks encapsulation as you have to add a >> callout >>>>> > from the guts of the ObjectMonitor code to reach into the thread to >> get >>>>> > this lock count adjustment. I understand why you have had to do >> this but >>>>> > I would much rather see a change to the EA optimisation strategy so >> that >>>>> > this is not needed. >>>>> > >>>>> > > Note also that the property is a straight forward extension of the >>>> existing concept of deferred >>>>> > > local updates. It is embedded into the structure holding them. So >> not >>>> even the footprint of a >>>>> > > JavaThread is enlarged if no deferred updates are generated. >>>>> > >>>>> > [...] >>>>> > >>>>> > > >>>>> > > I'm actually duplicating the existing external suspend mechanism, >>>> because a thread can be >>>>> > > suspended at most once. And hey, and don't like that either! But it >>>> seems not unlikely that the >>>>> > > duplicate can be removed together with the original and the new >> type >>>> of handshakes that will be >>>>> > > used for thread suspend can be used for object deoptimization >> too. See >>>> today's discussion in >>>>> > > JDK-8227745 [2]. >>>>> > >>>>> > I hope that discussion bears some fruit, at the moment it seems not >> to >>>>> > be possible to use handshakes here. :( >>>>> > >>>>> > The external suspend mechanism is a royal pain in the proverbial >> that we >>>>> > have to carefully live with. The idea that we're duplicating that for >>>>> > use in another fringe area of functionality does not thrill me at all. >>>>> > >>>>> > To be clear, I understand the problem that exists and that you wish >> to >>>>> > solve, but for the runtime parts I balk at the complexity cost of >>>>> > solving it. >>>>> >>>>> I know it's complex, but by far no rocket science. >>>>> >>>>> Also I find it hard to imagine another fix for JDK-8233915 besides >> changing >>>> the JVM TI specification. >>>>> >>>>> Thanks, Richard. >>>>> >>>>> -----Original Message----- >>>>> From: David Holmes >>>>> Sent: Dienstag, 17. Dezember 2019 08:03 >>>>> To: Reingruber, Richard ; serviceability- >>>> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; >> hotspot- >>>> runtime-dev at openjdk.java.net; Vladimir Kozlov >> (vladimir.kozlov at oracle.com) >>>> >>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >> Performance >>>> in the Presence of JVMTI Agents >>>>> >>>>> >>>>> >>>>> David >>>>> >>>>> On 17/12/2019 4:57 pm, David Holmes wrote: >>>>>> Hi Richard, >>>>>> >>>>>> On 14/12/2019 5:01 am, Reingruber, Richard wrote: >>>>>>> Hi David, >>>>>>> >>>>>>> ?? > Some further queries/concerns: >>>>>>> ?? > >>>>>>> ?? > src/hotspot/share/runtime/objectMonitor.cpp >>>>>>> ?? > >>>>>>> ?? > Can you please explain the changes to ObjectMonitor::wait: >>>>>>> ?? > >>>>>>> ?? > !?? _recursions = save????? // restore the old recursion count >>>>>>> ?? > !???????????????? + jt->get_and_reset_relock_count_after_wait(); // >>>>>>> ?? > increased by the deferred relock count >>>>>>> ?? > >>>>>>> ?? > what is the "deferred relock count"? I gather it relates to >>>>>>> ?? > >>>>>>> ?? > "The code was extended to be able to deoptimize objects of a >>>>>>> frame that >>>>>>> ?? > is not the top frame and to let another thread than the owning >>>>>>> thread do >>>>>>> ?? > it." >>>>>>> >>>>>>> Yes, these relate. Currently EA based optimizations are reverted, >> when >>>>>>> a compiled frame is replaced >>>>>>> with corresponding interpreter frames. Part of this is relocking >>>>>>> objects with eliminated >>>>>>> locking. New with the enhancement is that we do this also just before >>>>>>> object references are acquired >>>>>>> through JVMTI. In this case we deoptimize also the owning compiled >>>>>>> frame C and we register >>>>>>> deoptimized objects as deferred updates. When control returns to C >> it >>>>>>> gets deoptimized, we notice >>>>>>> that objects are already deoptimized (reallocated and relocked), so >> we >>>>>>> don't do it again (relocking >>>>>>> twice would be incorrect of course). Deferred updates are copied into >>>>>>> the new interpreter frames. >>>>>>> >>>>>>> Problem: relocking is not possible if the target thread T is waiting >>>>>>> on the monitor that needs to be >>>>>>> relocked. This happens only with non-local objects with >>>>>>> EliminateNestedLocks. Instead relocking is >>>>>>> deferred until T owns the monitor again. This is what the piece of >>>>>>> code above does. >>>>>> >>>>>> Sorry I need some more detail here. How can you wait() on an object >>>>>> monitor if the object allocation and/or locking was optimised away? >> And >>>>>> what is a "non-local object" in this context? Isn't EA restricted to >>>>>> thread-confined objects? >>>>>> >>>>>> Is it just that some of the locking gets optimized away e.g. >>>>>> >>>>>> synchronised(obj) { >>>>>> ? synchronised(obj) { >>>>>> ??? synchronised(obj) { >>>>>> ????? obj.wait(); >>>>>> ??? } >>>>>> ? } >>>>>> } >>>>>> >>>>>> If this is reduced to a form as-if it were a single lock of the monitor >>>>>> (due to EA) and the wait() triggers a JVM TI event which leads to the >>>>>> escape of "obj" then we need to reconstruct the true lock state, and so >>>>>> when the wait() internally unblocks and reacquires the monitor it has to >>>>>> set the true recursion count to 3, not the 1 that it appeared to be when >>>>>> wait() was initially called. Is that the scenario? >>>>>> >>>>>> If so I find this truly awful. Anyone using wait() in a realistic form >>>>>> requires a notification and so the object cannot be thread confined. In >>>>>> which case I would strongly argue that upon hitting the wait() the >> deopt >>>>>> should occur unconditionally and so the lock state is correct before we >>>>>> wait and so we don't need to mess with the recursion count internally >>>>>> when we reacquire the monitor. >>>>>> >>>>>>> >>>>>>> ?? > which I don't like the sound of at all when it comes to >>>>>>> ObjectMonitor >>>>>>> ?? > state. So I'd like to understand in detail exactly what is going >>>>>>> on here >>>>>>> ?? > and why.? This is a very intrusive change that seems to badly >> break >>>>>>> ?? > encapsulation and impacts future changes to ObjectMonitor that >>>>>>> are under >>>>>>> ?? > investigation. >>>>>>> >>>>>>> I would not regard this as breaking encapsulation. Certainly not badly. >>>>>>> >>>>>>> I've added a property relock_count_after_wait to JavaThread. The >>>>>>> property is well >>>>>>> encapsulated. Future ObjectMonitor implementations have to deal >> with >>>>>>> recursion too. They are free in >>>>>>> choosing a way to do that as long as that property is taken into >>>>>>> account. This is hardly a >>>>>>> limitation. >>>>>> >>>>>> I do think this badly breaks encapsulation as you have to add a callout >>>>>> from the guts of the ObjectMonitor code to reach into the thread to >> get >>>>>> this lock count adjustment. I understand why you have had to do this >> but >>>>>> I would much rather see a change to the EA optimisation strategy so >> that >>>>>> this is not needed. >>>>>> >>>>>>> Note also that the property is a straight forward extension of the >>>>>>> existing concept of deferred >>>>>>> local updates. It is embedded into the structure holding them. So not >>>>>>> even the footprint of a >>>>>>> JavaThread is enlarged if no deferred updates are generated. >>>>>>> >>>>>>> ?? > --- >>>>>>> ?? > >>>>>>> ?? > src/hotspot/share/runtime/thread.cpp >>>>>>> ?? > >>>>>>> ?? > Can you please explain why >>>>>>> JavaThread::wait_for_object_deoptimization >>>>>>> ?? > has to be handcrafted in this way rather than using proper >>>>>>> transitions. >>>>>>> ?? > >>>>>>> >>>>>>> I wrote wait_for_object_deoptimization taking >>>>>>> JavaThread::java_suspend_self_with_safepoint_check >>>>>>> as template. So in short: for the same reasons :) >>>>>>> >>>>>>> Threads reach both methods as part of thread state transitions, >>>>>>> therefore special handling is >>>>>>> required to change thread state on top of ongoing transitions. >>>>>>> >>>>>>> ?? > We got rid of "deopt suspend" some time ago and it is disturbing >>>>>>> to see >>>>>>> ?? > it being added back (effectively). This seems like it may be >>>>>>> something >>>>>>> ?? > that handshakes could be used for. >>>>>>> >>>>>>> Deopt suspend used to be something rather different with a similar >>>>>>> name[1]. It is not being added back. >>>>>> >>>>>> I stand corrected. Despite comments in the code to the contrary >>>>>> deopt_suspend didn't actually cause a self-suspend. I was doing a lot of >>>>>> cleanup in this area 13 years ago :) >>>>>> >>>>>>> >>>>>>> I'm actually duplicating the existing external suspend mechanism, >>>>>>> because a thread can be suspended >>>>>>> at most once. And hey, and don't like that either! But it seems not >>>>>>> unlikely that the duplicate can >>>>>>> be removed together with the original and the new type of >> handshakes >>>>>>> that will be used for >>>>>>> thread suspend can be used for object deoptimization too. See >> today's >>>>>>> discussion in JDK-8227745 [2]. >>>>>> >>>>>> I hope that discussion bears some fruit, at the moment it seems not to >>>>>> be possible to use handshakes here. :( >>>>>> >>>>>> The external suspend mechanism is a royal pain in the proverbial that >> we >>>>>> have to carefully live with. The idea that we're duplicating that for >>>>>> use in another fringe area of functionality does not thrill me at all. >>>>>> >>>>>> To be clear, I understand the problem that exists and that you wish to >>>>>> solve, but for the runtime parts I balk at the complexity cost of >>>>>> solving it. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> ----- >>>>>> >>>>>>> Thanks, Richard. >>>>>>> >>>>>>> [1] Deopt suspend was something like an async. handshake for >>>>>>> architectures with register windows, >>>>>>> ???? where patching the return pc for deoptimization of a compiled >>>>>>> frame was racy if the owner thread >>>>>>> ???? was in native code. Instead a "deopt" suspend flag was set on >>>>>>> which the thread patched its own >>>>>>> ???? frame upon return from native. So no thread was suspended. It >> got >>>>>>> its name only from the name of >>>>>>> ???? the flags. >>>>>>> >>>>>>> [2] Discussion about using handshakes to sync. with the target thread: >>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK- >>>> >> 8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.syst >> e >>>> m.issuetabpanels:comment-tabpanel#comment-14306727 >>>>>>> >>>>>>> >>>>>>> -----Original Message----- >>>>>>> From: David Holmes >>>>>>> Sent: Freitag, 13. Dezember 2019 00:56 >>>>>>> To: Reingruber, Richard ; >>>>>>> serviceability-dev at openjdk.java.net; >>>>>>> hotspot-compiler-dev at openjdk.java.net; >>>>>>> hotspot-runtime-dev at openjdk.java.net >>>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >>>>>>> Performance in the Presence of JVMTI Agents >>>>>>> >>>>>>> Hi Richard, >>>>>>> >>>>>>> Some further queries/concerns: >>>>>>> >>>>>>> src/hotspot/share/runtime/objectMonitor.cpp >>>>>>> >>>>>>> Can you please explain the changes to ObjectMonitor::wait: >>>>>>> >>>>>>> !?? _recursions = save????? // restore the old recursion count >>>>>>> !???????????????? + jt->get_and_reset_relock_count_after_wait(); // >>>>>>> increased by the deferred relock count >>>>>>> >>>>>>> what is the "deferred relock count"? I gather it relates to >>>>>>> >>>>>>> "The code was extended to be able to deoptimize objects of a frame >> that >>>>>>> is not the top frame and to let another thread than the owning thread >> do >>>>>>> it." >>>>>>> >>>>>>> which I don't like the sound of at all when it comes to ObjectMonitor >>>>>>> state. So I'd like to understand in detail exactly what is going on here >>>>>>> and why.? This is a very intrusive change that seems to badly break >>>>>>> encapsulation and impacts future changes to ObjectMonitor that are >> under >>>>>>> investigation. >>>>>>> >>>>>>> --- >>>>>>> >>>>>>> src/hotspot/share/runtime/thread.cpp >>>>>>> >>>>>>> Can you please explain why >> JavaThread::wait_for_object_deoptimization >>>>>>> has to be handcrafted in this way rather than using proper transitions. >>>>>>> >>>>>>> We got rid of "deopt suspend" some time ago and it is disturbing to >> see >>>>>>> it being added back (effectively). This seems like it may be something >>>>>>> that handshakes could be used for. >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> ----- >>>>>>> >>>>>>> On 12/12/2019 7:02 am, David Holmes wrote: >>>>>>>> On 12/12/2019 1:07 am, Reingruber, Richard wrote: >>>>>>>>> Hi David, >>>>>>>>> >>>>>>>>> ??? > Most of the details here are in areas I can comment on in >> detail, >>>>>>>>> but I >>>>>>>>> ??? > did take an initial general look at things. >>>>>>>>> >>>>>>>>> Thanks for taking the time! >>>>>>>> >>>>>>>> Apologies the above should read: >>>>>>>> >>>>>>>> "Most of the details here are in areas I *can't* comment on in detail >>>>>>>> ..." >>>>>>>> >>>>>>>> David >>>>>>>> >>>>>>>>> ??? > The only thing that jumped out at me is that I think the >>>>>>>>> ??? > DeoptimizeObjectsALotThread should be a hidden thread. >>>>>>>>> ??? > >>>>>>>>> ??? > +? bool is_hidden_from_external_view() const { return true; } >>>>>>>>> >>>>>>>>> Yes, it should. Will add the method like above. >>>>>>>>> >>>>>>>>> ??? > Also I don't see any testing of the >> DeoptimizeObjectsALotThread. >>>>>>>>> Without >>>>>>>>> ??? > active testing this will just bit-rot. >>>>>>>>> >>>>>>>>> DeoptimizeObjectsALot is meant for stress testing with a larger >>>>>>>>> workload. I will add a minimal test >>>>>>>>> to keep it fresh. >>>>>>>>> >>>>>>>>> ??? > Also on the tests I don't understand your @requires clause: >>>>>>>>> ??? > >>>>>>>>> ??? >?? @requires ((vm.compMode != "Xcomp") & >> vm.compiler2.enabled >>>> & >>>>>>>>> ??? > (vm.opt.TieredCompilation != true)) >>>>>>>>> ??? > >>>>>>>>> ??? > This seems to require that TieredCompilation is disabled, but >>>>>>>>> tiered is >>>>>>>>> ??? > our normal mode of operation. ?? >>>>>>>>> ??? > >>>>>>>>> >>>>>>>>> I removed the clause. I guess I wanted to target the tests towards >> the >>>>>>>>> code they are supposed to >>>>>>>>> test, and it's easier to analyze failures w/o tiered compilation and >>>>>>>>> with just one compiler thread. >>>>>>>>> >>>>>>>>> Additionally I will make use of >>>>>>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Richard. >>>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: David Holmes >>>>>>>>> Sent: Mittwoch, 11. Dezember 2019 08:03 >>>>>>>>> To: Reingruber, Richard ; >>>>>>>>> serviceability-dev at openjdk.java.net; >>>>>>>>> hotspot-compiler-dev at openjdk.java.net; >>>>>>>>> hotspot-runtime-dev at openjdk.java.net >>>>>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >>>>>>>>> Performance in the Presence of JVMTI Agents >>>>>>>>> >>>>>>>>> Hi Richard, >>>>>>>>> >>>>>>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I would like to get reviews please for >>>>>>>>>> >>>>>>>>>> >> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ >>>>>>>>>> >>>>>>>>>> Corresponding RFE: >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745 >>>>>>>>>> >>>>>>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 >>>>>>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK- >> 8214584 [1] >>>>>>>>>> >>>>>>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing >> without >>>>>>>>>> issues (thanks!). In addition the >>>>>>>>>> change is being tested at SAP since I posted the first RFR some >>>>>>>>>> months ago. >>>>>>>>>> >>>>>>>>>> The intention of this enhancement is to benefit performance wise >> from >>>>>>>>>> escape analysis even if JVMTI >>>>>>>>>> agents request capabilities that allow them to access local variable >>>>>>>>>> values. E.g. if you start-up >>>>>>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, >> then >>>>>>>>>> escape analysis is disabled right >>>>>>>>>> from the beginning, well before a debugger attaches -- if ever one >>>>>>>>>> should do so. With the >>>>>>>>>> enhancement, escape analysis will remain enabled until and after >> a >>>>>>>>>> debugger attaches. EA based >>>>>>>>>> optimizations are reverted just before an agent acquires the >>>>>>>>>> reference to an object. In the JBS item >>>>>>>>>> you'll find more details. >>>>>>>>> >>>>>>>>> Most of the details here are in areas I can comment on in detail, but >> I >>>>>>>>> did take an initial general look at things. >>>>>>>>> >>>>>>>>> The only thing that jumped out at me is that I think the >>>>>>>>> DeoptimizeObjectsALotThread should be a hidden thread. >>>>>>>>> >>>>>>>>> +? bool is_hidden_from_external_view() const { return true; } >>>>>>>>> >>>>>>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread. >>>>>>>>> Without >>>>>>>>> active testing this will just bit-rot. >>>>>>>>> >>>>>>>>> Also on the tests I don't understand your @requires clause: >>>>>>>>> >>>>>>>>> ??? @requires ((vm.compMode != "Xcomp") & >> vm.compiler2.enabled & >>>>>>>>> (vm.opt.TieredCompilation != true)) >>>>>>>>> >>>>>>>>> This seems to require that TieredCompilation is disabled, but tiered >> is >>>>>>>>> our normal mode of operation. ?? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Richard. >>>>>>>>>> >>>>>>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745 >>>>>>>>>> >>>> >> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.pa >> tc >>>> h >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> From daniel.daugherty at oracle.com Tue Mar 31 14:41:01 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 31 Mar 2020 10:41:01 -0400 Subject: Thread Local Handshake in JVMTI functions In-Reply-To: <4c9aa3ab-468d-eede-18f7-ac8d352575b6@oss.nttdata.com> References: <9ecf6856-f5c7-4723-7cc9-7d257e7bb7c0@oss.nttdata.com> <4c9aa3ab-468d-eede-18f7-ac8d352575b6@oss.nttdata.com> Message-ID: <64323780-b96e-0e3a-5f61-8f1dd74a1805@oracle.com> Add Robbin to this thread... This reminded of the following RFE that Robbin filed: ??? JDK-8201641 JVMTI: GetThreadListStackTraces should use Thread-Local Handshakes ??? https://bugs.openjdk.java.net/browse/JDK-8201641 We could update 8201641 to include everything that Yasumasa-san is requesting. Would be a good place to track it... Dan On 3/31/20 7:40 AM, Yasumasa Suenaga wrote: > Hi David, > > On 2020/03/31 19:16, David Holmes wrote: >> Hi Yasumasa, >> >> On 31/03/2020 8:06 pm, Yasumasa Suenaga wrote: >>> Hi all, >>> >>> Many JVMTI functions uses VM Operation to get information. However >>> some of them need to stop only one thread - they don't need to stop >>> all threads. >>> So I think we can use Thread Local Handshake as this webrev. It is >>> example for GetOneCurrentContendedMonitor(). >> >> True, but at the moment handshakes involve the VMThread. There is >> work being done to support direct thread-to-thread handshakes and >> once that is done this kind of conversion should be more easily done. >> It might be worth waiting for that. > > Thanks, I will be back to this topic when thread-to-thread handshake > is done. > I wondered at first why VMThread involves handshake. Its improvement > is welcome for me ;) > > > Cheers, > > Yasumasa > > >>> http://cr.openjdk.java.net/~ysuenaga/jvmti-thread-local-handshake/ >> >> An observation, it seems to me that calling_thread is not used when >> this is not a VMOperation. >> >> Cheers, >> David >> >>> Also I think we can replace following VM Operations to Thread Local >>> Handshake: >>> >>> class VM_GetCurrentLocation >>> class VM_EnterInterpOnlyMode >>> class VM_UpdateForPopTopFrame >>> class VM_SetFramePop >>> class VM_GetOwnedMonitorInfo >>> class VM_GetCurrentContendedMonitor >>> class VM_GetFrameCount >>> class VM_GetFrameLocation >>> >>> What do you think? >>> It it is acceptable, I will file it to JBS and send review request. >>> >>> >>> Thanks, >>> >>> Yasumasa From poonam.bajaj at oracle.com Tue Mar 31 16:19:23 2020 From: poonam.bajaj at oracle.com (Poonam Parhar) Date: Tue, 31 Mar 2020 09:19:23 -0700 Subject: Discussion about fixing deprecation in jdk.hotspot.agent In-Reply-To: <113dd83a-82a3-88fc-8f31-fe9bfd00c12c@oracle.com> References: <1916207b-de97-1f25-f93c-8830025fad62@oracle.com> <113dd83a-82a3-88fc-8f31-fe9bfd00c12c@oracle.com> Message-ID: Hello Coleen, Does the removal of this code only impact the 'reattach' functionality, and it does not affect any commands available in 'clhsdb' once it is attached to a core file? If that's true, then I think it should be okay to remove this code. Thanks, Poonam On 3/31/20 5:34 AM, coleen.phillimore at oracle.com wrote: > > To answer my own question, this functionality is used to allow > detach/reattach from {cl}hsdb.? Which seems to work on linux but not > windows with this code removed. > > The next question is whether this is useful functionality to justify > all this code (900+ and this new code that Magnus has added).? Can't > you just exit and restart the clhsdb process on the core file or process? > > For the record, this is me playing with python to remove this code. > > http://cr.openjdk.java.net/~coleenp/2020/01/webrev/index.html > > Thanks, > Coleen > > On 3/30/20 3:04 PM, coleen.phillimore at oracle.com wrote: >> >> I was wondering why this is needed when debugging a core file, which >> is the key thing we need the SA for: >> >> ? /** This is used by both the debugger and any runtime system. It is >> ????? the basic mechanism by which classes which mimic underlying VM >> ????? functionality cause themselves to be initialized. The given >> ????? observer will be notified (with arguments (null, null)) when the >> ????? VM is re-initialized, as well as when it registers itself with >> ????? the VM. */ >> ? public static void registerVMInitializedObserver(Observer o) { >> ??? vmInitializedObservers.add(o); >> ??? o.update(null, null); >> ? } >> >> It seems like if it isn't needed, we shouldn't add these classes and >> remove their use. >> >> Coleen >> >> On 3/30/20 8:14 AM, Magnus Ihse Bursie wrote: >>> No opinions on this? >>> >>> /Magnus >>> >>> On 2020-03-25 23:34, Magnus Ihse Bursie wrote: >>>> Hi everyone, >>>> >>>> As a follow-up to the ongoing review for JDK-8241618, I have also >>>> looked at fixing the deprecation warnings in jdk.hotspot.agent. >>>> These fall in three broad categories: >>>> >>>> * Deprecation of the boxing type constructors (e.g. "new >>>> Integer(42)"). >>>> >>>> * Deprecation of java.util.Observer and Observable. >>>> >>>> * The rest (mostly Class.newInstance(), and a few number of other >>>> odd deprecations) >>>> >>>> The first category is trivial to fix. The last category need some >>>> special discussion. But the overwhelming majority of deprecation >>>> warnings come from the use of Observer and Observable. This really >>>> dwarfs anything else, and needs to be handled first, otherwise it's >>>> hard to even spot the other issues. >>>> >>>> My analysis of the situation is that the deprecation of Observer >>>> and Observable seems a bit harsh, from the PoV of >>>> jdk.hotspot.agent. Sure, it might be limited, but I think it does >>>> exactly what is needed here. So the migration suggested in >>>> Observable (java.beans or java.util.concurrent) seems overkill. If >>>> there are genuine threading issues at play here, this assumption >>>> might be wrong, and then maybe going the j.u.c. route is correct. >>>> >>>> But if that's not, the main goal should be to stay with the current >>>> implementation. One way to do this is to sprinkle the code with >>>> @SuppressWarning. But I think a better way would be to just >>>> implement our own Observer and Observable. After all, the classes >>>> are trivial. >>>> >>>> I've made a mock-up of this solution, were I just copied the >>>> java.util.Observer and Observable, and removed the deprecation >>>> annotations. The only thing needed for the rest of the code is to >>>> make sure we import these; I've done this for three arbitrarily >>>> selected classes just to show what the change would typically look >>>> like. Here's the mock-up: >>>> >>>> http://cr.openjdk.java.net/~ihse/hotspot-agent-observer/webrev.01 >>>> >>>> Let me know what you think. >>>> >>>> /Magnus >>> >> > From serguei.spitsyn at oracle.com Tue Mar 31 16:59:52 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 31 Mar 2020 09:59:52 -0700 Subject: Thread Local Handshake in JVMTI functions In-Reply-To: <64323780-b96e-0e3a-5f61-8f1dd74a1805@oracle.com> References: <9ecf6856-f5c7-4723-7cc9-7d257e7bb7c0@oss.nttdata.com> <4c9aa3ab-468d-eede-18f7-ac8d352575b6@oss.nttdata.com> <64323780-b96e-0e3a-5f61-8f1dd74a1805@oracle.com> Message-ID: <8ced0d82-4125-7179-bac2-c8ce54807274@oracle.com> Hi Yasumasa, Yes, this works needs to be done. I'll take look at you webrev. Thanks, Serguei On 3/31/20 07:41, Daniel D. Daugherty wrote: > Add Robbin to this thread... > > > This reminded of the following RFE that Robbin filed: > > ??? JDK-8201641 JVMTI: GetThreadListStackTraces should use > Thread-Local Handshakes > ??? https://bugs.openjdk.java.net/browse/JDK-8201641 > > We could update 8201641 to include everything that Yasumasa-san is > requesting. > Would be a good place to track it... > > Dan > > > On 3/31/20 7:40 AM, Yasumasa Suenaga wrote: >> Hi David, >> >> On 2020/03/31 19:16, David Holmes wrote: >>> Hi Yasumasa, >>> >>> On 31/03/2020 8:06 pm, Yasumasa Suenaga wrote: >>>> Hi all, >>>> >>>> Many JVMTI functions uses VM Operation to get information. However >>>> some of them need to stop only one thread - they don't need to stop >>>> all threads. >>>> So I think we can use Thread Local Handshake as this webrev. It is >>>> example for GetOneCurrentContendedMonitor(). >>> >>> True, but at the moment handshakes involve the VMThread. There is >>> work being done to support direct thread-to-thread handshakes and >>> once that is done this kind of conversion should be more easily >>> done. It might be worth waiting for that. >> >> Thanks, I will be back to this topic when thread-to-thread handshake >> is done. >> I wondered at first why VMThread involves handshake. Its improvement >> is welcome for me ;) >> >> >> Cheers, >> >> Yasumasa >> >> >>>> http://cr.openjdk.java.net/~ysuenaga/jvmti-thread-local-handshake/ >>> >>> An observation, it seems to me that calling_thread is not used when >>> this is not a VMOperation. >>> >>> Cheers, >>> David >>> >>>> Also I think we can replace following VM Operations to Thread Local >>>> Handshake: >>>> >>>> class VM_GetCurrentLocation >>>> class VM_EnterInterpOnlyMode >>>> class VM_UpdateForPopTopFrame >>>> class VM_SetFramePop >>>> class VM_GetOwnedMonitorInfo >>>> class VM_GetCurrentContendedMonitor >>>> class VM_GetFrameCount >>>> class VM_GetFrameLocation >>>> >>>> What do you think? >>>> It it is acceptable, I will file it to JBS and send review request. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa > From mandy.chung at oracle.com Tue Mar 31 18:06:58 2020 From: mandy.chung at oracle.com (Mandy Chung) Date: Tue, 31 Mar 2020 11:06:58 -0700 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> Message-ID: <940c6907-612e-8744-376c-5362991d4a42@oracle.com> This patch addresses Joe's feedback on the CSR [1]: http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03-delta-jdarcy/ Specifically, it adds to the class specification of java.lang.Class to describe how the relevant methods behave for hidden classes.? In addition, use the new inline @jvms tag. Thanks Mandy [1] https://bugs.openjdk.java.net/browse/JDK-8238359 On 3/26/20 4:57 PM, Mandy Chung wrote: > Please review the implementation of JEP 371: Hidden Classes. The main > changes are in core-libs and hotspot runtime area.? Small changes are > made in javac, VM compiler (intrinsification of Class::isHiddenClass), > JFR, JDI, and jcmd.? CSR [1]has been reviewed and is in the finalized > state (see specdiff and javadoc below for reference). > > Webrev: > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 > > > Hidden class is created via `Lookup::defineHiddenClass`. From JVM's point > of view, a hidden class is a normal class except the following: > > - A hidden class has no initiating class loader and is not registered > in any dictionary. > - A hidden class has a name containing an illegal character > `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` > returns "Lp/Foo.0x1234;". > - A hidden class is not modifiable, i.e. cannot be redefined or > retransformed. JVM TI IsModifableClass returns false on a hidden. > - Final fields in a hidden class is "final".? The value of final > fields cannot be overriden via reflection.? setAccessible(true) can > still be called on reflected objects representing final fields in a > hidden class and its access check will be suppressed but only have > read-access (i.e. can do Field::getXXX but not setXXX). > > Brief summary of this patch: > > 1. A new Lookup::defineHiddenClass method is the API to create a > hidden class. > 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG > option that > ?? can be specified when creating a hidden class. > 3. A new Class::isHiddenClass method tests if a class is a hidden class. > 4. Field::setXXX method will throw IAE on a final field of a hidden class > ?? regardless of the value of the accessible flag. > 5. JVM_LookupDefineClass is the new JVM entry point for > Lookup::defineClass > ?? and defineHiddenClass to create a class from the given bytes. > 6. ClassLoaderData implementation is not changed.? There is one > primary CLD > ?? that holds the classes strongly referenced by its defining loader.? > There > ?? can be zero or more additional CLDs - one per weak class. > 7. Nest host determination is updated per revised JVMS 5.4.4. Access > control > ?? check no longer throws LinkageError but instead it will throw IAE with > ?? a clear message if a class fails to resolve/validate the nest host > declared > ?? in NestHost/NestMembers attribute. > 8. JFR, jcmd, JDI are updated to support hidden classes. > 9. update javac LambdaToMethod as lambda proxy starts using nestmates > ?? and generate a bridge method to desuger a method reference to a > protected > ?? method in its supertype in a different package > > This patch also updates StringConcatFactory, LambdaMetaFactory, and > LambdaForms > to use hidden classes.? The webrev includes changes in nashorn to > hidden class > and I will update the webrev if JEP 372 removes it any time soon. > > We uncovered a bug in Lookup::defineClass spec throws LinkageError and > intends > to have the newly created class linked.? However, the implementation > in 14 > does not link the class.? A separate CSR [2] proposes to update the > implementation to match the spec.? This patch fixes the implementation. > > The spec update on JVM TI, JDI and Instrumentation will be done as > a separate RFE [3].? This patch includes new tests for JVM TI and > java.instrument that validates how the existing APIs work for hidden > classes. > > javadoc/specdiff > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ > > > JVMS 5.4.4 change: > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf > > > CSR: > https://bugs.openjdk.java.net/browse/JDK-8238359 > > Thanks > Mandy > [1] https://bugs.openjdk.java.net/browse/JDK-8238359 > [2] https://bugs.openjdk.java.net/browse/JDK-8240338 > [3] https://bugs.openjdk.java.net/browse/JDK-8230502 -------------- next part -------------- An HTML attachment was scrubbed... URL: From leonid.mesnik at oracle.com Tue Mar 31 19:09:30 2020 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Tue, 31 Mar 2020 12:09:30 -0700 Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the children process if vmArguments is already specified In-Reply-To: References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com> <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com> <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com> <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com> <042D1C56-75E3-40B9-8259-035C9A13C13B@oracle.com> <47cba4c6-233d-84a9-1d89-b40e3c974c08@oracle.com> <8bcf232e-e05c-98ae-767f-26adf18ad3fd@oracle.com> Message-ID: Hi On 3/30/20 9:43 PM, Chris Plummer wrote: > Hi Leonid, > > On 3/30/20 5:42 PM, Leonid Mesnik wrote: >> Hi >> >> See my comments inline. I will update webrev after go through all >> your comments. >> >> >> On 3/30/20 11:39 AM, Chris Plummer wrote: >>> Hi Leonid, >>> >>> I haven't gone through all the tests yet.? I've accumulated enough >>> questions that I'd like to see them answered or addressed before I >>> continue on. >>> >>> This isn't directly related to your changes, but I noticed that >>> users of JDKToolLauncher do nothing to make sure that default test >>> options are used. This means we are never running these tools with >>> the test options being specified with the jtreg run. Is that a bug >>> or intentional? >> >> Which "default test options" do you mean? We have 2 properties to set >> JVM options. The idea is to pass test.vm.opts to ALL java processes >> and test.java.opts? to only tested processes if applicable. Usually, >> for example we don't want to run jcmd with -Xcomp. test.vm.opts was >> used (a long time ago) for options like '-d32/-d64' on Solaris where >> JVM don't start without choosing correct version. Also, it is used to >> reduce maximum heap for all JVM instances when tests are running >> concurrently. >> >> So, probably test.vm.opts (or test.vm.tools.opts) should be added by >> JDKToolLauncher but not test.java.opts. It is separate topic, there >> are a lot of launchers which ignore test.vm.opts now. > I always get confused about which set of options these properties > represent, but basically I'm suggesting that if for example we are > doing a -Xcomp run in mach5, JDKToolLauncher (at least in some cases) > should be launched with this option. I think this is what you get from > Utils.getTestJavaOpts(),. > > For example the SA tests use > JDKToolLauncher.createUsingTestJDK("jhsdb"). jhsdb is what is really > being tested here, and it should be launched with the test vm options. > Currently we launch the target process with these options, which is > probably also a good idea.? Also we aren't too concerned with the > options that the test itself is run with, although I'm guessing they > also get run with the test java opts. So we have 3 processes here: > ?- jhsdb, which should be getting test java opts but is not > ?- the target process, which should be getting test java opts and > currently is > ?- the test itself, where options don't really matter, but is getting > passed test java opts > > However, you could argue that tests like jinfo, jstack, and jcmd, all > of which use the Attach API and the bulk of the work is done on the > target process, are not that concerned with the options passed to the > command, but do want the options passed to the target process. Well, it is a good question if we want to run jhsdb tool itself with additional slow options like Xcomp. Does it help us to improve coverage? IIRC the original idea of adding test.java/vm.opts was to don't waste time executing javac and debuggers in slow mode on SPARC. Anyway, it is a separate question which is out of scope of this change. We might want to review all debugger/debugee tests to find better way to deal with this. >> >>> >>> In the problem lists, is it necessary to list the test multiple >>> times with #id0, #id1, etc, or could you list it just once and leave >>> that part off. It seems very error prone. Also, changing tests like >>> ClhsdbFindPC, ClhsdbJstack, and ClhsdbScanOops to split out the >>> testing in this manner seems completely unrelated to this CR, >>> especially when the tests do not even contain any changes related to >>> the CR. >> >> I think, that these chages are related. The startApp(...) was updated >> so some test combinations become invalid or redundant. >> >> ClhsdbFindPC and ClhsdbJstack were always run twice. Now, when test >> options passed in test it is not needed to run it twice when Xcomp is >> already set by user. >> > Ok. I see now that the second test run, which is the non -Xcomp run, > adds '@requires vm.compMode != "Xcomp"'. But this also is strange. The > first test run, which does not have the @requires and is the one that > makes LingeredApp launch with -Xcomp, will always run whether or not > it is an -Xcomp test run. So it will run as part of the a regular test > run and as part of a -Xcomp test run. The only difference between the > two is the -Xcomp run will also run the test with -Xcomp, but that's > not really needed (I think it will also end up passing -Xcomp to the > target processs twice). Perhaps '@requires vm.compMode == "Xcomp"' > should be used for the first test run, but that means it no longer > gets run until later tiers when we use -Xcomp. Why not revert it back > to a single test, but also add '@requires vm.compMode != "Xcomp"'. > Then it gets run both ways in an early tier and not run during the > -Xcomp run, which isn't really needed. There several flag which are executed with Xcomp only: "-XX:-DoEscapeAnalysis",? "-XX:-UseBiasedLocking", "-XX:+DeoptimizeALot" where this test is going to be skipped. So we never run test with these options. The original idea is to run test with given options and with added Xcomp.? I left logic the same and only skip run with "Xcomp" when it is set already by user. I agree that we have some duplication here and it could be improved, but it could be done separately. If you are ok with this let me file separate RFE for this. > >> ClhsdbScanOops is fixed to don't allow to run incompatible GC >> combination. > Ok >> >> So I should update these tests by splitting them or change them to? >> startAppExactJvmOpts() if we wan't continue to ignore user-given test >> options. > I don't think I was suggesting removing user-given test options. I > don't see why you would. I just wanted to say that these tests are affected by my changes and should be fixed anyway. Leonid >> >> It seems that #idN are required by jtreg now, otherwise it just run >> test. > Ok. >> >>> >>> ?426???? public static LingeredApp startApp(String... >>> additionalJvmOpts) throws IOException { >>> >>> The default test opts are appended to additionalJvmOpts, and if you >>> want prepended you need to call Utils.prependTestJavaOpts(). I would >>> have thought the opposite would be more desirable and expected >>> default behavior. Why did you choose this way? I also find it >>> somewhat confusing that there is even a default mode for where the >>> additionalJvmOpts go. Maybe it would be best to have >>> startAppAppendJvmArgs() and startAppPrependJvmArgs() just to make it >>> explicit. This would also be in line with the existing >>> startAppExactJvmOpts(). >>> >> I've chosen the most popular usage, which was >> Utils.appendTestJavaOpts. But I agree, that it would be better to >> change it to prepend. Thanks for pointing to this. >> >> I don't want to add startAppAppendJvmArgs()/startAppPrependJvmArgs() >> to don't complicate all things. I think that startApp() should be >> used in the cases when test vm options really shouldn't interfere >> with user-provided options or overwrite them. So basically the >> behavior is the same as for >> ProcessTools.createJavaProcessBuilder(true, ...) and jtreg itself. >> > Ok. >> >>> Is ClhsdbFindPC correct. It used to use just use -Xcomp or -Xint, >>> ignoring any default test opts. You've fixed it to include the >>> default test opts, but the are appended, possibly overriding the >>> -Xcomp or -Xint. Don't we want the default test opts prepended? Same >>> for ClhsdbJstack. >> >> The idea is to don't mix Xcomp and Xmixed/Xint using requires filter. >> However ClhsdbFindPC might override Xint with Xmixed if it is set >> explicitly. Switching to prepending will fix it. > Yes, that's what I was thinking and one reason I thought that should > be default behavior. > > thanks, > > Chris >> >> Leonid >> >>> >>> thanks, >>> >>> Chris >>> >>> On 3/25/20 2:31 PM, Leonid Mesnik wrote: >>>> >>>> Igor, Stefan, Ioi >>>> >>>> Thank you for your feedback. >>>> >>>> Filed https://bugs.openjdk.java.net/browse/JDK-8241624 To change >>>> @run main... to @run driver. >>>> >>>> Test ClhsdbJstack.java is updated. >>>> >>>> Still waiting for review from SVC team. >>>> >>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/ >>>> >>>> Leonid >>>> >>>> On 3/25/20 12:46 PM, Igor Ignatyev wrote: >>>>> Hi Leonid, >>>>> >>>>> not related related to your patch (but yet somewhat made more >>>>> obvious by it), it seems all (or at least almost all) the tests >>>>> which use?LingeredApp should be run in "driver" mode as they just >>>>> orchestrate execution of other JVMs, so running them w/ main (let >>>>> alone main/othervm) just wastes time, >>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java#id1, for >>>>> example, will now executed w/ Xcomp which will make it very slow >>>>> for no reasons. since you already got your hands dirty w/ these >>>>> tests, could you please file an RFE to sort this out and list all >>>>> the affected tests there? >>>>> >>>>> re: the patch, could you please update ClhsdbJstack.java test not >>>>> to be run w/ Xcomp and follow the same pattern you used in other >>>>> tests (e.g.?ClhsdbScanOops) ? other than that it looks fine to me, >>>>> I however wouldn't be able to tell if all svc tests continue to do >>>>> that they were supposed to, so I'd prefer for someone from svc >>>>> team to?chime in. >>>>> >>>>> Thanks, >>>>> -- Igor >>>>> >>>>>> On Mar 25, 2020, at 12:01 PM, Leonid Mesnik >>>>>> > wrote: >>>>>> >>>>>> Added Ioi, who also proposed new version of startAppVmOpts. >>>>>> >>>>>> Please find new webrev: >>>>>> http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/ >>>>>> >>>>>> Renamed startAppVmOpts/runAppVmOpts to >>>>>> "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should make >>>>>> very clear that this method doesn't use any of test.java.opts, >>>>>> test.vm.opts. >>>>>> >>>>>> Also, I fixed >>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java metnioned >>>>>> by Igor, and removed null pointer check as Ioi suggested in >>>>>> startApp method. >>>>>> >>>>>> + public static void startApp(LingeredApp theApp, String... >>>>>> additionalJvmOpts) throws IOException { >>>>>> + startAppExactJvmOpts(theApp, >>>>>> Utils.appendTestJavaOpts(additionalJvmOpts)); >>>>>> + } >>>>>> >>>>>> Leonid >>>>>> >>>>>> On 3/25/20 10:14 AM, Stefan Karlsson wrote: >>>>>>> On 2020-03-25 17:40, Igor Ignatyev wrote: >>>>>>>> Hi Leonid, >>>>>>>> >>>>>>>> I have briefly looked at the patch, a few comments so far: >>>>>>>> >>>>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java: >>>>>>>> ? - at L#114, could you please call static method using class >>>>>>>> name (as the opposite of using instance)? or was it meant to be >>>>>>>> theApp.runAppVmOpts(vmArgs) ? >>>>>>>> >>>>>>>> test/lib/jdk/test/lib/apps/LingeredApp.java: >>>>>>>> - it seems that code indent of startApp(LingeredApp, String[]) >>>>>>>> isn't correct >>>>>>>> - I don't like startAppVmOpts name, but unfortunately don't >>>>>>>> have a better suggestion (yet) >>>>>>> >>>>>>> I was going to say the same. Jtreg has the concept of "java >>>>>>> options" and "vm options". We have had a fair share of bugs and >>>>>>> wasted time when tests have been using the "vm options" part >>>>>>> (VM_OPTIONS, test.vm.options, etc), and we've been moving away >>>>>>> from using that way to pass options. I recently cleaned up some >>>>>>> of this with: >>>>>>> >>>>>>> 8237111: LingeredApp should be started with getTestJavaOpts >>>>>>> >>>>>>> Because of this, I would prefer if we used a name that doesn't >>>>>>> include "VmOpts", because it's too alike the other concept. Some >>>>>>> suggestions: >>>>>>> ?startAppJavaOptions >>>>>>> ?startAppUsingJavaOptions >>>>>>> ?startAppWithJavaOptions >>>>>>> ?startAppExactJavaOptions >>>>>>> ?startAppJvmOptions >>>>>>> >>>>>>> Thanks, >>>>>>> StefanK >>>>>>> >>>>>>>> Thanks, >>>>>>>> -- Igor >>>>>>>> >>>>>>>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Hi >>>>>>>>> >>>>>>>>> Could you please review following fix which change LingeredApp >>>>>>>>> to prepend vm options to java/vm.test.opts when startApp is >>>>>>>>> used and provide startAppVmOpts to override options completely. >>>>>>>>> >>>>>>>>> The intention is to avoid issue like in this bug where >>>>>>>>> test/jtreg options were ignored by tests. Also I fixed some >>>>>>>>> tests where intention was to append vm options rather than to >>>>>>>>> override them. >>>>>>>>> >>>>>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/ >>>>>>>>> >>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698 >>>>>>>>> >>>>>>>>> Leonid >>>>>>>>> >>>>>>> >>>>> >>> >>> > > From mandy.chung at oracle.com Tue Mar 31 19:25:53 2020 From: mandy.chung at oracle.com (Mandy Chung) Date: Tue, 31 Mar 2020 12:25:53 -0700 Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes In-Reply-To: <940c6907-612e-8744-376c-5362991d4a42@oracle.com> References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com> <940c6907-612e-8744-376c-5362991d4a42@oracle.com> Message-ID: Alex's feedback:? rename isHiddenClass to isHidden as it can be a hidden class or interface. `isLocalClass` and `sAnonymousClass` are correct because the Java language only has local classes and anon classes, not local interfaces or anon. interfaces.? `isHidden` is like `isSynthetic`, it could be a class or interface. Although isHiddenClass seems clearer, I'm okay to rename it to `isHidden`. Mandy On 3/31/20 11:06 AM, Mandy Chung wrote: > This patch addresses Joe's feedback on the CSR [1]: > > http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03-delta-jdarcy/ > > > Specifically, it adds to the class specification of java.lang.Class to > describe how the relevant methods behave for hidden classes.? In > addition, use the new inline @jvms tag. > > Thanks > Mandy > [1] https://bugs.openjdk.java.net/browse/JDK-8238359 > > On 3/26/20 4:57 PM, Mandy Chung wrote: >> Please review the implementation of JEP 371: Hidden Classes. The main >> changes are in core-libs and hotspot runtime area.? Small changes are >> made in javac, VM compiler (intrinsification of >> Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been reviewed >> and is in the finalized state (see specdiff and javadoc below for >> reference). >> >> Webrev: >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 >> >> >> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's >> point >> of view, a hidden class is a normal class except the following: >> >> - A hidden class has no initiating class loader and is not registered >> in any dictionary. >> - A hidden class has a name containing an illegal character >> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` >> returns "Lp/Foo.0x1234;". >> - A hidden class is not modifiable, i.e. cannot be redefined or >> retransformed. JVM TI IsModifableClass returns false on a hidden. >> - Final fields in a hidden class is "final".? The value of final >> fields cannot be overriden via reflection.? setAccessible(true) can >> still be called on reflected objects representing final fields in a >> hidden class and its access check will be suppressed but only have >> read-access (i.e. can do Field::getXXX but not setXXX). >> >> Brief summary of this patch: >> >> 1. A new Lookup::defineHiddenClass method is the API to create a >> hidden class. >> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG >> option that >> ?? can be specified when creating a hidden class. >> 3. A new Class::isHiddenClass method tests if a class is a hidden class. >> 4. Field::setXXX method will throw IAE on a final field of a hidden >> class >> ?? regardless of the value of the accessible flag. >> 5. JVM_LookupDefineClass is the new JVM entry point for >> Lookup::defineClass >> ?? and defineHiddenClass to create a class from the given bytes. >> 6. ClassLoaderData implementation is not changed.? There is one >> primary CLD >> ?? that holds the classes strongly referenced by its defining >> loader.? There >> ?? can be zero or more additional CLDs - one per weak class. >> 7. Nest host determination is updated per revised JVMS 5.4.4. Access >> control >> ?? check no longer throws LinkageError but instead it will throw IAE >> with >> ?? a clear message if a class fails to resolve/validate the nest host >> declared >> ?? in NestHost/NestMembers attribute. >> 8. JFR, jcmd, JDI are updated to support hidden classes. >> 9. update javac LambdaToMethod as lambda proxy starts using nestmates >> ?? and generate a bridge method to desuger a method reference to a >> protected >> ?? method in its supertype in a different package >> >> This patch also updates StringConcatFactory, LambdaMetaFactory, and >> LambdaForms >> to use hidden classes.? The webrev includes changes in nashorn to >> hidden class >> and I will update the webrev if JEP 372 removes it any time soon. >> >> We uncovered a bug in Lookup::defineClass spec throws LinkageError >> and intends >> to have the newly created class linked.? However, the implementation >> in 14 >> does not link the class.? A separate CSR [2] proposes to update the >> implementation to match the spec.? This patch fixes the implementation. >> >> The spec update on JVM TI, JDI and Instrumentation will be done as >> a separate RFE [3].? This patch includes new tests for JVM TI and >> java.instrument that validates how the existing APIs work for hidden >> classes. >> >> javadoc/specdiff >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ >> >> >> JVMS 5.4.4 change: >> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf >> >> >> CSR: >> https://bugs.openjdk.java.net/browse/JDK-8238359 >> >> Thanks >> Mandy >> [1] https://bugs.openjdk.java.net/browse/JDK-8238359 >> [2] https://bugs.openjdk.java.net/browse/JDK-8240338 >> [3] https://bugs.openjdk.java.net/browse/JDK-8230502 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Tue Mar 31 20:32:34 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 31 Mar 2020 13:32:34 -0700 Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the children process if vmArguments is already specified In-Reply-To: References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com> <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com> <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com> <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com> <042D1C56-75E3-40B9-8259-035C9A13C13B@oracle.com> <47cba4c6-233d-84a9-1d89-b40e3c974c08@oracle.com> <8bcf232e-e05c-98ae-767f-26adf18ad3fd@oracle.com> Message-ID: <8d0cfc5c-f622-2ac9-aecc-8d398f6d3f2e@oracle.com> On 3/31/20 12:09 PM, Leonid Mesnik wrote: > Hi > > On 3/30/20 9:43 PM, Chris Plummer wrote: >> Hi Leonid, >> >> On 3/30/20 5:42 PM, Leonid Mesnik wrote: >>> Hi >>> >>> See my comments inline. I will update webrev after go through all >>> your comments. >>> >>> >>> On 3/30/20 11:39 AM, Chris Plummer wrote: >>>> Hi Leonid, >>>> >>>> I haven't gone through all the tests yet.? I've accumulated enough >>>> questions that I'd like to see them answered or addressed before I >>>> continue on. >>>> >>>> This isn't directly related to your changes, but I noticed that >>>> users of JDKToolLauncher do nothing to make sure that default test >>>> options are used. This means we are never running these tools with >>>> the test options being specified with the jtreg run. Is that a bug >>>> or intentional? >>> >>> Which "default test options" do you mean? We have 2 properties to >>> set JVM options. The idea is to pass test.vm.opts to ALL java >>> processes and test.java.opts? to only tested processes if >>> applicable. Usually, for example we don't want to run jcmd with >>> -Xcomp. test.vm.opts was used (a long time ago) for options like >>> '-d32/-d64' on Solaris where JVM don't start without choosing >>> correct version. Also, it is used to reduce maximum heap for all JVM >>> instances when tests are running concurrently. >>> >>> So, probably test.vm.opts (or test.vm.tools.opts) should be added by >>> JDKToolLauncher but not test.java.opts. It is separate topic, there >>> are a lot of launchers which ignore test.vm.opts now. >> I always get confused about which set of options these properties >> represent, but basically I'm suggesting that if for example we are >> doing a -Xcomp run in mach5, JDKToolLauncher (at least in some cases) >> should be launched with this option. I think this is what you get >> from Utils.getTestJavaOpts(),. >> >> For example the SA tests use >> JDKToolLauncher.createUsingTestJDK("jhsdb"). jhsdb is what is really >> being tested here, and it should be launched with the test vm >> options. Currently we launch the target process with these options, >> which is probably also a good idea.? Also we aren't too concerned >> with the options that the test itself is run with, although I'm >> guessing they also get run with the test java opts. So we have 3 >> processes here: >> ?- jhsdb, which should be getting test java opts but is not >> ?- the target process, which should be getting test java opts and >> currently is >> ?- the test itself, where options don't really matter, but is getting >> passed test java opts >> >> However, you could argue that tests like jinfo, jstack, and jcmd, all >> of which use the Attach API and the bulk of the work is done on the >> target process, are not that concerned with the options passed to the >> command, but do want the options passed to the target process. > > Well, it is a good question if we want to run jhsdb tool itself with > additional slow options like Xcomp. Does it help us to improve > coverage? IIRC the original idea of adding test.java/vm.opts was to > don't waste time executing javac and debuggers in slow mode on SPARC. > > Anyway, it is a separate question which is out of scope of this > change. We might want to review all debugger/debugee tests to find > better way to deal with this. Might be good to get an RFE filed for this. > >>> >>>> >>>> In the problem lists, is it necessary to list the test multiple >>>> times with #id0, #id1, etc, or could you list it just once and >>>> leave that part off. It seems very error prone. Also, changing >>>> tests like ClhsdbFindPC, ClhsdbJstack, and ClhsdbScanOops to split >>>> out the testing in this manner seems completely unrelated to this >>>> CR, especially when the tests do not even contain any changes >>>> related to the CR. >>> >>> I think, that these chages are related. The startApp(...) was >>> updated so some test combinations become invalid or redundant. >>> >>> ClhsdbFindPC and ClhsdbJstack were always run twice. Now, when test >>> options passed in test it is not needed to run it twice when Xcomp >>> is already set by user. >>> >> Ok. I see now that the second test run, which is the non -Xcomp run, >> adds '@requires vm.compMode != "Xcomp"'. But this also is strange. >> The first test run, which does not have the @requires and is the one >> that makes LingeredApp launch with -Xcomp, will always run whether or >> not it is an -Xcomp test run. So it will run as part of the a regular >> test run and as part of a -Xcomp test run. The only difference >> between the two is the -Xcomp run will also run the test with -Xcomp, >> but that's not really needed (I think it will also end up passing >> -Xcomp to the target processs twice). Perhaps '@requires vm.compMode >> == "Xcomp"' should be used for the first test run, but that means it >> no longer gets run until later tiers when we use -Xcomp. Why not >> revert it back to a single test, but also add '@requires vm.compMode >> != "Xcomp"'. Then it gets run both ways in an early tier and not run >> during the -Xcomp run, which isn't really needed. > > There several flag which are executed with Xcomp only: > "-XX:-DoEscapeAnalysis",? "-XX:-UseBiasedLocking", > "-XX:+DeoptimizeALot" where this test is going to be skipped. So we > never run test with these options. > > The original idea is to run test with given options and with added > Xcomp.? I left logic the same and only skip run with "Xcomp" when it > is set already by user. I agree that we have some duplication here and > it could be improved, but it could be done separately. If you are ok > with this let me file separate RFE for this. Ok. > >> >>> ClhsdbScanOops is fixed to don't allow to run incompatible GC >>> combination. >> Ok >>> >>> So I should update these tests by splitting them or change them to? >>> startAppExactJvmOpts() if we wan't continue to ignore user-given >>> test options. >> I don't think I was suggesting removing user-given test options. I >> don't see why you would. > > I just wanted to say that these tests are affected by my changes and > should be fixed anyway. Ok. So I think the one change you agreed to make is have the default be to append test vm opts rather than prepend them. Let me know when you have a new webrev. thanks, Chris > > Leonid > >>> >>> It seems that #idN are required by jtreg now, otherwise it just run >>> test. >> Ok. >>> >>>> >>>> ?426???? public static LingeredApp startApp(String... >>>> additionalJvmOpts) throws IOException { >>>> >>>> The default test opts are appended to additionalJvmOpts, and if you >>>> want prepended you need to call Utils.prependTestJavaOpts(). I >>>> would have thought the opposite would be more desirable and >>>> expected default behavior. Why did you choose this way? I also find >>>> it somewhat confusing that there is even a default mode for where >>>> the additionalJvmOpts go. Maybe it would be best to have >>>> startAppAppendJvmArgs() and startAppPrependJvmArgs() just to make >>>> it explicit. This would also be in line with the existing >>>> startAppExactJvmOpts(). >>>> >>> I've chosen the most popular usage, which was >>> Utils.appendTestJavaOpts. But I agree, that it would be better to >>> change it to prepend. Thanks for pointing to this. >>> >>> I don't want to add startAppAppendJvmArgs()/startAppPrependJvmArgs() >>> to don't complicate all things. I think that startApp() should be >>> used in the cases when test vm options really shouldn't interfere >>> with user-provided options or overwrite them. So basically the >>> behavior is the same as for >>> ProcessTools.createJavaProcessBuilder(true, ...) and jtreg itself. >>> >> Ok. >>> >>>> Is ClhsdbFindPC correct. It used to use just use -Xcomp or -Xint, >>>> ignoring any default test opts. You've fixed it to include the >>>> default test opts, but the are appended, possibly overriding the >>>> -Xcomp or -Xint. Don't we want the default test opts prepended? >>>> Same for ClhsdbJstack. >>> >>> The idea is to don't mix Xcomp and Xmixed/Xint using requires >>> filter. However ClhsdbFindPC might override Xint with Xmixed if it >>> is set explicitly. Switching to prepending will fix it. >> Yes, that's what I was thinking and one reason I thought that should >> be default behavior. >> >> thanks, >> >> Chris >>> >>> Leonid >>> >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 3/25/20 2:31 PM, Leonid Mesnik wrote: >>>>> >>>>> Igor, Stefan, Ioi >>>>> >>>>> Thank you for your feedback. >>>>> >>>>> Filed https://bugs.openjdk.java.net/browse/JDK-8241624 To change >>>>> @run main... to @run driver. >>>>> >>>>> Test ClhsdbJstack.java is updated. >>>>> >>>>> Still waiting for review from SVC team. >>>>> >>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/ >>>>> >>>>> Leonid >>>>> >>>>> On 3/25/20 12:46 PM, Igor Ignatyev wrote: >>>>>> Hi Leonid, >>>>>> >>>>>> not related related to your patch (but yet somewhat made more >>>>>> obvious by it), it seems all (or at least almost all) the tests >>>>>> which use?LingeredApp should be run in "driver" mode as they just >>>>>> orchestrate execution of other JVMs, so running them w/ main (let >>>>>> alone main/othervm) just wastes time, >>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java#id1, for >>>>>> example, will now executed w/ Xcomp which will make it very slow >>>>>> for no reasons. since you already got your hands dirty w/ these >>>>>> tests, could you please file an RFE to sort this out and list all >>>>>> the affected tests there? >>>>>> >>>>>> re: the patch, could you please update ClhsdbJstack.java test not >>>>>> to be run w/ Xcomp and follow the same pattern you used in other >>>>>> tests (e.g.?ClhsdbScanOops) ? other than that it looks fine to >>>>>> me, I however wouldn't be able to tell if all svc tests continue >>>>>> to do that they were supposed to, so I'd prefer for someone from >>>>>> svc team to?chime in. >>>>>> >>>>>> Thanks, >>>>>> -- Igor >>>>>> >>>>>>> On Mar 25, 2020, at 12:01 PM, Leonid Mesnik >>>>>>> > wrote: >>>>>>> >>>>>>> Added Ioi, who also proposed new version of startAppVmOpts. >>>>>>> >>>>>>> Please find new webrev: >>>>>>> http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/ >>>>>>> >>>>>>> Renamed startAppVmOpts/runAppVmOpts to >>>>>>> "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should >>>>>>> make very clear that this method doesn't use any of >>>>>>> test.java.opts, test.vm.opts. >>>>>>> >>>>>>> Also, I fixed >>>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java metnioned >>>>>>> by Igor, and removed null pointer check as Ioi suggested in >>>>>>> startApp method. >>>>>>> >>>>>>> + public static void startApp(LingeredApp theApp, String... >>>>>>> additionalJvmOpts) throws IOException { >>>>>>> + startAppExactJvmOpts(theApp, >>>>>>> Utils.appendTestJavaOpts(additionalJvmOpts)); >>>>>>> + } >>>>>>> >>>>>>> Leonid >>>>>>> >>>>>>> On 3/25/20 10:14 AM, Stefan Karlsson wrote: >>>>>>>> On 2020-03-25 17:40, Igor Ignatyev wrote: >>>>>>>>> Hi Leonid, >>>>>>>>> >>>>>>>>> I have briefly looked at the patch, a few comments so far: >>>>>>>>> >>>>>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java: >>>>>>>>> ? - at L#114, could you please call static method using class >>>>>>>>> name (as the opposite of using instance)? or was it meant to >>>>>>>>> be theApp.runAppVmOpts(vmArgs) ? >>>>>>>>> >>>>>>>>> test/lib/jdk/test/lib/apps/LingeredApp.java: >>>>>>>>> - it seems that code indent of startApp(LingeredApp, String[]) >>>>>>>>> isn't correct >>>>>>>>> - I don't like startAppVmOpts name, but unfortunately don't >>>>>>>>> have a better suggestion (yet) >>>>>>>> >>>>>>>> I was going to say the same. Jtreg has the concept of "java >>>>>>>> options" and "vm options". We have had a fair share of bugs and >>>>>>>> wasted time when tests have been using the "vm options" part >>>>>>>> (VM_OPTIONS, test.vm.options, etc), and we've been moving away >>>>>>>> from using that way to pass options. I recently cleaned up some >>>>>>>> of this with: >>>>>>>> >>>>>>>> 8237111: LingeredApp should be started with getTestJavaOpts >>>>>>>> >>>>>>>> Because of this, I would prefer if we used a name that doesn't >>>>>>>> include "VmOpts", because it's too alike the other concept. >>>>>>>> Some suggestions: >>>>>>>> ?startAppJavaOptions >>>>>>>> ?startAppUsingJavaOptions >>>>>>>> ?startAppWithJavaOptions >>>>>>>> ?startAppExactJavaOptions >>>>>>>> ?startAppJvmOptions >>>>>>>> >>>>>>>> Thanks, >>>>>>>> StefanK >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> -- Igor >>>>>>>>> >>>>>>>>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Hi >>>>>>>>>> >>>>>>>>>> Could you please review following fix which change >>>>>>>>>> LingeredApp to prepend vm options to java/vm.test.opts when >>>>>>>>>> startApp is used and provide startAppVmOpts to override >>>>>>>>>> options completely. >>>>>>>>>> >>>>>>>>>> The intention is to avoid issue like in this bug where >>>>>>>>>> test/jtreg options were ignored by tests. Also I fixed some >>>>>>>>>> tests where intention was to append vm options rather than to >>>>>>>>>> override them. >>>>>>>>>> >>>>>>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/ >>>>>>>>>> >>>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698 >>>>>>>>>> >>>>>>>>>> Leonid >>>>>>>>>> >>>>>>>> >>>>>> >>>> >>>> >> >> From coleen.phillimore at oracle.com Tue Mar 31 20:32:45 2020 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 31 Mar 2020 16:32:45 -0400 Subject: Discussion about fixing deprecation in jdk.hotspot.agent In-Reply-To: References: <1916207b-de97-1f25-f93c-8830025fad62@oracle.com> <113dd83a-82a3-88fc-8f31-fe9bfd00c12c@oracle.com> Message-ID: <12e28226-136b-3391-ca01-e9e04058a2a8@oracle.com> On 3/31/20 12:19 PM, Poonam Parhar wrote: > Hello Coleen, > > Does the removal of this code only impact the 'reattach' > functionality, and it does not affect any commands available in > 'clhsdb' once it is attached to a core file? If that's true, then I > think it should be okay to remove this code. Hi Poonam,? Thank you for answering. Yes, this patch only removes the reattach functionality.? I tried out the other clhsdb commands from your wiki page, and they worked fine, including object and heap inspection. Thanks, Coleen > > Thanks, > Poonam > > On 3/31/20 5:34 AM, coleen.phillimore at oracle.com wrote: >> >> To answer my own question, this functionality is used to allow >> detach/reattach from {cl}hsdb.? Which seems to work on linux but not >> windows with this code removed. >> >> The next question is whether this is useful functionality to justify >> all this code (900+ and this new code that Magnus has added).? Can't >> you just exit and restart the clhsdb process on the core file or >> process? >> >> For the record, this is me playing with python to remove this code. >> >> http://cr.openjdk.java.net/~coleenp/2020/01/webrev/index.html >> >> Thanks, >> Coleen >> >> On 3/30/20 3:04 PM, coleen.phillimore at oracle.com wrote: >>> >>> I was wondering why this is needed when debugging a core file, which >>> is the key thing we need the SA for: >>> >>> ? /** This is used by both the debugger and any runtime system. It is >>> ????? the basic mechanism by which classes which mimic underlying VM >>> ????? functionality cause themselves to be initialized. The given >>> ????? observer will be notified (with arguments (null, null)) when the >>> ????? VM is re-initialized, as well as when it registers itself with >>> ????? the VM. */ >>> ? public static void registerVMInitializedObserver(Observer o) { >>> ??? vmInitializedObservers.add(o); >>> ??? o.update(null, null); >>> ? } >>> >>> It seems like if it isn't needed, we shouldn't add these classes and >>> remove their use. >>> >>> Coleen >>> >>> On 3/30/20 8:14 AM, Magnus Ihse Bursie wrote: >>>> No opinions on this? >>>> >>>> /Magnus >>>> >>>> On 2020-03-25 23:34, Magnus Ihse Bursie wrote: >>>>> Hi everyone, >>>>> >>>>> As a follow-up to the ongoing review for JDK-8241618, I have also >>>>> looked at fixing the deprecation warnings in jdk.hotspot.agent. >>>>> These fall in three broad categories: >>>>> >>>>> * Deprecation of the boxing type constructors (e.g. "new >>>>> Integer(42)"). >>>>> >>>>> * Deprecation of java.util.Observer and Observable. >>>>> >>>>> * The rest (mostly Class.newInstance(), and a few number of other >>>>> odd deprecations) >>>>> >>>>> The first category is trivial to fix. The last category need some >>>>> special discussion. But the overwhelming majority of deprecation >>>>> warnings come from the use of Observer and Observable. This really >>>>> dwarfs anything else, and needs to be handled first, otherwise >>>>> it's hard to even spot the other issues. >>>>> >>>>> My analysis of the situation is that the deprecation of Observer >>>>> and Observable seems a bit harsh, from the PoV of >>>>> jdk.hotspot.agent. Sure, it might be limited, but I think it does >>>>> exactly what is needed here. So the migration suggested in >>>>> Observable (java.beans or java.util.concurrent) seems overkill. If >>>>> there are genuine threading issues at play here, this assumption >>>>> might be wrong, and then maybe going the j.u.c. route is correct. >>>>> >>>>> But if that's not, the main goal should be to stay with the >>>>> current implementation. One way to do this is to sprinkle the code >>>>> with @SuppressWarning. But I think a better way would be to just >>>>> implement our own Observer and Observable. After all, the classes >>>>> are trivial. >>>>> >>>>> I've made a mock-up of this solution, were I just copied the >>>>> java.util.Observer and Observable, and removed the deprecation >>>>> annotations. The only thing needed for the rest of the code is to >>>>> make sure we import these; I've done this for three arbitrarily >>>>> selected classes just to show what the change would typically look >>>>> like. Here's the mock-up: >>>>> >>>>> http://cr.openjdk.java.net/~ihse/hotspot-agent-observer/webrev.01 >>>>> >>>>> Let me know what you think. >>>>> >>>>> /Magnus >>>> >>> >> > From chris.plummer at oracle.com Tue Mar 31 20:55:57 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 31 Mar 2020 13:55:57 -0700 Subject: Discussion about fixing deprecation in jdk.hotspot.agent In-Reply-To: <12e28226-136b-3391-ca01-e9e04058a2a8@oracle.com> References: <1916207b-de97-1f25-f93c-8830025fad62@oracle.com> <113dd83a-82a3-88fc-8f31-fe9bfd00c12c@oracle.com> <12e28226-136b-3391-ca01-e9e04058a2a8@oracle.com> Message-ID: <71739959-0aaf-c973-10d2-36f4295ddc37@oracle.com> On 3/31/20 1:32 PM, coleen.phillimore at oracle.com wrote: > > > On 3/31/20 12:19 PM, Poonam Parhar wrote: >> Hello Coleen, >> >> Does the removal of this code only impact the 'reattach' >> functionality, and it does not affect any commands available in >> 'clhsdb' once it is attached to a core file? If that's true, then I >> think it should be okay to remove this code. > > Hi Poonam,? Thank you for answering. Yes, this patch only removes the > reattach functionality.? I tried out the other clhsdb commands from > your wiki page, and they worked fine, including object and heap > inspection. I'm trying to understand exactly when all these static initializes are triggered. Is it only after you do an attach? The implementation of clhsdb reattach is exactly the same as doing a detach followed by an attach to the same process. I'm not sure how much value it has, but I think in general the removal of this code means you can't detach and then attach to anything, even a different pid. So "detach" might as well become "detach-and-exit", because your clhsdb session is dead once you detach. Do we really want to do this? Chris > > Thanks, > Coleen >> >> Thanks, >> Poonam >> >> On 3/31/20 5:34 AM, coleen.phillimore at oracle.com wrote: >>> >>> To answer my own question, this functionality is used to allow >>> detach/reattach from {cl}hsdb.? Which seems to work on linux but not >>> windows with this code removed. >>> >>> The next question is whether this is useful functionality to justify >>> all this code (900+ and this new code that Magnus has added).? Can't >>> you just exit and restart the clhsdb process on the core file or >>> process? >>> >>> For the record, this is me playing with python to remove this code. >>> >>> http://cr.openjdk.java.net/~coleenp/2020/01/webrev/index.html >>> >>> Thanks, >>> Coleen >>> >>> On 3/30/20 3:04 PM, coleen.phillimore at oracle.com wrote: >>>> >>>> I was wondering why this is needed when debugging a core file, >>>> which is the key thing we need the SA for: >>>> >>>> ? /** This is used by both the debugger and any runtime system. It is >>>> ????? the basic mechanism by which classes which mimic underlying VM >>>> ????? functionality cause themselves to be initialized. The given >>>> ????? observer will be notified (with arguments (null, null)) when the >>>> ????? VM is re-initialized, as well as when it registers itself with >>>> ????? the VM. */ >>>> ? public static void registerVMInitializedObserver(Observer o) { >>>> ??? vmInitializedObservers.add(o); >>>> ??? o.update(null, null); >>>> ? } >>>> >>>> It seems like if it isn't needed, we shouldn't add these classes >>>> and remove their use. >>>> >>>> Coleen >>>> >>>> On 3/30/20 8:14 AM, Magnus Ihse Bursie wrote: >>>>> No opinions on this? >>>>> >>>>> /Magnus >>>>> >>>>> On 2020-03-25 23:34, Magnus Ihse Bursie wrote: >>>>>> Hi everyone, >>>>>> >>>>>> As a follow-up to the ongoing review for JDK-8241618, I have also >>>>>> looked at fixing the deprecation warnings in jdk.hotspot.agent. >>>>>> These fall in three broad categories: >>>>>> >>>>>> * Deprecation of the boxing type constructors (e.g. "new >>>>>> Integer(42)"). >>>>>> >>>>>> * Deprecation of java.util.Observer and Observable. >>>>>> >>>>>> * The rest (mostly Class.newInstance(), and a few number of other >>>>>> odd deprecations) >>>>>> >>>>>> The first category is trivial to fix. The last category need some >>>>>> special discussion. But the overwhelming majority of deprecation >>>>>> warnings come from the use of Observer and Observable. This >>>>>> really dwarfs anything else, and needs to be handled first, >>>>>> otherwise it's hard to even spot the other issues. >>>>>> >>>>>> My analysis of the situation is that the deprecation of Observer >>>>>> and Observable seems a bit harsh, from the PoV of >>>>>> jdk.hotspot.agent. Sure, it might be limited, but I think it does >>>>>> exactly what is needed here. So the migration suggested in >>>>>> Observable (java.beans or java.util.concurrent) seems overkill. >>>>>> If there are genuine threading issues at play here, this >>>>>> assumption might be wrong, and then maybe going the j.u.c. route >>>>>> is correct. >>>>>> >>>>>> But if that's not, the main goal should be to stay with the >>>>>> current implementation. One way to do this is to sprinkle the >>>>>> code with @SuppressWarning. But I think a better way would be to >>>>>> just implement our own Observer and Observable. After all, the >>>>>> classes are trivial. >>>>>> >>>>>> I've made a mock-up of this solution, were I just copied the >>>>>> java.util.Observer and Observable, and removed the deprecation >>>>>> annotations. The only thing needed for the rest of the code is to >>>>>> make sure we import these; I've done this for three arbitrarily >>>>>> selected classes just to show what the change would typically >>>>>> look like. Here's the mock-up: >>>>>> >>>>>> http://cr.openjdk.java.net/~ihse/hotspot-agent-observer/webrev.01 >>>>>> >>>>>> Let me know what you think. >>>>>> >>>>>> /Magnus >>>>> >>>> >>> >> > From coleen.phillimore at oracle.com Tue Mar 31 21:20:19 2020 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 31 Mar 2020 17:20:19 -0400 Subject: Discussion about fixing deprecation in jdk.hotspot.agent In-Reply-To: <71739959-0aaf-c973-10d2-36f4295ddc37@oracle.com> References: <1916207b-de97-1f25-f93c-8830025fad62@oracle.com> <113dd83a-82a3-88fc-8f31-fe9bfd00c12c@oracle.com> <12e28226-136b-3391-ca01-e9e04058a2a8@oracle.com> <71739959-0aaf-c973-10d2-36f4295ddc37@oracle.com> Message-ID: <1cc556ce-67ed-1e6f-ee53-36d8227d0e1e@oracle.com> On 3/31/20 4:55 PM, Chris Plummer wrote: > On 3/31/20 1:32 PM, coleen.phillimore at oracle.com wrote: >> >> >> On 3/31/20 12:19 PM, Poonam Parhar wrote: >>> Hello Coleen, >>> >>> Does the removal of this code only impact the 'reattach' >>> functionality, and it does not affect any commands available in >>> 'clhsdb' once it is attached to a core file? If that's true, then I >>> think it should be okay to remove this code. >> >> Hi Poonam,? Thank you for answering. Yes, this patch only removes the >> reattach functionality.? I tried out the other clhsdb commands from >> your wiki page, and they worked fine, including object and heap >> inspection. > I'm trying to understand exactly when all these static initializes are > triggered. Is it only after you do an attach? > > The implementation of clhsdb reattach is exactly the same as doing a > detach followed by an attach to the same process. I'm not sure how > much value it has, but I think in general the removal of this code > means you can't detach and then attach to anything, even a different > pid. So "detach" might as well become "detach-and-exit", because your > clhsdb session is dead once you detach. Do we really want to do this? Well, that was my question. It seems like you could just exit and start up jhsdb again and that's more like something someone would do just as easily.? Given the use cases that we've seen from sustaining, this appears to be unneeded functionality. The original mail was proposing adding more code to work around the deprecation messages.? It seems like more code should not be added for something that is unused. thanks, Coleen > > Chris >> >> Thanks, >> Coleen >>> >>> Thanks, >>> Poonam >>> >>> On 3/31/20 5:34 AM, coleen.phillimore at oracle.com wrote: >>>> >>>> To answer my own question, this functionality is used to allow >>>> detach/reattach from {cl}hsdb.? Which seems to work on linux but >>>> not windows with this code removed. >>>> >>>> The next question is whether this is useful functionality to >>>> justify all this code (900+ and this new code that Magnus has >>>> added).? Can't you just exit and restart the clhsdb process on the >>>> core file or process? >>>> >>>> For the record, this is me playing with python to remove this code. >>>> >>>> http://cr.openjdk.java.net/~coleenp/2020/01/webrev/index.html >>>> >>>> Thanks, >>>> Coleen >>>> >>>> On 3/30/20 3:04 PM, coleen.phillimore at oracle.com wrote: >>>>> >>>>> I was wondering why this is needed when debugging a core file, >>>>> which is the key thing we need the SA for: >>>>> >>>>> ? /** This is used by both the debugger and any runtime system. It is >>>>> ????? the basic mechanism by which classes which mimic underlying VM >>>>> ????? functionality cause themselves to be initialized. The given >>>>> ????? observer will be notified (with arguments (null, null)) when >>>>> the >>>>> ????? VM is re-initialized, as well as when it registers itself with >>>>> ????? the VM. */ >>>>> ? public static void registerVMInitializedObserver(Observer o) { >>>>> ??? vmInitializedObservers.add(o); >>>>> ??? o.update(null, null); >>>>> ? } >>>>> >>>>> It seems like if it isn't needed, we shouldn't add these classes >>>>> and remove their use. >>>>> >>>>> Coleen >>>>> >>>>> On 3/30/20 8:14 AM, Magnus Ihse Bursie wrote: >>>>>> No opinions on this? >>>>>> >>>>>> /Magnus >>>>>> >>>>>> On 2020-03-25 23:34, Magnus Ihse Bursie wrote: >>>>>>> Hi everyone, >>>>>>> >>>>>>> As a follow-up to the ongoing review for JDK-8241618, I have >>>>>>> also looked at fixing the deprecation warnings in >>>>>>> jdk.hotspot.agent. These fall in three broad categories: >>>>>>> >>>>>>> * Deprecation of the boxing type constructors (e.g. "new >>>>>>> Integer(42)"). >>>>>>> >>>>>>> * Deprecation of java.util.Observer and Observable. >>>>>>> >>>>>>> * The rest (mostly Class.newInstance(), and a few number of >>>>>>> other odd deprecations) >>>>>>> >>>>>>> The first category is trivial to fix. The last category need >>>>>>> some special discussion. But the overwhelming majority of >>>>>>> deprecation warnings come from the use of Observer and >>>>>>> Observable. This really dwarfs anything else, and needs to be >>>>>>> handled first, otherwise it's hard to even spot the other issues. >>>>>>> >>>>>>> My analysis of the situation is that the deprecation of Observer >>>>>>> and Observable seems a bit harsh, from the PoV of >>>>>>> jdk.hotspot.agent. Sure, it might be limited, but I think it >>>>>>> does exactly what is needed here. So the migration suggested in >>>>>>> Observable (java.beans or java.util.concurrent) seems overkill. >>>>>>> If there are genuine threading issues at play here, this >>>>>>> assumption might be wrong, and then maybe going the j.u.c. route >>>>>>> is correct. >>>>>>> >>>>>>> But if that's not, the main goal should be to stay with the >>>>>>> current implementation. One way to do this is to sprinkle the >>>>>>> code with @SuppressWarning. But I think a better way would be to >>>>>>> just implement our own Observer and Observable. After all, the >>>>>>> classes are trivial. >>>>>>> >>>>>>> I've made a mock-up of this solution, were I just copied the >>>>>>> java.util.Observer and Observable, and removed the deprecation >>>>>>> annotations. The only thing needed for the rest of the code is >>>>>>> to make sure we import these; I've done this for three >>>>>>> arbitrarily selected classes just to show what the change would >>>>>>> typically look like. Here's the mock-up: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~ihse/hotspot-agent-observer/webrev.01 >>>>>>> >>>>>>> Let me know what you think. >>>>>>> >>>>>>> /Magnus >>>>>> >>>>> >>>> >>> >> > > From leonid.mesnik at oracle.com Tue Mar 31 23:12:32 2020 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Tue, 31 Mar 2020 16:12:32 -0700 Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the children process if vmArguments is already specified In-Reply-To: <8d0cfc5c-f622-2ac9-aecc-8d398f6d3f2e@oracle.com> References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com> <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com> <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com> <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com> <042D1C56-75E3-40B9-8259-035C9A13C13B@oracle.com> <47cba4c6-233d-84a9-1d89-b40e3c974c08@oracle.com> <8bcf232e-e05c-98ae-767f-26adf18ad3fd@oracle.com> <8d0cfc5c-f622-2ac9-aecc-8d398f6d3f2e@oracle.com> Message-ID: <8B98C7C4-C2BD-4E21-B79B-CDAD9C1C2E97@oracle.com> Here is new webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.03/ The only difference is updated startApp() method and it's comments: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.03/test/lib/jdk/test/lib/apps/LingeredApp.java.udiff.html Leonid > On Mar 31, 2020, at 1:32 PM, Chris Plummer wrote: > > On 3/31/20 12:09 PM, Leonid Mesnik wrote: >> Hi >> >> On 3/30/20 9:43 PM, Chris Plummer wrote: >>> Hi Leonid, >>> >>> On 3/30/20 5:42 PM, Leonid Mesnik wrote: >>>> Hi >>>> >>>> See my comments inline. I will update webrev after go through all your comments. >>>> >>>> >>>> On 3/30/20 11:39 AM, Chris Plummer wrote: >>>>> Hi Leonid, >>>>> >>>>> I haven't gone through all the tests yet. I've accumulated enough questions that I'd like to see them answered or addressed before I continue on. >>>>> >>>>> This isn't directly related to your changes, but I noticed that users of JDKToolLauncher do nothing to make sure that default test options are used. This means we are never running these tools with the test options being specified with the jtreg run. Is that a bug or intentional? >>>> >>>> Which "default test options" do you mean? We have 2 properties to set JVM options. The idea is to pass test.vm.opts to ALL java processes and test.java.opts to only tested processes if applicable. Usually, for example we don't want to run jcmd with -Xcomp. test.vm.opts was used (a long time ago) for options like '-d32/-d64' on Solaris where JVM don't start without choosing correct version. Also, it is used to reduce maximum heap for all JVM instances when tests are running concurrently. >>>> >>>> So, probably test.vm.opts (or test.vm.tools.opts) should be added by JDKToolLauncher but not test.java.opts. It is separate topic, there are a lot of launchers which ignore test.vm.opts now. >>> I always get confused about which set of options these properties represent, but basically I'm suggesting that if for example we are doing a -Xcomp run in mach5, JDKToolLauncher (at least in some cases) should be launched with this option. I think this is what you get from Utils.getTestJavaOpts(),. >>> >>> For example the SA tests use JDKToolLauncher.createUsingTestJDK("jhsdb"). jhsdb is what is really being tested here, and it should be launched with the test vm options. Currently we launch the target process with these options, which is probably also a good idea. Also we aren't too concerned with the options that the test itself is run with, although I'm guessing they also get run with the test java opts. So we have 3 processes here: >>> - jhsdb, which should be getting test java opts but is not >>> - the target process, which should be getting test java opts and currently is >>> - the test itself, where options don't really matter, but is getting passed test java opts >>> >>> However, you could argue that tests like jinfo, jstack, and jcmd, all of which use the Attach API and the bulk of the work is done on the target process, are not that concerned with the options passed to the command, but do want the options passed to the target process. >> >> Well, it is a good question if we want to run jhsdb tool itself with additional slow options like Xcomp. Does it help us to improve coverage? IIRC the original idea of adding test.java/vm.opts was to don't waste time executing javac and debuggers in slow mode on SPARC. >> >> Anyway, it is a separate question which is out of scope of this change. We might want to review all debugger/debugee tests to find better way to deal with this. > Might be good to get an RFE filed for this. >> >>>> >>>>> >>>>> In the problem lists, is it necessary to list the test multiple times with #id0, #id1, etc, or could you list it just once and leave that part off. It seems very error prone. Also, changing tests like ClhsdbFindPC, ClhsdbJstack, and ClhsdbScanOops to split out the testing in this manner seems completely unrelated to this CR, especially when the tests do not even contain any changes related to the CR. >>>> >>>> I think, that these chages are related. The startApp(...) was updated so some test combinations become invalid or redundant. >>>> >>>> ClhsdbFindPC and ClhsdbJstack were always run twice. Now, when test options passed in test it is not needed to run it twice when Xcomp is already set by user. >>>> >>> Ok. I see now that the second test run, which is the non -Xcomp run, adds '@requires vm.compMode != "Xcomp"'. But this also is strange. The first test run, which does not have the @requires and is the one that makes LingeredApp launch with -Xcomp, will always run whether or not it is an -Xcomp test run. So it will run as part of the a regular test run and as part of a -Xcomp test run. The only difference between the two is the -Xcomp run will also run the test with -Xcomp, but that's not really needed (I think it will also end up passing -Xcomp to the target processs twice). Perhaps '@requires vm.compMode == "Xcomp"' should be used for the first test run, but that means it no longer gets run until later tiers when we use -Xcomp. Why not revert it back to a single test, but also add '@requires vm.compMode != "Xcomp"'. Then it gets run both ways in an early tier and not run during the -Xcomp run, which isn't really needed. >> >> There several flag which are executed with Xcomp only: "-XX:-DoEscapeAnalysis", "-XX:-UseBiasedLocking", "-XX:+DeoptimizeALot" where this test is going to be skipped. So we never run test with these options. >> >> The original idea is to run test with given options and with added Xcomp. I left logic the same and only skip run with "Xcomp" when it is set already by user. I agree that we have some duplication here and it could be improved, but it could be done separately. If you are ok with this let me file separate RFE for this. > Ok. >> >>> >>>> ClhsdbScanOops is fixed to don't allow to run incompatible GC combination. >>> Ok >>>> >>>> So I should update these tests by splitting them or change them to startAppExactJvmOpts() if we wan't continue to ignore user-given test options. >>> I don't think I was suggesting removing user-given test options. I don't see why you would. >> >> I just wanted to say that these tests are affected by my changes and should be fixed anyway. > Ok. > > So I think the one change you agreed to make is have the default be to append test vm opts rather than prepend them. Let me know when you have a new webrev. > > thanks, > > Chris >> >> Leonid >> >>>> >>>> It seems that #idN are required by jtreg now, otherwise it just run test. >>> Ok. >>>> >>>>> >>>>> 426 public static LingeredApp startApp(String... additionalJvmOpts) throws IOException { >>>>> >>>>> The default test opts are appended to additionalJvmOpts, and if you want prepended you need to call Utils.prependTestJavaOpts(). I would have thought the opposite would be more desirable and expected default behavior. Why did you choose this way? I also find it somewhat confusing that there is even a default mode for where the additionalJvmOpts go. Maybe it would be best to have startAppAppendJvmArgs() and startAppPrependJvmArgs() just to make it explicit. This would also be in line with the existing startAppExactJvmOpts(). >>>>> >>>> I've chosen the most popular usage, which was Utils.appendTestJavaOpts. But I agree, that it would be better to change it to prepend. Thanks for pointing to this. >>>> >>>> I don't want to add startAppAppendJvmArgs()/startAppPrependJvmArgs() to don't complicate all things. I think that startApp() should be used in the cases when test vm options really shouldn't interfere with user-provided options or overwrite them. So basically the behavior is the same as for ProcessTools.createJavaProcessBuilder(true, ...) and jtreg itself. >>>> >>> Ok. >>>> >>>>> Is ClhsdbFindPC correct. It used to use just use -Xcomp or -Xint, ignoring any default test opts. You've fixed it to include the default test opts, but the are appended, possibly overriding the -Xcomp or -Xint. Don't we want the default test opts prepended? Same for ClhsdbJstack. >>>> >>>> The idea is to don't mix Xcomp and Xmixed/Xint using requires filter. However ClhsdbFindPC might override Xint with Xmixed if it is set explicitly. Switching to prepending will fix it. >>> Yes, that's what I was thinking and one reason I thought that should be default behavior. >>> >>> thanks, >>> >>> Chris >>>> >>>> Leonid >>>> >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 3/25/20 2:31 PM, Leonid Mesnik wrote: >>>>>> >>>>>> Igor, Stefan, Ioi >>>>>> >>>>>> Thank you for your feedback. >>>>>> >>>>>> Filed https://bugs.openjdk.java.net/browse/JDK-8241624 To change @run main... to @run driver. >>>>>> >>>>>> Test ClhsdbJstack.java is updated. >>>>>> >>>>>> Still waiting for review from SVC team. >>>>>> >>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/ >>>>>> >>>>>> Leonid >>>>>> >>>>>> On 3/25/20 12:46 PM, Igor Ignatyev wrote: >>>>>>> Hi Leonid, >>>>>>> >>>>>>> not related related to your patch (but yet somewhat made more obvious by it), it seems all (or at least almost all) the tests which use?LingeredApp should be run in "driver" mode as they just orchestrate execution of other JVMs, so running them w/ main (let alone main/othervm) just wastes time, test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java#id1, for example, will now executed w/ Xcomp which will make it very slow for no reasons. since you already got your hands dirty w/ these tests, could you please file an RFE to sort this out and list all the affected tests there? >>>>>>> >>>>>>> re: the patch, could you please update ClhsdbJstack.java test not to be run w/ Xcomp and follow the same pattern you used in other tests (e.g.?ClhsdbScanOops) ? other than that it looks fine to me, I however wouldn't be able to tell if all svc tests continue to do that they were supposed to, so I'd prefer for someone from svc team to?chime in. >>>>>>> >>>>>>> Thanks, >>>>>>> -- Igor >>>>>>> >>>>>>>> On Mar 25, 2020, at 12:01 PM, Leonid Mesnik > wrote: >>>>>>>> >>>>>>>> Added Ioi, who also proposed new version of startAppVmOpts. >>>>>>>> >>>>>>>> Please find new webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/ >>>>>>>> >>>>>>>> Renamed startAppVmOpts/runAppVmOpts to "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should make very clear that this method doesn't use any of test.java.opts, test.vm.opts. >>>>>>>> >>>>>>>> Also, I fixed test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java metnioned by Igor, and removed null pointer check as Ioi suggested in startApp method. >>>>>>>> >>>>>>>> + public static void startApp(LingeredApp theApp, String... additionalJvmOpts) throws IOException { >>>>>>>> + startAppExactJvmOpts(theApp, Utils.appendTestJavaOpts(additionalJvmOpts)); >>>>>>>> + } >>>>>>>> >>>>>>>> Leonid >>>>>>>> >>>>>>>> On 3/25/20 10:14 AM, Stefan Karlsson wrote: >>>>>>>>> On 2020-03-25 17:40, Igor Ignatyev wrote: >>>>>>>>>> Hi Leonid, >>>>>>>>>> >>>>>>>>>> I have briefly looked at the patch, a few comments so far: >>>>>>>>>> >>>>>>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java: >>>>>>>>>> ? - at L#114, could you please call static method using class name (as the opposite of using instance)? or was it meant to be theApp.runAppVmOpts(vmArgs) ? >>>>>>>>>> >>>>>>>>>> test/lib/jdk/test/lib/apps/LingeredApp.java: >>>>>>>>>> - it seems that code indent of startApp(LingeredApp, String[]) isn't correct >>>>>>>>>> - I don't like startAppVmOpts name, but unfortunately don't have a better suggestion (yet) >>>>>>>>> >>>>>>>>> I was going to say the same. Jtreg has the concept of "java options" and "vm options". We have had a fair share of bugs and wasted time when tests have been using the "vm options" part (VM_OPTIONS, test.vm.options, etc), and we've been moving away from using that way to pass options. I recently cleaned up some of this with: >>>>>>>>> >>>>>>>>> 8237111: LingeredApp should be started with getTestJavaOpts >>>>>>>>> >>>>>>>>> Because of this, I would prefer if we used a name that doesn't include "VmOpts", because it's too alike the other concept. Some suggestions: >>>>>>>>> ?startAppJavaOptions >>>>>>>>> ?startAppUsingJavaOptions >>>>>>>>> ?startAppWithJavaOptions >>>>>>>>> ?startAppExactJavaOptions >>>>>>>>> ?startAppJvmOptions >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> StefanK >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> -- Igor >>>>>>>>>> >>>>>>>>>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik wrote: >>>>>>>>>>> >>>>>>>>>>> Hi >>>>>>>>>>> >>>>>>>>>>> Could you please review following fix which change LingeredApp to prepend vm options to java/vm.test.opts when startApp is used and provide startAppVmOpts to override options completely. >>>>>>>>>>> >>>>>>>>>>> The intention is to avoid issue like in this bug where test/jtreg options were ignored by tests. Also I fixed some tests where intention was to append vm options rather than to override them. >>>>>>>>>>> >>>>>>>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/ >>>>>>>>>>> >>>>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698 >>>>>>>>>>> >>>>>>>>>>> Leonid -------------- next part -------------- An HTML attachment was scrubbed... URL: