From robbin.ehn at oracle.com  Mon Mar  2 10:16:44 2020
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Mon, 2 Mar 2020 11:16:44 +0100
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <AM7PR02MB6049A3D2F6DE10CAD6AA7A51ECEC0@AM7PR02MB6049.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
 <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
 <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com>
 <AM0PR0202MB33316510E86767AED0D29F679B030@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM7PR02MB6049A3D2F6DE10CAD6AA7A51ECEC0@AM7PR02MB6049.eurprd02.prod.outlook.com>
Message-ID: <b159e349-95bc-01c3-5250-f3b454d7ef53@oracle.com>

Hi,

On 2/24/20 5:39 PM, Lindenmaier, Goetz wrote:
> Hi,
> 
> I had a look at the progress of this change. Nothing
> happened since Richard posted his update using more
> handshakes [1].
> But we (SAP) would appreciate a lot if this change could
> be successfully reviewed and pushed.
> 
> I think there is basic understanding that this
> change is helpful. It fixes a number of issues with JVMTI,
> and will deliver the same performance benefits as EA
> does in current production mode for debugging scenarios.
> 
> This is important for us as we run our VMs prepared
> for debugging in production mode.
> 
> I understand that Robbin proposed to replace the usage of
> _suspend_flag with handshakes. Apparently, async handshakes
> are needed to do so. We have been waiting a while for removal
> of the _suspend_flag / introduction of async handshakes [2].
> What is the status here?

I have an old prototype which I would like to continue to work on.
So do not assume asynch handshakes will make 15.
Even if it would, I think there are a lot more investigate work to remove
_suspend_flag.

> 
> I think we should no longer wait, but proceed with
> this change. We will look into removing the usage of
> suspend_flag introduced here once it is possible to implement
> it with handshakes.

Yes, sure.

>> Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/

DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
You can move both declaration and definition to that file, no need to clobber 
thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)

Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's own 
hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.

Note that we also think we may have a bug in deopt:
https://bugs.openjdk.java.net/browse/JDK-8238237

I think it would be best, if possible, to push after that is resolved.

Not even nearly a full review :)

Thanks, Robbin


>> Incremental:
>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/
>>
>> I was not able to eliminate the additional suspend flag now. I'll take care of this
>> as soon as the
>> existing suspend-resume-mechanism is reworked.
>>
>> Testing:
>>
>> Nightly tests @SAP:
>>
>>    JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, Renaissance
>> Suite, SAP specific tests
>>    with fastdebug and release builds on all platforms
>>
>>    Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x parallel
>> for 24h
>>
>> Thanks, Richard.
>>
>>
>> More details on the changes:
>>
>> * Hide DeoptimizeObjectsALotThread from external view.
>>
>> * Changed EscapeBarrier_lock to be a _safepoint_check_never lock.
>>    It used to be _safepoint_check_sometimes, which will be eliminated sooner or
>> later.
>>    I added explicit thread state changes with ThreadBlockInVM to code paths
>> where we can wait()
>>    on EscapeBarrier_lock to become safepoint safe.
>>
>> * Use handshake EscapeBarrierSuspendHandshake to suspend target threads
>> instead of vm operation
>>    VM_ThreadSuspendAllForObjDeopt.
>>
>> * Removed uses of Threads_lock. When adding a new thread we suspend it iff
>> EA optimizations are
>>    being reverted. In the previous version we were waiting on Threads_lock
>> while EA optimizations
>>    were reverted. See EscapeBarrier::thread_added().
>>
>> * Made tests require Xmixed compilation mode.
>>
>> * Made tests agnostic regarding tiered compilation.
>>    I.e. tc isn't disabled anymore, and the tests can be run with tc enabled or
>> disabled.
>>
>> * Exercising EATests.java as well with stress test options
>> DeoptimizeObjectsALot*
>>    Due to the non-deterministic deoptimizations some tests need to be skipped.
>>    We do this to prevent bit-rot of the stress test code.
>>
>> * Executing EATests.java as well with graal if available. Driver for this is
>>    EATestsJVMCI.java. Graal cannot pass all tests, because it does not provide all
>> the new debug info
>>    (namely not_global_escape_in_scope and arg_escape in scopeDesc.hpp).
>>    And graal does not yet support the JVMTI operations force early return and
>> pop frame.
>>
>> * Removed tracing from new jdi tests in EATests.java. Too much trace output
>> before the debugging
>>    connection is established can cause deadlock because output buffers fill up.
>>    (See https://bugs.openjdk.java.net/browse/JDK-8173304)
>>
>> * Many copyright year changes and smaller clean-up changes of testing code
>> (trailing white-space and
>>    the like).
>>
>>
>> -----Original Message-----
>> From: David Holmes <david.holmes at oracle.com>
>> Sent: Donnerstag, 19. Dezember 2019 03:12
>> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-
>> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-
>> runtime-dev at openjdk.java.net; Vladimir Kozlov (vladimir.kozlov at oracle.com)
>> <vladimir.kozlov at oracle.com>
>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
>> the Presence of JVMTI Agents
>>
>> Hi Richard,
>>
>> I think my issue is with the way EliminateNestedLocks works so I'm going
>> to look into that more deeply.
>>
>> Thanks for the explanations.
>>
>> David
>>
>> On 18/12/2019 12:47 am, Reingruber, Richard wrote:
>>> Hi David,
>>>
>>>     > >    > Some further queries/concerns:
>>>     > >    >
>>>     > >    > src/hotspot/share/runtime/objectMonitor.cpp
>>>     > >    >
>>>     > >    > Can you please explain the changes to ObjectMonitor::wait:
>>>     > >    >
>>>     > >    > !   _recursions = save      // restore the old recursion count
>>>     > >    > !                 + jt->get_and_reset_relock_count_after_wait(); //
>>>     > >    > increased by the deferred relock count
>>>     > >    >
>>>     > >    > what is the "deferred relock count"? I gather it relates to
>>>     > >    >
>>>     > >    > "The code was extended to be able to deoptimize objects of a
>>>     > > frame that
>>>     > >    > is not the top frame and to let another thread than the owning
>>>     > > thread do
>>>     > >    > it."
>>>     > >
>>>     > > Yes, these relate. Currently EA based optimizations are reverted, when a
>> compiled frame is
>>>     > > replaced with corresponding interpreter frames. Part of this is relocking
>> objects with eliminated
>>>     > > locking. New with the enhancement is that we do this also just before
>> object references are
>>>     > > acquired through JVMTI. In this case we deoptimize also the owning
>> compiled frame C and we
>>>     > > register deoptimized objects as deferred updates. When control returns
>> to C it gets deoptimized,
>>>     > > we notice that objects are already deoptimized (reallocated and
>> relocked), so we don't do it again
>>>     > > (relocking twice would be incorrect of course). Deferred updates are
>> copied into the new
>>>     > > interpreter frames.
>>>     > >
>>>     > > Problem: relocking is not possible if the target thread T is waiting on the
>> monitor that needs to
>>>     > > be relocked. This happens only with non-local objects with
>> EliminateNestedLocks. Instead relocking
>>>     > > is deferred until T owns the monitor again. This is what the piece of
>> code above does.
>>>     >
>>>     >  Sorry I need some more detail here. How can you wait() on an object
>>>     >  monitor if the object allocation and/or locking was optimised away? And
>>>     >  what is a "non-local object" in this context? Isn't EA restricted to
>>>     >  thread-confined objects?
>>>
>>> "Non-local object" is an object that escapes its thread. The issue I'm
>> addressing with the changes
>>> in ObjectMonitor::wait are almost unrelated to EA. They are caused by
>> EliminateNestedLocks, where C2
>>> eliminates recursive locking of an already owned lock. The lock owning object
>> exists on the heap, it
>>> is locked and you can call wait() on it.
>>>
>>> EliminateLocks is the C2 option that controls lock elimination based on EA.
>> Both optimizations have
>>> in common that objects with eliminated locking need to be relocked when
>> deoptimizing a frame,
>>> i.e. when replacing a compiled frame with equivalent interpreter
>>> frames. Deoptimization::relock_objects does that job for /all/ eliminated
>> locks in scope. /All/ can
>>> be a mix of eliminated nested locks and locks of not-escaping objects.
>>>
>>> New with the enhancement: I call relock_objects earlier, just before objects
>> pontentially
>>> escape. But then later when the owning compiled frame gets deoptimized, I
>> must not do it again:
>>>
>>> See call to EscapeBarrier::objs_are_deoptimized in deoptimization.cpp:
>>>
>>>    373   if ((jvmci_enabled || ((DoEscapeAnalysis || EliminateNestedLocks) &&
>> EliminateLocks))
>>>    374       && !EscapeBarrier::objs_are_deoptimized(thread, deoptee.id())) {
>>>    375     bool unused;
>>>    376     eliminate_locks(thread, chunk, realloc_failures, deoptee, exec_mode,
>> unused);
>>>    377   }
>>>
>>> Now when calling relock_objects early it is quiet possible that I have to relock
>> an object the
>>> target thread currently waits for. Obviously I cannot relock in this case,
>> instead I chose to
>>> introduce relock_count_after_wait to JavaThread.
>>>
>>>     >  Is it just that some of the locking gets optimized away e.g.
>>>     >
>>>     >  synchronised(obj) {
>>>     >     synchronised(obj) {
>>>     >       synchronised(obj) {
>>>     >         obj.wait();
>>>     >       }
>>>     >     }
>>>     >  }
>>>     >
>>>     >  If this is reduced to a form as-if it were a single lock of the monitor
>>>     >  (due to EA) and the wait() triggers a JVM TI event which leads to the
>>>     >  escape of "obj" then we need to reconstruct the true lock state, and so
>>>     >  when the wait() internally unblocks and reacquires the monitor it has to
>>>     >  set the true recursion count to 3, not the 1 that it appeared to be when
>>>     >  wait() was initially called. Is that the scenario?
>>>
>>> Kind of... except that the locking is not eliminated due to EA and there is no
>> JVM TI event
>>> triggered by wait.
>>>
>>> Add
>>>
>>> LocalObject l1 = new LocalObject();
>>>
>>> in front of the synchrnized blocks and assume a JVM TI agent acquires l1. This
>> triggers the code in
>>> question.
>>>
>>> See that relocking/reallocating is transactional. If it is done then for /all/
>> objects in scope and it is
>>> done at most once. It wouldn't be quite so easy to split this in relocking of
>> nested/EA-based
>>> eliminated locks.
>>>
>>>     >  If so I find this truly awful. Anyone using wait() in a realistic form
>>>     >  requires a notification and so the object cannot be thread confined. In
>>>
>>> It is not thread confined.
>>>
>>>     >  which case I would strongly argue that upon hitting the wait() the deopt
>>>     >  should occur unconditionally and so the lock state is correct before we
>>>     >  wait and so we don't need to mess with the recursion count internally
>>>     >  when we reacquire the monitor.
>>>     >
>>>     > >
>>>     > >    > which I don't like the sound of at all when it comes to ObjectMonitor
>>>     > >    > state. So I'd like to understand in detail exactly what is going on here
>>>     > >    > and why.  This is a very intrusive change that seems to badly break
>>>     > >    > encapsulation and impacts future changes to ObjectMonitor that are
>> under
>>>     > >    > investigation.
>>>     > >
>>>     > > I would not regard this as breaking encapsulation. Certainly not badly.
>>>     > >
>>>     > > I've added a property relock_count_after_wait to JavaThread. The
>> property is well
>>>     > > encapsulated. Future ObjectMonitor implementations have to deal with
>> recursion too. They are free
>>>     > > in choosing a way to do that as long as that property is taken into
>> account. This is hardly a
>>>     > > limitation.
>>>     >
>>>     >  I do think this badly breaks encapsulation as you have to add a callout
>>>     >  from the guts of the ObjectMonitor code to reach into the thread to get
>>>     >  this lock count adjustment. I understand why you have had to do this but
>>>     >  I would much rather see a change to the EA optimisation strategy so that
>>>     >  this is not needed.
>>>     >
>>>     > > Note also that the property is a straight forward extension of the
>> existing concept of deferred
>>>     > > local updates. It is embedded into the structure holding them. So not
>> even the footprint of a
>>>     > > JavaThread is enlarged if no deferred updates are generated.
>>>     >
>>>     > [...]
>>>     >
>>>     > >
>>>     > > I'm actually duplicating the existing external suspend mechanism,
>> because a thread can be
>>>     > > suspended at most once. And hey, and don't like that either! But it
>> seems not unlikely that the
>>>     > > duplicate can be removed together with the original and the new type
>> of handshakes that will be
>>>     > > used for thread suspend can be used for object deoptimization too. See
>> today's discussion in
>>>     > > JDK-8227745 [2].
>>>     >
>>>     >  I hope that discussion bears some fruit, at the moment it seems not to
>>>     >  be possible to use handshakes here. :(
>>>     >
>>>     >  The external suspend mechanism is a royal pain in the proverbial that we
>>>     >  have to carefully live with. The idea that we're duplicating that for
>>>     >  use in another fringe area of functionality does not thrill me at all.
>>>     >
>>>     >  To be clear, I understand the problem that exists and that you wish to
>>>     >  solve, but for the runtime parts I balk at the complexity cost of
>>>     >  solving it.
>>>
>>> I know it's complex, but by far no rocket science.
>>>
>>> Also I find it hard to imagine another fix for JDK-8233915 besides changing
>> the JVM TI specification.
>>>
>>> Thanks, Richard.
>>>
>>> -----Original Message-----
>>> From: David Holmes <david.holmes at oracle.com>
>>> Sent: Dienstag, 17. Dezember 2019 08:03
>>> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-
>> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-
>> runtime-dev at openjdk.java.net; Vladimir Kozlov (vladimir.kozlov at oracle.com)
>> <vladimir.kozlov at oracle.com>
>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance
>> in the Presence of JVMTI Agents
>>>
>>> <resend as my mailer crashed during last send>
>>>
>>> David
>>>
>>> On 17/12/2019 4:57 pm, David Holmes wrote:
>>>> Hi Richard,
>>>>
>>>> On 14/12/2019 5:01 am, Reingruber, Richard wrote:
>>>>> Hi David,
>>>>>
>>>>>   ?? > Some further queries/concerns:
>>>>>   ?? >
>>>>>   ?? > src/hotspot/share/runtime/objectMonitor.cpp
>>>>>   ?? >
>>>>>   ?? > Can you please explain the changes to ObjectMonitor::wait:
>>>>>   ?? >
>>>>>   ?? > !?? _recursions = save????? // restore the old recursion count
>>>>>   ?? > !???????????????? + jt->get_and_reset_relock_count_after_wait(); //
>>>>>   ?? > increased by the deferred relock count
>>>>>   ?? >
>>>>>   ?? > what is the "deferred relock count"? I gather it relates to
>>>>>   ?? >
>>>>>   ?? > "The code was extended to be able to deoptimize objects of a
>>>>> frame that
>>>>>   ?? > is not the top frame and to let another thread than the owning
>>>>> thread do
>>>>>   ?? > it."
>>>>>
>>>>> Yes, these relate. Currently EA based optimizations are reverted, when
>>>>> a compiled frame is replaced
>>>>> with corresponding interpreter frames. Part of this is relocking
>>>>> objects with eliminated
>>>>> locking. New with the enhancement is that we do this also just before
>>>>> object references are acquired
>>>>> through JVMTI. In this case we deoptimize also the owning compiled
>>>>> frame C and we register
>>>>> deoptimized objects as deferred updates. When control returns to C it
>>>>> gets deoptimized, we notice
>>>>> that objects are already deoptimized (reallocated and relocked), so we
>>>>> don't do it again (relocking
>>>>> twice would be incorrect of course). Deferred updates are copied into
>>>>> the new interpreter frames.
>>>>>
>>>>> Problem: relocking is not possible if the target thread T is waiting
>>>>> on the monitor that needs to be
>>>>> relocked. This happens only with non-local objects with
>>>>> EliminateNestedLocks. Instead relocking is
>>>>> deferred until T owns the monitor again. This is what the piece of
>>>>> code above does.
>>>>
>>>> Sorry I need some more detail here. How can you wait() on an object
>>>> monitor if the object allocation and/or locking was optimised away? And
>>>> what is a "non-local object" in this context? Isn't EA restricted to
>>>> thread-confined objects?
>>>>
>>>> Is it just that some of the locking gets optimized away e.g.
>>>>
>>>> synchronised(obj) {
>>>>    ? synchronised(obj) {
>>>>    ??? synchronised(obj) {
>>>>    ????? obj.wait();
>>>>    ??? }
>>>>    ? }
>>>> }
>>>>
>>>> If this is reduced to a form as-if it were a single lock of the monitor
>>>> (due to EA) and the wait() triggers a JVM TI event which leads to the
>>>> escape of "obj" then we need to reconstruct the true lock state, and so
>>>> when the wait() internally unblocks and reacquires the monitor it has to
>>>> set the true recursion count to 3, not the 1 that it appeared to be when
>>>> wait() was initially called. Is that the scenario?
>>>>
>>>> If so I find this truly awful. Anyone using wait() in a realistic form
>>>> requires a notification and so the object cannot be thread confined. In
>>>> which case I would strongly argue that upon hitting the wait() the deopt
>>>> should occur unconditionally and so the lock state is correct before we
>>>> wait and so we don't need to mess with the recursion count internally
>>>> when we reacquire the monitor.
>>>>
>>>>>
>>>>>   ?? > which I don't like the sound of at all when it comes to
>>>>> ObjectMonitor
>>>>>   ?? > state. So I'd like to understand in detail exactly what is going
>>>>> on here
>>>>>   ?? > and why.? This is a very intrusive change that seems to badly break
>>>>>   ?? > encapsulation and impacts future changes to ObjectMonitor that
>>>>> are under
>>>>>   ?? > investigation.
>>>>>
>>>>> I would not regard this as breaking encapsulation. Certainly not badly.
>>>>>
>>>>> I've added a property relock_count_after_wait to JavaThread. The
>>>>> property is well
>>>>> encapsulated. Future ObjectMonitor implementations have to deal with
>>>>> recursion too. They are free in
>>>>> choosing a way to do that as long as that property is taken into
>>>>> account. This is hardly a
>>>>> limitation.
>>>>
>>>> I do think this badly breaks encapsulation as you have to add a callout
>>>> from the guts of the ObjectMonitor code to reach into the thread to get
>>>> this lock count adjustment. I understand why you have had to do this but
>>>> I would much rather see a change to the EA optimisation strategy so that
>>>> this is not needed.
>>>>
>>>>> Note also that the property is a straight forward extension of the
>>>>> existing concept of deferred
>>>>> local updates. It is embedded into the structure holding them. So not
>>>>> even the footprint of a
>>>>> JavaThread is enlarged if no deferred updates are generated.
>>>>>
>>>>>   ?? > ---
>>>>>   ?? >
>>>>>   ?? > src/hotspot/share/runtime/thread.cpp
>>>>>   ?? >
>>>>>   ?? > Can you please explain why
>>>>> JavaThread::wait_for_object_deoptimization
>>>>>   ?? > has to be handcrafted in this way rather than using proper
>>>>> transitions.
>>>>>   ?? >
>>>>>
>>>>> I wrote wait_for_object_deoptimization taking
>>>>> JavaThread::java_suspend_self_with_safepoint_check
>>>>> as template. So in short: for the same reasons :)
>>>>>
>>>>> Threads reach both methods as part of thread state transitions,
>>>>> therefore special handling is
>>>>> required to change thread state on top of ongoing transitions.
>>>>>
>>>>>   ?? > We got rid of "deopt suspend" some time ago and it is disturbing
>>>>> to see
>>>>>   ?? > it being added back (effectively). This seems like it may be
>>>>> something
>>>>>   ?? > that handshakes could be used for.
>>>>>
>>>>> Deopt suspend used to be something rather different with a similar
>>>>> name[1]. It is not being added back.
>>>>
>>>> I stand corrected. Despite comments in the code to the contrary
>>>> deopt_suspend didn't actually cause a self-suspend. I was doing a lot of
>>>> cleanup in this area 13 years ago :)
>>>>
>>>>>
>>>>> I'm actually duplicating the existing external suspend mechanism,
>>>>> because a thread can be suspended
>>>>> at most once. And hey, and don't like that either! But it seems not
>>>>> unlikely that the duplicate can
>>>>> be removed together with the original and the new type of handshakes
>>>>> that will be used for
>>>>> thread suspend can be used for object deoptimization too. See today's
>>>>> discussion in JDK-8227745 [2].
>>>>
>>>> I hope that discussion bears some fruit, at the moment it seems not to
>>>> be possible to use handshakes here. :(
>>>>
>>>> The external suspend mechanism is a royal pain in the proverbial that we
>>>> have to carefully live with. The idea that we're duplicating that for
>>>> use in another fringe area of functionality does not thrill me at all.
>>>>
>>>> To be clear, I understand the problem that exists and that you wish to
>>>> solve, but for the runtime parts I balk at the complexity cost of
>>>> solving it.
>>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>> Thanks, Richard.
>>>>>
>>>>> [1] Deopt suspend was something like an async. handshake for
>>>>> architectures with register windows,
>>>>>   ???? where patching the return pc for deoptimization of a compiled
>>>>> frame was racy if the owner thread
>>>>>   ???? was in native code. Instead a "deopt" suspend flag was set on
>>>>> which the thread patched its own
>>>>>   ???? frame upon return from native. So no thread was suspended. It got
>>>>> its name only from the name of
>>>>>   ???? the flags.
>>>>>
>>>>> [2] Discussion about using handshakes to sync. with the target thread:
>>>>>
>>>>> https://bugs.openjdk.java.net/browse/JDK-
>> 8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.syste
>> m.issuetabpanels:comment-tabpanel#comment-14306727
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>> Sent: Freitag, 13. Dezember 2019 00:56
>>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
>>>>> serviceability-dev at openjdk.java.net;
>>>>> hotspot-compiler-dev at openjdk.java.net;
>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
>>>>> Performance in the Presence of JVMTI Agents
>>>>>
>>>>> Hi Richard,
>>>>>
>>>>> Some further queries/concerns:
>>>>>
>>>>> src/hotspot/share/runtime/objectMonitor.cpp
>>>>>
>>>>> Can you please explain the changes to ObjectMonitor::wait:
>>>>>
>>>>> !?? _recursions = save????? // restore the old recursion count
>>>>> !???????????????? + jt->get_and_reset_relock_count_after_wait(); //
>>>>> increased by the deferred relock count
>>>>>
>>>>> what is the "deferred relock count"? I gather it relates to
>>>>>
>>>>> "The code was extended to be able to deoptimize objects of a frame that
>>>>> is not the top frame and to let another thread than the owning thread do
>>>>> it."
>>>>>
>>>>> which I don't like the sound of at all when it comes to ObjectMonitor
>>>>> state. So I'd like to understand in detail exactly what is going on here
>>>>> and why.? This is a very intrusive change that seems to badly break
>>>>> encapsulation and impacts future changes to ObjectMonitor that are under
>>>>> investigation.
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/runtime/thread.cpp
>>>>>
>>>>> Can you please explain why JavaThread::wait_for_object_deoptimization
>>>>> has to be handcrafted in this way rather than using proper transitions.
>>>>>
>>>>> We got rid of "deopt suspend" some time ago and it is disturbing to see
>>>>> it being added back (effectively). This seems like it may be something
>>>>> that handshakes could be used for.
>>>>>
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>>
>>>>> On 12/12/2019 7:02 am, David Holmes wrote:
>>>>>> On 12/12/2019 1:07 am, Reingruber, Richard wrote:
>>>>>>> Hi David,
>>>>>>>
>>>>>>>   ??? > Most of the details here are in areas I can comment on in detail,
>>>>>>> but I
>>>>>>>   ??? > did take an initial general look at things.
>>>>>>>
>>>>>>> Thanks for taking the time!
>>>>>>
>>>>>> Apologies the above should read:
>>>>>>
>>>>>> "Most of the details here are in areas I *can't* comment on in detail
>>>>>> ..."
>>>>>>
>>>>>> David
>>>>>>
>>>>>>>   ??? > The only thing that jumped out at me is that I think the
>>>>>>>   ??? > DeoptimizeObjectsALotThread should be a hidden thread.
>>>>>>>   ??? >
>>>>>>>   ??? > +? bool is_hidden_from_external_view() const { return true; }
>>>>>>>
>>>>>>> Yes, it should. Will add the method like above.
>>>>>>>
>>>>>>>   ??? > Also I don't see any testing of the DeoptimizeObjectsALotThread.
>>>>>>> Without
>>>>>>>   ??? > active testing this will just bit-rot.
>>>>>>>
>>>>>>> DeoptimizeObjectsALot is meant for stress testing with a larger
>>>>>>> workload. I will add a minimal test
>>>>>>> to keep it fresh.
>>>>>>>
>>>>>>>   ??? > Also on the tests I don't understand your @requires clause:
>>>>>>>   ??? >
>>>>>>>   ??? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled
>> &
>>>>>>>   ??? > (vm.opt.TieredCompilation != true))
>>>>>>>   ??? >
>>>>>>>   ??? > This seems to require that TieredCompilation is disabled, but
>>>>>>> tiered is
>>>>>>>   ??? > our normal mode of operation. ??
>>>>>>>   ??? >
>>>>>>>
>>>>>>> I removed the clause. I guess I wanted to target the tests towards the
>>>>>>> code they are supposed to
>>>>>>> test, and it's easier to analyze failures w/o tiered compilation and
>>>>>>> with just one compiler thread.
>>>>>>>
>>>>>>> Additionally I will make use of
>>>>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Richard.
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>>> Sent: Mittwoch, 11. Dezember 2019 08:03
>>>>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
>>>>>>> serviceability-dev at openjdk.java.net;
>>>>>>> hotspot-compiler-dev at openjdk.java.net;
>>>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
>>>>>>> Performance in the Presence of JVMTI Agents
>>>>>>>
>>>>>>> Hi Richard,
>>>>>>>
>>>>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I would like to get reviews please for
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
>>>>>>>>
>>>>>>>> Corresponding RFE:
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745
>>>>>>>>
>>>>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
>>>>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]
>>>>>>>>
>>>>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without
>>>>>>>> issues (thanks!). In addition the
>>>>>>>> change is being tested at SAP since I posted the first RFR some
>>>>>>>> months ago.
>>>>>>>>
>>>>>>>> The intention of this enhancement is to benefit performance wise from
>>>>>>>> escape analysis even if JVMTI
>>>>>>>> agents request capabilities that allow them to access local variable
>>>>>>>> values. E.g. if you start-up
>>>>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then
>>>>>>>> escape analysis is disabled right
>>>>>>>> from the beginning, well before a debugger attaches -- if ever one
>>>>>>>> should do so. With the
>>>>>>>> enhancement, escape analysis will remain enabled until and after a
>>>>>>>> debugger attaches. EA based
>>>>>>>> optimizations are reverted just before an agent acquires the
>>>>>>>> reference to an object. In the JBS item
>>>>>>>> you'll find more details.
>>>>>>>
>>>>>>> Most of the details here are in areas I can comment on in detail, but I
>>>>>>> did take an initial general look at things.
>>>>>>>
>>>>>>> The only thing that jumped out at me is that I think the
>>>>>>> DeoptimizeObjectsALotThread should be a hidden thread.
>>>>>>>
>>>>>>> +? bool is_hidden_from_external_view() const { return true; }
>>>>>>>
>>>>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread.
>>>>>>> Without
>>>>>>> active testing this will just bit-rot.
>>>>>>>
>>>>>>> Also on the tests I don't understand your @requires clause:
>>>>>>>
>>>>>>>   ??? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>>>>>>> (vm.opt.TieredCompilation != true))
>>>>>>>
>>>>>>> This seems to require that TieredCompilation is disabled, but tiered is
>>>>>>> our normal mode of operation. ??
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Richard.
>>>>>>>>
>>>>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745
>>>>>>>>
>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patc
>> h
>>>>>>>>
>>>>>>>>
>>>>>>>>

From kevin.walls at oracle.com  Mon Mar  2 10:47:16 2020
From: kevin.walls at oracle.com (Kevin Walls)
Date: Mon, 2 Mar 2020 10:47:16 +0000
Subject: RFR(S): hs_err elapsed time in seconds is not accurate enough
Message-ID: <4a3d7c8e-df9e-51de-77e4-775be1480392@oracle.com>

Hi,

(s11y and runtime opinions both relevant)

A few times in the last month I've really wanted to compare the Events 
logged in the hs_err file, and the time of the JVM's crash.

"elapsed time" in hs_err is only accurate to one second, and has been 
since before jdk5 was created.

The diff below changes the format string and uses the non-rounded time 
value (I don't see a need to change the other integer arithmetic here), 
and we can enjoy hs_errs with detail like:

...
Time: Mon Mar? 2 09:57:13 2020 UTC elapsed time: 5.490135 seconds (0d 0h 
0m 5s)
...

Thanks
Kevin


/jdk/open$ hg diff
diff --git a/src/hotspot/share/runtime/os.cpp 
b/src/hotspot/share/runtime/os.cpp
--- a/src/hotspot/share/runtime/os.cpp
+++ b/src/hotspot/share/runtime/os.cpp
@@ -1016,9 +1016,8 @@
 ?? }

 ?? double t = os::elapsedTime();
-? // NOTE: It tends to crash after a SEGV if we want to printf("%f",...) in
-? //?????? Linux. Must be a bug in glibc ? Workaround is to round "t" 
to int
-? //?????? before printf. We lost some precision, but who cares?
+? // NOTE: a crash using printf("%f",...) on Linux was historically 
noted here
+? //?????? (before the jdk5 repo was created).
 ?? int eltime = (int)t;? // elapsed time in seconds

 ?? // print elapsed time in a human-readable format:
@@ -1029,7 +1028,7 @@
 ?? int elmins = (eltime - day_secs - hour_secs) / secs_per_min;
 ?? int minute_secs = elmins * secs_per_min;
 ?? int elsecs = (eltime - day_secs - hour_secs - minute_secs);
-? st->print_cr(" elapsed time: %d seconds (%dd %dh %dm %ds)", eltime, 
eldays, elhours, elmins, elsecs);
+? st->print_cr(" elapsed time: %f seconds (%dd %dh %dm %ds)", t, 
eldays, elhours, elmins, elsecs);
 ?}


From kevin.walls at oracle.com  Mon Mar  2 10:48:13 2020
From: kevin.walls at oracle.com (Kevin Walls)
Date: Mon, 2 Mar 2020 10:48:13 +0000
Subject: RFR(S): 8240295: hs_err elapsed time in seconds is not accurate
 enough
In-Reply-To: <4a3d7c8e-df9e-51de-77e4-775be1480392@oracle.com>
References: <4a3d7c8e-df9e-51de-77e4-775be1480392@oracle.com>
Message-ID: <eb3880c3-27c0-b9b3-60f3-44f3f249dd66@oracle.com>

Oops, and with the bug ID in the title and JBS link:
https://bugs.openjdk.java.net/browse/JDK-8240295


On 02/03/2020 10:47, Kevin Walls wrote:
> Hi,
>
> (s11y and runtime opinions both relevant)
>
> A few times in the last month I've really wanted to compare the Events 
> logged in the hs_err file, and the time of the JVM's crash.
>
> "elapsed time" in hs_err is only accurate to one second, and has been 
> since before jdk5 was created.
>
> The diff below changes the format string and uses the non-rounded time 
> value (I don't see a need to change the other integer arithmetic 
> here), and we can enjoy hs_errs with detail like:
>
> ...
> Time: Mon Mar? 2 09:57:13 2020 UTC elapsed time: 5.490135 seconds (0d 
> 0h 0m 5s)
> ...
>
> Thanks
> Kevin
>
>
> /jdk/open$ hg diff
> diff --git a/src/hotspot/share/runtime/os.cpp 
> b/src/hotspot/share/runtime/os.cpp
> --- a/src/hotspot/share/runtime/os.cpp
> +++ b/src/hotspot/share/runtime/os.cpp
> @@ -1016,9 +1016,8 @@
> ?? }
>
> ?? double t = os::elapsedTime();
> -? // NOTE: It tends to crash after a SEGV if we want to 
> printf("%f",...) in
> -? //?????? Linux. Must be a bug in glibc ? Workaround is to round "t" 
> to int
> -? //?????? before printf. We lost some precision, but who cares?
> +? // NOTE: a crash using printf("%f",...) on Linux was historically 
> noted here
> +? //?????? (before the jdk5 repo was created).
> ?? int eltime = (int)t;? // elapsed time in seconds
>
> ?? // print elapsed time in a human-readable format:
> @@ -1029,7 +1028,7 @@
> ?? int elmins = (eltime - day_secs - hour_secs) / secs_per_min;
> ?? int minute_secs = elmins * secs_per_min;
> ?? int elsecs = (eltime - day_secs - hour_secs - minute_secs);
> -? st->print_cr(" elapsed time: %d seconds (%dd %dh %dm %ds)", eltime, 
> eldays, elhours, elmins, elsecs);
> +? st->print_cr(" elapsed time: %f seconds (%dd %dh %dm %ds)", t, 
> eldays, elhours, elmins, elsecs);
> ?}
>
>

From linzang at tencent.com  Mon Mar  2 13:56:52 2020
From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=)
Date: Mon, 2 Mar 2020 13:56:52 +0000
Subject: JDK-8215624 add parallel heap inspection support for jmap
 histo(G1)(Internet mail)
In-Reply-To: <c75874892032465b90fbd03ad29242d2@tencent.com>
References: <11bca96c0e7745f5b2558cc49b42b996@tencent.com>
 <d3369b66-481a-9c8e-4b7c-4ce8bd37b1cc@oracle.com>
 <fff142ab407a4a808cacc4952fb476df@tencent.com>
 <e4175fbf-868b-14fe-39f9-05cc852fa203@oracle.com>
 <c75874892032465b90fbd03ad29242d2@tencent.com>
Message-ID: <2EDF28BF-94D5-4F2E-B96E-2C45948AD454@tencent.com>

Dear all, 
      Let me try to ease the reviewing work by some explanation :P
      The patch's target is to speed up jmap -histo for heap iteration, from my experience it is necessary for large heap investigation. E.g in bigData scenario I have tried to conduct jmap -histo against 180GB heap, it does take quite a while. 
      And if my understanding is corrent, even the jmap -histo without "live" option does heap inspection with heap lock acquired. so it is very likely to block mutator thread in allocation-sensitive scenario. I would say the faster the heap inspection does, the shorter the mutator be blocked. This is parallel iteration for jmap is necessary.
      I think the parallel heap inspection should be applied to all kind of heap. However, consider the heap layout are different for  GCs, much time is required to understand all kinds of the heap layout to make the whole change. IMO, It is not wise to have a huge patch for the whole solution at once, and it is even harder to review it. So I plan to implement it incrementally, the first patch (this one) is going to confirm the implemention detail of how jmap accept the new option, passes it to attachListener of the jvm process and then how to make the parallel inspection closure be generic enough to make it easy to extend to different heap layout. And also how to implement the heap inspection in specific gc's heap. This patch use G1's heap as the begining.
      This patch actually do several things:
      1. Add an option "parallelThreadNum=<N>" to jmap -histo, the default behavior is to set N to 0, means let's JVM decide how many threads to use for heap inspection. Set this option to 1 will disable parallel heap inspection. (more details in CSR: https://bugs.openjdk.java.net/browse/JDK-8239290)
      2. Make a change in how Jmap passing arguments, changes in http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.udiff.html, originally it pass options as separate arguments to attachListener, this patch change to that all options be compose to a single string. So the arg_count_max in attachListener.hpp do not need to be changed, and hence avoid the compatibility issue, as disscussed at https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-March/027334.html
     3. Add an abstract class ParHeapInspectTask in heapInspection.hpp / heapInspection.cpp, It's work(uint worker_id) method prepares the data structure (KlassInfoTable) need for every parallel worker thread, and then call do_object_iterate_parallel() which is heap specific implementation. I also added some machenism in KlassInfoTable to support parallel iteration, such as merge().
    4. In specific heap (G1 in this patch), create a subclass of ParHeapInspectTask, implement the do_object_iterate_parallel() for parallel heap inspection. For G1, it simply invoke g1CollectedHeap's object_iterate_parallel().
    5. Add related test.
    6. it may be easy to extend this patch for other kinds of heap by creating subclass of ParHeapInspectTask and implement the do_object_iterate_parallel().

Hope these info could help on code review and initate the discussion :-) 
Thanks!
 
BRs,
Lin

?>On 2020/2/19, 9:40 AM, "linzang(??)" <linzang at tencent.com> wrote:.
>
>  Re-post this RFR with correct enhancement number to make it trackable.
>  please ignore the previous wrong post. sorry for troubles. 
>    
>   webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/
>    Hi bug: https://bugs.openjdk.java.net/browse/JDK-8215624
>    CSR: https://bugs.openjdk.java.net/browse/JDK-8239290
>    --------------
>    Lin
>    >Hi Lin,
>    >
>    >Could you, please, re-post your RFR with the right enhancement number in
>    >the message subject?
>    >It will be more trackable this way.
>    >
>    >Thanks,
>    >Serguei
>    >
>    >
>    >On 2/17/20 10:29 PM, linzang(??) wrote:
>    >> Dear David,
>    >>        Thanks a lot!
>    >>       I have updated the refined code to http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/.
>    >>        IMHO the parallel heap inspection can be extended to all kinds of heap as long as the heap layout can support parallel iteration.
>    >>        Maybe we can firstly use this webrev to discuss how to implement it, because I am not sure my current implementation is an appropriate way to communicate with collectedHeap, then we can extend the solution to other kinds of heap.
>    >>    
>    >> Thanks,
>    >> --------------
>    >> Lin
>    >>> Hi Lin,
>    >>>
>    >>> Adding in hotspot-gc-dev as they need to see how this interacts with GC
>    >>> worker threads, and whether it needs to be extended beyond G1.
>    >>>
>   >>> I happened to spot one nit when browsing:
>    >>>
>    >>> src/hotspot/share/gc/shared/collectedHeap.hpp
>    >>>
>    >>> +   virtual bool run_par_heap_inspect_task(KlassInfoTable* cit,
>    >>> +                                          BoolObjectClosure* filter,
>    >>> +                                          size_t* missed_count,
>    >>> +                                          size_t thread_num) {
>    >>> +     return NULL;
>    >>>
>    >>> s/NULL/false/
>    >>>
>    >>> Cheers,
>    >>> David
>    >>>
>    >>> On 18/02/2020 2:15 pm, linzang(??) wrote:
>    >>>> Dear All,
>    >>>>         May I ask your help to review the follow changes:
>    >>>>         webrev:
>    >>>> http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/
>    >>>>      bug: https://bugs.openjdk.java.net/browse/JDK-8215624
>    >>>>      related CSR: https://bugs.openjdk.java.net/browse/JDK-8239290
>    >>>>         This patch enable parallel heap inspection of G1 for jmap histo.
>    >>>>         my simple test shown it can speed up 2x of jmap -histo with
>    >>>> parallelThreadNum set to 2 for heap at ~500M on 4-core platform.
>    >>>>
>    >>>> ------------------------------------------------------------------------
>    >>>> BRs,
>    >>>> Lin
>    >> >
>    >


From david.holmes at oracle.com  Tue Mar  3 01:11:02 2020
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 3 Mar 2020 11:11:02 +1000
Subject: RFR(S): 8240295: hs_err elapsed time in seconds is not accurate
 enough
In-Reply-To: <eb3880c3-27c0-b9b3-60f3-44f3f249dd66@oracle.com>
References: <4a3d7c8e-df9e-51de-77e4-775be1480392@oracle.com>
 <eb3880c3-27c0-b9b3-60f3-44f3f249dd66@oracle.com>
Message-ID: <92ada19d-7200-6bb7-b174-9da8a166aea7@oracle.com>

Hi Kevin,

On 2/03/2020 8:48 pm, Kevin Walls wrote:
> Oops, and with the bug ID in the title and JBS link:
> https://bugs.openjdk.java.net/browse/JDK-8240295
> 
> 
> On 02/03/2020 10:47, Kevin Walls wrote:
>> Hi,
>>
>> (s11y and runtime opinions both relevant)
>>
>> A few times in the last month I've really wanted to compare the Events 
>> logged in the hs_err file, and the time of the JVM's crash.
>>
>> "elapsed time" in hs_err is only accurate to one second, and has been 
>> since before jdk5 was created.
>>
>> The diff below changes the format string and uses the non-rounded time 
>> value (I don't see a need to change the other integer arithmetic 
>> here), and we can enjoy hs_errs with detail like:
>>
>> ...
>> Time: Mon Mar? 2 09:57:13 2020 UTC elapsed time: 5.490135 seconds (0d 
>> 0h 0m 5s)
>> ...
>>
>> Thanks
>> Kevin
>>
>>
>> /jdk/open$ hg diff
>> diff --git a/src/hotspot/share/runtime/os.cpp 
>> b/src/hotspot/share/runtime/os.cpp
>> --- a/src/hotspot/share/runtime/os.cpp
>> +++ b/src/hotspot/share/runtime/os.cpp
>> @@ -1016,9 +1016,8 @@
>> ?? }
>>
>> ?? double t = os::elapsedTime();
>> -? // NOTE: It tends to crash after a SEGV if we want to 
>> printf("%f",...) in
>> -? //?????? Linux. Must be a bug in glibc ? Workaround is to round "t" 
>> to int
>> -? //?????? before printf. We lost some precision, but who cares?
>> +? // NOTE: a crash using printf("%f",...) on Linux was historically 
>> noted here
>> +? //?????? (before the jdk5 repo was created).

Just because it is old doesn't mean it no longer applies. printf is not 
async-signal-safe - we know that but we try to use it anyway. Maybe %f 
is even less async-signal-safe?

This may get through testing okay but cause problems with real crashes 
in the field.

What about breaking the time up into two ints: seconds and nanos?

Cheers,
David
-----

>> ?? int eltime = (int)t;? // elapsed time in seconds
>>
>> ?? // print elapsed time in a human-readable format:
>> @@ -1029,7 +1028,7 @@
>> ?? int elmins = (eltime - day_secs - hour_secs) / secs_per_min;
>> ?? int minute_secs = elmins * secs_per_min;
>> ?? int elsecs = (eltime - day_secs - hour_secs - minute_secs);
>> -? st->print_cr(" elapsed time: %d seconds (%dd %dh %dm %ds)", eltime, 
>> eldays, elhours, elmins, elsecs);
>> +? st->print_cr(" elapsed time: %f seconds (%dd %dh %dm %ds)", t, 
>> eldays, elhours, elmins, elsecs);
>> ?}
>>
>>

From ramkumar.sunderbabu at oracle.com  Tue Mar  3 08:52:18 2020
From: ramkumar.sunderbabu at oracle.com (Ramkumar Sunderbabu)
Date: Tue, 3 Mar 2020 00:52:18 -0800 (PST)
Subject: RFR(S) : 8153430: [TESTBUG] jdk regression test
 javax/management/loading/MletParserLocaleTest.java reduce default timeout
Message-ID: <0cee14ea-901d-4571-85b9-bbb6f01d59c1@default>

Hi all,

              Please review this patch.

Removed "timeout=5" from the Tests so that default timeout is used.

 
JBS: https://bugs.openjdk.java.net/browse/JDK-8153430

Webrev: http://cr.openjdk.java.net/~rsunderbabu/8153430/webrev.00/

 
Testing: Locally tested with "-Xcomp" option on a linux-64 machine.

 
Thanks,

Ram
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200303/540c3cb8/attachment.htm>

From richard.reingruber at sap.com  Tue Mar  3 20:22:46 2020
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Tue, 3 Mar 2020 20:22:46 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <b159e349-95bc-01c3-5250-f3b454d7ef53@oracle.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
 <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
 <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com>
 <AM0PR0202MB33316510E86767AED0D29F679B030@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM7PR02MB6049A3D2F6DE10CAD6AA7A51ECEC0@AM7PR02MB6049.eurprd02.prod.outlook.com>
 <b159e349-95bc-01c3-5250-f3b454d7ef53@oracle.com>
Message-ID: <AM0PR0202MB33315707EAB1F5C9801DB4C19BE40@AM0PR0202MB3331.eurprd02.prod.outlook.com>

Hi Robbin,

> > I understand that Robbin proposed to replace the usage of
> > _suspend_flag with handshakes. Apparently, async handshakes
> > are needed to do so. We have been waiting a while for removal
> > of the _suspend_flag / introduction of async handshakes [2].
> > What is the status here?

> I have an old prototype which I would like to continue to work on.
> So do not assume asynch handshakes will make 15.
> Even if it would, I think there are a lot more investigate work to remove
> _suspend_flag.

Let us know, if we can be of any help to you and be it only testing.

> >> Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/

> DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
> You can move both declaration and definition to that file, no need to clobber
> thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)

Will do.

> Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's own
> hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.

You are right. It shouldn't be declared in thread.hpp. I will look into that.

> Note that we also think we may have a bug in deopt:
> https://bugs.openjdk.java.net/browse/JDK-8238237

> I think it would be best, if possible, to push after that is resolved.

Sure.

> Not even nearly a full review :)

I know :)

Anyways, thanks a lot,
Richard.


-----Original Message-----
From: Robbin Ehn <robbin.ehn at oracle.com> 
Sent: Monday, March 2, 2020 11:17 AM
To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Reingruber, Richard <richard.reingruber at sap.com>; David Holmes <david.holmes at oracle.com>; Vladimir Kozlov (vladimir.kozlov at oracle.com) <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

Hi,

On 2/24/20 5:39 PM, Lindenmaier, Goetz wrote:
> Hi,
> 
> I had a look at the progress of this change. Nothing
> happened since Richard posted his update using more
> handshakes [1].
> But we (SAP) would appreciate a lot if this change could
> be successfully reviewed and pushed.
> 
> I think there is basic understanding that this
> change is helpful. It fixes a number of issues with JVMTI,
> and will deliver the same performance benefits as EA
> does in current production mode for debugging scenarios.
> 
> This is important for us as we run our VMs prepared
> for debugging in production mode.
> 
> I understand that Robbin proposed to replace the usage of
> _suspend_flag with handshakes. Apparently, async handshakes
> are needed to do so. We have been waiting a while for removal
> of the _suspend_flag / introduction of async handshakes [2].
> What is the status here?

I have an old prototype which I would like to continue to work on.
So do not assume asynch handshakes will make 15.
Even if it would, I think there are a lot more investigate work to remove
_suspend_flag.

> 
> I think we should no longer wait, but proceed with
> this change. We will look into removing the usage of
> suspend_flag introduced here once it is possible to implement
> it with handshakes.

Yes, sure.

>> Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/

DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
You can move both declaration and definition to that file, no need to clobber 
thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)

Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's own 
hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.

Note that we also think we may have a bug in deopt:
https://bugs.openjdk.java.net/browse/JDK-8238237

I think it would be best, if possible, to push after that is resolved.

Not even nearly a full review :)

Thanks, Robbin


>> Incremental:
>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/
>>
>> I was not able to eliminate the additional suspend flag now. I'll take care of this
>> as soon as the
>> existing suspend-resume-mechanism is reworked.
>>
>> Testing:
>>
>> Nightly tests @SAP:
>>
>>    JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, Renaissance
>> Suite, SAP specific tests
>>    with fastdebug and release builds on all platforms
>>
>>    Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x parallel
>> for 24h
>>
>> Thanks, Richard.
>>
>>
>> More details on the changes:
>>
>> * Hide DeoptimizeObjectsALotThread from external view.
>>
>> * Changed EscapeBarrier_lock to be a _safepoint_check_never lock.
>>    It used to be _safepoint_check_sometimes, which will be eliminated sooner or
>> later.
>>    I added explicit thread state changes with ThreadBlockInVM to code paths
>> where we can wait()
>>    on EscapeBarrier_lock to become safepoint safe.
>>
>> * Use handshake EscapeBarrierSuspendHandshake to suspend target threads
>> instead of vm operation
>>    VM_ThreadSuspendAllForObjDeopt.
>>
>> * Removed uses of Threads_lock. When adding a new thread we suspend it iff
>> EA optimizations are
>>    being reverted. In the previous version we were waiting on Threads_lock
>> while EA optimizations
>>    were reverted. See EscapeBarrier::thread_added().
>>
>> * Made tests require Xmixed compilation mode.
>>
>> * Made tests agnostic regarding tiered compilation.
>>    I.e. tc isn't disabled anymore, and the tests can be run with tc enabled or
>> disabled.
>>
>> * Exercising EATests.java as well with stress test options
>> DeoptimizeObjectsALot*
>>    Due to the non-deterministic deoptimizations some tests need to be skipped.
>>    We do this to prevent bit-rot of the stress test code.
>>
>> * Executing EATests.java as well with graal if available. Driver for this is
>>    EATestsJVMCI.java. Graal cannot pass all tests, because it does not provide all
>> the new debug info
>>    (namely not_global_escape_in_scope and arg_escape in scopeDesc.hpp).
>>    And graal does not yet support the JVMTI operations force early return and
>> pop frame.
>>
>> * Removed tracing from new jdi tests in EATests.java. Too much trace output
>> before the debugging
>>    connection is established can cause deadlock because output buffers fill up.
>>    (See https://bugs.openjdk.java.net/browse/JDK-8173304)
>>
>> * Many copyright year changes and smaller clean-up changes of testing code
>> (trailing white-space and
>>    the like).
>>
>>
>> -----Original Message-----
>> From: David Holmes <david.holmes at oracle.com>
>> Sent: Donnerstag, 19. Dezember 2019 03:12
>> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-
>> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-
>> runtime-dev at openjdk.java.net; Vladimir Kozlov (vladimir.kozlov at oracle.com)
>> <vladimir.kozlov at oracle.com>
>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
>> the Presence of JVMTI Agents
>>
>> Hi Richard,
>>
>> I think my issue is with the way EliminateNestedLocks works so I'm going
>> to look into that more deeply.
>>
>> Thanks for the explanations.
>>
>> David
>>
>> On 18/12/2019 12:47 am, Reingruber, Richard wrote:
>>> Hi David,
>>>
>>>     > >    > Some further queries/concerns:
>>>     > >    >
>>>     > >    > src/hotspot/share/runtime/objectMonitor.cpp
>>>     > >    >
>>>     > >    > Can you please explain the changes to ObjectMonitor::wait:
>>>     > >    >
>>>     > >    > !   _recursions = save      // restore the old recursion count
>>>     > >    > !                 + jt->get_and_reset_relock_count_after_wait(); //
>>>     > >    > increased by the deferred relock count
>>>     > >    >
>>>     > >    > what is the "deferred relock count"? I gather it relates to
>>>     > >    >
>>>     > >    > "The code was extended to be able to deoptimize objects of a
>>>     > > frame that
>>>     > >    > is not the top frame and to let another thread than the owning
>>>     > > thread do
>>>     > >    > it."
>>>     > >
>>>     > > Yes, these relate. Currently EA based optimizations are reverted, when a
>> compiled frame is
>>>     > > replaced with corresponding interpreter frames. Part of this is relocking
>> objects with eliminated
>>>     > > locking. New with the enhancement is that we do this also just before
>> object references are
>>>     > > acquired through JVMTI. In this case we deoptimize also the owning
>> compiled frame C and we
>>>     > > register deoptimized objects as deferred updates. When control returns
>> to C it gets deoptimized,
>>>     > > we notice that objects are already deoptimized (reallocated and
>> relocked), so we don't do it again
>>>     > > (relocking twice would be incorrect of course). Deferred updates are
>> copied into the new
>>>     > > interpreter frames.
>>>     > >
>>>     > > Problem: relocking is not possible if the target thread T is waiting on the
>> monitor that needs to
>>>     > > be relocked. This happens only with non-local objects with
>> EliminateNestedLocks. Instead relocking
>>>     > > is deferred until T owns the monitor again. This is what the piece of
>> code above does.
>>>     >
>>>     >  Sorry I need some more detail here. How can you wait() on an object
>>>     >  monitor if the object allocation and/or locking was optimised away? And
>>>     >  what is a "non-local object" in this context? Isn't EA restricted to
>>>     >  thread-confined objects?
>>>
>>> "Non-local object" is an object that escapes its thread. The issue I'm
>> addressing with the changes
>>> in ObjectMonitor::wait are almost unrelated to EA. They are caused by
>> EliminateNestedLocks, where C2
>>> eliminates recursive locking of an already owned lock. The lock owning object
>> exists on the heap, it
>>> is locked and you can call wait() on it.
>>>
>>> EliminateLocks is the C2 option that controls lock elimination based on EA.
>> Both optimizations have
>>> in common that objects with eliminated locking need to be relocked when
>> deoptimizing a frame,
>>> i.e. when replacing a compiled frame with equivalent interpreter
>>> frames. Deoptimization::relock_objects does that job for /all/ eliminated
>> locks in scope. /All/ can
>>> be a mix of eliminated nested locks and locks of not-escaping objects.
>>>
>>> New with the enhancement: I call relock_objects earlier, just before objects
>> pontentially
>>> escape. But then later when the owning compiled frame gets deoptimized, I
>> must not do it again:
>>>
>>> See call to EscapeBarrier::objs_are_deoptimized in deoptimization.cpp:
>>>
>>>    373   if ((jvmci_enabled || ((DoEscapeAnalysis || EliminateNestedLocks) &&
>> EliminateLocks))
>>>    374       && !EscapeBarrier::objs_are_deoptimized(thread, deoptee.id())) {
>>>    375     bool unused;
>>>    376     eliminate_locks(thread, chunk, realloc_failures, deoptee, exec_mode,
>> unused);
>>>    377   }
>>>
>>> Now when calling relock_objects early it is quiet possible that I have to relock
>> an object the
>>> target thread currently waits for. Obviously I cannot relock in this case,
>> instead I chose to
>>> introduce relock_count_after_wait to JavaThread.
>>>
>>>     >  Is it just that some of the locking gets optimized away e.g.
>>>     >
>>>     >  synchronised(obj) {
>>>     >     synchronised(obj) {
>>>     >       synchronised(obj) {
>>>     >         obj.wait();
>>>     >       }
>>>     >     }
>>>     >  }
>>>     >
>>>     >  If this is reduced to a form as-if it were a single lock of the monitor
>>>     >  (due to EA) and the wait() triggers a JVM TI event which leads to the
>>>     >  escape of "obj" then we need to reconstruct the true lock state, and so
>>>     >  when the wait() internally unblocks and reacquires the monitor it has to
>>>     >  set the true recursion count to 3, not the 1 that it appeared to be when
>>>     >  wait() was initially called. Is that the scenario?
>>>
>>> Kind of... except that the locking is not eliminated due to EA and there is no
>> JVM TI event
>>> triggered by wait.
>>>
>>> Add
>>>
>>> LocalObject l1 = new LocalObject();
>>>
>>> in front of the synchrnized blocks and assume a JVM TI agent acquires l1. This
>> triggers the code in
>>> question.
>>>
>>> See that relocking/reallocating is transactional. If it is done then for /all/
>> objects in scope and it is
>>> done at most once. It wouldn't be quite so easy to split this in relocking of
>> nested/EA-based
>>> eliminated locks.
>>>
>>>     >  If so I find this truly awful. Anyone using wait() in a realistic form
>>>     >  requires a notification and so the object cannot be thread confined. In
>>>
>>> It is not thread confined.
>>>
>>>     >  which case I would strongly argue that upon hitting the wait() the deopt
>>>     >  should occur unconditionally and so the lock state is correct before we
>>>     >  wait and so we don't need to mess with the recursion count internally
>>>     >  when we reacquire the monitor.
>>>     >
>>>     > >
>>>     > >    > which I don't like the sound of at all when it comes to ObjectMonitor
>>>     > >    > state. So I'd like to understand in detail exactly what is going on here
>>>     > >    > and why.  This is a very intrusive change that seems to badly break
>>>     > >    > encapsulation and impacts future changes to ObjectMonitor that are
>> under
>>>     > >    > investigation.
>>>     > >
>>>     > > I would not regard this as breaking encapsulation. Certainly not badly.
>>>     > >
>>>     > > I've added a property relock_count_after_wait to JavaThread. The
>> property is well
>>>     > > encapsulated. Future ObjectMonitor implementations have to deal with
>> recursion too. They are free
>>>     > > in choosing a way to do that as long as that property is taken into
>> account. This is hardly a
>>>     > > limitation.
>>>     >
>>>     >  I do think this badly breaks encapsulation as you have to add a callout
>>>     >  from the guts of the ObjectMonitor code to reach into the thread to get
>>>     >  this lock count adjustment. I understand why you have had to do this but
>>>     >  I would much rather see a change to the EA optimisation strategy so that
>>>     >  this is not needed.
>>>     >
>>>     > > Note also that the property is a straight forward extension of the
>> existing concept of deferred
>>>     > > local updates. It is embedded into the structure holding them. So not
>> even the footprint of a
>>>     > > JavaThread is enlarged if no deferred updates are generated.
>>>     >
>>>     > [...]
>>>     >
>>>     > >
>>>     > > I'm actually duplicating the existing external suspend mechanism,
>> because a thread can be
>>>     > > suspended at most once. And hey, and don't like that either! But it
>> seems not unlikely that the
>>>     > > duplicate can be removed together with the original and the new type
>> of handshakes that will be
>>>     > > used for thread suspend can be used for object deoptimization too. See
>> today's discussion in
>>>     > > JDK-8227745 [2].
>>>     >
>>>     >  I hope that discussion bears some fruit, at the moment it seems not to
>>>     >  be possible to use handshakes here. :(
>>>     >
>>>     >  The external suspend mechanism is a royal pain in the proverbial that we
>>>     >  have to carefully live with. The idea that we're duplicating that for
>>>     >  use in another fringe area of functionality does not thrill me at all.
>>>     >
>>>     >  To be clear, I understand the problem that exists and that you wish to
>>>     >  solve, but for the runtime parts I balk at the complexity cost of
>>>     >  solving it.
>>>
>>> I know it's complex, but by far no rocket science.
>>>
>>> Also I find it hard to imagine another fix for JDK-8233915 besides changing
>> the JVM TI specification.
>>>
>>> Thanks, Richard.
>>>
>>> -----Original Message-----
>>> From: David Holmes <david.holmes at oracle.com>
>>> Sent: Dienstag, 17. Dezember 2019 08:03
>>> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-
>> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-
>> runtime-dev at openjdk.java.net; Vladimir Kozlov (vladimir.kozlov at oracle.com)
>> <vladimir.kozlov at oracle.com>
>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance
>> in the Presence of JVMTI Agents
>>>
>>> <resend as my mailer crashed during last send>
>>>
>>> David
>>>
>>> On 17/12/2019 4:57 pm, David Holmes wrote:
>>>> Hi Richard,
>>>>
>>>> On 14/12/2019 5:01 am, Reingruber, Richard wrote:
>>>>> Hi David,
>>>>>
>>>>>   ?? > Some further queries/concerns:
>>>>>   ?? >
>>>>>   ?? > src/hotspot/share/runtime/objectMonitor.cpp
>>>>>   ?? >
>>>>>   ?? > Can you please explain the changes to ObjectMonitor::wait:
>>>>>   ?? >
>>>>>   ?? > !?? _recursions = save????? // restore the old recursion count
>>>>>   ?? > !???????????????? + jt->get_and_reset_relock_count_after_wait(); //
>>>>>   ?? > increased by the deferred relock count
>>>>>   ?? >
>>>>>   ?? > what is the "deferred relock count"? I gather it relates to
>>>>>   ?? >
>>>>>   ?? > "The code was extended to be able to deoptimize objects of a
>>>>> frame that
>>>>>   ?? > is not the top frame and to let another thread than the owning
>>>>> thread do
>>>>>   ?? > it."
>>>>>
>>>>> Yes, these relate. Currently EA based optimizations are reverted, when
>>>>> a compiled frame is replaced
>>>>> with corresponding interpreter frames. Part of this is relocking
>>>>> objects with eliminated
>>>>> locking. New with the enhancement is that we do this also just before
>>>>> object references are acquired
>>>>> through JVMTI. In this case we deoptimize also the owning compiled
>>>>> frame C and we register
>>>>> deoptimized objects as deferred updates. When control returns to C it
>>>>> gets deoptimized, we notice
>>>>> that objects are already deoptimized (reallocated and relocked), so we
>>>>> don't do it again (relocking
>>>>> twice would be incorrect of course). Deferred updates are copied into
>>>>> the new interpreter frames.
>>>>>
>>>>> Problem: relocking is not possible if the target thread T is waiting
>>>>> on the monitor that needs to be
>>>>> relocked. This happens only with non-local objects with
>>>>> EliminateNestedLocks. Instead relocking is
>>>>> deferred until T owns the monitor again. This is what the piece of
>>>>> code above does.
>>>>
>>>> Sorry I need some more detail here. How can you wait() on an object
>>>> monitor if the object allocation and/or locking was optimised away? And
>>>> what is a "non-local object" in this context? Isn't EA restricted to
>>>> thread-confined objects?
>>>>
>>>> Is it just that some of the locking gets optimized away e.g.
>>>>
>>>> synchronised(obj) {
>>>>    ? synchronised(obj) {
>>>>    ??? synchronised(obj) {
>>>>    ????? obj.wait();
>>>>    ??? }
>>>>    ? }
>>>> }
>>>>
>>>> If this is reduced to a form as-if it were a single lock of the monitor
>>>> (due to EA) and the wait() triggers a JVM TI event which leads to the
>>>> escape of "obj" then we need to reconstruct the true lock state, and so
>>>> when the wait() internally unblocks and reacquires the monitor it has to
>>>> set the true recursion count to 3, not the 1 that it appeared to be when
>>>> wait() was initially called. Is that the scenario?
>>>>
>>>> If so I find this truly awful. Anyone using wait() in a realistic form
>>>> requires a notification and so the object cannot be thread confined. In
>>>> which case I would strongly argue that upon hitting the wait() the deopt
>>>> should occur unconditionally and so the lock state is correct before we
>>>> wait and so we don't need to mess with the recursion count internally
>>>> when we reacquire the monitor.
>>>>
>>>>>
>>>>>   ?? > which I don't like the sound of at all when it comes to
>>>>> ObjectMonitor
>>>>>   ?? > state. So I'd like to understand in detail exactly what is going
>>>>> on here
>>>>>   ?? > and why.? This is a very intrusive change that seems to badly break
>>>>>   ?? > encapsulation and impacts future changes to ObjectMonitor that
>>>>> are under
>>>>>   ?? > investigation.
>>>>>
>>>>> I would not regard this as breaking encapsulation. Certainly not badly.
>>>>>
>>>>> I've added a property relock_count_after_wait to JavaThread. The
>>>>> property is well
>>>>> encapsulated. Future ObjectMonitor implementations have to deal with
>>>>> recursion too. They are free in
>>>>> choosing a way to do that as long as that property is taken into
>>>>> account. This is hardly a
>>>>> limitation.
>>>>
>>>> I do think this badly breaks encapsulation as you have to add a callout
>>>> from the guts of the ObjectMonitor code to reach into the thread to get
>>>> this lock count adjustment. I understand why you have had to do this but
>>>> I would much rather see a change to the EA optimisation strategy so that
>>>> this is not needed.
>>>>
>>>>> Note also that the property is a straight forward extension of the
>>>>> existing concept of deferred
>>>>> local updates. It is embedded into the structure holding them. So not
>>>>> even the footprint of a
>>>>> JavaThread is enlarged if no deferred updates are generated.
>>>>>
>>>>>   ?? > ---
>>>>>   ?? >
>>>>>   ?? > src/hotspot/share/runtime/thread.cpp
>>>>>   ?? >
>>>>>   ?? > Can you please explain why
>>>>> JavaThread::wait_for_object_deoptimization
>>>>>   ?? > has to be handcrafted in this way rather than using proper
>>>>> transitions.
>>>>>   ?? >
>>>>>
>>>>> I wrote wait_for_object_deoptimization taking
>>>>> JavaThread::java_suspend_self_with_safepoint_check
>>>>> as template. So in short: for the same reasons :)
>>>>>
>>>>> Threads reach both methods as part of thread state transitions,
>>>>> therefore special handling is
>>>>> required to change thread state on top of ongoing transitions.
>>>>>
>>>>>   ?? > We got rid of "deopt suspend" some time ago and it is disturbing
>>>>> to see
>>>>>   ?? > it being added back (effectively). This seems like it may be
>>>>> something
>>>>>   ?? > that handshakes could be used for.
>>>>>
>>>>> Deopt suspend used to be something rather different with a similar
>>>>> name[1]. It is not being added back.
>>>>
>>>> I stand corrected. Despite comments in the code to the contrary
>>>> deopt_suspend didn't actually cause a self-suspend. I was doing a lot of
>>>> cleanup in this area 13 years ago :)
>>>>
>>>>>
>>>>> I'm actually duplicating the existing external suspend mechanism,
>>>>> because a thread can be suspended
>>>>> at most once. And hey, and don't like that either! But it seems not
>>>>> unlikely that the duplicate can
>>>>> be removed together with the original and the new type of handshakes
>>>>> that will be used for
>>>>> thread suspend can be used for object deoptimization too. See today's
>>>>> discussion in JDK-8227745 [2].
>>>>
>>>> I hope that discussion bears some fruit, at the moment it seems not to
>>>> be possible to use handshakes here. :(
>>>>
>>>> The external suspend mechanism is a royal pain in the proverbial that we
>>>> have to carefully live with. The idea that we're duplicating that for
>>>> use in another fringe area of functionality does not thrill me at all.
>>>>
>>>> To be clear, I understand the problem that exists and that you wish to
>>>> solve, but for the runtime parts I balk at the complexity cost of
>>>> solving it.
>>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>> Thanks, Richard.
>>>>>
>>>>> [1] Deopt suspend was something like an async. handshake for
>>>>> architectures with register windows,
>>>>>   ???? where patching the return pc for deoptimization of a compiled
>>>>> frame was racy if the owner thread
>>>>>   ???? was in native code. Instead a "deopt" suspend flag was set on
>>>>> which the thread patched its own
>>>>>   ???? frame upon return from native. So no thread was suspended. It got
>>>>> its name only from the name of
>>>>>   ???? the flags.
>>>>>
>>>>> [2] Discussion about using handshakes to sync. with the target thread:
>>>>>
>>>>> https://bugs.openjdk.java.net/browse/JDK-
>> 8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.syste
>> m.issuetabpanels:comment-tabpanel#comment-14306727
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>> Sent: Freitag, 13. Dezember 2019 00:56
>>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
>>>>> serviceability-dev at openjdk.java.net;
>>>>> hotspot-compiler-dev at openjdk.java.net;
>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
>>>>> Performance in the Presence of JVMTI Agents
>>>>>
>>>>> Hi Richard,
>>>>>
>>>>> Some further queries/concerns:
>>>>>
>>>>> src/hotspot/share/runtime/objectMonitor.cpp
>>>>>
>>>>> Can you please explain the changes to ObjectMonitor::wait:
>>>>>
>>>>> !?? _recursions = save????? // restore the old recursion count
>>>>> !???????????????? + jt->get_and_reset_relock_count_after_wait(); //
>>>>> increased by the deferred relock count
>>>>>
>>>>> what is the "deferred relock count"? I gather it relates to
>>>>>
>>>>> "The code was extended to be able to deoptimize objects of a frame that
>>>>> is not the top frame and to let another thread than the owning thread do
>>>>> it."
>>>>>
>>>>> which I don't like the sound of at all when it comes to ObjectMonitor
>>>>> state. So I'd like to understand in detail exactly what is going on here
>>>>> and why.? This is a very intrusive change that seems to badly break
>>>>> encapsulation and impacts future changes to ObjectMonitor that are under
>>>>> investigation.
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/runtime/thread.cpp
>>>>>
>>>>> Can you please explain why JavaThread::wait_for_object_deoptimization
>>>>> has to be handcrafted in this way rather than using proper transitions.
>>>>>
>>>>> We got rid of "deopt suspend" some time ago and it is disturbing to see
>>>>> it being added back (effectively). This seems like it may be something
>>>>> that handshakes could be used for.
>>>>>
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>>
>>>>> On 12/12/2019 7:02 am, David Holmes wrote:
>>>>>> On 12/12/2019 1:07 am, Reingruber, Richard wrote:
>>>>>>> Hi David,
>>>>>>>
>>>>>>>   ??? > Most of the details here are in areas I can comment on in detail,
>>>>>>> but I
>>>>>>>   ??? > did take an initial general look at things.
>>>>>>>
>>>>>>> Thanks for taking the time!
>>>>>>
>>>>>> Apologies the above should read:
>>>>>>
>>>>>> "Most of the details here are in areas I *can't* comment on in detail
>>>>>> ..."
>>>>>>
>>>>>> David
>>>>>>
>>>>>>>   ??? > The only thing that jumped out at me is that I think the
>>>>>>>   ??? > DeoptimizeObjectsALotThread should be a hidden thread.
>>>>>>>   ??? >
>>>>>>>   ??? > +? bool is_hidden_from_external_view() const { return true; }
>>>>>>>
>>>>>>> Yes, it should. Will add the method like above.
>>>>>>>
>>>>>>>   ??? > Also I don't see any testing of the DeoptimizeObjectsALotThread.
>>>>>>> Without
>>>>>>>   ??? > active testing this will just bit-rot.
>>>>>>>
>>>>>>> DeoptimizeObjectsALot is meant for stress testing with a larger
>>>>>>> workload. I will add a minimal test
>>>>>>> to keep it fresh.
>>>>>>>
>>>>>>>   ??? > Also on the tests I don't understand your @requires clause:
>>>>>>>   ??? >
>>>>>>>   ??? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled
>> &
>>>>>>>   ??? > (vm.opt.TieredCompilation != true))
>>>>>>>   ??? >
>>>>>>>   ??? > This seems to require that TieredCompilation is disabled, but
>>>>>>> tiered is
>>>>>>>   ??? > our normal mode of operation. ??
>>>>>>>   ??? >
>>>>>>>
>>>>>>> I removed the clause. I guess I wanted to target the tests towards the
>>>>>>> code they are supposed to
>>>>>>> test, and it's easier to analyze failures w/o tiered compilation and
>>>>>>> with just one compiler thread.
>>>>>>>
>>>>>>> Additionally I will make use of
>>>>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Richard.
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>>> Sent: Mittwoch, 11. Dezember 2019 08:03
>>>>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
>>>>>>> serviceability-dev at openjdk.java.net;
>>>>>>> hotspot-compiler-dev at openjdk.java.net;
>>>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
>>>>>>> Performance in the Presence of JVMTI Agents
>>>>>>>
>>>>>>> Hi Richard,
>>>>>>>
>>>>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I would like to get reviews please for
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
>>>>>>>>
>>>>>>>> Corresponding RFE:
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745
>>>>>>>>
>>>>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
>>>>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]
>>>>>>>>
>>>>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without
>>>>>>>> issues (thanks!). In addition the
>>>>>>>> change is being tested at SAP since I posted the first RFR some
>>>>>>>> months ago.
>>>>>>>>
>>>>>>>> The intention of this enhancement is to benefit performance wise from
>>>>>>>> escape analysis even if JVMTI
>>>>>>>> agents request capabilities that allow them to access local variable
>>>>>>>> values. E.g. if you start-up
>>>>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then
>>>>>>>> escape analysis is disabled right
>>>>>>>> from the beginning, well before a debugger attaches -- if ever one
>>>>>>>> should do so. With the
>>>>>>>> enhancement, escape analysis will remain enabled until and after a
>>>>>>>> debugger attaches. EA based
>>>>>>>> optimizations are reverted just before an agent acquires the
>>>>>>>> reference to an object. In the JBS item
>>>>>>>> you'll find more details.
>>>>>>>
>>>>>>> Most of the details here are in areas I can comment on in detail, but I
>>>>>>> did take an initial general look at things.
>>>>>>>
>>>>>>> The only thing that jumped out at me is that I think the
>>>>>>> DeoptimizeObjectsALotThread should be a hidden thread.
>>>>>>>
>>>>>>> +? bool is_hidden_from_external_view() const { return true; }
>>>>>>>
>>>>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread.
>>>>>>> Without
>>>>>>> active testing this will just bit-rot.
>>>>>>>
>>>>>>> Also on the tests I don't understand your @requires clause:
>>>>>>>
>>>>>>>   ??? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>>>>>>> (vm.opt.TieredCompilation != true))
>>>>>>>
>>>>>>> This seems to require that TieredCompilation is disabled, but tiered is
>>>>>>> our normal mode of operation. ??
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Richard.
>>>>>>>>
>>>>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745
>>>>>>>>
>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patc
>> h
>>>>>>>>
>>>>>>>>
>>>>>>>>

From kevin.walls at oracle.com  Tue Mar  3 22:44:14 2020
From: kevin.walls at oracle.com (Kevin Walls)
Date: Tue, 3 Mar 2020 22:44:14 +0000
Subject: RFR(S): 8240295: hs_err elapsed time in seconds is not accurate
 enough
In-Reply-To: <92ada19d-7200-6bb7-b174-9da8a166aea7@oracle.com>
References: <4a3d7c8e-df9e-51de-77e4-775be1480392@oracle.com>
 <eb3880c3-27c0-b9b3-60f3-44f3f249dd66@oracle.com>
 <92ada19d-7200-6bb7-b174-9da8a166aea7@oracle.com>
Message-ID: <053864a8-617f-fcac-fc26-220d161e3e55@oracle.com>


Thanks David -

Yes there are situations where hs_err fails, and few people are sadder 
than me
when that happens 8-) , so I was thinking about how scared to be by the 
comment.

With the safety net of the error handler for the steps of the hs_err file
(which works well, we see it invoked frequently), it looks reasonable to use
%f as we might do other slightly questionable things for a signal handler.

Corrupting locale information or floating point state might possibly cause
problems, but if I cause a fake crash in print_date_and_time the error
handler recovers and the report continues.


Thinking about printing with two ints, seconds and fractions:
I don't see anything already that returns such a time in two components 
in the
JVM, so we might implement a new form of os::javaTimeNanos() or similar that
returns the two parts, and do that on each platform.

I didn't yet come up with anything to do in os::print_date_and_time()
which will take the fractional part of the double, and print just the 
fraction
as an int, without using any library / %f facilities.

If you're still concerned I could revisit these or some other idea.

Genuine laugh out loud moment for me, I backported the elapsed time 
logging from
6u4 to 5u19? (https://bugs.openjdk.java.net/browse/JDK-6447157) (2007).
(I said before jdk5 was created, I should have said before it was in 
mercurial.)

Thanks
Kevin


On 03/03/2020 01:11, David Holmes wrote:
> Hi Kevin,
>
> On 2/03/2020 8:48 pm, Kevin Walls wrote:
>> Oops, and with the bug ID in the title and JBS link:
>> https://bugs.openjdk.java.net/browse/JDK-8240295
>>
>>
>> On 02/03/2020 10:47, Kevin Walls wrote:
>>> Hi,
>>>
>>> (s11y and runtime opinions both relevant)
>>>
>>> A few times in the last month I've really wanted to compare the 
>>> Events logged in the hs_err file, and the time of the JVM's crash.
>>>
>>> "elapsed time" in hs_err is only accurate to one second, and has 
>>> been since before jdk5 was created.
>>>
>>> The diff below changes the format string and uses the non-rounded 
>>> time value (I don't see a need to change the other integer 
>>> arithmetic here), and we can enjoy hs_errs with detail like:
>>>
>>> ...
>>> Time: Mon Mar? 2 09:57:13 2020 UTC elapsed time: 5.490135 seconds 
>>> (0d 0h 0m 5s)
>>> ...
>>>
>>> Thanks
>>> Kevin
>>>
>>>
>>> /jdk/open$ hg diff
>>> diff --git a/src/hotspot/share/runtime/os.cpp 
>>> b/src/hotspot/share/runtime/os.cpp
>>> --- a/src/hotspot/share/runtime/os.cpp
>>> +++ b/src/hotspot/share/runtime/os.cpp
>>> @@ -1016,9 +1016,8 @@
>>> ?? }
>>>
>>> ?? double t = os::elapsedTime();
>>> -? // NOTE: It tends to crash after a SEGV if we want to 
>>> printf("%f",...) in
>>> -? //?????? Linux. Must be a bug in glibc ? Workaround is to round 
>>> "t" to int
>>> -? //?????? before printf. We lost some precision, but who cares?
>>> +? // NOTE: a crash using printf("%f",...) on Linux was historically 
>>> noted here
>>> +? //?????? (before the jdk5 repo was created).
>
> Just because it is old doesn't mean it no longer applies. printf is 
> not async-signal-safe - we know that but we try to use it anyway. 
> Maybe %f is even less async-signal-safe?
>
> This may get through testing okay but cause problems with real crashes 
> in the field.
>
> What about breaking the time up into two ints: seconds and nanos?
>
> Cheers,
> David
> -----
>
>>> ?? int eltime = (int)t;? // elapsed time in seconds
>>>
>>> ?? // print elapsed time in a human-readable format:
>>> @@ -1029,7 +1028,7 @@
>>> ?? int elmins = (eltime - day_secs - hour_secs) / secs_per_min;
>>> ?? int minute_secs = elmins * secs_per_min;
>>> ?? int elsecs = (eltime - day_secs - hour_secs - minute_secs);
>>> -? st->print_cr(" elapsed time: %d seconds (%dd %dh %dm %ds)", 
>>> eltime, eldays, elhours, elmins, elsecs);
>>> +? st->print_cr(" elapsed time: %f seconds (%dd %dh %dm %ds)", t, 
>>> eldays, elhours, elmins, elsecs);
>>> ?}
>>>
>>>

From ramkumar.sunderbabu at oracle.com  Wed Mar  4 05:24:33 2020
From: ramkumar.sunderbabu at oracle.com (Ramkumar Sunderbabu)
Date: Tue, 3 Mar 2020 21:24:33 -0800 (PST)
Subject: RFR(S) : 8153430: [TESTBUG] jdk regression test
 javax/management/loading/MletParserLocaleTest.java reduce default timeout
In-Reply-To: <0cee14ea-901d-4571-85b9-bbb6f01d59c1@default>
References: <0cee14ea-901d-4571-85b9-bbb6f01d59c1@default>
Message-ID: <bf77cafb-a813-4bad-9094-d8b59b5ccb0f@default>

Request to look into the change. It is tr?s simple.

?

Thanks,

Ram

?

From: Ramkumar Sunderbabu 
Sent: Tuesday, March 3, 2020 2:22 PM
To: serviceability-dev at openjdk.java.net
Subject: RFR(S) : 8153430: [TESTBUG] jdk regression test javax/management/loading/MletParserLocaleTest.java reduce default timeout

?

Hi all,

????????????? Please review this patch.

Removed "timeout=5" from the Tests so that default timeout is used.

?

JBS: https://bugs.openjdk.java.net/browse/JDK-8153430

Webrev: http://cr.openjdk.java.net/~rsunderbabu/8153430/webrev.00/

?

Testing: Locally tested with "-Xcomp" option on a linux-64 machine.

?

Thanks,

Ram
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200303/2683fa13/attachment.htm>

From chris.plummer at oracle.com  Wed Mar  4 06:10:12 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 3 Mar 2020 22:10:12 -0800
Subject: RFR(S) : 8153430: [TESTBUG] jdk regression test
 javax/management/loading/MletParserLocaleTest.java reduce default timeout
In-Reply-To: <bf77cafb-a813-4bad-9094-d8b59b5ccb0f@default>
References: <0cee14ea-901d-4571-85b9-bbb6f01d59c1@default>
 <bf77cafb-a813-4bad-9094-d8b59b5ccb0f@default>
Message-ID: <0653e5d4-893b-ce78-f0cf-5905a659bb10@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200303/023fd9c1/attachment.htm>

From alexey.menkov at oracle.com  Thu Mar  5 00:30:41 2020
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Wed, 4 Mar 2020 16:30:41 -0800
Subject: RFR: JDK-8240340: java/lang/management/ThreadMXBean/Locks.java is
 buggy
Message-ID: <a5f700fb-4955-aaed-0312-367d4a7d2460@oracle.com>

Hi all,

please review the fix for
https://bugs.openjdk.java.net/browse/JDK-8240340
webrev:
http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev/

changes:
- assertThreadState method: don't re-read thread state throwing 
exception (as we got weird error like "Thread WaitingThread is at 
WAITING state but is expected to be in Thread.State = WAITING");
- added proper test shutdown on error (made all threads "daemon", 
interrupt waiting thread if CheckerThread throws exception);
- if CheckerThread detects error, propagate the exception to main thread;
- fixed LockFreeLogger class - it should work for logging from several 
threads, but it doesn't. I prefer to simplify it just to keep 
ConcurrentLinkedQueue<String>.
LockFreeLogger is also used by ThreadMXBeanStateTest test, but only by a 
single thread.

--alex

From david.holmes at oracle.com  Thu Mar  5 00:57:35 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 5 Mar 2020 10:57:35 +1000
Subject: RFR(S): 8240295: hs_err elapsed time in seconds is not accurate
 enough
In-Reply-To: <053864a8-617f-fcac-fc26-220d161e3e55@oracle.com>
References: <4a3d7c8e-df9e-51de-77e4-775be1480392@oracle.com>
 <eb3880c3-27c0-b9b3-60f3-44f3f249dd66@oracle.com>
 <92ada19d-7200-6bb7-b174-9da8a166aea7@oracle.com>
 <053864a8-617f-fcac-fc26-220d161e3e55@oracle.com>
Message-ID: <c37c0dfd-4c81-b05d-537f-abdac913e45f@oracle.com>

On 4/03/2020 8:44 am, Kevin Walls wrote:
> 
> Thanks David -
> 
> Yes there are situations where hs_err fails, and few people are sadder 
> than me
> when that happens 8-) , so I was thinking about how scared to be by the 
> comment.
> 
> With the safety net of the error handler for the steps of the hs_err file
> (which works well, we see it invoked frequently), it looks reasonable to 
> use
> %f as we might do other slightly questionable things for a signal handler.
> 
> Corrupting locale information or floating point state might possibly cause
> problems, but if I cause a fake crash in print_date_and_time the error
> handler recovers and the report continues.

That is good to know.

> Thinking about printing with two ints, seconds and fractions:
> I don't see anything already that returns such a time in two components 
> in the
> JVM, so we might implement a new form of os::javaTimeNanos() or similar 
> that
> returns the two parts, and do that on each platform.

I was thinking of something simple/crude ...

> I didn't yet come up with anything to do in os::print_date_and_time()
> which will take the fractional part of the double, and print just the 
> fraction as an int, without using any library / %f facilities.

... just using e.g. (untested)

double t = os::elapsedTime();
int secs =  (int) t;
int micros =  (int)((t - secs) * 100000);
printf("%d.%d", secs, micros);

with appropriate width specifiers to get the formatting right.

Cheers,
David

> 
> If you're still concerned I could revisit these or some other idea.
> 
> Genuine laugh out loud moment for me, I backported the elapsed time 
> logging from
> 6u4 to 5u19? (https://bugs.openjdk.java.net/browse/JDK-6447157) (2007).
> (I said before jdk5 was created, I should have said before it was in 
> mercurial.)
> 
> Thanks
> Kevin
> 
> 
> On 03/03/2020 01:11, David Holmes wrote:
>> Hi Kevin,
>>
>> On 2/03/2020 8:48 pm, Kevin Walls wrote:
>>> Oops, and with the bug ID in the title and JBS link:
>>> https://bugs.openjdk.java.net/browse/JDK-8240295
>>>
>>>
>>> On 02/03/2020 10:47, Kevin Walls wrote:
>>>> Hi,
>>>>
>>>> (s11y and runtime opinions both relevant)
>>>>
>>>> A few times in the last month I've really wanted to compare the 
>>>> Events logged in the hs_err file, and the time of the JVM's crash.
>>>>
>>>> "elapsed time" in hs_err is only accurate to one second, and has 
>>>> been since before jdk5 was created.
>>>>
>>>> The diff below changes the format string and uses the non-rounded 
>>>> time value (I don't see a need to change the other integer 
>>>> arithmetic here), and we can enjoy hs_errs with detail like:
>>>>
>>>> ...
>>>> Time: Mon Mar? 2 09:57:13 2020 UTC elapsed time: 5.490135 seconds 
>>>> (0d 0h 0m 5s)
>>>> ...
>>>>
>>>> Thanks
>>>> Kevin
>>>>
>>>>
>>>> /jdk/open$ hg diff
>>>> diff --git a/src/hotspot/share/runtime/os.cpp 
>>>> b/src/hotspot/share/runtime/os.cpp
>>>> --- a/src/hotspot/share/runtime/os.cpp
>>>> +++ b/src/hotspot/share/runtime/os.cpp
>>>> @@ -1016,9 +1016,8 @@
>>>> ?? }
>>>>
>>>> ?? double t = os::elapsedTime();
>>>> -? // NOTE: It tends to crash after a SEGV if we want to 
>>>> printf("%f",...) in
>>>> -? //?????? Linux. Must be a bug in glibc ? Workaround is to round 
>>>> "t" to int
>>>> -? //?????? before printf. We lost some precision, but who cares?
>>>> +? // NOTE: a crash using printf("%f",...) on Linux was historically 
>>>> noted here
>>>> +? //?????? (before the jdk5 repo was created).
>>
>> Just because it is old doesn't mean it no longer applies. printf is 
>> not async-signal-safe - we know that but we try to use it anyway. 
>> Maybe %f is even less async-signal-safe?
>>
>> This may get through testing okay but cause problems with real crashes 
>> in the field.
>>
>> What about breaking the time up into two ints: seconds and nanos?
>>
>> Cheers,
>> David
>> -----
>>
>>>> ?? int eltime = (int)t;? // elapsed time in seconds
>>>>
>>>> ?? // print elapsed time in a human-readable format:
>>>> @@ -1029,7 +1028,7 @@
>>>> ?? int elmins = (eltime - day_secs - hour_secs) / secs_per_min;
>>>> ?? int minute_secs = elmins * secs_per_min;
>>>> ?? int elsecs = (eltime - day_secs - hour_secs - minute_secs);
>>>> -? st->print_cr(" elapsed time: %d seconds (%dd %dh %dm %ds)", 
>>>> eltime, eldays, elhours, elmins, elsecs);
>>>> +? st->print_cr(" elapsed time: %f seconds (%dd %dh %dm %ds)", t, 
>>>> eldays, elhours, elmins, elsecs);
>>>> ?}
>>>>
>>>>

From david.holmes at oracle.com  Thu Mar  5 01:50:11 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 5 Mar 2020 11:50:11 +1000
Subject: RFR: JDK-8240340: java/lang/management/ThreadMXBean/Locks.java is
 buggy
In-Reply-To: <a5f700fb-4955-aaed-0312-367d4a7d2460@oracle.com>
References: <a5f700fb-4955-aaed-0312-367d4a7d2460@oracle.com>
Message-ID: <4c9b6f1b-1308-9087-ef4c-140eaf103b0f@oracle.com>

Hi Alex,

On 5/03/2020 10:30 am, Alex Menkov wrote:
> Hi all,
> 
> please review the fix for
> https://bugs.openjdk.java.net/browse/JDK-8240340
> webrev:
> http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev/
> 
> changes:
> - assertThreadState method: don't re-read thread state throwing 
> exception (as we got weird error like "Thread WaitingThread is at 
> WAITING state but is expected to be in Thread.State = WAITING");
> - added proper test shutdown on error (made all threads "daemon", 
> interrupt waiting thread if CheckerThread throws exception);
> - if CheckerThread detects error, propagate the exception to main thread;

The test changes seem fine.

> - fixed LockFreeLogger class - it should work for logging from several 
> threads, but it doesn't. I prefer to simplify it just to keep 
> ConcurrentLinkedQueue<String>.
> LockFreeLogger is also used by ThreadMXBeanStateTest test, but only by a 
> single thread.

I don't understand your changes here as you've completely changed the 
intended design of the logger. The original accumulates log entries 
per-thread and then spits them all out (though I'm not clear on the 
exact ordering - I don't how to read that stream stuff). The new code 
just creates a single queue of log records interleaving entries from 
different threads. The simple logger may be all that is needed but it 
seems quite different to the intent of the original.

Thanks,
David

> --alex

From kevin.walls at oracle.com  Thu Mar  5 10:00:24 2020
From: kevin.walls at oracle.com (Kevin Walls)
Date: Thu, 5 Mar 2020 10:00:24 +0000
Subject: RFR(S): 8240295: hs_err elapsed time in seconds is not accurate
 enough
In-Reply-To: <c37c0dfd-4c81-b05d-537f-abdac913e45f@oracle.com>
References: <4a3d7c8e-df9e-51de-77e4-775be1480392@oracle.com>
 <eb3880c3-27c0-b9b3-60f3-44f3f249dd66@oracle.com>
 <92ada19d-7200-6bb7-b174-9da8a166aea7@oracle.com>
 <053864a8-617f-fcac-fc26-220d161e3e55@oracle.com>
 <c37c0dfd-4c81-b05d-537f-abdac913e45f@oracle.com>
Message-ID: <c5d54667-5f7a-a8d6-0e47-517986fbf1a5@oracle.com>

Thanks -

I had tried some ideas in the simple fashion, and we can use %06d 
formatting.... OK maybe such formatting is not as "bad" as %f...

(glibc parses the int width specified without allocation.? We provide 
the output buffer, I don't think we will cause? vfprintf code to alloca 
or malloc.)

I can offer a second version below that uses %d only.? Testing alongside 
%f in the same line, it retains the same value and position, e.g.

Time: Thu Mar? 5 08:57:50 2020 UTC elapsed time: f: 2.001065 int: 
2.001065 (raw int: 1065) seconds (0d 0h 0m 2s)

Output example from the hg diff below (not from the same run):

Time: Thu Mar? 5 09:28:01 2020 UTC elapsed time: 2.000611 seconds (0d 0h 
0m 2s)


Thanks!
Kevin


$ hg diff
diff --git a/src/hotspot/share/runtime/os.cpp 
b/src/hotspot/share/runtime/os.cpp
--- a/src/hotspot/share/runtime/os.cpp
+++ b/src/hotspot/share/runtime/os.cpp
@@ -1016,10 +1016,9 @@
 ?? }

 ?? double t = os::elapsedTime();
-? // NOTE: It tends to crash after a SEGV if we want to printf("%f",...) in
-? //?????? Linux. Must be a bug in glibc ? Workaround is to round "t" 
to int
-? //?????? before printf. We lost some precision, but who cares?
+? // NOTE: a crash using printf("%f",...) on Linux was historically 
noted here.
 ?? int eltime = (int)t;? // elapsed time in seconds
+? int eltimeFraction = (int) ((t - eltime) * 1000000);

 ?? // print elapsed time in a human-readable format:
 ?? int eldays = eltime / secs_per_day;
@@ -1029,7 +1028,7 @@
 ?? int elmins = (eltime - day_secs - hour_secs) / secs_per_min;
 ?? int minute_secs = elmins * secs_per_min;
 ?? int elsecs = (eltime - day_secs - hour_secs - minute_secs);
-? st->print_cr(" elapsed time: %d seconds (%dd %dh %dm %ds)", eltime, 
eldays, elhours, elmins, elsecs);
+? st->print_cr(" elapsed time: %d.%06d seconds (%dd %dh %dm %ds)", 
eltime, eltimeFraction, eldays, elhours, elmins, elsecs);
 ?}


On 05/03/2020 00:57, David Holmes wrote:
> On 4/03/2020 8:44 am, Kevin Walls wrote:
>>
>> Thanks David -
>>
>> Yes there are situations where hs_err fails, and few people are 
>> sadder than me
>> when that happens 8-) , so I was thinking about how scared to be by 
>> the comment.
>>
>> With the safety net of the error handler for the steps of the hs_err 
>> file
>> (which works well, we see it invoked frequently), it looks reasonable 
>> to use
>> %f as we might do other slightly questionable things for a signal 
>> handler.
>>
>> Corrupting locale information or floating point state might possibly 
>> cause
>> problems, but if I cause a fake crash in print_date_and_time the error
>> handler recovers and the report continues.
>
> That is good to know.
>
>> Thinking about printing with two ints, seconds and fractions:
>> I don't see anything already that returns such a time in two 
>> components in the
>> JVM, so we might implement a new form of os::javaTimeNanos() or 
>> similar that
>> returns the two parts, and do that on each platform.
>
> I was thinking of something simple/crude ...
>
>> I didn't yet come up with anything to do in os::print_date_and_time()
>> which will take the fractional part of the double, and print just the 
>> fraction as an int, without using any library / %f facilities.
>
> ... just using e.g. (untested)
>
> double t = os::elapsedTime();
> int secs =? (int) t;
> int micros =? (int)((t - secs) * 100000);
> printf("%d.%d", secs, micros);
>
> with appropriate width specifiers to get the formatting right.
>
> Cheers,
> David
>
>>
>> If you're still concerned I could revisit these or some other idea.
>>
>> Genuine laugh out loud moment for me, I backported the elapsed time 
>> logging from
>> 6u4 to 5u19? (https://bugs.openjdk.java.net/browse/JDK-6447157) (2007).
>> (I said before jdk5 was created, I should have said before it was in 
>> mercurial.)
>>
>> Thanks
>> Kevin
>>
>>
>> On 03/03/2020 01:11, David Holmes wrote:
>>> Hi Kevin,
>>>
>>> On 2/03/2020 8:48 pm, Kevin Walls wrote:
>>>> Oops, and with the bug ID in the title and JBS link:
>>>> https://bugs.openjdk.java.net/browse/JDK-8240295
>>>>
>>>>
>>>> On 02/03/2020 10:47, Kevin Walls wrote:
>>>>> Hi,
>>>>>
>>>>> (s11y and runtime opinions both relevant)
>>>>>
>>>>> A few times in the last month I've really wanted to compare the 
>>>>> Events logged in the hs_err file, and the time of the JVM's crash.
>>>>>
>>>>> "elapsed time" in hs_err is only accurate to one second, and has 
>>>>> been since before jdk5 was created.
>>>>>
>>>>> The diff below changes the format string and uses the non-rounded 
>>>>> time value (I don't see a need to change the other integer 
>>>>> arithmetic here), and we can enjoy hs_errs with detail like:
>>>>>
>>>>> ...
>>>>> Time: Mon Mar? 2 09:57:13 2020 UTC elapsed time: 5.490135 seconds 
>>>>> (0d 0h 0m 5s)
>>>>> ...
>>>>>
>>>>> Thanks
>>>>> Kevin
>>>>>
>>>>>
>>>>> /jdk/open$ hg diff
>>>>> diff --git a/src/hotspot/share/runtime/os.cpp 
>>>>> b/src/hotspot/share/runtime/os.cpp
>>>>> --- a/src/hotspot/share/runtime/os.cpp
>>>>> +++ b/src/hotspot/share/runtime/os.cpp
>>>>> @@ -1016,9 +1016,8 @@
>>>>> ?? }
>>>>>
>>>>> ?? double t = os::elapsedTime();
>>>>> -? // NOTE: It tends to crash after a SEGV if we want to 
>>>>> printf("%f",...) in
>>>>> -? //?????? Linux. Must be a bug in glibc ? Workaround is to round 
>>>>> "t" to int
>>>>> -? //?????? before printf. We lost some precision, but who cares?
>>>>> +? // NOTE: a crash using printf("%f",...) on Linux was 
>>>>> historically noted here
>>>>> +? //?????? (before the jdk5 repo was created).
>>>
>>> Just because it is old doesn't mean it no longer applies. printf is 
>>> not async-signal-safe - we know that but we try to use it anyway. 
>>> Maybe %f is even less async-signal-safe?
>>>
>>> This may get through testing okay but cause problems with real 
>>> crashes in the field.
>>>
>>> What about breaking the time up into two ints: seconds and nanos?
>>>
>>> Cheers,
>>> David
>>> -----
>>>
>>>>> ?? int eltime = (int)t;? // elapsed time in seconds
>>>>>
>>>>> ?? // print elapsed time in a human-readable format:
>>>>> @@ -1029,7 +1028,7 @@
>>>>> ?? int elmins = (eltime - day_secs - hour_secs) / secs_per_min;
>>>>> ?? int minute_secs = elmins * secs_per_min;
>>>>> ?? int elsecs = (eltime - day_secs - hour_secs - minute_secs);
>>>>> -? st->print_cr(" elapsed time: %d seconds (%dd %dh %dm %ds)", 
>>>>> eltime, eldays, elhours, elmins, elsecs);
>>>>> +? st->print_cr(" elapsed time: %f seconds (%dd %dh %dm %ds)", t, 
>>>>> eldays, elhours, elmins, elsecs);
>>>>> ?}
>>>>>
>>>>>

From david.holmes at oracle.com  Thu Mar  5 10:38:58 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 5 Mar 2020 20:38:58 +1000
Subject: RFR(S): 8240295: hs_err elapsed time in seconds is not accurate
 enough
In-Reply-To: <c5d54667-5f7a-a8d6-0e47-517986fbf1a5@oracle.com>
References: <4a3d7c8e-df9e-51de-77e4-775be1480392@oracle.com>
 <eb3880c3-27c0-b9b3-60f3-44f3f249dd66@oracle.com>
 <92ada19d-7200-6bb7-b174-9da8a166aea7@oracle.com>
 <053864a8-617f-fcac-fc26-220d161e3e55@oracle.com>
 <c37c0dfd-4c81-b05d-537f-abdac913e45f@oracle.com>
 <c5d54667-5f7a-a8d6-0e47-517986fbf1a5@oracle.com>
Message-ID: <20045e23-c736-5289-866e-9df5a09101a8@oracle.com>

Thanks Kevin. I think this is the less risky change and achieves the goal.

David

On 5/03/2020 8:00 pm, Kevin Walls wrote:
> Thanks -
> 
> I had tried some ideas in the simple fashion, and we can use %06d 
> formatting.... OK maybe such formatting is not as "bad" as %f...
> 
> (glibc parses the int width specified without allocation.? We provide 
> the output buffer, I don't think we will cause? vfprintf code to alloca 
> or malloc.)
> 
> I can offer a second version below that uses %d only.? Testing alongside 
> %f in the same line, it retains the same value and position, e.g.
> 
> Time: Thu Mar? 5 08:57:50 2020 UTC elapsed time: f: 2.001065 int: 
> 2.001065 (raw int: 1065) seconds (0d 0h 0m 2s)
> 
> Output example from the hg diff below (not from the same run):
> 
> Time: Thu Mar? 5 09:28:01 2020 UTC elapsed time: 2.000611 seconds (0d 0h 
> 0m 2s)
> 
> 
> Thanks!
> Kevin
> 
> 
> $ hg diff
> diff --git a/src/hotspot/share/runtime/os.cpp 
> b/src/hotspot/share/runtime/os.cpp
> --- a/src/hotspot/share/runtime/os.cpp
> +++ b/src/hotspot/share/runtime/os.cpp
> @@ -1016,10 +1016,9 @@
>  ?? }
> 
>  ?? double t = os::elapsedTime();
> -? // NOTE: It tends to crash after a SEGV if we want to 
> printf("%f",...) in
> -? //?????? Linux. Must be a bug in glibc ? Workaround is to round "t" 
> to int
> -? //?????? before printf. We lost some precision, but who cares?
> +? // NOTE: a crash using printf("%f",...) on Linux was historically 
> noted here.
>  ?? int eltime = (int)t;? // elapsed time in seconds
> +? int eltimeFraction = (int) ((t - eltime) * 1000000);
> 
>  ?? // print elapsed time in a human-readable format:
>  ?? int eldays = eltime / secs_per_day;
> @@ -1029,7 +1028,7 @@
>  ?? int elmins = (eltime - day_secs - hour_secs) / secs_per_min;
>  ?? int minute_secs = elmins * secs_per_min;
>  ?? int elsecs = (eltime - day_secs - hour_secs - minute_secs);
> -? st->print_cr(" elapsed time: %d seconds (%dd %dh %dm %ds)", eltime, 
> eldays, elhours, elmins, elsecs);
> +? st->print_cr(" elapsed time: %d.%06d seconds (%dd %dh %dm %ds)", 
> eltime, eltimeFraction, eldays, elhours, elmins, elsecs);
>  ?}
> 
> 
> 
> On 05/03/2020 00:57, David Holmes wrote:
>> On 4/03/2020 8:44 am, Kevin Walls wrote:
>>>
>>> Thanks David -
>>>
>>> Yes there are situations where hs_err fails, and few people are 
>>> sadder than me
>>> when that happens 8-) , so I was thinking about how scared to be by 
>>> the comment.
>>>
>>> With the safety net of the error handler for the steps of the hs_err 
>>> file
>>> (which works well, we see it invoked frequently), it looks reasonable 
>>> to use
>>> %f as we might do other slightly questionable things for a signal 
>>> handler.
>>>
>>> Corrupting locale information or floating point state might possibly 
>>> cause
>>> problems, but if I cause a fake crash in print_date_and_time the error
>>> handler recovers and the report continues.
>>
>> That is good to know.
>>
>>> Thinking about printing with two ints, seconds and fractions:
>>> I don't see anything already that returns such a time in two 
>>> components in the
>>> JVM, so we might implement a new form of os::javaTimeNanos() or 
>>> similar that
>>> returns the two parts, and do that on each platform.
>>
>> I was thinking of something simple/crude ...
>>
>>> I didn't yet come up with anything to do in os::print_date_and_time()
>>> which will take the fractional part of the double, and print just the 
>>> fraction as an int, without using any library / %f facilities.
>>
>> ... just using e.g. (untested)
>>
>> double t = os::elapsedTime();
>> int secs =? (int) t;
>> int micros =? (int)((t - secs) * 100000);
>> printf("%d.%d", secs, micros);
>>
>> with appropriate width specifiers to get the formatting right.
>>
>> Cheers,
>> David
>>
>>>
>>> If you're still concerned I could revisit these or some other idea.
>>>
>>> Genuine laugh out loud moment for me, I backported the elapsed time 
>>> logging from
>>> 6u4 to 5u19? (https://bugs.openjdk.java.net/browse/JDK-6447157) (2007).
>>> (I said before jdk5 was created, I should have said before it was in 
>>> mercurial.)
>>>
>>> Thanks
>>> Kevin
>>>
>>>
>>> On 03/03/2020 01:11, David Holmes wrote:
>>>> Hi Kevin,
>>>>
>>>> On 2/03/2020 8:48 pm, Kevin Walls wrote:
>>>>> Oops, and with the bug ID in the title and JBS link:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8240295
>>>>>
>>>>>
>>>>> On 02/03/2020 10:47, Kevin Walls wrote:
>>>>>> Hi,
>>>>>>
>>>>>> (s11y and runtime opinions both relevant)
>>>>>>
>>>>>> A few times in the last month I've really wanted to compare the 
>>>>>> Events logged in the hs_err file, and the time of the JVM's crash.
>>>>>>
>>>>>> "elapsed time" in hs_err is only accurate to one second, and has 
>>>>>> been since before jdk5 was created.
>>>>>>
>>>>>> The diff below changes the format string and uses the non-rounded 
>>>>>> time value (I don't see a need to change the other integer 
>>>>>> arithmetic here), and we can enjoy hs_errs with detail like:
>>>>>>
>>>>>> ...
>>>>>> Time: Mon Mar? 2 09:57:13 2020 UTC elapsed time: 5.490135 seconds 
>>>>>> (0d 0h 0m 5s)
>>>>>> ...
>>>>>>
>>>>>> Thanks
>>>>>> Kevin
>>>>>>
>>>>>>
>>>>>> /jdk/open$ hg diff
>>>>>> diff --git a/src/hotspot/share/runtime/os.cpp 
>>>>>> b/src/hotspot/share/runtime/os.cpp
>>>>>> --- a/src/hotspot/share/runtime/os.cpp
>>>>>> +++ b/src/hotspot/share/runtime/os.cpp
>>>>>> @@ -1016,9 +1016,8 @@
>>>>>> ?? }
>>>>>>
>>>>>> ?? double t = os::elapsedTime();
>>>>>> -? // NOTE: It tends to crash after a SEGV if we want to 
>>>>>> printf("%f",...) in
>>>>>> -? //?????? Linux. Must be a bug in glibc ? Workaround is to round 
>>>>>> "t" to int
>>>>>> -? //?????? before printf. We lost some precision, but who cares?
>>>>>> +? // NOTE: a crash using printf("%f",...) on Linux was 
>>>>>> historically noted here
>>>>>> +? //?????? (before the jdk5 repo was created).
>>>>
>>>> Just because it is old doesn't mean it no longer applies. printf is 
>>>> not async-signal-safe - we know that but we try to use it anyway. 
>>>> Maybe %f is even less async-signal-safe?
>>>>
>>>> This may get through testing okay but cause problems with real 
>>>> crashes in the field.
>>>>
>>>> What about breaking the time up into two ints: seconds and nanos?
>>>>
>>>> Cheers,
>>>> David
>>>> -----
>>>>
>>>>>> ?? int eltime = (int)t;? // elapsed time in seconds
>>>>>>
>>>>>> ?? // print elapsed time in a human-readable format:
>>>>>> @@ -1029,7 +1028,7 @@
>>>>>> ?? int elmins = (eltime - day_secs - hour_secs) / secs_per_min;
>>>>>> ?? int minute_secs = elmins * secs_per_min;
>>>>>> ?? int elsecs = (eltime - day_secs - hour_secs - minute_secs);
>>>>>> -? st->print_cr(" elapsed time: %d seconds (%dd %dh %dm %ds)", 
>>>>>> eltime, eldays, elhours, elmins, elsecs);
>>>>>> +? st->print_cr(" elapsed time: %f seconds (%dd %dh %dm %ds)", t, 
>>>>>> eldays, elhours, elmins, elsecs);
>>>>>> ?}
>>>>>>
>>>>>>

From daniel.fuchs at oracle.com  Thu Mar  5 14:27:36 2020
From: daniel.fuchs at oracle.com (Daniel Fuchs)
Date: Thu, 5 Mar 2020 14:27:36 +0000
Subject: jmx-dev RFR 8240604: Rewrite
 sun/management/jmxremote/bootstrap/CustomLauncherTest.java test to make
 binaries from source file
In-Reply-To: <6b21ac03-0505-881e-b805-1681faab437b@bell-sw.com>
References: <6b21ac03-0505-881e-b805-1681faab437b@bell-sw.com>
Message-ID: <b9cf2538-275a-f6ee-aa8c-3e50d1cdef3b@oracle.com>

Hi Alexander,

Fixes to JMX & management agent are reviewed on the
seviceability-dev (added in to:) these days.

best regards,

-- daniel

On 05/03/2020 13:17, Alexander Scherbatiy wrote:
> Hello,
> 
> Could you review a small enhancement where the test CustomLauncherTest 
> is updated to build binary launcher file from launcher.c file.
> The file launcher.c is renamed to exelauncher.c to follow the name 
> convention for executable test files building by jdk make system.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8240604
> Webrev: http://cr.openjdk.java.net/~alexsch/8240604/webrev.00
> 
> The changes for obsolete binary files from 
> sun/management/jmxremote/bootstrap/linux-* and solaris-* are not 
> included into the webrev. They needs to be removed manually.
> 
> The test is passed on Ubuntu 18.04 x86-64, Solaris Sparc v9 11.2, and 
> Solaris x64 11.4 systems.
> 
> The test is excluded from Windows and Mac Os X systems.
> 
> Thanks,
> Alexander.


From alexey.menkov at oracle.com  Thu Mar  5 18:54:20 2020
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Thu, 5 Mar 2020 10:54:20 -0800
Subject: RFR: JDK-8240340: java/lang/management/ThreadMXBean/Locks.java is
 buggy
In-Reply-To: <4c9b6f1b-1308-9087-ef4c-140eaf103b0f@oracle.com>
References: <a5f700fb-4955-aaed-0312-367d4a7d2460@oracle.com>
 <4c9b6f1b-1308-9087-ef4c-140eaf103b0f@oracle.com>
Message-ID: <9a3d8f8c-5c46-aa64-ba22-b1b2bf1836ee@oracle.com>

Hi David,

Thanks you for the review.

On 03/04/2020 17:50, David Holmes wrote:
> Hi Alex,
> 
> On 5/03/2020 10:30 am, Alex Menkov wrote:
>> Hi all,
>>
>> please review the fix for
>> https://bugs.openjdk.java.net/browse/JDK-8240340
>> webrev:
>> http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev/
>>
>> changes:
>> - assertThreadState method: don't re-read thread state throwing 
>> exception (as we got weird error like "Thread WaitingThread is at 
>> WAITING state but is expected to be in Thread.State = WAITING");
>> - added proper test shutdown on error (made all threads "daemon", 
>> interrupt waiting thread if CheckerThread throws exception);
>> - if CheckerThread detects error, propagate the exception to main thread;
> 
> The test changes seem fine.
> 
>> - fixed LockFreeLogger class - it should work for logging from several 
>> threads, but it doesn't. I prefer to simplify it just to keep 
>> ConcurrentLinkedQueue<String>.
>> LockFreeLogger is also used by ThreadMXBeanStateTest test, but only by 
>> a single thread.
> 
> I don't understand your changes here as you've completely changed the 
> intended design of the logger. The original accumulates log entries 
> per-thread and then spits them all out (though I'm not clear on the 
> exact ordering - I don't how to read that stream stuff). The new code 
> just creates a single queue of log records interleaving entries from 
> different threads. The simple logger may be all that is needed but it 
> seems quite different to the intent of the original.

Testing changes in the test I discovered that there is something wrong 
with the logger - it printed only part of the records, so I have to look 
at the LockFreeLogger class and I don't understand how it was supposed 
to work.
About ordering in cumulative log: each record has Integer which used to 
sort log entries from all threads (i.e. records from different threads 
are printed at the order which log() was called).
Looking at allRecords/records stuff I don't understand how it should be 
used. To get logs from different threads in one logger, we needs one 
instance. So we create LockFreeLogger (in main thread) and ctor creates 
ThreadLocal record and register it in allRecords. Logging from main 
thread works fine, but if any other thread tries to log, 1st log() call 
creates its own ThreadLocal records (by records.get()) and log records 
from this thread go there. But this ThreadLocal records is not 
registered in allRecords, so this logging won't be included in final log.
Looks like we need to change log() to something like

Map<Integer, String> recs = records.get();
if (recs.isEmpty()) {
     allRecords.add(recs);
}
recs.put(id, String.format(format, params));

But all this stuff do exactly the same as simple ConcurrentLinkedQueue 
(i.e. lock free ordered list).
At least I don't see other rationale in the stuff.

--alex

> 
> Thanks,
> David
> 
>> --alex

From kevin.walls at oracle.com  Thu Mar  5 21:01:53 2020
From: kevin.walls at oracle.com (Kevin Walls)
Date: Thu, 5 Mar 2020 21:01:53 +0000
Subject: RFR(S): 8240295: hs_err elapsed time in seconds is not accurate
 enough
In-Reply-To: <20045e23-c736-5289-866e-9df5a09101a8@oracle.com>
References: <4a3d7c8e-df9e-51de-77e4-775be1480392@oracle.com>
 <eb3880c3-27c0-b9b3-60f3-44f3f249dd66@oracle.com>
 <92ada19d-7200-6bb7-b174-9da8a166aea7@oracle.com>
 <053864a8-617f-fcac-fc26-220d161e3e55@oracle.com>
 <c37c0dfd-4c81-b05d-537f-abdac913e45f@oracle.com>
 <c5d54667-5f7a-a8d6-0e47-517986fbf1a5@oracle.com>
 <20045e23-c736-5289-866e-9df5a09101a8@oracle.com>
Message-ID: <f17a6065-3ace-5767-5d82-b8b7ac35e570@oracle.com>

Great, thanks David.

On 05/03/2020 10:38, David Holmes wrote:
> Thanks Kevin. I think this is the less risky change and achieves the 
> goal.
>
> David
>
> On 5/03/2020 8:00 pm, Kevin Walls wrote:
>> Thanks -
>>
>> I had tried some ideas in the simple fashion, and we can use %06d 
>> formatting.... OK maybe such formatting is not as "bad" as %f...
>>
>> (glibc parses the int width specified without allocation.? We provide 
>> the output buffer, I don't think we will cause? vfprintf code to 
>> alloca or malloc.)
>>
>> I can offer a second version below that uses %d only.? Testing 
>> alongside %f in the same line, it retains the same value and 
>> position, e.g.
>>
>> Time: Thu Mar? 5 08:57:50 2020 UTC elapsed time: f: 2.001065 int: 
>> 2.001065 (raw int: 1065) seconds (0d 0h 0m 2s)
>>
>> Output example from the hg diff below (not from the same run):
>>
>> Time: Thu Mar? 5 09:28:01 2020 UTC elapsed time: 2.000611 seconds (0d 
>> 0h 0m 2s)
>>
>>
>> Thanks!
>> Kevin
>>
>>
>> $ hg diff
>> diff --git a/src/hotspot/share/runtime/os.cpp 
>> b/src/hotspot/share/runtime/os.cpp
>> --- a/src/hotspot/share/runtime/os.cpp
>> +++ b/src/hotspot/share/runtime/os.cpp
>> @@ -1016,10 +1016,9 @@
>> ??? }
>>
>> ??? double t = os::elapsedTime();
>> -? // NOTE: It tends to crash after a SEGV if we want to 
>> printf("%f",...) in
>> -? //?????? Linux. Must be a bug in glibc ? Workaround is to round 
>> "t" to int
>> -? //?????? before printf. We lost some precision, but who cares?
>> +? // NOTE: a crash using printf("%f",...) on Linux was historically 
>> noted here.
>> ??? int eltime = (int)t;? // elapsed time in seconds
>> +? int eltimeFraction = (int) ((t - eltime) * 1000000);
>>
>> ??? // print elapsed time in a human-readable format:
>> ??? int eldays = eltime / secs_per_day;
>> @@ -1029,7 +1028,7 @@
>> ??? int elmins = (eltime - day_secs - hour_secs) / secs_per_min;
>> ??? int minute_secs = elmins * secs_per_min;
>> ??? int elsecs = (eltime - day_secs - hour_secs - minute_secs);
>> -? st->print_cr(" elapsed time: %d seconds (%dd %dh %dm %ds)", 
>> eltime, eldays, elhours, elmins, elsecs);
>> +? st->print_cr(" elapsed time: %d.%06d seconds (%dd %dh %dm %ds)", 
>> eltime, eltimeFraction, eldays, elhours, elmins, elsecs);
>> ??}
>>
>>
>>
>> On 05/03/2020 00:57, David Holmes wrote:
>>> On 4/03/2020 8:44 am, Kevin Walls wrote:
>>>>
>>>> Thanks David -
>>>>
>>>> Yes there are situations where hs_err fails, and few people are 
>>>> sadder than me
>>>> when that happens 8-) , so I was thinking about how scared to be by 
>>>> the comment.
>>>>
>>>> With the safety net of the error handler for the steps of the 
>>>> hs_err file
>>>> (which works well, we see it invoked frequently), it looks 
>>>> reasonable to use
>>>> %f as we might do other slightly questionable things for a signal 
>>>> handler.
>>>>
>>>> Corrupting locale information or floating point state might 
>>>> possibly cause
>>>> problems, but if I cause a fake crash in print_date_and_time the error
>>>> handler recovers and the report continues.
>>>
>>> That is good to know.
>>>
>>>> Thinking about printing with two ints, seconds and fractions:
>>>> I don't see anything already that returns such a time in two 
>>>> components in the
>>>> JVM, so we might implement a new form of os::javaTimeNanos() or 
>>>> similar that
>>>> returns the two parts, and do that on each platform.
>>>
>>> I was thinking of something simple/crude ...
>>>
>>>> I didn't yet come up with anything to do in os::print_date_and_time()
>>>> which will take the fractional part of the double, and print just 
>>>> the fraction as an int, without using any library / %f facilities.
>>>
>>> ... just using e.g. (untested)
>>>
>>> double t = os::elapsedTime();
>>> int secs =? (int) t;
>>> int micros =? (int)((t - secs) * 100000);
>>> printf("%d.%d", secs, micros);
>>>
>>> with appropriate width specifiers to get the formatting right.
>>>
>>> Cheers,
>>> David
>>>
>>>>
>>>> If you're still concerned I could revisit these or some other idea.
>>>>
>>>> Genuine laugh out loud moment for me, I backported the elapsed time 
>>>> logging from
>>>> 6u4 to 5u19 (https://bugs.openjdk.java.net/browse/JDK-6447157) (2007).
>>>> (I said before jdk5 was created, I should have said before it was 
>>>> in mercurial.)
>>>>
>>>> Thanks
>>>> Kevin
>>>>
>>>>
>>>> On 03/03/2020 01:11, David Holmes wrote:
>>>>> Hi Kevin,
>>>>>
>>>>> On 2/03/2020 8:48 pm, Kevin Walls wrote:
>>>>>> Oops, and with the bug ID in the title and JBS link:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8240295
>>>>>>
>>>>>>
>>>>>> On 02/03/2020 10:47, Kevin Walls wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> (s11y and runtime opinions both relevant)
>>>>>>>
>>>>>>> A few times in the last month I've really wanted to compare the 
>>>>>>> Events logged in the hs_err file, and the time of the JVM's crash.
>>>>>>>
>>>>>>> "elapsed time" in hs_err is only accurate to one second, and has 
>>>>>>> been since before jdk5 was created.
>>>>>>>
>>>>>>> The diff below changes the format string and uses the 
>>>>>>> non-rounded time value (I don't see a need to change the other 
>>>>>>> integer arithmetic here), and we can enjoy hs_errs with detail 
>>>>>>> like:
>>>>>>>
>>>>>>> ...
>>>>>>> Time: Mon Mar? 2 09:57:13 2020 UTC elapsed time: 5.490135 
>>>>>>> seconds (0d 0h 0m 5s)
>>>>>>> ...
>>>>>>>
>>>>>>> Thanks
>>>>>>> Kevin
>>>>>>>
>>>>>>>
>>>>>>> /jdk/open$ hg diff
>>>>>>> diff --git a/src/hotspot/share/runtime/os.cpp 
>>>>>>> b/src/hotspot/share/runtime/os.cpp
>>>>>>> --- a/src/hotspot/share/runtime/os.cpp
>>>>>>> +++ b/src/hotspot/share/runtime/os.cpp
>>>>>>> @@ -1016,9 +1016,8 @@
>>>>>>> ?? }
>>>>>>>
>>>>>>> ?? double t = os::elapsedTime();
>>>>>>> -? // NOTE: It tends to crash after a SEGV if we want to 
>>>>>>> printf("%f",...) in
>>>>>>> -? //?????? Linux. Must be a bug in glibc ? Workaround is to 
>>>>>>> round "t" to int
>>>>>>> -? //?????? before printf. We lost some precision, but who cares?
>>>>>>> +? // NOTE: a crash using printf("%f",...) on Linux was 
>>>>>>> historically noted here
>>>>>>> +? //?????? (before the jdk5 repo was created).
>>>>>
>>>>> Just because it is old doesn't mean it no longer applies. printf 
>>>>> is not async-signal-safe - we know that but we try to use it 
>>>>> anyway. Maybe %f is even less async-signal-safe?
>>>>>
>>>>> This may get through testing okay but cause problems with real 
>>>>> crashes in the field.
>>>>>
>>>>> What about breaking the time up into two ints: seconds and nanos?
>>>>>
>>>>> Cheers,
>>>>> David
>>>>> -----
>>>>>
>>>>>>> ?? int eltime = (int)t;? // elapsed time in seconds
>>>>>>>
>>>>>>> ?? // print elapsed time in a human-readable format:
>>>>>>> @@ -1029,7 +1028,7 @@
>>>>>>> ?? int elmins = (eltime - day_secs - hour_secs) / secs_per_min;
>>>>>>> ?? int minute_secs = elmins * secs_per_min;
>>>>>>> ?? int elsecs = (eltime - day_secs - hour_secs - minute_secs);
>>>>>>> -? st->print_cr(" elapsed time: %d seconds (%dd %dh %dm %ds)", 
>>>>>>> eltime, eldays, elhours, elmins, elsecs);
>>>>>>> +? st->print_cr(" elapsed time: %f seconds (%dd %dh %dm %ds)", 
>>>>>>> t, eldays, elhours, elmins, elsecs);
>>>>>>> ?}
>>>>>>>
>>>>>>>

From daniil.x.titov at oracle.com  Fri Mar  6 01:15:12 2020
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Thu, 05 Mar 2020 17:15:12 -0800
Subject: 8196751: Add jhsdb option to specify debug server RMI connector
 port
In-Reply-To: <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com>
References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com>
 <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com>
Message-ID: <D28BF049-D293-4F65-93EC-BCAE4F09B413@oracle.com>

Hi Yasumasa, Serguei and Alex,

Please review a new version of the fix [1] that addresses your comments. The new version in addition to RMI connector
port option introduces two more options to specify RMI registry port and RMI connector host name. Currently, these
last two settings could be specified using the system properties but the system properties have the following disadvantages
comparing to the command line options:
   -  It?s hard to know about them: they are not listed in tool?s help.
   -  They have long names that hard to remember
   -   It is easy to mistype them  in the command line and you will not get any warning about it.

The CSR [2] was also updated and needs to be reviewed.

Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker
container  and connecting  to it with the GUI debugger.  Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.

[1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.02/
[2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
[3] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751

Thank you,
Daniil

?On 2/24/20, 5:45 AM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:

    Hi Daniil,
    
       - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments.
         Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply.
    
       - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used.
         But you can use same port number as RMI registry (1099).
         It is same as relation between jmxremote.port and jmxremote.rmi.port.
    
    
    Thanks,
    
    Yasumasa
    
    
    On 2020/02/24 13:21, Daniil Titov wrote:
    > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port.
    > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container.
    > 
    > New CSR [3] was created for this change and it needs to be reviewed as well.
    > 
    > Man pages for jhsdb will be updated in a separate issue.
    > 
    > The current implementation (sun.jvm.hotspot.SALauncher)  parses the command line options passed to jhsdb tool,
    > converts them to the ones for the debug server and then delegates the call  to sun.jvm.hotspot.DebugServer.main().
    > 
    >                // delegate to the actual SA debug server.
    >   367         DebugServer.main(newArgArray.toArray(new String[0]));
    > 
    > However,  sun.jvm.hotspot.DebugServer  doesn't support named options and that prevents from efficiently adding new options to the tool.
    > I found it more suitable to start Hotspot agent directly in  SALauncher rather than  adding a new option in  both sun.jvm.hotspot.SALauncher
    >   and sun.jvm.hotspot.DebugServer and  delegating the call.  With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated
    > but I would prefer to address it in a separate issue.
    > 
    > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker
    >                  container  and connecting  to it with the GUI debugger.
    >                 Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
    > 
    > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01
    > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751
    > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
    > 
    > Thank you,
    > Daniil
    > 
    > 
    

From suenaga at oss.nttdata.com  Fri Mar  6 08:30:05 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Fri, 6 Mar 2020 17:30:05 +0900
Subject: 8196751: Add jhsdb option to specify debug server RMI connector
 port
In-Reply-To: <D28BF049-D293-4F65-93EC-BCAE4F09B413@oracle.com>
References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com>
 <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com>
 <D28BF049-D293-4F65-93EC-BCAE4F09B413@oracle.com>
Message-ID: <a3b5ac77-b6ee-4927-cf99-b586c6bbeae6@oss.nttdata.com>

Hi Daniil,


- SALauncher.java
     - checkBasicOptions() is needed? I think you can remove this method and embed it in caller.
     - I think registryPort should be checked with Integer.parseInt() like others (rmiPort and pid) rather than regex.
     - Shutdown hook is very good idea. You can implement more simply if you use lambda expression.

- SADebugDTest.java
     - Please add bug ID to @bug.
     - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array.


Thanks,

Yasumasa


On 2020/03/06 10:15, Daniil Titov wrote:
> Hi Yasumasa, Serguei and Alex,
> 
> Please review a new version of the fix [1] that addresses your comments. The new version in addition to RMI connector
> port option introduces two more options to specify RMI registry port and RMI connector host name. Currently, these
> last two settings could be specified using the system properties but the system properties have the following disadvantages
> comparing to the command line options:
>     -  It?s hard to know about them: they are not listed in tool?s help.
>     -  They have long names that hard to remember
>     -   It is easy to mistype them  in the command line and you will not get any warning about it.
> 
> The CSR [2] was also updated and needs to be reviewed.
> 
> Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker
> container  and connecting  to it with the GUI debugger.  Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
> 
> [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.02/
> [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
> [3] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751
> 
> Thank you,
> Daniil
> 
> ?On 2/24/20, 5:45 AM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
> 
>      Hi Daniil,
>      
>         - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments.
>           Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply.
>      
>         - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used.
>           But you can use same port number as RMI registry (1099).
>           It is same as relation between jmxremote.port and jmxremote.rmi.port.
>      
>      
>      Thanks,
>      
>      Yasumasa
>      
>      
>      On 2020/02/24 13:21, Daniil Titov wrote:
>      > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port.
>      > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container.
>      >
>      > New CSR [3] was created for this change and it needs to be reviewed as well.
>      >
>      > Man pages for jhsdb will be updated in a separate issue.
>      >
>      > The current implementation (sun.jvm.hotspot.SALauncher)  parses the command line options passed to jhsdb tool,
>      > converts them to the ones for the debug server and then delegates the call  to sun.jvm.hotspot.DebugServer.main().
>      >
>      >                // delegate to the actual SA debug server.
>      >   367         DebugServer.main(newArgArray.toArray(new String[0]));
>      >
>      > However,  sun.jvm.hotspot.DebugServer  doesn't support named options and that prevents from efficiently adding new options to the tool.
>      > I found it more suitable to start Hotspot agent directly in  SALauncher rather than  adding a new option in  both sun.jvm.hotspot.SALauncher
>      >   and sun.jvm.hotspot.DebugServer and  delegating the call.  With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated
>      > but I would prefer to address it in a separate issue.
>      >
>      > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker
>      >                  container  and connecting  to it with the GUI debugger.
>      >                 Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
>      >
>      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01
>      > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751
>      > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
>      >
>      > Thank you,
>      > Daniil
>      >
>      >
>      
> 
> 

From chiroito107 at gmail.com  Fri Mar  6 15:24:08 2020
From: chiroito107 at gmail.com (Chihiro Ito)
Date: Sat, 7 Mar 2020 00:24:08 +0900
Subject: PING: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable
 paths on Windows
In-Reply-To: <CAE_05uwttqkP06KStTEdQY_bbNTra4OHpQfszEcg5q_gbCxyrQ@mail.gmail.com>
References: <CAE_05uymp-AR6x5MLOZp7ATiOb4+O3Ev2bvscH2nmvHs79LAeA@mail.gmail.com>
 <2f7bbd0f-75ac-05d3-f97d-9819f56fc98f@oss.nttdata.com>
 <CAE_05uzqHJ5VUKvYgwDyaUz=gvMjDd_Xk_+SE94_Z6PRqv6yDw@mail.gmail.com>
 <171d3f8c-e0a6-0edf-8bbe-9fbc4b8f7614@oss.nttdata.com>
 <CAE_05uzOq3t7hS-rxw5Q+dMjh4E0pZFxkXC23mRfVZVBH-aUjw@mail.gmail.com>
 <54d4e146-58bc-8318-6e27-922616ff4b37@oss.nttdata.com>
 <CAE_05uwi8trdTc6W3dVepNshy3VbOPQGJd3gW6gkc9hYoXBTrg@mail.gmail.com>
 <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com>
 <CAE_05uxvGwtgsM4NSH5ipC_gOT+f_fjhdQ+xpHoX7DSp7yntEA@mail.gmail.com>
 <0a2df665-2e08-6139-c131-043a425b4916@oracle.com>
 <CAE_05uzT8PedhscdgqOQ_0Ek3uKB2mHdZSGJLOH4WyE02Jv+qg@mail.gmail.com>
 <AM0PR02MB450007C62B0392584C378ECF9FEA0@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <CAE_05uwttqkP06KStTEdQY_bbNTra4OHpQfszEcg5q_gbCxyrQ@mail.gmail.com>
Message-ID: <CAE_05uz93m96Ez+zRv8h2j6DyDW9xPeanm1wNCo9r+kG+BoOug@mail.gmail.com>

Hi Serguei,

Could you review this again, please?

Regards,
Chihiro


2020?2?27?(?) 22:11 Chihiro Ito <chiroito107 at gmail.com>:
>
> Hi Ralf,
>
> Thank you for your advice.
>
> 1.
> The comment of serializePropertiesToByteArray in VMSupport is "The stream written to the byte array is ISO 8859-1 encoded.".
> But the previous implementation does not keep this. I think we need to implement encode by ISO 8859-1.
>
> 2.
> According to help, the feature of VM.system_properties is just "Print system properties". The users should not use this output for loading. The users use it when they want to see System Properties soon.
>
> Regards,
> Chihiro
>
>
> 2020?2?26?(?) 18:53 Schmelter, Ralf <ralf.schmelter at sap.com>:
>>
>> Hi Chihiro,
>>
>> I have two remarks:
>>
>> 1. ISO Latin 1 characters which are not ASCII will not work with the code. While the Properties.store() method claims to create ISO Latin 1 String, it really only will create printable ASCII characters  (apart from the comment, but it is ASCII too in this case). See Properties.saveConvert, where the char is checked for < 0x20 or > 0x7e and then printed as \uxxxx. This is important, since the bytes of the ByteArrayOutputStream are then send to the jcmd. And jcmd expects UTF-8 encoded strings, which is OK if we only used ASCII characters. But a ISO Latin 1 character >= 0x80 will break the encoding. Just try using \u00DC in your test.
>>
>> 2. Your change makes it impossible to load the output with properties.load(). The old output could be loaded, since it was a valid properties file. But yours is not. For example, consider the filename c:\test\new. Formerly it would be encoded as:
>> C\:\\test\\new
>> And now it is:
>> C:\test\new
>> But the properties code would see "\n" as the newline character in your encoding. In fact you cannot differentiate between \n, \t, \f and \r originally being one or two characters.
>>
>> Best regards,
>> Ralf
>>
>>
>> From: serviceability-dev <serviceability-dev-bounces at openjdk.java.net> On Behalf Of Chihiro Ito
>> Sent: Dienstag, 25. Februar 2020 04:45
>> To: serguei.spitsyn at oracle.com
>> Cc: serviceability-dev at openjdk.java.net
>> Subject: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows
>>
>> Hi Serguei,
>>
>> Thanks for your review and advice.
>>
>> I modified these.
>> Could you review this again, please?
>>
>> Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.05/
>>
>> Regards,
>> Chihiro
>>

From serguei.spitsyn at oracle.com  Fri Mar  6 18:32:30 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 6 Mar 2020 10:32:30 -0800
Subject: PING: Re: RFR: JDK-8222489: jcmd VM.system_properties gives
 unusable paths on Windows
In-Reply-To: <CAE_05uz93m96Ez+zRv8h2j6DyDW9xPeanm1wNCo9r+kG+BoOug@mail.gmail.com>
References: <CAE_05uymp-AR6x5MLOZp7ATiOb4+O3Ev2bvscH2nmvHs79LAeA@mail.gmail.com>
 <2f7bbd0f-75ac-05d3-f97d-9819f56fc98f@oss.nttdata.com>
 <CAE_05uzqHJ5VUKvYgwDyaUz=gvMjDd_Xk_+SE94_Z6PRqv6yDw@mail.gmail.com>
 <171d3f8c-e0a6-0edf-8bbe-9fbc4b8f7614@oss.nttdata.com>
 <CAE_05uzOq3t7hS-rxw5Q+dMjh4E0pZFxkXC23mRfVZVBH-aUjw@mail.gmail.com>
 <54d4e146-58bc-8318-6e27-922616ff4b37@oss.nttdata.com>
 <CAE_05uwi8trdTc6W3dVepNshy3VbOPQGJd3gW6gkc9hYoXBTrg@mail.gmail.com>
 <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com>
 <CAE_05uxvGwtgsM4NSH5ipC_gOT+f_fjhdQ+xpHoX7DSp7yntEA@mail.gmail.com>
 <0a2df665-2e08-6139-c131-043a425b4916@oracle.com>
 <CAE_05uzT8PedhscdgqOQ_0Ek3uKB2mHdZSGJLOH4WyE02Jv+qg@mail.gmail.com>
 <AM0PR02MB450007C62B0392584C378ECF9FEA0@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <CAE_05uwttqkP06KStTEdQY_bbNTra4OHpQfszEcg5q_gbCxyrQ@mail.gmail.com>
 <CAE_05uz93m96Ez+zRv8h2j6DyDW9xPeanm1wNCo9r+kG+BoOug@mail.gmail.com>
Message-ID: <31599c2a-6947-44a0-0f91-cb240ddaf1ed@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200306/ab8c7abf/attachment.htm>

From daniil.x.titov at oracle.com  Fri Mar  6 18:38:09 2020
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Fri, 06 Mar 2020 10:38:09 -0800
Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI
 connector port
In-Reply-To: <a3b5ac77-b6ee-4927-cf99-b586c6bbeae6@oss.nttdata.com>
References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com>
 <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com>
 <D28BF049-D293-4F65-93EC-BCAE4F09B413@oracle.com>
 <a3b5ac77-b6ee-4927-cf99-b586c6bbeae6@oss.nttdata.com>
Message-ID: <41392FC2-AC78-4E94-878B-907DDFB3E968@oracle.com>

Hi Yasumasa,

 -> checkBasicOptions() is needed? I think you can remove this method and embed it in caller.
I think that having a piece of code that invokes  a method  named "buildAttachArgs" with a copy of the argument map  just for its side-effect ( it throws an exception if parameters are incorrect)  and ignores its return might look confusing. Thus, I found it more appropriate to wrap it inside a method with more relevant name .

 > SADebugDTest
 >  - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array.
We cannot use primitives there since these local variables are captured in lambda expression and are required to be final.
The other option is to use some other wrapper for them but I don't see any obvious benefits in it comparing to the array.

I will include your other suggestion in the new version of the webrev.

Thanks!
Daniil

?On 3/6/20, 12:30 AM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:

    Hi Daniil,
    
    
    - SALauncher.java
         - checkBasicOptions() is needed? I think you can remove this method and embed it in caller.
         - I think registryPort should be checked with Integer.parseInt() like others (rmiPort and pid) rather than regex.
         - Shutdown hook is very good idea. You can implement more simply if you use lambda expression.
    
    - SADebugDTest.java
         - Please add bug ID to @bug.
         - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array.
    
    
    Thanks,
    
    Yasumasa
    
    
    On 2020/03/06 10:15, Daniil Titov wrote:
    > Hi Yasumasa, Serguei and Alex,
    > 
    > Please review a new version of the fix [1] that addresses your comments. The new version in addition to RMI connector
    > port option introduces two more options to specify RMI registry port and RMI connector host name. Currently, these
    > last two settings could be specified using the system properties but the system properties have the following disadvantages
    > comparing to the command line options:
    >     -  It?s hard to know about them: they are not listed in tool?s help.
    >     -  They have long names that hard to remember
    >     -   It is easy to mistype them  in the command line and you will not get any warning about it.
    > 
    > The CSR [2] was also updated and needs to be reviewed.
    > 
    > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker
    > container  and connecting  to it with the GUI debugger.  Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
    > 
    > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.02/
    > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
    > [3] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751
    > 
    > Thank you,
    > Daniil
    > 
    > ?On 2/24/20, 5:45 AM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
    > 
    >      Hi Daniil,
    >      
    >         - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments.
    >           Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply.
    >      
    >         - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used.
    >           But you can use same port number as RMI registry (1099).
    >           It is same as relation between jmxremote.port and jmxremote.rmi.port.
    >      
    >      
    >      Thanks,
    >      
    >      Yasumasa
    >      
    >      
    >      On 2020/02/24 13:21, Daniil Titov wrote:
    >      > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port.
    >      > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container.
    >      >
    >      > New CSR [3] was created for this change and it needs to be reviewed as well.
    >      >
    >      > Man pages for jhsdb will be updated in a separate issue.
    >      >
    >      > The current implementation (sun.jvm.hotspot.SALauncher)  parses the command line options passed to jhsdb tool,
    >      > converts them to the ones for the debug server and then delegates the call  to sun.jvm.hotspot.DebugServer.main().
    >      >
    >      >                // delegate to the actual SA debug server.
    >      >   367         DebugServer.main(newArgArray.toArray(new String[0]));
    >      >
    >      > However,  sun.jvm.hotspot.DebugServer  doesn't support named options and that prevents from efficiently adding new options to the tool.
    >      > I found it more suitable to start Hotspot agent directly in  SALauncher rather than  adding a new option in  both sun.jvm.hotspot.SALauncher
    >      >   and sun.jvm.hotspot.DebugServer and  delegating the call.  With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated
    >      > but I would prefer to address it in a separate issue.
    >      >
    >      > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker
    >      >                  container  and connecting  to it with the GUI debugger.
    >      >                 Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
    >      >
    >      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01
    >      > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751
    >      > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
    >      >
    >      > Thank you,
    >      > Daniil
    >      >
    >      >
    >      
    > 
    > 
    

From suenaga at oss.nttdata.com  Sat Mar  7 02:15:03 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Sat, 7 Mar 2020 11:15:03 +0900
Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI
 connector port
In-Reply-To: <41392FC2-AC78-4E94-878B-907DDFB3E968@oracle.com>
References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com>
 <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com>
 <D28BF049-D293-4F65-93EC-BCAE4F09B413@oracle.com>
 <a3b5ac77-b6ee-4927-cf99-b586c6bbeae6@oss.nttdata.com>
 <41392FC2-AC78-4E94-878B-907DDFB3E968@oracle.com>
Message-ID: <c858b94c-0091-8d29-eb0b-145782984d86@oss.nttdata.com>

Hi Daniil,

On 2020/03/07 3:38, Daniil Titov wrote:
> Hi Yasumasa,
> 
>   -> checkBasicOptions() is needed? I think you can remove this method and embed it in caller.
> I think that having a piece of code that invokes  a method  named "buildAttachArgs" with a copy of the argument map  just for its side-effect ( it throws an exception if parameters are incorrect)  and ignores its return might look confusing. Thus, I found it more appropriate to wrap it inside a method with more relevant name .

Ok, but I prefer to leave comment it.


>   > SADebugDTest
>   >  - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array.
> We cannot use primitives there since these local variables are captured in lambda expression and are required to be final.
> The other option is to use some other wrapper for them but I don't see any obvious benefits in it comparing to the array.

Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains `exclusiveAccess.dirs=.` to avoid concurrent execution.
If you do not think this error check, test code is more simply.


> I will include your other suggestion in the new version of the webrev.

Sorry, I have one more comment:

>           - Shutdown hook is very good idea. You can implement more simply if you use lambda expression.

Shutdown hook is already registered in c'tor of HotSpotAgent.
It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed.


Thanks,

Yasumasa


> Thanks!
> Daniil
> 
> ?On 3/6/20, 12:30 AM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
> 
>      Hi Daniil,
>      
>      
>      - SALauncher.java
>           - checkBasicOptions() is needed? I think you can remove this method and embed it in caller.
>           - I think registryPort should be checked with Integer.parseInt() like others (rmiPort and pid) rather than regex.
>           - Shutdown hook is very good idea. You can implement more simply if you use lambda expression.
>      
>      - SADebugDTest.java
>           - Please add bug ID to @bug.
>           - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array.
>      
>      
>      Thanks,
>      
>      Yasumasa
>      
>      
>      On 2020/03/06 10:15, Daniil Titov wrote:
>      > Hi Yasumasa, Serguei and Alex,
>      >
>      > Please review a new version of the fix [1] that addresses your comments. The new version in addition to RMI connector
>      > port option introduces two more options to specify RMI registry port and RMI connector host name. Currently, these
>      > last two settings could be specified using the system properties but the system properties have the following disadvantages
>      > comparing to the command line options:
>      >     -  It?s hard to know about them: they are not listed in tool?s help.
>      >     -  They have long names that hard to remember
>      >     -   It is easy to mistype them  in the command line and you will not get any warning about it.
>      >
>      > The CSR [2] was also updated and needs to be reviewed.
>      >
>      > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker
>      > container  and connecting  to it with the GUI debugger.  Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
>      >
>      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.02/
>      > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
>      > [3] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751
>      >
>      > Thank you,
>      > Daniil
>      >
>      > ?On 2/24/20, 5:45 AM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
>      >
>      >      Hi Daniil,
>      >
>      >         - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments.
>      >           Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply.
>      >
>      >         - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used.
>      >           But you can use same port number as RMI registry (1099).
>      >           It is same as relation between jmxremote.port and jmxremote.rmi.port.
>      >
>      >
>      >      Thanks,
>      >
>      >      Yasumasa
>      >
>      >
>      >      On 2020/02/24 13:21, Daniil Titov wrote:
>      >      > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port.
>      >      > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container.
>      >      >
>      >      > New CSR [3] was created for this change and it needs to be reviewed as well.
>      >      >
>      >      > Man pages for jhsdb will be updated in a separate issue.
>      >      >
>      >      > The current implementation (sun.jvm.hotspot.SALauncher)  parses the command line options passed to jhsdb tool,
>      >      > converts them to the ones for the debug server and then delegates the call  to sun.jvm.hotspot.DebugServer.main().
>      >      >
>      >      >                // delegate to the actual SA debug server.
>      >      >   367         DebugServer.main(newArgArray.toArray(new String[0]));
>      >      >
>      >      > However,  sun.jvm.hotspot.DebugServer  doesn't support named options and that prevents from efficiently adding new options to the tool.
>      >      > I found it more suitable to start Hotspot agent directly in  SALauncher rather than  adding a new option in  both sun.jvm.hotspot.SALauncher
>      >      >   and sun.jvm.hotspot.DebugServer and  delegating the call.  With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated
>      >      > but I would prefer to address it in a separate issue.
>      >      >
>      >      > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker
>      >      >                  container  and connecting  to it with the GUI debugger.
>      >      >                 Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
>      >      >
>      >      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01
>      >      > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751
>      >      > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
>      >      >
>      >      > Thank you,
>      >      > Daniil
>      >      >
>      >      >
>      >
>      >
>      >
>      
> 
> 

From chris.plummer at oracle.com  Sat Mar  7 04:42:53 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Fri, 6 Mar 2020 20:42:53 -0800
Subject: RFR(XS): 8240691: serviceability/sa/ClhsdbCDSJstackPrintAll.java
 and serviceability/sa/ClhsdbCDSCore.java should be excluded with ZGC
In-Reply-To: <8c057888-5a75-c9c7-cb55-9352d07b3013@oracle.com>
References: <86878457-3386-ea0e-f23e-7ad1bdff64a6@oracle.com>
 <038c6af5-f7e4-cf93-9c22-729f1db03e05@oracle.com>
 <8c057888-5a75-c9c7-cb55-9352d07b3013@oracle.com>
Message-ID: <e8c67d46-4071-f2b3-6d3a-a1503a3bd41c@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200306/74bc04a0/attachment.htm>

From suenaga at oss.nttdata.com  Sat Mar  7 07:03:59 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Sat, 7 Mar 2020 16:03:59 +0900
Subject: PING: Re: RFR: JDK-8222489: jcmd VM.system_properties gives
 unusable paths on Windows
In-Reply-To: <31599c2a-6947-44a0-0f91-cb240ddaf1ed@oracle.com>
References: <CAE_05uymp-AR6x5MLOZp7ATiOb4+O3Ev2bvscH2nmvHs79LAeA@mail.gmail.com>
 <2f7bbd0f-75ac-05d3-f97d-9819f56fc98f@oss.nttdata.com>
 <CAE_05uzqHJ5VUKvYgwDyaUz=gvMjDd_Xk_+SE94_Z6PRqv6yDw@mail.gmail.com>
 <171d3f8c-e0a6-0edf-8bbe-9fbc4b8f7614@oss.nttdata.com>
 <CAE_05uzOq3t7hS-rxw5Q+dMjh4E0pZFxkXC23mRfVZVBH-aUjw@mail.gmail.com>
 <54d4e146-58bc-8318-6e27-922616ff4b37@oss.nttdata.com>
 <CAE_05uwi8trdTc6W3dVepNshy3VbOPQGJd3gW6gkc9hYoXBTrg@mail.gmail.com>
 <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com>
 <CAE_05uxvGwtgsM4NSH5ipC_gOT+f_fjhdQ+xpHoX7DSp7yntEA@mail.gmail.com>
 <0a2df665-2e08-6139-c131-043a425b4916@oracle.com>
 <CAE_05uzT8PedhscdgqOQ_0Ek3uKB2mHdZSGJLOH4WyE02Jv+qg@mail.gmail.com>
 <AM0PR02MB450007C62B0392584C378ECF9FEA0@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <CAE_05uwttqkP06KStTEdQY_bbNTra4OHpQfszEcg5q_gbCxyrQ@mail.gmail.com>
 <CAE_05uz93m96Ez+zRv8h2j6DyDW9xPeanm1wNCo9r+kG+BoOug@mail.gmail.com>
 <31599c2a-6947-44a0-0f91-cb240ddaf1ed@oracle.com>
Message-ID: <d7637b4f-5c9a-436d-f11b-468cc43aa8d3@oss.nttdata.com>

Hi Chihiro,

I'm also ok with webrev.05 after updating copyright year.


Yasumasa


On 2020/03/07 3:32, serguei.spitsyn at oracle.com wrote:
> Hi Chichiro,
> 
> I'm okay with the fix.
> Could you, please, update the copyright date in || src/java.base/share/classes/jdk/internal/vm/VMSupport.java before push?
> 
> Thanks,
> Serguei
> 
> 
> On 3/6/20 07:24, Chihiro Ito wrote:
>> Hi Serguei,
>>
>> Could you review this again, please?
>>
>> Regards,
>> Chihiro
>>
>>
>> 2020?2?27?(?) 22:11 Chihiro Ito<chiroito107 at gmail.com>:
>>> Hi Ralf,
>>>
>>> Thank you for your advice.
>>>
>>> 1.
>>> The comment of serializePropertiesToByteArray in VMSupport is "The stream written to the byte array is ISO 8859-1 encoded.".
>>> But the previous implementation does not keep this. I think we need to implement encode by ISO 8859-1.
>>>
>>> 2.
>>> According to help, the feature of VM.system_properties is just "Print system properties". The users should not use this output for loading. The users use it when they want to see System Properties soon.
>>>
>>> Regards,
>>> Chihiro
>>>
>>>
>>> 2020?2?26?(?) 18:53 Schmelter, Ralf<ralf.schmelter at sap.com>:
>>>> Hi Chihiro,
>>>>
>>>> I have two remarks:
>>>>
>>>> 1. ISO Latin 1 characters which are not ASCII will not work with the code. While the Properties.store() method claims to create ISO Latin 1 String, it really only will create printable ASCII characters  (apart from the comment, but it is ASCII too in this case). See Properties.saveConvert, where the char is checked for < 0x20 or > 0x7e and then printed as \uxxxx. This is important, since the bytes of the ByteArrayOutputStream are then send to the jcmd. And jcmd expects UTF-8 encoded strings, which is OK if we only used ASCII characters. But a ISO Latin 1 character >= 0x80 will break the encoding. Just try using \u00DC in your test.
>>>>
>>>> 2. Your change makes it impossible to load the output with properties.load(). The old output could be loaded, since it was a valid properties file. But yours is not. For example, consider the filename c:\test\new. Formerly it would be encoded as:
>>>> C\:\\test\\new
>>>> And now it is:
>>>> C:\test\new
>>>> But the properties code would see "\n" as the newline character in your encoding. In fact you cannot differentiate between \n, \t, \f and \r originally being one or two characters.
>>>>
>>>> Best regards,
>>>> Ralf
>>>>
>>>>
>>>> From: serviceability-dev<serviceability-dev-bounces at openjdk.java.net>  On Behalf Of Chihiro Ito
>>>> Sent: Dienstag, 25. Februar 2020 04:45
>>>> To:serguei.spitsyn at oracle.com
>>>> Cc:serviceability-dev at openjdk.java.net
>>>> Subject: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows
>>>>
>>>> Hi Serguei,
>>>>
>>>> Thanks for your review and advice.
>>>>
>>>> I modified these.
>>>> Could you review this again, please?
>>>>
>>>> Webrev :http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.05/
>>>>
>>>> Regards,
>>>> Chihiro
>>>>
> 

From serguei.spitsyn at oracle.com  Sat Mar  7 07:17:59 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 6 Mar 2020 23:17:59 -0800
Subject: RFR: JDK-8240340: java/lang/management/ThreadMXBean/Locks.java is
 buggy
In-Reply-To: <a5f700fb-4955-aaed-0312-367d4a7d2460@oracle.com>
References: <a5f700fb-4955-aaed-0312-367d4a7d2460@oracle.com>
Message-ID: <fe5b71e2-b27c-f0b7-d946-e2ec7ca83f56@oracle.com>

Hi Alex,

It looks good to me.

Thanks,
Serguei


On 3/4/20 16:30, Alex Menkov wrote:
> Hi all,
>
> please review the fix for
> https://bugs.openjdk.java.net/browse/JDK-8240340
> webrev:
> http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev/
>
> changes:
> - assertThreadState method: don't re-read thread state throwing 
> exception (as we got weird error like "Thread WaitingThread is at 
> WAITING state but is expected to be in Thread.State = WAITING");
> - added proper test shutdown on error (made all threads "daemon", 
> interrupt waiting thread if CheckerThread throws exception);
> - if CheckerThread detects error, propagate the exception to main thread;
> - fixed LockFreeLogger class - it should work for logging from several 
> threads, but it doesn't. I prefer to simplify it just to keep 
> ConcurrentLinkedQueue<String>.
> LockFreeLogger is also used by ThreadMXBeanStateTest test, but only by 
> a single thread.
>
> --alex


From serguei.spitsyn at oracle.com  Sat Mar  7 07:28:17 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 6 Mar 2020 23:28:17 -0800
Subject: RFR: JDK-8240340: java/lang/management/ThreadMXBean/Locks.java is
 buggy
In-Reply-To: <9a3d8f8c-5c46-aa64-ba22-b1b2bf1836ee@oracle.com>
References: <a5f700fb-4955-aaed-0312-367d4a7d2460@oracle.com>
 <4c9b6f1b-1308-9087-ef4c-140eaf103b0f@oracle.com>
 <9a3d8f8c-5c46-aa64-ba22-b1b2bf1836ee@oracle.com>
Message-ID: <2cf61bd9-da4d-2477-6604-d104118c4e03@oracle.com>

Hi David and Alex,

My understanding is that previous implementation collected logs 
separately for each thread in TLS, and at the end, merged and sorted out 
the output by log id.
So, the result is that all messages are serialized at the end.
Alex changed the implementation but the result is the same - all log 
messages are serialized.

There are two tests which use the LockFreeLogger.
Another one is: test/jdk/java/lang/Thread/ThreadStateController.java .
Does the ThreadStateController.java work okay after the fix?

Thanks,
Serguei


On 3/5/20 10:54, Alex Menkov wrote:
> Hi David,
>
> Thanks you for the review.
>
> On 03/04/2020 17:50, David Holmes wrote:
>> Hi Alex,
>>
>> On 5/03/2020 10:30 am, Alex Menkov wrote:
>>> Hi all,
>>>
>>> please review the fix for
>>> https://bugs.openjdk.java.net/browse/JDK-8240340
>>> webrev:
>>> http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev/ 
>>>
>>>
>>> changes:
>>> - assertThreadState method: don't re-read thread state throwing 
>>> exception (as we got weird error like "Thread WaitingThread is at 
>>> WAITING state but is expected to be in Thread.State = WAITING");
>>> - added proper test shutdown on error (made all threads "daemon", 
>>> interrupt waiting thread if CheckerThread throws exception);
>>> - if CheckerThread detects error, propagate the exception to main 
>>> thread;
>>
>> The test changes seem fine.
>>
>>> - fixed LockFreeLogger class - it should work for logging from 
>>> several threads, but it doesn't. I prefer to simplify it just to 
>>> keep ConcurrentLinkedQueue<String>.
>>> LockFreeLogger is also used by ThreadMXBeanStateTest test, but only 
>>> by a single thread.
>>
>> I don't understand your changes here as you've completely changed the 
>> intended design of the logger. The original accumulates log entries 
>> per-thread and then spits them all out (though I'm not clear on the 
>> exact ordering - I don't how to read that stream stuff). The new code 
>> just creates a single queue of log records interleaving entries from 
>> different threads. The simple logger may be all that is needed but it 
>> seems quite different to the intent of the original.
>
> Testing changes in the test I discovered that there is something wrong 
> with the logger - it printed only part of the records, so I have to 
> look at the LockFreeLogger class and I don't understand how it was 
> supposed to work.
> About ordering in cumulative log: each record has Integer which used 
> to sort log entries from all threads (i.e. records from different 
> threads are printed at the order which log() was called).
> Looking at allRecords/records stuff I don't understand how it should 
> be used. To get logs from different threads in one logger, we needs 
> one instance. So we create LockFreeLogger (in main thread) and ctor 
> creates ThreadLocal record and register it in allRecords. Logging from 
> main thread works fine, but if any other thread tries to log, 1st 
> log() call creates its own ThreadLocal records (by records.get()) and 
> log records from this thread go there. But this ThreadLocal records is 
> not registered in allRecords, so this logging won't be included in 
> final log.
> Looks like we need to change log() to something like
>
> Map<Integer, String> recs = records.get();
> if (recs.isEmpty()) {
> ??? allRecords.add(recs);
> }
> recs.put(id, String.format(format, params));
>
> But all this stuff do exactly the same as simple ConcurrentLinkedQueue 
> (i.e. lock free ordered list).
> At least I don't see other rationale in the stuff.
>
> --alex
>
>>
>> Thanks,
>> David
>>
>>> --alex


From serguei.spitsyn at oracle.com  Sat Mar  7 07:32:33 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 6 Mar 2020 23:32:33 -0800
Subject: RFR(S) : 8153430: [TESTBUG] jdk regression test
 javax/management/loading/MletParserLocaleTest.java reduce default timeout
In-Reply-To: <0653e5d4-893b-ce78-f0cf-5905a659bb10@oracle.com>
References: <0cee14ea-901d-4571-85b9-bbb6f01d59c1@default>
 <bf77cafb-a813-4bad-9094-d8b59b5ccb0f@default>
 <0653e5d4-893b-ce78-f0cf-5905a659bb10@oracle.com>
Message-ID: <ebf1b87d-ffe2-130b-3155-0f6ba76a89f0@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200306/68ce22eb/attachment-0001.htm>

From serguei.spitsyn at oracle.com  Sat Mar  7 07:42:33 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 6 Mar 2020 23:42:33 -0800
Subject: RFR(XS): 8240691: serviceability/sa/ClhsdbCDSJstackPrintAll.java
 and serviceability/sa/ClhsdbCDSCore.java should be excluded with ZGC
In-Reply-To: <e8c67d46-4071-f2b3-6d3a-a1503a3bd41c@oracle.com>
References: <86878457-3386-ea0e-f23e-7ad1bdff64a6@oracle.com>
 <038c6af5-f7e4-cf93-9c22-729f1db03e05@oracle.com>
 <8c057888-5a75-c9c7-cb55-9352d07b3013@oracle.com>
 <e8c67d46-4071-f2b3-6d3a-a1503a3bd41c@oracle.com>
Message-ID: <42b7ac87-a09c-f703-f1a2-7bde6fb75f76@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200306/4e8fcb58/attachment.htm>

From serguei.spitsyn at oracle.com  Sat Mar  7 07:53:21 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 6 Mar 2020 23:53:21 -0800
Subject: RFR(S): 8240295: hs_err elapsed time in seconds is not accurate
 enough
In-Reply-To: <20045e23-c736-5289-866e-9df5a09101a8@oracle.com>
References: <4a3d7c8e-df9e-51de-77e4-775be1480392@oracle.com>
 <eb3880c3-27c0-b9b3-60f3-44f3f249dd66@oracle.com>
 <92ada19d-7200-6bb7-b174-9da8a166aea7@oracle.com>
 <053864a8-617f-fcac-fc26-220d161e3e55@oracle.com>
 <c37c0dfd-4c81-b05d-537f-abdac913e45f@oracle.com>
 <c5d54667-5f7a-a8d6-0e47-517986fbf1a5@oracle.com>
 <20045e23-c736-5289-866e-9df5a09101a8@oracle.com>
Message-ID: <80fc495b-5169-2ec8-559c-8ba8c9f4939d@oracle.com>

Hi Kevin,

This looks okay to me as well.

Thanks,
Serguei


On 3/5/20 02:38, David Holmes wrote:
> Thanks Kevin. I think this is the less risky change and achieves the 
> goal.
>
> David
>
> On 5/03/2020 8:00 pm, Kevin Walls wrote:
>> Thanks -
>>
>> I had tried some ideas in the simple fashion, and we can use %06d 
>> formatting.... OK maybe such formatting is not as "bad" as %f...
>>
>> (glibc parses the int width specified without allocation.? We provide 
>> the output buffer, I don't think we will cause? vfprintf code to 
>> alloca or malloc.)
>>
>> I can offer a second version below that uses %d only.? Testing 
>> alongside %f in the same line, it retains the same value and 
>> position, e.g.
>>
>> Time: Thu Mar? 5 08:57:50 2020 UTC elapsed time: f: 2.001065 int: 
>> 2.001065 (raw int: 1065) seconds (0d 0h 0m 2s)
>>
>> Output example from the hg diff below (not from the same run):
>>
>> Time: Thu Mar? 5 09:28:01 2020 UTC elapsed time: 2.000611 seconds (0d 
>> 0h 0m 2s)
>>
>>
>> Thanks!
>> Kevin
>>
>>
>> $ hg diff
>> diff --git a/src/hotspot/share/runtime/os.cpp 
>> b/src/hotspot/share/runtime/os.cpp
>> --- a/src/hotspot/share/runtime/os.cpp
>> +++ b/src/hotspot/share/runtime/os.cpp
>> @@ -1016,10 +1016,9 @@
>> ??? }
>>
>> ??? double t = os::elapsedTime();
>> -? // NOTE: It tends to crash after a SEGV if we want to 
>> printf("%f",...) in
>> -? //?????? Linux. Must be a bug in glibc ? Workaround is to round 
>> "t" to int
>> -? //?????? before printf. We lost some precision, but who cares?
>> +? // NOTE: a crash using printf("%f",...) on Linux was historically 
>> noted here.
>> ??? int eltime = (int)t;? // elapsed time in seconds
>> +? int eltimeFraction = (int) ((t - eltime) * 1000000);
>>
>> ??? // print elapsed time in a human-readable format:
>> ??? int eldays = eltime / secs_per_day;
>> @@ -1029,7 +1028,7 @@
>> ??? int elmins = (eltime - day_secs - hour_secs) / secs_per_min;
>> ??? int minute_secs = elmins * secs_per_min;
>> ??? int elsecs = (eltime - day_secs - hour_secs - minute_secs);
>> -? st->print_cr(" elapsed time: %d seconds (%dd %dh %dm %ds)", 
>> eltime, eldays, elhours, elmins, elsecs);
>> +? st->print_cr(" elapsed time: %d.%06d seconds (%dd %dh %dm %ds)", 
>> eltime, eltimeFraction, eldays, elhours, elmins, elsecs);
>> ??}
>>
>>
>>
>> On 05/03/2020 00:57, David Holmes wrote:
>>> On 4/03/2020 8:44 am, Kevin Walls wrote:
>>>>
>>>> Thanks David -
>>>>
>>>> Yes there are situations where hs_err fails, and few people are 
>>>> sadder than me
>>>> when that happens 8-) , so I was thinking about how scared to be by 
>>>> the comment.
>>>>
>>>> With the safety net of the error handler for the steps of the 
>>>> hs_err file
>>>> (which works well, we see it invoked frequently), it looks 
>>>> reasonable to use
>>>> %f as we might do other slightly questionable things for a signal 
>>>> handler.
>>>>
>>>> Corrupting locale information or floating point state might 
>>>> possibly cause
>>>> problems, but if I cause a fake crash in print_date_and_time the error
>>>> handler recovers and the report continues.
>>>
>>> That is good to know.
>>>
>>>> Thinking about printing with two ints, seconds and fractions:
>>>> I don't see anything already that returns such a time in two 
>>>> components in the
>>>> JVM, so we might implement a new form of os::javaTimeNanos() or 
>>>> similar that
>>>> returns the two parts, and do that on each platform.
>>>
>>> I was thinking of something simple/crude ...
>>>
>>>> I didn't yet come up with anything to do in os::print_date_and_time()
>>>> which will take the fractional part of the double, and print just 
>>>> the fraction as an int, without using any library / %f facilities.
>>>
>>> ... just using e.g. (untested)
>>>
>>> double t = os::elapsedTime();
>>> int secs =? (int) t;
>>> int micros =? (int)((t - secs) * 100000);
>>> printf("%d.%d", secs, micros);
>>>
>>> with appropriate width specifiers to get the formatting right.
>>>
>>> Cheers,
>>> David
>>>
>>>>
>>>> If you're still concerned I could revisit these or some other idea.
>>>>
>>>> Genuine laugh out loud moment for me, I backported the elapsed time 
>>>> logging from
>>>> 6u4 to 5u19 (https://bugs.openjdk.java.net/browse/JDK-6447157) (2007).
>>>> (I said before jdk5 was created, I should have said before it was 
>>>> in mercurial.)
>>>>
>>>> Thanks
>>>> Kevin
>>>>
>>>>
>>>> On 03/03/2020 01:11, David Holmes wrote:
>>>>> Hi Kevin,
>>>>>
>>>>> On 2/03/2020 8:48 pm, Kevin Walls wrote:
>>>>>> Oops, and with the bug ID in the title and JBS link:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8240295
>>>>>>
>>>>>>
>>>>>> On 02/03/2020 10:47, Kevin Walls wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> (s11y and runtime opinions both relevant)
>>>>>>>
>>>>>>> A few times in the last month I've really wanted to compare the 
>>>>>>> Events logged in the hs_err file, and the time of the JVM's crash.
>>>>>>>
>>>>>>> "elapsed time" in hs_err is only accurate to one second, and has 
>>>>>>> been since before jdk5 was created.
>>>>>>>
>>>>>>> The diff below changes the format string and uses the 
>>>>>>> non-rounded time value (I don't see a need to change the other 
>>>>>>> integer arithmetic here), and we can enjoy hs_errs with detail 
>>>>>>> like:
>>>>>>>
>>>>>>> ...
>>>>>>> Time: Mon Mar? 2 09:57:13 2020 UTC elapsed time: 5.490135 
>>>>>>> seconds (0d 0h 0m 5s)
>>>>>>> ...
>>>>>>>
>>>>>>> Thanks
>>>>>>> Kevin
>>>>>>>
>>>>>>>
>>>>>>> /jdk/open$ hg diff
>>>>>>> diff --git a/src/hotspot/share/runtime/os.cpp 
>>>>>>> b/src/hotspot/share/runtime/os.cpp
>>>>>>> --- a/src/hotspot/share/runtime/os.cpp
>>>>>>> +++ b/src/hotspot/share/runtime/os.cpp
>>>>>>> @@ -1016,9 +1016,8 @@
>>>>>>> ?? }
>>>>>>>
>>>>>>> ?? double t = os::elapsedTime();
>>>>>>> -? // NOTE: It tends to crash after a SEGV if we want to 
>>>>>>> printf("%f",...) in
>>>>>>> -? //?????? Linux. Must be a bug in glibc ? Workaround is to 
>>>>>>> round "t" to int
>>>>>>> -? //?????? before printf. We lost some precision, but who cares?
>>>>>>> +? // NOTE: a crash using printf("%f",...) on Linux was 
>>>>>>> historically noted here
>>>>>>> +? //?????? (before the jdk5 repo was created).
>>>>>
>>>>> Just because it is old doesn't mean it no longer applies. printf 
>>>>> is not async-signal-safe - we know that but we try to use it 
>>>>> anyway. Maybe %f is even less async-signal-safe?
>>>>>
>>>>> This may get through testing okay but cause problems with real 
>>>>> crashes in the field.
>>>>>
>>>>> What about breaking the time up into two ints: seconds and nanos?
>>>>>
>>>>> Cheers,
>>>>> David
>>>>> -----
>>>>>
>>>>>>> ?? int eltime = (int)t;? // elapsed time in seconds
>>>>>>>
>>>>>>> ?? // print elapsed time in a human-readable format:
>>>>>>> @@ -1029,7 +1028,7 @@
>>>>>>> ?? int elmins = (eltime - day_secs - hour_secs) / secs_per_min;
>>>>>>> ?? int minute_secs = elmins * secs_per_min;
>>>>>>> ?? int elsecs = (eltime - day_secs - hour_secs - minute_secs);
>>>>>>> -? st->print_cr(" elapsed time: %d seconds (%dd %dh %dm %ds)", 
>>>>>>> eltime, eldays, elhours, elmins, elsecs);
>>>>>>> +? st->print_cr(" elapsed time: %f seconds (%dd %dh %dm %ds)", 
>>>>>>> t, eldays, elhours, elmins, elsecs);
>>>>>>> ?}
>>>>>>>
>>>>>>>


From kevin.walls at oracle.com  Sat Mar  7 07:57:48 2020
From: kevin.walls at oracle.com (Kevin Walls)
Date: Fri, 6 Mar 2020 23:57:48 -0800 (PST)
Subject: RFR(S): 8240295: hs_err elapsed time in seconds is not accurate
 enough
In-Reply-To: <80fc495b-5169-2ec8-559c-8ba8c9f4939d@oracle.com>
References: <4a3d7c8e-df9e-51de-77e4-775be1480392@oracle.com>
 <eb3880c3-27c0-b9b3-60f3-44f3f249dd66@oracle.com>
 <92ada19d-7200-6bb7-b174-9da8a166aea7@oracle.com>
 <053864a8-617f-fcac-fc26-220d161e3e55@oracle.com>
 <c37c0dfd-4c81-b05d-537f-abdac913e45f@oracle.com>
 <c5d54667-5f7a-a8d6-0e47-517986fbf1a5@oracle.com>
 <20045e23-c736-5289-866e-9df5a09101a8@oracle.com>
 <80fc495b-5169-2ec8-559c-8ba8c9f4939d@oracle.com>
Message-ID: <67d1cbc7-8102-09ac-7322-9f545089a45c@oracle.com>

Great, thanks!

On 07/03/2020 07:53, serguei.spitsyn at oracle.com wrote:
> Hi Kevin,
>
> This looks okay to me as well.
>
> Thanks,
> Serguei

From chiroito107 at gmail.com  Sat Mar  7 14:13:01 2020
From: chiroito107 at gmail.com (Chihiro Ito)
Date: Sat, 7 Mar 2020 23:13:01 +0900
Subject: PING: Re: RFR: JDK-8222489: jcmd VM.system_properties gives
 unusable paths on Windows
In-Reply-To: <d7637b4f-5c9a-436d-f11b-468cc43aa8d3@oss.nttdata.com>
References: <CAE_05uymp-AR6x5MLOZp7ATiOb4+O3Ev2bvscH2nmvHs79LAeA@mail.gmail.com>
 <2f7bbd0f-75ac-05d3-f97d-9819f56fc98f@oss.nttdata.com>
 <CAE_05uzqHJ5VUKvYgwDyaUz=gvMjDd_Xk_+SE94_Z6PRqv6yDw@mail.gmail.com>
 <171d3f8c-e0a6-0edf-8bbe-9fbc4b8f7614@oss.nttdata.com>
 <CAE_05uzOq3t7hS-rxw5Q+dMjh4E0pZFxkXC23mRfVZVBH-aUjw@mail.gmail.com>
 <54d4e146-58bc-8318-6e27-922616ff4b37@oss.nttdata.com>
 <CAE_05uwi8trdTc6W3dVepNshy3VbOPQGJd3gW6gkc9hYoXBTrg@mail.gmail.com>
 <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com>
 <CAE_05uxvGwtgsM4NSH5ipC_gOT+f_fjhdQ+xpHoX7DSp7yntEA@mail.gmail.com>
 <0a2df665-2e08-6139-c131-043a425b4916@oracle.com>
 <CAE_05uzT8PedhscdgqOQ_0Ek3uKB2mHdZSGJLOH4WyE02Jv+qg@mail.gmail.com>
 <AM0PR02MB450007C62B0392584C378ECF9FEA0@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <CAE_05uwttqkP06KStTEdQY_bbNTra4OHpQfszEcg5q_gbCxyrQ@mail.gmail.com>
 <CAE_05uz93m96Ez+zRv8h2j6DyDW9xPeanm1wNCo9r+kG+BoOug@mail.gmail.com>
 <31599c2a-6947-44a0-0f91-cb240ddaf1ed@oracle.com>
 <d7637b4f-5c9a-436d-f11b-468cc43aa8d3@oss.nttdata.com>
Message-ID: <CAE_05uzNAU0f8T8Gh9=pxd5U18xTDsWbxKkmbj5iSf2AhMWXYQ@mail.gmail.com>

Hi Serguei and Yasumasa,

I update the copyright year and created the change set.

Could you sponsor this, please?

Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/
Change set : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset

Regards,
Chihiro


2020?3?7?(?) 16:03 Yasumasa Suenaga <suenaga at oss.nttdata.com>:


>
> Hi Chihiro,
>
> I'm also ok with webrev.05 after updating copyright year.
>
>
> Yasumasa
>
>
> On 2020/03/07 3:32, serguei.spitsyn at oracle.com wrote:
> > Hi Chichiro,
> >
> > I'm okay with the fix.
> > Could you, please, update the copyright date in || src/java.base/share/classes/jdk/internal/vm/VMSupport.java before push?
> >
> > Thanks,
> > Serguei
> >
> >
> > On 3/6/20 07:24, Chihiro Ito wrote:
> >> Hi Serguei,
> >>
> >> Could you review this again, please?
> >>
> >> Regards,
> >> Chihiro
> >>
> >>
> >> 2020?2?27?(?) 22:11 Chihiro Ito<chiroito107 at gmail.com>:
> >>> Hi Ralf,
> >>>
> >>> Thank you for your advice.
> >>>
> >>> 1.
> >>> The comment of serializePropertiesToByteArray in VMSupport is "The stream written to the byte array is ISO 8859-1 encoded.".
> >>> But the previous implementation does not keep this. I think we need to implement encode by ISO 8859-1.
> >>>
> >>> 2.
> >>> According to help, the feature of VM.system_properties is just "Print system properties". The users should not use this output for loading. The users use it when they want to see System Properties soon.
> >>>
> >>> Regards,
> >>> Chihiro
> >>>
> >>>
> >>> 2020?2?26?(?) 18:53 Schmelter, Ralf<ralf.schmelter at sap.com>:
> >>>> Hi Chihiro,
> >>>>
> >>>> I have two remarks:
> >>>>
> >>>> 1. ISO Latin 1 characters which are not ASCII will not work with the code. While the Properties.store() method claims to create ISO Latin 1 String, it really only will create printable ASCII characters  (apart from the comment, but it is ASCII too in this case). See Properties.saveConvert, where the char is checked for < 0x20 or > 0x7e and then printed as \uxxxx. This is important, since the bytes of the ByteArrayOutputStream are then send to the jcmd. And jcmd expects UTF-8 encoded strings, which is OK if we only used ASCII characters. But a ISO Latin 1 character >= 0x80 will break the encoding. Just try using \u00DC in your test.
> >>>>
> >>>> 2. Your change makes it impossible to load the output with properties.load(). The old output could be loaded, since it was a valid properties file. But yours is not. For example, consider the filename c:\test\new. Formerly it would be encoded as:
> >>>> C\:\\test\\new
> >>>> And now it is:
> >>>> C:\test\new
> >>>> But the properties code would see "\n" as the newline character in your encoding. In fact you cannot differentiate between \n, \t, \f and \r originally being one or two characters.
> >>>>
> >>>> Best regards,
> >>>> Ralf
> >>>>
> >>>>
> >>>> From: serviceability-dev<serviceability-dev-bounces at openjdk.java.net>  On Behalf Of Chihiro Ito
> >>>> Sent: Dienstag, 25. Februar 2020 04:45
> >>>> To:serguei.spitsyn at oracle.com
> >>>> Cc:serviceability-dev at openjdk.java.net
> >>>> Subject: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows
> >>>>
> >>>> Hi Serguei,
> >>>>
> >>>> Thanks for your review and advice.
> >>>>
> >>>> I modified these.
> >>>> Could you review this again, please?
> >>>>
> >>>> Webrev :http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.05/
> >>>>
> >>>> Regards,
> >>>> Chihiro
> >>>>
> >

From chiroito107 at gmail.com  Sun Mar  8 13:05:21 2020
From: chiroito107 at gmail.com (Chihiro Ito)
Date: Sun, 8 Mar 2020 22:05:21 +0900
Subject: PING: Re: RFR: JDK-8222489: jcmd VM.system_properties gives
 unusable paths on Windows
In-Reply-To: <CAE_05uzNAU0f8T8Gh9=pxd5U18xTDsWbxKkmbj5iSf2AhMWXYQ@mail.gmail.com>
References: <CAE_05uymp-AR6x5MLOZp7ATiOb4+O3Ev2bvscH2nmvHs79LAeA@mail.gmail.com>
 <2f7bbd0f-75ac-05d3-f97d-9819f56fc98f@oss.nttdata.com>
 <CAE_05uzqHJ5VUKvYgwDyaUz=gvMjDd_Xk_+SE94_Z6PRqv6yDw@mail.gmail.com>
 <171d3f8c-e0a6-0edf-8bbe-9fbc4b8f7614@oss.nttdata.com>
 <CAE_05uzOq3t7hS-rxw5Q+dMjh4E0pZFxkXC23mRfVZVBH-aUjw@mail.gmail.com>
 <54d4e146-58bc-8318-6e27-922616ff4b37@oss.nttdata.com>
 <CAE_05uwi8trdTc6W3dVepNshy3VbOPQGJd3gW6gkc9hYoXBTrg@mail.gmail.com>
 <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com>
 <CAE_05uxvGwtgsM4NSH5ipC_gOT+f_fjhdQ+xpHoX7DSp7yntEA@mail.gmail.com>
 <0a2df665-2e08-6139-c131-043a425b4916@oracle.com>
 <CAE_05uzT8PedhscdgqOQ_0Ek3uKB2mHdZSGJLOH4WyE02Jv+qg@mail.gmail.com>
 <AM0PR02MB450007C62B0392584C378ECF9FEA0@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <CAE_05uwttqkP06KStTEdQY_bbNTra4OHpQfszEcg5q_gbCxyrQ@mail.gmail.com>
 <CAE_05uz93m96Ez+zRv8h2j6DyDW9xPeanm1wNCo9r+kG+BoOug@mail.gmail.com>
 <31599c2a-6947-44a0-0f91-cb240ddaf1ed@oracle.com>
 <d7637b4f-5c9a-436d-f11b-468cc43aa8d3@oss.nttdata.com>
 <CAE_05uzNAU0f8T8Gh9=pxd5U18xTDsWbxKkmbj5iSf2AhMWXYQ@mail.gmail.com>
Message-ID: <CAE_05uxC4ADo-qQaOaSAFHfUZ93uufVHZu639vvvSfeJWeNy=A@mail.gmail.com>

Hi,

I'm sorry. I included "JDK-" in the changeset title. I removed it and
updated it.

Change set : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset

Regards,
Chihiro

2020?3?7?(?) 23:13 Chihiro Ito <chiroito107 at gmail.com>:
>
> Hi Serguei and Yasumasa,
>
> I update the copyright year and created the change set.
>
> Could you sponsor this, please?
>
> Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/
> Change set : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset
>
> Regards,
> Chihiro
>
>
> 2020?3?7?(?) 16:03 Yasumasa Suenaga <suenaga at oss.nttdata.com>:
>
>
> >
> > Hi Chihiro,
> >
> > I'm also ok with webrev.05 after updating copyright year.
> >
> >
> > Yasumasa
> >
> >
> > On 2020/03/07 3:32, serguei.spitsyn at oracle.com wrote:
> > > Hi Chichiro,
> > >
> > > I'm okay with the fix.
> > > Could you, please, update the copyright date in || src/java.base/share/classes/jdk/internal/vm/VMSupport.java before push?
> > >
> > > Thanks,
> > > Serguei
> > >
> > >
> > > On 3/6/20 07:24, Chihiro Ito wrote:
> > >> Hi Serguei,
> > >>
> > >> Could you review this again, please?
> > >>
> > >> Regards,
> > >> Chihiro
> > >>
> > >>
> > >> 2020?2?27?(?) 22:11 Chihiro Ito<chiroito107 at gmail.com>:
> > >>> Hi Ralf,
> > >>>
> > >>> Thank you for your advice.
> > >>>
> > >>> 1.
> > >>> The comment of serializePropertiesToByteArray in VMSupport is "The stream written to the byte array is ISO 8859-1 encoded.".
> > >>> But the previous implementation does not keep this. I think we need to implement encode by ISO 8859-1.
> > >>>
> > >>> 2.
> > >>> According to help, the feature of VM.system_properties is just "Print system properties". The users should not use this output for loading. The users use it when they want to see System Properties soon.
> > >>>
> > >>> Regards,
> > >>> Chihiro
> > >>>
> > >>>
> > >>> 2020?2?26?(?) 18:53 Schmelter, Ralf<ralf.schmelter at sap.com>:
> > >>>> Hi Chihiro,
> > >>>>
> > >>>> I have two remarks:
> > >>>>
> > >>>> 1. ISO Latin 1 characters which are not ASCII will not work with the code. While the Properties.store() method claims to create ISO Latin 1 String, it really only will create printable ASCII characters  (apart from the comment, but it is ASCII too in this case). See Properties.saveConvert, where the char is checked for < 0x20 or > 0x7e and then printed as \uxxxx. This is important, since the bytes of the ByteArrayOutputStream are then send to the jcmd. And jcmd expects UTF-8 encoded strings, which is OK if we only used ASCII characters. But a ISO Latin 1 character >= 0x80 will break the encoding. Just try using \u00DC in your test.
> > >>>>
> > >>>> 2. Your change makes it impossible to load the output with properties.load(). The old output could be loaded, since it was a valid properties file. But yours is not. For example, consider the filename c:\test\new. Formerly it would be encoded as:
> > >>>> C\:\\test\\new
> > >>>> And now it is:
> > >>>> C:\test\new
> > >>>> But the properties code would see "\n" as the newline character in your encoding. In fact you cannot differentiate between \n, \t, \f and \r originally being one or two characters.
> > >>>>
> > >>>> Best regards,
> > >>>> Ralf
> > >>>>
> > >>>>
> > >>>> From: serviceability-dev<serviceability-dev-bounces at openjdk.java.net>  On Behalf Of Chihiro Ito
> > >>>> Sent: Dienstag, 25. Februar 2020 04:45
> > >>>> To:serguei.spitsyn at oracle.com
> > >>>> Cc:serviceability-dev at openjdk.java.net
> > >>>> Subject: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows
> > >>>>
> > >>>> Hi Serguei,
> > >>>>
> > >>>> Thanks for your review and advice.
> > >>>>
> > >>>> I modified these.
> > >>>> Could you review this again, please?
> > >>>>
> > >>>> Webrev :http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.05/
> > >>>>
> > >>>> Regards,
> > >>>> Chihiro
> > >>>>
> > >

From david.holmes at oracle.com  Mon Mar  9 04:15:58 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 9 Mar 2020 14:15:58 +1000
Subject: RFR: JDK-8240340: java/lang/management/ThreadMXBean/Locks.java is
 buggy
In-Reply-To: <9a3d8f8c-5c46-aa64-ba22-b1b2bf1836ee@oracle.com>
References: <a5f700fb-4955-aaed-0312-367d4a7d2460@oracle.com>
 <4c9b6f1b-1308-9087-ef4c-140eaf103b0f@oracle.com>
 <9a3d8f8c-5c46-aa64-ba22-b1b2bf1836ee@oracle.com>
Message-ID: <8d0f40d7-4202-0179-d130-1366c77e5c05@oracle.com>

Hi Alex,

On 6/03/2020 4:54 am, Alex Menkov wrote:
> Hi David,
> 
> Thanks you for the review.
> 
> On 03/04/2020 17:50, David Holmes wrote:
>> Hi Alex,
>>
>> On 5/03/2020 10:30 am, Alex Menkov wrote:
>>> Hi all,
>>>
>>> please review the fix for
>>> https://bugs.openjdk.java.net/browse/JDK-8240340
>>> webrev:
>>> http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev/ 
>>>
>>>
>>> changes:
>>> - assertThreadState method: don't re-read thread state throwing 
>>> exception (as we got weird error like "Thread WaitingThread is at 
>>> WAITING state but is expected to be in Thread.State = WAITING");
>>> - added proper test shutdown on error (made all threads "daemon", 
>>> interrupt waiting thread if CheckerThread throws exception);
>>> - if CheckerThread detects error, propagate the exception to main 
>>> thread;
>>
>> The test changes seem fine.
>>
>>> - fixed LockFreeLogger class - it should work for logging from 
>>> several threads, but it doesn't. I prefer to simplify it just to keep 
>>> ConcurrentLinkedQueue<String>.
>>> LockFreeLogger is also used by ThreadMXBeanStateTest test, but only 
>>> by a single thread.
>>
>> I don't understand your changes here as you've completely changed the 
>> intended design of the logger. The original accumulates log entries 
>> per-thread and then spits them all out (though I'm not clear on the 
>> exact ordering - I don't how to read that stream stuff). The new code 
>> just creates a single queue of log records interleaving entries from 
>> different threads. The simple logger may be all that is needed but it 
>> seems quite different to the intent of the original.
> 
> Testing changes in the test I discovered that there is something wrong 
> with the logger - it printed only part of the records, so I have to look 
> at the LockFreeLogger class and I don't understand how it was supposed 
> to work.
> About ordering in cumulative log: each record has Integer which used to 
> sort log entries from all threads (i.e. records from different threads 
> are printed at the order which log() was called).
> Looking at allRecords/records stuff I don't understand how it should be 
> used. To get logs from different threads in one logger, we needs one 
> instance. So we create LockFreeLogger (in main thread) and ctor creates 
> ThreadLocal record and register it in allRecords. Logging from main 
> thread works fine, but if any other thread tries to log, 1st log() call 
> creates its own ThreadLocal records (by records.get()) and log records 
> from this thread go there. But this ThreadLocal records is not 
> registered in allRecords, so this logging won't be included in final log.
> Looks like we need to change log() to something like
> 
> Map<Integer, String> recs = records.get();
> if (recs.isEmpty()) {
>  ??? allRecords.add(recs);
> }
> recs.put(id, String.format(format, params));

Yep good catch - this logger was completely broken.

> But all this stuff do exactly the same as simple ConcurrentLinkedQueue 
> (i.e. lock free ordered list).
> At least I don't see other rationale in the stuff.

I'm not certain of intent with the original but I'd always want to see 
log entries in chronological order - which is what we now clearly have.

Thanks,
David

> --alex
> 
>>
>> Thanks,
>> David
>>
>>> --alex

From david.holmes at oracle.com  Mon Mar  9 04:19:27 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 9 Mar 2020 14:19:27 +1000
Subject: RFR: JDK-8240340: java/lang/management/ThreadMXBean/Locks.java is
 buggy
In-Reply-To: <8d0f40d7-4202-0179-d130-1366c77e5c05@oracle.com>
References: <a5f700fb-4955-aaed-0312-367d4a7d2460@oracle.com>
 <4c9b6f1b-1308-9087-ef4c-140eaf103b0f@oracle.com>
 <9a3d8f8c-5c46-aa64-ba22-b1b2bf1836ee@oracle.com>
 <8d0f40d7-4202-0179-d130-1366c77e5c05@oracle.com>
Message-ID: <1082f12c-102f-fd63-f8a4-7b623944ee03@oracle.com>

P.S.

Forgot to note however that you need to update the documentation for the 
logger now as the mention of "per-thread logs" makes no sense now. Also 
in the spirit of not using @author, and because this is no longer the 
code created by Jaroslav, please delete the @author line.

Thanks,
David

On 9/03/2020 2:15 pm, David Holmes wrote:
> Hi Alex,
> 
> On 6/03/2020 4:54 am, Alex Menkov wrote:
>> Hi David,
>>
>> Thanks you for the review.
>>
>> On 03/04/2020 17:50, David Holmes wrote:
>>> Hi Alex,
>>>
>>> On 5/03/2020 10:30 am, Alex Menkov wrote:
>>>> Hi all,
>>>>
>>>> please review the fix for
>>>> https://bugs.openjdk.java.net/browse/JDK-8240340
>>>> webrev:
>>>> http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev/ 
>>>>
>>>>
>>>> changes:
>>>> - assertThreadState method: don't re-read thread state throwing 
>>>> exception (as we got weird error like "Thread WaitingThread is at 
>>>> WAITING state but is expected to be in Thread.State = WAITING");
>>>> - added proper test shutdown on error (made all threads "daemon", 
>>>> interrupt waiting thread if CheckerThread throws exception);
>>>> - if CheckerThread detects error, propagate the exception to main 
>>>> thread;
>>>
>>> The test changes seem fine.
>>>
>>>> - fixed LockFreeLogger class - it should work for logging from 
>>>> several threads, but it doesn't. I prefer to simplify it just to 
>>>> keep ConcurrentLinkedQueue<String>.
>>>> LockFreeLogger is also used by ThreadMXBeanStateTest test, but only 
>>>> by a single thread.
>>>
>>> I don't understand your changes here as you've completely changed the 
>>> intended design of the logger. The original accumulates log entries 
>>> per-thread and then spits them all out (though I'm not clear on the 
>>> exact ordering - I don't how to read that stream stuff). The new code 
>>> just creates a single queue of log records interleaving entries from 
>>> different threads. The simple logger may be all that is needed but it 
>>> seems quite different to the intent of the original.
>>
>> Testing changes in the test I discovered that there is something wrong 
>> with the logger - it printed only part of the records, so I have to 
>> look at the LockFreeLogger class and I don't understand how it was 
>> supposed to work.
>> About ordering in cumulative log: each record has Integer which used 
>> to sort log entries from all threads (i.e. records from different 
>> threads are printed at the order which log() was called).
>> Looking at allRecords/records stuff I don't understand how it should 
>> be used. To get logs from different threads in one logger, we needs 
>> one instance. So we create LockFreeLogger (in main thread) and ctor 
>> creates ThreadLocal record and register it in allRecords. Logging from 
>> main thread works fine, but if any other thread tries to log, 1st 
>> log() call creates its own ThreadLocal records (by records.get()) and 
>> log records from this thread go there. But this ThreadLocal records is 
>> not registered in allRecords, so this logging won't be included in 
>> final log.
>> Looks like we need to change log() to something like
>>
>> Map<Integer, String> recs = records.get();
>> if (recs.isEmpty()) {
>> ???? allRecords.add(recs);
>> }
>> recs.put(id, String.format(format, params));
> 
> Yep good catch - this logger was completely broken.
> 
>> But all this stuff do exactly the same as simple ConcurrentLinkedQueue 
>> (i.e. lock free ordered list).
>> At least I don't see other rationale in the stuff.
> 
> I'm not certain of intent with the original but I'd always want to see 
> log entries in chronological order - which is what we now clearly have.
> 
> Thanks,
> David
> 
>> --alex
>>
>>>
>>> Thanks,
>>> David
>>>
>>>> --alex

From rkennke at redhat.com  Mon Mar  9 12:39:03 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 9 Mar 2020 13:39:03 +0100
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
Message-ID: <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>

Hello all,

Can I please get reviews of this change? In the meantime, we've done
more testing and also field-/torture-testing by a customer who is happy
now. :-)

Thanks,
Roman


> Hi Serguei,
> 
> Thanks for reviewing!
> 
> I updated the patch to reflect your suggestions, very good!
> It also includes a fix to allow re-connecting an agent after disconnect,
> namely move setup of the trackingEnv and deletedSignatureBag to
> _activate() to ensure have those structures after re-connect.
> 
> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
> 
> Let me know what you think!
> Roman
> 
>> Hi Roman,
>>
>> Thank you for taking care about this scalability issue!
>>
>> I have a couple of quick comments.
>>
>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>
>> 72 /*
>> 73 * Lock to protect deletedSignatureBag
>> 74 */
>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>> 78 * A bag containing all the deleted classes' signatures. Must be
>> accessed under
>> 79 * deletedTagLock,
>>   80  */
>> 81 struct bag* deletedSignatureBag;
>>
>> ? The comments contradict to each other.
>> ? I guess, the lock name at line 79 has to be deletedSignatureLock
>> instead of deletedTagLock.
>> ? Also, comma at the end must be replaced with dot.
>>
>>
>> 101 // Tag not found? Ignore.
>> 102 if (klass == NULL) {
>> 103 debugMonitorExit(deletedSignatureLock);
>> 104 return;
>> 105 }
>>  106 
>> 107 // Scan linked-list.
>> 108 jlong found_tag = klass->klass_tag;
>> 109 while (klass != NULL && found_tag != tag) {
>> 110 klass_ptr = &klass->next;
>> 111 klass = *klass_ptr;
>> 112 found_tag = klass->klass_tag;
>>  113     }
>> 114
>> 115 // Tag not found? Ignore.
>> 116 if (found_tag != tag) {
>> 117 debugMonitorExit(deletedSignatureLock);
>> 118 return;
>>  119     }
>>
>>
>> ?The code above can be simplified, so that the lines 101-105 are not
>> needed anymore.
>> ?It can be something like this:
>>
>> // Scan linked-list.
>> while (klass != NULL && klass->klass_tag != tag) {
>> klass_ptr = &klass->next;
>> klass = *klass_ptr;
>>      }
>> if (klass == NULL || klass->klass_tag != tag) { // klass not found - ignore.
>> debugMonitorExit(deletedSignatureLock);
>> return;
>>      }
>>
>> It will take more time when I get a chance to look at the rest.
>>
>>
>> Thanks,
>> Serguei
>>
>>
>>
>>
>> On 12/21/19 13:24, Roman Kennke wrote:
>>> Here comes an update that resolves some races that happen when
>>> disconnecting an agent. In particular, we need to take the lock on
>>> basically every operation, and also need to check whether or not
>>> class-tracking is active and return an appropriate result (e.g. an empty
>>> list) when we're not.
>>>
>>> Updated webrev:
>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/
>>>
>>> Thanks,
>>> Roman
>>>
>>>
>>>> So, here comes the O(1) implementation:
>>>>
>>>> - Whenever a class is 'prepared', it is registered with a tag, and we
>>>> set-up a listener to get notified when it is unloaded.
>>>> - Prepared classes are kept in a datastructure that is a table, which
>>>> each entry being the head of a linked-list of KlassNode*. The table is
>>>> indexed by tag % slot-count, and then simply prepend the new KlassNode*.
>>>> This is O(1) operation.
>>>> - When we get notified of unloading a class, we look up the signature of
>>>> the reported tag in that table, and remember it in a bag. The KlassNode*
>>>> is then unlinked from the table and deallocated. This is ~O(1) operation
>>>> too, depending on the depth of the table. In my testcase which hammered
>>>> the code with class-loads and unloads, I usually see depths of like 2-3,
>>>> but not usually more. It should be ok.
>>>> - when processUnloads() gets called, we simply hand out that bag, and
>>>> allocate a new one.
>>>> - I also added cleanup-code in classTrack_reset() to avoid leaking the
>>>> signatures and KlassNode* etc when debug agent gets detached and/or
>>>> re-attached (was missing before).
>>>> - I also added locks around data-structure-manipulation (was missing
>>>> before).
>>>> - Also, I only activate this whole process when an actual listener gets
>>>> registered on EI_GC_FINISH. This seems to happen right when attaching a
>>>> jdb, not sure why jdb does that though. This may be something to improve
>>>> in the future?
>>>>
>>>> In my tests, the performance of class-tracking itself looks really good.
>>>> The bottleneck now is clearly actual synthesizing the class-unload
>>>> events. I don't see how this can be helped when the debug agent asks for it?
>>>>
>>>> Updated webrev:
>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
>>>>
>>>> Please let me know what you think of it.
>>>>
>>>> Thanks,
>>>> Roman
>>>>
>>>>
>>>>> Alright, the perfectionist in me got me. I am implementing the even more
>>>>> efficient ~O(1) class tracking. Please hold off reviewing for now.
>>>>>
>>>>> Thanks,Roman
>>>>>
>>>>>  Hi Chris,
>>>>>>> I'll have a look at this, although it might not be for a few days. In
>>>>>>> the meantime, maybe you can describe your new implementation in
>>>>>>> classTrack.c so it's easier to look through the changes.
>>>>>> Sure.
>>>>>>
>>>>>> The purpose of this class-tracking is to be able to determine the
>>>>>> signatures of unloaded classes when GC/class-unloading happened, so that
>>>>>> we can generate the appropriate JDWP event.
>>>>>>
>>>>>> The current implementation does so by maintaining a table of currently
>>>>>> prepared classes by building that table when classTrack is initialized,
>>>>>> and then add new classes whenever a class gets loaded. When unloading
>>>>>> occurs, that cache is rebuilt into a new table, and compared with the
>>>>>> old table, and whatever is in the old, but not in the new table gets
>>>>>> returned. The problem is that when GCs happen frequently and/or many
>>>>>> classes get loaded+unloaded, this amounts to O(classCount*gcCount)
>>>>>> complexity.
>>>>>>
>>>>>> The new implementation keeps a linked-list of prepared classes, and also
>>>>>> tracks unloads via the listener cbTrackingObjectFree(). Whenever an
>>>>>> unload/GC occurs, the list of prepared classes is scanned, and classes
>>>>>> that are also in the deletedTagBag are unlinked (thus maintaining the
>>>>>> prepared-classes-list) and its signature put in the list that gets returned.
>>>>>>
>>>>>> The implementation is not perfect. In order to determine whether or not
>>>>>> a class is unloaded, it needs to scan the deletedTagBag. That process is
>>>>>> therefore still O(unloadedClassCount). The assumption here is that
>>>>>> unloadedClassCount << classCount. In my experiments this seems to be
>>>>>> true, and also reasonable to expect.
>>>>>>
>>>>>> (I have some ideas how to improve the implementation to ~O(1) but it
>>>>>> would be considerably more complex: have to maintain a (hash)table that
>>>>>> maps tags -> KlassNode*, unlink them directly upon unload, and build the
>>>>>> unloaded-signatures list there, but I don't currently see that it's
>>>>>> worth the effort).
>>>>>>
>>>>>> In addition to all that, this process is only activated when there's an
>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>
>>>>>> Thanks,
>>>>>> Roman
>>>>>>
>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>> Hello all,
>>>>>>>>
>>>>>>>> Issue:
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>
>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c. It avoids
>>>>>>>> throwing away the class cache on GC, and instead keeps track of
>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>
>>>>>>>> In addition to that, it avoids this whole dance until an agent
>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>
>>>>>>>> Webrev:
>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>>>>>>>
>>>>>>>> Testing: manual testing of provided test scenarios and timing.
>>>>>>>>
>>>>>>>> Eg with the testcase provided here:
>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>
>>>>>>>> I am getting those numbers:
>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>> patched:?? no debug: 85s with debug: 95s
>>>>>>>>
>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>
>>>>>>>> Can I please get a review?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Roman
>>>>>>>>
>>
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200309/b4192b03/signature.asc>

From alexey.menkov at oracle.com  Mon Mar  9 18:34:31 2020
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Mon, 9 Mar 2020 11:34:31 -0700
Subject: RFR: JDK-8240340: java/lang/management/ThreadMXBean/Locks.java is
 buggy
In-Reply-To: <2cf61bd9-da4d-2477-6604-d104118c4e03@oracle.com>
References: <a5f700fb-4955-aaed-0312-367d4a7d2460@oracle.com>
 <4c9b6f1b-1308-9087-ef4c-140eaf103b0f@oracle.com>
 <9a3d8f8c-5c46-aa64-ba22-b1b2bf1836ee@oracle.com>
 <2cf61bd9-da4d-2477-6604-d104118c4e03@oracle.com>
Message-ID: <8b3eba03-b290-677a-70b0-99595ec19eb8@oracle.com>


On 03/06/2020 23:28, serguei.spitsyn at oracle.com wrote:
> Hi David and Alex,
> 
> My understanding is that previous implementation collected logs 
> separately for each thread in TLS, and at the end, merged and sorted out 
> the output by log id.
> So, the result is that all messages are serialized at the end.
> Alex changed the implementation but the result is the same - all log 
> messages are serialized.
> 
> There are two tests which use the LockFreeLogger.
> Another one is: test/jdk/java/lang/Thread/ThreadStateController.java .
> Does the ThreadStateController.java work okay after the fix?

ThreadStateController is an utility class used only by 
ThreadMXBeanStateTest.java.
ThreadMXBeanStateTest.java is problem-listed, but I verified that 
logging works in the test.

--alex


> 
> Thanks,
> Serguei
> 
> 
> On 3/5/20 10:54, Alex Menkov wrote:
>> Hi David,
>>
>> Thanks you for the review.
>>
>> On 03/04/2020 17:50, David Holmes wrote:
>>> Hi Alex,
>>>
>>> On 5/03/2020 10:30 am, Alex Menkov wrote:
>>>> Hi all,
>>>>
>>>> please review the fix for
>>>> https://bugs.openjdk.java.net/browse/JDK-8240340
>>>> webrev:
>>>> http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev/ 
>>>>
>>>>
>>>> changes:
>>>> - assertThreadState method: don't re-read thread state throwing 
>>>> exception (as we got weird error like "Thread WaitingThread is at 
>>>> WAITING state but is expected to be in Thread.State = WAITING");
>>>> - added proper test shutdown on error (made all threads "daemon", 
>>>> interrupt waiting thread if CheckerThread throws exception);
>>>> - if CheckerThread detects error, propagate the exception to main 
>>>> thread;
>>>
>>> The test changes seem fine.
>>>
>>>> - fixed LockFreeLogger class - it should work for logging from 
>>>> several threads, but it doesn't. I prefer to simplify it just to 
>>>> keep ConcurrentLinkedQueue<String>.
>>>> LockFreeLogger is also used by ThreadMXBeanStateTest test, but only 
>>>> by a single thread.
>>>
>>> I don't understand your changes here as you've completely changed the 
>>> intended design of the logger. The original accumulates log entries 
>>> per-thread and then spits them all out (though I'm not clear on the 
>>> exact ordering - I don't how to read that stream stuff). The new code 
>>> just creates a single queue of log records interleaving entries from 
>>> different threads. The simple logger may be all that is needed but it 
>>> seems quite different to the intent of the original.
>>
>> Testing changes in the test I discovered that there is something wrong 
>> with the logger - it printed only part of the records, so I have to 
>> look at the LockFreeLogger class and I don't understand how it was 
>> supposed to work.
>> About ordering in cumulative log: each record has Integer which used 
>> to sort log entries from all threads (i.e. records from different 
>> threads are printed at the order which log() was called).
>> Looking at allRecords/records stuff I don't understand how it should 
>> be used. To get logs from different threads in one logger, we needs 
>> one instance. So we create LockFreeLogger (in main thread) and ctor 
>> creates ThreadLocal record and register it in allRecords. Logging from 
>> main thread works fine, but if any other thread tries to log, 1st 
>> log() call creates its own ThreadLocal records (by records.get()) and 
>> log records from this thread go there. But this ThreadLocal records is 
>> not registered in allRecords, so this logging won't be included in 
>> final log.
>> Looks like we need to change log() to something like
>>
>> Map<Integer, String> recs = records.get();
>> if (recs.isEmpty()) {
>> ??? allRecords.add(recs);
>> }
>> recs.put(id, String.format(format, params));
>>
>> But all this stuff do exactly the same as simple ConcurrentLinkedQueue 
>> (i.e. lock free ordered list).
>> At least I don't see other rationale in the stuff.
>>
>> --alex
>>
>>>
>>> Thanks,
>>> David
>>>
>>>> --alex
> 

From alexey.menkov at oracle.com  Mon Mar  9 19:15:11 2020
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Mon, 9 Mar 2020 12:15:11 -0700
Subject: RFR: JDK-8240340: java/lang/management/ThreadMXBean/Locks.java is
 buggy
In-Reply-To: <1082f12c-102f-fd63-f8a4-7b623944ee03@oracle.com>
References: <a5f700fb-4955-aaed-0312-367d4a7d2460@oracle.com>
 <4c9b6f1b-1308-9087-ef4c-140eaf103b0f@oracle.com>
 <9a3d8f8c-5c46-aa64-ba22-b1b2bf1836ee@oracle.com>
 <8d0f40d7-4202-0179-d130-1366c77e5c05@oracle.com>
 <1082f12c-102f-fd63-f8a4-7b623944ee03@oracle.com>
Message-ID: <e96230f2-e689-509d-5c79-bb0412cddb1c@oracle.com>


Updated webrev:
http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev.02/

Changes are in LockFreeLogger comments only.

--alex

On 03/08/2020 21:19, David Holmes wrote:
> P.S.
> 
> Forgot to note however that you need to update the documentation for the 
> logger now as the mention of "per-thread logs" makes no sense now. Also 
> in the spirit of not using @author, and because this is no longer the 
> code created by Jaroslav, please delete the @author line.
> 
> Thanks,
> David
> 
> On 9/03/2020 2:15 pm, David Holmes wrote:
>> Hi Alex,
>>
>> On 6/03/2020 4:54 am, Alex Menkov wrote:
>>> Hi David,
>>>
>>> Thanks you for the review.
>>>
>>> On 03/04/2020 17:50, David Holmes wrote:
>>>> Hi Alex,
>>>>
>>>> On 5/03/2020 10:30 am, Alex Menkov wrote:
>>>>> Hi all,
>>>>>
>>>>> please review the fix for
>>>>> https://bugs.openjdk.java.net/browse/JDK-8240340
>>>>> webrev:
>>>>> http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev/ 
>>>>>
>>>>>
>>>>> changes:
>>>>> - assertThreadState method: don't re-read thread state throwing 
>>>>> exception (as we got weird error like "Thread WaitingThread is at 
>>>>> WAITING state but is expected to be in Thread.State = WAITING");
>>>>> - added proper test shutdown on error (made all threads "daemon", 
>>>>> interrupt waiting thread if CheckerThread throws exception);
>>>>> - if CheckerThread detects error, propagate the exception to main 
>>>>> thread;
>>>>
>>>> The test changes seem fine.
>>>>
>>>>> - fixed LockFreeLogger class - it should work for logging from 
>>>>> several threads, but it doesn't. I prefer to simplify it just to 
>>>>> keep ConcurrentLinkedQueue<String>.
>>>>> LockFreeLogger is also used by ThreadMXBeanStateTest test, but only 
>>>>> by a single thread.
>>>>
>>>> I don't understand your changes here as you've completely changed 
>>>> the intended design of the logger. The original accumulates log 
>>>> entries per-thread and then spits them all out (though I'm not clear 
>>>> on the exact ordering - I don't how to read that stream stuff). The 
>>>> new code just creates a single queue of log records interleaving 
>>>> entries from different threads. The simple logger may be all that is 
>>>> needed but it seems quite different to the intent of the original.
>>>
>>> Testing changes in the test I discovered that there is something 
>>> wrong with the logger - it printed only part of the records, so I 
>>> have to look at the LockFreeLogger class and I don't understand how 
>>> it was supposed to work.
>>> About ordering in cumulative log: each record has Integer which used 
>>> to sort log entries from all threads (i.e. records from different 
>>> threads are printed at the order which log() was called).
>>> Looking at allRecords/records stuff I don't understand how it should 
>>> be used. To get logs from different threads in one logger, we needs 
>>> one instance. So we create LockFreeLogger (in main thread) and ctor 
>>> creates ThreadLocal record and register it in allRecords. Logging 
>>> from main thread works fine, but if any other thread tries to log, 
>>> 1st log() call creates its own ThreadLocal records (by records.get()) 
>>> and log records from this thread go there. But this ThreadLocal 
>>> records is not registered in allRecords, so this logging won't be 
>>> included in final log.
>>> Looks like we need to change log() to something like
>>>
>>> Map<Integer, String> recs = records.get();
>>> if (recs.isEmpty()) {
>>> ???? allRecords.add(recs);
>>> }
>>> recs.put(id, String.format(format, params));
>>
>> Yep good catch - this logger was completely broken.
>>
>>> But all this stuff do exactly the same as simple 
>>> ConcurrentLinkedQueue (i.e. lock free ordered list).
>>> At least I don't see other rationale in the stuff.
>>
>> I'm not certain of intent with the original but I'd always want to see 
>> log entries in chronological order - which is what we now clearly have.
>>
>> Thanks,
>> David
>>
>>> --alex
>>>
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> --alex

From david.holmes at oracle.com  Mon Mar  9 23:09:21 2020
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 10 Mar 2020 09:09:21 +1000
Subject: RFR: JDK-8240340: java/lang/management/ThreadMXBean/Locks.java is
 buggy
In-Reply-To: <e96230f2-e689-509d-5c79-bb0412cddb1c@oracle.com>
References: <a5f700fb-4955-aaed-0312-367d4a7d2460@oracle.com>
 <4c9b6f1b-1308-9087-ef4c-140eaf103b0f@oracle.com>
 <9a3d8f8c-5c46-aa64-ba22-b1b2bf1836ee@oracle.com>
 <8d0f40d7-4202-0179-d130-1366c77e5c05@oracle.com>
 <1082f12c-102f-fd63-f8a4-7b623944ee03@oracle.com>
 <e96230f2-e689-509d-5c79-bb0412cddb1c@oracle.com>
Message-ID: <fbe0fb0a-8b86-b46a-f9e5-4a2229069a27@oracle.com>

Looks good.

Thanks,
David

On 10/03/2020 5:15 am, Alex Menkov wrote:
> 
> Updated webrev:
> http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev.02/ 
> 
> 
> Changes are in LockFreeLogger comments only.
> 
> --alex
> 
> On 03/08/2020 21:19, David Holmes wrote:
>> P.S.
>>
>> Forgot to note however that you need to update the documentation for 
>> the logger now as the mention of "per-thread logs" makes no sense now. 
>> Also in the spirit of not using @author, and because this is no longer 
>> the code created by Jaroslav, please delete the @author line.
>>
>> Thanks,
>> David
>>
>> On 9/03/2020 2:15 pm, David Holmes wrote:
>>> Hi Alex,
>>>
>>> On 6/03/2020 4:54 am, Alex Menkov wrote:
>>>> Hi David,
>>>>
>>>> Thanks you for the review.
>>>>
>>>> On 03/04/2020 17:50, David Holmes wrote:
>>>>> Hi Alex,
>>>>>
>>>>> On 5/03/2020 10:30 am, Alex Menkov wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> please review the fix for
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8240340
>>>>>> webrev:
>>>>>> http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev/ 
>>>>>>
>>>>>>
>>>>>> changes:
>>>>>> - assertThreadState method: don't re-read thread state throwing 
>>>>>> exception (as we got weird error like "Thread WaitingThread is at 
>>>>>> WAITING state but is expected to be in Thread.State = WAITING");
>>>>>> - added proper test shutdown on error (made all threads "daemon", 
>>>>>> interrupt waiting thread if CheckerThread throws exception);
>>>>>> - if CheckerThread detects error, propagate the exception to main 
>>>>>> thread;
>>>>>
>>>>> The test changes seem fine.
>>>>>
>>>>>> - fixed LockFreeLogger class - it should work for logging from 
>>>>>> several threads, but it doesn't. I prefer to simplify it just to 
>>>>>> keep ConcurrentLinkedQueue<String>.
>>>>>> LockFreeLogger is also used by ThreadMXBeanStateTest test, but 
>>>>>> only by a single thread.
>>>>>
>>>>> I don't understand your changes here as you've completely changed 
>>>>> the intended design of the logger. The original accumulates log 
>>>>> entries per-thread and then spits them all out (though I'm not 
>>>>> clear on the exact ordering - I don't how to read that stream 
>>>>> stuff). The new code just creates a single queue of log records 
>>>>> interleaving entries from different threads. The simple logger may 
>>>>> be all that is needed but it seems quite different to the intent of 
>>>>> the original.
>>>>
>>>> Testing changes in the test I discovered that there is something 
>>>> wrong with the logger - it printed only part of the records, so I 
>>>> have to look at the LockFreeLogger class and I don't understand how 
>>>> it was supposed to work.
>>>> About ordering in cumulative log: each record has Integer which used 
>>>> to sort log entries from all threads (i.e. records from different 
>>>> threads are printed at the order which log() was called).
>>>> Looking at allRecords/records stuff I don't understand how it should 
>>>> be used. To get logs from different threads in one logger, we needs 
>>>> one instance. So we create LockFreeLogger (in main thread) and ctor 
>>>> creates ThreadLocal record and register it in allRecords. Logging 
>>>> from main thread works fine, but if any other thread tries to log, 
>>>> 1st log() call creates its own ThreadLocal records (by 
>>>> records.get()) and log records from this thread go there. But this 
>>>> ThreadLocal records is not registered in allRecords, so this logging 
>>>> won't be included in final log.
>>>> Looks like we need to change log() to something like
>>>>
>>>> Map<Integer, String> recs = records.get();
>>>> if (recs.isEmpty()) {
>>>> ???? allRecords.add(recs);
>>>> }
>>>> recs.put(id, String.format(format, params));
>>>
>>> Yep good catch - this logger was completely broken.
>>>
>>>> But all this stuff do exactly the same as simple 
>>>> ConcurrentLinkedQueue (i.e. lock free ordered list).
>>>> At least I don't see other rationale in the stuff.
>>>
>>> I'm not certain of intent with the original but I'd always want to 
>>> see log entries in chronological order - which is what we now clearly 
>>> have.
>>>
>>> Thanks,
>>> David
>>>
>>>> --alex
>>>>
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>> --alex

From chris.plummer at oracle.com  Tue Mar 10 02:29:53 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Mon, 9 Mar 2020 19:29:53 -0700
Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they do
 not attempt to use sudo when available
Message-ID: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com>

Hi,

Please help review the following:

https://bugs.openjdk.java.net/browse/JDK-8238268
http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/

I'll try to give enough background first to make it easier to understand 
the changes. On OSX you must run SA tests that attach to a live process 
as root or using sudo. For example:

 ? sudo make run-test TEST=serviceability/sa/ClhsdbJstackXcompStress.java

Whether running as root or under sudo, the check to allow the test to 
run is done with:

 ??? private static boolean canAttachOSX() {
 ????????? return userName.equals("root");
 ??? }

Any test using "@requires vm.hasSAandCanAttach" must pass this check via 
Platform.shouldSAAttach(), which for OSX returns:

 ???????????? return canAttachOSX() && !isSignedOSX();

So if running as root the "@requires vm.hasSAandCanAttach" passes, 
otherwise it does not. However, using a root login to run tests is not a 
very desirable, nor is issuing a "sudo make run-test" (any created file 
ends up with root ownership). Because of this support was previously 
added for just running the attaching process using sudo, not the entire 
test. This was only done for the 20 or so tests that use ClhsdbLauncher. 
These tests use "@requires vm.hasSA", and then while running the test 
will do a "sudo" check if canAttachOSX() returns false:

 ??????? if (!Platform.shouldSAAttach()) {
 ??????????? if (Platform.isOSX()) {
 ??????????????? if (Platform.isSignedOSX()) {
 ??????????????????? throw new SkippedException("SA attach not expected 
to work. JDK is signed.");
 ??????????????? } else if (SATestUtils.canAddPrivileges()) {
 ??????????????????? needPrivileges = true;
 ??????????????? }
 ??????????? }
 ??????????? if (!needPrivileges)? {
 ?????????????? // Skip the test if we don't have enough permissions to 
attach
 ?????????????? // and cannot add privileges.
 ?????????????? throw new SkippedException(
 ?????????????????? "SA attach not expected to work. Insufficient 
privileges.");
 ?????????? }
 ??????? }

So basically it does a runtime check of vm.hasSAandCanAttach, and if it 
fails then checks if running with sudo will work. This allows for either 
a passwordless sudo to be used when running clhsdb, or for the user to 
be prompted for the sudo password (note I've remove support for the 
latter with my changes).

That brings us to the CR that is being fixed. ClhsdbLauncher tests 
support sudo and will therefore run with our CI testing on OSX, but the 
25 or so tests that use "@requires vm.hasSAandCanAttach" do not, and 
therefore are never run with our CI OSX testing. The changes in this 
webrev fix that.

There are two possible approaches to the fix. One is having the check 
for sudo be done as part of the vm.hasSAandCanAttach evaluation. The 
other approach is to do the check in the test at runtime similar to how 
ClhsdbLauncher currently does. This would mean just using "@requires 
vm.hasSA" for all the tests instead of "@requires vm.hasSAandCanAttach". 
I chose the later because there is an advantage to throwing 
SkippedException rather than just silently skipping the test using 
@requires. The advantage is that mdash tells you how many tests were 
skipped, and when you hover over the reason you can see the 
SkippedException message, which will differentiate between reasons like 
the JDK was signed or there are insufficient privileges. If all the 
checking was done by the vm.hasSAandCanAttach evaluation, you would not 
know why the test wasn't run.

The "support" related changes made are all in the following 3 files. The 
rest of the changes are in the tests:

test/jtreg-ext/requires/VMProps.java
test/lib/jdk/test/lib/Platform.java
test/lib/jdk/test/lib/SA/SATestUtils.java

You'll noticed that one change I made to the sudo support in 
SATestUtils.canAddPrivileges() is to make sudo non-interactive, which 
means no password prompt. So that means either the user does not require 
a password, or the credentials have been cached. Otherwise the sudo 
check will fail. On most platforms if you execute a sudo command, the 
credentials are cached for 5 minutes. So if your user is not setup for 
passwordless sudo, then a sudo command can be issued before running the 
tests, and will likely remain cached until the test is run. The reason 
for using passwordless is because prompting in the middle of running 
tests can be confusing (you usually walk way once launching the tests 
and miss the prompt anyway), and avoids unnecessary delays in automated 
testing due to waiting for the password prompt to timeout (it used to 
wait 1 minute).

There are essentially 3 types of tests that SA Attach to a process, each 
needing a slightly different fix:

1. Tests that directly launch a jdk.hotspot.agent class, such as 
TestClassDump.java. They need to call SATestUtils.checkAttachOk() to 
verify that attaching will be possible, and then 
SATestUtils.addPrivilegesIfNeeded(pb) to get the sudo command added if 
needed.They also need to switch from using hasSAandCanAttach to using hasSA.

2. Tests that launch command line tools such has jhsdb. They need to 
call SATestUtils.checkAttachOk() to verify that attaching will be 
possible, and then SATestUtils.createProcessBuilder() to create a 
process that will be launched using sudo if necessary.They also need to 
switch from using hasSAandCanAttach to using hasSA.

3. Tests that use ClhsdbLauncher. They already use hasSA instead of 
hasSAandCanAttach, and rely on ClhsdbLauncher to do check at runtime if 
attaching will work, so for the most part all the these tests are 
unchanged. ClhsdbLauncher was modified to take advantage of the new 
SATestUtils.createProcessBuilder() and SATestUtils.checkAttachOk() APIs.

Some tests required special handling:

test/hotspot/jtreg/compiler/ciReplay/TestSAClient.java
test/hotspot/jtreg/compiler/ciReplay/TestSAServer.java

- These two tests SA Attach to a core file, not to a process, so only 
need hasSA,
 ? not hasSAandCanAttach. No other changes were needed.

test/hotspot/jtreg/serviceability/sa/ClhsdbFindPC.java

- The output should never be null. If the test was skipped due to lack 
of privileges, you
 ? would never get to this section of the test.

test/hotspot/jtreg/serviceability/sa/TestClhsdbJstackLock.java
test/hotspot/jtreg/serviceability/sa/TestIntConstant.java
test/hotspot/jtreg/serviceability/sa/TestPrintMdo.java
test/hotspot/jtreg/serviceability/sa/TestType.java
test/hotspot/jtreg/serviceability/sa/TestUniverse.java

- These are ClhsdbLauncher tests, so they should have been using hasSA 
instead of
 ? hasSAandCanAttachin the first place. No other changes were needed.

test/hotspot/jtreg/serviceability/sa/TestCpoolForInvokeDynamic.java
test/hotspot/jtreg/serviceability/sa/TestDefaultMethods.java
test/hotspot/jtreg/serviceability/sa/TestG1HeapRegion.java

- These tests used to "@require mac" but seem run fine on OSX, so I 
removed this requirement.

test/jdk/sun/tools/jhsdb/BasicLauncherTest.java

- This test had a runtime check to not run on OSX due to not having core 
file stack
 ? walking support. However, this tests always attaches to a process, 
not a core file,
 ? and seems to run just fine on OSX.

test/jdk/sun/tools/jstack/DeadlockDetectionTest.java

- I changed the test to throw a SkippedException if it gets the 
unexpected error code
 ? rather than just println.

And a few other miscellaneous changes not already covered:

test/lib/jdk/test/lib/Platform.java
- Made canPtraceAttachLinux() public so it can be called from SATestUtils.
- vm.hasSAandCanAttach is now gone.

thanks,

Chris


From chris.plummer at oracle.com  Tue Mar 10 04:35:10 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Mon, 9 Mar 2020 21:35:10 -0700
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
Message-ID: <980425b9-cf93-3a5a-b10d-459c0d0d692a@oracle.com>

I'll have a look at it, although it might not be for a couple of days.

Chris

On 3/9/20 5:39 AM, Roman Kennke wrote:
> Hello all,
>
> Can I please get reviews of this change? In the meantime, we've done
> more testing and also field-/torture-testing by a customer who is happy
> now. :-)
>
> Thanks,
> Roman
>
>
>> Hi Serguei,
>>
>> Thanks for reviewing!
>>
>> I updated the patch to reflect your suggestions, very good!
>> It also includes a fix to allow re-connecting an agent after disconnect,
>> namely move setup of the trackingEnv and deletedSignatureBag to
>> _activate() to ensure have those structures after re-connect.
>>
>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
>>
>> Let me know what you think!
>> Roman
>>
>>> Hi Roman,
>>>
>>> Thank you for taking care about this scalability issue!
>>>
>>> I have a couple of quick comments.
>>>
>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>
>>> 72 /*
>>> 73 * Lock to protect deletedSignatureBag
>>> 74 */
>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>>> 78 * A bag containing all the deleted classes' signatures. Must be
>>> accessed under
>>> 79 * deletedTagLock,
>>>    80  */
>>> 81 struct bag* deletedSignatureBag;
>>>
>>>  ? The comments contradict to each other.
>>>  ? I guess, the lock name at line 79 has to be deletedSignatureLock
>>> instead of deletedTagLock.
>>>  ? Also, comma at the end must be replaced with dot.
>>>
>>>
>>> 101 // Tag not found? Ignore.
>>> 102 if (klass == NULL) {
>>> 103 debugMonitorExit(deletedSignatureLock);
>>> 104 return;
>>> 105 }
>>>   106
>>> 107 // Scan linked-list.
>>> 108 jlong found_tag = klass->klass_tag;
>>> 109 while (klass != NULL && found_tag != tag) {
>>> 110 klass_ptr = &klass->next;
>>> 111 klass = *klass_ptr;
>>> 112 found_tag = klass->klass_tag;
>>>   113     }
>>> 114
>>> 115 // Tag not found? Ignore.
>>> 116 if (found_tag != tag) {
>>> 117 debugMonitorExit(deletedSignatureLock);
>>> 118 return;
>>>   119     }
>>>
>>>
>>>  ?The code above can be simplified, so that the lines 101-105 are not
>>> needed anymore.
>>>  ?It can be something like this:
>>>
>>> // Scan linked-list.
>>> while (klass != NULL && klass->klass_tag != tag) {
>>> klass_ptr = &klass->next;
>>> klass = *klass_ptr;
>>>       }
>>> if (klass == NULL || klass->klass_tag != tag) { // klass not found - ignore.
>>> debugMonitorExit(deletedSignatureLock);
>>> return;
>>>       }
>>>
>>> It will take more time when I get a chance to look at the rest.
>>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>>
>>>
>>> On 12/21/19 13:24, Roman Kennke wrote:
>>>> Here comes an update that resolves some races that happen when
>>>> disconnecting an agent. In particular, we need to take the lock on
>>>> basically every operation, and also need to check whether or not
>>>> class-tracking is active and return an appropriate result (e.g. an empty
>>>> list) when we're not.
>>>>
>>>> Updated webrev:
>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/
>>>>
>>>> Thanks,
>>>> Roman
>>>>
>>>>
>>>>> So, here comes the O(1) implementation:
>>>>>
>>>>> - Whenever a class is 'prepared', it is registered with a tag, and we
>>>>> set-up a listener to get notified when it is unloaded.
>>>>> - Prepared classes are kept in a datastructure that is a table, which
>>>>> each entry being the head of a linked-list of KlassNode*. The table is
>>>>> indexed by tag % slot-count, and then simply prepend the new KlassNode*.
>>>>> This is O(1) operation.
>>>>> - When we get notified of unloading a class, we look up the signature of
>>>>> the reported tag in that table, and remember it in a bag. The KlassNode*
>>>>> is then unlinked from the table and deallocated. This is ~O(1) operation
>>>>> too, depending on the depth of the table. In my testcase which hammered
>>>>> the code with class-loads and unloads, I usually see depths of like 2-3,
>>>>> but not usually more. It should be ok.
>>>>> - when processUnloads() gets called, we simply hand out that bag, and
>>>>> allocate a new one.
>>>>> - I also added cleanup-code in classTrack_reset() to avoid leaking the
>>>>> signatures and KlassNode* etc when debug agent gets detached and/or
>>>>> re-attached (was missing before).
>>>>> - I also added locks around data-structure-manipulation (was missing
>>>>> before).
>>>>> - Also, I only activate this whole process when an actual listener gets
>>>>> registered on EI_GC_FINISH. This seems to happen right when attaching a
>>>>> jdb, not sure why jdb does that though. This may be something to improve
>>>>> in the future?
>>>>>
>>>>> In my tests, the performance of class-tracking itself looks really good.
>>>>> The bottleneck now is clearly actual synthesizing the class-unload
>>>>> events. I don't see how this can be helped when the debug agent asks for it?
>>>>>
>>>>> Updated webrev:
>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
>>>>>
>>>>> Please let me know what you think of it.
>>>>>
>>>>> Thanks,
>>>>> Roman
>>>>>
>>>>>
>>>>>> Alright, the perfectionist in me got me. I am implementing the even more
>>>>>> efficient ~O(1) class tracking. Please hold off reviewing for now.
>>>>>>
>>>>>> Thanks,Roman
>>>>>>
>>>>>>   Hi Chris,
>>>>>>>> I'll have a look at this, although it might not be for a few days. In
>>>>>>>> the meantime, maybe you can describe your new implementation in
>>>>>>>> classTrack.c so it's easier to look through the changes.
>>>>>>> Sure.
>>>>>>>
>>>>>>> The purpose of this class-tracking is to be able to determine the
>>>>>>> signatures of unloaded classes when GC/class-unloading happened, so that
>>>>>>> we can generate the appropriate JDWP event.
>>>>>>>
>>>>>>> The current implementation does so by maintaining a table of currently
>>>>>>> prepared classes by building that table when classTrack is initialized,
>>>>>>> and then add new classes whenever a class gets loaded. When unloading
>>>>>>> occurs, that cache is rebuilt into a new table, and compared with the
>>>>>>> old table, and whatever is in the old, but not in the new table gets
>>>>>>> returned. The problem is that when GCs happen frequently and/or many
>>>>>>> classes get loaded+unloaded, this amounts to O(classCount*gcCount)
>>>>>>> complexity.
>>>>>>>
>>>>>>> The new implementation keeps a linked-list of prepared classes, and also
>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). Whenever an
>>>>>>> unload/GC occurs, the list of prepared classes is scanned, and classes
>>>>>>> that are also in the deletedTagBag are unlinked (thus maintaining the
>>>>>>> prepared-classes-list) and its signature put in the list that gets returned.
>>>>>>>
>>>>>>> The implementation is not perfect. In order to determine whether or not
>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. That process is
>>>>>>> therefore still O(unloadedClassCount). The assumption here is that
>>>>>>> unloadedClassCount << classCount. In my experiments this seems to be
>>>>>>> true, and also reasonable to expect.
>>>>>>>
>>>>>>> (I have some ideas how to improve the implementation to ~O(1) but it
>>>>>>> would be considerably more complex: have to maintain a (hash)table that
>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload, and build the
>>>>>>> unloaded-signatures list there, but I don't currently see that it's
>>>>>>> worth the effort).
>>>>>>>
>>>>>>> In addition to all that, this process is only activated when there's an
>>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Roman
>>>>>>>
>>>>>>>
>>>>>>>> Chris
>>>>>>>>
>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>>> Hello all,
>>>>>>>>>
>>>>>>>>> Issue:
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>>
>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c. It avoids
>>>>>>>>> throwing away the class cache on GC, and instead keeps track of
>>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>>
>>>>>>>>> In addition to that, it avoids this whole dance until an agent
>>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>>
>>>>>>>>> Webrev:
>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>>>>>>>>
>>>>>>>>> Testing: manual testing of provided test scenarios and timing.
>>>>>>>>>
>>>>>>>>> Eg with the testcase provided here:
>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>
>>>>>>>>> I am getting those numbers:
>>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>>> patched:?? no debug: 85s with debug: 95s
>>>>>>>>>
>>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>>
>>>>>>>>> Can I please get a review?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Roman
>>>>>>>>>


From serguei.spitsyn at oracle.com  Tue Mar 10 09:54:02 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 10 Mar 2020 02:54:02 -0700
Subject: PING: Re: RFR: JDK-8222489: jcmd VM.system_properties gives
 unusable paths on Windows
In-Reply-To: <CAE_05uxC4ADo-qQaOaSAFHfUZ93uufVHZu639vvvSfeJWeNy=A@mail.gmail.com>
References: <CAE_05uymp-AR6x5MLOZp7ATiOb4+O3Ev2bvscH2nmvHs79LAeA@mail.gmail.com>
 <CAE_05uzOq3t7hS-rxw5Q+dMjh4E0pZFxkXC23mRfVZVBH-aUjw@mail.gmail.com>
 <54d4e146-58bc-8318-6e27-922616ff4b37@oss.nttdata.com>
 <CAE_05uwi8trdTc6W3dVepNshy3VbOPQGJd3gW6gkc9hYoXBTrg@mail.gmail.com>
 <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com>
 <CAE_05uxvGwtgsM4NSH5ipC_gOT+f_fjhdQ+xpHoX7DSp7yntEA@mail.gmail.com>
 <0a2df665-2e08-6139-c131-043a425b4916@oracle.com>
 <CAE_05uzT8PedhscdgqOQ_0Ek3uKB2mHdZSGJLOH4WyE02Jv+qg@mail.gmail.com>
 <AM0PR02MB450007C62B0392584C378ECF9FEA0@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <CAE_05uwttqkP06KStTEdQY_bbNTra4OHpQfszEcg5q_gbCxyrQ@mail.gmail.com>
 <CAE_05uz93m96Ez+zRv8h2j6DyDW9xPeanm1wNCo9r+kG+BoOug@mail.gmail.com>
 <31599c2a-6947-44a0-0f91-cb240ddaf1ed@oracle.com>
 <d7637b4f-5c9a-436d-f11b-468cc43aa8d3@oss.nttdata.com>
 <CAE_05uzNAU0f8T8Gh9=pxd5U18xTDsWbxKkmbj5iSf2AhMWXYQ@mail.gmail.com>
 <CAE_05uxC4ADo-qQaOaSAFHfUZ93uufVHZu639vvvSfeJWeNy=A@mail.gmail.com>
Message-ID: <3e535ec6-573d-901e-e4c0-4e2174c80f15@oracle.com>

Hi Chihiro,

Yes, I'll sponsor it.
Thank you for the update.

Thanks,
Serguei


On 3/8/20 06:05, Chihiro Ito wrote:
> Hi,
>
> I'm sorry. I included "JDK-" in the changeset title. I removed it and
> updated it.
>
> Change set : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset
>
> Regards,
> Chihiro
>
> 2020?3?7?(?) 23:13 Chihiro Ito <chiroito107 at gmail.com>:
>> Hi Serguei and Yasumasa,
>>
>> I update the copyright year and created the change set.
>>
>> Could you sponsor this, please?
>>
>> Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/
>> Change set : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset
>>
>> Regards,
>> Chihiro
>>
>>
>> 2020?3?7?(?) 16:03 Yasumasa Suenaga <suenaga at oss.nttdata.com>:
>>
>>
>>> Hi Chihiro,
>>>
>>> I'm also ok with webrev.05 after updating copyright year.
>>>
>>>
>>> Yasumasa
>>>
>>>
>>> On 2020/03/07 3:32, serguei.spitsyn at oracle.com wrote:
>>>> Hi Chichiro,
>>>>
>>>> I'm okay with the fix.
>>>> Could you, please, update the copyright date in || src/java.base/share/classes/jdk/internal/vm/VMSupport.java before push?
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>> On 3/6/20 07:24, Chihiro Ito wrote:
>>>>> Hi Serguei,
>>>>>
>>>>> Could you review this again, please?
>>>>>
>>>>> Regards,
>>>>> Chihiro
>>>>>
>>>>>
>>>>> 2020?2?27?(?) 22:11 Chihiro Ito<chiroito107 at gmail.com>:
>>>>>> Hi Ralf,
>>>>>>
>>>>>> Thank you for your advice.
>>>>>>
>>>>>> 1.
>>>>>> The comment of serializePropertiesToByteArray in VMSupport is "The stream written to the byte array is ISO 8859-1 encoded.".
>>>>>> But the previous implementation does not keep this. I think we need to implement encode by ISO 8859-1.
>>>>>>
>>>>>> 2.
>>>>>> According to help, the feature of VM.system_properties is just "Print system properties". The users should not use this output for loading. The users use it when they want to see System Properties soon.
>>>>>>
>>>>>> Regards,
>>>>>> Chihiro
>>>>>>
>>>>>>
>>>>>> 2020?2?26?(?) 18:53 Schmelter, Ralf<ralf.schmelter at sap.com>:
>>>>>>> Hi Chihiro,
>>>>>>>
>>>>>>> I have two remarks:
>>>>>>>
>>>>>>> 1. ISO Latin 1 characters which are not ASCII will not work with the code. While the Properties.store() method claims to create ISO Latin 1 String, it really only will create printable ASCII characters  (apart from the comment, but it is ASCII too in this case). See Properties.saveConvert, where the char is checked for < 0x20 or > 0x7e and then printed as \uxxxx. This is important, since the bytes of the ByteArrayOutputStream are then send to the jcmd. And jcmd expects UTF-8 encoded strings, which is OK if we only used ASCII characters. But a ISO Latin 1 character >= 0x80 will break the encoding. Just try using \u00DC in your test.
>>>>>>>
>>>>>>> 2. Your change makes it impossible to load the output with properties.load(). The old output could be loaded, since it was a valid properties file. But yours is not. For example, consider the filename c:\test\new. Formerly it would be encoded as:
>>>>>>> C\:\\test\\new
>>>>>>> And now it is:
>>>>>>> C:\test\new
>>>>>>> But the properties code would see "\n" as the newline character in your encoding. In fact you cannot differentiate between \n, \t, \f and \r originally being one or two characters.
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Ralf
>>>>>>>
>>>>>>>
>>>>>>> From: serviceability-dev<serviceability-dev-bounces at openjdk.java.net>  On Behalf Of Chihiro Ito
>>>>>>> Sent: Dienstag, 25. Februar 2020 04:45
>>>>>>> To:serguei.spitsyn at oracle.com
>>>>>>> Cc:serviceability-dev at openjdk.java.net
>>>>>>> Subject: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows
>>>>>>>
>>>>>>> Hi Serguei,
>>>>>>>
>>>>>>> Thanks for your review and advice.
>>>>>>>
>>>>>>> I modified these.
>>>>>>> Could you review this again, please?
>>>>>>>
>>>>>>> Webrev :http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.05/
>>>>>>>
>>>>>>> Regards,
>>>>>>> Chihiro
>>>>>>>


From kevin.walls at oracle.com  Tue Mar 10 09:58:57 2020
From: kevin.walls at oracle.com (Kevin Walls)
Date: Tue, 10 Mar 2020 09:58:57 +0000
Subject: : PING: RFR: 8234624: jstack mixed mode should refer DWARF
In-Reply-To: <b2d223d2-47db-3119-b579-e48ca3b50469@oss.nttdata.com>
References: <eb284894-0801-d04d-c5c2-68ea3faaef98@oss.nttdata.com>
 <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com>
 <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com>
 <e819c572-2172-c3e2-9933-f3859662d400@oss.nttdata.com>
 <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com>
 <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com>
 <c25bb60f-4e37-58db-b40f-2afce2dbf82f@oss.nttdata.com>
 <b2d223d2-47db-3119-b579-e48ca3b50469@oss.nttdata.com>
Message-ID: <c9ac396c-f14b-3313-b3aa-912dd7dca482@oracle.com>

Hi Yasumasa ,

The changes build OK for me in the latest jdk, and things still work.
I have not yet seen the dwarf usage in action: I've tried a couple of 
different systems and so far have not reproduced the problem, i.e. 
jstack has not failed on native frames.

I may need more recent basic libraries, will look again for somewhere 
where the problem happens and get back to you as I really want to run 
the changes.

I have mostly minor other comments which don't need a new webrev, some 
just comments for the future:

src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp:

DW_CFA_nop - shouldn't this continue instead of return?
(It may "never" happen, but a nop could appear within some other 
instructions?)

DW_CFA_remember_state: a minor typo in the comment, "DW_CFA_remenber_state".

We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not 
DW_CFA_advance_loc4.? I thought that was odd, but maybe addresses in 
these tables never increase by 4-byte amounts, would this mean a lot of 
code on one line. 8-)
So maybe it's never used in practice, if you think it's unnecessary no 
problem, maybe a comment, or add it for robustness.


General-purpose methods like read_leb128(), get_entry_length(), 
get_decoded_value() specifically update the _buf pointer in this 
DwarfParser.

DwarfParser::process_dwarf() moves _buf.
It calls process_cie() which reads, moves _buf and restores it to the 
original position, then we read augmentation_length from where _buf is.
I'm not sure if that's wrong, or if I just need to read again about the 
CIE/etc layout.

I don't really want to suggest making the code pass around a current 
_buf for the invocation of these general purpose methods, but just 
wanted to comment that if these get used more widely that might become 
necessary.

Similarly in future, if this DWARF support code became used more widely, 
it might want to move to an
OS-neutral directory?? It's odd to label it as Linux-specific.


src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp:
Thanks for changing "can_parsable" which was in the earlier version. 8-)


These are just comments to mainly say it looks good, and somebody else 
out there has read it.
I will look for a system that shows the problem, and get back to you again!

Many thanks
Kevin

On 27/02/2020 05:13, Yasumasa Suenaga wrote:
> Hi all,
>
> webrev.03 cannot be applied to current jdk/jdk due to 8239224 and 
> 8239462 changes (they updated copyright year).
> So I modified webrev (only copyright year changes) to be able to apply 
> to current jdk/jdk.
> Could you review it?
>
> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/
>
> I need one more reviewer to push.
>
>
> Thanks,
>
> Yasumasa
>
>
> On 2020/02/17 13:07, Yasumasa Suenaga wrote:
>> PING: Could you review it?
>>
>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>
>> This change has been already reviewed by Serguei.
>> I need one more reviewer to push.
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> On 2020/02/03 1:37, Yasumasa Suenaga wrote:
>>> PING: Could you reveiw this change?
>>>
>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>
>>> I believe this change helps troubleshooter to fight to postmortem 
>>> analysis.
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2020/01/19 3:16, Yasumasa Suenaga wrote:
>>>> PING: Could you review it?
>>>>
>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>
>>>> I updated webrev. I discussed with Serguei in off list, and I 
>>>> refactored webrev.02 .
>>>> It has passed tests on submit repo 
>>>> (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549).
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2019/12/15 10:51, Yasumasa Suenaga wrote:
>>>>> Hi Serguei,
>>>>>
>>>>> Thanks for your comment!
>>>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev.
>>>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as 
>>>>> Dmitry said.
>>>>>
>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/
>>>>>
>>>>> This change has been passed all tests on submit repo 
>>>>> (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487).
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote:
>>>>>> Hi Yasumasa,
>>>>>>
>>>>>> This is nice move in general.
>>>>>> Thank you for working on this!
>>>>>>
>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html 
>>>>>>
>>>>>>
>>>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 
>>>>>> 0L) { // Java frame 98 Address rbp = 
>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp 
>>>>>> == null) { 100 return null; 101 } 102 return new 
>>>>>> LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native 
>>>>>> frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new 
>>>>>> DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 
>>>>>> Address rbp = 
>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if (rbp 
>>>>>> == null) { 110 return null; 111 } 112 return new 
>>>>>> LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 
>>>>>> dwarf.processDwarf(pc); 115 Address cfa = 
>>>>>> ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 
>>>>>> !dwarf.isBPOffsetAvailable()) 117 ? 
>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : 
>>>>>> context.getRegisterAsAddress(dwarf.getCFARegister()) 119 
>>>>>> .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 
>>>>>> return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, 
>>>>>> dwarf); 124 }
>>>>>>
>>>>>>
>>>>>> I'd suggest to simplify the logic by refactoring to something 
>>>>>> like below:
>>>>>>
>>>>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc);
>>>>>> ?????????? Address cfa = 
>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame
>>>>>> ?????????? DwarfParser dwarf = null;
>>>>>>
>>>>>> ?????????? if (libptr != 0L) { // Native frame
>>>>>> ???????????? try {
>>>>>> ?????????????? dwarf = new DwarfParser(libptr);
>>>>>> ?????????????? dwarf.processDwarf(pc);
>>>>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == 
>>>>>> AMD64ThreadContext.RBP) &&
>>>>>> !dwarf.isBPOffsetAvailable())
>>>>>> ???????????????????????????????? ? 
>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP)
>>>>>> ???????????????????????????????? : 
>>>>>> context.getRegisterAsAddress(dwarf.getCFARegister())
>>>>>> .addOffsetTo(dwarf.getCFAOffset());
>>>>>>
>>>>>> ??????????? } catch (DebuggerException e) { // bail out to Java 
>>>>>> frame case
>>>>>> ??????????? }
>>>>>> ????????? }
>>>>>> ????????? if (cfa == null) {
>>>>>> ??????????? return null;
>>>>>> ????????? }
>>>>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf);
>>>>>>
>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html 
>>>>>>
>>>>>>
>>>>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA()
>>>>>>
>>>>>> ?? Better to rename 'ofs' => 'offs'.
>>>>>>
>>>>>> 77 nextCFA = nextCFA.addOffsetTo(- 
>>>>>> nextDwarf.getBasePointerOffsetFromCFA());
>>>>>>
>>>>>> ?? Extra space after '-' sign.
>>>>>>
>>>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, 
>>>>>> ThreadContext context) {
>>>>>>
>>>>>> ?? It feels like the logic has to be somehow 
>>>>>> refactored/simplified as
>>>>>> ?? several typical fragments appears in slightly different contexts.
>>>>>> ?? But it is not easy to understand what it is.
>>>>>> ?? Could you, please, add some comments to key places explaining 
>>>>>> this logic.
>>>>>> ?? Then I'll check if it is possible to make it a little bit 
>>>>>> simpler.
>>>>>>
>>>>>> 109 private CFrame javaSender(ThreadContext context) { 110 
>>>>>> Address nextCFA; 111 Address nextPC; 112 113 nextPC = 
>>>>>> getNextPC(false); 114 if (nextPC == null) { 115 return null; 116 
>>>>>> } 117 118 DwarfParser nextDwarf = null; 119 long libptr = 
>>>>>> dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // 
>>>>>> Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 
>>>>>> 123 } catch (DebuggerException e) { 124 nextCFA = 
>>>>>> getNextCFA(null, context); 125 return (nextCFA == null) ? null : 
>>>>>> new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 
>>>>>> nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = 
>>>>>> getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? 
>>>>>> null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 
>>>>>> 133 }
>>>>>>
>>>>>> ??The above can be simplified if a DebuggerException can not be 
>>>>>> thrown from processDwarf(nextPC):
>>>>>> ????? private CFrame javaSender(ThreadContext context) {
>>>>>> ??????? Address nextPC = getNextPC(false);
>>>>>> ??????? if (nextPC == null) {
>>>>>> ????????? return null;
>>>>>> ??????? }
>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>
>>>>>> ??????? if (libptr != 0L) { // Native frame
>>>>>> ????????? try {
>>>>>> ??????????? nextDwarf = new DwarfParser(libptr);
>>>>>> ??????????? nextDwarf.processDwarf(nextPC);
>>>>>> ????????? } catch (DebuggerException e) { // Bail out to Java frame
>>>>>> ????????? }
>>>>>> ??????? }
>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>> ??????? return (nextCFA == null) ? null : new 
>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>> ????? }
>>>>>>
>>>>>> 135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext 
>>>>>> context = thread.getContext(); 137 138 if (dwarf == null) { // 
>>>>>> Java frame 139 return javaSender(context); 140 } 141 142 Address 
>>>>>> nextPC = getNextPC(true); 143 if (nextPC == null) { 144 return 
>>>>>> null; 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = 
>>>>>> dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = 
>>>>>> dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // 
>>>>>> Next frame might be Java frame 153 nextCFA = getNextCFA(null, 
>>>>>> context); 154 return (nextCFA == null) ? null : new 
>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 } 156 try { 157 
>>>>>> nextDwarf = new DwarfParser(libptr); 158 } catch 
>>>>>> (DebuggerException e) { 159 nextCFA = getNextCFA(null, context); 
>>>>>> 160 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, 
>>>>>> nextCFA, nextPC, null); 161 } 162 } 163 164 
>>>>>> nextDwarf.processDwarf(nextPC); 165 nextCFA = 
>>>>>> getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? 
>>>>>> null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 167 }
>>>>>>
>>>>>> ??This one can be also simplified a little:
>>>>>>
>>>>>> ????? public CFrame sender(ThreadProxy thread) {
>>>>>> ??????? ThreadContext context = thread.getContext();
>>>>>>
>>>>>> ??????? if (dwarf == null) { // Java frame
>>>>>> ????????? return javaSender(context);
>>>>>> ??????? }
>>>>>> ??????? Address nextPC = getNextPC(true);
>>>>>> ??????? if (nextPC == null) {
>>>>>> ????????? return null;
>>>>>> ??????? }
>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>> ??????? if (!dwarf.isIn(nextPC)) {
>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>> ????????? if (libptr != 0L) {
>>>>>> ??????????? try {
>>>>>> ????????????? nextDwarf = new DwarfParser(libptr);
>>>>>> ????????????? nextDwarf.processDwarf(nextPC);
>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java 
>>>>>> frame
>>>>>> ??????????? }
>>>>>> ????????? }
>>>>>> ??????? }
>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>> ??????? return (nextCFA == null) ? null : new 
>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>> ????? }
>>>>>>
>>>>>> Finally, it looks like just one method could replace both
>>>>>> sender(ThreadProxy thread) and javaSender(ThreadContext context):
>>>>>>
>>>>>> ????? private CFrame commonSender(ThreadProxy thread) {
>>>>>> ??????? ThreadContext context = thread.getContext();
>>>>>> ??????? Address nextPC = getNextPC(false);
>>>>>> ??????? if (nextPC == null) {
>>>>>> ????????? return null;
>>>>>> ??????? }
>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>
>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) {
>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>> ????????? if (libptr != 0L) {
>>>>>> ??????????? try {
>>>>>> ????????????? nextDwarf = new DwarfParser(libptr);
>>>>>> ????????????? nextDwarf.processDwarf(nextPC);
>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java 
>>>>>> frame
>>>>>> ??????????? }
>>>>>> ????????? }
>>>>>> ??????? }
>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>> ??????? return (nextCFA == null) ? null : new 
>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>> ????? }
>>>>>>
>>>>>> I'm still reviewing the dwarf parser files.
>>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>>
>>>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I refactored LinuxAMD64CFrame.java . It works fine in 
>>>>>>> serviceability/sa tests and
>>>>>>> all tests on submit repo 
>>>>>>> (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923).
>>>>>>> Could you review new webrev?
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/
>>>>>>>
>>>>>>> The diff from previous webrev is here:
>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> Please review this change:
>>>>>>>>
>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>> ?? webrev: 
>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/
>>>>>>>>
>>>>>>>>
>>>>>>>> According to 2.7 Stack Unwind Algorithm in System V Application 
>>>>>>>> Binary Interface AMD64
>>>>>>>> Architecture Processor Supplement [1], we need to use DWARF in 
>>>>>>>> .eh_frame or .debug_frame
>>>>>>>> for stack unwinding.
>>>>>>>>
>>>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by default 
>>>>>>>> since GCC 4.6, so system
>>>>>>>> library (e.g. libc) might be compiled with this feature.
>>>>>>>>
>>>>>>>> However `jhsdb jstack --mixed` does not do so, it uses base 
>>>>>>>> pointer register (RBP).
>>>>>>>> So it might be lack of stack frames.
>>>>>>>>
>>>>>>>> I guess JDK-8219201 is caused by same issue.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> [1] 
>>>>>>>> https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf
>>>>>>

From suenaga at oss.nttdata.com  Tue Mar 10 12:36:38 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Tue, 10 Mar 2020 21:36:38 +0900
Subject: : PING: RFR: 8234624: jstack mixed mode should refer DWARF
In-Reply-To: <c9ac396c-f14b-3313-b3aa-912dd7dca482@oracle.com>
References: <eb284894-0801-d04d-c5c2-68ea3faaef98@oss.nttdata.com>
 <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com>
 <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com>
 <e819c572-2172-c3e2-9933-f3859662d400@oss.nttdata.com>
 <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com>
 <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com>
 <c25bb60f-4e37-58db-b40f-2afce2dbf82f@oss.nttdata.com>
 <b2d223d2-47db-3119-b579-e48ca3b50469@oss.nttdata.com>
 <c9ac396c-f14b-3313-b3aa-912dd7dca482@oracle.com>
Message-ID: <24a2dcae-fba9-c6ed-657e-a1d54584bbc3@oss.nttdata.com>

Hi Kevin,

Thanks for your comment!

On 2020/03/10 18:58, Kevin Walls wrote:
> Hi Yasumasa ,
> 
> The changes build OK for me in the latest jdk, and things still work.
> I have not yet seen the dwarf usage in action: I've tried a couple of different systems and so far have not reproduced the problem, i.e. jstack has not failed on native frames.
> 
> I may need more recent basic libraries, will look again for somewhere where the problem happens and get back to you as I really want to run the changes.

You can see the problem with JShell.
Some Java frames would not be seen in mixed jstack.


> I have mostly minor other comments which don't need a new webrev, some just comments for the future:
> 
> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp:
> 
> DW_CFA_nop - shouldn't this continue instead of return?
> (It may "never" happen, but a nop could appear within some other instructions?)

DW_CFA_nop is used for padding, so we can ignore (return immediately) it.


> DW_CFA_remember_state: a minor typo in the comment, "DW_CFA_remenber_state".

I will fix it.


> We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not DW_CFA_advance_loc4.? I thought that was odd, but maybe addresses in these tables never increase by 4-byte amounts, would this mean a lot of code on one line. 8-)
> So maybe it's never used in practice, if you think it's unnecessary no problem, maybe a comment, or add it for robustness.

I will add DW_CFA_advance_loc4.


> General-purpose methods like read_leb128(), get_entry_length(), get_decoded_value() specifically update the _buf pointer in this DwarfParser.
> 
> DwarfParser::process_dwarf() moves _buf.
> It calls process_cie() which reads, moves _buf and restores it to the original position, then we read augmentation_length from where _buf is.
> I'm not sure if that's wrong, or if I just need to read again about the CIE/etc layout.
> 
> I don't really want to suggest making the code pass around a current _buf for the invocation of these general purpose methods, but just wanted to comment that if these get used more widely that might become necessary.

I saw GDB and binutils source for creating this patch.
They seems to process similar code because we need to calculate DWARF instructions one-by-one to get the value which relates to specified PC.


> Similarly in future, if this DWARF support code became used more widely, it might want to move to an
> OS-neutral directory?? It's odd to label it as Linux-specific.

Windows does not use DWARF at least, it uses another feature.

   https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019

I'm not sure other platforms (Solaris, macOS) uses DWARF.
If DWARF is used in them, I can move DWARF related code to posix directory.


> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp:
> Thanks for changing "can_parsable" which was in the earlier version. 8-)
> 
> 
> These are just comments to mainly say it looks good, and somebody else out there has read it.
> I will look for a system that shows the problem, and get back to you again!


Thanks,

Yasumasa


> Many thanks
> Kevin
> 
> On 27/02/2020 05:13, Yasumasa Suenaga wrote:
>> Hi all,
>>
>> webrev.03 cannot be applied to current jdk/jdk due to 8239224 and 8239462 changes (they updated copyright year).
>> So I modified webrev (only copyright year changes) to be able to apply to current jdk/jdk.
>> Could you review it?
>>
>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/
>>
>> I need one more reviewer to push.
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> On 2020/02/17 13:07, Yasumasa Suenaga wrote:
>>> PING: Could you review it?
>>>
>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>
>>> This change has been already reviewed by Serguei.
>>> I need one more reviewer to push.
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2020/02/03 1:37, Yasumasa Suenaga wrote:
>>>> PING: Could you reveiw this change?
>>>>
>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>
>>>> I believe this change helps troubleshooter to fight to postmortem analysis.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2020/01/19 3:16, Yasumasa Suenaga wrote:
>>>>> PING: Could you review it?
>>>>>
>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>
>>>>> I updated webrev. I discussed with Serguei in off list, and I refactored webrev.02 .
>>>>> It has passed tests on submit repo (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549).
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2019/12/15 10:51, Yasumasa Suenaga wrote:
>>>>>> Hi Serguei,
>>>>>>
>>>>>> Thanks for your comment!
>>>>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev.
>>>>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as Dmitry said.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/
>>>>>>
>>>>>> This change has been passed all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487).
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote:
>>>>>>> Hi Yasumasa,
>>>>>>>
>>>>>>> This is nice move in general.
>>>>>>> Thank you for working on this!
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html
>>>>>>>
>>>>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 0L) { // Java frame 98 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp == null) { 100 return null; 101 } 102 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if (rbp == null) { 110 return null; 111 } 112 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 dwarf.processDwarf(pc); 115 Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 !dwarf.isBPOffsetAvailable()) 117 ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : context.getRegisterAsAddress(dwarf.getCFARegister()) 119 .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); 124 }
>>>>>>>
>>>>>>>
>>>>>>> I'd suggest to simplify the logic by refactoring to something like below:
>>>>>>>
>>>>>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc);
>>>>>>> ?????????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame
>>>>>>> ?????????? DwarfParser dwarf = null;
>>>>>>>
>>>>>>> ?????????? if (libptr != 0L) { // Native frame
>>>>>>> ???????????? try {
>>>>>>> ?????????????? dwarf = new DwarfParser(libptr);
>>>>>>> ?????????????? dwarf.processDwarf(pc);
>>>>>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) &&
>>>>>>> !dwarf.isBPOffsetAvailable())
>>>>>>> ???????????????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP)
>>>>>>> ???????????????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister())
>>>>>>> .addOffsetTo(dwarf.getCFAOffset());
>>>>>>>
>>>>>>> ??????????? } catch (DebuggerException e) { // bail out to Java frame case
>>>>>>> ??????????? }
>>>>>>> ????????? }
>>>>>>> ????????? if (cfa == null) {
>>>>>>> ??????????? return null;
>>>>>>> ????????? }
>>>>>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf);
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html
>>>>>>>
>>>>>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA()
>>>>>>>
>>>>>>> ?? Better to rename 'ofs' => 'offs'.
>>>>>>>
>>>>>>> 77 nextCFA = nextCFA.addOffsetTo(- nextDwarf.getBasePointerOffsetFromCFA());
>>>>>>>
>>>>>>> ?? Extra space after '-' sign.
>>>>>>>
>>>>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, ThreadContext context) {
>>>>>>>
>>>>>>> ?? It feels like the logic has to be somehow refactored/simplified as
>>>>>>> ?? several typical fragments appears in slightly different contexts.
>>>>>>> ?? But it is not easy to understand what it is.
>>>>>>> ?? Could you, please, add some comments to key places explaining this logic.
>>>>>>> ?? Then I'll check if it is possible to make it a little bit simpler.
>>>>>>>
>>>>>>> 109 private CFrame javaSender(ThreadContext context) { 110 Address nextCFA; 111 Address nextPC; 112 113 nextPC = getNextPC(false); 114 if (nextPC == null) { 115 return null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 123 } catch (DebuggerException e) { 124 nextCFA = getNextCFA(null, context); 125 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 133 }
>>>>>>>
>>>>>>> ??The above can be simplified if a DebuggerException can not be thrown from processDwarf(nextPC):
>>>>>>> ????? private CFrame javaSender(ThreadContext context) {
>>>>>>> ??????? Address nextPC = getNextPC(false);
>>>>>>> ??????? if (nextPC == null) {
>>>>>>> ????????? return null;
>>>>>>> ??????? }
>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>
>>>>>>> ??????? if (libptr != 0L) { // Native frame
>>>>>>> ????????? try {
>>>>>>> ??????????? nextDwarf = new DwarfParser(libptr);
>>>>>>> ??????????? nextDwarf.processDwarf(nextPC);
>>>>>>> ????????? } catch (DebuggerException e) { // Bail out to Java frame
>>>>>>> ????????? }
>>>>>>> ??????? }
>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>> ????? }
>>>>>>>
>>>>>>> 135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext context = thread.getContext(); 137 138 if (dwarf == null) { // Java frame 139 return javaSender(context); 140 } 141 142 Address nextPC = getNextPC(true); 143 if (nextPC == null) { 144 return null; 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // Next frame might be Java frame 153 nextCFA = getNextCFA(null, context); 154 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 158 } catch (DebuggerException e) { 159 nextCFA = getNextCFA(null, context); 160 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? null : new 
>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 167 }
>>>>>>>
>>>>>>> ??This one can be also simplified a little:
>>>>>>>
>>>>>>> ????? public CFrame sender(ThreadProxy thread) {
>>>>>>> ??????? ThreadContext context = thread.getContext();
>>>>>>>
>>>>>>> ??????? if (dwarf == null) { // Java frame
>>>>>>> ????????? return javaSender(context);
>>>>>>> ??????? }
>>>>>>> ??????? Address nextPC = getNextPC(true);
>>>>>>> ??????? if (nextPC == null) {
>>>>>>> ????????? return null;
>>>>>>> ??????? }
>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>> ??????? if (!dwarf.isIn(nextPC)) {
>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>> ????????? if (libptr != 0L) {
>>>>>>> ??????????? try {
>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr);
>>>>>>> ????????????? nextDwarf.processDwarf(nextPC);
>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame
>>>>>>> ??????????? }
>>>>>>> ????????? }
>>>>>>> ??????? }
>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>> ????? }
>>>>>>>
>>>>>>> Finally, it looks like just one method could replace both
>>>>>>> sender(ThreadProxy thread) and javaSender(ThreadContext context):
>>>>>>>
>>>>>>> ????? private CFrame commonSender(ThreadProxy thread) {
>>>>>>> ??????? ThreadContext context = thread.getContext();
>>>>>>> ??????? Address nextPC = getNextPC(false);
>>>>>>> ??????? if (nextPC == null) {
>>>>>>> ????????? return null;
>>>>>>> ??????? }
>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>
>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) {
>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>> ????????? if (libptr != 0L) {
>>>>>>> ??????????? try {
>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr);
>>>>>>> ????????????? nextDwarf.processDwarf(nextPC);
>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame
>>>>>>> ??????????? }
>>>>>>> ????????? }
>>>>>>> ??????? }
>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>> ????? }
>>>>>>>
>>>>>>> I'm still reviewing the dwarf parser files.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>
>>>>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I refactored LinuxAMD64CFrame.java . It works fine in serviceability/sa tests and
>>>>>>>> all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923).
>>>>>>>> Could you review new webrev?
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/
>>>>>>>>
>>>>>>>> The diff from previous webrev is here:
>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote:
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> Please review this change:
>>>>>>>>>
>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> According to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64
>>>>>>>>> Architecture Processor Supplement [1], we need to use DWARF in .eh_frame or .debug_frame
>>>>>>>>> for stack unwinding.
>>>>>>>>>
>>>>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by default since GCC 4.6, so system
>>>>>>>>> library (e.g. libc) might be compiled with this feature.
>>>>>>>>>
>>>>>>>>> However `jhsdb jstack --mixed` does not do so, it uses base pointer register (RBP).
>>>>>>>>> So it might be lack of stack frames.
>>>>>>>>>
>>>>>>>>> I guess JDK-8219201 is caused by same issue.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [1] https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf
>>>>>>>

From serguei.spitsyn at oracle.com  Tue Mar 10 22:55:53 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 10 Mar 2020 15:55:53 -0700
Subject: RFR: JDK-8240340: java/lang/management/ThreadMXBean/Locks.java is
 buggy
In-Reply-To: <e96230f2-e689-509d-5c79-bb0412cddb1c@oracle.com>
References: <a5f700fb-4955-aaed-0312-367d4a7d2460@oracle.com>
 <4c9b6f1b-1308-9087-ef4c-140eaf103b0f@oracle.com>
 <9a3d8f8c-5c46-aa64-ba22-b1b2bf1836ee@oracle.com>
 <8d0f40d7-4202-0179-d130-1366c77e5c05@oracle.com>
 <1082f12c-102f-fd63-f8a4-7b623944ee03@oracle.com>
 <e96230f2-e689-509d-5c79-bb0412cddb1c@oracle.com>
Message-ID: <7c4b037c-384d-2137-f42d-9f31390c9f15@oracle.com>

Hi Alex,

The update looks good.

Thanks,
Serguei


On 3/9/20 12:15, Alex Menkov wrote:
>
> Updated webrev:
> http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev.02/ 
>
>
> Changes are in LockFreeLogger comments only.
>
> --alex
>
> On 03/08/2020 21:19, David Holmes wrote:
>> P.S.
>>
>> Forgot to note however that you need to update the documentation for 
>> the logger now as the mention of "per-thread logs" makes no sense 
>> now. Also in the spirit of not using @author, and because this is no 
>> longer the code created by Jaroslav, please delete the @author line.
>>
>> Thanks,
>> David
>>
>> On 9/03/2020 2:15 pm, David Holmes wrote:
>>> Hi Alex,
>>>
>>> On 6/03/2020 4:54 am, Alex Menkov wrote:
>>>> Hi David,
>>>>
>>>> Thanks you for the review.
>>>>
>>>> On 03/04/2020 17:50, David Holmes wrote:
>>>>> Hi Alex,
>>>>>
>>>>> On 5/03/2020 10:30 am, Alex Menkov wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> please review the fix for
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8240340
>>>>>> webrev:
>>>>>> http://cr.openjdk.java.net/~amenkov/jdk15/ThreadMXBean_Locks_test/webrev/ 
>>>>>>
>>>>>>
>>>>>> changes:
>>>>>> - assertThreadState method: don't re-read thread state throwing 
>>>>>> exception (as we got weird error like "Thread WaitingThread is at 
>>>>>> WAITING state but is expected to be in Thread.State = WAITING");
>>>>>> - added proper test shutdown on error (made all threads "daemon", 
>>>>>> interrupt waiting thread if CheckerThread throws exception);
>>>>>> - if CheckerThread detects error, propagate the exception to main 
>>>>>> thread;
>>>>>
>>>>> The test changes seem fine.
>>>>>
>>>>>> - fixed LockFreeLogger class - it should work for logging from 
>>>>>> several threads, but it doesn't. I prefer to simplify it just to 
>>>>>> keep ConcurrentLinkedQueue<String>.
>>>>>> LockFreeLogger is also used by ThreadMXBeanStateTest test, but 
>>>>>> only by a single thread.
>>>>>
>>>>> I don't understand your changes here as you've completely changed 
>>>>> the intended design of the logger. The original accumulates log 
>>>>> entries per-thread and then spits them all out (though I'm not 
>>>>> clear on the exact ordering - I don't how to read that stream 
>>>>> stuff). The new code just creates a single queue of log records 
>>>>> interleaving entries from different threads. The simple logger may 
>>>>> be all that is needed but it seems quite different to the intent 
>>>>> of the original.
>>>>
>>>> Testing changes in the test I discovered that there is something 
>>>> wrong with the logger - it printed only part of the records, so I 
>>>> have to look at the LockFreeLogger class and I don't understand how 
>>>> it was supposed to work.
>>>> About ordering in cumulative log: each record has Integer which 
>>>> used to sort log entries from all threads (i.e. records from 
>>>> different threads are printed at the order which log() was called).
>>>> Looking at allRecords/records stuff I don't understand how it 
>>>> should be used. To get logs from different threads in one logger, 
>>>> we needs one instance. So we create LockFreeLogger (in main thread) 
>>>> and ctor creates ThreadLocal record and register it in allRecords. 
>>>> Logging from main thread works fine, but if any other thread tries 
>>>> to log, 1st log() call creates its own ThreadLocal records (by 
>>>> records.get()) and log records from this thread go there. But this 
>>>> ThreadLocal records is not registered in allRecords, so this 
>>>> logging won't be included in final log.
>>>> Looks like we need to change log() to something like
>>>>
>>>> Map<Integer, String> recs = records.get();
>>>> if (recs.isEmpty()) {
>>>> ???? allRecords.add(recs);
>>>> }
>>>> recs.put(id, String.format(format, params));
>>>
>>> Yep good catch - this logger was completely broken.
>>>
>>>> But all this stuff do exactly the same as simple 
>>>> ConcurrentLinkedQueue (i.e. lock free ordered list).
>>>> At least I don't see other rationale in the stuff.
>>>
>>> I'm not certain of intent with the original but I'd always want to 
>>> see log entries in chronological order - which is what we now 
>>> clearly have.
>>>
>>> Thanks,
>>> David
>>>
>>>> --alex
>>>>
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>> --alex


From kevin.walls at oracle.com  Tue Mar 10 23:53:21 2020
From: kevin.walls at oracle.com (Kevin Walls)
Date: Tue, 10 Mar 2020 16:53:21 -0700 (PDT)
Subject: : PING: RFR: 8234624: jstack mixed mode should refer DWARF
In-Reply-To: <24a2dcae-fba9-c6ed-657e-a1d54584bbc3@oss.nttdata.com>
References: <eb284894-0801-d04d-c5c2-68ea3faaef98@oss.nttdata.com>
 <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com>
 <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com>
 <e819c572-2172-c3e2-9933-f3859662d400@oss.nttdata.com>
 <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com>
 <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com>
 <c25bb60f-4e37-58db-b40f-2afce2dbf82f@oss.nttdata.com>
 <b2d223d2-47db-3119-b579-e48ca3b50469@oss.nttdata.com>
 <c9ac396c-f14b-3313-b3aa-912dd7dca482@oracle.com>
 <24a2dcae-fba9-c6ed-657e-a1d54584bbc3@oss.nttdata.com>
Message-ID: <01c2a8c1-0e04-4e42-161a-80d6bdaa4ffa@oracle.com>

Hi -

In testing I wasn't seeing any of the Dwarf code triggered.

With LIBSAPROC_DEBUG set I'm getting the "Could not find executable 
section in" for lots of / maybe all the libraries...

src/jdk.hotspot.agent/linux/native/libsaproc/libproc_impl.c

 ?? if (fill_instr_info(newlib)) {
 ???? if (!read_eh_frame(ph, newlib)) {

fill_instr_info is failing, and we never get to read_eh_frame().

output like:

libsaproc DEBUG: [0] vaddr = 0x0, memsz = 0xaba4, filesz = 0xaba4
libsaproc DEBUG: Could not find executable section in 
/lib/x86_64-linux-gnu/libnss_nis-2.27.so

(similar for all libraries).

fill_instr fails if:

 ?if ((lib->exec_start == 0L) || (lib->exec_end == 0L))

...but isn't exec_start relative to the library address? It's the value 
of ph->vaddr and it is often zero.

I added some booleans and did:

185?????? if ((lib->exec_start == 0L) || (lib->exec_start > ph->p_vaddr)) {
186???????? lib->exec_start = ph->p_vaddr;
187???????? found_start =true;
188?????? }

(similarly for end) and only failed if:

201?? if (!found_start || !found_end) {
202???? return false;

...and now it's better. ? I go from:

----------------- 3306 -----------------
0x00007f75824acd2d????? __GI___pthread_timedjoin_ex + 0x17d

to:

----------------- 31127 -----------------
0x00007fa284d78d2d????? __GI___pthread_timedjoin_ex + 0x17d
0x00007fa2857aaf2d????? CallJavaMainInNewThread + 0xad
0x00007fa2857a74ed????? ContinueInNewThread + 0x4d
0x00007fa2857a8c49????? JLI_Launch + 0x1529
0x000055af1b78db1c????? main + 0x11c


Thanks
Kevin


On 10/03/2020 12:36, Yasumasa Suenaga wrote:

> Hi Kevin,
>
> Thanks for your comment!
>
> On 2020/03/10 18:58, Kevin Walls wrote:
>> Hi Yasumasa ,
>>
>> The changes build OK for me in the latest jdk, and things still work.
>> I have not yet seen the dwarf usage in action: I've tried a couple of 
>> different systems and so far have not reproduced the problem, i.e. 
>> jstack has not failed on native frames.
>>
>> I may need more recent basic libraries, will look again for somewhere 
>> where the problem happens and get back to you as I really want to run 
>> the changes.
>
> You can see the problem with JShell.
> Some Java frames would not be seen in mixed jstack.
>
>
>> I have mostly minor other comments which don't need a new webrev, 
>> some just comments for the future:
>>
>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp:
>>
>> DW_CFA_nop - shouldn't this continue instead of return?
>> (It may "never" happen, but a nop could appear within some other 
>> instructions?)
>
> DW_CFA_nop is used for padding, so we can ignore (return immediately) it.
>
>
>> DW_CFA_remember_state: a minor typo in the comment, 
>> "DW_CFA_remenber_state".
>
> I will fix it.
>
>
>> We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not 
>> DW_CFA_advance_loc4.? I thought that was odd, but maybe addresses in 
>> these tables never increase by 4-byte amounts, would this mean a lot 
>> of code on one line. 8-)
>> So maybe it's never used in practice, if you think it's unnecessary 
>> no problem, maybe a comment, or add it for robustness.
>
> I will add DW_CFA_advance_loc4.
>
>
>> General-purpose methods like read_leb128(), get_entry_length(), 
>> get_decoded_value() specifically update the _buf pointer in this 
>> DwarfParser.
>>
>> DwarfParser::process_dwarf() moves _buf.
>> It calls process_cie() which reads, moves _buf and restores it to the 
>> original position, then we read augmentation_length from where _buf is.
>> I'm not sure if that's wrong, or if I just need to read again about 
>> the CIE/etc layout.
>>
>> I don't really want to suggest making the code pass around a current 
>> _buf for the invocation of these general purpose methods, but just 
>> wanted to comment that if these get used more widely that might 
>> become necessary.
>
> I saw GDB and binutils source for creating this patch.
> They seems to process similar code because we need to calculate DWARF 
> instructions one-by-one to get the value which relates to specified PC.
>
>
>> Similarly in future, if this DWARF support code became used more 
>> widely, it might want to move to an
>> OS-neutral directory?? It's odd to label it as Linux-specific.
>
> Windows does not use DWARF at least, it uses another feature.
>
> https://urldefense.com/v3/__https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaTtALtpRg$ 
>
> I'm not sure other platforms (Solaris, macOS) uses DWARF.
> If DWARF is used in them, I can move DWARF related code to posix 
> directory.
>
>
>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp:
>> Thanks for changing "can_parsable" which was in the earlier version. 8-)
>>
>>
>> These are just comments to mainly say it looks good, and somebody 
>> else out there has read it.
>> I will look for a system that shows the problem, and get back to you 
>> again!
>
>
> Thanks,
>
> Yasumasa
>
>
>> Many thanks
>> Kevin
>>
>> On 27/02/2020 05:13, Yasumasa Suenaga wrote:
>>> Hi all,
>>>
>>> webrev.03 cannot be applied to current jdk/jdk due to 8239224 and 
>>> 8239462 changes (they updated copyright year).
>>> So I modified webrev (only copyright year changes) to be able to 
>>> apply to current jdk/jdk.
>>> Could you review it?
>>>
>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/
>>>
>>> I need one more reviewer to push.
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2020/02/17 13:07, Yasumasa Suenaga wrote:
>>>> PING: Could you review it?
>>>>
>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>> ?? webrev: 
>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>
>>>> This change has been already reviewed by Serguei.
>>>> I need one more reviewer to push.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2020/02/03 1:37, Yasumasa Suenaga wrote:
>>>>> PING: Could you reveiw this change?
>>>>>
>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>> ?? webrev: 
>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>
>>>>> I believe this change helps troubleshooter to fight to postmortem 
>>>>> analysis.
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2020/01/19 3:16, Yasumasa Suenaga wrote:
>>>>>> PING: Could you review it?
>>>>>>
>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>> ?? webrev: 
>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>
>>>>>> I updated webrev. I discussed with Serguei in off list, and I 
>>>>>> refactored webrev.02 .
>>>>>> It has passed tests on submit repo 
>>>>>> (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549).
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>> On 2019/12/15 10:51, Yasumasa Suenaga wrote:
>>>>>>> Hi Serguei,
>>>>>>>
>>>>>>> Thanks for your comment!
>>>>>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev.
>>>>>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as 
>>>>>>> Dmitry said.
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/
>>>>>>>
>>>>>>> This change has been passed all tests on submit repo 
>>>>>>> (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487).
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote:
>>>>>>>> Hi Yasumasa,
>>>>>>>>
>>>>>>>> This is nice move in general.
>>>>>>>> Thank you for working on this!
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html 
>>>>>>>>
>>>>>>>>
>>>>>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 
>>>>>>>> 0L) { // Java frame 98 Address rbp = 
>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if 
>>>>>>>> (rbp == null) { 100 return null; 101 } 102 return new 
>>>>>>>> LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native 
>>>>>>>> frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new 
>>>>>>>> DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 
>>>>>>>> Address rbp = 
>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if 
>>>>>>>> (rbp == null) { 110 return null; 111 } 112 return new 
>>>>>>>> LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 
>>>>>>>> dwarf.processDwarf(pc); 115 Address cfa = 
>>>>>>>> ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 
>>>>>>>> !dwarf.isBPOffsetAvailable()) 117 ? 
>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : 
>>>>>>>> context.getRegisterAsAddress(dwarf.getCFARegister()) 119 
>>>>>>>> .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 
>>>>>>>> return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, 
>>>>>>>> pc, dwarf); 124 }
>>>>>>>>
>>>>>>>>
>>>>>>>> I'd suggest to simplify the logic by refactoring to something 
>>>>>>>> like below:
>>>>>>>>
>>>>>>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc);
>>>>>>>> ?????????? Address cfa = 
>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java 
>>>>>>>> frame
>>>>>>>> ?????????? DwarfParser dwarf = null;
>>>>>>>>
>>>>>>>> ?????????? if (libptr != 0L) { // Native frame
>>>>>>>> ???????????? try {
>>>>>>>> ?????????????? dwarf = new DwarfParser(libptr);
>>>>>>>> ?????????????? dwarf.processDwarf(pc);
>>>>>>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == 
>>>>>>>> AMD64ThreadContext.RBP) &&
>>>>>>>> !dwarf.isBPOffsetAvailable())
>>>>>>>> ???????????????????????????????? ? 
>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP)
>>>>>>>> ???????????????????????????????? : 
>>>>>>>> context.getRegisterAsAddress(dwarf.getCFARegister())
>>>>>>>> .addOffsetTo(dwarf.getCFAOffset());
>>>>>>>>
>>>>>>>> ??????????? } catch (DebuggerException e) { // bail out to Java 
>>>>>>>> frame case
>>>>>>>> ??????????? }
>>>>>>>> ????????? }
>>>>>>>> ????????? if (cfa == null) {
>>>>>>>> ??????????? return null;
>>>>>>>> ????????? }
>>>>>>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf);
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html 
>>>>>>>>
>>>>>>>>
>>>>>>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA()
>>>>>>>>
>>>>>>>> ?? Better to rename 'ofs' => 'offs'.
>>>>>>>>
>>>>>>>> 77 nextCFA = nextCFA.addOffsetTo(- 
>>>>>>>> nextDwarf.getBasePointerOffsetFromCFA());
>>>>>>>>
>>>>>>>> ?? Extra space after '-' sign.
>>>>>>>>
>>>>>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, 
>>>>>>>> ThreadContext context) {
>>>>>>>>
>>>>>>>> ?? It feels like the logic has to be somehow 
>>>>>>>> refactored/simplified as
>>>>>>>> ?? several typical fragments appears in slightly different 
>>>>>>>> contexts.
>>>>>>>> ?? But it is not easy to understand what it is.
>>>>>>>> ?? Could you, please, add some comments to key places 
>>>>>>>> explaining this logic.
>>>>>>>> ?? Then I'll check if it is possible to make it a little bit 
>>>>>>>> simpler.
>>>>>>>>
>>>>>>>> 109 private CFrame javaSender(ThreadContext context) { 110 
>>>>>>>> Address nextCFA; 111 Address nextPC; 112 113 nextPC = 
>>>>>>>> getNextPC(false); 114 if (nextPC == null) { 115 return null; 
>>>>>>>> 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = 
>>>>>>>> dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // 
>>>>>>>> Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 
>>>>>>>> 123 } catch (DebuggerException e) { 124 nextCFA = 
>>>>>>>> getNextCFA(null, context); 125 return (nextCFA == null) ? null 
>>>>>>>> : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 
>>>>>>>> nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = 
>>>>>>>> getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? 
>>>>>>>> null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, 
>>>>>>>> nextDwarf); 133 }
>>>>>>>>
>>>>>>>> ??The above can be simplified if a DebuggerException can not be 
>>>>>>>> thrown from processDwarf(nextPC):
>>>>>>>> ????? private CFrame javaSender(ThreadContext context) {
>>>>>>>> ??????? Address nextPC = getNextPC(false);
>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>> ????????? return null;
>>>>>>>> ??????? }
>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>
>>>>>>>> ??????? if (libptr != 0L) { // Native frame
>>>>>>>> ????????? try {
>>>>>>>> ??????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>> ??????????? nextDwarf.processDwarf(nextPC);
>>>>>>>> ????????? } catch (DebuggerException e) { // Bail out to Java 
>>>>>>>> frame
>>>>>>>> ????????? }
>>>>>>>> ??????? }
>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>> ??????? return (nextCFA == null) ? null : new 
>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>> ????? }
>>>>>>>>
>>>>>>>> 135 public CFrame sender(ThreadProxy thread) { 136 
>>>>>>>> ThreadContext context = thread.getContext(); 137 138 if (dwarf 
>>>>>>>> == null) { // Java frame 139 return javaSender(context); 140 } 
>>>>>>>> 141 142 Address nextPC = getNextPC(true); 143 if (nextPC == 
>>>>>>>> null) { 144 return null; 145 } 146 147 Address nextCFA; 148 
>>>>>>>> DwarfParser nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 
>>>>>>>> 150 long libptr = dbg.findLibPtrByAddress(nextPC); 151 if 
>>>>>>>> (libptr == 0L) { 152 // Next frame might be Java frame 153 
>>>>>>>> nextCFA = getNextCFA(null, context); 154 return (nextCFA == 
>>>>>>>> null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, 
>>>>>>>> null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 
>>>>>>>> 158 } catch (DebuggerException e) { 159 nextCFA = 
>>>>>>>> getNextCFA(null, context); 160 return (nextCFA == null) ? null 
>>>>>>>> : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 
>>>>>>>> 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = 
>>>>>>>> getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? 
>>>>>>>> null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 
>>>>>>>> 167 }
>>>>>>>>
>>>>>>>> ??This one can be also simplified a little:
>>>>>>>>
>>>>>>>> ????? public CFrame sender(ThreadProxy thread) {
>>>>>>>> ??????? ThreadContext context = thread.getContext();
>>>>>>>>
>>>>>>>> ??????? if (dwarf == null) { // Java frame
>>>>>>>> ????????? return javaSender(context);
>>>>>>>> ??????? }
>>>>>>>> ??????? Address nextPC = getNextPC(true);
>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>> ????????? return null;
>>>>>>>> ??????? }
>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>> ??????? if (!dwarf.isIn(nextPC)) {
>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>> ????????? if (libptr != 0L) {
>>>>>>>> ??????????? try {
>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC);
>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java 
>>>>>>>> frame
>>>>>>>> ??????????? }
>>>>>>>> ????????? }
>>>>>>>> ??????? }
>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>> ??????? return (nextCFA == null) ? null : new 
>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>> ????? }
>>>>>>>>
>>>>>>>> Finally, it looks like just one method could replace both
>>>>>>>> sender(ThreadProxy thread) and javaSender(ThreadContext context):
>>>>>>>>
>>>>>>>> ????? private CFrame commonSender(ThreadProxy thread) {
>>>>>>>> ??????? ThreadContext context = thread.getContext();
>>>>>>>> ??????? Address nextPC = getNextPC(false);
>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>> ????????? return null;
>>>>>>>> ??????? }
>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>
>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) {
>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>> ????????? if (libptr != 0L) {
>>>>>>>> ??????????? try {
>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC);
>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java 
>>>>>>>> frame
>>>>>>>> ??????????? }
>>>>>>>> ????????? }
>>>>>>>> ??????? }
>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>> ??????? return (nextCFA == null) ? null : new 
>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>> ????? }
>>>>>>>>
>>>>>>>> I'm still reviewing the dwarf parser files.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>>
>>>>>>>>
>>>>>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I refactored LinuxAMD64CFrame.java . It works fine in 
>>>>>>>>> serviceability/sa tests and
>>>>>>>>> all tests on submit repo 
>>>>>>>>> (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923).
>>>>>>>>> Could you review new webrev?
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/
>>>>>>>>>
>>>>>>>>> The diff from previous webrev is here:
>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote:
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> Please review this change:
>>>>>>>>>>
>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>> ?? webrev: 
>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> According to 2.7 Stack Unwind Algorithm in System V 
>>>>>>>>>> Application Binary Interface AMD64
>>>>>>>>>> Architecture Processor Supplement [1], we need to use DWARF 
>>>>>>>>>> in .eh_frame or .debug_frame
>>>>>>>>>> for stack unwinding.
>>>>>>>>>>
>>>>>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by default 
>>>>>>>>>> since GCC 4.6, so system
>>>>>>>>>> library (e.g. libc) might be compiled with this feature.
>>>>>>>>>>
>>>>>>>>>> However `jhsdb jstack --mixed` does not do so, it uses base 
>>>>>>>>>> pointer register (RBP).
>>>>>>>>>> So it might be lack of stack frames.
>>>>>>>>>>
>>>>>>>>>> I guess JDK-8219201 is caused by same issue.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> [1] 
>>>>>>>>>> https://urldefense.com/v3/__https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaQusmjOqA$ 
>>>>>>>>>
>>>>>>>>

From serguei.spitsyn at oracle.com  Wed Mar 11 01:07:51 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 10 Mar 2020 18:07:51 -0700
Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they
 do not attempt to use sudo when available
In-Reply-To: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com>
References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com>
Message-ID: <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200310/55e44304/attachment.htm>

From chris.plummer at oracle.com  Wed Mar 11 01:57:01 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 10 Mar 2020 18:57:01 -0700
Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they
 do not attempt to use sudo when available
In-Reply-To: <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com>
References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com>
 <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com>
Message-ID: <c26a5f69-3127-bd0a-0e25-8b7afe4464aa@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200310/543a888f/attachment-0001.htm>

From suenaga at oss.nttdata.com  Wed Mar 11 02:07:48 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Wed, 11 Mar 2020 11:07:48 +0900
Subject: : PING: RFR: 8234624: jstack mixed mode should refer DWARF
In-Reply-To: <01c2a8c1-0e04-4e42-161a-80d6bdaa4ffa@oracle.com>
References: <eb284894-0801-d04d-c5c2-68ea3faaef98@oss.nttdata.com>
 <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com>
 <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com>
 <e819c572-2172-c3e2-9933-f3859662d400@oss.nttdata.com>
 <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com>
 <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com>
 <c25bb60f-4e37-58db-b40f-2afce2dbf82f@oss.nttdata.com>
 <b2d223d2-47db-3119-b579-e48ca3b50469@oss.nttdata.com>
 <c9ac396c-f14b-3313-b3aa-912dd7dca482@oracle.com>
 <24a2dcae-fba9-c6ed-657e-a1d54584bbc3@oss.nttdata.com>
 <01c2a8c1-0e04-4e42-161a-80d6bdaa4ffa@oracle.com>
Message-ID: <ab3a496e-4f19-922b-7418-1fca058a14c1@oss.nttdata.com>

Hi Kevin,

I guess first program header in the libraries which are on your machine has exec flag (you can check it with `readelf -l`).
So I tweaked my patch (initial value of exec_start and exec_end set to -1) in new webrev.

   http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05/

This webrev contains the fix for your comment (typo and DW_CFA_advance_loc4).


Thanks,

Yasumasa


On 2020/03/11 8:53, Kevin Walls wrote:
> Hi -
> 
> In testing I wasn't seeing any of the Dwarf code triggered.
> 
> With LIBSAPROC_DEBUG set I'm getting the "Could not find executable section in" for lots of / maybe all the libraries...
> 
> src/jdk.hotspot.agent/linux/native/libsaproc/libproc_impl.c
> 
>  ?? if (fill_instr_info(newlib)) {
>  ???? if (!read_eh_frame(ph, newlib)) {
> 
> fill_instr_info is failing, and we never get to read_eh_frame().
> 
> output like:
> 
> libsaproc DEBUG: [0] vaddr = 0x0, memsz = 0xaba4, filesz = 0xaba4
> libsaproc DEBUG: Could not find executable section in /lib/x86_64-linux-gnu/libnss_nis-2.27.so
> 
> (similar for all libraries).
> 
> fill_instr fails if:
> 
>  ?if ((lib->exec_start == 0L) || (lib->exec_end == 0L))
> 
> ...but isn't exec_start relative to the library address? It's the value of ph->vaddr and it is often zero.
> 
> I added some booleans and did:
> 
> 185?????? if ((lib->exec_start == 0L) || (lib->exec_start > ph->p_vaddr)) {
> 186???????? lib->exec_start = ph->p_vaddr;
> 187???????? found_start =true;
> 188?????? }
> 
> (similarly for end) and only failed if:
> 
> 201?? if (!found_start || !found_end) {
> 202???? return false;
> 
> ...and now it's better. ? I go from:
> 
> ----------------- 3306 -----------------
> 0x00007f75824acd2d????? __GI___pthread_timedjoin_ex + 0x17d
> 
> to:
> 
> ----------------- 31127 -----------------
> 0x00007fa284d78d2d????? __GI___pthread_timedjoin_ex + 0x17d
> 0x00007fa2857aaf2d????? CallJavaMainInNewThread + 0xad
> 0x00007fa2857a74ed????? ContinueInNewThread + 0x4d
> 0x00007fa2857a8c49????? JLI_Launch + 0x1529
> 0x000055af1b78db1c????? main + 0x11c
> 
> 
> Thanks
> Kevin
> 
> 
> 
> 
> On 10/03/2020 12:36, Yasumasa Suenaga wrote:
> 
>> Hi Kevin,
>>
>> Thanks for your comment!
>>
>> On 2020/03/10 18:58, Kevin Walls wrote:
>>> Hi Yasumasa ,
>>>
>>> The changes build OK for me in the latest jdk, and things still work.
>>> I have not yet seen the dwarf usage in action: I've tried a couple of different systems and so far have not reproduced the problem, i.e. jstack has not failed on native frames.
>>>
>>> I may need more recent basic libraries, will look again for somewhere where the problem happens and get back to you as I really want to run the changes.
>>
>> You can see the problem with JShell.
>> Some Java frames would not be seen in mixed jstack.
>>
>>
>>> I have mostly minor other comments which don't need a new webrev, some just comments for the future:
>>>
>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp:
>>>
>>> DW_CFA_nop - shouldn't this continue instead of return?
>>> (It may "never" happen, but a nop could appear within some other instructions?)
>>
>> DW_CFA_nop is used for padding, so we can ignore (return immediately) it.
>>
>>
>>> DW_CFA_remember_state: a minor typo in the comment, "DW_CFA_remenber_state".
>>
>> I will fix it.
>>
>>
>>> We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not DW_CFA_advance_loc4.? I thought that was odd, but maybe addresses in these tables never increase by 4-byte amounts, would this mean a lot of code on one line. 8-)
>>> So maybe it's never used in practice, if you think it's unnecessary no problem, maybe a comment, or add it for robustness.
>>
>> I will add DW_CFA_advance_loc4.
>>
>>
>>> General-purpose methods like read_leb128(), get_entry_length(), get_decoded_value() specifically update the _buf pointer in this DwarfParser.
>>>
>>> DwarfParser::process_dwarf() moves _buf.
>>> It calls process_cie() which reads, moves _buf and restores it to the original position, then we read augmentation_length from where _buf is.
>>> I'm not sure if that's wrong, or if I just need to read again about the CIE/etc layout.
>>>
>>> I don't really want to suggest making the code pass around a current _buf for the invocation of these general purpose methods, but just wanted to comment that if these get used more widely that might become necessary.
>>
>> I saw GDB and binutils source for creating this patch.
>> They seems to process similar code because we need to calculate DWARF instructions one-by-one to get the value which relates to specified PC.
>>
>>
>>> Similarly in future, if this DWARF support code became used more widely, it might want to move to an
>>> OS-neutral directory?? It's odd to label it as Linux-specific.
>>
>> Windows does not use DWARF at least, it uses another feature.
>>
>> https://urldefense.com/v3/__https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaTtALtpRg$
>> I'm not sure other platforms (Solaris, macOS) uses DWARF.
>> If DWARF is used in them, I can move DWARF related code to posix directory.
>>
>>
>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp:
>>> Thanks for changing "can_parsable" which was in the earlier version. 8-)
>>>
>>>
>>> These are just comments to mainly say it looks good, and somebody else out there has read it.
>>> I will look for a system that shows the problem, and get back to you again!
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>>> Many thanks
>>> Kevin
>>>
>>> On 27/02/2020 05:13, Yasumasa Suenaga wrote:
>>>> Hi all,
>>>>
>>>> webrev.03 cannot be applied to current jdk/jdk due to 8239224 and 8239462 changes (they updated copyright year).
>>>> So I modified webrev (only copyright year changes) to be able to apply to current jdk/jdk.
>>>> Could you review it?
>>>>
>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/
>>>>
>>>> I need one more reviewer to push.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2020/02/17 13:07, Yasumasa Suenaga wrote:
>>>>> PING: Could you review it?
>>>>>
>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>
>>>>> This change has been already reviewed by Serguei.
>>>>> I need one more reviewer to push.
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2020/02/03 1:37, Yasumasa Suenaga wrote:
>>>>>> PING: Could you reveiw this change?
>>>>>>
>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>
>>>>>> I believe this change helps troubleshooter to fight to postmortem analysis.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>> On 2020/01/19 3:16, Yasumasa Suenaga wrote:
>>>>>>> PING: Could you review it?
>>>>>>>
>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>>
>>>>>>> I updated webrev. I discussed with Serguei in off list, and I refactored webrev.02 .
>>>>>>> It has passed tests on submit repo (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549).
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>> On 2019/12/15 10:51, Yasumasa Suenaga wrote:
>>>>>>>> Hi Serguei,
>>>>>>>>
>>>>>>>> Thanks for your comment!
>>>>>>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev.
>>>>>>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as Dmitry said.
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/
>>>>>>>>
>>>>>>>> This change has been passed all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487).
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote:
>>>>>>>>> Hi Yasumasa,
>>>>>>>>>
>>>>>>>>> This is nice move in general.
>>>>>>>>> Thank you for working on this!
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html
>>>>>>>>>
>>>>>>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 0L) { // Java frame 98 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp == null) { 100 return null; 101 } 102 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if (rbp == null) { 110 return null; 111 } 112 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 dwarf.processDwarf(pc); 115 Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 !dwarf.isBPOffsetAvailable()) 117 ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : context.getRegisterAsAddress(dwarf.getCFARegister()) 119 .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); 124 }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I'd suggest to simplify the logic by refactoring to something like below:
>>>>>>>>>
>>>>>>>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc);
>>>>>>>>> ?????????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame
>>>>>>>>> ?????????? DwarfParser dwarf = null;
>>>>>>>>>
>>>>>>>>> ?????????? if (libptr != 0L) { // Native frame
>>>>>>>>> ???????????? try {
>>>>>>>>> ?????????????? dwarf = new DwarfParser(libptr);
>>>>>>>>> ?????????????? dwarf.processDwarf(pc);
>>>>>>>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) &&
>>>>>>>>> !dwarf.isBPOffsetAvailable())
>>>>>>>>> ???????????????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP)
>>>>>>>>> ???????????????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister())
>>>>>>>>> .addOffsetTo(dwarf.getCFAOffset());
>>>>>>>>>
>>>>>>>>> ??????????? } catch (DebuggerException e) { // bail out to Java frame case
>>>>>>>>> ??????????? }
>>>>>>>>> ????????? }
>>>>>>>>> ????????? if (cfa == null) {
>>>>>>>>> ??????????? return null;
>>>>>>>>> ????????? }
>>>>>>>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf);
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html
>>>>>>>>>
>>>>>>>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA()
>>>>>>>>>
>>>>>>>>> ?? Better to rename 'ofs' => 'offs'.
>>>>>>>>>
>>>>>>>>> 77 nextCFA = nextCFA.addOffsetTo(- nextDwarf.getBasePointerOffsetFromCFA());
>>>>>>>>>
>>>>>>>>> ?? Extra space after '-' sign.
>>>>>>>>>
>>>>>>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, ThreadContext context) {
>>>>>>>>>
>>>>>>>>> ?? It feels like the logic has to be somehow refactored/simplified as
>>>>>>>>> ?? several typical fragments appears in slightly different contexts.
>>>>>>>>> ?? But it is not easy to understand what it is.
>>>>>>>>> ?? Could you, please, add some comments to key places explaining this logic.
>>>>>>>>> ?? Then I'll check if it is possible to make it a little bit simpler.
>>>>>>>>>
>>>>>>>>> 109 private CFrame javaSender(ThreadContext context) { 110 Address nextCFA; 111 Address nextPC; 112 113 nextPC = getNextPC(false); 114 if (nextPC == null) { 115 return null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 123 } catch (DebuggerException e) { 124 nextCFA = getNextCFA(null, context); 125 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 133 }
>>>>>>>>>
>>>>>>>>> ??The above can be simplified if a DebuggerException can not be thrown from processDwarf(nextPC):
>>>>>>>>> ????? private CFrame javaSender(ThreadContext context) {
>>>>>>>>> ??????? Address nextPC = getNextPC(false);
>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>> ????????? return null;
>>>>>>>>> ??????? }
>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>>
>>>>>>>>> ??????? if (libptr != 0L) { // Native frame
>>>>>>>>> ????????? try {
>>>>>>>>> ??????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>> ??????????? nextDwarf.processDwarf(nextPC);
>>>>>>>>> ????????? } catch (DebuggerException e) { // Bail out to Java frame
>>>>>>>>> ????????? }
>>>>>>>>> ??????? }
>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>> ????? }
>>>>>>>>>
>>>>>>>>> 135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext context = thread.getContext(); 137 138 if (dwarf == null) { // Java frame 139 return javaSender(context); 140 } 141 142 Address nextPC = getNextPC(true); 143 if (nextPC == null) { 144 return null; 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // Next frame might be Java frame 153 nextCFA = getNextCFA(null, context); 154 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 158 } catch (DebuggerException e) { 159 nextCFA = getNextCFA(null, context); 160 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? null : new 
>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 167 }
>>>>>>>>>
>>>>>>>>> ??This one can be also simplified a little:
>>>>>>>>>
>>>>>>>>> ????? public CFrame sender(ThreadProxy thread) {
>>>>>>>>> ??????? ThreadContext context = thread.getContext();
>>>>>>>>>
>>>>>>>>> ??????? if (dwarf == null) { // Java frame
>>>>>>>>> ????????? return javaSender(context);
>>>>>>>>> ??????? }
>>>>>>>>> ??????? Address nextPC = getNextPC(true);
>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>> ????????? return null;
>>>>>>>>> ??????? }
>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>> ??????? if (!dwarf.isIn(nextPC)) {
>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>> ????????? if (libptr != 0L) {
>>>>>>>>> ??????????? try {
>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC);
>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame
>>>>>>>>> ??????????? }
>>>>>>>>> ????????? }
>>>>>>>>> ??????? }
>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>> ????? }
>>>>>>>>>
>>>>>>>>> Finally, it looks like just one method could replace both
>>>>>>>>> sender(ThreadProxy thread) and javaSender(ThreadContext context):
>>>>>>>>>
>>>>>>>>> ????? private CFrame commonSender(ThreadProxy thread) {
>>>>>>>>> ??????? ThreadContext context = thread.getContext();
>>>>>>>>> ??????? Address nextPC = getNextPC(false);
>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>> ????????? return null;
>>>>>>>>> ??????? }
>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>>
>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) {
>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>> ????????? if (libptr != 0L) {
>>>>>>>>> ??????????? try {
>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC);
>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame
>>>>>>>>> ??????????? }
>>>>>>>>> ????????? }
>>>>>>>>> ??????? }
>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>> ????? }
>>>>>>>>>
>>>>>>>>> I'm still reviewing the dwarf parser files.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Serguei
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I refactored LinuxAMD64CFrame.java . It works fine in serviceability/sa tests and
>>>>>>>>>> all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923).
>>>>>>>>>> Could you review new webrev?
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/
>>>>>>>>>>
>>>>>>>>>> The diff from previous webrev is here:
>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote:
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> Please review this change:
>>>>>>>>>>>
>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> According to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64
>>>>>>>>>>> Architecture Processor Supplement [1], we need to use DWARF in .eh_frame or .debug_frame
>>>>>>>>>>> for stack unwinding.
>>>>>>>>>>>
>>>>>>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by default since GCC 4.6, so system
>>>>>>>>>>> library (e.g. libc) might be compiled with this feature.
>>>>>>>>>>>
>>>>>>>>>>> However `jhsdb jstack --mixed` does not do so, it uses base pointer register (RBP).
>>>>>>>>>>> So it might be lack of stack frames.
>>>>>>>>>>>
>>>>>>>>>>> I guess JDK-8219201 is caused by same issue.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> [1] https://urldefense.com/v3/__https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaQusmjOqA$ 
>>>>>>>>>>
>>>>>>>>>

From serguei.spitsyn at oracle.com  Wed Mar 11 02:25:44 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 10 Mar 2020 19:25:44 -0700
Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they
 do not attempt to use sudo when available
In-Reply-To: <c26a5f69-3127-bd0a-0e25-8b7afe4464aa@oracle.com>
References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com>
 <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com>
 <c26a5f69-3127-bd0a-0e25-8b7afe4464aa@oracle.com>
Message-ID: <7c6ae898-f4f1-f353-99a6-47d00162bda9@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200310/520997ac/attachment-0001.htm>

From suenaga at oss.nttdata.com  Wed Mar 11 05:52:16 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Wed, 11 Mar 2020 14:52:16 +0900
Subject: : PING: RFR: 8234624: jstack mixed mode should refer DWARF
In-Reply-To: <ab3a496e-4f19-922b-7418-1fca058a14c1@oss.nttdata.com>
References: <eb284894-0801-d04d-c5c2-68ea3faaef98@oss.nttdata.com>
 <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com>
 <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com>
 <e819c572-2172-c3e2-9933-f3859662d400@oss.nttdata.com>
 <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com>
 <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com>
 <c25bb60f-4e37-58db-b40f-2afce2dbf82f@oss.nttdata.com>
 <b2d223d2-47db-3119-b579-e48ca3b50469@oss.nttdata.com>
 <c9ac396c-f14b-3313-b3aa-912dd7dca482@oracle.com>
 <24a2dcae-fba9-c6ed-657e-a1d54584bbc3@oss.nttdata.com>
 <01c2a8c1-0e04-4e42-161a-80d6bdaa4ffa@oracle.com>
 <ab3a496e-4f19-922b-7418-1fca058a14c1@oss.nttdata.com>
Message-ID: <e8a16c8e-ff00-6985-dc3d-9450c6b489df@oss.nttdata.com>

Hi Kevin,

I saw 2 errors on submit repo (mach5-one-ysuenaga-JDK-8234624-5-20200311-0209-9358475).
So I tweaked my patch, but I saw the crash again (mach5-one-ysuenaga-JDK-8234624-5-20200311-0448-9361448).

   Last change on submit repo is here:
     http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05-2/

Can you share details on submit repo?


Thanks,

Yasumasa


On 2020/03/11 11:07, Yasumasa Suenaga wrote:
> Hi Kevin,
> 
> I guess first program header in the libraries which are on your machine has exec flag (you can check it with `readelf -l`).
> So I tweaked my patch (initial value of exec_start and exec_end set to -1) in new webrev.
> 
>  ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05/
> 
> This webrev contains the fix for your comment (typo and DW_CFA_advance_loc4).
> 
> 
> Thanks,
> 
> Yasumasa
> 
> 
> On 2020/03/11 8:53, Kevin Walls wrote:
>> Hi -
>>
>> In testing I wasn't seeing any of the Dwarf code triggered.
>>
>> With LIBSAPROC_DEBUG set I'm getting the "Could not find executable section in" for lots of / maybe all the libraries...
>>
>> src/jdk.hotspot.agent/linux/native/libsaproc/libproc_impl.c
>>
>> ??? if (fill_instr_info(newlib)) {
>> ????? if (!read_eh_frame(ph, newlib)) {
>>
>> fill_instr_info is failing, and we never get to read_eh_frame().
>>
>> output like:
>>
>> libsaproc DEBUG: [0] vaddr = 0x0, memsz = 0xaba4, filesz = 0xaba4
>> libsaproc DEBUG: Could not find executable section in /lib/x86_64-linux-gnu/libnss_nis-2.27.so
>>
>> (similar for all libraries).
>>
>> fill_instr fails if:
>>
>> ??if ((lib->exec_start == 0L) || (lib->exec_end == 0L))
>>
>> ...but isn't exec_start relative to the library address? It's the value of ph->vaddr and it is often zero.
>>
>> I added some booleans and did:
>>
>> 185?????? if ((lib->exec_start == 0L) || (lib->exec_start > ph->p_vaddr)) {
>> 186???????? lib->exec_start = ph->p_vaddr;
>> 187???????? found_start =true;
>> 188?????? }
>>
>> (similarly for end) and only failed if:
>>
>> 201?? if (!found_start || !found_end) {
>> 202???? return false;
>>
>> ...and now it's better. ? I go from:
>>
>> ----------------- 3306 -----------------
>> 0x00007f75824acd2d????? __GI___pthread_timedjoin_ex + 0x17d
>>
>> to:
>>
>> ----------------- 31127 -----------------
>> 0x00007fa284d78d2d????? __GI___pthread_timedjoin_ex + 0x17d
>> 0x00007fa2857aaf2d????? CallJavaMainInNewThread + 0xad
>> 0x00007fa2857a74ed????? ContinueInNewThread + 0x4d
>> 0x00007fa2857a8c49????? JLI_Launch + 0x1529
>> 0x000055af1b78db1c????? main + 0x11c
>>
>>
>> Thanks
>> Kevin
>>
>>
>>
>>
>> On 10/03/2020 12:36, Yasumasa Suenaga wrote:
>>
>>> Hi Kevin,
>>>
>>> Thanks for your comment!
>>>
>>> On 2020/03/10 18:58, Kevin Walls wrote:
>>>> Hi Yasumasa ,
>>>>
>>>> The changes build OK for me in the latest jdk, and things still work.
>>>> I have not yet seen the dwarf usage in action: I've tried a couple of different systems and so far have not reproduced the problem, i.e. jstack has not failed on native frames.
>>>>
>>>> I may need more recent basic libraries, will look again for somewhere where the problem happens and get back to you as I really want to run the changes.
>>>
>>> You can see the problem with JShell.
>>> Some Java frames would not be seen in mixed jstack.
>>>
>>>
>>>> I have mostly minor other comments which don't need a new webrev, some just comments for the future:
>>>>
>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp:
>>>>
>>>> DW_CFA_nop - shouldn't this continue instead of return?
>>>> (It may "never" happen, but a nop could appear within some other instructions?)
>>>
>>> DW_CFA_nop is used for padding, so we can ignore (return immediately) it.
>>>
>>>
>>>> DW_CFA_remember_state: a minor typo in the comment, "DW_CFA_remenber_state".
>>>
>>> I will fix it.
>>>
>>>
>>>> We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not DW_CFA_advance_loc4.? I thought that was odd, but maybe addresses in these tables never increase by 4-byte amounts, would this mean a lot of code on one line. 8-)
>>>> So maybe it's never used in practice, if you think it's unnecessary no problem, maybe a comment, or add it for robustness.
>>>
>>> I will add DW_CFA_advance_loc4.
>>>
>>>
>>>> General-purpose methods like read_leb128(), get_entry_length(), get_decoded_value() specifically update the _buf pointer in this DwarfParser.
>>>>
>>>> DwarfParser::process_dwarf() moves _buf.
>>>> It calls process_cie() which reads, moves _buf and restores it to the original position, then we read augmentation_length from where _buf is.
>>>> I'm not sure if that's wrong, or if I just need to read again about the CIE/etc layout.
>>>>
>>>> I don't really want to suggest making the code pass around a current _buf for the invocation of these general purpose methods, but just wanted to comment that if these get used more widely that might become necessary.
>>>
>>> I saw GDB and binutils source for creating this patch.
>>> They seems to process similar code because we need to calculate DWARF instructions one-by-one to get the value which relates to specified PC.
>>>
>>>
>>>> Similarly in future, if this DWARF support code became used more widely, it might want to move to an
>>>> OS-neutral directory?? It's odd to label it as Linux-specific.
>>>
>>> Windows does not use DWARF at least, it uses another feature.
>>>
>>> https://urldefense.com/v3/__https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaTtALtpRg$
>>> I'm not sure other platforms (Solaris, macOS) uses DWARF.
>>> If DWARF is used in them, I can move DWARF related code to posix directory.
>>>
>>>
>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp:
>>>> Thanks for changing "can_parsable" which was in the earlier version. 8-)
>>>>
>>>>
>>>> These are just comments to mainly say it looks good, and somebody else out there has read it.
>>>> I will look for a system that shows the problem, and get back to you again!
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>>> Many thanks
>>>> Kevin
>>>>
>>>> On 27/02/2020 05:13, Yasumasa Suenaga wrote:
>>>>> Hi all,
>>>>>
>>>>> webrev.03 cannot be applied to current jdk/jdk due to 8239224 and 8239462 changes (they updated copyright year).
>>>>> So I modified webrev (only copyright year changes) to be able to apply to current jdk/jdk.
>>>>> Could you review it?
>>>>>
>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/
>>>>>
>>>>> I need one more reviewer to push.
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2020/02/17 13:07, Yasumasa Suenaga wrote:
>>>>>> PING: Could you review it?
>>>>>>
>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>
>>>>>> This change has been already reviewed by Serguei.
>>>>>> I need one more reviewer to push.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>> On 2020/02/03 1:37, Yasumasa Suenaga wrote:
>>>>>>> PING: Could you reveiw this change?
>>>>>>>
>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>>
>>>>>>> I believe this change helps troubleshooter to fight to postmortem analysis.
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>> On 2020/01/19 3:16, Yasumasa Suenaga wrote:
>>>>>>>> PING: Could you review it?
>>>>>>>>
>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>>>
>>>>>>>> I updated webrev. I discussed with Serguei in off list, and I refactored webrev.02 .
>>>>>>>> It has passed tests on submit repo (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549).
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2019/12/15 10:51, Yasumasa Suenaga wrote:
>>>>>>>>> Hi Serguei,
>>>>>>>>>
>>>>>>>>> Thanks for your comment!
>>>>>>>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev.
>>>>>>>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as Dmitry said.
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/
>>>>>>>>>
>>>>>>>>> This change has been passed all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>
>>>>>>>>>> This is nice move in general.
>>>>>>>>>> Thank you for working on this!
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html
>>>>>>>>>>
>>>>>>>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 0L) { // Java frame 98 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp == null) { 100 return null; 101 } 102 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if (rbp == null) { 110 return null; 111 } 112 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 dwarf.processDwarf(pc); 115 Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 !dwarf.isBPOffsetAvailable()) 117 ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : context.getRegisterAsAddress(dwarf.getCFARegister()) 119 .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); 124 }
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I'd suggest to simplify the logic by refactoring to something like below:
>>>>>>>>>>
>>>>>>>>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc);
>>>>>>>>>> ?????????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame
>>>>>>>>>> ?????????? DwarfParser dwarf = null;
>>>>>>>>>>
>>>>>>>>>> ?????????? if (libptr != 0L) { // Native frame
>>>>>>>>>> ???????????? try {
>>>>>>>>>> ?????????????? dwarf = new DwarfParser(libptr);
>>>>>>>>>> ?????????????? dwarf.processDwarf(pc);
>>>>>>>>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) &&
>>>>>>>>>> !dwarf.isBPOffsetAvailable())
>>>>>>>>>> ???????????????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP)
>>>>>>>>>> ???????????????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister())
>>>>>>>>>> .addOffsetTo(dwarf.getCFAOffset());
>>>>>>>>>>
>>>>>>>>>> ??????????? } catch (DebuggerException e) { // bail out to Java frame case
>>>>>>>>>> ??????????? }
>>>>>>>>>> ????????? }
>>>>>>>>>> ????????? if (cfa == null) {
>>>>>>>>>> ??????????? return null;
>>>>>>>>>> ????????? }
>>>>>>>>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf);
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html
>>>>>>>>>>
>>>>>>>>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA()
>>>>>>>>>>
>>>>>>>>>> ?? Better to rename 'ofs' => 'offs'.
>>>>>>>>>>
>>>>>>>>>> 77 nextCFA = nextCFA.addOffsetTo(- nextDwarf.getBasePointerOffsetFromCFA());
>>>>>>>>>>
>>>>>>>>>> ?? Extra space after '-' sign.
>>>>>>>>>>
>>>>>>>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, ThreadContext context) {
>>>>>>>>>>
>>>>>>>>>> ?? It feels like the logic has to be somehow refactored/simplified as
>>>>>>>>>> ?? several typical fragments appears in slightly different contexts.
>>>>>>>>>> ?? But it is not easy to understand what it is.
>>>>>>>>>> ?? Could you, please, add some comments to key places explaining this logic.
>>>>>>>>>> ?? Then I'll check if it is possible to make it a little bit simpler.
>>>>>>>>>>
>>>>>>>>>> 109 private CFrame javaSender(ThreadContext context) { 110 Address nextCFA; 111 Address nextPC; 112 113 nextPC = getNextPC(false); 114 if (nextPC == null) { 115 return null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 123 } catch (DebuggerException e) { 124 nextCFA = getNextCFA(null, context); 125 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 133 }
>>>>>>>>>>
>>>>>>>>>> ??The above can be simplified if a DebuggerException can not be thrown from processDwarf(nextPC):
>>>>>>>>>> ????? private CFrame javaSender(ThreadContext context) {
>>>>>>>>>> ??????? Address nextPC = getNextPC(false);
>>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>>> ????????? return null;
>>>>>>>>>> ??????? }
>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>>>
>>>>>>>>>> ??????? if (libptr != 0L) { // Native frame
>>>>>>>>>> ????????? try {
>>>>>>>>>> ??????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>>> ??????????? nextDwarf.processDwarf(nextPC);
>>>>>>>>>> ????????? } catch (DebuggerException e) { // Bail out to Java frame
>>>>>>>>>> ????????? }
>>>>>>>>>> ??????? }
>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>>> ????? }
>>>>>>>>>>
>>>>>>>>>> 135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext context = thread.getContext(); 137 138 if (dwarf == null) { // Java frame 139 return javaSender(context); 140 } 141 142 Address nextPC = getNextPC(true); 143 if (nextPC == null) { 144 return null; 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // Next frame might be Java frame 153 nextCFA = getNextCFA(null, context); 154 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 158 } catch (DebuggerException e) { 159 nextCFA = getNextCFA(null, context); 160 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? null : new 
>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 167 }
>>>>>>>>>>
>>>>>>>>>> ??This one can be also simplified a little:
>>>>>>>>>>
>>>>>>>>>> ????? public CFrame sender(ThreadProxy thread) {
>>>>>>>>>> ??????? ThreadContext context = thread.getContext();
>>>>>>>>>>
>>>>>>>>>> ??????? if (dwarf == null) { // Java frame
>>>>>>>>>> ????????? return javaSender(context);
>>>>>>>>>> ??????? }
>>>>>>>>>> ??????? Address nextPC = getNextPC(true);
>>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>>> ????????? return null;
>>>>>>>>>> ??????? }
>>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>>> ??????? if (!dwarf.isIn(nextPC)) {
>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>> ????????? if (libptr != 0L) {
>>>>>>>>>> ??????????? try {
>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC);
>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame
>>>>>>>>>> ??????????? }
>>>>>>>>>> ????????? }
>>>>>>>>>> ??????? }
>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>>> ????? }
>>>>>>>>>>
>>>>>>>>>> Finally, it looks like just one method could replace both
>>>>>>>>>> sender(ThreadProxy thread) and javaSender(ThreadContext context):
>>>>>>>>>>
>>>>>>>>>> ????? private CFrame commonSender(ThreadProxy thread) {
>>>>>>>>>> ??????? ThreadContext context = thread.getContext();
>>>>>>>>>> ??????? Address nextPC = getNextPC(false);
>>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>>> ????????? return null;
>>>>>>>>>> ??????? }
>>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>>>
>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) {
>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>> ????????? if (libptr != 0L) {
>>>>>>>>>> ??????????? try {
>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC);
>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame
>>>>>>>>>> ??????????? }
>>>>>>>>>> ????????? }
>>>>>>>>>> ??????? }
>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>>> ????? }
>>>>>>>>>>
>>>>>>>>>> I'm still reviewing the dwarf parser files.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Serguei
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I refactored LinuxAMD64CFrame.java . It works fine in serviceability/sa tests and
>>>>>>>>>>> all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923).
>>>>>>>>>>> Could you review new webrev?
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/
>>>>>>>>>>>
>>>>>>>>>>> The diff from previous webrev is here:
>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote:
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>
>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>
>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> According to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64
>>>>>>>>>>>> Architecture Processor Supplement [1], we need to use DWARF in .eh_frame or .debug_frame
>>>>>>>>>>>> for stack unwinding.
>>>>>>>>>>>>
>>>>>>>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by default since GCC 4.6, so system
>>>>>>>>>>>> library (e.g. libc) might be compiled with this feature.
>>>>>>>>>>>>
>>>>>>>>>>>> However `jhsdb jstack --mixed` does not do so, it uses base pointer register (RBP).
>>>>>>>>>>>> So it might be lack of stack frames.
>>>>>>>>>>>>
>>>>>>>>>>>> I guess JDK-8219201 is caused by same issue.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> [1] https://urldefense.com/v3/__https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaQusmjOqA$ 
>>>>>>>>>>>
>>>>>>>>>>

From david.holmes at oracle.com  Wed Mar 11 05:59:29 2020
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 11 Mar 2020 15:59:29 +1000
Subject: : PING: RFR: 8234624: jstack mixed mode should refer DWARF
In-Reply-To: <e8a16c8e-ff00-6985-dc3d-9450c6b489df@oss.nttdata.com>
References: <eb284894-0801-d04d-c5c2-68ea3faaef98@oss.nttdata.com>
 <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com>
 <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com>
 <e819c572-2172-c3e2-9933-f3859662d400@oss.nttdata.com>
 <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com>
 <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com>
 <c25bb60f-4e37-58db-b40f-2afce2dbf82f@oss.nttdata.com>
 <b2d223d2-47db-3119-b579-e48ca3b50469@oss.nttdata.com>
 <c9ac396c-f14b-3313-b3aa-912dd7dca482@oracle.com>
 <24a2dcae-fba9-c6ed-657e-a1d54584bbc3@oss.nttdata.com>
 <01c2a8c1-0e04-4e42-161a-80d6bdaa4ffa@oracle.com>
 <ab3a496e-4f19-922b-7418-1fca058a14c1@oss.nttdata.com>
 <e8a16c8e-ff00-6985-dc3d-9450c6b489df@oss.nttdata.com>
Message-ID: <26681640-d8c8-5b64-58cc-36f8a854d101@oracle.com>

Hi Yasumasa,

Partial hs_err info below.

David
-----

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fdf2000e87c, pid=29798, tid=29800
#
# JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 
15-internal+0-2020-03-11-0447267.suenaga.source)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
15-internal+0-2020-03-11-0447267.suenaga.source, mixed mode, sharing, 
tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# C  [libsaproc.so+0x487c]  DwarfParser::process_dwarf(unsigned long)+0x2c
#
# Core dump will be written. Default location: Core dumps may be 
processed with "/opt/core.sh %p" (or dumping to 
/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/scratch/0/core.29798)
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

---------------  S U M M A R Y ------------

Command Line: 
-Denv.class.path=/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/serviceability/sa/TestJhsdbJstackMixed.d:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/hotspot/jtreg/serviceability/sa:/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/javatest.jar:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/jtreg.jar 
-Dapplication.home=/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/linux-x64-debug.jdk/jdk-15/fastdebug 
-Xms8m -Djdk.module.main=jdk.hotspot.agent 
jdk.hotspot.agent/sun.jvm.hotspot.SALauncher jstack --mixed --pid 29770

Time: Wed Mar 11 05:20:57 2020 UTC elapsed time: 3.927809 seconds (0d 0h 
0m 3s)

---------------  T H R E A D  ---------------

Current thread (0x00007fdf5c032000):  JavaThread "main" 
[_thread_in_native, id=29800, stack(0x00007fdf63a9e000,0x00007fdf63b9f000)]

Stack: [0x00007fdf63a9e000,0x00007fdf63b9f000],  sp=0x00007fdf63b9d190, 
free space=1020k
Native frames: (J=compiled Java code, A=aot compiled Java code, 
j=interpreted, Vv=VM code, C=native code)
C  [libsaproc.so+0x487c]  DwarfParser::process_dwarf(unsigned long)+0x2c
j  sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 
jdk.hotspot.agent at 15-internal
j 
sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 
jdk.hotspot.agent at 15-internal
j 
sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 
jdk.hotspot.agent at 15-internal
j 
sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 
jdk.hotspot.agent at 15-internal
j 
sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 
jdk.hotspot.agent at 15-internal
j  sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 
jdk.hotspot.agent at 15-internal
j  sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal
j  sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal
j  sun.jvm.hotspot.tools.Tool.startInternal()V+87 
jdk.hotspot.agent at 15-internal
j  sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 
jdk.hotspot.agent at 15-internal
j  sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 
jdk.hotspot.agent at 15-internal
j  sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 
jdk.hotspot.agent at 15-internal
j  sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 
jdk.hotspot.agent at 15-internal
j  sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 
jdk.hotspot.agent at 15-internal
j  sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 
jdk.hotspot.agent at 15-internal
v  ~StubRoutines::call_stub
V  [libjvm.so+0xc2291c]  JavaCalls::call_helper(JavaValue*, methodHandle 
const&, JavaCallArguments*, Thread*)+0x6ac
V  [libjvm.so+0xd31970]  jni_invoke_static(JNIEnv_*, JavaValue*, 
_jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) 
[clone .isra.140] [clone .constprop.263]+0x370
V  [libjvm.so+0xd36202]  jni_CallStaticVoidMethod+0x222
C  [libjli.so+0x4bed]  JavaMain+0xbcd
C  [libjli.so+0x80a9]  ThreadJavaMain+0x9

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 
jdk.hotspot.agent at 15-internal
j 
sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 
jdk.hotspot.agent at 15-internal
j 
sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 
jdk.hotspot.agent at 15-internal
j 
sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 
jdk.hotspot.agent at 15-internal
j 
sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 
jdk.hotspot.agent at 15-internal
j  sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 
jdk.hotspot.agent at 15-internal
j  sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal
j  sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal
j  sun.jvm.hotspot.tools.Tool.startInternal()V+87 
jdk.hotspot.agent at 15-internal
j  sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 
jdk.hotspot.agent at 15-internal
j  sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 
jdk.hotspot.agent at 15-internal
j  sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 
jdk.hotspot.agent at 15-internal
j  sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 
jdk.hotspot.agent at 15-internal
j  sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 
jdk.hotspot.agent at 15-internal
j  sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 
jdk.hotspot.agent at 15-internal
v  ~StubRoutines::call_stub

siginfo: si_signo: 11 (SIGSEGV), si_code: 2 (SEGV_ACCERR), si_addr: 
0x00007fded5076b79

Register to memory mapping:

RAX=0x00007f7e4dfe3229 is an unknown value
RBX=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 d4 
de 7f 00 00
RCX=0x00007fded4072380 points into unknown readable memory: 2f 75 73 72 
2f 6c 69 62
RDX=0x00007fded4076b85 points into unknown readable memory: 01 00 00
RSP=0x00007fdf63b9d190 is pointing into the stack for thread: 
0x00007fdf5c032000
RBP=0x00007fdf63b9d1b0 is pointing into the stack for thread: 
0x00007fdf5c032000
RSI=0x0000000000000004 is an unknown value
RDI=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 d4 
de 7f 00 00
R8 =0x000000000146c380 points into unknown readable memory: 02 00 00 00 
00 00 00 00
R9 =0x00007fded4076b79 points into unknown readable memory: 7a 52 00 01 
78 10 01
R10=0x00000000ffffffff is an unknown value
R11=0x000000000100527a is an unknown value
R12=0x00007fded5076b79 is an unknown value
R13=0x00007f7da2f8e68a is an unknown value
R14=0x00007f7dbdf62b1d is an unknown value
R15=0x00007fdf5c032000 is a thread


Registers:
RAX=0x00007f7e4dfe3229, RBX=0x00007fdf5c4d7080, RCX=0x00007fded4072380, 
RDX=0x00007fded4076b85
RSP=0x00007fdf63b9d190, RBP=0x00007fdf63b9d1b0, RSI=0x0000000000000004, 
RDI=0x00007fdf5c4d7080
R8 =0x000000000146c380, R9 =0x00007fded4076b79, R10=0x00000000ffffffff, 
R11=0x000000000100527a
R12=0x00007fded5076b79, R13=0x00007f7da2f8e68a, R14=0x00007f7dbdf62b1d, 
R15=0x00007fdf5c032000
RIP=0x00007fdf2000e87c, EFLAGS=0x0000000000010206, 
CSGSFS=0x002b000000000033, ERR=0x0000000000000004
   TRAPNO=0x000000000000000e

Top of Stack: (sp=0x00007fdf63b9d190)
0x00007fdf63b9d190:   00007fdf209d0980 0000000000000000
0x00007fdf63b9d1a0:   00007fdf209d0980 00007fdf63b9d258
0x00007fdf63b9d1b0:   00007fdf63b9d228 00007fdf44778dbe
0x00007fdf63b9d1c0:   000000000146c380 00007fdf5c032000

Instructions: (pc=0x00007fdf2000e87c)
0x00007fdf2000e77c:   89 43 18 4d 85 f6 75 0f eb 2a 66 2e 0f 1f 84 00
0x00007fdf2000e78c:   00 00 00 00 48 89 c2 48 8d 42 01 48 89 43 08 80
0x00007fdf2000e79c:   78 ff 00 78 ef 48 8d 42 02 48 89 43 08 0f b6 42
0x00007fdf2000e7ac:   01 88 43 10 48 c7 43 28 00 00 00 00 4c 89 e1 48
0x00007fdf2000e7bc:   89 df 31 f6 48 b8 07 00 00 00 10 00 00 00 c6 43
0x00007fdf2000e7cc:   3c 00 48 c7 c2 ff ff ff ff 48 89 43 14 48 c7 43
0x00007fdf2000e7dc:   30 00 00 00 00 c7 43 38 00 00 00 00 e8 13 fb ff
0x00007fdf2000e7ec:   ff 4c 89 6b 08 48 83 c4 18 5b 41 5c 41 5d 41 5e
0x00007fdf2000e7fc:   41 5f 5d c3 83 e7 40 0f 84 63 ff ff ff 48 c7 c2
0x00007fdf2000e80c:   ff ff ff ff 48 d3 e2 49 09 d0 e9 51 ff ff ff 90
0x00007fdf2000e81c:   0f 1f 40 00 0f b6 47 10 83 e0 07 3c 02 74 0a 76
0x00007fdf2000e82c:   1b 3c 03 74 04 3c 04 75 17 48 8b 57 08 8b 02 48
0x00007fdf2000e83c:   83 c2 04 48 89 57 08 c3 0f 1f 40 00 84 c0 74 e9
0x00007fdf2000e84c:   31 c0 c3 90 55 41 ba ff ff ff ff 48 89 e5 41 56
0x00007fdf2000e85c:   41 55 49 89 f5 41 54 53 48 8b 07 48 89 fb 4c 8b
0x00007fdf2000e86c:   a0 28 11 00 00 eb 09 0f 1f 44 00 00 4c 89 63 08
0x00007fdf2000e87c:   41 8b 04 24 4d 8d 4c 24 04 4c 89 4b 08 4c 39 d0
0x00007fdf2000e88c:   75 0a 49 8b 44 24 04 4d 8d 4c 24 0c 45 8b 19 4d
0x00007fdf2000e89c:   8d 24 01 49 8d 41 04 48 89 43 08 45 85 db 74 cc
0x00007fdf2000e8ac:   48 89 df e8 8c f9 ff ff 48 8b 13 41 89 c6 4c 03
0x00007fdf2000e8bc:   b2 18 11 00 00 e8 5a ff ff ff 89 c0 4c 01 f0 4c
0x00007fdf2000e8cc:   39 e8 76 a8 4d 39 ee 77 a3 44 89 da 4c 89 ce e8
0x00007fdf2000e8dc:   90 fd ff ff 48 8b 43 08 31 c9 31 ff 48 83 c0 01
0x00007fdf2000e8ec:   0f 1f 40 00 48 89 43 08 0f b6 70 ff 49 89 c0 48
0x00007fdf2000e8fc:   83 c0 01 48 89 f2 83 e2 7f 48 d3 e2 83 c1 07 48
0x00007fdf2000e90c:   09 d7 40 84 f6 78 dd 4c 01 c7 4c 89 e1 4c 89 ea
0x00007fdf2000e91c:   4c 89 f6 48 89 7b 08 48 89 df e8 d5 f9 ff ff 5b
0x00007fdf2000e92c:   31 c0 41 5c 41 5d 41 5e 5d c3 66 2e 0f 1f 84 00
0x00007fdf2000e93c:   00 00 00 00 55 48 89 e5 41 54 53 48 81 ec d0 00
0x00007fdf2000e94c:   00 00 48 89 b5 48 ff ff ff 48 89 95 50 ff ff ff
0x00007fdf2000e95c:   48 89 8d 58 ff ff ff 4c 89 85 60 ff ff ff 4c 89
0x00007fdf2000e96c:   8d 68 ff ff ff 84 c0 74 23 0f 29 85 70 ff ff ff


On 11/03/2020 3:52 pm, Yasumasa Suenaga wrote:
> Hi Kevin,
> 
> I saw 2 errors on submit repo 
> (mach5-one-ysuenaga-JDK-8234624-5-20200311-0209-9358475).
> So I tweaked my patch, but I saw the crash again 
> (mach5-one-ysuenaga-JDK-8234624-5-20200311-0448-9361448).
> 
>  ? Last change on submit repo is here:
>  ??? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05-2/
> 
> Can you share details on submit repo?
> 
> 
> Thanks,
> 
> Yasumasa
> 
> 
> On 2020/03/11 11:07, Yasumasa Suenaga wrote:
>> Hi Kevin,
>>
>> I guess first program header in the libraries which are on your 
>> machine has exec flag (you can check it with `readelf -l`).
>> So I tweaked my patch (initial value of exec_start and exec_end set to 
>> -1) in new webrev.
>>
>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05/
>>
>> This webrev contains the fix for your comment (typo and 
>> DW_CFA_advance_loc4).
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> On 2020/03/11 8:53, Kevin Walls wrote:
>>> Hi -
>>>
>>> In testing I wasn't seeing any of the Dwarf code triggered.
>>>
>>> With LIBSAPROC_DEBUG set I'm getting the "Could not find executable 
>>> section in" for lots of / maybe all the libraries...
>>>
>>> src/jdk.hotspot.agent/linux/native/libsaproc/libproc_impl.c
>>>
>>> ??? if (fill_instr_info(newlib)) {
>>> ????? if (!read_eh_frame(ph, newlib)) {
>>>
>>> fill_instr_info is failing, and we never get to read_eh_frame().
>>>
>>> output like:
>>>
>>> libsaproc DEBUG: [0] vaddr = 0x0, memsz = 0xaba4, filesz = 0xaba4
>>> libsaproc DEBUG: Could not find executable section in 
>>> /lib/x86_64-linux-gnu/libnss_nis-2.27.so
>>>
>>> (similar for all libraries).
>>>
>>> fill_instr fails if:
>>>
>>> ??if ((lib->exec_start == 0L) || (lib->exec_end == 0L))
>>>
>>> ...but isn't exec_start relative to the library address? It's the 
>>> value of ph->vaddr and it is often zero.
>>>
>>> I added some booleans and did:
>>>
>>> 185?????? if ((lib->exec_start == 0L) || (lib->exec_start > 
>>> ph->p_vaddr)) {
>>> 186???????? lib->exec_start = ph->p_vaddr;
>>> 187???????? found_start =true;
>>> 188?????? }
>>>
>>> (similarly for end) and only failed if:
>>>
>>> 201?? if (!found_start || !found_end) {
>>> 202???? return false;
>>>
>>> ...and now it's better. ? I go from:
>>>
>>> ----------------- 3306 -----------------
>>> 0x00007f75824acd2d????? __GI___pthread_timedjoin_ex + 0x17d
>>>
>>> to:
>>>
>>> ----------------- 31127 -----------------
>>> 0x00007fa284d78d2d????? __GI___pthread_timedjoin_ex + 0x17d
>>> 0x00007fa2857aaf2d????? CallJavaMainInNewThread + 0xad
>>> 0x00007fa2857a74ed????? ContinueInNewThread + 0x4d
>>> 0x00007fa2857a8c49????? JLI_Launch + 0x1529
>>> 0x000055af1b78db1c????? main + 0x11c
>>>
>>>
>>> Thanks
>>> Kevin
>>>
>>>
>>>
>>>
>>> On 10/03/2020 12:36, Yasumasa Suenaga wrote:
>>>
>>>> Hi Kevin,
>>>>
>>>> Thanks for your comment!
>>>>
>>>> On 2020/03/10 18:58, Kevin Walls wrote:
>>>>> Hi Yasumasa ,
>>>>>
>>>>> The changes build OK for me in the latest jdk, and things still work.
>>>>> I have not yet seen the dwarf usage in action: I've tried a couple 
>>>>> of different systems and so far have not reproduced the problem, 
>>>>> i.e. jstack has not failed on native frames.
>>>>>
>>>>> I may need more recent basic libraries, will look again for 
>>>>> somewhere where the problem happens and get back to you as I really 
>>>>> want to run the changes.
>>>>
>>>> You can see the problem with JShell.
>>>> Some Java frames would not be seen in mixed jstack.
>>>>
>>>>
>>>>> I have mostly minor other comments which don't need a new webrev, 
>>>>> some just comments for the future:
>>>>>
>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp:
>>>>>
>>>>> DW_CFA_nop - shouldn't this continue instead of return?
>>>>> (It may "never" happen, but a nop could appear within some other 
>>>>> instructions?)
>>>>
>>>> DW_CFA_nop is used for padding, so we can ignore (return 
>>>> immediately) it.
>>>>
>>>>
>>>>> DW_CFA_remember_state: a minor typo in the comment, 
>>>>> "DW_CFA_remenber_state".
>>>>
>>>> I will fix it.
>>>>
>>>>
>>>>> We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not 
>>>>> DW_CFA_advance_loc4.? I thought that was odd, but maybe addresses 
>>>>> in these tables never increase by 4-byte amounts, would this mean a 
>>>>> lot of code on one line. 8-)
>>>>> So maybe it's never used in practice, if you think it's unnecessary 
>>>>> no problem, maybe a comment, or add it for robustness.
>>>>
>>>> I will add DW_CFA_advance_loc4.
>>>>
>>>>
>>>>> General-purpose methods like read_leb128(), get_entry_length(), 
>>>>> get_decoded_value() specifically update the _buf pointer in this 
>>>>> DwarfParser.
>>>>>
>>>>> DwarfParser::process_dwarf() moves _buf.
>>>>> It calls process_cie() which reads, moves _buf and restores it to 
>>>>> the original position, then we read augmentation_length from where 
>>>>> _buf is.
>>>>> I'm not sure if that's wrong, or if I just need to read again about 
>>>>> the CIE/etc layout.
>>>>>
>>>>> I don't really want to suggest making the code pass around a 
>>>>> current _buf for the invocation of these general purpose methods, 
>>>>> but just wanted to comment that if these get used more widely that 
>>>>> might become necessary.
>>>>
>>>> I saw GDB and binutils source for creating this patch.
>>>> They seems to process similar code because we need to calculate 
>>>> DWARF instructions one-by-one to get the value which relates to 
>>>> specified PC.
>>>>
>>>>
>>>>> Similarly in future, if this DWARF support code became used more 
>>>>> widely, it might want to move to an
>>>>> OS-neutral directory?? It's odd to label it as Linux-specific.
>>>>
>>>> Windows does not use DWARF at least, it uses another feature.
>>>>
>>>> https://urldefense.com/v3/__https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaTtALtpRg$ 
>>>>
>>>> I'm not sure other platforms (Solaris, macOS) uses DWARF.
>>>> If DWARF is used in them, I can move DWARF related code to posix 
>>>> directory.
>>>>
>>>>
>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp:
>>>>> Thanks for changing "can_parsable" which was in the earlier 
>>>>> version. 8-)
>>>>>
>>>>>
>>>>> These are just comments to mainly say it looks good, and somebody 
>>>>> else out there has read it.
>>>>> I will look for a system that shows the problem, and get back to 
>>>>> you again!
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>>> Many thanks
>>>>> Kevin
>>>>>
>>>>> On 27/02/2020 05:13, Yasumasa Suenaga wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> webrev.03 cannot be applied to current jdk/jdk due to 8239224 and 
>>>>>> 8239462 changes (they updated copyright year).
>>>>>> So I modified webrev (only copyright year changes) to be able to 
>>>>>> apply to current jdk/jdk.
>>>>>> Could you review it?
>>>>>>
>>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/
>>>>>>
>>>>>> I need one more reviewer to push.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>> On 2020/02/17 13:07, Yasumasa Suenaga wrote:
>>>>>>> PING: Could you review it?
>>>>>>>
>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>> ?? webrev: 
>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>>
>>>>>>> This change has been already reviewed by Serguei.
>>>>>>> I need one more reviewer to push.
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>> On 2020/02/03 1:37, Yasumasa Suenaga wrote:
>>>>>>>> PING: Could you reveiw this change?
>>>>>>>>
>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>> ?? webrev: 
>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>>>
>>>>>>>> I believe this change helps troubleshooter to fight to 
>>>>>>>> postmortem analysis.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2020/01/19 3:16, Yasumasa Suenaga wrote:
>>>>>>>>> PING: Could you review it?
>>>>>>>>>
>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>> ?? webrev: 
>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>>>>
>>>>>>>>> I updated webrev. I discussed with Serguei in off list, and I 
>>>>>>>>> refactored webrev.02 .
>>>>>>>>> It has passed tests on submit repo 
>>>>>>>>> (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2019/12/15 10:51, Yasumasa Suenaga wrote:
>>>>>>>>>> Hi Serguei,
>>>>>>>>>>
>>>>>>>>>> Thanks for your comment!
>>>>>>>>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev.
>>>>>>>>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as 
>>>>>>>>>> Dmitry said.
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/
>>>>>>>>>>
>>>>>>>>>> This change has been passed all tests on submit repo 
>>>>>>>>>> (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>
>>>>>>>>>>> This is nice move in general.
>>>>>>>>>>> Thank you for working on this!
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr 
>>>>>>>>>>> == 0L) { // Java frame 98 Address rbp = 
>>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if 
>>>>>>>>>>> (rbp == null) { 100 return null; 101 } 102 return new 
>>>>>>>>>>> LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native 
>>>>>>>>>>> frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new 
>>>>>>>>>>> DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 
>>>>>>>>>>> Address rbp = 
>>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if 
>>>>>>>>>>> (rbp == null) { 110 return null; 111 } 112 return new 
>>>>>>>>>>> LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 
>>>>>>>>>>> dwarf.processDwarf(pc); 115 Address cfa = 
>>>>>>>>>>> ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 
>>>>>>>>>>> !dwarf.isBPOffsetAvailable()) 117 ? 
>>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : 
>>>>>>>>>>> context.getRegisterAsAddress(dwarf.getCFARegister()) 119 
>>>>>>>>>>> .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 
>>>>>>>>>>> 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, 
>>>>>>>>>>> cfa, pc, dwarf); 124 }
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I'd suggest to simplify the logic by refactoring to something 
>>>>>>>>>>> like below:
>>>>>>>>>>>
>>>>>>>>>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc);
>>>>>>>>>>> ?????????? Address cfa = 
>>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java 
>>>>>>>>>>> frame
>>>>>>>>>>> ?????????? DwarfParser dwarf = null;
>>>>>>>>>>>
>>>>>>>>>>> ?????????? if (libptr != 0L) { // Native frame
>>>>>>>>>>> ???????????? try {
>>>>>>>>>>> ?????????????? dwarf = new DwarfParser(libptr);
>>>>>>>>>>> ?????????????? dwarf.processDwarf(pc);
>>>>>>>>>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == 
>>>>>>>>>>> AMD64ThreadContext.RBP) &&
>>>>>>>>>>> !dwarf.isBPOffsetAvailable())
>>>>>>>>>>> ???????????????????????????????? ? 
>>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP)
>>>>>>>>>>> ???????????????????????????????? : 
>>>>>>>>>>> context.getRegisterAsAddress(dwarf.getCFARegister())
>>>>>>>>>>> .addOffsetTo(dwarf.getCFAOffset());
>>>>>>>>>>>
>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // bail out to 
>>>>>>>>>>> Java frame case
>>>>>>>>>>> ??????????? }
>>>>>>>>>>> ????????? }
>>>>>>>>>>> ????????? if (cfa == null) {
>>>>>>>>>>> ??????????? return null;
>>>>>>>>>>> ????????? }
>>>>>>>>>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf);
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA()
>>>>>>>>>>>
>>>>>>>>>>> ?? Better to rename 'ofs' => 'offs'.
>>>>>>>>>>>
>>>>>>>>>>> 77 nextCFA = nextCFA.addOffsetTo(- 
>>>>>>>>>>> nextDwarf.getBasePointerOffsetFromCFA());
>>>>>>>>>>>
>>>>>>>>>>> ?? Extra space after '-' sign.
>>>>>>>>>>>
>>>>>>>>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, 
>>>>>>>>>>> ThreadContext context) {
>>>>>>>>>>>
>>>>>>>>>>> ?? It feels like the logic has to be somehow 
>>>>>>>>>>> refactored/simplified as
>>>>>>>>>>> ?? several typical fragments appears in slightly different 
>>>>>>>>>>> contexts.
>>>>>>>>>>> ?? But it is not easy to understand what it is.
>>>>>>>>>>> ?? Could you, please, add some comments to key places 
>>>>>>>>>>> explaining this logic.
>>>>>>>>>>> ?? Then I'll check if it is possible to make it a little bit 
>>>>>>>>>>> simpler.
>>>>>>>>>>>
>>>>>>>>>>> 109 private CFrame javaSender(ThreadContext context) { 110 
>>>>>>>>>>> Address nextCFA; 111 Address nextPC; 112 113 nextPC = 
>>>>>>>>>>> getNextPC(false); 114 if (nextPC == null) { 115 return null; 
>>>>>>>>>>> 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = 
>>>>>>>>>>> dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // 
>>>>>>>>>>> Native frame 121 try { 122 nextDwarf = new 
>>>>>>>>>>> DwarfParser(libptr); 123 } catch (DebuggerException e) { 124 
>>>>>>>>>>> nextCFA = getNextCFA(null, context); 125 return (nextCFA == 
>>>>>>>>>>> null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, 
>>>>>>>>>>> null); 126 } 127 nextDwarf.processDwarf(nextPC); 128 } 129 
>>>>>>>>>>> 130 nextCFA = getNextCFA(nextDwarf, context); 131 return 
>>>>>>>>>>> (nextCFA == null) ? null 132 : new LinuxAMD64CFrame(dbg, 
>>>>>>>>>>> nextCFA, nextPC, nextDwarf); 133 }
>>>>>>>>>>>
>>>>>>>>>>> ??The above can be simplified if a DebuggerException can not 
>>>>>>>>>>> be thrown from processDwarf(nextPC):
>>>>>>>>>>> ????? private CFrame javaSender(ThreadContext context) {
>>>>>>>>>>> ??????? Address nextPC = getNextPC(false);
>>>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>>>> ????????? return null;
>>>>>>>>>>> ??????? }
>>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>>>>
>>>>>>>>>>> ??????? if (libptr != 0L) { // Native frame
>>>>>>>>>>> ????????? try {
>>>>>>>>>>> ??????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>>>> ??????????? nextDwarf.processDwarf(nextPC);
>>>>>>>>>>> ????????? } catch (DebuggerException e) { // Bail out to Java 
>>>>>>>>>>> frame
>>>>>>>>>>> ????????? }
>>>>>>>>>>> ??????? }
>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new 
>>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>>>> ????? }
>>>>>>>>>>>
>>>>>>>>>>> 135 public CFrame sender(ThreadProxy thread) { 136 
>>>>>>>>>>> ThreadContext context = thread.getContext(); 137 138 if 
>>>>>>>>>>> (dwarf == null) { // Java frame 139 return 
>>>>>>>>>>> javaSender(context); 140 } 141 142 Address nextPC = 
>>>>>>>>>>> getNextPC(true); 143 if (nextPC == null) { 144 return null; 
>>>>>>>>>>> 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = 
>>>>>>>>>>> dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = 
>>>>>>>>>>> dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 
>>>>>>>>>>> // Next frame might be Java frame 153 nextCFA = 
>>>>>>>>>>> getNextCFA(null, context); 154 return (nextCFA == null) ? 
>>>>>>>>>>> null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 
>>>>>>>>>>> } 156 try { 157 nextDwarf = new DwarfParser(libptr); 158 } 
>>>>>>>>>>> catch (DebuggerException e) { 159 nextCFA = getNextCFA(null, 
>>>>>>>>>>> context); 160 return (nextCFA == null) ? null : new 
>>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 163 
>>>>>>>>>>> 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = 
>>>>>>>>>>> getNextCFA(nextDwarf, context); 166 return (nextCFA == null) 
>>>>>>>>>>> ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, 
>>>>>>>>>>> nextDwarf); 167 }
>>>>>>>>>>>
>>>>>>>>>>> ??This one can be also simplified a little:
>>>>>>>>>>>
>>>>>>>>>>> ????? public CFrame sender(ThreadProxy thread) {
>>>>>>>>>>> ??????? ThreadContext context = thread.getContext();
>>>>>>>>>>>
>>>>>>>>>>> ??????? if (dwarf == null) { // Java frame
>>>>>>>>>>> ????????? return javaSender(context);
>>>>>>>>>>> ??????? }
>>>>>>>>>>> ??????? Address nextPC = getNextPC(true);
>>>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>>>> ????????? return null;
>>>>>>>>>>> ??????? }
>>>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>>>> ??????? if (!dwarf.isIn(nextPC)) {
>>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>> ????????? if (libptr != 0L) {
>>>>>>>>>>> ??????????? try {
>>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC);
>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to 
>>>>>>>>>>> Java frame
>>>>>>>>>>> ??????????? }
>>>>>>>>>>> ????????? }
>>>>>>>>>>> ??????? }
>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new 
>>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>>>> ????? }
>>>>>>>>>>>
>>>>>>>>>>> Finally, it looks like just one method could replace both
>>>>>>>>>>> sender(ThreadProxy thread) and javaSender(ThreadContext 
>>>>>>>>>>> context):
>>>>>>>>>>>
>>>>>>>>>>> ????? private CFrame commonSender(ThreadProxy thread) {
>>>>>>>>>>> ??????? ThreadContext context = thread.getContext();
>>>>>>>>>>> ??????? Address nextPC = getNextPC(false);
>>>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>>>> ????????? return null;
>>>>>>>>>>> ??????? }
>>>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>>>>
>>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) {
>>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>> ????????? if (libptr != 0L) {
>>>>>>>>>>> ??????????? try {
>>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC);
>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to 
>>>>>>>>>>> Java frame
>>>>>>>>>>> ??????????? }
>>>>>>>>>>> ????????? }
>>>>>>>>>>> ??????? }
>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new 
>>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>>>> ????? }
>>>>>>>>>>>
>>>>>>>>>>> I'm still reviewing the dwarf parser files.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Serguei
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> I refactored LinuxAMD64CFrame.java . It works fine in 
>>>>>>>>>>>> serviceability/sa tests and
>>>>>>>>>>>> all tests on submit repo 
>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923).
>>>>>>>>>>>> Could you review new webrev?
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/
>>>>>>>>>>>>
>>>>>>>>>>>> The diff from previous webrev is here:
>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote:
>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>
>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>>>>> ?? webrev: 
>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> According to 2.7 Stack Unwind Algorithm in System V 
>>>>>>>>>>>>> Application Binary Interface AMD64
>>>>>>>>>>>>> Architecture Processor Supplement [1], we need to use DWARF 
>>>>>>>>>>>>> in .eh_frame or .debug_frame
>>>>>>>>>>>>> for stack unwinding.
>>>>>>>>>>>>>
>>>>>>>>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by 
>>>>>>>>>>>>> default since GCC 4.6, so system
>>>>>>>>>>>>> library (e.g. libc) might be compiled with this feature.
>>>>>>>>>>>>>
>>>>>>>>>>>>> However `jhsdb jstack --mixed` does not do so, it uses base 
>>>>>>>>>>>>> pointer register (RBP).
>>>>>>>>>>>>> So it might be lack of stack frames.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I guess JDK-8219201 is caused by same issue.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> [1] 
>>>>>>>>>>>>> https://urldefense.com/v3/__https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaQusmjOqA$ 
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>

From suenaga at oss.nttdata.com  Wed Mar 11 06:03:59 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Wed, 11 Mar 2020 15:03:59 +0900
Subject: : PING: RFR: 8234624: jstack mixed mode should refer DWARF
In-Reply-To: <26681640-d8c8-5b64-58cc-36f8a854d101@oracle.com>
References: <eb284894-0801-d04d-c5c2-68ea3faaef98@oss.nttdata.com>
 <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com>
 <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com>
 <e819c572-2172-c3e2-9933-f3859662d400@oss.nttdata.com>
 <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com>
 <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com>
 <c25bb60f-4e37-58db-b40f-2afce2dbf82f@oss.nttdata.com>
 <b2d223d2-47db-3119-b579-e48ca3b50469@oss.nttdata.com>
 <c9ac396c-f14b-3313-b3aa-912dd7dca482@oracle.com>
 <24a2dcae-fba9-c6ed-657e-a1d54584bbc3@oss.nttdata.com>
 <01c2a8c1-0e04-4e42-161a-80d6bdaa4ffa@oracle.com>
 <ab3a496e-4f19-922b-7418-1fca058a14c1@oss.nttdata.com>
 <e8a16c8e-ff00-6985-dc3d-9450c6b489df@oss.nttdata.com>
 <26681640-d8c8-5b64-58cc-36f8a854d101@oracle.com>
Message-ID: <75c8f8ac-4557-03ab-77eb-f2383aa2b5f1@oss.nttdata.com>

Thanks David!

Can you share native backtrace?
(Did /opt/core.sh collect it?)


Yasumasa


On 2020/03/11 14:59, David Holmes wrote:
> Hi Yasumasa,
> 
> Partial hs_err info below.
> 
> David
> -----
> 
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #? SIGSEGV (0xb) at pc=0x00007fdf2000e87c, pid=29798, tid=29800
> #
> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-11-0447267.suenaga.source)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-11-0447267.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
> # Problematic frame:
> # C? [libsaproc.so+0x487c]? DwarfParser::process_dwarf(unsigned long)+0x2c
> #
> # Core dump will be written. Default location: Core dumps may be processed with "/opt/core.sh %p" (or dumping to /opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/scratch/0/core.29798)
> #
> # If you would like to submit a bug report, please visit:
> #?? https://bugreport.java.com/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #
> 
> ---------------? S U M M A R Y ------------
> 
> Command Line: 
> -Denv.class.path=/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/serviceability/sa/TestJhsdbJstackMixed.d:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/hotspot/jtreg/serviceability/sa:/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/javatest.jar:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/jtreg.jar 
> -Dapplication.home=/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/linux-x64-debug.jdk/jdk-15/fastdebug -Xms8m -Djdk.module.main=jdk.hotspot.agent jdk.hotspot.agent/sun.jvm.hotspot.SALauncher jstack --mixed --pid 29770
> 
> Time: Wed Mar 11 05:20:57 2020 UTC elapsed time: 3.927809 seconds (0d 0h 0m 3s)
> 
> ---------------? T H R E A D? ---------------
> 
> Current thread (0x00007fdf5c032000):? JavaThread "main" [_thread_in_native, id=29800, stack(0x00007fdf63a9e000,0x00007fdf63b9f000)]
> 
> Stack: [0x00007fdf63a9e000,0x00007fdf63b9f000],? sp=0x00007fdf63b9d190, free space=1020k
> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
> C? [libsaproc.so+0x487c]? DwarfParser::process_dwarf(unsigned long)+0x2c
> j? sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 jdk.hotspot.agent at 15-internal
> j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 jdk.hotspot.agent at 15-internal
> j sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 jdk.hotspot.agent at 15-internal
> j sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 jdk.hotspot.agent at 15-internal
> j sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.Tool.startInternal()V+87 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 jdk.hotspot.agent at 15-internal
> v? ~StubRoutines::call_stub
> V? [libjvm.so+0xc2291c]? JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x6ac
> V? [libjvm.so+0xd31970]? jni_invoke_static(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) [clone .isra.140] [clone .constprop.263]+0x370
> V? [libjvm.so+0xd36202]? jni_CallStaticVoidMethod+0x222
> C? [libjli.so+0x4bed]? JavaMain+0xbcd
> C? [libjli.so+0x80a9]? ThreadJavaMain+0x9
> 
> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
> j? sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 jdk.hotspot.agent at 15-internal
> j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 jdk.hotspot.agent at 15-internal
> j sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 jdk.hotspot.agent at 15-internal
> j sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 jdk.hotspot.agent at 15-internal
> j sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.Tool.startInternal()V+87 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 jdk.hotspot.agent at 15-internal
> v? ~StubRoutines::call_stub
> 
> siginfo: si_signo: 11 (SIGSEGV), si_code: 2 (SEGV_ACCERR), si_addr: 0x00007fded5076b79
> 
> Register to memory mapping:
> 
> RAX=0x00007f7e4dfe3229 is an unknown value
> RBX=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 d4 de 7f 00 00
> RCX=0x00007fded4072380 points into unknown readable memory: 2f 75 73 72 2f 6c 69 62
> RDX=0x00007fded4076b85 points into unknown readable memory: 01 00 00
> RSP=0x00007fdf63b9d190 is pointing into the stack for thread: 0x00007fdf5c032000
> RBP=0x00007fdf63b9d1b0 is pointing into the stack for thread: 0x00007fdf5c032000
> RSI=0x0000000000000004 is an unknown value
> RDI=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 d4 de 7f 00 00
> R8 =0x000000000146c380 points into unknown readable memory: 02 00 00 00 00 00 00 00
> R9 =0x00007fded4076b79 points into unknown readable memory: 7a 52 00 01 78 10 01
> R10=0x00000000ffffffff is an unknown value
> R11=0x000000000100527a is an unknown value
> R12=0x00007fded5076b79 is an unknown value
> R13=0x00007f7da2f8e68a is an unknown value
> R14=0x00007f7dbdf62b1d is an unknown value
> R15=0x00007fdf5c032000 is a thread
> 
> 
> Registers:
> RAX=0x00007f7e4dfe3229, RBX=0x00007fdf5c4d7080, RCX=0x00007fded4072380, RDX=0x00007fded4076b85
> RSP=0x00007fdf63b9d190, RBP=0x00007fdf63b9d1b0, RSI=0x0000000000000004, RDI=0x00007fdf5c4d7080
> R8 =0x000000000146c380, R9 =0x00007fded4076b79, R10=0x00000000ffffffff, R11=0x000000000100527a
> R12=0x00007fded5076b79, R13=0x00007f7da2f8e68a, R14=0x00007f7dbdf62b1d, R15=0x00007fdf5c032000
> RIP=0x00007fdf2000e87c, EFLAGS=0x0000000000010206, CSGSFS=0x002b000000000033, ERR=0x0000000000000004
>  ? TRAPNO=0x000000000000000e
> 
> Top of Stack: (sp=0x00007fdf63b9d190)
> 0x00007fdf63b9d190:?? 00007fdf209d0980 0000000000000000
> 0x00007fdf63b9d1a0:?? 00007fdf209d0980 00007fdf63b9d258
> 0x00007fdf63b9d1b0:?? 00007fdf63b9d228 00007fdf44778dbe
> 0x00007fdf63b9d1c0:?? 000000000146c380 00007fdf5c032000
> 
> Instructions: (pc=0x00007fdf2000e87c)
> 0x00007fdf2000e77c:?? 89 43 18 4d 85 f6 75 0f eb 2a 66 2e 0f 1f 84 00
> 0x00007fdf2000e78c:?? 00 00 00 00 48 89 c2 48 8d 42 01 48 89 43 08 80
> 0x00007fdf2000e79c:?? 78 ff 00 78 ef 48 8d 42 02 48 89 43 08 0f b6 42
> 0x00007fdf2000e7ac:?? 01 88 43 10 48 c7 43 28 00 00 00 00 4c 89 e1 48
> 0x00007fdf2000e7bc:?? 89 df 31 f6 48 b8 07 00 00 00 10 00 00 00 c6 43
> 0x00007fdf2000e7cc:?? 3c 00 48 c7 c2 ff ff ff ff 48 89 43 14 48 c7 43
> 0x00007fdf2000e7dc:?? 30 00 00 00 00 c7 43 38 00 00 00 00 e8 13 fb ff
> 0x00007fdf2000e7ec:?? ff 4c 89 6b 08 48 83 c4 18 5b 41 5c 41 5d 41 5e
> 0x00007fdf2000e7fc:?? 41 5f 5d c3 83 e7 40 0f 84 63 ff ff ff 48 c7 c2
> 0x00007fdf2000e80c:?? ff ff ff ff 48 d3 e2 49 09 d0 e9 51 ff ff ff 90
> 0x00007fdf2000e81c:?? 0f 1f 40 00 0f b6 47 10 83 e0 07 3c 02 74 0a 76
> 0x00007fdf2000e82c:?? 1b 3c 03 74 04 3c 04 75 17 48 8b 57 08 8b 02 48
> 0x00007fdf2000e83c:?? 83 c2 04 48 89 57 08 c3 0f 1f 40 00 84 c0 74 e9
> 0x00007fdf2000e84c:?? 31 c0 c3 90 55 41 ba ff ff ff ff 48 89 e5 41 56
> 0x00007fdf2000e85c:?? 41 55 49 89 f5 41 54 53 48 8b 07 48 89 fb 4c 8b
> 0x00007fdf2000e86c:?? a0 28 11 00 00 eb 09 0f 1f 44 00 00 4c 89 63 08
> 0x00007fdf2000e87c:?? 41 8b 04 24 4d 8d 4c 24 04 4c 89 4b 08 4c 39 d0
> 0x00007fdf2000e88c:?? 75 0a 49 8b 44 24 04 4d 8d 4c 24 0c 45 8b 19 4d
> 0x00007fdf2000e89c:?? 8d 24 01 49 8d 41 04 48 89 43 08 45 85 db 74 cc
> 0x00007fdf2000e8ac:?? 48 89 df e8 8c f9 ff ff 48 8b 13 41 89 c6 4c 03
> 0x00007fdf2000e8bc:?? b2 18 11 00 00 e8 5a ff ff ff 89 c0 4c 01 f0 4c
> 0x00007fdf2000e8cc:?? 39 e8 76 a8 4d 39 ee 77 a3 44 89 da 4c 89 ce e8
> 0x00007fdf2000e8dc:?? 90 fd ff ff 48 8b 43 08 31 c9 31 ff 48 83 c0 01
> 0x00007fdf2000e8ec:?? 0f 1f 40 00 48 89 43 08 0f b6 70 ff 49 89 c0 48
> 0x00007fdf2000e8fc:?? 83 c0 01 48 89 f2 83 e2 7f 48 d3 e2 83 c1 07 48
> 0x00007fdf2000e90c:?? 09 d7 40 84 f6 78 dd 4c 01 c7 4c 89 e1 4c 89 ea
> 0x00007fdf2000e91c:?? 4c 89 f6 48 89 7b 08 48 89 df e8 d5 f9 ff ff 5b
> 0x00007fdf2000e92c:?? 31 c0 41 5c 41 5d 41 5e 5d c3 66 2e 0f 1f 84 00
> 0x00007fdf2000e93c:?? 00 00 00 00 55 48 89 e5 41 54 53 48 81 ec d0 00
> 0x00007fdf2000e94c:?? 00 00 48 89 b5 48 ff ff ff 48 89 95 50 ff ff ff
> 0x00007fdf2000e95c:?? 48 89 8d 58 ff ff ff 4c 89 85 60 ff ff ff 4c 89
> 0x00007fdf2000e96c:?? 8d 68 ff ff ff 84 c0 74 23 0f 29 85 70 ff ff ff
> 
> 
> On 11/03/2020 3:52 pm, Yasumasa Suenaga wrote:
>> Hi Kevin,
>>
>> I saw 2 errors on submit repo (mach5-one-ysuenaga-JDK-8234624-5-20200311-0209-9358475).
>> So I tweaked my patch, but I saw the crash again (mach5-one-ysuenaga-JDK-8234624-5-20200311-0448-9361448).
>>
>> ?? Last change on submit repo is here:
>> ???? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05-2/
>>
>> Can you share details on submit repo?
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> On 2020/03/11 11:07, Yasumasa Suenaga wrote:
>>> Hi Kevin,
>>>
>>> I guess first program header in the libraries which are on your machine has exec flag (you can check it with `readelf -l`).
>>> So I tweaked my patch (initial value of exec_start and exec_end set to -1) in new webrev.
>>>
>>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05/
>>>
>>> This webrev contains the fix for your comment (typo and DW_CFA_advance_loc4).
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2020/03/11 8:53, Kevin Walls wrote:
>>>> Hi -
>>>>
>>>> In testing I wasn't seeing any of the Dwarf code triggered.
>>>>
>>>> With LIBSAPROC_DEBUG set I'm getting the "Could not find executable section in" for lots of / maybe all the libraries...
>>>>
>>>> src/jdk.hotspot.agent/linux/native/libsaproc/libproc_impl.c
>>>>
>>>> ??? if (fill_instr_info(newlib)) {
>>>> ????? if (!read_eh_frame(ph, newlib)) {
>>>>
>>>> fill_instr_info is failing, and we never get to read_eh_frame().
>>>>
>>>> output like:
>>>>
>>>> libsaproc DEBUG: [0] vaddr = 0x0, memsz = 0xaba4, filesz = 0xaba4
>>>> libsaproc DEBUG: Could not find executable section in /lib/x86_64-linux-gnu/libnss_nis-2.27.so
>>>>
>>>> (similar for all libraries).
>>>>
>>>> fill_instr fails if:
>>>>
>>>> ??if ((lib->exec_start == 0L) || (lib->exec_end == 0L))
>>>>
>>>> ...but isn't exec_start relative to the library address? It's the value of ph->vaddr and it is often zero.
>>>>
>>>> I added some booleans and did:
>>>>
>>>> 185?????? if ((lib->exec_start == 0L) || (lib->exec_start > ph->p_vaddr)) {
>>>> 186???????? lib->exec_start = ph->p_vaddr;
>>>> 187???????? found_start =true;
>>>> 188?????? }
>>>>
>>>> (similarly for end) and only failed if:
>>>>
>>>> 201?? if (!found_start || !found_end) {
>>>> 202???? return false;
>>>>
>>>> ...and now it's better. ? I go from:
>>>>
>>>> ----------------- 3306 -----------------
>>>> 0x00007f75824acd2d????? __GI___pthread_timedjoin_ex + 0x17d
>>>>
>>>> to:
>>>>
>>>> ----------------- 31127 -----------------
>>>> 0x00007fa284d78d2d????? __GI___pthread_timedjoin_ex + 0x17d
>>>> 0x00007fa2857aaf2d????? CallJavaMainInNewThread + 0xad
>>>> 0x00007fa2857a74ed????? ContinueInNewThread + 0x4d
>>>> 0x00007fa2857a8c49????? JLI_Launch + 0x1529
>>>> 0x000055af1b78db1c????? main + 0x11c
>>>>
>>>>
>>>> Thanks
>>>> Kevin
>>>>
>>>>
>>>>
>>>>
>>>> On 10/03/2020 12:36, Yasumasa Suenaga wrote:
>>>>
>>>>> Hi Kevin,
>>>>>
>>>>> Thanks for your comment!
>>>>>
>>>>> On 2020/03/10 18:58, Kevin Walls wrote:
>>>>>> Hi Yasumasa ,
>>>>>>
>>>>>> The changes build OK for me in the latest jdk, and things still work.
>>>>>> I have not yet seen the dwarf usage in action: I've tried a couple of different systems and so far have not reproduced the problem, i.e. jstack has not failed on native frames.
>>>>>>
>>>>>> I may need more recent basic libraries, will look again for somewhere where the problem happens and get back to you as I really want to run the changes.
>>>>>
>>>>> You can see the problem with JShell.
>>>>> Some Java frames would not be seen in mixed jstack.
>>>>>
>>>>>
>>>>>> I have mostly minor other comments which don't need a new webrev, some just comments for the future:
>>>>>>
>>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp:
>>>>>>
>>>>>> DW_CFA_nop - shouldn't this continue instead of return?
>>>>>> (It may "never" happen, but a nop could appear within some other instructions?)
>>>>>
>>>>> DW_CFA_nop is used for padding, so we can ignore (return immediately) it.
>>>>>
>>>>>
>>>>>> DW_CFA_remember_state: a minor typo in the comment, "DW_CFA_remenber_state".
>>>>>
>>>>> I will fix it.
>>>>>
>>>>>
>>>>>> We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not DW_CFA_advance_loc4.? I thought that was odd, but maybe addresses in these tables never increase by 4-byte amounts, would this mean a lot of code on one line. 8-)
>>>>>> So maybe it's never used in practice, if you think it's unnecessary no problem, maybe a comment, or add it for robustness.
>>>>>
>>>>> I will add DW_CFA_advance_loc4.
>>>>>
>>>>>
>>>>>> General-purpose methods like read_leb128(), get_entry_length(), get_decoded_value() specifically update the _buf pointer in this DwarfParser.
>>>>>>
>>>>>> DwarfParser::process_dwarf() moves _buf.
>>>>>> It calls process_cie() which reads, moves _buf and restores it to the original position, then we read augmentation_length from where _buf is.
>>>>>> I'm not sure if that's wrong, or if I just need to read again about the CIE/etc layout.
>>>>>>
>>>>>> I don't really want to suggest making the code pass around a current _buf for the invocation of these general purpose methods, but just wanted to comment that if these get used more widely that might become necessary.
>>>>>
>>>>> I saw GDB and binutils source for creating this patch.
>>>>> They seems to process similar code because we need to calculate DWARF instructions one-by-one to get the value which relates to specified PC.
>>>>>
>>>>>
>>>>>> Similarly in future, if this DWARF support code became used more widely, it might want to move to an
>>>>>> OS-neutral directory?? It's odd to label it as Linux-specific.
>>>>>
>>>>> Windows does not use DWARF at least, it uses another feature.
>>>>>
>>>>> https://urldefense.com/v3/__https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaTtALtpRg$
>>>>> I'm not sure other platforms (Solaris, macOS) uses DWARF.
>>>>> If DWARF is used in them, I can move DWARF related code to posix directory.
>>>>>
>>>>>
>>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp:
>>>>>> Thanks for changing "can_parsable" which was in the earlier version. 8-)
>>>>>>
>>>>>>
>>>>>> These are just comments to mainly say it looks good, and somebody else out there has read it.
>>>>>> I will look for a system that shows the problem, and get back to you again!
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>>> Many thanks
>>>>>> Kevin
>>>>>>
>>>>>> On 27/02/2020 05:13, Yasumasa Suenaga wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> webrev.03 cannot be applied to current jdk/jdk due to 8239224 and 8239462 changes (they updated copyright year).
>>>>>>> So I modified webrev (only copyright year changes) to be able to apply to current jdk/jdk.
>>>>>>> Could you review it?
>>>>>>>
>>>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/
>>>>>>>
>>>>>>> I need one more reviewer to push.
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>> On 2020/02/17 13:07, Yasumasa Suenaga wrote:
>>>>>>>> PING: Could you review it?
>>>>>>>>
>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>>>
>>>>>>>> This change has been already reviewed by Serguei.
>>>>>>>> I need one more reviewer to push.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2020/02/03 1:37, Yasumasa Suenaga wrote:
>>>>>>>>> PING: Could you reveiw this change?
>>>>>>>>>
>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>>>>
>>>>>>>>> I believe this change helps troubleshooter to fight to postmortem analysis.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2020/01/19 3:16, Yasumasa Suenaga wrote:
>>>>>>>>>> PING: Could you review it?
>>>>>>>>>>
>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>>>>>
>>>>>>>>>> I updated webrev. I discussed with Serguei in off list, and I refactored webrev.02 .
>>>>>>>>>> It has passed tests on submit repo (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2019/12/15 10:51, Yasumasa Suenaga wrote:
>>>>>>>>>>> Hi Serguei,
>>>>>>>>>>>
>>>>>>>>>>> Thanks for your comment!
>>>>>>>>>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev.
>>>>>>>>>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as Dmitry said.
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/
>>>>>>>>>>>
>>>>>>>>>>> This change has been passed all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487).
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>
>>>>>>>>>>>> This is nice move in general.
>>>>>>>>>>>> Thank you for working on this!
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html
>>>>>>>>>>>>
>>>>>>>>>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 0L) { // Java frame 98 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp == null) { 100 return null; 101 } 102 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if (rbp == null) { 110 return null; 111 } 112 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 dwarf.processDwarf(pc); 115 Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 !dwarf.isBPOffsetAvailable()) 117 ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : context.getRegisterAsAddress(dwarf.getCFARegister()) 119 .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); 124 }
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I'd suggest to simplify the logic by refactoring to something like below:
>>>>>>>>>>>>
>>>>>>>>>>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc);
>>>>>>>>>>>> ?????????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame
>>>>>>>>>>>> ?????????? DwarfParser dwarf = null;
>>>>>>>>>>>>
>>>>>>>>>>>> ?????????? if (libptr != 0L) { // Native frame
>>>>>>>>>>>> ???????????? try {
>>>>>>>>>>>> ?????????????? dwarf = new DwarfParser(libptr);
>>>>>>>>>>>> ?????????????? dwarf.processDwarf(pc);
>>>>>>>>>>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) &&
>>>>>>>>>>>> !dwarf.isBPOffsetAvailable())
>>>>>>>>>>>> ???????????????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP)
>>>>>>>>>>>> ???????????????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister())
>>>>>>>>>>>> .addOffsetTo(dwarf.getCFAOffset());
>>>>>>>>>>>>
>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // bail out to Java frame case
>>>>>>>>>>>> ??????????? }
>>>>>>>>>>>> ????????? }
>>>>>>>>>>>> ????????? if (cfa == null) {
>>>>>>>>>>>> ??????????? return null;
>>>>>>>>>>>> ????????? }
>>>>>>>>>>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf);
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html
>>>>>>>>>>>>
>>>>>>>>>>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA()
>>>>>>>>>>>>
>>>>>>>>>>>> ?? Better to rename 'ofs' => 'offs'.
>>>>>>>>>>>>
>>>>>>>>>>>> 77 nextCFA = nextCFA.addOffsetTo(- nextDwarf.getBasePointerOffsetFromCFA());
>>>>>>>>>>>>
>>>>>>>>>>>> ?? Extra space after '-' sign.
>>>>>>>>>>>>
>>>>>>>>>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, ThreadContext context) {
>>>>>>>>>>>>
>>>>>>>>>>>> ?? It feels like the logic has to be somehow refactored/simplified as
>>>>>>>>>>>> ?? several typical fragments appears in slightly different contexts.
>>>>>>>>>>>> ?? But it is not easy to understand what it is.
>>>>>>>>>>>> ?? Could you, please, add some comments to key places explaining this logic.
>>>>>>>>>>>> ?? Then I'll check if it is possible to make it a little bit simpler.
>>>>>>>>>>>>
>>>>>>>>>>>> 109 private CFrame javaSender(ThreadContext context) { 110 Address nextCFA; 111 Address nextPC; 112 113 nextPC = getNextPC(false); 114 if (nextPC == null) { 115 return null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 123 } catch (DebuggerException e) { 124 nextCFA = getNextCFA(null, context); 125 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 133 }
>>>>>>>>>>>>
>>>>>>>>>>>> ??The above can be simplified if a DebuggerException can not be thrown from processDwarf(nextPC):
>>>>>>>>>>>> ????? private CFrame javaSender(ThreadContext context) {
>>>>>>>>>>>> ??????? Address nextPC = getNextPC(false);
>>>>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>>>>> ????????? return null;
>>>>>>>>>>>> ??????? }
>>>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>>>>>
>>>>>>>>>>>> ??????? if (libptr != 0L) { // Native frame
>>>>>>>>>>>> ????????? try {
>>>>>>>>>>>> ??????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>>>>> ??????????? nextDwarf.processDwarf(nextPC);
>>>>>>>>>>>> ????????? } catch (DebuggerException e) { // Bail out to Java frame
>>>>>>>>>>>> ????????? }
>>>>>>>>>>>> ??????? }
>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>>>>> ????? }
>>>>>>>>>>>>
>>>>>>>>>>>> 135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext context = thread.getContext(); 137 138 if (dwarf == null) { // Java frame 139 return javaSender(context); 140 } 141 142 Address nextPC = getNextPC(true); 143 if (nextPC == null) { 144 return null; 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // Next frame might be Java frame 153 nextCFA = getNextCFA(null, context); 154 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 158 } catch (DebuggerException e) { 159 nextCFA = getNextCFA(null, context); 160 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? null : new 
>>>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 167 }
>>>>>>>>>>>>
>>>>>>>>>>>> ??This one can be also simplified a little:
>>>>>>>>>>>>
>>>>>>>>>>>> ????? public CFrame sender(ThreadProxy thread) {
>>>>>>>>>>>> ??????? ThreadContext context = thread.getContext();
>>>>>>>>>>>>
>>>>>>>>>>>> ??????? if (dwarf == null) { // Java frame
>>>>>>>>>>>> ????????? return javaSender(context);
>>>>>>>>>>>> ??????? }
>>>>>>>>>>>> ??????? Address nextPC = getNextPC(true);
>>>>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>>>>> ????????? return null;
>>>>>>>>>>>> ??????? }
>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>>>>> ??????? if (!dwarf.isIn(nextPC)) {
>>>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>>> ????????? if (libptr != 0L) {
>>>>>>>>>>>> ??????????? try {
>>>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC);
>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame
>>>>>>>>>>>> ??????????? }
>>>>>>>>>>>> ????????? }
>>>>>>>>>>>> ??????? }
>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>>>>> ????? }
>>>>>>>>>>>>
>>>>>>>>>>>> Finally, it looks like just one method could replace both
>>>>>>>>>>>> sender(ThreadProxy thread) and javaSender(ThreadContext context):
>>>>>>>>>>>>
>>>>>>>>>>>> ????? private CFrame commonSender(ThreadProxy thread) {
>>>>>>>>>>>> ??????? ThreadContext context = thread.getContext();
>>>>>>>>>>>> ??????? Address nextPC = getNextPC(false);
>>>>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>>>>> ????????? return null;
>>>>>>>>>>>> ??????? }
>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>>>>>
>>>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) {
>>>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>>> ????????? if (libptr != 0L) {
>>>>>>>>>>>> ??????????? try {
>>>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC);
>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame
>>>>>>>>>>>> ??????????? }
>>>>>>>>>>>> ????????? }
>>>>>>>>>>>> ??????? }
>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>>>>> ????? }
>>>>>>>>>>>>
>>>>>>>>>>>> I'm still reviewing the dwarf parser files.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Serguei
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I refactored LinuxAMD64CFrame.java . It works fine in serviceability/sa tests and
>>>>>>>>>>>>> all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923).
>>>>>>>>>>>>> Could you review new webrev?
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/
>>>>>>>>>>>>>
>>>>>>>>>>>>> The diff from previous webrev is here:
>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> According to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64
>>>>>>>>>>>>>> Architecture Processor Supplement [1], we need to use DWARF in .eh_frame or .debug_frame
>>>>>>>>>>>>>> for stack unwinding.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by default since GCC 4.6, so system
>>>>>>>>>>>>>> library (e.g. libc) might be compiled with this feature.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> However `jhsdb jstack --mixed` does not do so, it uses base pointer register (RBP).
>>>>>>>>>>>>>> So it might be lack of stack frames.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I guess JDK-8219201 is caused by same issue.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [1] https://urldefense.com/v3/__https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaQusmjOqA$ 
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>

From david.holmes at oracle.com  Wed Mar 11 06:20:52 2020
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 11 Mar 2020 16:20:52 +1000
Subject: : PING: RFR: 8234624: jstack mixed mode should refer DWARF
In-Reply-To: <75c8f8ac-4557-03ab-77eb-f2383aa2b5f1@oss.nttdata.com>
References: <eb284894-0801-d04d-c5c2-68ea3faaef98@oss.nttdata.com>
 <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com>
 <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com>
 <e819c572-2172-c3e2-9933-f3859662d400@oss.nttdata.com>
 <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com>
 <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com>
 <c25bb60f-4e37-58db-b40f-2afce2dbf82f@oss.nttdata.com>
 <b2d223d2-47db-3119-b579-e48ca3b50469@oss.nttdata.com>
 <c9ac396c-f14b-3313-b3aa-912dd7dca482@oracle.com>
 <24a2dcae-fba9-c6ed-657e-a1d54584bbc3@oss.nttdata.com>
 <01c2a8c1-0e04-4e42-161a-80d6bdaa4ffa@oracle.com>
 <ab3a496e-4f19-922b-7418-1fca058a14c1@oss.nttdata.com>
 <e8a16c8e-ff00-6985-dc3d-9450c6b489df@oss.nttdata.com>
 <26681640-d8c8-5b64-58cc-36f8a854d101@oracle.com>
 <75c8f8ac-4557-03ab-77eb-f2383aa2b5f1@oss.nttdata.com>
Message-ID: <89d854f1-6c28-3976-ba3f-33e3e8cb6012@oracle.com>

On 11/03/2020 4:03 pm, Yasumasa Suenaga wrote:
> Thanks David!
> 
> Can you share native backtrace?
> (Did /opt/core.sh collect it?)

There is a core file but I can't process it, sorry.

David
-----

> 
> Yasumasa
> 
> 
> On 2020/03/11 14:59, David Holmes wrote:
>> Hi Yasumasa,
>>
>> Partial hs_err info below.
>>
>> David
>> -----
>>
>> #
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> #? SIGSEGV (0xb) at pc=0x00007fdf2000e87c, pid=29798, tid=29800
>> #
>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 
>> 15-internal+0-2020-03-11-0447267.suenaga.source)
>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
>> 15-internal+0-2020-03-11-0447267.suenaga.source, mixed mode, sharing, 
>> tiered, compressed oops, g1 gc, linux-amd64)
>> # Problematic frame:
>> # C? [libsaproc.so+0x487c]? DwarfParser::process_dwarf(unsigned 
>> long)+0x2c
>> #
>> # Core dump will be written. Default location: Core dumps may be 
>> processed with "/opt/core.sh %p" (or dumping to 
>> /opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/scratch/0/core.29798) 
>>
>> #
>> # If you would like to submit a bug report, please visit:
>> #?? https://bugreport.java.com/bugreport/crash.jsp
>> # The crash happened outside the Java Virtual Machine in native code.
>> # See problematic frame for where to report the bug.
>> #
>>
>> ---------------? S U M M A R Y ------------
>>
>> Command Line: 
>> -Denv.class.path=/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/serviceability/sa/TestJhsdbJstackMixed.d:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/hotspot/jtreg/serviceability/sa:/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/javatest.jar:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/jtreg.jar 
>> -Dapplication.home=/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/linux-x64-debug.jdk/jdk-15/fastdebug 
>> -Xms8m -Djdk.module.main=jdk.hotspot.agent 
>> jdk.hotspot.agent/sun.jvm.hotspot.SALauncher jstack --mixed --pid 29770
>>
>> Time: Wed Mar 11 05:20:57 2020 UTC elapsed time: 3.927809 seconds (0d 
>> 0h 0m 3s)
>>
>> ---------------? T H R E A D? ---------------
>>
>> Current thread (0x00007fdf5c032000):? JavaThread "main" 
>> [_thread_in_native, id=29800, 
>> stack(0x00007fdf63a9e000,0x00007fdf63b9f000)]
>>
>> Stack: [0x00007fdf63a9e000,0x00007fdf63b9f000],  
>> sp=0x00007fdf63b9d190, free space=1020k
>> Native frames: (J=compiled Java code, A=aot compiled Java code, 
>> j=interpreted, Vv=VM code, C=native code)
>> C? [libsaproc.so+0x487c]? DwarfParser::process_dwarf(unsigned long)+0x2c
>> j  
>> sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 
>> jdk.hotspot.agent at 15-internal
>> j 
>> sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 
>> jdk.hotspot.agent at 15-internal
>> j 
>> sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 
>> jdk.hotspot.agent at 15-internal
>> j 
>> sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 
>> jdk.hotspot.agent at 15-internal
>> j 
>> sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.tools.Tool.startInternal()V+87 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 
>> jdk.hotspot.agent at 15-internal
>> v? ~StubRoutines::call_stub
>> V? [libjvm.so+0xc2291c]? JavaCalls::call_helper(JavaValue*, 
>> methodHandle const&, JavaCallArguments*, Thread*)+0x6ac
>> V? [libjvm.so+0xd31970]? jni_invoke_static(JNIEnv_*, JavaValue*, 
>> _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) 
>> [clone .isra.140] [clone .constprop.263]+0x370
>> V? [libjvm.so+0xd36202]? jni_CallStaticVoidMethod+0x222
>> C? [libjli.so+0x4bed]? JavaMain+0xbcd
>> C? [libjli.so+0x80a9]? ThreadJavaMain+0x9
>>
>> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
>> j  
>> sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 
>> jdk.hotspot.agent at 15-internal
>> j 
>> sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 
>> jdk.hotspot.agent at 15-internal
>> j 
>> sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 
>> jdk.hotspot.agent at 15-internal
>> j 
>> sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 
>> jdk.hotspot.agent at 15-internal
>> j 
>> sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.tools.Tool.startInternal()V+87 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 
>> jdk.hotspot.agent at 15-internal
>> v? ~StubRoutines::call_stub
>>
>> siginfo: si_signo: 11 (SIGSEGV), si_code: 2 (SEGV_ACCERR), si_addr: 
>> 0x00007fded5076b79
>>
>> Register to memory mapping:
>>
>> RAX=0x00007f7e4dfe3229 is an unknown value
>> RBX=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 
>> d4 de 7f 00 00
>> RCX=0x00007fded4072380 points into unknown readable memory: 2f 75 73 
>> 72 2f 6c 69 62
>> RDX=0x00007fded4076b85 points into unknown readable memory: 01 00 00
>> RSP=0x00007fdf63b9d190 is pointing into the stack for thread: 
>> 0x00007fdf5c032000
>> RBP=0x00007fdf63b9d1b0 is pointing into the stack for thread: 
>> 0x00007fdf5c032000
>> RSI=0x0000000000000004 is an unknown value
>> RDI=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 
>> d4 de 7f 00 00
>> R8 =0x000000000146c380 points into unknown readable memory: 02 00 00 
>> 00 00 00 00 00
>> R9 =0x00007fded4076b79 points into unknown readable memory: 7a 52 00 
>> 01 78 10 01
>> R10=0x00000000ffffffff is an unknown value
>> R11=0x000000000100527a is an unknown value
>> R12=0x00007fded5076b79 is an unknown value
>> R13=0x00007f7da2f8e68a is an unknown value
>> R14=0x00007f7dbdf62b1d is an unknown value
>> R15=0x00007fdf5c032000 is a thread
>>
>>
>> Registers:
>> RAX=0x00007f7e4dfe3229, RBX=0x00007fdf5c4d7080, 
>> RCX=0x00007fded4072380, RDX=0x00007fded4076b85
>> RSP=0x00007fdf63b9d190, RBP=0x00007fdf63b9d1b0, 
>> RSI=0x0000000000000004, RDI=0x00007fdf5c4d7080
>> R8 =0x000000000146c380, R9 =0x00007fded4076b79, 
>> R10=0x00000000ffffffff, R11=0x000000000100527a
>> R12=0x00007fded5076b79, R13=0x00007f7da2f8e68a, 
>> R14=0x00007f7dbdf62b1d, R15=0x00007fdf5c032000
>> RIP=0x00007fdf2000e87c, EFLAGS=0x0000000000010206, 
>> CSGSFS=0x002b000000000033, ERR=0x0000000000000004
>> ?? TRAPNO=0x000000000000000e
>>
>> Top of Stack: (sp=0x00007fdf63b9d190)
>> 0x00007fdf63b9d190:?? 00007fdf209d0980 0000000000000000
>> 0x00007fdf63b9d1a0:?? 00007fdf209d0980 00007fdf63b9d258
>> 0x00007fdf63b9d1b0:?? 00007fdf63b9d228 00007fdf44778dbe
>> 0x00007fdf63b9d1c0:?? 000000000146c380 00007fdf5c032000
>>
>> Instructions: (pc=0x00007fdf2000e87c)
>> 0x00007fdf2000e77c:?? 89 43 18 4d 85 f6 75 0f eb 2a 66 2e 0f 1f 84 00
>> 0x00007fdf2000e78c:?? 00 00 00 00 48 89 c2 48 8d 42 01 48 89 43 08 80
>> 0x00007fdf2000e79c:?? 78 ff 00 78 ef 48 8d 42 02 48 89 43 08 0f b6 42
>> 0x00007fdf2000e7ac:?? 01 88 43 10 48 c7 43 28 00 00 00 00 4c 89 e1 48
>> 0x00007fdf2000e7bc:?? 89 df 31 f6 48 b8 07 00 00 00 10 00 00 00 c6 43
>> 0x00007fdf2000e7cc:?? 3c 00 48 c7 c2 ff ff ff ff 48 89 43 14 48 c7 43
>> 0x00007fdf2000e7dc:?? 30 00 00 00 00 c7 43 38 00 00 00 00 e8 13 fb ff
>> 0x00007fdf2000e7ec:?? ff 4c 89 6b 08 48 83 c4 18 5b 41 5c 41 5d 41 5e
>> 0x00007fdf2000e7fc:?? 41 5f 5d c3 83 e7 40 0f 84 63 ff ff ff 48 c7 c2
>> 0x00007fdf2000e80c:?? ff ff ff ff 48 d3 e2 49 09 d0 e9 51 ff ff ff 90
>> 0x00007fdf2000e81c:?? 0f 1f 40 00 0f b6 47 10 83 e0 07 3c 02 74 0a 76
>> 0x00007fdf2000e82c:?? 1b 3c 03 74 04 3c 04 75 17 48 8b 57 08 8b 02 48
>> 0x00007fdf2000e83c:?? 83 c2 04 48 89 57 08 c3 0f 1f 40 00 84 c0 74 e9
>> 0x00007fdf2000e84c:?? 31 c0 c3 90 55 41 ba ff ff ff ff 48 89 e5 41 56
>> 0x00007fdf2000e85c:?? 41 55 49 89 f5 41 54 53 48 8b 07 48 89 fb 4c 8b
>> 0x00007fdf2000e86c:?? a0 28 11 00 00 eb 09 0f 1f 44 00 00 4c 89 63 08
>> 0x00007fdf2000e87c:?? 41 8b 04 24 4d 8d 4c 24 04 4c 89 4b 08 4c 39 d0
>> 0x00007fdf2000e88c:?? 75 0a 49 8b 44 24 04 4d 8d 4c 24 0c 45 8b 19 4d
>> 0x00007fdf2000e89c:?? 8d 24 01 49 8d 41 04 48 89 43 08 45 85 db 74 cc
>> 0x00007fdf2000e8ac:?? 48 89 df e8 8c f9 ff ff 48 8b 13 41 89 c6 4c 03
>> 0x00007fdf2000e8bc:?? b2 18 11 00 00 e8 5a ff ff ff 89 c0 4c 01 f0 4c
>> 0x00007fdf2000e8cc:?? 39 e8 76 a8 4d 39 ee 77 a3 44 89 da 4c 89 ce e8
>> 0x00007fdf2000e8dc:?? 90 fd ff ff 48 8b 43 08 31 c9 31 ff 48 83 c0 01
>> 0x00007fdf2000e8ec:?? 0f 1f 40 00 48 89 43 08 0f b6 70 ff 49 89 c0 48
>> 0x00007fdf2000e8fc:?? 83 c0 01 48 89 f2 83 e2 7f 48 d3 e2 83 c1 07 48
>> 0x00007fdf2000e90c:?? 09 d7 40 84 f6 78 dd 4c 01 c7 4c 89 e1 4c 89 ea
>> 0x00007fdf2000e91c:?? 4c 89 f6 48 89 7b 08 48 89 df e8 d5 f9 ff ff 5b
>> 0x00007fdf2000e92c:?? 31 c0 41 5c 41 5d 41 5e 5d c3 66 2e 0f 1f 84 00
>> 0x00007fdf2000e93c:?? 00 00 00 00 55 48 89 e5 41 54 53 48 81 ec d0 00
>> 0x00007fdf2000e94c:?? 00 00 48 89 b5 48 ff ff ff 48 89 95 50 ff ff ff
>> 0x00007fdf2000e95c:?? 48 89 8d 58 ff ff ff 4c 89 85 60 ff ff ff 4c 89
>> 0x00007fdf2000e96c:?? 8d 68 ff ff ff 84 c0 74 23 0f 29 85 70 ff ff ff
>>
>>
>> On 11/03/2020 3:52 pm, Yasumasa Suenaga wrote:
>>> Hi Kevin,
>>>
>>> I saw 2 errors on submit repo 
>>> (mach5-one-ysuenaga-JDK-8234624-5-20200311-0209-9358475).
>>> So I tweaked my patch, but I saw the crash again 
>>> (mach5-one-ysuenaga-JDK-8234624-5-20200311-0448-9361448).
>>>
>>> ?? Last change on submit repo is here:
>>> ???? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05-2/
>>>
>>> Can you share details on submit repo?
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2020/03/11 11:07, Yasumasa Suenaga wrote:
>>>> Hi Kevin,
>>>>
>>>> I guess first program header in the libraries which are on your 
>>>> machine has exec flag (you can check it with `readelf -l`).
>>>> So I tweaked my patch (initial value of exec_start and exec_end set 
>>>> to -1) in new webrev.
>>>>
>>>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05/
>>>>
>>>> This webrev contains the fix for your comment (typo and 
>>>> DW_CFA_advance_loc4).
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2020/03/11 8:53, Kevin Walls wrote:
>>>>> Hi -
>>>>>
>>>>> In testing I wasn't seeing any of the Dwarf code triggered.
>>>>>
>>>>> With LIBSAPROC_DEBUG set I'm getting the "Could not find executable 
>>>>> section in" for lots of / maybe all the libraries...
>>>>>
>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/libproc_impl.c
>>>>>
>>>>> ??? if (fill_instr_info(newlib)) {
>>>>> ????? if (!read_eh_frame(ph, newlib)) {
>>>>>
>>>>> fill_instr_info is failing, and we never get to read_eh_frame().
>>>>>
>>>>> output like:
>>>>>
>>>>> libsaproc DEBUG: [0] vaddr = 0x0, memsz = 0xaba4, filesz = 0xaba4
>>>>> libsaproc DEBUG: Could not find executable section in 
>>>>> /lib/x86_64-linux-gnu/libnss_nis-2.27.so
>>>>>
>>>>> (similar for all libraries).
>>>>>
>>>>> fill_instr fails if:
>>>>>
>>>>> ??if ((lib->exec_start == 0L) || (lib->exec_end == 0L))
>>>>>
>>>>> ...but isn't exec_start relative to the library address? It's the 
>>>>> value of ph->vaddr and it is often zero.
>>>>>
>>>>> I added some booleans and did:
>>>>>
>>>>> 185?????? if ((lib->exec_start == 0L) || (lib->exec_start > 
>>>>> ph->p_vaddr)) {
>>>>> 186???????? lib->exec_start = ph->p_vaddr;
>>>>> 187???????? found_start =true;
>>>>> 188?????? }
>>>>>
>>>>> (similarly for end) and only failed if:
>>>>>
>>>>> 201?? if (!found_start || !found_end) {
>>>>> 202???? return false;
>>>>>
>>>>> ...and now it's better. ? I go from:
>>>>>
>>>>> ----------------- 3306 -----------------
>>>>> 0x00007f75824acd2d????? __GI___pthread_timedjoin_ex + 0x17d
>>>>>
>>>>> to:
>>>>>
>>>>> ----------------- 31127 -----------------
>>>>> 0x00007fa284d78d2d????? __GI___pthread_timedjoin_ex + 0x17d
>>>>> 0x00007fa2857aaf2d????? CallJavaMainInNewThread + 0xad
>>>>> 0x00007fa2857a74ed????? ContinueInNewThread + 0x4d
>>>>> 0x00007fa2857a8c49????? JLI_Launch + 0x1529
>>>>> 0x000055af1b78db1c????? main + 0x11c
>>>>>
>>>>>
>>>>> Thanks
>>>>> Kevin
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 10/03/2020 12:36, Yasumasa Suenaga wrote:
>>>>>
>>>>>> Hi Kevin,
>>>>>>
>>>>>> Thanks for your comment!
>>>>>>
>>>>>> On 2020/03/10 18:58, Kevin Walls wrote:
>>>>>>> Hi Yasumasa ,
>>>>>>>
>>>>>>> The changes build OK for me in the latest jdk, and things still 
>>>>>>> work.
>>>>>>> I have not yet seen the dwarf usage in action: I've tried a 
>>>>>>> couple of different systems and so far have not reproduced the 
>>>>>>> problem, i.e. jstack has not failed on native frames.
>>>>>>>
>>>>>>> I may need more recent basic libraries, will look again for 
>>>>>>> somewhere where the problem happens and get back to you as I 
>>>>>>> really want to run the changes.
>>>>>>
>>>>>> You can see the problem with JShell.
>>>>>> Some Java frames would not be seen in mixed jstack.
>>>>>>
>>>>>>
>>>>>>> I have mostly minor other comments which don't need a new webrev, 
>>>>>>> some just comments for the future:
>>>>>>>
>>>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp:
>>>>>>>
>>>>>>> DW_CFA_nop - shouldn't this continue instead of return?
>>>>>>> (It may "never" happen, but a nop could appear within some other 
>>>>>>> instructions?)
>>>>>>
>>>>>> DW_CFA_nop is used for padding, so we can ignore (return 
>>>>>> immediately) it.
>>>>>>
>>>>>>
>>>>>>> DW_CFA_remember_state: a minor typo in the comment, 
>>>>>>> "DW_CFA_remenber_state".
>>>>>>
>>>>>> I will fix it.
>>>>>>
>>>>>>
>>>>>>> We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not 
>>>>>>> DW_CFA_advance_loc4.? I thought that was odd, but maybe addresses 
>>>>>>> in these tables never increase by 4-byte amounts, would this mean 
>>>>>>> a lot of code on one line. 8-)
>>>>>>> So maybe it's never used in practice, if you think it's 
>>>>>>> unnecessary no problem, maybe a comment, or add it for robustness.
>>>>>>
>>>>>> I will add DW_CFA_advance_loc4.
>>>>>>
>>>>>>
>>>>>>> General-purpose methods like read_leb128(), get_entry_length(), 
>>>>>>> get_decoded_value() specifically update the _buf pointer in this 
>>>>>>> DwarfParser.
>>>>>>>
>>>>>>> DwarfParser::process_dwarf() moves _buf.
>>>>>>> It calls process_cie() which reads, moves _buf and restores it to 
>>>>>>> the original position, then we read augmentation_length from 
>>>>>>> where _buf is.
>>>>>>> I'm not sure if that's wrong, or if I just need to read again 
>>>>>>> about the CIE/etc layout.
>>>>>>>
>>>>>>> I don't really want to suggest making the code pass around a 
>>>>>>> current _buf for the invocation of these general purpose methods, 
>>>>>>> but just wanted to comment that if these get used more widely 
>>>>>>> that might become necessary.
>>>>>>
>>>>>> I saw GDB and binutils source for creating this patch.
>>>>>> They seems to process similar code because we need to calculate 
>>>>>> DWARF instructions one-by-one to get the value which relates to 
>>>>>> specified PC.
>>>>>>
>>>>>>
>>>>>>> Similarly in future, if this DWARF support code became used more 
>>>>>>> widely, it might want to move to an
>>>>>>> OS-neutral directory?? It's odd to label it as Linux-specific.
>>>>>>
>>>>>> Windows does not use DWARF at least, it uses another feature.
>>>>>>
>>>>>> https://urldefense.com/v3/__https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaTtALtpRg$ 
>>>>>>
>>>>>> I'm not sure other platforms (Solaris, macOS) uses DWARF.
>>>>>> If DWARF is used in them, I can move DWARF related code to posix 
>>>>>> directory.
>>>>>>
>>>>>>
>>>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp:
>>>>>>> Thanks for changing "can_parsable" which was in the earlier 
>>>>>>> version. 8-)
>>>>>>>
>>>>>>>
>>>>>>> These are just comments to mainly say it looks good, and somebody 
>>>>>>> else out there has read it.
>>>>>>> I will look for a system that shows the problem, and get back to 
>>>>>>> you again!
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>>> Many thanks
>>>>>>> Kevin
>>>>>>>
>>>>>>> On 27/02/2020 05:13, Yasumasa Suenaga wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> webrev.03 cannot be applied to current jdk/jdk due to 8239224 
>>>>>>>> and 8239462 changes (they updated copyright year).
>>>>>>>> So I modified webrev (only copyright year changes) to be able to 
>>>>>>>> apply to current jdk/jdk.
>>>>>>>> Could you review it?
>>>>>>>>
>>>>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/
>>>>>>>>
>>>>>>>> I need one more reviewer to push.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2020/02/17 13:07, Yasumasa Suenaga wrote:
>>>>>>>>> PING: Could you review it?
>>>>>>>>>
>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>>> ?? webrev: 
>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>>>>
>>>>>>>>> This change has been already reviewed by Serguei.
>>>>>>>>> I need one more reviewer to push.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2020/02/03 1:37, Yasumasa Suenaga wrote:
>>>>>>>>>> PING: Could you reveiw this change?
>>>>>>>>>>
>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>>> ?? webrev: 
>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>>>>>
>>>>>>>>>> I believe this change helps troubleshooter to fight to 
>>>>>>>>>> postmortem analysis.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2020/01/19 3:16, Yasumasa Suenaga wrote:
>>>>>>>>>>> PING: Could you review it?
>>>>>>>>>>>
>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>>> ?? webrev: 
>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>>>>>>
>>>>>>>>>>> I updated webrev. I discussed with Serguei in off list, and I 
>>>>>>>>>>> refactored webrev.02 .
>>>>>>>>>>> It has passed tests on submit repo 
>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549).
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 2019/12/15 10:51, Yasumasa Suenaga wrote:
>>>>>>>>>>>> Hi Serguei,
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for your comment!
>>>>>>>>>>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev.
>>>>>>>>>>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as 
>>>>>>>>>>>> Dmitry said.
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/
>>>>>>>>>>>>
>>>>>>>>>>>> This change has been passed all tests on submit repo 
>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is nice move in general.
>>>>>>>>>>>>> Thank you for working on this!
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html 
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr 
>>>>>>>>>>>>> == 0L) { // Java frame 98 Address rbp = 
>>>>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if 
>>>>>>>>>>>>> (rbp == null) { 100 return null; 101 } 102 return new 
>>>>>>>>>>>>> LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // 
>>>>>>>>>>>>> Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = 
>>>>>>>>>>>>> new DwarfParser(libptr); 107 } catch (DebuggerException e) 
>>>>>>>>>>>>> { 108 Address rbp = 
>>>>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 
>>>>>>>>>>>>> if (rbp == null) { 110 return null; 111 } 112 return new 
>>>>>>>>>>>>> LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 
>>>>>>>>>>>>> dwarf.processDwarf(pc); 115 Address cfa = 
>>>>>>>>>>>>> ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 
>>>>>>>>>>>>> !dwarf.isBPOffsetAvailable()) 117 ? 
>>>>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : 
>>>>>>>>>>>>> context.getRegisterAsAddress(dwarf.getCFARegister()) 119 
>>>>>>>>>>>>> .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 
>>>>>>>>>>>>> 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, 
>>>>>>>>>>>>> cfa, pc, dwarf); 124 }
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'd suggest to simplify the logic by refactoring to 
>>>>>>>>>>>>> something like below:
>>>>>>>>>>>>>
>>>>>>>>>>>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc);
>>>>>>>>>>>>> ?????????? Address cfa = 
>>>>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); // 
>>>>>>>>>>>>> Java frame
>>>>>>>>>>>>> ?????????? DwarfParser dwarf = null;
>>>>>>>>>>>>>
>>>>>>>>>>>>> ?????????? if (libptr != 0L) { // Native frame
>>>>>>>>>>>>> ???????????? try {
>>>>>>>>>>>>> ?????????????? dwarf = new DwarfParser(libptr);
>>>>>>>>>>>>> ?????????????? dwarf.processDwarf(pc);
>>>>>>>>>>>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == 
>>>>>>>>>>>>> AMD64ThreadContext.RBP) &&
>>>>>>>>>>>>> !dwarf.isBPOffsetAvailable())
>>>>>>>>>>>>> ???????????????????????????????? ? 
>>>>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP)
>>>>>>>>>>>>> ???????????????????????????????? : 
>>>>>>>>>>>>> context.getRegisterAsAddress(dwarf.getCFARegister())
>>>>>>>>>>>>> .addOffsetTo(dwarf.getCFAOffset());
>>>>>>>>>>>>>
>>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // bail out to 
>>>>>>>>>>>>> Java frame case
>>>>>>>>>>>>> ??????????? }
>>>>>>>>>>>>> ????????? }
>>>>>>>>>>>>> ????????? if (cfa == null) {
>>>>>>>>>>>>> ??????????? return null;
>>>>>>>>>>>>> ????????? }
>>>>>>>>>>>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf);
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html 
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA()
>>>>>>>>>>>>>
>>>>>>>>>>>>> ?? Better to rename 'ofs' => 'offs'.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 77 nextCFA = nextCFA.addOffsetTo(- 
>>>>>>>>>>>>> nextDwarf.getBasePointerOffsetFromCFA());
>>>>>>>>>>>>>
>>>>>>>>>>>>> ?? Extra space after '-' sign.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, 
>>>>>>>>>>>>> ThreadContext context) {
>>>>>>>>>>>>>
>>>>>>>>>>>>> ?? It feels like the logic has to be somehow 
>>>>>>>>>>>>> refactored/simplified as
>>>>>>>>>>>>> ?? several typical fragments appears in slightly different 
>>>>>>>>>>>>> contexts.
>>>>>>>>>>>>> ?? But it is not easy to understand what it is.
>>>>>>>>>>>>> ?? Could you, please, add some comments to key places 
>>>>>>>>>>>>> explaining this logic.
>>>>>>>>>>>>> ?? Then I'll check if it is possible to make it a little 
>>>>>>>>>>>>> bit simpler.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 109 private CFrame javaSender(ThreadContext context) { 110 
>>>>>>>>>>>>> Address nextCFA; 111 Address nextPC; 112 113 nextPC = 
>>>>>>>>>>>>> getNextPC(false); 114 if (nextPC == null) { 115 return 
>>>>>>>>>>>>> null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long 
>>>>>>>>>>>>> libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 
>>>>>>>>>>>>> 0L) { // Native frame 121 try { 122 nextDwarf = new 
>>>>>>>>>>>>> DwarfParser(libptr); 123 } catch (DebuggerException e) { 
>>>>>>>>>>>>> 124 nextCFA = getNextCFA(null, context); 125 return 
>>>>>>>>>>>>> (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, 
>>>>>>>>>>>>> nextCFA, nextPC, null); 126 } 127 
>>>>>>>>>>>>> nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = 
>>>>>>>>>>>>> getNextCFA(nextDwarf, context); 131 return (nextCFA == 
>>>>>>>>>>>>> null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, 
>>>>>>>>>>>>> nextPC, nextDwarf); 133 }
>>>>>>>>>>>>>
>>>>>>>>>>>>> ??The above can be simplified if a DebuggerException can 
>>>>>>>>>>>>> not be thrown from processDwarf(nextPC):
>>>>>>>>>>>>> ????? private CFrame javaSender(ThreadContext context) {
>>>>>>>>>>>>> ??????? Address nextPC = getNextPC(false);
>>>>>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>>>>>> ????????? return null;
>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>>>>>>
>>>>>>>>>>>>> ??????? if (libptr != 0L) { // Native frame
>>>>>>>>>>>>> ????????? try {
>>>>>>>>>>>>> ??????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>>>>>> ??????????? nextDwarf.processDwarf(nextPC);
>>>>>>>>>>>>> ????????? } catch (DebuggerException e) { // Bail out to 
>>>>>>>>>>>>> Java frame
>>>>>>>>>>>>> ????????? }
>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new 
>>>>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>>>>>> ????? }
>>>>>>>>>>>>>
>>>>>>>>>>>>> 135 public CFrame sender(ThreadProxy thread) { 136 
>>>>>>>>>>>>> ThreadContext context = thread.getContext(); 137 138 if 
>>>>>>>>>>>>> (dwarf == null) { // Java frame 139 return 
>>>>>>>>>>>>> javaSender(context); 140 } 141 142 Address nextPC = 
>>>>>>>>>>>>> getNextPC(true); 143 if (nextPC == null) { 144 return null; 
>>>>>>>>>>>>> 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = 
>>>>>>>>>>>>> dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = 
>>>>>>>>>>>>> dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 
>>>>>>>>>>>>> 152 // Next frame might be Java frame 153 nextCFA = 
>>>>>>>>>>>>> getNextCFA(null, context); 154 return (nextCFA == null) ? 
>>>>>>>>>>>>> null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 
>>>>>>>>>>>>> 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 
>>>>>>>>>>>>> 158 } catch (DebuggerException e) { 159 nextCFA = 
>>>>>>>>>>>>> getNextCFA(null, context); 160 return (nextCFA == null) ? 
>>>>>>>>>>>>> null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 
>>>>>>>>>>>>> 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 
>>>>>>>>>>>>> nextCFA = getNextCFA(nextDwarf, context); 166 return 
>>>>>>>>>>>>> (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, 
>>>>>>>>>>>>> nextCFA, nextPC, nextDwarf); 167 }
>>>>>>>>>>>>>
>>>>>>>>>>>>> ??This one can be also simplified a little:
>>>>>>>>>>>>>
>>>>>>>>>>>>> ????? public CFrame sender(ThreadProxy thread) {
>>>>>>>>>>>>> ??????? ThreadContext context = thread.getContext();
>>>>>>>>>>>>>
>>>>>>>>>>>>> ??????? if (dwarf == null) { // Java frame
>>>>>>>>>>>>> ????????? return javaSender(context);
>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>> ??????? Address nextPC = getNextPC(true);
>>>>>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>>>>>> ????????? return null;
>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>>>>>> ??????? if (!dwarf.isIn(nextPC)) {
>>>>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>>>> ????????? if (libptr != 0L) {
>>>>>>>>>>>>> ??????????? try {
>>>>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC);
>>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to 
>>>>>>>>>>>>> Java frame
>>>>>>>>>>>>> ??????????? }
>>>>>>>>>>>>> ????????? }
>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new 
>>>>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>>>>>> ????? }
>>>>>>>>>>>>>
>>>>>>>>>>>>> Finally, it looks like just one method could replace both
>>>>>>>>>>>>> sender(ThreadProxy thread) and javaSender(ThreadContext 
>>>>>>>>>>>>> context):
>>>>>>>>>>>>>
>>>>>>>>>>>>> ????? private CFrame commonSender(ThreadProxy thread) {
>>>>>>>>>>>>> ??????? ThreadContext context = thread.getContext();
>>>>>>>>>>>>> ??????? Address nextPC = getNextPC(false);
>>>>>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>>>>>> ????????? return null;
>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>>>>>>
>>>>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) {
>>>>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>>>> ????????? if (libptr != 0L) {
>>>>>>>>>>>>> ??????????? try {
>>>>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC);
>>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to 
>>>>>>>>>>>>> Java frame
>>>>>>>>>>>>> ??????????? }
>>>>>>>>>>>>> ????????? }
>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new 
>>>>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>>>>>> ????? }
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm still reviewing the dwarf parser files.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I refactored LinuxAMD64CFrame.java . It works fine in 
>>>>>>>>>>>>>> serviceability/sa tests and
>>>>>>>>>>>>>> all tests on submit repo 
>>>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923).
>>>>>>>>>>>>>> Could you review new webrev?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The diff from previous webrev is here:
>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>>>>>>> ?? webrev: 
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> According to 2.7 Stack Unwind Algorithm in System V 
>>>>>>>>>>>>>>> Application Binary Interface AMD64
>>>>>>>>>>>>>>> Architecture Processor Supplement [1], we need to use 
>>>>>>>>>>>>>>> DWARF in .eh_frame or .debug_frame
>>>>>>>>>>>>>>> for stack unwinding.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by 
>>>>>>>>>>>>>>> default since GCC 4.6, so system
>>>>>>>>>>>>>>> library (e.g. libc) might be compiled with this feature.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> However `jhsdb jstack --mixed` does not do so, it uses 
>>>>>>>>>>>>>>> base pointer register (RBP).
>>>>>>>>>>>>>>> So it might be lack of stack frames.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I guess JDK-8219201 is caused by same issue.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [1] 
>>>>>>>>>>>>>>> https://urldefense.com/v3/__https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaQusmjOqA$ 
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>

From suenaga at oss.nttdata.com  Wed Mar 11 06:48:26 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Wed, 11 Mar 2020 15:48:26 +0900
Subject: : PING: RFR: 8234624: jstack mixed mode should refer DWARF
In-Reply-To: <89d854f1-6c28-3976-ba3f-33e3e8cb6012@oracle.com>
References: <eb284894-0801-d04d-c5c2-68ea3faaef98@oss.nttdata.com>
 <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com>
 <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com>
 <e819c572-2172-c3e2-9933-f3859662d400@oss.nttdata.com>
 <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com>
 <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com>
 <c25bb60f-4e37-58db-b40f-2afce2dbf82f@oss.nttdata.com>
 <b2d223d2-47db-3119-b579-e48ca3b50469@oss.nttdata.com>
 <c9ac396c-f14b-3313-b3aa-912dd7dca482@oracle.com>
 <24a2dcae-fba9-c6ed-657e-a1d54584bbc3@oss.nttdata.com>
 <01c2a8c1-0e04-4e42-161a-80d6bdaa4ffa@oracle.com>
 <ab3a496e-4f19-922b-7418-1fca058a14c1@oss.nttdata.com>
 <e8a16c8e-ff00-6985-dc3d-9450c6b489df@oss.nttdata.com>
 <26681640-d8c8-5b64-58cc-36f8a854d101@oracle.com>
 <75c8f8ac-4557-03ab-77eb-f2383aa2b5f1@oss.nttdata.com>
 <89d854f1-6c28-3976-ba3f-33e3e8cb6012@oracle.com>
Message-ID: <cc22ae57-2052-da58-ad9a-355dac4ac2af@oss.nttdata.com>

On 2020/03/11 15:20, David Holmes wrote:
> On 11/03/2020 4:03 pm, Yasumasa Suenaga wrote:
>> Thanks David!
>>
>> Can you share native backtrace?
>> (Did /opt/core.sh collect it?)
> 
> There is a core file but I can't process it, sorry.

Can you share entire of hs_err log and libsaproc.so on this test?
I cannot reproduce the crash on my laptop.


Yasumasa


> David
> -----
> 
>>
>> Yasumasa
>>
>>
>> On 2020/03/11 14:59, David Holmes wrote:
>>> Hi Yasumasa,
>>>
>>> Partial hs_err info below.
>>>
>>> David
>>> -----
>>>
>>> #
>>> # A fatal error has been detected by the Java Runtime Environment:
>>> #
>>> #? SIGSEGV (0xb) at pc=0x00007fdf2000e87c, pid=29798, tid=29800
>>> #
>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-11-0447267.suenaga.source)
>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-11-0447267.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>> # Problematic frame:
>>> # C? [libsaproc.so+0x487c]? DwarfParser::process_dwarf(unsigned long)+0x2c
>>> #
>>> # Core dump will be written. Default location: Core dumps may be processed with "/opt/core.sh %p" (or dumping to /opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/scratch/0/core.29798)
>>> #
>>> # If you would like to submit a bug report, please visit:
>>> #?? https://bugreport.java.com/bugreport/crash.jsp
>>> # The crash happened outside the Java Virtual Machine in native code.
>>> # See problematic frame for where to report the bug.
>>> #
>>>
>>> ---------------? S U M M A R Y ------------
>>>
>>> Command Line: 
>>> -Denv.class.path=/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/serviceability/sa/TestJhsdbJstackMixed.d:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/hotspot/jtreg/serviceability/sa:/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/javatest.jar:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/jtreg.jar 
>>> -Dapplication.home=/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/linux-x64-debug.jdk/jdk-15/fastdebug -Xms8m -Djdk.module.main=jdk.hotspot.agent jdk.hotspot.agent/sun.jvm.hotspot.SALauncher jstack --mixed --pid 29770
>>>
>>> Time: Wed Mar 11 05:20:57 2020 UTC elapsed time: 3.927809 seconds (0d 0h 0m 3s)
>>>
>>> ---------------? T H R E A D? ---------------
>>>
>>> Current thread (0x00007fdf5c032000):? JavaThread "main" [_thread_in_native, id=29800, stack(0x00007fdf63a9e000,0x00007fdf63b9f000)]
>>>
>>> Stack: [0x00007fdf63a9e000,0x00007fdf63b9f000], sp=0x00007fdf63b9d190, free space=1020k
>>> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
>>> C? [libsaproc.so+0x487c]? DwarfParser::process_dwarf(unsigned long)+0x2c
>>> j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 jdk.hotspot.agent at 15-internal
>>> j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 jdk.hotspot.agent at 15-internal
>>> j sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 jdk.hotspot.agent at 15-internal
>>> j sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 jdk.hotspot.agent at 15-internal
>>> j sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.tools.Tool.startInternal()V+87 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 jdk.hotspot.agent at 15-internal
>>> v? ~StubRoutines::call_stub
>>> V? [libjvm.so+0xc2291c]? JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x6ac
>>> V? [libjvm.so+0xd31970]? jni_invoke_static(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) [clone .isra.140] [clone .constprop.263]+0x370
>>> V? [libjvm.so+0xd36202]? jni_CallStaticVoidMethod+0x222
>>> C? [libjli.so+0x4bed]? JavaMain+0xbcd
>>> C? [libjli.so+0x80a9]? ThreadJavaMain+0x9
>>>
>>> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
>>> j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 jdk.hotspot.agent at 15-internal
>>> j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 jdk.hotspot.agent at 15-internal
>>> j sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 jdk.hotspot.agent at 15-internal
>>> j sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 jdk.hotspot.agent at 15-internal
>>> j sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.tools.Tool.startInternal()V+87 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 jdk.hotspot.agent at 15-internal
>>> v? ~StubRoutines::call_stub
>>>
>>> siginfo: si_signo: 11 (SIGSEGV), si_code: 2 (SEGV_ACCERR), si_addr: 0x00007fded5076b79
>>>
>>> Register to memory mapping:
>>>
>>> RAX=0x00007f7e4dfe3229 is an unknown value
>>> RBX=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 d4 de 7f 00 00
>>> RCX=0x00007fded4072380 points into unknown readable memory: 2f 75 73 72 2f 6c 69 62
>>> RDX=0x00007fded4076b85 points into unknown readable memory: 01 00 00
>>> RSP=0x00007fdf63b9d190 is pointing into the stack for thread: 0x00007fdf5c032000
>>> RBP=0x00007fdf63b9d1b0 is pointing into the stack for thread: 0x00007fdf5c032000
>>> RSI=0x0000000000000004 is an unknown value
>>> RDI=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 d4 de 7f 00 00
>>> R8 =0x000000000146c380 points into unknown readable memory: 02 00 00 00 00 00 00 00
>>> R9 =0x00007fded4076b79 points into unknown readable memory: 7a 52 00 01 78 10 01
>>> R10=0x00000000ffffffff is an unknown value
>>> R11=0x000000000100527a is an unknown value
>>> R12=0x00007fded5076b79 is an unknown value
>>> R13=0x00007f7da2f8e68a is an unknown value
>>> R14=0x00007f7dbdf62b1d is an unknown value
>>> R15=0x00007fdf5c032000 is a thread
>>>
>>>
>>> Registers:
>>> RAX=0x00007f7e4dfe3229, RBX=0x00007fdf5c4d7080, RCX=0x00007fded4072380, RDX=0x00007fded4076b85
>>> RSP=0x00007fdf63b9d190, RBP=0x00007fdf63b9d1b0, RSI=0x0000000000000004, RDI=0x00007fdf5c4d7080
>>> R8 =0x000000000146c380, R9 =0x00007fded4076b79, R10=0x00000000ffffffff, R11=0x000000000100527a
>>> R12=0x00007fded5076b79, R13=0x00007f7da2f8e68a, R14=0x00007f7dbdf62b1d, R15=0x00007fdf5c032000
>>> RIP=0x00007fdf2000e87c, EFLAGS=0x0000000000010206, CSGSFS=0x002b000000000033, ERR=0x0000000000000004
>>> ?? TRAPNO=0x000000000000000e
>>>
>>> Top of Stack: (sp=0x00007fdf63b9d190)
>>> 0x00007fdf63b9d190:?? 00007fdf209d0980 0000000000000000
>>> 0x00007fdf63b9d1a0:?? 00007fdf209d0980 00007fdf63b9d258
>>> 0x00007fdf63b9d1b0:?? 00007fdf63b9d228 00007fdf44778dbe
>>> 0x00007fdf63b9d1c0:?? 000000000146c380 00007fdf5c032000
>>>
>>> Instructions: (pc=0x00007fdf2000e87c)
>>> 0x00007fdf2000e77c:?? 89 43 18 4d 85 f6 75 0f eb 2a 66 2e 0f 1f 84 00
>>> 0x00007fdf2000e78c:?? 00 00 00 00 48 89 c2 48 8d 42 01 48 89 43 08 80
>>> 0x00007fdf2000e79c:?? 78 ff 00 78 ef 48 8d 42 02 48 89 43 08 0f b6 42
>>> 0x00007fdf2000e7ac:?? 01 88 43 10 48 c7 43 28 00 00 00 00 4c 89 e1 48
>>> 0x00007fdf2000e7bc:?? 89 df 31 f6 48 b8 07 00 00 00 10 00 00 00 c6 43
>>> 0x00007fdf2000e7cc:?? 3c 00 48 c7 c2 ff ff ff ff 48 89 43 14 48 c7 43
>>> 0x00007fdf2000e7dc:?? 30 00 00 00 00 c7 43 38 00 00 00 00 e8 13 fb ff
>>> 0x00007fdf2000e7ec:?? ff 4c 89 6b 08 48 83 c4 18 5b 41 5c 41 5d 41 5e
>>> 0x00007fdf2000e7fc:?? 41 5f 5d c3 83 e7 40 0f 84 63 ff ff ff 48 c7 c2
>>> 0x00007fdf2000e80c:?? ff ff ff ff 48 d3 e2 49 09 d0 e9 51 ff ff ff 90
>>> 0x00007fdf2000e81c:?? 0f 1f 40 00 0f b6 47 10 83 e0 07 3c 02 74 0a 76
>>> 0x00007fdf2000e82c:?? 1b 3c 03 74 04 3c 04 75 17 48 8b 57 08 8b 02 48
>>> 0x00007fdf2000e83c:?? 83 c2 04 48 89 57 08 c3 0f 1f 40 00 84 c0 74 e9
>>> 0x00007fdf2000e84c:?? 31 c0 c3 90 55 41 ba ff ff ff ff 48 89 e5 41 56
>>> 0x00007fdf2000e85c:?? 41 55 49 89 f5 41 54 53 48 8b 07 48 89 fb 4c 8b
>>> 0x00007fdf2000e86c:?? a0 28 11 00 00 eb 09 0f 1f 44 00 00 4c 89 63 08
>>> 0x00007fdf2000e87c:?? 41 8b 04 24 4d 8d 4c 24 04 4c 89 4b 08 4c 39 d0
>>> 0x00007fdf2000e88c:?? 75 0a 49 8b 44 24 04 4d 8d 4c 24 0c 45 8b 19 4d
>>> 0x00007fdf2000e89c:?? 8d 24 01 49 8d 41 04 48 89 43 08 45 85 db 74 cc
>>> 0x00007fdf2000e8ac:?? 48 89 df e8 8c f9 ff ff 48 8b 13 41 89 c6 4c 03
>>> 0x00007fdf2000e8bc:?? b2 18 11 00 00 e8 5a ff ff ff 89 c0 4c 01 f0 4c
>>> 0x00007fdf2000e8cc:?? 39 e8 76 a8 4d 39 ee 77 a3 44 89 da 4c 89 ce e8
>>> 0x00007fdf2000e8dc:?? 90 fd ff ff 48 8b 43 08 31 c9 31 ff 48 83 c0 01
>>> 0x00007fdf2000e8ec:?? 0f 1f 40 00 48 89 43 08 0f b6 70 ff 49 89 c0 48
>>> 0x00007fdf2000e8fc:?? 83 c0 01 48 89 f2 83 e2 7f 48 d3 e2 83 c1 07 48
>>> 0x00007fdf2000e90c:?? 09 d7 40 84 f6 78 dd 4c 01 c7 4c 89 e1 4c 89 ea
>>> 0x00007fdf2000e91c:?? 4c 89 f6 48 89 7b 08 48 89 df e8 d5 f9 ff ff 5b
>>> 0x00007fdf2000e92c:?? 31 c0 41 5c 41 5d 41 5e 5d c3 66 2e 0f 1f 84 00
>>> 0x00007fdf2000e93c:?? 00 00 00 00 55 48 89 e5 41 54 53 48 81 ec d0 00
>>> 0x00007fdf2000e94c:?? 00 00 48 89 b5 48 ff ff ff 48 89 95 50 ff ff ff
>>> 0x00007fdf2000e95c:?? 48 89 8d 58 ff ff ff 4c 89 85 60 ff ff ff 4c 89
>>> 0x00007fdf2000e96c:?? 8d 68 ff ff ff 84 c0 74 23 0f 29 85 70 ff ff ff
>>>
>>>
>>> On 11/03/2020 3:52 pm, Yasumasa Suenaga wrote:
>>>> Hi Kevin,
>>>>
>>>> I saw 2 errors on submit repo (mach5-one-ysuenaga-JDK-8234624-5-20200311-0209-9358475).
>>>> So I tweaked my patch, but I saw the crash again (mach5-one-ysuenaga-JDK-8234624-5-20200311-0448-9361448).
>>>>
>>>> ?? Last change on submit repo is here:
>>>> ???? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05-2/
>>>>
>>>> Can you share details on submit repo?
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2020/03/11 11:07, Yasumasa Suenaga wrote:
>>>>> Hi Kevin,
>>>>>
>>>>> I guess first program header in the libraries which are on your machine has exec flag (you can check it with `readelf -l`).
>>>>> So I tweaked my patch (initial value of exec_start and exec_end set to -1) in new webrev.
>>>>>
>>>>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05/
>>>>>
>>>>> This webrev contains the fix for your comment (typo and DW_CFA_advance_loc4).
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2020/03/11 8:53, Kevin Walls wrote:
>>>>>> Hi -
>>>>>>
>>>>>> In testing I wasn't seeing any of the Dwarf code triggered.
>>>>>>
>>>>>> With LIBSAPROC_DEBUG set I'm getting the "Could not find executable section in" for lots of / maybe all the libraries...
>>>>>>
>>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/libproc_impl.c
>>>>>>
>>>>>> ??? if (fill_instr_info(newlib)) {
>>>>>> ????? if (!read_eh_frame(ph, newlib)) {
>>>>>>
>>>>>> fill_instr_info is failing, and we never get to read_eh_frame().
>>>>>>
>>>>>> output like:
>>>>>>
>>>>>> libsaproc DEBUG: [0] vaddr = 0x0, memsz = 0xaba4, filesz = 0xaba4
>>>>>> libsaproc DEBUG: Could not find executable section in /lib/x86_64-linux-gnu/libnss_nis-2.27.so
>>>>>>
>>>>>> (similar for all libraries).
>>>>>>
>>>>>> fill_instr fails if:
>>>>>>
>>>>>> ??if ((lib->exec_start == 0L) || (lib->exec_end == 0L))
>>>>>>
>>>>>> ...but isn't exec_start relative to the library address? It's the value of ph->vaddr and it is often zero.
>>>>>>
>>>>>> I added some booleans and did:
>>>>>>
>>>>>> 185?????? if ((lib->exec_start == 0L) || (lib->exec_start > ph->p_vaddr)) {
>>>>>> 186???????? lib->exec_start = ph->p_vaddr;
>>>>>> 187???????? found_start =true;
>>>>>> 188?????? }
>>>>>>
>>>>>> (similarly for end) and only failed if:
>>>>>>
>>>>>> 201?? if (!found_start || !found_end) {
>>>>>> 202???? return false;
>>>>>>
>>>>>> ...and now it's better. ? I go from:
>>>>>>
>>>>>> ----------------- 3306 -----------------
>>>>>> 0x00007f75824acd2d????? __GI___pthread_timedjoin_ex + 0x17d
>>>>>>
>>>>>> to:
>>>>>>
>>>>>> ----------------- 31127 -----------------
>>>>>> 0x00007fa284d78d2d????? __GI___pthread_timedjoin_ex + 0x17d
>>>>>> 0x00007fa2857aaf2d????? CallJavaMainInNewThread + 0xad
>>>>>> 0x00007fa2857a74ed????? ContinueInNewThread + 0x4d
>>>>>> 0x00007fa2857a8c49????? JLI_Launch + 0x1529
>>>>>> 0x000055af1b78db1c????? main + 0x11c
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> Kevin
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 10/03/2020 12:36, Yasumasa Suenaga wrote:
>>>>>>
>>>>>>> Hi Kevin,
>>>>>>>
>>>>>>> Thanks for your comment!
>>>>>>>
>>>>>>> On 2020/03/10 18:58, Kevin Walls wrote:
>>>>>>>> Hi Yasumasa ,
>>>>>>>>
>>>>>>>> The changes build OK for me in the latest jdk, and things still work.
>>>>>>>> I have not yet seen the dwarf usage in action: I've tried a couple of different systems and so far have not reproduced the problem, i.e. jstack has not failed on native frames.
>>>>>>>>
>>>>>>>> I may need more recent basic libraries, will look again for somewhere where the problem happens and get back to you as I really want to run the changes.
>>>>>>>
>>>>>>> You can see the problem with JShell.
>>>>>>> Some Java frames would not be seen in mixed jstack.
>>>>>>>
>>>>>>>
>>>>>>>> I have mostly minor other comments which don't need a new webrev, some just comments for the future:
>>>>>>>>
>>>>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp:
>>>>>>>>
>>>>>>>> DW_CFA_nop - shouldn't this continue instead of return?
>>>>>>>> (It may "never" happen, but a nop could appear within some other instructions?)
>>>>>>>
>>>>>>> DW_CFA_nop is used for padding, so we can ignore (return immediately) it.
>>>>>>>
>>>>>>>
>>>>>>>> DW_CFA_remember_state: a minor typo in the comment, "DW_CFA_remenber_state".
>>>>>>>
>>>>>>> I will fix it.
>>>>>>>
>>>>>>>
>>>>>>>> We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not DW_CFA_advance_loc4.? I thought that was odd, but maybe addresses in these tables never increase by 4-byte amounts, would this mean a lot of code on one line. 8-)
>>>>>>>> So maybe it's never used in practice, if you think it's unnecessary no problem, maybe a comment, or add it for robustness.
>>>>>>>
>>>>>>> I will add DW_CFA_advance_loc4.
>>>>>>>
>>>>>>>
>>>>>>>> General-purpose methods like read_leb128(), get_entry_length(), get_decoded_value() specifically update the _buf pointer in this DwarfParser.
>>>>>>>>
>>>>>>>> DwarfParser::process_dwarf() moves _buf.
>>>>>>>> It calls process_cie() which reads, moves _buf and restores it to the original position, then we read augmentation_length from where _buf is.
>>>>>>>> I'm not sure if that's wrong, or if I just need to read again about the CIE/etc layout.
>>>>>>>>
>>>>>>>> I don't really want to suggest making the code pass around a current _buf for the invocation of these general purpose methods, but just wanted to comment that if these get used more widely that might become necessary.
>>>>>>>
>>>>>>> I saw GDB and binutils source for creating this patch.
>>>>>>> They seems to process similar code because we need to calculate DWARF instructions one-by-one to get the value which relates to specified PC.
>>>>>>>
>>>>>>>
>>>>>>>> Similarly in future, if this DWARF support code became used more widely, it might want to move to an
>>>>>>>> OS-neutral directory?? It's odd to label it as Linux-specific.
>>>>>>>
>>>>>>> Windows does not use DWARF at least, it uses another feature.
>>>>>>>
>>>>>>> https://urldefense.com/v3/__https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaTtALtpRg$
>>>>>>> I'm not sure other platforms (Solaris, macOS) uses DWARF.
>>>>>>> If DWARF is used in them, I can move DWARF related code to posix directory.
>>>>>>>
>>>>>>>
>>>>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp:
>>>>>>>> Thanks for changing "can_parsable" which was in the earlier version. 8-)
>>>>>>>>
>>>>>>>>
>>>>>>>> These are just comments to mainly say it looks good, and somebody else out there has read it.
>>>>>>>> I will look for a system that shows the problem, and get back to you again!
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>>> Many thanks
>>>>>>>> Kevin
>>>>>>>>
>>>>>>>> On 27/02/2020 05:13, Yasumasa Suenaga wrote:
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> webrev.03 cannot be applied to current jdk/jdk due to 8239224 and 8239462 changes (they updated copyright year).
>>>>>>>>> So I modified webrev (only copyright year changes) to be able to apply to current jdk/jdk.
>>>>>>>>> Could you review it?
>>>>>>>>>
>>>>>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/
>>>>>>>>>
>>>>>>>>> I need one more reviewer to push.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2020/02/17 13:07, Yasumasa Suenaga wrote:
>>>>>>>>>> PING: Could you review it?
>>>>>>>>>>
>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>>>>>
>>>>>>>>>> This change has been already reviewed by Serguei.
>>>>>>>>>> I need one more reviewer to push.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2020/02/03 1:37, Yasumasa Suenaga wrote:
>>>>>>>>>>> PING: Could you reveiw this change?
>>>>>>>>>>>
>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>>>>>>
>>>>>>>>>>> I believe this change helps troubleshooter to fight to postmortem analysis.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 2020/01/19 3:16, Yasumasa Suenaga wrote:
>>>>>>>>>>>> PING: Could you review it?
>>>>>>>>>>>>
>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>>>>>>>
>>>>>>>>>>>> I updated webrev. I discussed with Serguei in off list, and I refactored webrev.02 .
>>>>>>>>>>>> It has passed tests on submit repo (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 2019/12/15 10:51, Yasumasa Suenaga wrote:
>>>>>>>>>>>>> Hi Serguei,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for your comment!
>>>>>>>>>>>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev.
>>>>>>>>>>>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as Dmitry said.
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/
>>>>>>>>>>>>>
>>>>>>>>>>>>> This change has been passed all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487).
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is nice move in general.
>>>>>>>>>>>>>> Thank you for working on this!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 0L) { // Java frame 98 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp == null) { 100 return null; 101 } 102 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if (rbp == null) { 110 return null; 111 } 112 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 dwarf.processDwarf(pc); 115 Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 !dwarf.isBPOffsetAvailable()) 117 ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : context.getRegisterAsAddress(dwarf.getCFARegister()) 119 .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); 124 }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'd suggest to simplify the logic by refactoring to something like below:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc);
>>>>>>>>>>>>>> ?????????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame
>>>>>>>>>>>>>> ?????????? DwarfParser dwarf = null;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ?????????? if (libptr != 0L) { // Native frame
>>>>>>>>>>>>>> ???????????? try {
>>>>>>>>>>>>>> ?????????????? dwarf = new DwarfParser(libptr);
>>>>>>>>>>>>>> ?????????????? dwarf.processDwarf(pc);
>>>>>>>>>>>>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) &&
>>>>>>>>>>>>>> !dwarf.isBPOffsetAvailable())
>>>>>>>>>>>>>> ???????????????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP)
>>>>>>>>>>>>>> ???????????????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister())
>>>>>>>>>>>>>> .addOffsetTo(dwarf.getCFAOffset());
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // bail out to Java frame case
>>>>>>>>>>>>>> ??????????? }
>>>>>>>>>>>>>> ????????? }
>>>>>>>>>>>>>> ????????? if (cfa == null) {
>>>>>>>>>>>>>> ??????????? return null;
>>>>>>>>>>>>>> ????????? }
>>>>>>>>>>>>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA()
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ?? Better to rename 'ofs' => 'offs'.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 77 nextCFA = nextCFA.addOffsetTo(- nextDwarf.getBasePointerOffsetFromCFA());
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ?? Extra space after '-' sign.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, ThreadContext context) {
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ?? It feels like the logic has to be somehow refactored/simplified as
>>>>>>>>>>>>>> ?? several typical fragments appears in slightly different contexts.
>>>>>>>>>>>>>> ?? But it is not easy to understand what it is.
>>>>>>>>>>>>>> ?? Could you, please, add some comments to key places explaining this logic.
>>>>>>>>>>>>>> ?? Then I'll check if it is possible to make it a little bit simpler.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 109 private CFrame javaSender(ThreadContext context) { 110 Address nextCFA; 111 Address nextPC; 112 113 nextPC = getNextPC(false); 114 if (nextPC == null) { 115 return null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 123 } catch (DebuggerException e) { 124 nextCFA = getNextCFA(null, context); 125 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 133 }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ??The above can be simplified if a DebuggerException can not be thrown from processDwarf(nextPC):
>>>>>>>>>>>>>> ????? private CFrame javaSender(ThreadContext context) {
>>>>>>>>>>>>>> ??????? Address nextPC = getNextPC(false);
>>>>>>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>>>>>>> ????????? return null;
>>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ??????? if (libptr != 0L) { // Native frame
>>>>>>>>>>>>>> ????????? try {
>>>>>>>>>>>>>> ??????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>>>>>>> ??????????? nextDwarf.processDwarf(nextPC);
>>>>>>>>>>>>>> ????????? } catch (DebuggerException e) { // Bail out to Java frame
>>>>>>>>>>>>>> ????????? }
>>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>>>>>>> ????? }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext context = thread.getContext(); 137 138 if (dwarf == null) { // Java frame 139 return javaSender(context); 140 } 141 142 Address nextPC = getNextPC(true); 143 if (nextPC == null) { 144 return null; 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // Next frame might be Java frame 153 nextCFA = getNextCFA(null, context); 154 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 158 } catch (DebuggerException e) { 159 nextCFA = getNextCFA(null, context); 160 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? null : new 
>>>>>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 167 }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ??This one can be also simplified a little:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ????? public CFrame sender(ThreadProxy thread) {
>>>>>>>>>>>>>> ??????? ThreadContext context = thread.getContext();
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ??????? if (dwarf == null) { // Java frame
>>>>>>>>>>>>>> ????????? return javaSender(context);
>>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>>> ??????? Address nextPC = getNextPC(true);
>>>>>>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>>>>>>> ????????? return null;
>>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>>>>>>> ??????? if (!dwarf.isIn(nextPC)) {
>>>>>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>>>>> ????????? if (libptr != 0L) {
>>>>>>>>>>>>>> ??????????? try {
>>>>>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC);
>>>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame
>>>>>>>>>>>>>> ??????????? }
>>>>>>>>>>>>>> ????????? }
>>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>>>>>>> ????? }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Finally, it looks like just one method could replace both
>>>>>>>>>>>>>> sender(ThreadProxy thread) and javaSender(ThreadContext context):
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ????? private CFrame commonSender(ThreadProxy thread) {
>>>>>>>>>>>>>> ??????? ThreadContext context = thread.getContext();
>>>>>>>>>>>>>> ??????? Address nextPC = getNextPC(false);
>>>>>>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>>>>>>> ????????? return null;
>>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>>>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) {
>>>>>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>>>>> ????????? if (libptr != 0L) {
>>>>>>>>>>>>>> ??????????? try {
>>>>>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC);
>>>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame
>>>>>>>>>>>>>> ??????????? }
>>>>>>>>>>>>>> ????????? }
>>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>>>>>>> ????? }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm still reviewing the dwarf parser files.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I refactored LinuxAMD64CFrame.java . It works fine in serviceability/sa tests and
>>>>>>>>>>>>>>> all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923).
>>>>>>>>>>>>>>> Could you review new webrev?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The diff from previous webrev is here:
>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> According to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64
>>>>>>>>>>>>>>>> Architecture Processor Supplement [1], we need to use DWARF in .eh_frame or .debug_frame
>>>>>>>>>>>>>>>> for stack unwinding.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by default since GCC 4.6, so system
>>>>>>>>>>>>>>>> library (e.g. libc) might be compiled with this feature.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> However `jhsdb jstack --mixed` does not do so, it uses base pointer register (RBP).
>>>>>>>>>>>>>>>> So it might be lack of stack frames.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I guess JDK-8219201 is caused by same issue.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [1] https://urldefense.com/v3/__https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaQusmjOqA$ 
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>

From suenaga at oss.nttdata.com  Wed Mar 11 09:49:12 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Wed, 11 Mar 2020 18:49:12 +0900
Subject: : PING: RFR: 8234624: jstack mixed mode should refer DWARF
In-Reply-To: <26681640-d8c8-5b64-58cc-36f8a854d101@oracle.com>
References: <eb284894-0801-d04d-c5c2-68ea3faaef98@oss.nttdata.com>
 <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com>
 <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com>
 <e819c572-2172-c3e2-9933-f3859662d400@oss.nttdata.com>
 <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com>
 <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com>
 <c25bb60f-4e37-58db-b40f-2afce2dbf82f@oss.nttdata.com>
 <b2d223d2-47db-3119-b579-e48ca3b50469@oss.nttdata.com>
 <c9ac396c-f14b-3313-b3aa-912dd7dca482@oracle.com>
 <24a2dcae-fba9-c6ed-657e-a1d54584bbc3@oss.nttdata.com>
 <01c2a8c1-0e04-4e42-161a-80d6bdaa4ffa@oracle.com>
 <ab3a496e-4f19-922b-7418-1fca058a14c1@oss.nttdata.com>
 <e8a16c8e-ff00-6985-dc3d-9450c6b489df@oss.nttdata.com>
 <26681640-d8c8-5b64-58cc-36f8a854d101@oracle.com>
Message-ID: <bfd14ee6-61b1-35e4-786c-242f65d00b7a@oss.nttdata.com>

Hi,

Thanks David and Ioi for sharing the status.
I've fixed the problem in new webrev (mach5-one-ysuenaga-JDK-8234624-6-20200311-0827-9367344):

   http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.06/

Diff from webrev.05 is here:

   http://hg.openjdk.java.net/jdk/submit/rev/e3d12785f087


Thanks,

Yasumasa


On 2020/03/11 14:59, David Holmes wrote:
> Hi Yasumasa,
> 
> Partial hs_err info below.
> 
> David
> -----
> 
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #? SIGSEGV (0xb) at pc=0x00007fdf2000e87c, pid=29798, tid=29800
> #
> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-11-0447267.suenaga.source)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-11-0447267.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
> # Problematic frame:
> # C? [libsaproc.so+0x487c]? DwarfParser::process_dwarf(unsigned long)+0x2c
> #
> # Core dump will be written. Default location: Core dumps may be processed with "/opt/core.sh %p" (or dumping to /opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/scratch/0/core.29798)
> #
> # If you would like to submit a bug report, please visit:
> #?? https://bugreport.java.com/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #
> 
> ---------------? S U M M A R Y ------------
> 
> Command Line: 
> -Denv.class.path=/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/serviceability/sa/TestJhsdbJstackMixed.d:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/hotspot/jtreg/serviceability/sa:/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/javatest.jar:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/jtreg.jar 
> -Dapplication.home=/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/linux-x64-debug.jdk/jdk-15/fastdebug -Xms8m -Djdk.module.main=jdk.hotspot.agent jdk.hotspot.agent/sun.jvm.hotspot.SALauncher jstack --mixed --pid 29770
> 
> Time: Wed Mar 11 05:20:57 2020 UTC elapsed time: 3.927809 seconds (0d 0h 0m 3s)
> 
> ---------------? T H R E A D? ---------------
> 
> Current thread (0x00007fdf5c032000):? JavaThread "main" [_thread_in_native, id=29800, stack(0x00007fdf63a9e000,0x00007fdf63b9f000)]
> 
> Stack: [0x00007fdf63a9e000,0x00007fdf63b9f000],? sp=0x00007fdf63b9d190, free space=1020k
> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
> C? [libsaproc.so+0x487c]? DwarfParser::process_dwarf(unsigned long)+0x2c
> j? sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 jdk.hotspot.agent at 15-internal
> j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 jdk.hotspot.agent at 15-internal
> j sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 jdk.hotspot.agent at 15-internal
> j sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 jdk.hotspot.agent at 15-internal
> j sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.Tool.startInternal()V+87 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 jdk.hotspot.agent at 15-internal
> v? ~StubRoutines::call_stub
> V? [libjvm.so+0xc2291c]? JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x6ac
> V? [libjvm.so+0xd31970]? jni_invoke_static(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) [clone .isra.140] [clone .constprop.263]+0x370
> V? [libjvm.so+0xd36202]? jni_CallStaticVoidMethod+0x222
> C? [libjli.so+0x4bed]? JavaMain+0xbcd
> C? [libjli.so+0x80a9]? ThreadJavaMain+0x9
> 
> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
> j? sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 jdk.hotspot.agent at 15-internal
> j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 jdk.hotspot.agent at 15-internal
> j sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 jdk.hotspot.agent at 15-internal
> j sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 jdk.hotspot.agent at 15-internal
> j sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.Tool.startInternal()V+87 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 jdk.hotspot.agent at 15-internal
> j? sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 jdk.hotspot.agent at 15-internal
> v? ~StubRoutines::call_stub
> 
> siginfo: si_signo: 11 (SIGSEGV), si_code: 2 (SEGV_ACCERR), si_addr: 0x00007fded5076b79
> 
> Register to memory mapping:
> 
> RAX=0x00007f7e4dfe3229 is an unknown value
> RBX=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 d4 de 7f 00 00
> RCX=0x00007fded4072380 points into unknown readable memory: 2f 75 73 72 2f 6c 69 62
> RDX=0x00007fded4076b85 points into unknown readable memory: 01 00 00
> RSP=0x00007fdf63b9d190 is pointing into the stack for thread: 0x00007fdf5c032000
> RBP=0x00007fdf63b9d1b0 is pointing into the stack for thread: 0x00007fdf5c032000
> RSI=0x0000000000000004 is an unknown value
> RDI=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 d4 de 7f 00 00
> R8 =0x000000000146c380 points into unknown readable memory: 02 00 00 00 00 00 00 00
> R9 =0x00007fded4076b79 points into unknown readable memory: 7a 52 00 01 78 10 01
> R10=0x00000000ffffffff is an unknown value
> R11=0x000000000100527a is an unknown value
> R12=0x00007fded5076b79 is an unknown value
> R13=0x00007f7da2f8e68a is an unknown value
> R14=0x00007f7dbdf62b1d is an unknown value
> R15=0x00007fdf5c032000 is a thread
> 
> 
> Registers:
> RAX=0x00007f7e4dfe3229, RBX=0x00007fdf5c4d7080, RCX=0x00007fded4072380, RDX=0x00007fded4076b85
> RSP=0x00007fdf63b9d190, RBP=0x00007fdf63b9d1b0, RSI=0x0000000000000004, RDI=0x00007fdf5c4d7080
> R8 =0x000000000146c380, R9 =0x00007fded4076b79, R10=0x00000000ffffffff, R11=0x000000000100527a
> R12=0x00007fded5076b79, R13=0x00007f7da2f8e68a, R14=0x00007f7dbdf62b1d, R15=0x00007fdf5c032000
> RIP=0x00007fdf2000e87c, EFLAGS=0x0000000000010206, CSGSFS=0x002b000000000033, ERR=0x0000000000000004
>  ? TRAPNO=0x000000000000000e
> 
> Top of Stack: (sp=0x00007fdf63b9d190)
> 0x00007fdf63b9d190:?? 00007fdf209d0980 0000000000000000
> 0x00007fdf63b9d1a0:?? 00007fdf209d0980 00007fdf63b9d258
> 0x00007fdf63b9d1b0:?? 00007fdf63b9d228 00007fdf44778dbe
> 0x00007fdf63b9d1c0:?? 000000000146c380 00007fdf5c032000
> 
> Instructions: (pc=0x00007fdf2000e87c)
> 0x00007fdf2000e77c:?? 89 43 18 4d 85 f6 75 0f eb 2a 66 2e 0f 1f 84 00
> 0x00007fdf2000e78c:?? 00 00 00 00 48 89 c2 48 8d 42 01 48 89 43 08 80
> 0x00007fdf2000e79c:?? 78 ff 00 78 ef 48 8d 42 02 48 89 43 08 0f b6 42
> 0x00007fdf2000e7ac:?? 01 88 43 10 48 c7 43 28 00 00 00 00 4c 89 e1 48
> 0x00007fdf2000e7bc:?? 89 df 31 f6 48 b8 07 00 00 00 10 00 00 00 c6 43
> 0x00007fdf2000e7cc:?? 3c 00 48 c7 c2 ff ff ff ff 48 89 43 14 48 c7 43
> 0x00007fdf2000e7dc:?? 30 00 00 00 00 c7 43 38 00 00 00 00 e8 13 fb ff
> 0x00007fdf2000e7ec:?? ff 4c 89 6b 08 48 83 c4 18 5b 41 5c 41 5d 41 5e
> 0x00007fdf2000e7fc:?? 41 5f 5d c3 83 e7 40 0f 84 63 ff ff ff 48 c7 c2
> 0x00007fdf2000e80c:?? ff ff ff ff 48 d3 e2 49 09 d0 e9 51 ff ff ff 90
> 0x00007fdf2000e81c:?? 0f 1f 40 00 0f b6 47 10 83 e0 07 3c 02 74 0a 76
> 0x00007fdf2000e82c:?? 1b 3c 03 74 04 3c 04 75 17 48 8b 57 08 8b 02 48
> 0x00007fdf2000e83c:?? 83 c2 04 48 89 57 08 c3 0f 1f 40 00 84 c0 74 e9
> 0x00007fdf2000e84c:?? 31 c0 c3 90 55 41 ba ff ff ff ff 48 89 e5 41 56
> 0x00007fdf2000e85c:?? 41 55 49 89 f5 41 54 53 48 8b 07 48 89 fb 4c 8b
> 0x00007fdf2000e86c:?? a0 28 11 00 00 eb 09 0f 1f 44 00 00 4c 89 63 08
> 0x00007fdf2000e87c:?? 41 8b 04 24 4d 8d 4c 24 04 4c 89 4b 08 4c 39 d0
> 0x00007fdf2000e88c:?? 75 0a 49 8b 44 24 04 4d 8d 4c 24 0c 45 8b 19 4d
> 0x00007fdf2000e89c:?? 8d 24 01 49 8d 41 04 48 89 43 08 45 85 db 74 cc
> 0x00007fdf2000e8ac:?? 48 89 df e8 8c f9 ff ff 48 8b 13 41 89 c6 4c 03
> 0x00007fdf2000e8bc:?? b2 18 11 00 00 e8 5a ff ff ff 89 c0 4c 01 f0 4c
> 0x00007fdf2000e8cc:?? 39 e8 76 a8 4d 39 ee 77 a3 44 89 da 4c 89 ce e8
> 0x00007fdf2000e8dc:?? 90 fd ff ff 48 8b 43 08 31 c9 31 ff 48 83 c0 01
> 0x00007fdf2000e8ec:?? 0f 1f 40 00 48 89 43 08 0f b6 70 ff 49 89 c0 48
> 0x00007fdf2000e8fc:?? 83 c0 01 48 89 f2 83 e2 7f 48 d3 e2 83 c1 07 48
> 0x00007fdf2000e90c:?? 09 d7 40 84 f6 78 dd 4c 01 c7 4c 89 e1 4c 89 ea
> 0x00007fdf2000e91c:?? 4c 89 f6 48 89 7b 08 48 89 df e8 d5 f9 ff ff 5b
> 0x00007fdf2000e92c:?? 31 c0 41 5c 41 5d 41 5e 5d c3 66 2e 0f 1f 84 00
> 0x00007fdf2000e93c:?? 00 00 00 00 55 48 89 e5 41 54 53 48 81 ec d0 00
> 0x00007fdf2000e94c:?? 00 00 48 89 b5 48 ff ff ff 48 89 95 50 ff ff ff
> 0x00007fdf2000e95c:?? 48 89 8d 58 ff ff ff 4c 89 85 60 ff ff ff 4c 89
> 0x00007fdf2000e96c:?? 8d 68 ff ff ff 84 c0 74 23 0f 29 85 70 ff ff ff
> 
> 
> On 11/03/2020 3:52 pm, Yasumasa Suenaga wrote:
>> Hi Kevin,
>>
>> I saw 2 errors on submit repo (mach5-one-ysuenaga-JDK-8234624-5-20200311-0209-9358475).
>> So I tweaked my patch, but I saw the crash again (mach5-one-ysuenaga-JDK-8234624-5-20200311-0448-9361448).
>>
>> ?? Last change on submit repo is here:
>> ???? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05-2/
>>
>> Can you share details on submit repo?
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> On 2020/03/11 11:07, Yasumasa Suenaga wrote:
>>> Hi Kevin,
>>>
>>> I guess first program header in the libraries which are on your machine has exec flag (you can check it with `readelf -l`).
>>> So I tweaked my patch (initial value of exec_start and exec_end set to -1) in new webrev.
>>>
>>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05/
>>>
>>> This webrev contains the fix for your comment (typo and DW_CFA_advance_loc4).
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2020/03/11 8:53, Kevin Walls wrote:
>>>> Hi -
>>>>
>>>> In testing I wasn't seeing any of the Dwarf code triggered.
>>>>
>>>> With LIBSAPROC_DEBUG set I'm getting the "Could not find executable section in" for lots of / maybe all the libraries...
>>>>
>>>> src/jdk.hotspot.agent/linux/native/libsaproc/libproc_impl.c
>>>>
>>>> ??? if (fill_instr_info(newlib)) {
>>>> ????? if (!read_eh_frame(ph, newlib)) {
>>>>
>>>> fill_instr_info is failing, and we never get to read_eh_frame().
>>>>
>>>> output like:
>>>>
>>>> libsaproc DEBUG: [0] vaddr = 0x0, memsz = 0xaba4, filesz = 0xaba4
>>>> libsaproc DEBUG: Could not find executable section in /lib/x86_64-linux-gnu/libnss_nis-2.27.so
>>>>
>>>> (similar for all libraries).
>>>>
>>>> fill_instr fails if:
>>>>
>>>> ??if ((lib->exec_start == 0L) || (lib->exec_end == 0L))
>>>>
>>>> ...but isn't exec_start relative to the library address? It's the value of ph->vaddr and it is often zero.
>>>>
>>>> I added some booleans and did:
>>>>
>>>> 185?????? if ((lib->exec_start == 0L) || (lib->exec_start > ph->p_vaddr)) {
>>>> 186???????? lib->exec_start = ph->p_vaddr;
>>>> 187???????? found_start =true;
>>>> 188?????? }
>>>>
>>>> (similarly for end) and only failed if:
>>>>
>>>> 201?? if (!found_start || !found_end) {
>>>> 202???? return false;
>>>>
>>>> ...and now it's better. ? I go from:
>>>>
>>>> ----------------- 3306 -----------------
>>>> 0x00007f75824acd2d????? __GI___pthread_timedjoin_ex + 0x17d
>>>>
>>>> to:
>>>>
>>>> ----------------- 31127 -----------------
>>>> 0x00007fa284d78d2d????? __GI___pthread_timedjoin_ex + 0x17d
>>>> 0x00007fa2857aaf2d????? CallJavaMainInNewThread + 0xad
>>>> 0x00007fa2857a74ed????? ContinueInNewThread + 0x4d
>>>> 0x00007fa2857a8c49????? JLI_Launch + 0x1529
>>>> 0x000055af1b78db1c????? main + 0x11c
>>>>
>>>>
>>>> Thanks
>>>> Kevin
>>>>
>>>>
>>>>
>>>>
>>>> On 10/03/2020 12:36, Yasumasa Suenaga wrote:
>>>>
>>>>> Hi Kevin,
>>>>>
>>>>> Thanks for your comment!
>>>>>
>>>>> On 2020/03/10 18:58, Kevin Walls wrote:
>>>>>> Hi Yasumasa ,
>>>>>>
>>>>>> The changes build OK for me in the latest jdk, and things still work.
>>>>>> I have not yet seen the dwarf usage in action: I've tried a couple of different systems and so far have not reproduced the problem, i.e. jstack has not failed on native frames.
>>>>>>
>>>>>> I may need more recent basic libraries, will look again for somewhere where the problem happens and get back to you as I really want to run the changes.
>>>>>
>>>>> You can see the problem with JShell.
>>>>> Some Java frames would not be seen in mixed jstack.
>>>>>
>>>>>
>>>>>> I have mostly minor other comments which don't need a new webrev, some just comments for the future:
>>>>>>
>>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp:
>>>>>>
>>>>>> DW_CFA_nop - shouldn't this continue instead of return?
>>>>>> (It may "never" happen, but a nop could appear within some other instructions?)
>>>>>
>>>>> DW_CFA_nop is used for padding, so we can ignore (return immediately) it.
>>>>>
>>>>>
>>>>>> DW_CFA_remember_state: a minor typo in the comment, "DW_CFA_remenber_state".
>>>>>
>>>>> I will fix it.
>>>>>
>>>>>
>>>>>> We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not DW_CFA_advance_loc4.? I thought that was odd, but maybe addresses in these tables never increase by 4-byte amounts, would this mean a lot of code on one line. 8-)
>>>>>> So maybe it's never used in practice, if you think it's unnecessary no problem, maybe a comment, or add it for robustness.
>>>>>
>>>>> I will add DW_CFA_advance_loc4.
>>>>>
>>>>>
>>>>>> General-purpose methods like read_leb128(), get_entry_length(), get_decoded_value() specifically update the _buf pointer in this DwarfParser.
>>>>>>
>>>>>> DwarfParser::process_dwarf() moves _buf.
>>>>>> It calls process_cie() which reads, moves _buf and restores it to the original position, then we read augmentation_length from where _buf is.
>>>>>> I'm not sure if that's wrong, or if I just need to read again about the CIE/etc layout.
>>>>>>
>>>>>> I don't really want to suggest making the code pass around a current _buf for the invocation of these general purpose methods, but just wanted to comment that if these get used more widely that might become necessary.
>>>>>
>>>>> I saw GDB and binutils source for creating this patch.
>>>>> They seems to process similar code because we need to calculate DWARF instructions one-by-one to get the value which relates to specified PC.
>>>>>
>>>>>
>>>>>> Similarly in future, if this DWARF support code became used more widely, it might want to move to an
>>>>>> OS-neutral directory?? It's odd to label it as Linux-specific.
>>>>>
>>>>> Windows does not use DWARF at least, it uses another feature.
>>>>>
>>>>> https://urldefense.com/v3/__https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaTtALtpRg$
>>>>> I'm not sure other platforms (Solaris, macOS) uses DWARF.
>>>>> If DWARF is used in them, I can move DWARF related code to posix directory.
>>>>>
>>>>>
>>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp:
>>>>>> Thanks for changing "can_parsable" which was in the earlier version. 8-)
>>>>>>
>>>>>>
>>>>>> These are just comments to mainly say it looks good, and somebody else out there has read it.
>>>>>> I will look for a system that shows the problem, and get back to you again!
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>>> Many thanks
>>>>>> Kevin
>>>>>>
>>>>>> On 27/02/2020 05:13, Yasumasa Suenaga wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> webrev.03 cannot be applied to current jdk/jdk due to 8239224 and 8239462 changes (they updated copyright year).
>>>>>>> So I modified webrev (only copyright year changes) to be able to apply to current jdk/jdk.
>>>>>>> Could you review it?
>>>>>>>
>>>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/
>>>>>>>
>>>>>>> I need one more reviewer to push.
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>> On 2020/02/17 13:07, Yasumasa Suenaga wrote:
>>>>>>>> PING: Could you review it?
>>>>>>>>
>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>>>
>>>>>>>> This change has been already reviewed by Serguei.
>>>>>>>> I need one more reviewer to push.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2020/02/03 1:37, Yasumasa Suenaga wrote:
>>>>>>>>> PING: Could you reveiw this change?
>>>>>>>>>
>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>>>>
>>>>>>>>> I believe this change helps troubleshooter to fight to postmortem analysis.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2020/01/19 3:16, Yasumasa Suenaga wrote:
>>>>>>>>>> PING: Could you review it?
>>>>>>>>>>
>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>>>>>
>>>>>>>>>> I updated webrev. I discussed with Serguei in off list, and I refactored webrev.02 .
>>>>>>>>>> It has passed tests on submit repo (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2019/12/15 10:51, Yasumasa Suenaga wrote:
>>>>>>>>>>> Hi Serguei,
>>>>>>>>>>>
>>>>>>>>>>> Thanks for your comment!
>>>>>>>>>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev.
>>>>>>>>>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as Dmitry said.
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/
>>>>>>>>>>>
>>>>>>>>>>> This change has been passed all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487).
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>
>>>>>>>>>>>> This is nice move in general.
>>>>>>>>>>>> Thank you for working on this!
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html
>>>>>>>>>>>>
>>>>>>>>>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 0L) { // Java frame 98 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp == null) { 100 return null; 101 } 102 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if (rbp == null) { 110 return null; 111 } 112 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 dwarf.processDwarf(pc); 115 Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 !dwarf.isBPOffsetAvailable()) 117 ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : context.getRegisterAsAddress(dwarf.getCFARegister()) 119 .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); 124 }
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I'd suggest to simplify the logic by refactoring to something like below:
>>>>>>>>>>>>
>>>>>>>>>>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc);
>>>>>>>>>>>> ?????????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame
>>>>>>>>>>>> ?????????? DwarfParser dwarf = null;
>>>>>>>>>>>>
>>>>>>>>>>>> ?????????? if (libptr != 0L) { // Native frame
>>>>>>>>>>>> ???????????? try {
>>>>>>>>>>>> ?????????????? dwarf = new DwarfParser(libptr);
>>>>>>>>>>>> ?????????????? dwarf.processDwarf(pc);
>>>>>>>>>>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) &&
>>>>>>>>>>>> !dwarf.isBPOffsetAvailable())
>>>>>>>>>>>> ???????????????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP)
>>>>>>>>>>>> ???????????????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister())
>>>>>>>>>>>> .addOffsetTo(dwarf.getCFAOffset());
>>>>>>>>>>>>
>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // bail out to Java frame case
>>>>>>>>>>>> ??????????? }
>>>>>>>>>>>> ????????? }
>>>>>>>>>>>> ????????? if (cfa == null) {
>>>>>>>>>>>> ??????????? return null;
>>>>>>>>>>>> ????????? }
>>>>>>>>>>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf);
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html
>>>>>>>>>>>>
>>>>>>>>>>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA()
>>>>>>>>>>>>
>>>>>>>>>>>> ?? Better to rename 'ofs' => 'offs'.
>>>>>>>>>>>>
>>>>>>>>>>>> 77 nextCFA = nextCFA.addOffsetTo(- nextDwarf.getBasePointerOffsetFromCFA());
>>>>>>>>>>>>
>>>>>>>>>>>> ?? Extra space after '-' sign.
>>>>>>>>>>>>
>>>>>>>>>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, ThreadContext context) {
>>>>>>>>>>>>
>>>>>>>>>>>> ?? It feels like the logic has to be somehow refactored/simplified as
>>>>>>>>>>>> ?? several typical fragments appears in slightly different contexts.
>>>>>>>>>>>> ?? But it is not easy to understand what it is.
>>>>>>>>>>>> ?? Could you, please, add some comments to key places explaining this logic.
>>>>>>>>>>>> ?? Then I'll check if it is possible to make it a little bit simpler.
>>>>>>>>>>>>
>>>>>>>>>>>> 109 private CFrame javaSender(ThreadContext context) { 110 Address nextCFA; 111 Address nextPC; 112 113 nextPC = getNextPC(false); 114 if (nextPC == null) { 115 return null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 123 } catch (DebuggerException e) { 124 nextCFA = getNextCFA(null, context); 125 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 133 }
>>>>>>>>>>>>
>>>>>>>>>>>> ??The above can be simplified if a DebuggerException can not be thrown from processDwarf(nextPC):
>>>>>>>>>>>> ????? private CFrame javaSender(ThreadContext context) {
>>>>>>>>>>>> ??????? Address nextPC = getNextPC(false);
>>>>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>>>>> ????????? return null;
>>>>>>>>>>>> ??????? }
>>>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>>>>>
>>>>>>>>>>>> ??????? if (libptr != 0L) { // Native frame
>>>>>>>>>>>> ????????? try {
>>>>>>>>>>>> ??????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>>>>> ??????????? nextDwarf.processDwarf(nextPC);
>>>>>>>>>>>> ????????? } catch (DebuggerException e) { // Bail out to Java frame
>>>>>>>>>>>> ????????? }
>>>>>>>>>>>> ??????? }
>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>>>>> ????? }
>>>>>>>>>>>>
>>>>>>>>>>>> 135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext context = thread.getContext(); 137 138 if (dwarf == null) { // Java frame 139 return javaSender(context); 140 } 141 142 Address nextPC = getNextPC(true); 143 if (nextPC == null) { 144 return null; 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // Next frame might be Java frame 153 nextCFA = getNextCFA(null, context); 154 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 158 } catch (DebuggerException e) { 159 nextCFA = getNextCFA(null, context); 160 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? null : new 
>>>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 167 }
>>>>>>>>>>>>
>>>>>>>>>>>> ??This one can be also simplified a little:
>>>>>>>>>>>>
>>>>>>>>>>>> ????? public CFrame sender(ThreadProxy thread) {
>>>>>>>>>>>> ??????? ThreadContext context = thread.getContext();
>>>>>>>>>>>>
>>>>>>>>>>>> ??????? if (dwarf == null) { // Java frame
>>>>>>>>>>>> ????????? return javaSender(context);
>>>>>>>>>>>> ??????? }
>>>>>>>>>>>> ??????? Address nextPC = getNextPC(true);
>>>>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>>>>> ????????? return null;
>>>>>>>>>>>> ??????? }
>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>>>>> ??????? if (!dwarf.isIn(nextPC)) {
>>>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>>> ????????? if (libptr != 0L) {
>>>>>>>>>>>> ??????????? try {
>>>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC);
>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame
>>>>>>>>>>>> ??????????? }
>>>>>>>>>>>> ????????? }
>>>>>>>>>>>> ??????? }
>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>>>>> ????? }
>>>>>>>>>>>>
>>>>>>>>>>>> Finally, it looks like just one method could replace both
>>>>>>>>>>>> sender(ThreadProxy thread) and javaSender(ThreadContext context):
>>>>>>>>>>>>
>>>>>>>>>>>> ????? private CFrame commonSender(ThreadProxy thread) {
>>>>>>>>>>>> ??????? ThreadContext context = thread.getContext();
>>>>>>>>>>>> ??????? Address nextPC = getNextPC(false);
>>>>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>>>>> ????????? return null;
>>>>>>>>>>>> ??????? }
>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>>>>>
>>>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) {
>>>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>>> ????????? if (libptr != 0L) {
>>>>>>>>>>>> ??????????? try {
>>>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>>>>> ????????????? nextDwarf.processDwarf(nextPC);
>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame
>>>>>>>>>>>> ??????????? }
>>>>>>>>>>>> ????????? }
>>>>>>>>>>>> ??????? }
>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>>>>> ????? }
>>>>>>>>>>>>
>>>>>>>>>>>> I'm still reviewing the dwarf parser files.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Serguei
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I refactored LinuxAMD64CFrame.java . It works fine in serviceability/sa tests and
>>>>>>>>>>>>> all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923).
>>>>>>>>>>>>> Could you review new webrev?
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/
>>>>>>>>>>>>>
>>>>>>>>>>>>> The diff from previous webrev is here:
>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> According to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64
>>>>>>>>>>>>>> Architecture Processor Supplement [1], we need to use DWARF in .eh_frame or .debug_frame
>>>>>>>>>>>>>> for stack unwinding.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by default since GCC 4.6, so system
>>>>>>>>>>>>>> library (e.g. libc) might be compiled with this feature.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> However `jhsdb jstack --mixed` does not do so, it uses base pointer register (RBP).
>>>>>>>>>>>>>> So it might be lack of stack frames.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I guess JDK-8219201 is caused by same issue.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [1] https://urldefense.com/v3/__https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaQusmjOqA$ 
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>

From kevin.walls at oracle.com  Wed Mar 11 15:31:31 2020
From: kevin.walls at oracle.com (Kevin Walls)
Date: Wed, 11 Mar 2020 15:31:31 +0000
Subject: : PING: RFR: 8234624: jstack mixed mode should refer DWARF
In-Reply-To: <bfd14ee6-61b1-35e4-786c-242f65d00b7a@oss.nttdata.com>
References: <eb284894-0801-d04d-c5c2-68ea3faaef98@oss.nttdata.com>
 <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com>
 <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com>
 <e819c572-2172-c3e2-9933-f3859662d400@oss.nttdata.com>
 <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com>
 <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com>
 <c25bb60f-4e37-58db-b40f-2afce2dbf82f@oss.nttdata.com>
 <b2d223d2-47db-3119-b579-e48ca3b50469@oss.nttdata.com>
 <c9ac396c-f14b-3313-b3aa-912dd7dca482@oracle.com>
 <24a2dcae-fba9-c6ed-657e-a1d54584bbc3@oss.nttdata.com>
 <01c2a8c1-0e04-4e42-161a-80d6bdaa4ffa@oracle.com>
 <ab3a496e-4f19-922b-7418-1fca058a14c1@oss.nttdata.com>
 <e8a16c8e-ff00-6985-dc3d-9450c6b489df@oss.nttdata.com>
 <26681640-d8c8-5b64-58cc-36f8a854d101@oracle.com>
 <bfd14ee6-61b1-35e4-786c-242f65d00b7a@oss.nttdata.com>
Message-ID: <aa7b4d1b-7daf-2d3a-c6b1-9be34d2b1f07@oracle.com>

Hi -

OK great, it checks for a zero-length dwarf entry.

I did a rebuild and tested it locally it works here, so all looks good 
to me.

We may in future want to work on 
src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/bsd/amd64/BsdAMD64CFrame.java 
to use Dwarf similarly, that's why I mentioned the platform-neutral 
directory name, but I have no issue with that happening in the future.

Thanks
Kevin


On 11/03/2020 09:49, Yasumasa Suenaga wrote:
> Hi,
>
> Thanks David and Ioi for sharing the status.
> I've fixed the problem in new webrev 
> (mach5-one-ysuenaga-JDK-8234624-6-20200311-0827-9367344):
>
> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.06/
>
> Diff from webrev.05 is here:
>
> ? http://hg.openjdk.java.net/jdk/submit/rev/e3d12785f087
>
>
> Thanks,
>
> Yasumasa
>
>
> On 2020/03/11 14:59, David Holmes wrote:
>> Hi Yasumasa,
>>
>> Partial hs_err info below.
>>
>> David
>> -----
>>
>> #
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> #? SIGSEGV (0xb) at pc=0x00007fdf2000e87c, pid=29798, tid=29800
>> #
>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug 
>> build 15-internal+0-2020-03-11-0447267.suenaga.source)
>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
>> 15-internal+0-2020-03-11-0447267.suenaga.source, mixed mode, sharing, 
>> tiered, compressed oops, g1 gc, linux-amd64)
>> # Problematic frame:
>> # C? [libsaproc.so+0x487c]? DwarfParser::process_dwarf(unsigned 
>> long)+0x2c
>> #
>> # Core dump will be written. Default location: Core dumps may be 
>> processed with "/opt/core.sh %p" (or dumping to 
>> /opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/scratch/0/core.29798)
>> #
>> # If you would like to submit a bug report, please visit:
>> # 
>> https://urldefense.com/v3/__https://bugreport.java.com/bugreport/crash.jsp__;!!GqivPVa7Brio!OghfqRRRHbAZloG3aVJ244OPPTcCQOwYIl_vm6vU_toLb9qFzTUirVBEHn2tfDp26A$ 
>> # The crash happened outside the Java Virtual Machine in native code.
>> # See problematic frame for where to report the bug.
>> #
>>
>> ---------------? S U M M A R Y ------------
>>
>> Command Line: 
>> -Denv.class.path=/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/serviceability/sa/TestJhsdbJstackMixed.d:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/hotspot/jtreg/serviceability/sa:/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/javatest.jar:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/jtreg.jar 
>> -Dapplication.home=/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/linux-x64-debug.jdk/jdk-15/fastdebug 
>> -Xms8m -Djdk.module.main=jdk.hotspot.agent 
>> jdk.hotspot.agent/sun.jvm.hotspot.SALauncher jstack --mixed --pid 29770
>>
>> Time: Wed Mar 11 05:20:57 2020 UTC elapsed time: 3.927809 seconds (0d 
>> 0h 0m 3s)
>>
>> ---------------? T H R E A D? ---------------
>>
>> Current thread (0x00007fdf5c032000):? JavaThread "main" 
>> [_thread_in_native, id=29800, 
>> stack(0x00007fdf63a9e000,0x00007fdf63b9f000)]
>>
>> Stack: [0x00007fdf63a9e000,0x00007fdf63b9f000], 
>> sp=0x00007fdf63b9d190, free space=1020k
>> Native frames: (J=compiled Java code, A=aot compiled Java code, 
>> j=interpreted, Vv=VM code, C=native code)
>> C? [libsaproc.so+0x487c]? DwarfParser::process_dwarf(unsigned long)+0x2c
>> j 
>> sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 
>> jdk.hotspot.agent at 15-internal
>> j 
>> sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 
>> jdk.hotspot.agent at 15-internal
>> j 
>> sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 
>> jdk.hotspot.agent at 15-internal
>> j 
>> sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 
>> jdk.hotspot.agent at 15-internal
>> j 
>> sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.tools.Tool.startInternal()V+87 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 
>> jdk.hotspot.agent at 15-internal
>> j sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 
>> jdk.hotspot.agent at 15-internal
>> j sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 
>> jdk.hotspot.agent at 15-internal
>> v? ~StubRoutines::call_stub
>> V? [libjvm.so+0xc2291c]? JavaCalls::call_helper(JavaValue*, 
>> methodHandle const&, JavaCallArguments*, Thread*)+0x6ac
>> V? [libjvm.so+0xd31970]? jni_invoke_static(JNIEnv_*, JavaValue*, 
>> _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) 
>> [clone .isra.140] [clone .constprop.263]+0x370
>> V? [libjvm.so+0xd36202]? jni_CallStaticVoidMethod+0x222
>> C? [libjli.so+0x4bed]? JavaMain+0xbcd
>> C? [libjli.so+0x80a9]? ThreadJavaMain+0x9
>>
>> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
>> j 
>> sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 
>> jdk.hotspot.agent at 15-internal
>> j 
>> sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 
>> jdk.hotspot.agent at 15-internal
>> j 
>> sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 
>> jdk.hotspot.agent at 15-internal
>> j 
>> sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 
>> jdk.hotspot.agent at 15-internal
>> j 
>> sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.tools.Tool.startInternal()V+87 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 
>> jdk.hotspot.agent at 15-internal
>> j sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 
>> jdk.hotspot.agent at 15-internal
>> j sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 
>> jdk.hotspot.agent at 15-internal
>> j? sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 
>> jdk.hotspot.agent at 15-internal
>> v? ~StubRoutines::call_stub
>>
>> siginfo: si_signo: 11 (SIGSEGV), si_code: 2 (SEGV_ACCERR), si_addr: 
>> 0x00007fded5076b79
>>
>> Register to memory mapping:
>>
>> RAX=0x00007f7e4dfe3229 is an unknown value
>> RBX=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 
>> d4 de 7f 00 00
>> RCX=0x00007fded4072380 points into unknown readable memory: 2f 75 73 
>> 72 2f 6c 69 62
>> RDX=0x00007fded4076b85 points into unknown readable memory: 01 00 00
>> RSP=0x00007fdf63b9d190 is pointing into the stack for thread: 
>> 0x00007fdf5c032000
>> RBP=0x00007fdf63b9d1b0 is pointing into the stack for thread: 
>> 0x00007fdf5c032000
>> RSI=0x0000000000000004 is an unknown value
>> RDI=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 
>> d4 de 7f 00 00
>> R8 =0x000000000146c380 points into unknown readable memory: 02 00 00 
>> 00 00 00 00 00
>> R9 =0x00007fded4076b79 points into unknown readable memory: 7a 52 00 
>> 01 78 10 01
>> R10=0x00000000ffffffff is an unknown value
>> R11=0x000000000100527a is an unknown value
>> R12=0x00007fded5076b79 is an unknown value
>> R13=0x00007f7da2f8e68a is an unknown value
>> R14=0x00007f7dbdf62b1d is an unknown value
>> R15=0x00007fdf5c032000 is a thread
>>
>>
>> Registers:
>> RAX=0x00007f7e4dfe3229, RBX=0x00007fdf5c4d7080, 
>> RCX=0x00007fded4072380, RDX=0x00007fded4076b85
>> RSP=0x00007fdf63b9d190, RBP=0x00007fdf63b9d1b0, 
>> RSI=0x0000000000000004, RDI=0x00007fdf5c4d7080
>> R8 =0x000000000146c380, R9 =0x00007fded4076b79, 
>> R10=0x00000000ffffffff, R11=0x000000000100527a
>> R12=0x00007fded5076b79, R13=0x00007f7da2f8e68a, 
>> R14=0x00007f7dbdf62b1d, R15=0x00007fdf5c032000
>> RIP=0x00007fdf2000e87c, EFLAGS=0x0000000000010206, 
>> CSGSFS=0x002b000000000033, ERR=0x0000000000000004
>> ?? TRAPNO=0x000000000000000e
>>
>> Top of Stack: (sp=0x00007fdf63b9d190)
>> 0x00007fdf63b9d190:?? 00007fdf209d0980 0000000000000000
>> 0x00007fdf63b9d1a0:?? 00007fdf209d0980 00007fdf63b9d258
>> 0x00007fdf63b9d1b0:?? 00007fdf63b9d228 00007fdf44778dbe
>> 0x00007fdf63b9d1c0:?? 000000000146c380 00007fdf5c032000
>>
>> Instructions: (pc=0x00007fdf2000e87c)
>> 0x00007fdf2000e77c:?? 89 43 18 4d 85 f6 75 0f eb 2a 66 2e 0f 1f 84 00
>> 0x00007fdf2000e78c:?? 00 00 00 00 48 89 c2 48 8d 42 01 48 89 43 08 80
>> 0x00007fdf2000e79c:?? 78 ff 00 78 ef 48 8d 42 02 48 89 43 08 0f b6 42
>> 0x00007fdf2000e7ac:?? 01 88 43 10 48 c7 43 28 00 00 00 00 4c 89 e1 48
>> 0x00007fdf2000e7bc:?? 89 df 31 f6 48 b8 07 00 00 00 10 00 00 00 c6 43
>> 0x00007fdf2000e7cc:?? 3c 00 48 c7 c2 ff ff ff ff 48 89 43 14 48 c7 43
>> 0x00007fdf2000e7dc:?? 30 00 00 00 00 c7 43 38 00 00 00 00 e8 13 fb ff
>> 0x00007fdf2000e7ec:?? ff 4c 89 6b 08 48 83 c4 18 5b 41 5c 41 5d 41 5e
>> 0x00007fdf2000e7fc:?? 41 5f 5d c3 83 e7 40 0f 84 63 ff ff ff 48 c7 c2
>> 0x00007fdf2000e80c:?? ff ff ff ff 48 d3 e2 49 09 d0 e9 51 ff ff ff 90
>> 0x00007fdf2000e81c:?? 0f 1f 40 00 0f b6 47 10 83 e0 07 3c 02 74 0a 76
>> 0x00007fdf2000e82c:?? 1b 3c 03 74 04 3c 04 75 17 48 8b 57 08 8b 02 48
>> 0x00007fdf2000e83c:?? 83 c2 04 48 89 57 08 c3 0f 1f 40 00 84 c0 74 e9
>> 0x00007fdf2000e84c:?? 31 c0 c3 90 55 41 ba ff ff ff ff 48 89 e5 41 56
>> 0x00007fdf2000e85c:?? 41 55 49 89 f5 41 54 53 48 8b 07 48 89 fb 4c 8b
>> 0x00007fdf2000e86c:?? a0 28 11 00 00 eb 09 0f 1f 44 00 00 4c 89 63 08
>> 0x00007fdf2000e87c:?? 41 8b 04 24 4d 8d 4c 24 04 4c 89 4b 08 4c 39 d0
>> 0x00007fdf2000e88c:?? 75 0a 49 8b 44 24 04 4d 8d 4c 24 0c 45 8b 19 4d
>> 0x00007fdf2000e89c:?? 8d 24 01 49 8d 41 04 48 89 43 08 45 85 db 74 cc
>> 0x00007fdf2000e8ac:?? 48 89 df e8 8c f9 ff ff 48 8b 13 41 89 c6 4c 03
>> 0x00007fdf2000e8bc:?? b2 18 11 00 00 e8 5a ff ff ff 89 c0 4c 01 f0 4c
>> 0x00007fdf2000e8cc:?? 39 e8 76 a8 4d 39 ee 77 a3 44 89 da 4c 89 ce e8
>> 0x00007fdf2000e8dc:?? 90 fd ff ff 48 8b 43 08 31 c9 31 ff 48 83 c0 01
>> 0x00007fdf2000e8ec:?? 0f 1f 40 00 48 89 43 08 0f b6 70 ff 49 89 c0 48
>> 0x00007fdf2000e8fc:?? 83 c0 01 48 89 f2 83 e2 7f 48 d3 e2 83 c1 07 48
>> 0x00007fdf2000e90c:?? 09 d7 40 84 f6 78 dd 4c 01 c7 4c 89 e1 4c 89 ea
>> 0x00007fdf2000e91c:?? 4c 89 f6 48 89 7b 08 48 89 df e8 d5 f9 ff ff 5b
>> 0x00007fdf2000e92c:?? 31 c0 41 5c 41 5d 41 5e 5d c3 66 2e 0f 1f 84 00
>> 0x00007fdf2000e93c:?? 00 00 00 00 55 48 89 e5 41 54 53 48 81 ec d0 00
>> 0x00007fdf2000e94c:?? 00 00 48 89 b5 48 ff ff ff 48 89 95 50 ff ff ff
>> 0x00007fdf2000e95c:?? 48 89 8d 58 ff ff ff 4c 89 85 60 ff ff ff 4c 89
>> 0x00007fdf2000e96c:?? 8d 68 ff ff ff 84 c0 74 23 0f 29 85 70 ff ff ff
>>
>>
>> On 11/03/2020 3:52 pm, Yasumasa Suenaga wrote:
>>> Hi Kevin,
>>>
>>> I saw 2 errors on submit repo 
>>> (mach5-one-ysuenaga-JDK-8234624-5-20200311-0209-9358475).
>>> So I tweaked my patch, but I saw the crash again 
>>> (mach5-one-ysuenaga-JDK-8234624-5-20200311-0448-9361448).
>>>
>>> ?? Last change on submit repo is here:
>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05-2/
>>>
>>> Can you share details on submit repo?
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2020/03/11 11:07, Yasumasa Suenaga wrote:
>>>> Hi Kevin,
>>>>
>>>> I guess first program header in the libraries which are on your 
>>>> machine has exec flag (you can check it with `readelf -l`).
>>>> So I tweaked my patch (initial value of exec_start and exec_end set 
>>>> to -1) in new webrev.
>>>>
>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05/
>>>>
>>>> This webrev contains the fix for your comment (typo and 
>>>> DW_CFA_advance_loc4).
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2020/03/11 8:53, Kevin Walls wrote:
>>>>> Hi -
>>>>>
>>>>> In testing I wasn't seeing any of the Dwarf code triggered.
>>>>>
>>>>> With LIBSAPROC_DEBUG set I'm getting the "Could not find 
>>>>> executable section in" for lots of / maybe all the libraries...
>>>>>
>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/libproc_impl.c
>>>>>
>>>>> ??? if (fill_instr_info(newlib)) {
>>>>> ????? if (!read_eh_frame(ph, newlib)) {
>>>>>
>>>>> fill_instr_info is failing, and we never get to read_eh_frame().
>>>>>
>>>>> output like:
>>>>>
>>>>> libsaproc DEBUG: [0] vaddr = 0x0, memsz = 0xaba4, filesz = 0xaba4
>>>>> libsaproc DEBUG: Could not find executable section in 
>>>>> /lib/x86_64-linux-gnu/libnss_nis-2.27.so
>>>>>
>>>>> (similar for all libraries).
>>>>>
>>>>> fill_instr fails if:
>>>>>
>>>>> ??if ((lib->exec_start == 0L) || (lib->exec_end == 0L))
>>>>>
>>>>> ...but isn't exec_start relative to the library address? It's the 
>>>>> value of ph->vaddr and it is often zero.
>>>>>
>>>>> I added some booleans and did:
>>>>>
>>>>> 185?????? if ((lib->exec_start == 0L) || (lib->exec_start > 
>>>>> ph->p_vaddr)) {
>>>>> 186???????? lib->exec_start = ph->p_vaddr;
>>>>> 187???????? found_start =true;
>>>>> 188?????? }
>>>>>
>>>>> (similarly for end) and only failed if:
>>>>>
>>>>> 201?? if (!found_start || !found_end) {
>>>>> 202???? return false;
>>>>>
>>>>> ...and now it's better. ? I go from:
>>>>>
>>>>> ----------------- 3306 -----------------
>>>>> 0x00007f75824acd2d????? __GI___pthread_timedjoin_ex + 0x17d
>>>>>
>>>>> to:
>>>>>
>>>>> ----------------- 31127 -----------------
>>>>> 0x00007fa284d78d2d????? __GI___pthread_timedjoin_ex + 0x17d
>>>>> 0x00007fa2857aaf2d????? CallJavaMainInNewThread + 0xad
>>>>> 0x00007fa2857a74ed????? ContinueInNewThread + 0x4d
>>>>> 0x00007fa2857a8c49????? JLI_Launch + 0x1529
>>>>> 0x000055af1b78db1c????? main + 0x11c
>>>>>
>>>>>
>>>>> Thanks
>>>>> Kevin
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 10/03/2020 12:36, Yasumasa Suenaga wrote:
>>>>>
>>>>>> Hi Kevin,
>>>>>>
>>>>>> Thanks for your comment!
>>>>>>
>>>>>> On 2020/03/10 18:58, Kevin Walls wrote:
>>>>>>> Hi Yasumasa ,
>>>>>>>
>>>>>>> The changes build OK for me in the latest jdk, and things still 
>>>>>>> work.
>>>>>>> I have not yet seen the dwarf usage in action: I've tried a 
>>>>>>> couple of different systems and so far have not reproduced the 
>>>>>>> problem, i.e. jstack has not failed on native frames.
>>>>>>>
>>>>>>> I may need more recent basic libraries, will look again for 
>>>>>>> somewhere where the problem happens and get back to you as I 
>>>>>>> really want to run the changes.
>>>>>>
>>>>>> You can see the problem with JShell.
>>>>>> Some Java frames would not be seen in mixed jstack.
>>>>>>
>>>>>>
>>>>>>> I have mostly minor other comments which don't need a new 
>>>>>>> webrev, some just comments for the future:
>>>>>>>
>>>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp:
>>>>>>>
>>>>>>> DW_CFA_nop - shouldn't this continue instead of return?
>>>>>>> (It may "never" happen, but a nop could appear within some other 
>>>>>>> instructions?)
>>>>>>
>>>>>> DW_CFA_nop is used for padding, so we can ignore (return 
>>>>>> immediately) it.
>>>>>>
>>>>>>
>>>>>>> DW_CFA_remember_state: a minor typo in the comment, 
>>>>>>> "DW_CFA_remenber_state".
>>>>>>
>>>>>> I will fix it.
>>>>>>
>>>>>>
>>>>>>> We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not 
>>>>>>> DW_CFA_advance_loc4.? I thought that was odd, but maybe 
>>>>>>> addresses in these tables never increase by 4-byte amounts, 
>>>>>>> would this mean a lot of code on one line. 8-)
>>>>>>> So maybe it's never used in practice, if you think it's 
>>>>>>> unnecessary no problem, maybe a comment, or add it for robustness.
>>>>>>
>>>>>> I will add DW_CFA_advance_loc4.
>>>>>>
>>>>>>
>>>>>>> General-purpose methods like read_leb128(), get_entry_length(), 
>>>>>>> get_decoded_value() specifically update the _buf pointer in this 
>>>>>>> DwarfParser.
>>>>>>>
>>>>>>> DwarfParser::process_dwarf() moves _buf.
>>>>>>> It calls process_cie() which reads, moves _buf and restores it 
>>>>>>> to the original position, then we read augmentation_length from 
>>>>>>> where _buf is.
>>>>>>> I'm not sure if that's wrong, or if I just need to read again 
>>>>>>> about the CIE/etc layout.
>>>>>>>
>>>>>>> I don't really want to suggest making the code pass around a 
>>>>>>> current _buf for the invocation of these general purpose 
>>>>>>> methods, but just wanted to comment that if these get used more 
>>>>>>> widely that might become necessary.
>>>>>>
>>>>>> I saw GDB and binutils source for creating this patch.
>>>>>> They seems to process similar code because we need to calculate 
>>>>>> DWARF instructions one-by-one to get the value which relates to 
>>>>>> specified PC.
>>>>>>
>>>>>>
>>>>>>> Similarly in future, if this DWARF support code became used more 
>>>>>>> widely, it might want to move to an
>>>>>>> OS-neutral directory?? It's odd to label it as Linux-specific.
>>>>>>
>>>>>> Windows does not use DWARF at least, it uses another feature.
>>>>>>
>>>>>> https://urldefense.com/v3/__https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaTtALtpRg$ 
>>>>>>
>>>>>> I'm not sure other platforms (Solaris, macOS) uses DWARF.
>>>>>> If DWARF is used in them, I can move DWARF related code to posix 
>>>>>> directory.
>>>>>>
>>>>>>
>>>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp:
>>>>>>> Thanks for changing "can_parsable" which was in the earlier 
>>>>>>> version. 8-)
>>>>>>>
>>>>>>>
>>>>>>> These are just comments to mainly say it looks good, and 
>>>>>>> somebody else out there has read it.
>>>>>>> I will look for a system that shows the problem, and get back to 
>>>>>>> you again!
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>>> Many thanks
>>>>>>> Kevin
>>>>>>>
>>>>>>> On 27/02/2020 05:13, Yasumasa Suenaga wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> webrev.03 cannot be applied to current jdk/jdk due to 8239224 
>>>>>>>> and 8239462 changes (they updated copyright year).
>>>>>>>> So I modified webrev (only copyright year changes) to be able 
>>>>>>>> to apply to current jdk/jdk.
>>>>>>>> Could you review it?
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/
>>>>>>>>
>>>>>>>> I need one more reviewer to push.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2020/02/17 13:07, Yasumasa Suenaga wrote:
>>>>>>>>> PING: Could you review it?
>>>>>>>>>
>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>>> ?? webrev: 
>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>>>>
>>>>>>>>> This change has been already reviewed by Serguei.
>>>>>>>>> I need one more reviewer to push.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2020/02/03 1:37, Yasumasa Suenaga wrote:
>>>>>>>>>> PING: Could you reveiw this change?
>>>>>>>>>>
>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>>> ?? webrev: 
>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>>>>>
>>>>>>>>>> I believe this change helps troubleshooter to fight to 
>>>>>>>>>> postmortem analysis.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2020/01/19 3:16, Yasumasa Suenaga wrote:
>>>>>>>>>>> PING: Could you review it?
>>>>>>>>>>>
>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>>> ?? webrev: 
>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>>>>>>
>>>>>>>>>>> I updated webrev. I discussed with Serguei in off list, and 
>>>>>>>>>>> I refactored webrev.02 .
>>>>>>>>>>> It has passed tests on submit repo 
>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549).
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 2019/12/15 10:51, Yasumasa Suenaga wrote:
>>>>>>>>>>>> Hi Serguei,
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for your comment!
>>>>>>>>>>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new 
>>>>>>>>>>>> webrev.
>>>>>>>>>>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c 
>>>>>>>>>>>> as Dmitry said.
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/
>>>>>>>>>>>>
>>>>>>>>>>>> This change has been passed all tests on submit repo 
>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is nice move in general.
>>>>>>>>>>>>> Thank you for working on this!
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html 
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if 
>>>>>>>>>>>>> (libptr == 0L) { // Java frame 98 Address rbp = 
>>>>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 
>>>>>>>>>>>>> if (rbp == null) { 100 return null; 101 } 102 return new 
>>>>>>>>>>>>> LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // 
>>>>>>>>>>>>> Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = 
>>>>>>>>>>>>> new DwarfParser(libptr); 107 } catch (DebuggerException e) 
>>>>>>>>>>>>> { 108 Address rbp = 
>>>>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 
>>>>>>>>>>>>> if (rbp == null) { 110 return null; 111 } 112 return new 
>>>>>>>>>>>>> LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 
>>>>>>>>>>>>> dwarf.processDwarf(pc); 115 Address cfa = 
>>>>>>>>>>>>> ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 
>>>>>>>>>>>>> !dwarf.isBPOffsetAvailable()) 117 ? 
>>>>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : 
>>>>>>>>>>>>> context.getRegisterAsAddress(dwarf.getCFARegister()) 119 
>>>>>>>>>>>>> .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 
>>>>>>>>>>>>> 121 return null; 122 } 123 return new 
>>>>>>>>>>>>> LinuxAMD64CFrame(dbg, cfa, pc, dwarf); 124 }
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'd suggest to simplify the logic by refactoring to 
>>>>>>>>>>>>> something like below:
>>>>>>>>>>>>>
>>>>>>>>>>>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc);
>>>>>>>>>>>>> ?????????? Address cfa = 
>>>>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); // 
>>>>>>>>>>>>> Java frame
>>>>>>>>>>>>> ?????????? DwarfParser dwarf = null;
>>>>>>>>>>>>>
>>>>>>>>>>>>> ?????????? if (libptr != 0L) { // Native frame
>>>>>>>>>>>>> ???????????? try {
>>>>>>>>>>>>> ?????????????? dwarf = new DwarfParser(libptr);
>>>>>>>>>>>>> ?????????????? dwarf.processDwarf(pc);
>>>>>>>>>>>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == 
>>>>>>>>>>>>> AMD64ThreadContext.RBP) &&
>>>>>>>>>>>>> !dwarf.isBPOffsetAvailable())
>>>>>>>>>>>>> ???????????????????????????????? ? 
>>>>>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP)
>>>>>>>>>>>>> ???????????????????????????????? : 
>>>>>>>>>>>>> context.getRegisterAsAddress(dwarf.getCFARegister())
>>>>>>>>>>>>> .addOffsetTo(dwarf.getCFAOffset());
>>>>>>>>>>>>>
>>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // bail out to 
>>>>>>>>>>>>> Java frame case
>>>>>>>>>>>>> ??????????? }
>>>>>>>>>>>>> ????????? }
>>>>>>>>>>>>> ????????? if (cfa == null) {
>>>>>>>>>>>>> ??????????? return null;
>>>>>>>>>>>>> ????????? }
>>>>>>>>>>>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf);
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html 
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 58 long ofs = useDwarf ? 
>>>>>>>>>>>>> dwarf.getReturnAddressOffsetFromCFA()
>>>>>>>>>>>>>
>>>>>>>>>>>>> ?? Better to rename 'ofs' => 'offs'.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 77 nextCFA = nextCFA.addOffsetTo(- 
>>>>>>>>>>>>> nextDwarf.getBasePointerOffsetFromCFA());
>>>>>>>>>>>>>
>>>>>>>>>>>>> ?? Extra space after '-' sign.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, 
>>>>>>>>>>>>> ThreadContext context) {
>>>>>>>>>>>>>
>>>>>>>>>>>>> ?? It feels like the logic has to be somehow 
>>>>>>>>>>>>> refactored/simplified as
>>>>>>>>>>>>> ?? several typical fragments appears in slightly different 
>>>>>>>>>>>>> contexts.
>>>>>>>>>>>>> ?? But it is not easy to understand what it is.
>>>>>>>>>>>>> ?? Could you, please, add some comments to key places 
>>>>>>>>>>>>> explaining this logic.
>>>>>>>>>>>>> ?? Then I'll check if it is possible to make it a little 
>>>>>>>>>>>>> bit simpler.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 109 private CFrame javaSender(ThreadContext context) { 110 
>>>>>>>>>>>>> Address nextCFA; 111 Address nextPC; 112 113 nextPC = 
>>>>>>>>>>>>> getNextPC(false); 114 if (nextPC == null) { 115 return 
>>>>>>>>>>>>> null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long 
>>>>>>>>>>>>> libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr 
>>>>>>>>>>>>> != 0L) { // Native frame 121 try { 122 nextDwarf = new 
>>>>>>>>>>>>> DwarfParser(libptr); 123 } catch (DebuggerException e) { 
>>>>>>>>>>>>> 124 nextCFA = getNextCFA(null, context); 125 return 
>>>>>>>>>>>>> (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, 
>>>>>>>>>>>>> nextCFA, nextPC, null); 126 } 127 
>>>>>>>>>>>>> nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = 
>>>>>>>>>>>>> getNextCFA(nextDwarf, context); 131 return (nextCFA == 
>>>>>>>>>>>>> null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, 
>>>>>>>>>>>>> nextPC, nextDwarf); 133 }
>>>>>>>>>>>>>
>>>>>>>>>>>>> ??The above can be simplified if a DebuggerException can 
>>>>>>>>>>>>> not be thrown from processDwarf(nextPC):
>>>>>>>>>>>>> ????? private CFrame javaSender(ThreadContext context) {
>>>>>>>>>>>>> ??????? Address nextPC = getNextPC(false);
>>>>>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>>>>>> ????????? return null;
>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>>>>>>
>>>>>>>>>>>>> ??????? if (libptr != 0L) { // Native frame
>>>>>>>>>>>>> ????????? try {
>>>>>>>>>>>>> ??????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>>>>>> nextDwarf.processDwarf(nextPC);
>>>>>>>>>>>>> ????????? } catch (DebuggerException e) { // Bail out to 
>>>>>>>>>>>>> Java frame
>>>>>>>>>>>>> ????????? }
>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new 
>>>>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>>>>>> ????? }
>>>>>>>>>>>>>
>>>>>>>>>>>>> 135 public CFrame sender(ThreadProxy thread) { 136 
>>>>>>>>>>>>> ThreadContext context = thread.getContext(); 137 138 if 
>>>>>>>>>>>>> (dwarf == null) { // Java frame 139 return 
>>>>>>>>>>>>> javaSender(context); 140 } 141 142 Address nextPC = 
>>>>>>>>>>>>> getNextPC(true); 143 if (nextPC == null) { 144 return 
>>>>>>>>>>>>> null; 145 } 146 147 Address nextCFA; 148 DwarfParser 
>>>>>>>>>>>>> nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long 
>>>>>>>>>>>>> libptr = dbg.findLibPtrByAddress(nextPC); 151 if (libptr 
>>>>>>>>>>>>> == 0L) { 152 // Next frame might be Java frame 153 nextCFA 
>>>>>>>>>>>>> = getNextCFA(null, context); 154 return (nextCFA == null) 
>>>>>>>>>>>>> ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 
>>>>>>>>>>>>> 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 
>>>>>>>>>>>>> 158 } catch (DebuggerException e) { 159 nextCFA = 
>>>>>>>>>>>>> getNextCFA(null, context); 160 return (nextCFA == null) ? 
>>>>>>>>>>>>> null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 
>>>>>>>>>>>>> 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 
>>>>>>>>>>>>> nextCFA = getNextCFA(nextDwarf, context); 166 return 
>>>>>>>>>>>>> (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, 
>>>>>>>>>>>>> nextCFA, nextPC, nextDwarf); 167 }
>>>>>>>>>>>>>
>>>>>>>>>>>>> ??This one can be also simplified a little:
>>>>>>>>>>>>>
>>>>>>>>>>>>> ????? public CFrame sender(ThreadProxy thread) {
>>>>>>>>>>>>> ??????? ThreadContext context = thread.getContext();
>>>>>>>>>>>>>
>>>>>>>>>>>>> ??????? if (dwarf == null) { // Java frame
>>>>>>>>>>>>> ????????? return javaSender(context);
>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>> ??????? Address nextPC = getNextPC(true);
>>>>>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>>>>>> ????????? return null;
>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>>>>>> ??????? if (!dwarf.isIn(nextPC)) {
>>>>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>>>> ????????? if (libptr != 0L) {
>>>>>>>>>>>>> ??????????? try {
>>>>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>>>>>> nextDwarf.processDwarf(nextPC);
>>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to 
>>>>>>>>>>>>> Java frame
>>>>>>>>>>>>> ??????????? }
>>>>>>>>>>>>> ????????? }
>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new 
>>>>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>>>>>> ????? }
>>>>>>>>>>>>>
>>>>>>>>>>>>> Finally, it looks like just one method could replace both
>>>>>>>>>>>>> sender(ThreadProxy thread) and javaSender(ThreadContext 
>>>>>>>>>>>>> context):
>>>>>>>>>>>>>
>>>>>>>>>>>>> ????? private CFrame commonSender(ThreadProxy thread) {
>>>>>>>>>>>>> ??????? ThreadContext context = thread.getContext();
>>>>>>>>>>>>> ??????? Address nextPC = getNextPC(false);
>>>>>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>>>>>> ????????? return null;
>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>>>>>>
>>>>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) {
>>>>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>>>> ????????? if (libptr != 0L) {
>>>>>>>>>>>>> ??????????? try {
>>>>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>>>>>> nextDwarf.processDwarf(nextPC);
>>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to 
>>>>>>>>>>>>> Java frame
>>>>>>>>>>>>> ??????????? }
>>>>>>>>>>>>> ????????? }
>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new 
>>>>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>>>>>> ????? }
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm still reviewing the dwarf parser files.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I refactored LinuxAMD64CFrame.java . It works fine in 
>>>>>>>>>>>>>> serviceability/sa tests and
>>>>>>>>>>>>>> all tests on submit repo 
>>>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923).
>>>>>>>>>>>>>> Could you review new webrev?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The diff from previous webrev is here:
>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>>>>>>> ?? webrev: 
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> According to 2.7 Stack Unwind Algorithm in System V 
>>>>>>>>>>>>>>> Application Binary Interface AMD64
>>>>>>>>>>>>>>> Architecture Processor Supplement [1], we need to use 
>>>>>>>>>>>>>>> DWARF in .eh_frame or .debug_frame
>>>>>>>>>>>>>>> for stack unwinding.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by 
>>>>>>>>>>>>>>> default since GCC 4.6, so system
>>>>>>>>>>>>>>> library (e.g. libc) might be compiled with this feature.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> However `jhsdb jstack --mixed` does not do so, it uses 
>>>>>>>>>>>>>>> base pointer register (RBP).
>>>>>>>>>>>>>>> So it might be lack of stack frames.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I guess JDK-8219201 is caused by same issue.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [1] 
>>>>>>>>>>>>>>> https://urldefense.com/v3/__https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaQusmjOqA$ 
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>


From serguei.spitsyn at oracle.com  Wed Mar 11 19:35:44 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 11 Mar 2020 12:35:44 -0700
Subject: RFR: 8240881: several tests are failing due to encoding failures
Message-ID: <af01b322-cfdc-d76f-cfbb-59a365472bd2@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200311/b703493b/attachment.htm>

From daniel.daugherty at oracle.com  Wed Mar 11 19:49:08 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Wed, 11 Mar 2020 15:49:08 -0400
Subject: RFR: 8240881: several tests are failing due to encoding failures
In-Reply-To: <af01b322-cfdc-d76f-cfbb-59a365472bd2@oracle.com>
References: <af01b322-cfdc-d76f-cfbb-59a365472bd2@oracle.com>
Message-ID: <b6e8224e-484b-664d-2851-b982f378b6ab@oracle.com>

On 3/11/20 3:35 PM, serguei.spitsyn at oracle.com wrote:
> Please, review the fix of:
> https://bugs.openjdk.java.net/browse/JDK-8240881
>
> Webrev:
> http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/8240881-encoding-antidelta/

Thumbs up! This is a trivial changeset so you may push with a
single (R)eviewer.

Your anti-delta matches mine and I compared mine to the parent of
the push for JDK-8222489.

My Mach5 Tier[56] job set is not quite finished, but I haven't seen
any signs of the failures yet.

Dan


>
>
> Summary:
> ? JDK-8240881 is a regression caused by the fix of: 
> https://bugs.openjdk.java.net/browse/JDK-8222489
> ? ?? Webrev: http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/
> ???? Changeset: 
> http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset
>
> ? The suggested fix is the JDK-8240881 anti-delta.
> ? As a reviewer and sponsor, I apologize for this regression.
> ? The change impact occurred bigger than expected.
>
> Testing:
> ? The mac
>
> Thanks,
> Serguei

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200311/095b5207/attachment.htm>

From ioi.lam at oracle.com  Wed Mar 11 19:49:55 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Wed, 11 Mar 2020 12:49:55 -0700
Subject: RFR: 8240881: several tests are failing due to encoding failures
In-Reply-To: <af01b322-cfdc-d76f-cfbb-59a365472bd2@oracle.com>
References: <af01b322-cfdc-d76f-cfbb-59a365472bd2@oracle.com>
Message-ID: <681add21-344f-fc59-4784-5b1c0f0bb851@oracle.com>

Looks good to me.

Thanks
- Ioi

On 3/11/20 12:35 PM, serguei.spitsyn at oracle.com wrote:
> Please, review the fix of:
> https://bugs.openjdk.java.net/browse/JDK-8240881
>
> Webrev:
> http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/8240881-encoding-antidelta/
>
>
> Summary:
> ? JDK-8240881 is a regression caused by the fix of: 
> https://bugs.openjdk.java.net/browse/JDK-8222489
> ? ?? Webrev: http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/
> ???? Changeset: 
> http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset
>
> ? The suggested fix is the JDK-8240881 anti-delta.
> ? As a reviewer and sponsor, I apologize for this regression.
> ? The change impact occurred bigger than expected.
>
> Testing:
> ? The mac
>
> Thanks,
> Serguei

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200311/dc418b68/attachment.htm>

From serguei.spitsyn at oracle.com  Wed Mar 11 19:52:38 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 11 Mar 2020 12:52:38 -0700
Subject: PING: Re: RFR: JDK-8222489: jcmd VM.system_properties gives
 unusable paths on Windows
In-Reply-To: <3e535ec6-573d-901e-e4c0-4e2174c80f15@oracle.com>
References: <CAE_05uymp-AR6x5MLOZp7ATiOb4+O3Ev2bvscH2nmvHs79LAeA@mail.gmail.com>
 <54d4e146-58bc-8318-6e27-922616ff4b37@oss.nttdata.com>
 <CAE_05uwi8trdTc6W3dVepNshy3VbOPQGJd3gW6gkc9hYoXBTrg@mail.gmail.com>
 <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com>
 <CAE_05uxvGwtgsM4NSH5ipC_gOT+f_fjhdQ+xpHoX7DSp7yntEA@mail.gmail.com>
 <0a2df665-2e08-6139-c131-043a425b4916@oracle.com>
 <CAE_05uzT8PedhscdgqOQ_0Ek3uKB2mHdZSGJLOH4WyE02Jv+qg@mail.gmail.com>
 <AM0PR02MB450007C62B0392584C378ECF9FEA0@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <CAE_05uwttqkP06KStTEdQY_bbNTra4OHpQfszEcg5q_gbCxyrQ@mail.gmail.com>
 <CAE_05uz93m96Ez+zRv8h2j6DyDW9xPeanm1wNCo9r+kG+BoOug@mail.gmail.com>
 <31599c2a-6947-44a0-0f91-cb240ddaf1ed@oracle.com>
 <d7637b4f-5c9a-436d-f11b-468cc43aa8d3@oss.nttdata.com>
 <CAE_05uzNAU0f8T8Gh9=pxd5U18xTDsWbxKkmbj5iSf2AhMWXYQ@mail.gmail.com>
 <CAE_05uxC4ADo-qQaOaSAFHfUZ93uufVHZu639vvvSfeJWeNy=A@mail.gmail.com>
 <3e535ec6-573d-901e-e4c0-4e2174c80f15@oracle.com>
Message-ID: <4d29cbef-aa4e-31bc-8562-16930eeed219@oracle.com>

Hi Chihiro,

I've tested and pushed your fix but the impact of fix was underestimated.
The fix caused several regressions and the following bug was filed:
 ? https://bugs.openjdk.java.net/browse/JDK-8240881

Now, I'm working on removing the fix of JDK-8222489 with the anti-delta.
You can find and review my RFR posted on the serviceability-dev mailing 
list:
 ? RFR: 8240881: several tests are failing due to encoding failures

You can file another bug as a replacement of JDK-8222489.
I will help you with the information about test regressions caused by it.

Thanks,
Serguei


On 3/10/20 02:54, serguei.spitsyn at oracle.com wrote:
> Hi Chihiro,
>
> Yes, I'll sponsor it.
> Thank you for the update.
>
> Thanks,
> Serguei
>
>
> On 3/8/20 06:05, Chihiro Ito wrote:
>> Hi,
>>
>> I'm sorry. I included "JDK-" in the changeset title. I removed it and
>> updated it.
>>
>> Change set : 
>> http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset
>>
>> Regards,
>> Chihiro
>>
>> 2020?3?7?(?) 23:13 Chihiro Ito <chiroito107 at gmail.com>:
>>> Hi Serguei and Yasumasa,
>>>
>>> I update the copyright year and created the change set.
>>>
>>> Could you sponsor this, please?
>>>
>>> Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/
>>> Change set : 
>>> http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset
>>>
>>> Regards,
>>> Chihiro
>>>
>>>
>>> 2020?3?7?(?) 16:03 Yasumasa Suenaga <suenaga at oss.nttdata.com>:
>>>
>>>
>>>> Hi Chihiro,
>>>>
>>>> I'm also ok with webrev.05 after updating copyright year.
>>>>
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2020/03/07 3:32, serguei.spitsyn at oracle.com wrote:
>>>>> Hi Chichiro,
>>>>>
>>>>> I'm okay with the fix.
>>>>> Could you, please, update the copyright date in || 
>>>>> src/java.base/share/classes/jdk/internal/vm/VMSupport.java before 
>>>>> push?
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>> On 3/6/20 07:24, Chihiro Ito wrote:
>>>>>> Hi Serguei,
>>>>>>
>>>>>> Could you review this again, please?
>>>>>>
>>>>>> Regards,
>>>>>> Chihiro
>>>>>>
>>>>>>
>>>>>> 2020?2?27?(?) 22:11 Chihiro Ito<chiroito107 at gmail.com>:
>>>>>>> Hi Ralf,
>>>>>>>
>>>>>>> Thank you for your advice.
>>>>>>>
>>>>>>> 1.
>>>>>>> The comment of serializePropertiesToByteArray in VMSupport is 
>>>>>>> "The stream written to the byte array is ISO 8859-1 encoded.".
>>>>>>> But the previous implementation does not keep this. I think we 
>>>>>>> need to implement encode by ISO 8859-1.
>>>>>>>
>>>>>>> 2.
>>>>>>> According to help, the feature of VM.system_properties is just 
>>>>>>> "Print system properties". The users should not use this output 
>>>>>>> for loading. The users use it when they want to see System 
>>>>>>> Properties soon.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Chihiro
>>>>>>>
>>>>>>>
>>>>>>> 2020?2?26?(?) 18:53 Schmelter, Ralf<ralf.schmelter at sap.com>:
>>>>>>>> Hi Chihiro,
>>>>>>>>
>>>>>>>> I have two remarks:
>>>>>>>>
>>>>>>>> 1. ISO Latin 1 characters which are not ASCII will not work 
>>>>>>>> with the code. While the Properties.store() method claims to 
>>>>>>>> create ISO Latin 1 String, it really only will create printable 
>>>>>>>> ASCII characters (apart from the comment, but it is ASCII too 
>>>>>>>> in this case). See Properties.saveConvert, where the char is 
>>>>>>>> checked for < 0x20 or > 0x7e and then printed as \uxxxx. This 
>>>>>>>> is important, since the bytes of the ByteArrayOutputStream are 
>>>>>>>> then send to the jcmd. And jcmd expects UTF-8 encoded strings, 
>>>>>>>> which is OK if we only used ASCII characters. But a ISO Latin 1 
>>>>>>>> character >= 0x80 will break the encoding. Just try using 
>>>>>>>> \u00DC in your test.
>>>>>>>>
>>>>>>>> 2. Your change makes it impossible to load the output with 
>>>>>>>> properties.load(). The old output could be loaded, since it was 
>>>>>>>> a valid properties file. But yours is not. For example, 
>>>>>>>> consider the filename c:\test\new. Formerly it would be encoded 
>>>>>>>> as:
>>>>>>>> C\:\\test\\new
>>>>>>>> And now it is:
>>>>>>>> C:\test\new
>>>>>>>> But the properties code would see "\n" as the newline character 
>>>>>>>> in your encoding. In fact you cannot differentiate between \n, 
>>>>>>>> \t, \f and \r originally being one or two characters.
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>> Ralf
>>>>>>>>
>>>>>>>>
>>>>>>>> From: 
>>>>>>>> serviceability-dev<serviceability-dev-bounces at openjdk.java.net> 
>>>>>>>> On Behalf Of Chihiro Ito
>>>>>>>> Sent: Dienstag, 25. Februar 2020 04:45
>>>>>>>> To:serguei.spitsyn at oracle.com
>>>>>>>> Cc:serviceability-dev at openjdk.java.net
>>>>>>>> Subject: Re: RFR: JDK-8222489: jcmd VM.system_properties gives 
>>>>>>>> unusable paths on Windows
>>>>>>>>
>>>>>>>> Hi Serguei,
>>>>>>>>
>>>>>>>> Thanks for your review and advice.
>>>>>>>>
>>>>>>>> I modified these.
>>>>>>>> Could you review this again, please?
>>>>>>>>
>>>>>>>> Webrev :http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.05/
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Chihiro
>>>>>>>>
>


From daniel.daugherty at oracle.com  Wed Mar 11 19:58:11 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Wed, 11 Mar 2020 15:58:11 -0400
Subject: PING: Re: RFR: JDK-8222489: jcmd VM.system_properties gives
 unusable paths on Windows
In-Reply-To: <4d29cbef-aa4e-31bc-8562-16930eeed219@oracle.com>
References: <CAE_05uymp-AR6x5MLOZp7ATiOb4+O3Ev2bvscH2nmvHs79LAeA@mail.gmail.com>
 <CAE_05uwi8trdTc6W3dVepNshy3VbOPQGJd3gW6gkc9hYoXBTrg@mail.gmail.com>
 <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com>
 <CAE_05uxvGwtgsM4NSH5ipC_gOT+f_fjhdQ+xpHoX7DSp7yntEA@mail.gmail.com>
 <0a2df665-2e08-6139-c131-043a425b4916@oracle.com>
 <CAE_05uzT8PedhscdgqOQ_0Ek3uKB2mHdZSGJLOH4WyE02Jv+qg@mail.gmail.com>
 <AM0PR02MB450007C62B0392584C378ECF9FEA0@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <CAE_05uwttqkP06KStTEdQY_bbNTra4OHpQfszEcg5q_gbCxyrQ@mail.gmail.com>
 <CAE_05uz93m96Ez+zRv8h2j6DyDW9xPeanm1wNCo9r+kG+BoOug@mail.gmail.com>
 <31599c2a-6947-44a0-0f91-cb240ddaf1ed@oracle.com>
 <d7637b4f-5c9a-436d-f11b-468cc43aa8d3@oss.nttdata.com>
 <CAE_05uzNAU0f8T8Gh9=pxd5U18xTDsWbxKkmbj5iSf2AhMWXYQ@mail.gmail.com>
 <CAE_05uxC4ADo-qQaOaSAFHfUZ93uufVHZu639vvvSfeJWeNy=A@mail.gmail.com>
 <3e535ec6-573d-901e-e4c0-4e2174c80f15@oracle.com>
 <4d29cbef-aa4e-31bc-8562-16930eeed219@oracle.com>
Message-ID: <319bc269-a35b-210e-faca-d0cae54e0d51@oracle.com>

The replacement bug should be filed with this description:

 ??? [REDO] 8222489 jcmd VM.system_properties gives unusable paths on 
Windows

and should be linked to the original bug also.

Dan


On 3/11/20 3:52 PM, serguei.spitsyn at oracle.com wrote:
> Hi Chihiro,
>
> I've tested and pushed your fix but the impact of fix was underestimated.
> The fix caused several regressions and the following bug was filed:
> ? https://bugs.openjdk.java.net/browse/JDK-8240881
>
> Now, I'm working on removing the fix of JDK-8222489 with the anti-delta.
> You can find and review my RFR posted on the serviceability-dev 
> mailing list:
> ? RFR: 8240881: several tests are failing due to encoding failures
>
> You can file another bug as a replacement of JDK-8222489.
> I will help you with the information about test regressions caused by it.
>
> Thanks,
> Serguei
>
>
> On 3/10/20 02:54, serguei.spitsyn at oracle.com wrote:
>> Hi Chihiro,
>>
>> Yes, I'll sponsor it.
>> Thank you for the update.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 3/8/20 06:05, Chihiro Ito wrote:
>>> Hi,
>>>
>>> I'm sorry. I included "JDK-" in the changeset title. I removed it and
>>> updated it.
>>>
>>> Change set : 
>>> http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset
>>>
>>> Regards,
>>> Chihiro
>>>
>>> 2020?3?7?(?) 23:13 Chihiro Ito <chiroito107 at gmail.com>:
>>>> Hi Serguei and Yasumasa,
>>>>
>>>> I update the copyright year and created the change set.
>>>>
>>>> Could you sponsor this, please?
>>>>
>>>> Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/
>>>> Change set : 
>>>> http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset
>>>>
>>>> Regards,
>>>> Chihiro
>>>>
>>>>
>>>> 2020?3?7?(?) 16:03 Yasumasa Suenaga <suenaga at oss.nttdata.com>:
>>>>
>>>>
>>>>> Hi Chihiro,
>>>>>
>>>>> I'm also ok with webrev.05 after updating copyright year.
>>>>>
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2020/03/07 3:32, serguei.spitsyn at oracle.com wrote:
>>>>>> Hi Chichiro,
>>>>>>
>>>>>> I'm okay with the fix.
>>>>>> Could you, please, update the copyright date in || 
>>>>>> src/java.base/share/classes/jdk/internal/vm/VMSupport.java before 
>>>>>> push?
>>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>>
>>>>>> On 3/6/20 07:24, Chihiro Ito wrote:
>>>>>>> Hi Serguei,
>>>>>>>
>>>>>>> Could you review this again, please?
>>>>>>>
>>>>>>> Regards,
>>>>>>> Chihiro
>>>>>>>
>>>>>>>
>>>>>>> 2020?2?27?(?) 22:11 Chihiro Ito<chiroito107 at gmail.com>:
>>>>>>>> Hi Ralf,
>>>>>>>>
>>>>>>>> Thank you for your advice.
>>>>>>>>
>>>>>>>> 1.
>>>>>>>> The comment of serializePropertiesToByteArray in VMSupport is 
>>>>>>>> "The stream written to the byte array is ISO 8859-1 encoded.".
>>>>>>>> But the previous implementation does not keep this. I think we 
>>>>>>>> need to implement encode by ISO 8859-1.
>>>>>>>>
>>>>>>>> 2.
>>>>>>>> According to help, the feature of VM.system_properties is just 
>>>>>>>> "Print system properties". The users should not use this output 
>>>>>>>> for loading. The users use it when they want to see System 
>>>>>>>> Properties soon.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Chihiro
>>>>>>>>
>>>>>>>>
>>>>>>>> 2020?2?26?(?) 18:53 Schmelter, Ralf<ralf.schmelter at sap.com>:
>>>>>>>>> Hi Chihiro,
>>>>>>>>>
>>>>>>>>> I have two remarks:
>>>>>>>>>
>>>>>>>>> 1. ISO Latin 1 characters which are not ASCII will not work 
>>>>>>>>> with the code. While the Properties.store() method claims to 
>>>>>>>>> create ISO Latin 1 String, it really only will create 
>>>>>>>>> printable ASCII characters (apart from the comment, but it is 
>>>>>>>>> ASCII too in this case). See Properties.saveConvert, where the 
>>>>>>>>> char is checked for < 0x20 or > 0x7e and then printed as 
>>>>>>>>> \uxxxx. This is important, since the bytes of the 
>>>>>>>>> ByteArrayOutputStream are then send to the jcmd. And jcmd 
>>>>>>>>> expects UTF-8 encoded strings, which is OK if we only used 
>>>>>>>>> ASCII characters. But a ISO Latin 1 character >= 0x80 will 
>>>>>>>>> break the encoding. Just try using \u00DC in your test.
>>>>>>>>>
>>>>>>>>> 2. Your change makes it impossible to load the output with 
>>>>>>>>> properties.load(). The old output could be loaded, since it 
>>>>>>>>> was a valid properties file. But yours is not. For example, 
>>>>>>>>> consider the filename c:\test\new. Formerly it would be 
>>>>>>>>> encoded as:
>>>>>>>>> C\:\\test\\new
>>>>>>>>> And now it is:
>>>>>>>>> C:\test\new
>>>>>>>>> But the properties code would see "\n" as the newline 
>>>>>>>>> character in your encoding. In fact you cannot differentiate 
>>>>>>>>> between \n, \t, \f and \r originally being one or two characters.
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>> Ralf
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> From: 
>>>>>>>>> serviceability-dev<serviceability-dev-bounces at openjdk.java.net> 
>>>>>>>>> On Behalf Of Chihiro Ito
>>>>>>>>> Sent: Dienstag, 25. Februar 2020 04:45
>>>>>>>>> To:serguei.spitsyn at oracle.com
>>>>>>>>> Cc:serviceability-dev at openjdk.java.net
>>>>>>>>> Subject: Re: RFR: JDK-8222489: jcmd VM.system_properties gives 
>>>>>>>>> unusable paths on Windows
>>>>>>>>>
>>>>>>>>> Hi Serguei,
>>>>>>>>>
>>>>>>>>> Thanks for your review and advice.
>>>>>>>>>
>>>>>>>>> I modified these.
>>>>>>>>> Could you review this again, please?
>>>>>>>>>
>>>>>>>>> Webrev :http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.05/
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Chihiro
>>>>>>>>>
>>
>


From serguei.spitsyn at oracle.com  Wed Mar 11 19:58:34 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 11 Mar 2020 12:58:34 -0700
Subject: RFR: 8240881: several tests are failing due to encoding failures
In-Reply-To: <b6e8224e-484b-664d-2851-b982f378b6ab@oracle.com>
References: <af01b322-cfdc-d76f-cfbb-59a365472bd2@oracle.com>
 <b6e8224e-484b-664d-2851-b982f378b6ab@oracle.com>
Message-ID: <e5d52c91-7472-0ebd-eef4-8fac3996dd1a@oracle.com>

Hi Dan,

Thank you for filing the bug and review!
My mach5 job was submitted later , so your job comes to be handy - thanks!
I'll push the fix.

Thanks,
Serguei


On 3/11/20 12:49, Daniel D. Daugherty wrote:
> On 3/11/20 3:35 PM, serguei.spitsyn at oracle.com wrote:
>> Please, review the fix of:
>> https://bugs.openjdk.java.net/browse/JDK-8240881
>>
>> Webrev:
>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/8240881-encoding-antidelta/
>
> Thumbs up! This is a trivial changeset so you may push with a
> single (R)eviewer.
>
> Your anti-delta matches mine and I compared mine to the parent of
> the push for JDK-8222489.
>
> My Mach5 Tier[56] job set is not quite finished, but I haven't seen
> any signs of the failures yet.
>
> Dan
>
>
>>
>>
>> Summary:
>> ? JDK-8240881 is a regression caused by the fix of: 
>> https://bugs.openjdk.java.net/browse/JDK-8222489
>> ? ?? Webrev: http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/
>> ???? Changeset: 
>> http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset
>>
>> ? The suggested fix is the JDK-8240881 anti-delta.
>> ? As a reviewer and sponsor, I apologize for this regression.
>> ? The change impact occurred bigger than expected.
>>
>> Testing:
>> ? The mac
>>
>> Thanks,
>> Serguei
>


From serguei.spitsyn at oracle.com  Wed Mar 11 20:13:44 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 11 Mar 2020 13:13:44 -0700
Subject: RFR: 8240881: several tests are failing due to encoding failures
In-Reply-To: <681add21-344f-fc59-4784-5b1c0f0bb851@oracle.com>
References: <af01b322-cfdc-d76f-cfbb-59a365472bd2@oracle.com>
 <681add21-344f-fc59-4784-5b1c0f0bb851@oracle.com>
Message-ID: <2ac3d137-a695-3e43-b44e-499f71e7ec49@oracle.com>

Thanks, Ioi!
Serguei


On 3/11/20 12:49, Ioi Lam wrote:
> Looks good to me.
>
> Thanks
> - Ioi
>
> On 3/11/20 12:35 PM, serguei.spitsyn at oracle.com wrote:
>> Please, review the fix of:
>> https://bugs.openjdk.java.net/browse/JDK-8240881
>>
>> Webrev:
>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/8240881-encoding-antidelta/
>>
>>
>> Summary:
>> ? JDK-8240881 is a regression caused by the fix of: 
>> https://bugs.openjdk.java.net/browse/JDK-8222489
>> ? ?? Webrev: http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/
>> ???? Changeset: 
>> http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset
>>
>> ? The suggested fix is the JDK-8240881 anti-delta.
>> ? As a reviewer and sponsor, I apologize for this regression.
>> ? The change impact occurred bigger than expected.
>>
>> Testing:
>> ? The mac
>>
>> Thanks,
>> Serguei
>


From alexey.menkov at oracle.com  Wed Mar 11 22:29:16 2020
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Wed, 11 Mar 2020 15:29:16 -0700
Subject: RFR(XS): JDK-8217441: Failure of ::realloc() should be handled
 correctly in sawindbg.cpp
Message-ID: <513ed37a-172d-2700-ad8d-9dcf322ed7cc@oracle.com>

Hi all,

please review small (and I'd say trivial) fix for
https://bugs.openjdk.java.net/browse/JDK-8217441
webrev:
http://cr.openjdk.java.net/~amenkov/jdk15/WinDbg_realloc/webrev/

from realloc() spec:
On failure, returns a null pointer. The original pointer ptr remains 
valid and may need to be deallocated with free() or realloc().

--alex

From chris.plummer at oracle.com  Wed Mar 11 23:04:14 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 11 Mar 2020 16:04:14 -0700
Subject: RFR(XS): JDK-8217441: Failure of ::realloc() should be handled
 correctly in sawindbg.cpp
In-Reply-To: <513ed37a-172d-2700-ad8d-9dcf322ed7cc@oracle.com>
References: <513ed37a-172d-2700-ad8d-9dcf322ed7cc@oracle.com>
Message-ID: <2a1fab61-6075-6391-29ba-4e7de45b0930@oracle.com>

Looks good.

Chris

On 3/11/20 3:29 PM, Alex Menkov wrote:
> Hi all,
>
> please review small (and I'd say trivial) fix for
> https://bugs.openjdk.java.net/browse/JDK-8217441
> webrev:
> http://cr.openjdk.java.net/~amenkov/jdk15/WinDbg_realloc/webrev/
>
> from realloc() spec:
> On failure, returns a null pointer. The original pointer ptr remains 
> valid and may need to be deallocated with free() or realloc().
>
> --alex


From serguei.spitsyn at oracle.com  Wed Mar 11 23:04:50 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 11 Mar 2020 23:04:50 +0000 (UTC)
Subject: RFR(XS): JDK-8217441: Failure of ::realloc() should be handled
 correctly in sawindbg.cpp
In-Reply-To: <513ed37a-172d-2700-ad8d-9dcf322ed7cc@oracle.com>
References: <513ed37a-172d-2700-ad8d-9dcf322ed7cc@oracle.com>
Message-ID: <c53c9d25-ee4c-216c-b156-362ca0463026@oracle.com>

Hi Alex,

It looks good to me.
It seems, returning S_FALSE from SAOutputCallbacks::Output should be 
okay as the same is done when nullptr is returned from malloc.

Thanks,
Serguei


On 3/11/20 15:29, Alex Menkov wrote:
> Hi all,
>
> please review small (and I'd say trivial) fix for
> https://bugs.openjdk.java.net/browse/JDK-8217441
> webrev:
> http://cr.openjdk.java.net/~amenkov/jdk15/WinDbg_realloc/webrev/
>
> from realloc() spec:
> On failure, returns a null pointer. The original pointer ptr remains 
> valid and may need to be deallocated with free() or realloc().
>
> --alex


From alexey.menkov at oracle.com  Wed Mar 11 23:32:19 2020
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Wed, 11 Mar 2020 16:32:19 -0700
Subject: RFR(XS): JDK-8217441: Failure of ::realloc() should be handled
 correctly in sawindbg.cpp
In-Reply-To: <c53c9d25-ee4c-216c-b156-362ca0463026@oracle.com>
References: <513ed37a-172d-2700-ad8d-9dcf322ed7cc@oracle.com>
 <c53c9d25-ee4c-216c-b156-362ca0463026@oracle.com>
Message-ID: <146669f5-49d7-99a9-4e9f-5ba47f91c69d@oracle.com>


On 03/11/2020 16:04, serguei.spitsyn at oracle.com wrote:
> Hi Alex,
> 
> It looks good to me.
> It seems, returning S_FALSE from SAOutputCallbacks::Output should be 
> okay as the same is done when nullptr is returned from malloc.

Accordingly MSDN returned value is ignored by debugger engine.
But I keept it as it was.

--alex

> 
> Thanks,
> Serguei
> 
> 
> On 3/11/20 15:29, Alex Menkov wrote:
>> Hi all,
>>
>> please review small (and I'd say trivial) fix for
>> https://bugs.openjdk.java.net/browse/JDK-8217441
>> webrev:
>> http://cr.openjdk.java.net/~amenkov/jdk15/WinDbg_realloc/webrev/
>>
>> from realloc() spec:
>> On failure, returns a null pointer. The original pointer ptr remains 
>> valid and may need to be deallocated with free() or realloc().
>>
>> --alex
> 

From suenaga at oss.nttdata.com  Thu Mar 12 00:10:10 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Thu, 12 Mar 2020 09:10:10 +0900
Subject: : PING: RFR: 8234624: jstack mixed mode should refer DWARF
In-Reply-To: <aa7b4d1b-7daf-2d3a-c6b1-9be34d2b1f07@oracle.com>
References: <eb284894-0801-d04d-c5c2-68ea3faaef98@oss.nttdata.com>
 <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com>
 <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com>
 <e819c572-2172-c3e2-9933-f3859662d400@oss.nttdata.com>
 <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com>
 <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com>
 <c25bb60f-4e37-58db-b40f-2afce2dbf82f@oss.nttdata.com>
 <b2d223d2-47db-3119-b579-e48ca3b50469@oss.nttdata.com>
 <c9ac396c-f14b-3313-b3aa-912dd7dca482@oracle.com>
 <24a2dcae-fba9-c6ed-657e-a1d54584bbc3@oss.nttdata.com>
 <01c2a8c1-0e04-4e42-161a-80d6bdaa4ffa@oracle.com>
 <ab3a496e-4f19-922b-7418-1fca058a14c1@oss.nttdata.com>
 <e8a16c8e-ff00-6985-dc3d-9450c6b489df@oss.nttdata.com>
 <26681640-d8c8-5b64-58cc-36f8a854d101@oracle.com>
 <bfd14ee6-61b1-35e4-786c-242f65d00b7a@oss.nttdata.com>
 <aa7b4d1b-7daf-2d3a-c6b1-9be34d2b1f07@oracle.com>
Message-ID: <d6f07d16-07ab-8439-600f-9a0757bb140e@oss.nttdata.com>

On 2020/03/12 0:31, Kevin Walls wrote:
> Hi -
> 
> OK great, it checks for a zero-length dwarf entry.
> 
> I did a rebuild and tested it locally it works here, so all looks good to me.

Thanks Kevin!


> We may in future want to work on src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/bsd/amd64/BsdAMD64CFrame.java to use Dwarf similarly, that's why I mentioned the platform-neutral directory name, but I have no issue with that happening in the future.

I do not have Mac, so I cannot work for it...
Of course, if you (or other serviceability folks) work for it, I will help.


Yasumasa


> Thanks
> Kevin
> 
> 
> On 11/03/2020 09:49, Yasumasa Suenaga wrote:
>> Hi,
>>
>> Thanks David and Ioi for sharing the status.
>> I've fixed the problem in new webrev (mach5-one-ysuenaga-JDK-8234624-6-20200311-0827-9367344):
>>
>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.06/
>>
>> Diff from webrev.05 is here:
>>
>> ? http://hg.openjdk.java.net/jdk/submit/rev/e3d12785f087
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> On 2020/03/11 14:59, David Holmes wrote:
>>> Hi Yasumasa,
>>>
>>> Partial hs_err info below.
>>>
>>> David
>>> -----
>>>
>>> #
>>> # A fatal error has been detected by the Java Runtime Environment:
>>> #
>>> #? SIGSEGV (0xb) at pc=0x00007fdf2000e87c, pid=29798, tid=29800
>>> #
>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-11-0447267.suenaga.source)
>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-11-0447267.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>> # Problematic frame:
>>> # C? [libsaproc.so+0x487c]? DwarfParser::process_dwarf(unsigned long)+0x2c
>>> #
>>> # Core dump will be written. Default location: Core dumps may be processed with "/opt/core.sh %p" (or dumping to /opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/scratch/0/core.29798)
>>> #
>>> # If you would like to submit a bug report, please visit:
>>> # https://urldefense.com/v3/__https://bugreport.java.com/bugreport/crash.jsp__;!!GqivPVa7Brio!OghfqRRRHbAZloG3aVJ244OPPTcCQOwYIl_vm6vU_toLb9qFzTUirVBEHn2tfDp26A$ # The crash happened outside the Java Virtual Machine in native code.
>>> # See problematic frame for where to report the bug.
>>> #
>>>
>>> ---------------? S U M M A R Y ------------
>>>
>>> Command Line: 
>>> -Denv.class.path=/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/serviceability/sa/TestJhsdbJstackMixed.d:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/hotspot/jtreg/serviceability/sa:/opt/mach5/mesos/work_dir/slaves/90726e33-be99-4e27-9d68-25dad266ef13-S5982/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/5f116aa5-2fc4-44e1-8808-1284111132ed/runs/908d4a0c-e2df-4273-bb74-b899888c3b6a/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/0/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/src.full/open/test/lib:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/javatest.jar:/opt/mach5/mesos/work_dir/jib-master/install/jtreg/5.0/b01/bundles/jtreg_bin-5.0.zip/jtreg/lib/jtreg.jar 
>>> -Dapplication.home=/opt/mach5/mesos/work_dir/jib-master/install/2020-03-11-0447267.suenaga.source/linux-x64-debug.jdk/jdk-15/fastdebug -Xms8m -Djdk.module.main=jdk.hotspot.agent jdk.hotspot.agent/sun.jvm.hotspot.SALauncher jstack --mixed --pid 29770
>>>
>>> Time: Wed Mar 11 05:20:57 2020 UTC elapsed time: 3.927809 seconds (0d 0h 0m 3s)
>>>
>>> ---------------? T H R E A D? ---------------
>>>
>>> Current thread (0x00007fdf5c032000):? JavaThread "main" [_thread_in_native, id=29800, stack(0x00007fdf63a9e000,0x00007fdf63b9f000)]
>>>
>>> Stack: [0x00007fdf63a9e000,0x00007fdf63b9f000], sp=0x00007fdf63b9d190, free space=1020k
>>> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
>>> C? [libsaproc.so+0x487c]? DwarfParser::process_dwarf(unsigned long)+0x2c
>>> j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 jdk.hotspot.agent at 15-internal
>>> j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 jdk.hotspot.agent at 15-internal
>>> j sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 jdk.hotspot.agent at 15-internal
>>> j sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 jdk.hotspot.agent at 15-internal
>>> j sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.tools.Tool.startInternal()V+87 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 jdk.hotspot.agent at 15-internal
>>> j sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 jdk.hotspot.agent at 15-internal
>>> j sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 jdk.hotspot.agent at 15-internal
>>> v? ~StubRoutines::call_stub
>>> V? [libjvm.so+0xc2291c]? JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x6ac
>>> V? [libjvm.so+0xd31970]? jni_invoke_static(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) [clone .isra.140] [clone .constprop.263]+0x370
>>> V? [libjvm.so+0xd36202]? jni_CallStaticVoidMethod+0x222
>>> C? [libjli.so+0x4bed]? JavaMain+0xbcd
>>> C? [libjli.so+0x80a9]? ThreadJavaMain+0x9
>>>
>>> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
>>> j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf0(J)V+0 jdk.hotspot.agent at 15-internal
>>> j sun.jvm.hotspot.debugger.linux.amd64.DwarfParser.processDwarf(Lsun/jvm/hotspot/debugger/Address;)V+7 jdk.hotspot.agent at 15-internal
>>> j sun.jvm.hotspot.debugger.linux.amd64.LinuxAMD64CFrame.getTopFrame(Lsun/jvm/hotspot/debugger/linux/LinuxDebugger;Lsun/jvm/hotspot/debugger/Address;Lsun/jvm/hotspot/debugger/ThreadContext;)Lsun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame;+38 jdk.hotspot.agent at 15-internal
>>> j sun.jvm.hotspot.debugger.linux.LinuxCDebugger.topFrameForThread(Lsun/jvm/hotspot/debugger/ThreadProxy;)Lsun/jvm/hotspot/debugger/cdbg/CFrame;+116 jdk.hotspot.agent at 15-internal
>>> j sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;Lsun/jvm/hotspot/debugger/Debugger;)V+125 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.tools.PStack.run(Ljava/io/PrintStream;)V+11 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.tools.PStack.run()V+4 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.tools.JStack.run()V+55 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.tools.Tool.startInternal()V+87 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.tools.Tool.start([Ljava/lang/String;)I+359 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.tools.Tool.execute([Ljava/lang/String;)V+4 jdk.hotspot.agent at 15-internal
>>> j sun.jvm.hotspot.tools.JStack.runWithArgs([Ljava/lang/String;)V+99 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.SALauncher.runJSTACK([Ljava/lang/String;)V+58 jdk.hotspot.agent at 15-internal
>>> j sun.jvm.hotspot.SALauncher$$Lambda$3.accept(Ljava/lang/Object;)V+4 jdk.hotspot.agent at 15-internal
>>> j? sun.jvm.hotspot.SALauncher.main([Ljava/lang/String;)V+158 jdk.hotspot.agent at 15-internal
>>> v? ~StubRoutines::call_stub
>>>
>>> siginfo: si_signo: 11 (SIGSEGV), si_code: 2 (SEGV_ACCERR), si_addr: 0x00007fded5076b79
>>>
>>> Register to memory mapping:
>>>
>>> RAX=0x00007f7e4dfe3229 is an unknown value
>>> RBX=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 d4 de 7f 00 00
>>> RCX=0x00007fded4072380 points into unknown readable memory: 2f 75 73 72 2f 6c 69 62
>>> RDX=0x00007fded4076b85 points into unknown readable memory: 01 00 00
>>> RSP=0x00007fdf63b9d190 is pointing into the stack for thread: 0x00007fdf5c032000
>>> RBP=0x00007fdf63b9d1b0 is pointing into the stack for thread: 0x00007fdf5c032000
>>> RSI=0x0000000000000004 is an unknown value
>>> RDI=0x00007fdf5c4d7080 points into unknown readable memory: 80 23 07 d4 de 7f 00 00
>>> R8 =0x000000000146c380 points into unknown readable memory: 02 00 00 00 00 00 00 00
>>> R9 =0x00007fded4076b79 points into unknown readable memory: 7a 52 00 01 78 10 01
>>> R10=0x00000000ffffffff is an unknown value
>>> R11=0x000000000100527a is an unknown value
>>> R12=0x00007fded5076b79 is an unknown value
>>> R13=0x00007f7da2f8e68a is an unknown value
>>> R14=0x00007f7dbdf62b1d is an unknown value
>>> R15=0x00007fdf5c032000 is a thread
>>>
>>>
>>> Registers:
>>> RAX=0x00007f7e4dfe3229, RBX=0x00007fdf5c4d7080, RCX=0x00007fded4072380, RDX=0x00007fded4076b85
>>> RSP=0x00007fdf63b9d190, RBP=0x00007fdf63b9d1b0, RSI=0x0000000000000004, RDI=0x00007fdf5c4d7080
>>> R8 =0x000000000146c380, R9 =0x00007fded4076b79, R10=0x00000000ffffffff, R11=0x000000000100527a
>>> R12=0x00007fded5076b79, R13=0x00007f7da2f8e68a, R14=0x00007f7dbdf62b1d, R15=0x00007fdf5c032000
>>> RIP=0x00007fdf2000e87c, EFLAGS=0x0000000000010206, CSGSFS=0x002b000000000033, ERR=0x0000000000000004
>>> ?? TRAPNO=0x000000000000000e
>>>
>>> Top of Stack: (sp=0x00007fdf63b9d190)
>>> 0x00007fdf63b9d190:?? 00007fdf209d0980 0000000000000000
>>> 0x00007fdf63b9d1a0:?? 00007fdf209d0980 00007fdf63b9d258
>>> 0x00007fdf63b9d1b0:?? 00007fdf63b9d228 00007fdf44778dbe
>>> 0x00007fdf63b9d1c0:?? 000000000146c380 00007fdf5c032000
>>>
>>> Instructions: (pc=0x00007fdf2000e87c)
>>> 0x00007fdf2000e77c:?? 89 43 18 4d 85 f6 75 0f eb 2a 66 2e 0f 1f 84 00
>>> 0x00007fdf2000e78c:?? 00 00 00 00 48 89 c2 48 8d 42 01 48 89 43 08 80
>>> 0x00007fdf2000e79c:?? 78 ff 00 78 ef 48 8d 42 02 48 89 43 08 0f b6 42
>>> 0x00007fdf2000e7ac:?? 01 88 43 10 48 c7 43 28 00 00 00 00 4c 89 e1 48
>>> 0x00007fdf2000e7bc:?? 89 df 31 f6 48 b8 07 00 00 00 10 00 00 00 c6 43
>>> 0x00007fdf2000e7cc:?? 3c 00 48 c7 c2 ff ff ff ff 48 89 43 14 48 c7 43
>>> 0x00007fdf2000e7dc:?? 30 00 00 00 00 c7 43 38 00 00 00 00 e8 13 fb ff
>>> 0x00007fdf2000e7ec:?? ff 4c 89 6b 08 48 83 c4 18 5b 41 5c 41 5d 41 5e
>>> 0x00007fdf2000e7fc:?? 41 5f 5d c3 83 e7 40 0f 84 63 ff ff ff 48 c7 c2
>>> 0x00007fdf2000e80c:?? ff ff ff ff 48 d3 e2 49 09 d0 e9 51 ff ff ff 90
>>> 0x00007fdf2000e81c:?? 0f 1f 40 00 0f b6 47 10 83 e0 07 3c 02 74 0a 76
>>> 0x00007fdf2000e82c:?? 1b 3c 03 74 04 3c 04 75 17 48 8b 57 08 8b 02 48
>>> 0x00007fdf2000e83c:?? 83 c2 04 48 89 57 08 c3 0f 1f 40 00 84 c0 74 e9
>>> 0x00007fdf2000e84c:?? 31 c0 c3 90 55 41 ba ff ff ff ff 48 89 e5 41 56
>>> 0x00007fdf2000e85c:?? 41 55 49 89 f5 41 54 53 48 8b 07 48 89 fb 4c 8b
>>> 0x00007fdf2000e86c:?? a0 28 11 00 00 eb 09 0f 1f 44 00 00 4c 89 63 08
>>> 0x00007fdf2000e87c:?? 41 8b 04 24 4d 8d 4c 24 04 4c 89 4b 08 4c 39 d0
>>> 0x00007fdf2000e88c:?? 75 0a 49 8b 44 24 04 4d 8d 4c 24 0c 45 8b 19 4d
>>> 0x00007fdf2000e89c:?? 8d 24 01 49 8d 41 04 48 89 43 08 45 85 db 74 cc
>>> 0x00007fdf2000e8ac:?? 48 89 df e8 8c f9 ff ff 48 8b 13 41 89 c6 4c 03
>>> 0x00007fdf2000e8bc:?? b2 18 11 00 00 e8 5a ff ff ff 89 c0 4c 01 f0 4c
>>> 0x00007fdf2000e8cc:?? 39 e8 76 a8 4d 39 ee 77 a3 44 89 da 4c 89 ce e8
>>> 0x00007fdf2000e8dc:?? 90 fd ff ff 48 8b 43 08 31 c9 31 ff 48 83 c0 01
>>> 0x00007fdf2000e8ec:?? 0f 1f 40 00 48 89 43 08 0f b6 70 ff 49 89 c0 48
>>> 0x00007fdf2000e8fc:?? 83 c0 01 48 89 f2 83 e2 7f 48 d3 e2 83 c1 07 48
>>> 0x00007fdf2000e90c:?? 09 d7 40 84 f6 78 dd 4c 01 c7 4c 89 e1 4c 89 ea
>>> 0x00007fdf2000e91c:?? 4c 89 f6 48 89 7b 08 48 89 df e8 d5 f9 ff ff 5b
>>> 0x00007fdf2000e92c:?? 31 c0 41 5c 41 5d 41 5e 5d c3 66 2e 0f 1f 84 00
>>> 0x00007fdf2000e93c:?? 00 00 00 00 55 48 89 e5 41 54 53 48 81 ec d0 00
>>> 0x00007fdf2000e94c:?? 00 00 48 89 b5 48 ff ff ff 48 89 95 50 ff ff ff
>>> 0x00007fdf2000e95c:?? 48 89 8d 58 ff ff ff 4c 89 85 60 ff ff ff 4c 89
>>> 0x00007fdf2000e96c:?? 8d 68 ff ff ff 84 c0 74 23 0f 29 85 70 ff ff ff
>>>
>>>
>>> On 11/03/2020 3:52 pm, Yasumasa Suenaga wrote:
>>>> Hi Kevin,
>>>>
>>>> I saw 2 errors on submit repo (mach5-one-ysuenaga-JDK-8234624-5-20200311-0209-9358475).
>>>> So I tweaked my patch, but I saw the crash again (mach5-one-ysuenaga-JDK-8234624-5-20200311-0448-9361448).
>>>>
>>>> ?? Last change on submit repo is here:
>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05-2/
>>>>
>>>> Can you share details on submit repo?
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2020/03/11 11:07, Yasumasa Suenaga wrote:
>>>>> Hi Kevin,
>>>>>
>>>>> I guess first program header in the libraries which are on your machine has exec flag (you can check it with `readelf -l`).
>>>>> So I tweaked my patch (initial value of exec_start and exec_end set to -1) in new webrev.
>>>>>
>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.05/
>>>>>
>>>>> This webrev contains the fix for your comment (typo and DW_CFA_advance_loc4).
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2020/03/11 8:53, Kevin Walls wrote:
>>>>>> Hi -
>>>>>>
>>>>>> In testing I wasn't seeing any of the Dwarf code triggered.
>>>>>>
>>>>>> With LIBSAPROC_DEBUG set I'm getting the "Could not find executable section in" for lots of / maybe all the libraries...
>>>>>>
>>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/libproc_impl.c
>>>>>>
>>>>>> ??? if (fill_instr_info(newlib)) {
>>>>>> ????? if (!read_eh_frame(ph, newlib)) {
>>>>>>
>>>>>> fill_instr_info is failing, and we never get to read_eh_frame().
>>>>>>
>>>>>> output like:
>>>>>>
>>>>>> libsaproc DEBUG: [0] vaddr = 0x0, memsz = 0xaba4, filesz = 0xaba4
>>>>>> libsaproc DEBUG: Could not find executable section in /lib/x86_64-linux-gnu/libnss_nis-2.27.so
>>>>>>
>>>>>> (similar for all libraries).
>>>>>>
>>>>>> fill_instr fails if:
>>>>>>
>>>>>> ??if ((lib->exec_start == 0L) || (lib->exec_end == 0L))
>>>>>>
>>>>>> ...but isn't exec_start relative to the library address? It's the value of ph->vaddr and it is often zero.
>>>>>>
>>>>>> I added some booleans and did:
>>>>>>
>>>>>> 185?????? if ((lib->exec_start == 0L) || (lib->exec_start > ph->p_vaddr)) {
>>>>>> 186???????? lib->exec_start = ph->p_vaddr;
>>>>>> 187???????? found_start =true;
>>>>>> 188?????? }
>>>>>>
>>>>>> (similarly for end) and only failed if:
>>>>>>
>>>>>> 201?? if (!found_start || !found_end) {
>>>>>> 202???? return false;
>>>>>>
>>>>>> ...and now it's better. ? I go from:
>>>>>>
>>>>>> ----------------- 3306 -----------------
>>>>>> 0x00007f75824acd2d????? __GI___pthread_timedjoin_ex + 0x17d
>>>>>>
>>>>>> to:
>>>>>>
>>>>>> ----------------- 31127 -----------------
>>>>>> 0x00007fa284d78d2d????? __GI___pthread_timedjoin_ex + 0x17d
>>>>>> 0x00007fa2857aaf2d????? CallJavaMainInNewThread + 0xad
>>>>>> 0x00007fa2857a74ed????? ContinueInNewThread + 0x4d
>>>>>> 0x00007fa2857a8c49????? JLI_Launch + 0x1529
>>>>>> 0x000055af1b78db1c????? main + 0x11c
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> Kevin
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 10/03/2020 12:36, Yasumasa Suenaga wrote:
>>>>>>
>>>>>>> Hi Kevin,
>>>>>>>
>>>>>>> Thanks for your comment!
>>>>>>>
>>>>>>> On 2020/03/10 18:58, Kevin Walls wrote:
>>>>>>>> Hi Yasumasa ,
>>>>>>>>
>>>>>>>> The changes build OK for me in the latest jdk, and things still work.
>>>>>>>> I have not yet seen the dwarf usage in action: I've tried a couple of different systems and so far have not reproduced the problem, i.e. jstack has not failed on native frames.
>>>>>>>>
>>>>>>>> I may need more recent basic libraries, will look again for somewhere where the problem happens and get back to you as I really want to run the changes.
>>>>>>>
>>>>>>> You can see the problem with JShell.
>>>>>>> Some Java frames would not be seen in mixed jstack.
>>>>>>>
>>>>>>>
>>>>>>>> I have mostly minor other comments which don't need a new webrev, some just comments for the future:
>>>>>>>>
>>>>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp:
>>>>>>>>
>>>>>>>> DW_CFA_nop - shouldn't this continue instead of return?
>>>>>>>> (It may "never" happen, but a nop could appear within some other instructions?)
>>>>>>>
>>>>>>> DW_CFA_nop is used for padding, so we can ignore (return immediately) it.
>>>>>>>
>>>>>>>
>>>>>>>> DW_CFA_remember_state: a minor typo in the comment, "DW_CFA_remenber_state".
>>>>>>>
>>>>>>> I will fix it.
>>>>>>>
>>>>>>>
>>>>>>>> We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not DW_CFA_advance_loc4.? I thought that was odd, but maybe addresses in these tables never increase by 4-byte amounts, would this mean a lot of code on one line. 8-)
>>>>>>>> So maybe it's never used in practice, if you think it's unnecessary no problem, maybe a comment, or add it for robustness.
>>>>>>>
>>>>>>> I will add DW_CFA_advance_loc4.
>>>>>>>
>>>>>>>
>>>>>>>> General-purpose methods like read_leb128(), get_entry_length(), get_decoded_value() specifically update the _buf pointer in this DwarfParser.
>>>>>>>>
>>>>>>>> DwarfParser::process_dwarf() moves _buf.
>>>>>>>> It calls process_cie() which reads, moves _buf and restores it to the original position, then we read augmentation_length from where _buf is.
>>>>>>>> I'm not sure if that's wrong, or if I just need to read again about the CIE/etc layout.
>>>>>>>>
>>>>>>>> I don't really want to suggest making the code pass around a current _buf for the invocation of these general purpose methods, but just wanted to comment that if these get used more widely that might become necessary.
>>>>>>>
>>>>>>> I saw GDB and binutils source for creating this patch.
>>>>>>> They seems to process similar code because we need to calculate DWARF instructions one-by-one to get the value which relates to specified PC.
>>>>>>>
>>>>>>>
>>>>>>>> Similarly in future, if this DWARF support code became used more widely, it might want to move to an
>>>>>>>> OS-neutral directory?? It's odd to label it as Linux-specific.
>>>>>>>
>>>>>>> Windows does not use DWARF at least, it uses another feature.
>>>>>>>
>>>>>>> https://urldefense.com/v3/__https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaTtALtpRg$
>>>>>>> I'm not sure other platforms (Solaris, macOS) uses DWARF.
>>>>>>> If DWARF is used in them, I can move DWARF related code to posix directory.
>>>>>>>
>>>>>>>
>>>>>>>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp:
>>>>>>>> Thanks for changing "can_parsable" which was in the earlier version. 8-)
>>>>>>>>
>>>>>>>>
>>>>>>>> These are just comments to mainly say it looks good, and somebody else out there has read it.
>>>>>>>> I will look for a system that shows the problem, and get back to you again!
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>>> Many thanks
>>>>>>>> Kevin
>>>>>>>>
>>>>>>>> On 27/02/2020 05:13, Yasumasa Suenaga wrote:
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> webrev.03 cannot be applied to current jdk/jdk due to 8239224 and 8239462 changes (they updated copyright year).
>>>>>>>>> So I modified webrev (only copyright year changes) to be able to apply to current jdk/jdk.
>>>>>>>>> Could you review it?
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/
>>>>>>>>>
>>>>>>>>> I need one more reviewer to push.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2020/02/17 13:07, Yasumasa Suenaga wrote:
>>>>>>>>>> PING: Could you review it?
>>>>>>>>>>
>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>>>>>
>>>>>>>>>> This change has been already reviewed by Serguei.
>>>>>>>>>> I need one more reviewer to push.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2020/02/03 1:37, Yasumasa Suenaga wrote:
>>>>>>>>>>> PING: Could you reveiw this change?
>>>>>>>>>>>
>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>>>>>>
>>>>>>>>>>> I believe this change helps troubleshooter to fight to postmortem analysis.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 2020/01/19 3:16, Yasumasa Suenaga wrote:
>>>>>>>>>>>> PING: Could you review it?
>>>>>>>>>>>>
>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>>>>>>>
>>>>>>>>>>>> I updated webrev. I discussed with Serguei in off list, and I refactored webrev.02 .
>>>>>>>>>>>> It has passed tests on submit repo (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 2019/12/15 10:51, Yasumasa Suenaga wrote:
>>>>>>>>>>>>> Hi Serguei,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for your comment!
>>>>>>>>>>>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev.
>>>>>>>>>>>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as Dmitry said.
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/
>>>>>>>>>>>>>
>>>>>>>>>>>>> This change has been passed all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487).
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is nice move in general.
>>>>>>>>>>>>>> Thank you for working on this!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 0L) { // Java frame 98 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp == null) { 100 return null; 101 } 102 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if (rbp == null) { 110 return null; 111 } 112 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 dwarf.processDwarf(pc); 115 Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 !dwarf.isBPOffsetAvailable()) 117 ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : context.getRegisterAsAddress(dwarf.getCFARegister()) 119 .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); 124 }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'd suggest to simplify the logic by refactoring to something like below:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc);
>>>>>>>>>>>>>> ?????????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame
>>>>>>>>>>>>>> ?????????? DwarfParser dwarf = null;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ?????????? if (libptr != 0L) { // Native frame
>>>>>>>>>>>>>> ???????????? try {
>>>>>>>>>>>>>> ?????????????? dwarf = new DwarfParser(libptr);
>>>>>>>>>>>>>> ?????????????? dwarf.processDwarf(pc);
>>>>>>>>>>>>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) &&
>>>>>>>>>>>>>> !dwarf.isBPOffsetAvailable())
>>>>>>>>>>>>>> ???????????????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP)
>>>>>>>>>>>>>> ???????????????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister())
>>>>>>>>>>>>>> .addOffsetTo(dwarf.getCFAOffset());
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // bail out to Java frame case
>>>>>>>>>>>>>> ??????????? }
>>>>>>>>>>>>>> ????????? }
>>>>>>>>>>>>>> ????????? if (cfa == null) {
>>>>>>>>>>>>>> ??????????? return null;
>>>>>>>>>>>>>> ????????? }
>>>>>>>>>>>>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA()
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ?? Better to rename 'ofs' => 'offs'.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 77 nextCFA = nextCFA.addOffsetTo(- nextDwarf.getBasePointerOffsetFromCFA());
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ?? Extra space after '-' sign.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, ThreadContext context) {
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ?? It feels like the logic has to be somehow refactored/simplified as
>>>>>>>>>>>>>> ?? several typical fragments appears in slightly different contexts.
>>>>>>>>>>>>>> ?? But it is not easy to understand what it is.
>>>>>>>>>>>>>> ?? Could you, please, add some comments to key places explaining this logic.
>>>>>>>>>>>>>> ?? Then I'll check if it is possible to make it a little bit simpler.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 109 private CFrame javaSender(ThreadContext context) { 110 Address nextCFA; 111 Address nextPC; 112 113 nextPC = getNextPC(false); 114 if (nextPC == null) { 115 return null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 123 } catch (DebuggerException e) { 124 nextCFA = getNextCFA(null, context); 125 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 133 }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ??The above can be simplified if a DebuggerException can not be thrown from processDwarf(nextPC):
>>>>>>>>>>>>>> ????? private CFrame javaSender(ThreadContext context) {
>>>>>>>>>>>>>> ??????? Address nextPC = getNextPC(false);
>>>>>>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>>>>>>> ????????? return null;
>>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ??????? if (libptr != 0L) { // Native frame
>>>>>>>>>>>>>> ????????? try {
>>>>>>>>>>>>>> ??????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>>>>>>> nextDwarf.processDwarf(nextPC);
>>>>>>>>>>>>>> ????????? } catch (DebuggerException e) { // Bail out to Java frame
>>>>>>>>>>>>>> ????????? }
>>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>>>>>>> ????? }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext context = thread.getContext(); 137 138 if (dwarf == null) { // Java frame 139 return javaSender(context); 140 } 141 142 Address nextPC = getNextPC(true); 143 if (nextPC == null) { 144 return null; 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // Next frame might be Java frame 153 nextCFA = getNextCFA(null, context); 154 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 158 } catch (DebuggerException e) { 159 nextCFA = getNextCFA(null, context); 160 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? null : new 
>>>>>>>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 167 }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ??This one can be also simplified a little:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ????? public CFrame sender(ThreadProxy thread) {
>>>>>>>>>>>>>> ??????? ThreadContext context = thread.getContext();
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ??????? if (dwarf == null) { // Java frame
>>>>>>>>>>>>>> ????????? return javaSender(context);
>>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>>> ??????? Address nextPC = getNextPC(true);
>>>>>>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>>>>>>> ????????? return null;
>>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>>>>>>> ??????? if (!dwarf.isIn(nextPC)) {
>>>>>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>>>>> ????????? if (libptr != 0L) {
>>>>>>>>>>>>>> ??????????? try {
>>>>>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>>>>>>> nextDwarf.processDwarf(nextPC);
>>>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame
>>>>>>>>>>>>>> ??????????? }
>>>>>>>>>>>>>> ????????? }
>>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>>>>>>> ????? }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Finally, it looks like just one method could replace both
>>>>>>>>>>>>>> sender(ThreadProxy thread) and javaSender(ThreadContext context):
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ????? private CFrame commonSender(ThreadProxy thread) {
>>>>>>>>>>>>>> ??????? ThreadContext context = thread.getContext();
>>>>>>>>>>>>>> ??????? Address nextPC = getNextPC(false);
>>>>>>>>>>>>>> ??????? if (nextPC == null) {
>>>>>>>>>>>>>> ????????? return null;
>>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>>> ??????? DwarfParser nextDwarf = null;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>>>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) {
>>>>>>>>>>>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>>>>>> ????????? if (libptr != 0L) {
>>>>>>>>>>>>>> ??????????? try {
>>>>>>>>>>>>>> ????????????? nextDwarf = new DwarfParser(libptr);
>>>>>>>>>>>>>> nextDwarf.processDwarf(nextPC);
>>>>>>>>>>>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame
>>>>>>>>>>>>>> ??????????? }
>>>>>>>>>>>>>> ????????? }
>>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>>>>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>>>>>>> ????? }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm still reviewing the dwarf parser files.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I refactored LinuxAMD64CFrame.java . It works fine in serviceability/sa tests and
>>>>>>>>>>>>>>> all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923).
>>>>>>>>>>>>>>> Could you review new webrev?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The diff from previous webrev is here:
>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> According to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64
>>>>>>>>>>>>>>>> Architecture Processor Supplement [1], we need to use DWARF in .eh_frame or .debug_frame
>>>>>>>>>>>>>>>> for stack unwinding.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by default since GCC 4.6, so system
>>>>>>>>>>>>>>>> library (e.g. libc) might be compiled with this feature.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> However `jhsdb jstack --mixed` does not do so, it uses base pointer register (RBP).
>>>>>>>>>>>>>>>> So it might be lack of stack frames.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I guess JDK-8219201 is caused by same issue.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [1] https://urldefense.com/v3/__https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaQusmjOqA$ 
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
> 

From suenaga at oss.nttdata.com  Thu Mar 12 01:32:58 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Thu, 12 Mar 2020 10:32:58 +0900
Subject: PING: Re: RFR: JDK-8222489: jcmd VM.system_properties gives
 unusable paths on Windows
In-Reply-To: <4d29cbef-aa4e-31bc-8562-16930eeed219@oracle.com>
References: <CAE_05uymp-AR6x5MLOZp7ATiOb4+O3Ev2bvscH2nmvHs79LAeA@mail.gmail.com>
 <CAE_05uwi8trdTc6W3dVepNshy3VbOPQGJd3gW6gkc9hYoXBTrg@mail.gmail.com>
 <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com>
 <CAE_05uxvGwtgsM4NSH5ipC_gOT+f_fjhdQ+xpHoX7DSp7yntEA@mail.gmail.com>
 <0a2df665-2e08-6139-c131-043a425b4916@oracle.com>
 <CAE_05uzT8PedhscdgqOQ_0Ek3uKB2mHdZSGJLOH4WyE02Jv+qg@mail.gmail.com>
 <AM0PR02MB450007C62B0392584C378ECF9FEA0@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <CAE_05uwttqkP06KStTEdQY_bbNTra4OHpQfszEcg5q_gbCxyrQ@mail.gmail.com>
 <CAE_05uz93m96Ez+zRv8h2j6DyDW9xPeanm1wNCo9r+kG+BoOug@mail.gmail.com>
 <31599c2a-6947-44a0-0f91-cb240ddaf1ed@oracle.com>
 <d7637b4f-5c9a-436d-f11b-468cc43aa8d3@oss.nttdata.com>
 <CAE_05uzNAU0f8T8Gh9=pxd5U18xTDsWbxKkmbj5iSf2AhMWXYQ@mail.gmail.com>
 <CAE_05uxC4ADo-qQaOaSAFHfUZ93uufVHZu639vvvSfeJWeNy=A@mail.gmail.com>
 <3e535ec6-573d-901e-e4c0-4e2174c80f15@oracle.com>
 <4d29cbef-aa4e-31bc-8562-16930eeed219@oracle.com>
Message-ID: <e99ae9d3-d214-bac6-a12f-85619b7679ea@oss.nttdata.com>

Hi,

AFAICS failure tests which are listed in JBS (JDK-8240881) seems to be caused by HotSpotVirtualMachine::getSystemProperties.
It would load the result of PrintSystemPropertiesDCmd via Properties::load. So we have to compliant the spec of Properties.

OTOH it is not described in spec (JDK-7120511: introducing VM.system_properties, and help message).

Thus I think we need to add new option for VM.system_properties for showing raw values (e.g. -raw).
If do so, we need CSR.

What do you think?


Yasumasa


On 2020/03/12 4:52, serguei.spitsyn at oracle.com wrote:
> Hi Chihiro,
> 
> I've tested and pushed your fix but the impact of fix was underestimated.
> The fix caused several regressions and the following bug was filed:
>  ? https://bugs.openjdk.java.net/browse/JDK-8240881
> 
> Now, I'm working on removing the fix of JDK-8222489 with the anti-delta.
> You can find and review my RFR posted on the serviceability-dev mailing list:
>  ? RFR: 8240881: several tests are failing due to encoding failures
> 
> You can file another bug as a replacement of JDK-8222489.
> I will help you with the information about test regressions caused by it.
> 
> Thanks,
> Serguei
> 
> 
> On 3/10/20 02:54, serguei.spitsyn at oracle.com wrote:
>> Hi Chihiro,
>>
>> Yes, I'll sponsor it.
>> Thank you for the update.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 3/8/20 06:05, Chihiro Ito wrote:
>>> Hi,
>>>
>>> I'm sorry. I included "JDK-" in the changeset title. I removed it and
>>> updated it.
>>>
>>> Change set : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset
>>>
>>> Regards,
>>> Chihiro
>>>
>>> 2020?3?7?(?) 23:13 Chihiro Ito <chiroito107 at gmail.com>:
>>>> Hi Serguei and Yasumasa,
>>>>
>>>> I update the copyright year and created the change set.
>>>>
>>>> Could you sponsor this, please?
>>>>
>>>> Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/
>>>> Change set : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset
>>>>
>>>> Regards,
>>>> Chihiro
>>>>
>>>>
>>>> 2020?3?7?(?) 16:03 Yasumasa Suenaga <suenaga at oss.nttdata.com>:
>>>>
>>>>
>>>>> Hi Chihiro,
>>>>>
>>>>> I'm also ok with webrev.05 after updating copyright year.
>>>>>
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2020/03/07 3:32, serguei.spitsyn at oracle.com wrote:
>>>>>> Hi Chichiro,
>>>>>>
>>>>>> I'm okay with the fix.
>>>>>> Could you, please, update the copyright date in || src/java.base/share/classes/jdk/internal/vm/VMSupport.java before push?
>>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>>
>>>>>> On 3/6/20 07:24, Chihiro Ito wrote:
>>>>>>> Hi Serguei,
>>>>>>>
>>>>>>> Could you review this again, please?
>>>>>>>
>>>>>>> Regards,
>>>>>>> Chihiro
>>>>>>>
>>>>>>>
>>>>>>> 2020?2?27?(?) 22:11 Chihiro Ito<chiroito107 at gmail.com>:
>>>>>>>> Hi Ralf,
>>>>>>>>
>>>>>>>> Thank you for your advice.
>>>>>>>>
>>>>>>>> 1.
>>>>>>>> The comment of serializePropertiesToByteArray in VMSupport is "The stream written to the byte array is ISO 8859-1 encoded.".
>>>>>>>> But the previous implementation does not keep this. I think we need to implement encode by ISO 8859-1.
>>>>>>>>
>>>>>>>> 2.
>>>>>>>> According to help, the feature of VM.system_properties is just "Print system properties". The users should not use this output for loading. The users use it when they want to see System Properties soon.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Chihiro
>>>>>>>>
>>>>>>>>
>>>>>>>> 2020?2?26?(?) 18:53 Schmelter, Ralf<ralf.schmelter at sap.com>:
>>>>>>>>> Hi Chihiro,
>>>>>>>>>
>>>>>>>>> I have two remarks:
>>>>>>>>>
>>>>>>>>> 1. ISO Latin 1 characters which are not ASCII will not work with the code. While the Properties.store() method claims to create ISO Latin 1 String, it really only will create printable ASCII characters (apart from the comment, but it is ASCII too in this case). See Properties.saveConvert, where the char is checked for < 0x20 or > 0x7e and then printed as \uxxxx. This is important, since the bytes of the ByteArrayOutputStream are then send to the jcmd. And jcmd expects UTF-8 encoded strings, which is OK if we only used ASCII characters. But a ISO Latin 1 character >= 0x80 will break the encoding. Just try using \u00DC in your test.
>>>>>>>>>
>>>>>>>>> 2. Your change makes it impossible to load the output with properties.load(). The old output could be loaded, since it was a valid properties file. But yours is not. For example, consider the filename c:\test\new. Formerly it would be encoded as:
>>>>>>>>> C\:\\test\\new
>>>>>>>>> And now it is:
>>>>>>>>> C:\test\new
>>>>>>>>> But the properties code would see "\n" as the newline character in your encoding. In fact you cannot differentiate between \n, \t, \f and \r originally being one or two characters.
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>> Ralf
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> From: serviceability-dev<serviceability-dev-bounces at openjdk.java.net> On Behalf Of Chihiro Ito
>>>>>>>>> Sent: Dienstag, 25. Februar 2020 04:45
>>>>>>>>> To:serguei.spitsyn at oracle.com
>>>>>>>>> Cc:serviceability-dev at openjdk.java.net
>>>>>>>>> Subject: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows
>>>>>>>>>
>>>>>>>>> Hi Serguei,
>>>>>>>>>
>>>>>>>>> Thanks for your review and advice.
>>>>>>>>>
>>>>>>>>> I modified these.
>>>>>>>>> Could you review this again, please?
>>>>>>>>>
>>>>>>>>> Webrev :http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.05/
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Chihiro
>>>>>>>>>
>>
> 

From chiroito107 at gmail.com  Thu Mar 12 03:15:40 2020
From: chiroito107 at gmail.com (Chihiro Ito)
Date: Thu, 12 Mar 2020 12:15:40 +0900
Subject: PING: Re: RFR: JDK-8222489: jcmd VM.system_properties gives
 unusable paths on Windows
In-Reply-To: <e99ae9d3-d214-bac6-a12f-85619b7679ea@oss.nttdata.com>
References: <CAE_05uymp-AR6x5MLOZp7ATiOb4+O3Ev2bvscH2nmvHs79LAeA@mail.gmail.com>
 <CAE_05uwi8trdTc6W3dVepNshy3VbOPQGJd3gW6gkc9hYoXBTrg@mail.gmail.com>
 <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com>
 <CAE_05uxvGwtgsM4NSH5ipC_gOT+f_fjhdQ+xpHoX7DSp7yntEA@mail.gmail.com>
 <0a2df665-2e08-6139-c131-043a425b4916@oracle.com>
 <CAE_05uzT8PedhscdgqOQ_0Ek3uKB2mHdZSGJLOH4WyE02Jv+qg@mail.gmail.com>
 <AM0PR02MB450007C62B0392584C378ECF9FEA0@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <CAE_05uwttqkP06KStTEdQY_bbNTra4OHpQfszEcg5q_gbCxyrQ@mail.gmail.com>
 <CAE_05uz93m96Ez+zRv8h2j6DyDW9xPeanm1wNCo9r+kG+BoOug@mail.gmail.com>
 <31599c2a-6947-44a0-0f91-cb240ddaf1ed@oracle.com>
 <d7637b4f-5c9a-436d-f11b-468cc43aa8d3@oss.nttdata.com>
 <CAE_05uzNAU0f8T8Gh9=pxd5U18xTDsWbxKkmbj5iSf2AhMWXYQ@mail.gmail.com>
 <CAE_05uxC4ADo-qQaOaSAFHfUZ93uufVHZu639vvvSfeJWeNy=A@mail.gmail.com>
 <3e535ec6-573d-901e-e4c0-4e2174c80f15@oracle.com>
 <4d29cbef-aa4e-31bc-8562-16930eeed219@oracle.com>
 <e99ae9d3-d214-bac6-a12f-85619b7679ea@oss.nttdata.com>
Message-ID: <CAE_05uxC2BkvEEw+8+yxcdyUYDbJ+UHH7zt7stTo0Os+XLsC5g@mail.gmail.com>

Hi Serguei,

I could not fail these tests on my environment.
I would like to see more detail bug information especially the
environment variable.
Could you share with me the test-result, please?

Regards,
Chihiro

2020?3?12?(?) 10:32 Yasumasa Suenaga <suenaga at oss.nttdata.com>:
>
> Hi,
>
> AFAICS failure tests which are listed in JBS (JDK-8240881) seems to be caused by HotSpotVirtualMachine::getSystemProperties.
> It would load the result of PrintSystemPropertiesDCmd via Properties::load. So we have to compliant the spec of Properties.
>
> OTOH it is not described in spec (JDK-7120511: introducing VM.system_properties, and help message).
>
> Thus I think we need to add new option for VM.system_properties for showing raw values (e.g. -raw).
> If do so, we need CSR.
>
> What do you think?
>
>
> Yasumasa
>
>
> On 2020/03/12 4:52, serguei.spitsyn at oracle.com wrote:
> > Hi Chihiro,
> >
> > I've tested and pushed your fix but the impact of fix was underestimated.
> > The fix caused several regressions and the following bug was filed:
> >    https://bugs.openjdk.java.net/browse/JDK-8240881
> >
> > Now, I'm working on removing the fix of JDK-8222489 with the anti-delta.
> > You can find and review my RFR posted on the serviceability-dev mailing list:
> >    RFR: 8240881: several tests are failing due to encoding failures
> >
> > You can file another bug as a replacement of JDK-8222489.
> > I will help you with the information about test regressions caused by it.
> >
> > Thanks,
> > Serguei
> >
> >
> > On 3/10/20 02:54, serguei.spitsyn at oracle.com wrote:
> >> Hi Chihiro,
> >>
> >> Yes, I'll sponsor it.
> >> Thank you for the update.
> >>
> >> Thanks,
> >> Serguei
> >>
> >>
> >> On 3/8/20 06:05, Chihiro Ito wrote:
> >>> Hi,
> >>>
> >>> I'm sorry. I included "JDK-" in the changeset title. I removed it and
> >>> updated it.
> >>>
> >>> Change set : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset
> >>>
> >>> Regards,
> >>> Chihiro
> >>>
> >>> 2020?3?7?(?) 23:13 Chihiro Ito <chiroito107 at gmail.com>:
> >>>> Hi Serguei and Yasumasa,
> >>>>
> >>>> I update the copyright year and created the change set.
> >>>>
> >>>> Could you sponsor this, please?
> >>>>
> >>>> Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/
> >>>> Change set : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset
> >>>>
> >>>> Regards,
> >>>> Chihiro
> >>>>
> >>>>
> >>>> 2020?3?7?(?) 16:03 Yasumasa Suenaga <suenaga at oss.nttdata.com>:
> >>>>
> >>>>
> >>>>> Hi Chihiro,
> >>>>>
> >>>>> I'm also ok with webrev.05 after updating copyright year.
> >>>>>
> >>>>>
> >>>>> Yasumasa
> >>>>>
> >>>>>
> >>>>> On 2020/03/07 3:32, serguei.spitsyn at oracle.com wrote:
> >>>>>> Hi Chichiro,
> >>>>>>
> >>>>>> I'm okay with the fix.
> >>>>>> Could you, please, update the copyright date in || src/java.base/share/classes/jdk/internal/vm/VMSupport.java before push?
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Serguei
> >>>>>>
> >>>>>>
> >>>>>> On 3/6/20 07:24, Chihiro Ito wrote:
> >>>>>>> Hi Serguei,
> >>>>>>>
> >>>>>>> Could you review this again, please?
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>> Chihiro
> >>>>>>>
> >>>>>>>
> >>>>>>> 2020?2?27?(?) 22:11 Chihiro Ito<chiroito107 at gmail.com>:
> >>>>>>>> Hi Ralf,
> >>>>>>>>
> >>>>>>>> Thank you for your advice.
> >>>>>>>>
> >>>>>>>> 1.
> >>>>>>>> The comment of serializePropertiesToByteArray in VMSupport is "The stream written to the byte array is ISO 8859-1 encoded.".
> >>>>>>>> But the previous implementation does not keep this. I think we need to implement encode by ISO 8859-1.
> >>>>>>>>
> >>>>>>>> 2.
> >>>>>>>> According to help, the feature of VM.system_properties is just "Print system properties". The users should not use this output for loading. The users use it when they want to see System Properties soon.
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Chihiro
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> 2020?2?26?(?) 18:53 Schmelter, Ralf<ralf.schmelter at sap.com>:
> >>>>>>>>> Hi Chihiro,
> >>>>>>>>>
> >>>>>>>>> I have two remarks:
> >>>>>>>>>
> >>>>>>>>> 1. ISO Latin 1 characters which are not ASCII will not work with the code. While the Properties.store() method claims to create ISO Latin 1 String, it really only will create printable ASCII characters (apart from the comment, but it is ASCII too in this case). See Properties.saveConvert, where the char is checked for < 0x20 or > 0x7e and then printed as \uxxxx. This is important, since the bytes of the ByteArrayOutputStream are then send to the jcmd. And jcmd expects UTF-8 encoded strings, which is OK if we only used ASCII characters. But a ISO Latin 1 character >= 0x80 will break the encoding. Just try using \u00DC in your test.
> >>>>>>>>>
> >>>>>>>>> 2. Your change makes it impossible to load the output with properties.load(). The old output could be loaded, since it was a valid properties file. But yours is not. For example, consider the filename c:\test\new. Formerly it would be encoded as:
> >>>>>>>>> C\:\\test\\new
> >>>>>>>>> And now it is:
> >>>>>>>>> C:\test\new
> >>>>>>>>> But the properties code would see "\n" as the newline character in your encoding. In fact you cannot differentiate between \n, \t, \f and \r originally being one or two characters.
> >>>>>>>>>
> >>>>>>>>> Best regards,
> >>>>>>>>> Ralf
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> From: serviceability-dev<serviceability-dev-bounces at openjdk.java.net> On Behalf Of Chihiro Ito
> >>>>>>>>> Sent: Dienstag, 25. Februar 2020 04:45
> >>>>>>>>> To:serguei.spitsyn at oracle.com
> >>>>>>>>> Cc:serviceability-dev at openjdk.java.net
> >>>>>>>>> Subject: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows
> >>>>>>>>>
> >>>>>>>>> Hi Serguei,
> >>>>>>>>>
> >>>>>>>>> Thanks for your review and advice.
> >>>>>>>>>
> >>>>>>>>> I modified these.
> >>>>>>>>> Could you review this again, please?
> >>>>>>>>>
> >>>>>>>>> Webrev :http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.05/
> >>>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>> Chihiro
> >>>>>>>>>
> >>
> >

From serguei.spitsyn at oracle.com  Thu Mar 12 06:20:08 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 11 Mar 2020 23:20:08 -0700
Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they
 do not attempt to use sudo when available
In-Reply-To: <c26a5f69-3127-bd0a-0e25-8b7afe4464aa@oracle.com>
References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com>
 <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com>
 <c26a5f69-3127-bd0a-0e25-8b7afe4464aa@oracle.com>
Message-ID: <150f87f8-5b54-deae-ced1-1227bb1ed17a@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200311/0343ab0b/attachment-0001.htm>

From chris.plummer at oracle.com  Thu Mar 12 07:03:57 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Thu, 12 Mar 2020 00:03:57 -0700
Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they
 do not attempt to use sudo when available
In-Reply-To: <150f87f8-5b54-deae-ced1-1227bb1ed17a@oracle.com>
References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com>
 <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com>
 <c26a5f69-3127-bd0a-0e25-8b7afe4464aa@oracle.com>
 <150f87f8-5b54-deae-ced1-1227bb1ed17a@oracle.com>
Message-ID: <227652ff-ffb0-a1f0-531d-03b75ecec921@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200312/0a466f20/attachment.htm>

From serguei.spitsyn at oracle.com  Thu Mar 12 07:06:39 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 12 Mar 2020 00:06:39 -0700 (PDT)
Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they
 do not attempt to use sudo when available
In-Reply-To: <227652ff-ffb0-a1f0-531d-03b75ecec921@oracle.com>
References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com>
 <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com>
 <c26a5f69-3127-bd0a-0e25-8b7afe4464aa@oracle.com>
 <150f87f8-5b54-deae-ced1-1227bb1ed17a@oracle.com>
 <227652ff-ffb0-a1f0-531d-03b75ecec921@oracle.com>
Message-ID: <33ffd0e7-d017-c1bf-feb5-4d2f07e753f2@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200312/d3690c99/attachment-0001.htm>

From egor.ushakov at jetbrains.com  Thu Mar 12 10:12:49 2020
From: egor.ushakov at jetbrains.com (Egor Ushakov)
Date: Thu, 12 Mar 2020 13:12:49 +0300
Subject: invokeMethod's result gced immediately
Message-ID: <9a3aeb36-e859-5ca5-27b2-4781b0cbf137@jetbrains.com>

Hi all,

it seems that the result of the invokeMethod could be gced immediately, 
which is quite strange.
Currently we have to do:
invoke + disableCollection
new(Array)Instance + disableCollection
(String)mirrorOf + disableCollection
in a loop until succeeded, to allow something like foo().boo().zoo() to 
evaluate successfully.
Is there a way to automatically disable collection for newly created 
objects from jdi?
Maybe there's a bug about this?

Thanks!

-- 
Egor Ushakov
Software Developer
JetBrains
http://www.jetbrains.com
The Drive to Develop


From chiroito107 at gmail.com  Thu Mar 12 14:32:17 2020
From: chiroito107 at gmail.com (Chihiro Ito)
Date: Thu, 12 Mar 2020 23:32:17 +0900
Subject: PING: Re: RFR: JDK-8222489: jcmd VM.system_properties gives
 unusable paths on Windows
In-Reply-To: <CAE_05uxC2BkvEEw+8+yxcdyUYDbJ+UHH7zt7stTo0Os+XLsC5g@mail.gmail.com>
References: <CAE_05uymp-AR6x5MLOZp7ATiOb4+O3Ev2bvscH2nmvHs79LAeA@mail.gmail.com>
 <CAE_05uwi8trdTc6W3dVepNshy3VbOPQGJd3gW6gkc9hYoXBTrg@mail.gmail.com>
 <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com>
 <CAE_05uxvGwtgsM4NSH5ipC_gOT+f_fjhdQ+xpHoX7DSp7yntEA@mail.gmail.com>
 <0a2df665-2e08-6139-c131-043a425b4916@oracle.com>
 <CAE_05uzT8PedhscdgqOQ_0Ek3uKB2mHdZSGJLOH4WyE02Jv+qg@mail.gmail.com>
 <AM0PR02MB450007C62B0392584C378ECF9FEA0@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <CAE_05uwttqkP06KStTEdQY_bbNTra4OHpQfszEcg5q_gbCxyrQ@mail.gmail.com>
 <CAE_05uz93m96Ez+zRv8h2j6DyDW9xPeanm1wNCo9r+kG+BoOug@mail.gmail.com>
 <31599c2a-6947-44a0-0f91-cb240ddaf1ed@oracle.com>
 <d7637b4f-5c9a-436d-f11b-468cc43aa8d3@oss.nttdata.com>
 <CAE_05uzNAU0f8T8Gh9=pxd5U18xTDsWbxKkmbj5iSf2AhMWXYQ@mail.gmail.com>
 <CAE_05uxC4ADo-qQaOaSAFHfUZ93uufVHZu639vvvSfeJWeNy=A@mail.gmail.com>
 <3e535ec6-573d-901e-e4c0-4e2174c80f15@oracle.com>
 <4d29cbef-aa4e-31bc-8562-16930eeed219@oracle.com>
 <e99ae9d3-d214-bac6-a12f-85619b7679ea@oss.nttdata.com>
 <CAE_05uxC2BkvEEw+8+yxcdyUYDbJ+UHH7zt7stTo0Os+XLsC5g@mail.gmail.com>
Message-ID: <CAE_05uzqkJKkN3bV9QdGvt0nNByc3SVGyxnK+ooMHUw6fCgsBg@mail.gmail.com>

Hi,

I agree with Yasunaga's idea. The current implementation seems to be
required for the JVM to read and write. According to the test results,
it is difficult to improve this. However, as in JBS, we also need
human-readable output.

Regards,
Chihiro

2020?3?12?(?) 12:15 Chihiro Ito <chiroito107 at gmail.com>:
>
> Hi Serguei,
>
> I could not fail these tests on my environment.
> I would like to see more detail bug information especially the
> environment variable.
> Could you share with me the test-result, please?
>
> Regards,
> Chihiro
>
> 2020?3?12?(?) 10:32 Yasumasa Suenaga <suenaga at oss.nttdata.com>:
> >
> > Hi,
> >
> > AFAICS failure tests which are listed in JBS (JDK-8240881) seems to be caused by HotSpotVirtualMachine::getSystemProperties.
> > It would load the result of PrintSystemPropertiesDCmd via Properties::load. So we have to compliant the spec of Properties.
> >
> > OTOH it is not described in spec (JDK-7120511: introducing VM.system_properties, and help message).
> >
> > Thus I think we need to add new option for VM.system_properties for showing raw values (e.g. -raw).
> > If do so, we need CSR.
> >
> > What do you think?
> >
> >
> > Yasumasa
> >
> >
> > On 2020/03/12 4:52, serguei.spitsyn at oracle.com wrote:
> > > Hi Chihiro,
> > >
> > > I've tested and pushed your fix but the impact of fix was underestimated.
> > > The fix caused several regressions and the following bug was filed:
> > >    https://bugs.openjdk.java.net/browse/JDK-8240881
> > >
> > > Now, I'm working on removing the fix of JDK-8222489 with the anti-delta.
> > > You can find and review my RFR posted on the serviceability-dev mailing list:
> > >    RFR: 8240881: several tests are failing due to encoding failures
> > >
> > > You can file another bug as a replacement of JDK-8222489.
> > > I will help you with the information about test regressions caused by it.
> > >
> > > Thanks,
> > > Serguei
> > >
> > >
> > > On 3/10/20 02:54, serguei.spitsyn at oracle.com wrote:
> > >> Hi Chihiro,
> > >>
> > >> Yes, I'll sponsor it.
> > >> Thank you for the update.
> > >>
> > >> Thanks,
> > >> Serguei
> > >>
> > >>
> > >> On 3/8/20 06:05, Chihiro Ito wrote:
> > >>> Hi,
> > >>>
> > >>> I'm sorry. I included "JDK-" in the changeset title. I removed it and
> > >>> updated it.
> > >>>
> > >>> Change set : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset
> > >>>
> > >>> Regards,
> > >>> Chihiro
> > >>>
> > >>> 2020?3?7?(?) 23:13 Chihiro Ito <chiroito107 at gmail.com>:
> > >>>> Hi Serguei and Yasumasa,
> > >>>>
> > >>>> I update the copyright year and created the change set.
> > >>>>
> > >>>> Could you sponsor this, please?
> > >>>>
> > >>>> Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/
> > >>>> Change set : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.06/changeset
> > >>>>
> > >>>> Regards,
> > >>>> Chihiro
> > >>>>
> > >>>>
> > >>>> 2020?3?7?(?) 16:03 Yasumasa Suenaga <suenaga at oss.nttdata.com>:
> > >>>>
> > >>>>
> > >>>>> Hi Chihiro,
> > >>>>>
> > >>>>> I'm also ok with webrev.05 after updating copyright year.
> > >>>>>
> > >>>>>
> > >>>>> Yasumasa
> > >>>>>
> > >>>>>
> > >>>>> On 2020/03/07 3:32, serguei.spitsyn at oracle.com wrote:
> > >>>>>> Hi Chichiro,
> > >>>>>>
> > >>>>>> I'm okay with the fix.
> > >>>>>> Could you, please, update the copyright date in || src/java.base/share/classes/jdk/internal/vm/VMSupport.java before push?
> > >>>>>>
> > >>>>>> Thanks,
> > >>>>>> Serguei
> > >>>>>>
> > >>>>>>
> > >>>>>> On 3/6/20 07:24, Chihiro Ito wrote:
> > >>>>>>> Hi Serguei,
> > >>>>>>>
> > >>>>>>> Could you review this again, please?
> > >>>>>>>
> > >>>>>>> Regards,
> > >>>>>>> Chihiro
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> 2020?2?27?(?) 22:11 Chihiro Ito<chiroito107 at gmail.com>:
> > >>>>>>>> Hi Ralf,
> > >>>>>>>>
> > >>>>>>>> Thank you for your advice.
> > >>>>>>>>
> > >>>>>>>> 1.
> > >>>>>>>> The comment of serializePropertiesToByteArray in VMSupport is "The stream written to the byte array is ISO 8859-1 encoded.".
> > >>>>>>>> But the previous implementation does not keep this. I think we need to implement encode by ISO 8859-1.
> > >>>>>>>>
> > >>>>>>>> 2.
> > >>>>>>>> According to help, the feature of VM.system_properties is just "Print system properties". The users should not use this output for loading. The users use it when they want to see System Properties soon.
> > >>>>>>>>
> > >>>>>>>> Regards,
> > >>>>>>>> Chihiro
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> 2020?2?26?(?) 18:53 Schmelter, Ralf<ralf.schmelter at sap.com>:
> > >>>>>>>>> Hi Chihiro,
> > >>>>>>>>>
> > >>>>>>>>> I have two remarks:
> > >>>>>>>>>
> > >>>>>>>>> 1. ISO Latin 1 characters which are not ASCII will not work with the code. While the Properties.store() method claims to create ISO Latin 1 String, it really only will create printable ASCII characters (apart from the comment, but it is ASCII too in this case). See Properties.saveConvert, where the char is checked for < 0x20 or > 0x7e and then printed as \uxxxx. This is important, since the bytes of the ByteArrayOutputStream are then send to the jcmd. And jcmd expects UTF-8 encoded strings, which is OK if we only used ASCII characters. But a ISO Latin 1 character >= 0x80 will break the encoding. Just try using \u00DC in your test.
> > >>>>>>>>>
> > >>>>>>>>> 2. Your change makes it impossible to load the output with properties.load(). The old output could be loaded, since it was a valid properties file. But yours is not. For example, consider the filename c:\test\new. Formerly it would be encoded as:
> > >>>>>>>>> C\:\\test\\new
> > >>>>>>>>> And now it is:
> > >>>>>>>>> C:\test\new
> > >>>>>>>>> But the properties code would see "\n" as the newline character in your encoding. In fact you cannot differentiate between \n, \t, \f and \r originally being one or two characters.
> > >>>>>>>>>
> > >>>>>>>>> Best regards,
> > >>>>>>>>> Ralf
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> From: serviceability-dev<serviceability-dev-bounces at openjdk.java.net> On Behalf Of Chihiro Ito
> > >>>>>>>>> Sent: Dienstag, 25. Februar 2020 04:45
> > >>>>>>>>> To:serguei.spitsyn at oracle.com
> > >>>>>>>>> Cc:serviceability-dev at openjdk.java.net
> > >>>>>>>>> Subject: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows
> > >>>>>>>>>
> > >>>>>>>>> Hi Serguei,
> > >>>>>>>>>
> > >>>>>>>>> Thanks for your review and advice.
> > >>>>>>>>>
> > >>>>>>>>> I modified these.
> > >>>>>>>>> Could you review this again, please?
> > >>>>>>>>>
> > >>>>>>>>> Webrev :http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.05/
> > >>>>>>>>>
> > >>>>>>>>> Regards,
> > >>>>>>>>> Chihiro
> > >>>>>>>>>
> > >>
> > >

From martin.doerr at sap.com  Thu Mar 12 16:28:29 2020
From: martin.doerr at sap.com (Doerr, Martin)
Date: Thu, 12 Mar 2020 16:28:29 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <AM0PR0202MB33315707EAB1F5C9801DB4C19BE40@AM0PR0202MB3331.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
 <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
 <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com>
 <AM0PR0202MB33316510E86767AED0D29F679B030@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM7PR02MB6049A3D2F6DE10CAD6AA7A51ECEC0@AM7PR02MB6049.eurprd02.prod.outlook.com>
 <b159e349-95bc-01c3-5250-f3b454d7ef53@oracle.com>
 <AM0PR0202MB33315707EAB1F5C9801DB4C19BE40@AM0PR0202MB3331.eurprd02.prod.outlook.com>
Message-ID: <AM0PR0202MB32972071A26C80FB22FC49DE9AFD0@AM0PR0202MB3297.eurprd02.prod.outlook.com>

Hi Richard,


I managed to find time for a (almost) complete review of webrev.4. (I'll review the tests separately.)

First of all, the change seems to be in pretty good quality for its significant complexity. I couldn't find any real bugs. But I'd like to propose minor improvements.
I'm convinced that it's mature because we did substantial testing.

I like the new functionality for object deoptimization. It can possibly be reused for future escape analysis based optimizations. So I appreciate having it available in the code base.
In addition to that, your change makes the JVMTI implementation better integrated into the VM.


Now to the details:


src/hotspot/share/c1/c1_IR.hpp
describe_scope parameters. Ok.


src/hotspot/share/ci/ciEnv.cpp
src/hotspot/share/ci/ciEnv.hpp
Fix for JvmtiExport::can_walk_any_space() capability. Ok.


src/hotspot/share/code/compiledMethod.cpp
Nice cleanup!


src/hotspot/share/code/debugInfoRec.cpp
src/hotspot/share/code/debugInfoRec.hpp
Additional parmeters. (Remark: I think "non_global_escape_in_scope" would read better than "not_global_escape_in_scope", but your version is consistent with existing code, so no change request from my side.) Ok.


src/hotspot/share/code/nmethod.cpp
Nice cleanup!


src/hotspot/share/code/pcDesc.hpp
Additional parameters. Ok.


src/hotspot/share/code/scopeDesc.cpp
src/hotspot/share/code/scopeDesc.hpp
Improved implementation + additional parameters. Ok.


src/hotspot/share/compiler/compileBroker.cpp
src/hotspot/share/compiler/compileBroker.hpp
Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into a follow up change together with the test in order to make this webrev smaller, but since it is included, I'm reviewing everything at once. Not a big deal.) Ok.


src/hotspot/share/jvmci/jvmciCodeInstaller.cpp
Additional parameters. Ok.


src/hotspot/share/opto/c2compiler.cpp
Make do_escape_analysis independent of JVMCI capabilities. Nice!


src/hotspot/share/opto/callnode.hpp
Additional fields for MachSafePointNodes. Ok.


src/hotspot/share/opto/escape.cpp
Annotation for MachSafePointNodes. Your added functionality looks correct.
But I'd prefer to move the bulky code out of the large function.
I suggest to factor out something like has_not_global_escape and has_arg_escape. So the code could look like this:
      SafePointNode* sfn = sfn_worklist.at(next);
      sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn));
      if (sfn->is_CallJava()) {
        CallJavaNode* call = sfn->as_CallJava();
        call->set_arg_escape(has_arg_escape(call));
      }
This would also allow us to get rid of the found_..._escape_in_args variables making the loops better readable.

It's kind of ugly to use strcmp to recognize uncommon trap, but that seems to be the way to do it (there are more such places). So it's ok.


src/hotspot/share/opto/machnode.hpp
Additional fields for MachSafePointNodes. Ok.


src/hotspot/share/opto/macro.cpp
Allow elimination of non-escaping allocations. Ok.


src/hotspot/share/opto/matcher.cpp
src/hotspot/share/opto/output.cpp
Copy attribute / pass parameters. Ok.


src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp
Nice cleanup!


src/hotspot/share/prims/jvmtiEnv.cpp
src/hotspot/share/prims/jvmtiEnvBase.cpp
Escape barriers + deoptimize objects for target thread. Good.


src/hotspot/share/prims/jvmtiImpl.cpp
src/hotspot/share/prims/jvmtiImpl.hpp
The sequence is pretty complex:
VM_GetOrSetLocal element initialization executes EscapeBarrier code which suspends the target thread (extra VM Operation).
VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM Thread to prepare VM Operation with frame deoptimization).
VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor which resumes the target thread.
But I don't have any improvement proposal. Performance is probably not a concern, here. So it's ok.

VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it has non-globally escaping objects and other frames if they have arg escaping ones. Good.


src/hotspot/share/prims/jvmtiTagMap.cpp
Escape barriers + deoptimize objects for all threads. Ok.


src/hotspot/share/prims/whitebox.cpp
Added WB_IsFrameDeoptimized to API. Ok.


src/hotspot/share/runtime/deoptimization.cpp
Object deoptimization. I have more comments and proposals, here.
First of all, handling recursive and waiting locks in relock_objects is tricky, but looks correct.
Comments are sufficient to understand why things are done as they are implemented.

BiasedLocking related parts are complex, but we may get rid of them in the future (with BiasedLocking removal).
Anyway, looks correct, too.

Typo in comment: "regularily" => "regularly"

Deoptimization::fetch_unroll_info_helper is the only place where _jvmti_deferred_updates get deallocated (except JavaThread destructor). But I think we always go through it, so I can't see a memory leak or such kind of issues.

EscapeBarrier::deoptimize_objects: ResourceMark should use calling_thread().

You can use MutexLocker and MonitorLocker with Thread* to save the Thread::current() call.

I'd make set_objs_are_deoptimized static and remove it from the EscapeBarrier interface because I think it shouldn't be used outside of EscapeBarrier::deoptimize_objects.

Typo in comment: "we must only deoptimize" => "we only have to deoptimize"

"bool EscapeBarrier::deoptimize_objects(intptr_t* fr_id)" is trivial and barrier_active() is redundant. Implementation can get moved to hpp file.

I'll get back to suspend flags, later.

There are weird cases regarding _self_deoptimization_in_progress.
Assume we have 3 threads A, B and C. A deopts C, B deopts C, C deopts C. C can set _self_deoptimization_in_progress while A performs the handshake for suspending C. I think this doesn't lead to errors, but it's probably not desired.
I think it would be better to use only one "wait" call in sync_and_suspend_one and sync_and_suspend_all.

I first thought it'd be better to move ThreadBlockInVM before wait() to reduce thread state transitions, but that seems to be problematic because ThreadBlockInVM destructor contains a safepoint check which we shouldn't do while holding EscapeBarrier_lock. So no change request.

Change in thred_added:
I think the sequence would be more comprehensive if we waited for deopt_all_threads in Thread::start and all other places where a new thread can run into Java code (e.g. JVMTI attach).
Your version makes new threads come up with suspend flag set. That looks correct, too. Advantage is that you only have to change one place (thread_added). It'll be interesting to see how it will look like when we use async handshakes instead of suspend flags.
For now, I'm ok with your version.

I'd only move MutexLocker ml(EscapeBarrier_lock...) after if (!jt->is_hidden_from_external_view()).

Having 4 different deoptimize_objects functions makes it a little hard to keep an overview of which one is used for what.
Maybe adding suffixes would help a little bit, but I can also live with what you have.
Implementation looks correct to me.


src/hotspot/share/runtime/deoptimization.hpp
Escape barriers and object deoptimization functions.
Typo in comment: "helt" => "held"


src/hotspot/share/runtime/globals.hpp
Addition of develop flag DeoptimizeObjectsALotInterval. Ok.


src/hotspot/share/runtime/interfaceSupport.cpp
InterfaceSupport::deoptimizeAllObjects() is only used for DeoptimizeObjectsALot = 1.
I think DeoptimizeObjectsALot = 2 is more important, but I think it's not bad to have DeoptimizeObjectsALot = 1 in addition. Ok.


src/hotspot/share/runtime/interfaceSupport.inline.hpp
Addition of deoptimizeAllObjects. Ok.


src/hotspot/share/runtime/mutexLocker.cpp
src/hotspot/share/runtime/mutexLocker.hpp
Addition of EscapeBarrier_lock. Ok.


src/hotspot/share/runtime/objectMonitor.cpp
Make recursion count relock aware. Ok.


src/hotspot/share/runtime/stackValue.hpp
Better reinitilization in StackValue. Good.


src/hotspot/share/runtime/thread.cpp
src/hotspot/share/runtime/thread.hpp
src/hotspot/share/runtime/thread.inline.hpp
wait_for_object_deoptimization, suspend flag, deferred updates and test feature to deoptimize objects.

In the long term, we want to get rid of suspend flags, so it's not so nice to introduce a new one. But I agree with G?tz that it should be acceptable as temporary solution until async handshakes are available (which takes more time). So I'm ok with your change.

You can use MutexLocker with Thread*.

JVMTIDeferredUpdates: I agree with Robin. It'd be nice to move the class out of thread.hpp.


src/hotspot/share/runtime/vframe.cpp
Added support for entry frame to new_vframe. Ok.


src/hotspot/share/runtime/vframe_hp.cpp
src/hotspot/share/runtime/vframe_hp.hpp

I think code()->as_nmethod() in not_global_escape_in_scope() and arg_escape() should better be under #ifdef ASSERT or inside the assert statement (no need for code cache walking in product build).

jvmtiDeferredLocalVariableSet::update_monitors:
Please add a comment explaining that owner referenced by original info may be scalar replaced, but it is deoptimized in the vframe.


src/hotspot/share/utilities/macros.hpp
Addition of NOT_COMPILER2_OR_JVMCI_RETURN macros. Ok.


test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysisEnabled.java
test/hotspot/jtreg/serviceability/jvmti/Heap/libIterateHeapWithEscapeAnalysisEnabled.c
New test. Will review separately.


test/jdk/TEST.ROOT
Addition of vm.jvmci as required property. Ok.


test/jdk/com/sun/jdi/EATests.java
test/jdk/com/sun/jdi/EATestsJVMCI.java
New test. Will review separately.


test/lib/sun/hotspot/WhiteBox.java
Added isFrameDeoptimized to API. Ok.


That was it. Best regards,
Martin


> -----Original Message-----
> From: hotspot-compiler-dev <hotspot-compiler-dev-
> bounces at openjdk.java.net> On Behalf Of Reingruber, Richard
> Sent: Dienstag, 3. M?rz 2020 21:23
> To: 'Robbin Ehn' <robbin.ehn at oracle.com>; Lindenmaier, Goetz
> <goetz.lindenmaier at sap.com>; David Holmes <david.holmes at oracle.com>;
> Vladimir Kozlov (vladimir.kozlov at oracle.com)
> <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net;
> hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-
> dev at openjdk.java.net
> Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better
> Performance in the Presence of JVMTI Agents
> 
> Hi Robbin,
> 
> > > I understand that Robbin proposed to replace the usage of
> > > _suspend_flag with handshakes. Apparently, async handshakes
> > > are needed to do so. We have been waiting a while for removal
> > > of the _suspend_flag / introduction of async handshakes [2].
> > > What is the status here?
> 
> > I have an old prototype which I would like to continue to work on.
> > So do not assume asynch handshakes will make 15.
> > Even if it would, I think there are a lot more investigate work to remove
> > _suspend_flag.
> 
> Let us know, if we can be of any help to you and be it only testing.
> 
> > >> Full:
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/
> 
> > DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
> > You can move both declaration and definition to that file, no need to
> clobber
> > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)
> 
> Will do.
> 
> > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's
> own
> > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.
> 
> You are right. It shouldn't be declared in thread.hpp. I will look into that.
> 
> > Note that we also think we may have a bug in deopt:
> > https://bugs.openjdk.java.net/browse/JDK-8238237
> 
> > I think it would be best, if possible, to push after that is resolved.
> 
> Sure.
> 
> > Not even nearly a full review :)
> 
> I know :)
> 
> Anyways, thanks a lot,
> Richard.
> 
> 
> -----Original Message-----
> From: Robbin Ehn <robbin.ehn at oracle.com>
> Sent: Monday, March 2, 2020 11:17 AM
> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Reingruber, Richard
> <richard.reingruber at sap.com>; David Holmes <david.holmes at oracle.com>;
> Vladimir Kozlov (vladimir.kozlov at oracle.com)
> <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net;
> hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-
> dev at openjdk.java.net
> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance
> in the Presence of JVMTI Agents
> 
> Hi,
> 
> On 2/24/20 5:39 PM, Lindenmaier, Goetz wrote:
> > Hi,
> >
> > I had a look at the progress of this change. Nothing
> > happened since Richard posted his update using more
> > handshakes [1].
> > But we (SAP) would appreciate a lot if this change could
> > be successfully reviewed and pushed.
> >
> > I think there is basic understanding that this
> > change is helpful. It fixes a number of issues with JVMTI,
> > and will deliver the same performance benefits as EA
> > does in current production mode for debugging scenarios.
> >
> > This is important for us as we run our VMs prepared
> > for debugging in production mode.
> >
> > I understand that Robbin proposed to replace the usage of
> > _suspend_flag with handshakes. Apparently, async handshakes
> > are needed to do so. We have been waiting a while for removal
> > of the _suspend_flag / introduction of async handshakes [2].
> > What is the status here?
> 
> I have an old prototype which I would like to continue to work on.
> So do not assume asynch handshakes will make 15.
> Even if it would, I think there are a lot more investigate work to remove
> _suspend_flag.
> 
> >
> > I think we should no longer wait, but proceed with
> > this change. We will look into removing the usage of
> > suspend_flag introduced here once it is possible to implement
> > it with handshakes.
> 
> Yes, sure.
> 
> >> Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/
> 
> DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
> You can move both declaration and definition to that file, no need to clobber
> thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)
> 
> Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's
> own
> hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.
> 
> Note that we also think we may have a bug in deopt:
> https://bugs.openjdk.java.net/browse/JDK-8238237
> 
> I think it would be best, if possible, to push after that is resolved.
> 
> Not even nearly a full review :)
> 
> Thanks, Robbin
> 
> 
> >> Incremental:
> >> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/
> >>
> >> I was not able to eliminate the additional suspend flag now. I'll take care
> of this
> >> as soon as the
> >> existing suspend-resume-mechanism is reworked.
> >>
> >> Testing:
> >>
> >> Nightly tests @SAP:
> >>
> >>    JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015,
> Renaissance
> >> Suite, SAP specific tests
> >>    with fastdebug and release builds on all platforms
> >>
> >>    Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x
> parallel
> >> for 24h
> >>
> >> Thanks, Richard.
> >>
> >>
> >> More details on the changes:
> >>
> >> * Hide DeoptimizeObjectsALotThread from external view.
> >>
> >> * Changed EscapeBarrier_lock to be a _safepoint_check_never lock.
> >>    It used to be _safepoint_check_sometimes, which will be eliminated
> sooner or
> >> later.
> >>    I added explicit thread state changes with ThreadBlockInVM to code
> paths
> >> where we can wait()
> >>    on EscapeBarrier_lock to become safepoint safe.
> >>
> >> * Use handshake EscapeBarrierSuspendHandshake to suspend target
> threads
> >> instead of vm operation
> >>    VM_ThreadSuspendAllForObjDeopt.
> >>
> >> * Removed uses of Threads_lock. When adding a new thread we suspend
> it iff
> >> EA optimizations are
> >>    being reverted. In the previous version we were waiting on
> Threads_lock
> >> while EA optimizations
> >>    were reverted. See EscapeBarrier::thread_added().
> >>
> >> * Made tests require Xmixed compilation mode.
> >>
> >> * Made tests agnostic regarding tiered compilation.
> >>    I.e. tc isn't disabled anymore, and the tests can be run with tc enabled or
> >> disabled.
> >>
> >> * Exercising EATests.java as well with stress test options
> >> DeoptimizeObjectsALot*
> >>    Due to the non-deterministic deoptimizations some tests need to be
> skipped.
> >>    We do this to prevent bit-rot of the stress test code.
> >>
> >> * Executing EATests.java as well with graal if available. Driver for this is
> >>    EATestsJVMCI.java. Graal cannot pass all tests, because it does not
> provide all
> >> the new debug info
> >>    (namely not_global_escape_in_scope and arg_escape in
> scopeDesc.hpp).
> >>    And graal does not yet support the JVMTI operations force early return
> and
> >> pop frame.
> >>
> >> * Removed tracing from new jdi tests in EATests.java. Too much trace
> output
> >> before the debugging
> >>    connection is established can cause deadlock because output buffers fill
> up.
> >>    (See https://bugs.openjdk.java.net/browse/JDK-8173304)
> >>
> >> * Many copyright year changes and smaller clean-up changes of testing
> code
> >> (trailing white-space and
> >>    the like).
> >>
> >>
> >> -----Original Message-----
> >> From: David Holmes <david.holmes at oracle.com>
> >> Sent: Donnerstag, 19. Dezember 2019 03:12
> >> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-
> >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net;
> hotspot-
> >> runtime-dev at openjdk.java.net; Vladimir Kozlov
> (vladimir.kozlov at oracle.com)
> >> <vladimir.kozlov at oracle.com>
> >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> Performance in
> >> the Presence of JVMTI Agents
> >>
> >> Hi Richard,
> >>
> >> I think my issue is with the way EliminateNestedLocks works so I'm going
> >> to look into that more deeply.
> >>
> >> Thanks for the explanations.
> >>
> >> David
> >>
> >> On 18/12/2019 12:47 am, Reingruber, Richard wrote:
> >>> Hi David,
> >>>
> >>>     > >    > Some further queries/concerns:
> >>>     > >    >
> >>>     > >    > src/hotspot/share/runtime/objectMonitor.cpp
> >>>     > >    >
> >>>     > >    > Can you please explain the changes to ObjectMonitor::wait:
> >>>     > >    >
> >>>     > >    > !   _recursions = save      // restore the old recursion count
> >>>     > >    > !                 + jt->get_and_reset_relock_count_after_wait(); //
> >>>     > >    > increased by the deferred relock count
> >>>     > >    >
> >>>     > >    > what is the "deferred relock count"? I gather it relates to
> >>>     > >    >
> >>>     > >    > "The code was extended to be able to deoptimize objects of a
> >>>     > > frame that
> >>>     > >    > is not the top frame and to let another thread than the owning
> >>>     > > thread do
> >>>     > >    > it."
> >>>     > >
> >>>     > > Yes, these relate. Currently EA based optimizations are reverted,
> when a
> >> compiled frame is
> >>>     > > replaced with corresponding interpreter frames. Part of this is
> relocking
> >> objects with eliminated
> >>>     > > locking. New with the enhancement is that we do this also just
> before
> >> object references are
> >>>     > > acquired through JVMTI. In this case we deoptimize also the
> owning
> >> compiled frame C and we
> >>>     > > register deoptimized objects as deferred updates. When control
> returns
> >> to C it gets deoptimized,
> >>>     > > we notice that objects are already deoptimized (reallocated and
> >> relocked), so we don't do it again
> >>>     > > (relocking twice would be incorrect of course). Deferred updates
> are
> >> copied into the new
> >>>     > > interpreter frames.
> >>>     > >
> >>>     > > Problem: relocking is not possible if the target thread T is waiting
> on the
> >> monitor that needs to
> >>>     > > be relocked. This happens only with non-local objects with
> >> EliminateNestedLocks. Instead relocking
> >>>     > > is deferred until T owns the monitor again. This is what the piece of
> >> code above does.
> >>>     >
> >>>     >  Sorry I need some more detail here. How can you wait() on an
> object
> >>>     >  monitor if the object allocation and/or locking was optimised away?
> And
> >>>     >  what is a "non-local object" in this context? Isn't EA restricted to
> >>>     >  thread-confined objects?
> >>>
> >>> "Non-local object" is an object that escapes its thread. The issue I'm
> >> addressing with the changes
> >>> in ObjectMonitor::wait are almost unrelated to EA. They are caused by
> >> EliminateNestedLocks, where C2
> >>> eliminates recursive locking of an already owned lock. The lock owning
> object
> >> exists on the heap, it
> >>> is locked and you can call wait() on it.
> >>>
> >>> EliminateLocks is the C2 option that controls lock elimination based on
> EA.
> >> Both optimizations have
> >>> in common that objects with eliminated locking need to be relocked
> when
> >> deoptimizing a frame,
> >>> i.e. when replacing a compiled frame with equivalent interpreter
> >>> frames. Deoptimization::relock_objects does that job for /all/ eliminated
> >> locks in scope. /All/ can
> >>> be a mix of eliminated nested locks and locks of not-escaping objects.
> >>>
> >>> New with the enhancement: I call relock_objects earlier, just before
> objects
> >> pontentially
> >>> escape. But then later when the owning compiled frame gets
> deoptimized, I
> >> must not do it again:
> >>>
> >>> See call to EscapeBarrier::objs_are_deoptimized in deoptimization.cpp:
> >>>
> >>>    373   if ((jvmci_enabled || ((DoEscapeAnalysis ||
> EliminateNestedLocks) &&
> >> EliminateLocks))
> >>>    374       && !EscapeBarrier::objs_are_deoptimized(thread,
> deoptee.id())) {
> >>>    375     bool unused;
> >>>    376     eliminate_locks(thread, chunk, realloc_failures, deoptee,
> exec_mode,
> >> unused);
> >>>    377   }
> >>>
> >>> Now when calling relock_objects early it is quiet possible that I have to
> relock
> >> an object the
> >>> target thread currently waits for. Obviously I cannot relock in this case,
> >> instead I chose to
> >>> introduce relock_count_after_wait to JavaThread.
> >>>
> >>>     >  Is it just that some of the locking gets optimized away e.g.
> >>>     >
> >>>     >  synchronised(obj) {
> >>>     >     synchronised(obj) {
> >>>     >       synchronised(obj) {
> >>>     >         obj.wait();
> >>>     >       }
> >>>     >     }
> >>>     >  }
> >>>     >
> >>>     >  If this is reduced to a form as-if it were a single lock of the monitor
> >>>     >  (due to EA) and the wait() triggers a JVM TI event which leads to the
> >>>     >  escape of "obj" then we need to reconstruct the true lock state, and
> so
> >>>     >  when the wait() internally unblocks and reacquires the monitor it
> has to
> >>>     >  set the true recursion count to 3, not the 1 that it appeared to be
> when
> >>>     >  wait() was initially called. Is that the scenario?
> >>>
> >>> Kind of... except that the locking is not eliminated due to EA and there is
> no
> >> JVM TI event
> >>> triggered by wait.
> >>>
> >>> Add
> >>>
> >>> LocalObject l1 = new LocalObject();
> >>>
> >>> in front of the synchrnized blocks and assume a JVM TI agent acquires l1.
> This
> >> triggers the code in
> >>> question.
> >>>
> >>> See that relocking/reallocating is transactional. If it is done then for /all/
> >> objects in scope and it is
> >>> done at most once. It wouldn't be quite so easy to split this in relocking
> of
> >> nested/EA-based
> >>> eliminated locks.
> >>>
> >>>     >  If so I find this truly awful. Anyone using wait() in a realistic form
> >>>     >  requires a notification and so the object cannot be thread confined.
> In
> >>>
> >>> It is not thread confined.
> >>>
> >>>     >  which case I would strongly argue that upon hitting the wait() the
> deopt
> >>>     >  should occur unconditionally and so the lock state is correct before
> we
> >>>     >  wait and so we don't need to mess with the recursion count
> internally
> >>>     >  when we reacquire the monitor.
> >>>     >
> >>>     > >
> >>>     > >    > which I don't like the sound of at all when it comes to
> ObjectMonitor
> >>>     > >    > state. So I'd like to understand in detail exactly what is going on
> here
> >>>     > >    > and why.  This is a very intrusive change that seems to badly
> break
> >>>     > >    > encapsulation and impacts future changes to ObjectMonitor
> that are
> >> under
> >>>     > >    > investigation.
> >>>     > >
> >>>     > > I would not regard this as breaking encapsulation. Certainly not
> badly.
> >>>     > >
> >>>     > > I've added a property relock_count_after_wait to JavaThread. The
> >> property is well
> >>>     > > encapsulated. Future ObjectMonitor implementations have to deal
> with
> >> recursion too. They are free
> >>>     > > in choosing a way to do that as long as that property is taken into
> >> account. This is hardly a
> >>>     > > limitation.
> >>>     >
> >>>     >  I do think this badly breaks encapsulation as you have to add a
> callout
> >>>     >  from the guts of the ObjectMonitor code to reach into the thread to
> get
> >>>     >  this lock count adjustment. I understand why you have had to do
> this but
> >>>     >  I would much rather see a change to the EA optimisation strategy so
> that
> >>>     >  this is not needed.
> >>>     >
> >>>     > > Note also that the property is a straight forward extension of the
> >> existing concept of deferred
> >>>     > > local updates. It is embedded into the structure holding them. So
> not
> >> even the footprint of a
> >>>     > > JavaThread is enlarged if no deferred updates are generated.
> >>>     >
> >>>     > [...]
> >>>     >
> >>>     > >
> >>>     > > I'm actually duplicating the existing external suspend mechanism,
> >> because a thread can be
> >>>     > > suspended at most once. And hey, and don't like that either! But it
> >> seems not unlikely that the
> >>>     > > duplicate can be removed together with the original and the new
> type
> >> of handshakes that will be
> >>>     > > used for thread suspend can be used for object deoptimization
> too. See
> >> today's discussion in
> >>>     > > JDK-8227745 [2].
> >>>     >
> >>>     >  I hope that discussion bears some fruit, at the moment it seems not
> to
> >>>     >  be possible to use handshakes here. :(
> >>>     >
> >>>     >  The external suspend mechanism is a royal pain in the proverbial
> that we
> >>>     >  have to carefully live with. The idea that we're duplicating that for
> >>>     >  use in another fringe area of functionality does not thrill me at all.
> >>>     >
> >>>     >  To be clear, I understand the problem that exists and that you wish
> to
> >>>     >  solve, but for the runtime parts I balk at the complexity cost of
> >>>     >  solving it.
> >>>
> >>> I know it's complex, but by far no rocket science.
> >>>
> >>> Also I find it hard to imagine another fix for JDK-8233915 besides
> changing
> >> the JVM TI specification.
> >>>
> >>> Thanks, Richard.
> >>>
> >>> -----Original Message-----
> >>> From: David Holmes <david.holmes at oracle.com>
> >>> Sent: Dienstag, 17. Dezember 2019 08:03
> >>> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-
> >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net;
> hotspot-
> >> runtime-dev at openjdk.java.net; Vladimir Kozlov
> (vladimir.kozlov at oracle.com)
> >> <vladimir.kozlov at oracle.com>
> >>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> Performance
> >> in the Presence of JVMTI Agents
> >>>
> >>> <resend as my mailer crashed during last send>
> >>>
> >>> David
> >>>
> >>> On 17/12/2019 4:57 pm, David Holmes wrote:
> >>>> Hi Richard,
> >>>>
> >>>> On 14/12/2019 5:01 am, Reingruber, Richard wrote:
> >>>>> Hi David,
> >>>>>
> >>>>>   ?? > Some further queries/concerns:
> >>>>>   ?? >
> >>>>>   ?? > src/hotspot/share/runtime/objectMonitor.cpp
> >>>>>   ?? >
> >>>>>   ?? > Can you please explain the changes to ObjectMonitor::wait:
> >>>>>   ?? >
> >>>>>   ?? > !?? _recursions = save????? // restore the old recursion count
> >>>>>   ?? > !???????????????? + jt->get_and_reset_relock_count_after_wait(); //
> >>>>>   ?? > increased by the deferred relock count
> >>>>>   ?? >
> >>>>>   ?? > what is the "deferred relock count"? I gather it relates to
> >>>>>   ?? >
> >>>>>   ?? > "The code was extended to be able to deoptimize objects of a
> >>>>> frame that
> >>>>>   ?? > is not the top frame and to let another thread than the owning
> >>>>> thread do
> >>>>>   ?? > it."
> >>>>>
> >>>>> Yes, these relate. Currently EA based optimizations are reverted,
> when
> >>>>> a compiled frame is replaced
> >>>>> with corresponding interpreter frames. Part of this is relocking
> >>>>> objects with eliminated
> >>>>> locking. New with the enhancement is that we do this also just before
> >>>>> object references are acquired
> >>>>> through JVMTI. In this case we deoptimize also the owning compiled
> >>>>> frame C and we register
> >>>>> deoptimized objects as deferred updates. When control returns to C
> it
> >>>>> gets deoptimized, we notice
> >>>>> that objects are already deoptimized (reallocated and relocked), so
> we
> >>>>> don't do it again (relocking
> >>>>> twice would be incorrect of course). Deferred updates are copied into
> >>>>> the new interpreter frames.
> >>>>>
> >>>>> Problem: relocking is not possible if the target thread T is waiting
> >>>>> on the monitor that needs to be
> >>>>> relocked. This happens only with non-local objects with
> >>>>> EliminateNestedLocks. Instead relocking is
> >>>>> deferred until T owns the monitor again. This is what the piece of
> >>>>> code above does.
> >>>>
> >>>> Sorry I need some more detail here. How can you wait() on an object
> >>>> monitor if the object allocation and/or locking was optimised away?
> And
> >>>> what is a "non-local object" in this context? Isn't EA restricted to
> >>>> thread-confined objects?
> >>>>
> >>>> Is it just that some of the locking gets optimized away e.g.
> >>>>
> >>>> synchronised(obj) {
> >>>>    ? synchronised(obj) {
> >>>>    ??? synchronised(obj) {
> >>>>    ????? obj.wait();
> >>>>    ??? }
> >>>>    ? }
> >>>> }
> >>>>
> >>>> If this is reduced to a form as-if it were a single lock of the monitor
> >>>> (due to EA) and the wait() triggers a JVM TI event which leads to the
> >>>> escape of "obj" then we need to reconstruct the true lock state, and so
> >>>> when the wait() internally unblocks and reacquires the monitor it has to
> >>>> set the true recursion count to 3, not the 1 that it appeared to be when
> >>>> wait() was initially called. Is that the scenario?
> >>>>
> >>>> If so I find this truly awful. Anyone using wait() in a realistic form
> >>>> requires a notification and so the object cannot be thread confined. In
> >>>> which case I would strongly argue that upon hitting the wait() the
> deopt
> >>>> should occur unconditionally and so the lock state is correct before we
> >>>> wait and so we don't need to mess with the recursion count internally
> >>>> when we reacquire the monitor.
> >>>>
> >>>>>
> >>>>>   ?? > which I don't like the sound of at all when it comes to
> >>>>> ObjectMonitor
> >>>>>   ?? > state. So I'd like to understand in detail exactly what is going
> >>>>> on here
> >>>>>   ?? > and why.? This is a very intrusive change that seems to badly
> break
> >>>>>   ?? > encapsulation and impacts future changes to ObjectMonitor that
> >>>>> are under
> >>>>>   ?? > investigation.
> >>>>>
> >>>>> I would not regard this as breaking encapsulation. Certainly not badly.
> >>>>>
> >>>>> I've added a property relock_count_after_wait to JavaThread. The
> >>>>> property is well
> >>>>> encapsulated. Future ObjectMonitor implementations have to deal
> with
> >>>>> recursion too. They are free in
> >>>>> choosing a way to do that as long as that property is taken into
> >>>>> account. This is hardly a
> >>>>> limitation.
> >>>>
> >>>> I do think this badly breaks encapsulation as you have to add a callout
> >>>> from the guts of the ObjectMonitor code to reach into the thread to
> get
> >>>> this lock count adjustment. I understand why you have had to do this
> but
> >>>> I would much rather see a change to the EA optimisation strategy so
> that
> >>>> this is not needed.
> >>>>
> >>>>> Note also that the property is a straight forward extension of the
> >>>>> existing concept of deferred
> >>>>> local updates. It is embedded into the structure holding them. So not
> >>>>> even the footprint of a
> >>>>> JavaThread is enlarged if no deferred updates are generated.
> >>>>>
> >>>>>   ?? > ---
> >>>>>   ?? >
> >>>>>   ?? > src/hotspot/share/runtime/thread.cpp
> >>>>>   ?? >
> >>>>>   ?? > Can you please explain why
> >>>>> JavaThread::wait_for_object_deoptimization
> >>>>>   ?? > has to be handcrafted in this way rather than using proper
> >>>>> transitions.
> >>>>>   ?? >
> >>>>>
> >>>>> I wrote wait_for_object_deoptimization taking
> >>>>> JavaThread::java_suspend_self_with_safepoint_check
> >>>>> as template. So in short: for the same reasons :)
> >>>>>
> >>>>> Threads reach both methods as part of thread state transitions,
> >>>>> therefore special handling is
> >>>>> required to change thread state on top of ongoing transitions.
> >>>>>
> >>>>>   ?? > We got rid of "deopt suspend" some time ago and it is disturbing
> >>>>> to see
> >>>>>   ?? > it being added back (effectively). This seems like it may be
> >>>>> something
> >>>>>   ?? > that handshakes could be used for.
> >>>>>
> >>>>> Deopt suspend used to be something rather different with a similar
> >>>>> name[1]. It is not being added back.
> >>>>
> >>>> I stand corrected. Despite comments in the code to the contrary
> >>>> deopt_suspend didn't actually cause a self-suspend. I was doing a lot of
> >>>> cleanup in this area 13 years ago :)
> >>>>
> >>>>>
> >>>>> I'm actually duplicating the existing external suspend mechanism,
> >>>>> because a thread can be suspended
> >>>>> at most once. And hey, and don't like that either! But it seems not
> >>>>> unlikely that the duplicate can
> >>>>> be removed together with the original and the new type of
> handshakes
> >>>>> that will be used for
> >>>>> thread suspend can be used for object deoptimization too. See
> today's
> >>>>> discussion in JDK-8227745 [2].
> >>>>
> >>>> I hope that discussion bears some fruit, at the moment it seems not to
> >>>> be possible to use handshakes here. :(
> >>>>
> >>>> The external suspend mechanism is a royal pain in the proverbial that
> we
> >>>> have to carefully live with. The idea that we're duplicating that for
> >>>> use in another fringe area of functionality does not thrill me at all.
> >>>>
> >>>> To be clear, I understand the problem that exists and that you wish to
> >>>> solve, but for the runtime parts I balk at the complexity cost of
> >>>> solving it.
> >>>>
> >>>> Thanks,
> >>>> David
> >>>> -----
> >>>>
> >>>>> Thanks, Richard.
> >>>>>
> >>>>> [1] Deopt suspend was something like an async. handshake for
> >>>>> architectures with register windows,
> >>>>>   ???? where patching the return pc for deoptimization of a compiled
> >>>>> frame was racy if the owner thread
> >>>>>   ???? was in native code. Instead a "deopt" suspend flag was set on
> >>>>> which the thread patched its own
> >>>>>   ???? frame upon return from native. So no thread was suspended. It
> got
> >>>>> its name only from the name of
> >>>>>   ???? the flags.
> >>>>>
> >>>>> [2] Discussion about using handshakes to sync. with the target thread:
> >>>>>
> >>>>> https://bugs.openjdk.java.net/browse/JDK-
> >>
> 8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.syst
> e
> >> m.issuetabpanels:comment-tabpanel#comment-14306727
> >>>>>
> >>>>>
> >>>>> -----Original Message-----
> >>>>> From: David Holmes <david.holmes at oracle.com>
> >>>>> Sent: Freitag, 13. Dezember 2019 00:56
> >>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
> >>>>> serviceability-dev at openjdk.java.net;
> >>>>> hotspot-compiler-dev at openjdk.java.net;
> >>>>> hotspot-runtime-dev at openjdk.java.net
> >>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> >>>>> Performance in the Presence of JVMTI Agents
> >>>>>
> >>>>> Hi Richard,
> >>>>>
> >>>>> Some further queries/concerns:
> >>>>>
> >>>>> src/hotspot/share/runtime/objectMonitor.cpp
> >>>>>
> >>>>> Can you please explain the changes to ObjectMonitor::wait:
> >>>>>
> >>>>> !?? _recursions = save????? // restore the old recursion count
> >>>>> !???????????????? + jt->get_and_reset_relock_count_after_wait(); //
> >>>>> increased by the deferred relock count
> >>>>>
> >>>>> what is the "deferred relock count"? I gather it relates to
> >>>>>
> >>>>> "The code was extended to be able to deoptimize objects of a frame
> that
> >>>>> is not the top frame and to let another thread than the owning thread
> do
> >>>>> it."
> >>>>>
> >>>>> which I don't like the sound of at all when it comes to ObjectMonitor
> >>>>> state. So I'd like to understand in detail exactly what is going on here
> >>>>> and why.? This is a very intrusive change that seems to badly break
> >>>>> encapsulation and impacts future changes to ObjectMonitor that are
> under
> >>>>> investigation.
> >>>>>
> >>>>> ---
> >>>>>
> >>>>> src/hotspot/share/runtime/thread.cpp
> >>>>>
> >>>>> Can you please explain why
> JavaThread::wait_for_object_deoptimization
> >>>>> has to be handcrafted in this way rather than using proper transitions.
> >>>>>
> >>>>> We got rid of "deopt suspend" some time ago and it is disturbing to
> see
> >>>>> it being added back (effectively). This seems like it may be something
> >>>>> that handshakes could be used for.
> >>>>>
> >>>>> Thanks,
> >>>>> David
> >>>>> -----
> >>>>>
> >>>>> On 12/12/2019 7:02 am, David Holmes wrote:
> >>>>>> On 12/12/2019 1:07 am, Reingruber, Richard wrote:
> >>>>>>> Hi David,
> >>>>>>>
> >>>>>>>   ??? > Most of the details here are in areas I can comment on in
> detail,
> >>>>>>> but I
> >>>>>>>   ??? > did take an initial general look at things.
> >>>>>>>
> >>>>>>> Thanks for taking the time!
> >>>>>>
> >>>>>> Apologies the above should read:
> >>>>>>
> >>>>>> "Most of the details here are in areas I *can't* comment on in detail
> >>>>>> ..."
> >>>>>>
> >>>>>> David
> >>>>>>
> >>>>>>>   ??? > The only thing that jumped out at me is that I think the
> >>>>>>>   ??? > DeoptimizeObjectsALotThread should be a hidden thread.
> >>>>>>>   ??? >
> >>>>>>>   ??? > +? bool is_hidden_from_external_view() const { return true; }
> >>>>>>>
> >>>>>>> Yes, it should. Will add the method like above.
> >>>>>>>
> >>>>>>>   ??? > Also I don't see any testing of the
> DeoptimizeObjectsALotThread.
> >>>>>>> Without
> >>>>>>>   ??? > active testing this will just bit-rot.
> >>>>>>>
> >>>>>>> DeoptimizeObjectsALot is meant for stress testing with a larger
> >>>>>>> workload. I will add a minimal test
> >>>>>>> to keep it fresh.
> >>>>>>>
> >>>>>>>   ??? > Also on the tests I don't understand your @requires clause:
> >>>>>>>   ??? >
> >>>>>>>   ??? >?? @requires ((vm.compMode != "Xcomp") &
> vm.compiler2.enabled
> >> &
> >>>>>>>   ??? > (vm.opt.TieredCompilation != true))
> >>>>>>>   ??? >
> >>>>>>>   ??? > This seems to require that TieredCompilation is disabled, but
> >>>>>>> tiered is
> >>>>>>>   ??? > our normal mode of operation. ??
> >>>>>>>   ??? >
> >>>>>>>
> >>>>>>> I removed the clause. I guess I wanted to target the tests towards
> the
> >>>>>>> code they are supposed to
> >>>>>>> test, and it's easier to analyze failures w/o tiered compilation and
> >>>>>>> with just one compiler thread.
> >>>>>>>
> >>>>>>> Additionally I will make use of
> >>>>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Richard.
> >>>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: David Holmes <david.holmes at oracle.com>
> >>>>>>> Sent: Mittwoch, 11. Dezember 2019 08:03
> >>>>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
> >>>>>>> serviceability-dev at openjdk.java.net;
> >>>>>>> hotspot-compiler-dev at openjdk.java.net;
> >>>>>>> hotspot-runtime-dev at openjdk.java.net
> >>>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> >>>>>>> Performance in the Presence of JVMTI Agents
> >>>>>>>
> >>>>>>> Hi Richard,
> >>>>>>>
> >>>>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote:
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> I would like to get reviews please for
> >>>>>>>>
> >>>>>>>>
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
> >>>>>>>>
> >>>>>>>> Corresponding RFE:
> >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745
> >>>>>>>>
> >>>>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
> >>>>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-
> 8214584 [1]
> >>>>>>>>
> >>>>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing
> without
> >>>>>>>> issues (thanks!). In addition the
> >>>>>>>> change is being tested at SAP since I posted the first RFR some
> >>>>>>>> months ago.
> >>>>>>>>
> >>>>>>>> The intention of this enhancement is to benefit performance wise
> from
> >>>>>>>> escape analysis even if JVMTI
> >>>>>>>> agents request capabilities that allow them to access local variable
> >>>>>>>> values. E.g. if you start-up
> >>>>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,
> then
> >>>>>>>> escape analysis is disabled right
> >>>>>>>> from the beginning, well before a debugger attaches -- if ever one
> >>>>>>>> should do so. With the
> >>>>>>>> enhancement, escape analysis will remain enabled until and after
> a
> >>>>>>>> debugger attaches. EA based
> >>>>>>>> optimizations are reverted just before an agent acquires the
> >>>>>>>> reference to an object. In the JBS item
> >>>>>>>> you'll find more details.
> >>>>>>>
> >>>>>>> Most of the details here are in areas I can comment on in detail, but
> I
> >>>>>>> did take an initial general look at things.
> >>>>>>>
> >>>>>>> The only thing that jumped out at me is that I think the
> >>>>>>> DeoptimizeObjectsALotThread should be a hidden thread.
> >>>>>>>
> >>>>>>> +? bool is_hidden_from_external_view() const { return true; }
> >>>>>>>
> >>>>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread.
> >>>>>>> Without
> >>>>>>> active testing this will just bit-rot.
> >>>>>>>
> >>>>>>> Also on the tests I don't understand your @requires clause:
> >>>>>>>
> >>>>>>>   ??? @requires ((vm.compMode != "Xcomp") &
> vm.compiler2.enabled &
> >>>>>>> (vm.opt.TieredCompilation != true))
> >>>>>>>
> >>>>>>> This seems to require that TieredCompilation is disabled, but tiered
> is
> >>>>>>> our normal mode of operation. ??
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> David
> >>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Richard.
> >>>>>>>>
> >>>>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745
> >>>>>>>>
> >>
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.pa
> tc
> >> h
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>

From chris.plummer at oracle.com  Thu Mar 12 16:53:49 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Thu, 12 Mar 2020 09:53:49 -0700
Subject: invokeMethod's result gced immediately
In-Reply-To: <9a3aeb36-e859-5ca5-27b2-4781b0cbf137@jetbrains.com>
References: <9a3aeb36-e859-5ca5-27b2-4781b0cbf137@jetbrains.com>
Message-ID: <7ee41f75-106e-2f35-0369-3256193aab5d@oracle.com>

Hi Egor,

This stems from the JDWP spec. If you look at the 
ObjectReference.DisableCollection command:

https://docs.oracle.com/javase/10/docs/specs/jdwp/jdwp-protocol.html#JDWP_ObjectReference_DisableCollection

"By default all objects in back-end replies may be collected at any time 
the target VM is running. A call to this command guarantees that the 
object will not be collected."

Yes, it can be annoying that by default collection is not already 
disabled on any returned ObjectReference. We've had some tests with 
intermittent failures because they were buggy in this regard. I think 
there are two reasons it is done this way, both mentioned in the above 
JDWP section. The first is:

"Note that while the target VM is suspended, no garbage collection will 
occur because all threads are suspended. The typical examination of 
variables, fields, and arrays during the suspension is safe without 
explicitly disabling garbage collection."

So it's quite common not to need to DisableCollection on the object. 
Second is:

"This method should be used sparingly, as it alters the pattern of 
garbage collection in the target VM and, consequently, may result in 
application behavior under the debugger that differs from its 
non-debugged behavior."

So it looks like there's good reason to limit using DisableCollection.

It's a bit unclear to me how your foo().boo().zoo() example is 
implemented. Are you under a SUSPEND_ALL when you do the invoke? If so, 
all threads should still be suspended after the invoke completes, so you 
should be able to call DisableCollection without having to worry about 
it failing due to the object already being collected. This should set 
you up to use the objectref in the next invoke in the chain, once again 
without worry about having to retry due to collection.

cheers,

Chris

On 3/12/20 3:12 AM, Egor Ushakov wrote:
> Hi all,
>
> it seems that the result of the invokeMethod could be gced 
> immediately, which is quite strange.
> Currently we have to do:
> invoke + disableCollection
> new(Array)Instance + disableCollection
> (String)mirrorOf + disableCollection
> in a loop until succeeded, to allow something like foo().boo().zoo() 
> to evaluate successfully.
> Is there a way to automatically disable collection for newly created 
> objects from jdi?
> Maybe there's a bug about this?
>
> Thanks!
>


From jonathan.gibbons at oracle.com  Thu Mar 12 20:50:01 2020
From: jonathan.gibbons at oracle.com (Jonathan Gibbons)
Date: Thu, 12 Mar 2020 13:50:01 -0700
Subject: RFR: [small,docs] JDK-8240971 Fix CSS styles in some doc comments
Message-ID: <a70477b4-36f4-4642-867e-c76218ade93c@oracle.com>

Please review a simple fix regarding the non-standard use of some CSS in 
some doc comments.

 From the JBS Description:

Recently, for the display of javadoc block tags, javadoc changed from 
using an inconsistent set of CSS class names on the generated 'dt' 
elements to using a single new name ("notes") on the enclosing 'dl' 
element.

There are a few (4) places in the main JDK code where the old-style 
names were used explicitly in doc comments, in order to emulate the 
appearance of a list of block tags. These use-sites should be fixed up. 
They are in the following files:

open/src/java.base/share/classes/module-info.java
open/src/java.se/share/classes/module-info.java
open/src/java.management.rmi/share/classes/module-info.java
open/src/jdk.jconsole/share/classes/module-info.java

In addition, these four files used the style attribute to force the font 
to be used. The font is now set in the standard CSS for "notes", and so 
the local use of a "style" attribute is no longer necessary.

-- Jon

JBS: https://bugs.openjdk.java.net/browse/JDK-8240971
Webrev: http://cr.openjdk.java.net/~jjg/8240971/webrev.00/index.html
API: http://cr.openjdk.java.net/~jjg/8240971/api.00/index.html


From mandy.chung at oracle.com  Thu Mar 12 20:53:31 2020
From: mandy.chung at oracle.com (Mandy Chung)
Date: Thu, 12 Mar 2020 13:53:31 -0700
Subject: RFR: [small, docs] JDK-8240971 Fix CSS styles in some doc comments
In-Reply-To: <a70477b4-36f4-4642-867e-c76218ade93c@oracle.com>
References: <a70477b4-36f4-4642-867e-c76218ade93c@oracle.com>
Message-ID: <3b7e1339-a173-9739-fa8c-f603304a5886@oracle.com>

This change looks okay.

Mandy

On 3/12/20 1:50 PM, Jonathan Gibbons wrote:
> Please review a simple fix regarding the non-standard use of some CSS 
> in some doc comments.
>
> From the JBS Description:
>
> Recently, for the display of javadoc block tags, javadoc changed from 
> using an inconsistent set of CSS class names on the generated 'dt' 
> elements to using a single new name ("notes") on the enclosing 'dl' 
> element.
>
> There are a few (4) places in the main JDK code where the old-style 
> names were used explicitly in doc comments, in order to emulate the 
> appearance of a list of block tags. These use-sites should be fixed 
> up. They are in the following files:
>
> open/src/java.base/share/classes/module-info.java
> open/src/java.se/share/classes/module-info.java
> open/src/java.management.rmi/share/classes/module-info.java
> open/src/jdk.jconsole/share/classes/module-info.java
>
> In addition, these four files used the style attribute to force the 
> font to be used. The font is now set in the standard CSS for "notes", 
> and so the local use of a "style" attribute is no longer necessary.
>
> -- Jon
>
> JBS: https://bugs.openjdk.java.net/browse/JDK-8240971
> Webrev: http://cr.openjdk.java.net/~jjg/8240971/webrev.00/index.html
> API: http://cr.openjdk.java.net/~jjg/8240971/api.00/index.html
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200312/408bb382/attachment.htm>

From alexey.menkov at oracle.com  Fri Mar 13 00:33:21 2020
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Thu, 12 Mar 2020 17:33:21 -0700
Subject: RFR: [small, docs] JDK-8240971 Fix CSS styles in some doc comments
In-Reply-To: <3b7e1339-a173-9739-fa8c-f603304a5886@oracle.com>
References: <a70477b4-36f4-4642-867e-c76218ade93c@oracle.com>
 <3b7e1339-a173-9739-fa8c-f603304a5886@oracle.com>
Message-ID: <fe200dc6-5cec-e0f9-8ed9-798c952b00e9@oracle.com>

+1

--alex

On 03/12/2020 13:53, Mandy Chung wrote:
> This change looks okay.
> 
> Mandy
> 
> On 3/12/20 1:50 PM, Jonathan Gibbons wrote:
>> Please review a simple fix regarding the non-standard use of some CSS 
>> in some doc comments.
>>
>> From the JBS Description:
>>
>> Recently, for the display of javadoc block tags, javadoc changed from 
>> using an inconsistent set of CSS class names on the generated 'dt' 
>> elements to using a single new name ("notes") on the enclosing 'dl' 
>> element.
>>
>> There are a few (4) places in the main JDK code where the old-style 
>> names were used explicitly in doc comments, in order to emulate the 
>> appearance of a list of block tags. These use-sites should be fixed 
>> up. They are in the following files:
>>
>> open/src/java.base/share/classes/module-info.java
>> open/src/java.se/share/classes/module-info.java
>> open/src/java.management.rmi/share/classes/module-info.java
>> open/src/jdk.jconsole/share/classes/module-info.java
>>
>> In addition, these four files used the style attribute to force the 
>> font to be used. The font is now set in the standard CSS for "notes", 
>> and so the local use of a "style" attribute is no longer necessary.
>>
>> -- Jon
>>
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8240971
>> Webrev: http://cr.openjdk.java.net/~jjg/8240971/webrev.00/index.html
>> API: http://cr.openjdk.java.net/~jjg/8240971/api.00/index.html
>>
> 

From chris.plummer at oracle.com  Fri Mar 13 06:06:03 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Thu, 12 Mar 2020 23:06:03 -0700
Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they
 do not attempt to use sudo when available
In-Reply-To: <33ffd0e7-d017-c1bf-feb5-4d2f07e753f2@oracle.com>
References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com>
 <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com>
 <c26a5f69-3127-bd0a-0e25-8b7afe4464aa@oracle.com>
 <150f87f8-5b54-deae-ced1-1227bb1ed17a@oracle.com>
 <227652ff-ffb0-a1f0-531d-03b75ecec921@oracle.com>
 <33ffd0e7-d017-c1bf-feb5-4d2f07e753f2@oracle.com>
Message-ID: <90e1ffcc-6636-b328-b167-fd1d42072575@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200312/0493cd93/attachment-0001.htm>

From richard.reingruber at sap.com  Fri Mar 13 09:08:51 2020
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Fri, 13 Mar 2020 09:08:51 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <AM0PR0202MB32972071A26C80FB22FC49DE9AFD0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
 <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
 <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com>
 <AM0PR0202MB33316510E86767AED0D29F679B030@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM7PR02MB6049A3D2F6DE10CAD6AA7A51ECEC0@AM7PR02MB6049.eurprd02.prod.outlook.com>
 <b159e349-95bc-01c3-5250-f3b454d7ef53@oracle.com>
 <AM0PR0202MB33315707EAB1F5C9801DB4C19BE40@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB32972071A26C80FB22FC49DE9AFD0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
Message-ID: <AM0PR0202MB3331B757ECD4E52CACC121989BFA0@AM0PR0202MB3331.eurprd02.prod.outlook.com>

Hi Martin,

thanks a lot for reviewing and the feedback. I'll dig into the details as soon as possible. Looking forward to it :)

Thanks, Richard.

-----Original Message-----
From: Doerr, Martin <martin.doerr at sap.com> 
Sent: Donnerstag, 12. M?rz 2020 17:28
To: Reingruber, Richard <richard.reingruber at sap.com>; 'Robbin Ehn' <robbin.ehn at oracle.com>; Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; David Holmes <david.holmes at oracle.com>; Vladimir Kozlov (vladimir.kozlov at oracle.com) <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

Hi Richard,


I managed to find time for a (almost) complete review of webrev.4. (I'll review the tests separately.)

First of all, the change seems to be in pretty good quality for its significant complexity. I couldn't find any real bugs. But I'd like to propose minor improvements.
I'm convinced that it's mature because we did substantial testing.

I like the new functionality for object deoptimization. It can possibly be reused for future escape analysis based optimizations. So I appreciate having it available in the code base.
In addition to that, your change makes the JVMTI implementation better integrated into the VM.


Now to the details:


src/hotspot/share/c1/c1_IR.hpp
describe_scope parameters. Ok.


src/hotspot/share/ci/ciEnv.cpp
src/hotspot/share/ci/ciEnv.hpp
Fix for JvmtiExport::can_walk_any_space() capability. Ok.


src/hotspot/share/code/compiledMethod.cpp
Nice cleanup!


src/hotspot/share/code/debugInfoRec.cpp
src/hotspot/share/code/debugInfoRec.hpp
Additional parmeters. (Remark: I think "non_global_escape_in_scope" would read better than "not_global_escape_in_scope", but your version is consistent with existing code, so no change request from my side.) Ok.


src/hotspot/share/code/nmethod.cpp
Nice cleanup!


src/hotspot/share/code/pcDesc.hpp
Additional parameters. Ok.


src/hotspot/share/code/scopeDesc.cpp
src/hotspot/share/code/scopeDesc.hpp
Improved implementation + additional parameters. Ok.


src/hotspot/share/compiler/compileBroker.cpp
src/hotspot/share/compiler/compileBroker.hpp
Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into a follow up change together with the test in order to make this webrev smaller, but since it is included, I'm reviewing everything at once. Not a big deal.) Ok.


src/hotspot/share/jvmci/jvmciCodeInstaller.cpp
Additional parameters. Ok.


src/hotspot/share/opto/c2compiler.cpp
Make do_escape_analysis independent of JVMCI capabilities. Nice!


src/hotspot/share/opto/callnode.hpp
Additional fields for MachSafePointNodes. Ok.


src/hotspot/share/opto/escape.cpp
Annotation for MachSafePointNodes. Your added functionality looks correct.
But I'd prefer to move the bulky code out of the large function.
I suggest to factor out something like has_not_global_escape and has_arg_escape. So the code could look like this:
      SafePointNode* sfn = sfn_worklist.at(next);
      sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn));
      if (sfn->is_CallJava()) {
        CallJavaNode* call = sfn->as_CallJava();
        call->set_arg_escape(has_arg_escape(call));
      }
This would also allow us to get rid of the found_..._escape_in_args variables making the loops better readable.

It's kind of ugly to use strcmp to recognize uncommon trap, but that seems to be the way to do it (there are more such places). So it's ok.


src/hotspot/share/opto/machnode.hpp
Additional fields for MachSafePointNodes. Ok.


src/hotspot/share/opto/macro.cpp
Allow elimination of non-escaping allocations. Ok.


src/hotspot/share/opto/matcher.cpp
src/hotspot/share/opto/output.cpp
Copy attribute / pass parameters. Ok.


src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp
Nice cleanup!


src/hotspot/share/prims/jvmtiEnv.cpp
src/hotspot/share/prims/jvmtiEnvBase.cpp
Escape barriers + deoptimize objects for target thread. Good.


src/hotspot/share/prims/jvmtiImpl.cpp
src/hotspot/share/prims/jvmtiImpl.hpp
The sequence is pretty complex:
VM_GetOrSetLocal element initialization executes EscapeBarrier code which suspends the target thread (extra VM Operation).
VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM Thread to prepare VM Operation with frame deoptimization).
VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor which resumes the target thread.
But I don't have any improvement proposal. Performance is probably not a concern, here. So it's ok.

VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it has non-globally escaping objects and other frames if they have arg escaping ones. Good.


src/hotspot/share/prims/jvmtiTagMap.cpp
Escape barriers + deoptimize objects for all threads. Ok.


src/hotspot/share/prims/whitebox.cpp
Added WB_IsFrameDeoptimized to API. Ok.


src/hotspot/share/runtime/deoptimization.cpp
Object deoptimization. I have more comments and proposals, here.
First of all, handling recursive and waiting locks in relock_objects is tricky, but looks correct.
Comments are sufficient to understand why things are done as they are implemented.

BiasedLocking related parts are complex, but we may get rid of them in the future (with BiasedLocking removal).
Anyway, looks correct, too.

Typo in comment: "regularily" => "regularly"

Deoptimization::fetch_unroll_info_helper is the only place where _jvmti_deferred_updates get deallocated (except JavaThread destructor). But I think we always go through it, so I can't see a memory leak or such kind of issues.

EscapeBarrier::deoptimize_objects: ResourceMark should use calling_thread().

You can use MutexLocker and MonitorLocker with Thread* to save the Thread::current() call.

I'd make set_objs_are_deoptimized static and remove it from the EscapeBarrier interface because I think it shouldn't be used outside of EscapeBarrier::deoptimize_objects.

Typo in comment: "we must only deoptimize" => "we only have to deoptimize"

"bool EscapeBarrier::deoptimize_objects(intptr_t* fr_id)" is trivial and barrier_active() is redundant. Implementation can get moved to hpp file.

I'll get back to suspend flags, later.

There are weird cases regarding _self_deoptimization_in_progress.
Assume we have 3 threads A, B and C. A deopts C, B deopts C, C deopts C. C can set _self_deoptimization_in_progress while A performs the handshake for suspending C. I think this doesn't lead to errors, but it's probably not desired.
I think it would be better to use only one "wait" call in sync_and_suspend_one and sync_and_suspend_all.

I first thought it'd be better to move ThreadBlockInVM before wait() to reduce thread state transitions, but that seems to be problematic because ThreadBlockInVM destructor contains a safepoint check which we shouldn't do while holding EscapeBarrier_lock. So no change request.

Change in thred_added:
I think the sequence would be more comprehensive if we waited for deopt_all_threads in Thread::start and all other places where a new thread can run into Java code (e.g. JVMTI attach).
Your version makes new threads come up with suspend flag set. That looks correct, too. Advantage is that you only have to change one place (thread_added). It'll be interesting to see how it will look like when we use async handshakes instead of suspend flags.
For now, I'm ok with your version.

I'd only move MutexLocker ml(EscapeBarrier_lock...) after if (!jt->is_hidden_from_external_view()).

Having 4 different deoptimize_objects functions makes it a little hard to keep an overview of which one is used for what.
Maybe adding suffixes would help a little bit, but I can also live with what you have.
Implementation looks correct to me.


src/hotspot/share/runtime/deoptimization.hpp
Escape barriers and object deoptimization functions.
Typo in comment: "helt" => "held"


src/hotspot/share/runtime/globals.hpp
Addition of develop flag DeoptimizeObjectsALotInterval. Ok.


src/hotspot/share/runtime/interfaceSupport.cpp
InterfaceSupport::deoptimizeAllObjects() is only used for DeoptimizeObjectsALot = 1.
I think DeoptimizeObjectsALot = 2 is more important, but I think it's not bad to have DeoptimizeObjectsALot = 1 in addition. Ok.


src/hotspot/share/runtime/interfaceSupport.inline.hpp
Addition of deoptimizeAllObjects. Ok.


src/hotspot/share/runtime/mutexLocker.cpp
src/hotspot/share/runtime/mutexLocker.hpp
Addition of EscapeBarrier_lock. Ok.


src/hotspot/share/runtime/objectMonitor.cpp
Make recursion count relock aware. Ok.


src/hotspot/share/runtime/stackValue.hpp
Better reinitilization in StackValue. Good.


src/hotspot/share/runtime/thread.cpp
src/hotspot/share/runtime/thread.hpp
src/hotspot/share/runtime/thread.inline.hpp
wait_for_object_deoptimization, suspend flag, deferred updates and test feature to deoptimize objects.

In the long term, we want to get rid of suspend flags, so it's not so nice to introduce a new one. But I agree with G?tz that it should be acceptable as temporary solution until async handshakes are available (which takes more time). So I'm ok with your change.

You can use MutexLocker with Thread*.

JVMTIDeferredUpdates: I agree with Robin. It'd be nice to move the class out of thread.hpp.


src/hotspot/share/runtime/vframe.cpp
Added support for entry frame to new_vframe. Ok.


src/hotspot/share/runtime/vframe_hp.cpp
src/hotspot/share/runtime/vframe_hp.hpp

I think code()->as_nmethod() in not_global_escape_in_scope() and arg_escape() should better be under #ifdef ASSERT or inside the assert statement (no need for code cache walking in product build).

jvmtiDeferredLocalVariableSet::update_monitors:
Please add a comment explaining that owner referenced by original info may be scalar replaced, but it is deoptimized in the vframe.


src/hotspot/share/utilities/macros.hpp
Addition of NOT_COMPILER2_OR_JVMCI_RETURN macros. Ok.


test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysisEnabled.java
test/hotspot/jtreg/serviceability/jvmti/Heap/libIterateHeapWithEscapeAnalysisEnabled.c
New test. Will review separately.


test/jdk/TEST.ROOT
Addition of vm.jvmci as required property. Ok.


test/jdk/com/sun/jdi/EATests.java
test/jdk/com/sun/jdi/EATestsJVMCI.java
New test. Will review separately.


test/lib/sun/hotspot/WhiteBox.java
Added isFrameDeoptimized to API. Ok.


That was it. Best regards,
Martin


> -----Original Message-----
> From: hotspot-compiler-dev <hotspot-compiler-dev-
> bounces at openjdk.java.net> On Behalf Of Reingruber, Richard
> Sent: Dienstag, 3. M?rz 2020 21:23
> To: 'Robbin Ehn' <robbin.ehn at oracle.com>; Lindenmaier, Goetz
> <goetz.lindenmaier at sap.com>; David Holmes <david.holmes at oracle.com>;
> Vladimir Kozlov (vladimir.kozlov at oracle.com)
> <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net;
> hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-
> dev at openjdk.java.net
> Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better
> Performance in the Presence of JVMTI Agents
> 
> Hi Robbin,
> 
> > > I understand that Robbin proposed to replace the usage of
> > > _suspend_flag with handshakes. Apparently, async handshakes
> > > are needed to do so. We have been waiting a while for removal
> > > of the _suspend_flag / introduction of async handshakes [2].
> > > What is the status here?
> 
> > I have an old prototype which I would like to continue to work on.
> > So do not assume asynch handshakes will make 15.
> > Even if it would, I think there are a lot more investigate work to remove
> > _suspend_flag.
> 
> Let us know, if we can be of any help to you and be it only testing.
> 
> > >> Full:
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/
> 
> > DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
> > You can move both declaration and definition to that file, no need to
> clobber
> > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)
> 
> Will do.
> 
> > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's
> own
> > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.
> 
> You are right. It shouldn't be declared in thread.hpp. I will look into that.
> 
> > Note that we also think we may have a bug in deopt:
> > https://bugs.openjdk.java.net/browse/JDK-8238237
> 
> > I think it would be best, if possible, to push after that is resolved.
> 
> Sure.
> 
> > Not even nearly a full review :)
> 
> I know :)
> 
> Anyways, thanks a lot,
> Richard.
> 
> 
> -----Original Message-----
> From: Robbin Ehn <robbin.ehn at oracle.com>
> Sent: Monday, March 2, 2020 11:17 AM
> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Reingruber, Richard
> <richard.reingruber at sap.com>; David Holmes <david.holmes at oracle.com>;
> Vladimir Kozlov (vladimir.kozlov at oracle.com)
> <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net;
> hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-
> dev at openjdk.java.net
> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance
> in the Presence of JVMTI Agents
> 
> Hi,
> 
> On 2/24/20 5:39 PM, Lindenmaier, Goetz wrote:
> > Hi,
> >
> > I had a look at the progress of this change. Nothing
> > happened since Richard posted his update using more
> > handshakes [1].
> > But we (SAP) would appreciate a lot if this change could
> > be successfully reviewed and pushed.
> >
> > I think there is basic understanding that this
> > change is helpful. It fixes a number of issues with JVMTI,
> > and will deliver the same performance benefits as EA
> > does in current production mode for debugging scenarios.
> >
> > This is important for us as we run our VMs prepared
> > for debugging in production mode.
> >
> > I understand that Robbin proposed to replace the usage of
> > _suspend_flag with handshakes. Apparently, async handshakes
> > are needed to do so. We have been waiting a while for removal
> > of the _suspend_flag / introduction of async handshakes [2].
> > What is the status here?
> 
> I have an old prototype which I would like to continue to work on.
> So do not assume asynch handshakes will make 15.
> Even if it would, I think there are a lot more investigate work to remove
> _suspend_flag.
> 
> >
> > I think we should no longer wait, but proceed with
> > this change. We will look into removing the usage of
> > suspend_flag introduced here once it is possible to implement
> > it with handshakes.
> 
> Yes, sure.
> 
> >> Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/
> 
> DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
> You can move both declaration and definition to that file, no need to clobber
> thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)
> 
> Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's
> own
> hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.
> 
> Note that we also think we may have a bug in deopt:
> https://bugs.openjdk.java.net/browse/JDK-8238237
> 
> I think it would be best, if possible, to push after that is resolved.
> 
> Not even nearly a full review :)
> 
> Thanks, Robbin
> 
> 
> >> Incremental:
> >> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/
> >>
> >> I was not able to eliminate the additional suspend flag now. I'll take care
> of this
> >> as soon as the
> >> existing suspend-resume-mechanism is reworked.
> >>
> >> Testing:
> >>
> >> Nightly tests @SAP:
> >>
> >>    JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015,
> Renaissance
> >> Suite, SAP specific tests
> >>    with fastdebug and release builds on all platforms
> >>
> >>    Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x
> parallel
> >> for 24h
> >>
> >> Thanks, Richard.
> >>
> >>
> >> More details on the changes:
> >>
> >> * Hide DeoptimizeObjectsALotThread from external view.
> >>
> >> * Changed EscapeBarrier_lock to be a _safepoint_check_never lock.
> >>    It used to be _safepoint_check_sometimes, which will be eliminated
> sooner or
> >> later.
> >>    I added explicit thread state changes with ThreadBlockInVM to code
> paths
> >> where we can wait()
> >>    on EscapeBarrier_lock to become safepoint safe.
> >>
> >> * Use handshake EscapeBarrierSuspendHandshake to suspend target
> threads
> >> instead of vm operation
> >>    VM_ThreadSuspendAllForObjDeopt.
> >>
> >> * Removed uses of Threads_lock. When adding a new thread we suspend
> it iff
> >> EA optimizations are
> >>    being reverted. In the previous version we were waiting on
> Threads_lock
> >> while EA optimizations
> >>    were reverted. See EscapeBarrier::thread_added().
> >>
> >> * Made tests require Xmixed compilation mode.
> >>
> >> * Made tests agnostic regarding tiered compilation.
> >>    I.e. tc isn't disabled anymore, and the tests can be run with tc enabled or
> >> disabled.
> >>
> >> * Exercising EATests.java as well with stress test options
> >> DeoptimizeObjectsALot*
> >>    Due to the non-deterministic deoptimizations some tests need to be
> skipped.
> >>    We do this to prevent bit-rot of the stress test code.
> >>
> >> * Executing EATests.java as well with graal if available. Driver for this is
> >>    EATestsJVMCI.java. Graal cannot pass all tests, because it does not
> provide all
> >> the new debug info
> >>    (namely not_global_escape_in_scope and arg_escape in
> scopeDesc.hpp).
> >>    And graal does not yet support the JVMTI operations force early return
> and
> >> pop frame.
> >>
> >> * Removed tracing from new jdi tests in EATests.java. Too much trace
> output
> >> before the debugging
> >>    connection is established can cause deadlock because output buffers fill
> up.
> >>    (See https://bugs.openjdk.java.net/browse/JDK-8173304)
> >>
> >> * Many copyright year changes and smaller clean-up changes of testing
> code
> >> (trailing white-space and
> >>    the like).
> >>
> >>
> >> -----Original Message-----
> >> From: David Holmes <david.holmes at oracle.com>
> >> Sent: Donnerstag, 19. Dezember 2019 03:12
> >> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-
> >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net;
> hotspot-
> >> runtime-dev at openjdk.java.net; Vladimir Kozlov
> (vladimir.kozlov at oracle.com)
> >> <vladimir.kozlov at oracle.com>
> >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> Performance in
> >> the Presence of JVMTI Agents
> >>
> >> Hi Richard,
> >>
> >> I think my issue is with the way EliminateNestedLocks works so I'm going
> >> to look into that more deeply.
> >>
> >> Thanks for the explanations.
> >>
> >> David
> >>
> >> On 18/12/2019 12:47 am, Reingruber, Richard wrote:
> >>> Hi David,
> >>>
> >>>     > >    > Some further queries/concerns:
> >>>     > >    >
> >>>     > >    > src/hotspot/share/runtime/objectMonitor.cpp
> >>>     > >    >
> >>>     > >    > Can you please explain the changes to ObjectMonitor::wait:
> >>>     > >    >
> >>>     > >    > !   _recursions = save      // restore the old recursion count
> >>>     > >    > !                 + jt->get_and_reset_relock_count_after_wait(); //
> >>>     > >    > increased by the deferred relock count
> >>>     > >    >
> >>>     > >    > what is the "deferred relock count"? I gather it relates to
> >>>     > >    >
> >>>     > >    > "The code was extended to be able to deoptimize objects of a
> >>>     > > frame that
> >>>     > >    > is not the top frame and to let another thread than the owning
> >>>     > > thread do
> >>>     > >    > it."
> >>>     > >
> >>>     > > Yes, these relate. Currently EA based optimizations are reverted,
> when a
> >> compiled frame is
> >>>     > > replaced with corresponding interpreter frames. Part of this is
> relocking
> >> objects with eliminated
> >>>     > > locking. New with the enhancement is that we do this also just
> before
> >> object references are
> >>>     > > acquired through JVMTI. In this case we deoptimize also the
> owning
> >> compiled frame C and we
> >>>     > > register deoptimized objects as deferred updates. When control
> returns
> >> to C it gets deoptimized,
> >>>     > > we notice that objects are already deoptimized (reallocated and
> >> relocked), so we don't do it again
> >>>     > > (relocking twice would be incorrect of course). Deferred updates
> are
> >> copied into the new
> >>>     > > interpreter frames.
> >>>     > >
> >>>     > > Problem: relocking is not possible if the target thread T is waiting
> on the
> >> monitor that needs to
> >>>     > > be relocked. This happens only with non-local objects with
> >> EliminateNestedLocks. Instead relocking
> >>>     > > is deferred until T owns the monitor again. This is what the piece of
> >> code above does.
> >>>     >
> >>>     >  Sorry I need some more detail here. How can you wait() on an
> object
> >>>     >  monitor if the object allocation and/or locking was optimised away?
> And
> >>>     >  what is a "non-local object" in this context? Isn't EA restricted to
> >>>     >  thread-confined objects?
> >>>
> >>> "Non-local object" is an object that escapes its thread. The issue I'm
> >> addressing with the changes
> >>> in ObjectMonitor::wait are almost unrelated to EA. They are caused by
> >> EliminateNestedLocks, where C2
> >>> eliminates recursive locking of an already owned lock. The lock owning
> object
> >> exists on the heap, it
> >>> is locked and you can call wait() on it.
> >>>
> >>> EliminateLocks is the C2 option that controls lock elimination based on
> EA.
> >> Both optimizations have
> >>> in common that objects with eliminated locking need to be relocked
> when
> >> deoptimizing a frame,
> >>> i.e. when replacing a compiled frame with equivalent interpreter
> >>> frames. Deoptimization::relock_objects does that job for /all/ eliminated
> >> locks in scope. /All/ can
> >>> be a mix of eliminated nested locks and locks of not-escaping objects.
> >>>
> >>> New with the enhancement: I call relock_objects earlier, just before
> objects
> >> pontentially
> >>> escape. But then later when the owning compiled frame gets
> deoptimized, I
> >> must not do it again:
> >>>
> >>> See call to EscapeBarrier::objs_are_deoptimized in deoptimization.cpp:
> >>>
> >>>    373   if ((jvmci_enabled || ((DoEscapeAnalysis ||
> EliminateNestedLocks) &&
> >> EliminateLocks))
> >>>    374       && !EscapeBarrier::objs_are_deoptimized(thread,
> deoptee.id())) {
> >>>    375     bool unused;
> >>>    376     eliminate_locks(thread, chunk, realloc_failures, deoptee,
> exec_mode,
> >> unused);
> >>>    377   }
> >>>
> >>> Now when calling relock_objects early it is quiet possible that I have to
> relock
> >> an object the
> >>> target thread currently waits for. Obviously I cannot relock in this case,
> >> instead I chose to
> >>> introduce relock_count_after_wait to JavaThread.
> >>>
> >>>     >  Is it just that some of the locking gets optimized away e.g.
> >>>     >
> >>>     >  synchronised(obj) {
> >>>     >     synchronised(obj) {
> >>>     >       synchronised(obj) {
> >>>     >         obj.wait();
> >>>     >       }
> >>>     >     }
> >>>     >  }
> >>>     >
> >>>     >  If this is reduced to a form as-if it were a single lock of the monitor
> >>>     >  (due to EA) and the wait() triggers a JVM TI event which leads to the
> >>>     >  escape of "obj" then we need to reconstruct the true lock state, and
> so
> >>>     >  when the wait() internally unblocks and reacquires the monitor it
> has to
> >>>     >  set the true recursion count to 3, not the 1 that it appeared to be
> when
> >>>     >  wait() was initially called. Is that the scenario?
> >>>
> >>> Kind of... except that the locking is not eliminated due to EA and there is
> no
> >> JVM TI event
> >>> triggered by wait.
> >>>
> >>> Add
> >>>
> >>> LocalObject l1 = new LocalObject();
> >>>
> >>> in front of the synchrnized blocks and assume a JVM TI agent acquires l1.
> This
> >> triggers the code in
> >>> question.
> >>>
> >>> See that relocking/reallocating is transactional. If it is done then for /all/
> >> objects in scope and it is
> >>> done at most once. It wouldn't be quite so easy to split this in relocking
> of
> >> nested/EA-based
> >>> eliminated locks.
> >>>
> >>>     >  If so I find this truly awful. Anyone using wait() in a realistic form
> >>>     >  requires a notification and so the object cannot be thread confined.
> In
> >>>
> >>> It is not thread confined.
> >>>
> >>>     >  which case I would strongly argue that upon hitting the wait() the
> deopt
> >>>     >  should occur unconditionally and so the lock state is correct before
> we
> >>>     >  wait and so we don't need to mess with the recursion count
> internally
> >>>     >  when we reacquire the monitor.
> >>>     >
> >>>     > >
> >>>     > >    > which I don't like the sound of at all when it comes to
> ObjectMonitor
> >>>     > >    > state. So I'd like to understand in detail exactly what is going on
> here
> >>>     > >    > and why.  This is a very intrusive change that seems to badly
> break
> >>>     > >    > encapsulation and impacts future changes to ObjectMonitor
> that are
> >> under
> >>>     > >    > investigation.
> >>>     > >
> >>>     > > I would not regard this as breaking encapsulation. Certainly not
> badly.
> >>>     > >
> >>>     > > I've added a property relock_count_after_wait to JavaThread. The
> >> property is well
> >>>     > > encapsulated. Future ObjectMonitor implementations have to deal
> with
> >> recursion too. They are free
> >>>     > > in choosing a way to do that as long as that property is taken into
> >> account. This is hardly a
> >>>     > > limitation.
> >>>     >
> >>>     >  I do think this badly breaks encapsulation as you have to add a
> callout
> >>>     >  from the guts of the ObjectMonitor code to reach into the thread to
> get
> >>>     >  this lock count adjustment. I understand why you have had to do
> this but
> >>>     >  I would much rather see a change to the EA optimisation strategy so
> that
> >>>     >  this is not needed.
> >>>     >
> >>>     > > Note also that the property is a straight forward extension of the
> >> existing concept of deferred
> >>>     > > local updates. It is embedded into the structure holding them. So
> not
> >> even the footprint of a
> >>>     > > JavaThread is enlarged if no deferred updates are generated.
> >>>     >
> >>>     > [...]
> >>>     >
> >>>     > >
> >>>     > > I'm actually duplicating the existing external suspend mechanism,
> >> because a thread can be
> >>>     > > suspended at most once. And hey, and don't like that either! But it
> >> seems not unlikely that the
> >>>     > > duplicate can be removed together with the original and the new
> type
> >> of handshakes that will be
> >>>     > > used for thread suspend can be used for object deoptimization
> too. See
> >> today's discussion in
> >>>     > > JDK-8227745 [2].
> >>>     >
> >>>     >  I hope that discussion bears some fruit, at the moment it seems not
> to
> >>>     >  be possible to use handshakes here. :(
> >>>     >
> >>>     >  The external suspend mechanism is a royal pain in the proverbial
> that we
> >>>     >  have to carefully live with. The idea that we're duplicating that for
> >>>     >  use in another fringe area of functionality does not thrill me at all.
> >>>     >
> >>>     >  To be clear, I understand the problem that exists and that you wish
> to
> >>>     >  solve, but for the runtime parts I balk at the complexity cost of
> >>>     >  solving it.
> >>>
> >>> I know it's complex, but by far no rocket science.
> >>>
> >>> Also I find it hard to imagine another fix for JDK-8233915 besides
> changing
> >> the JVM TI specification.
> >>>
> >>> Thanks, Richard.
> >>>
> >>> -----Original Message-----
> >>> From: David Holmes <david.holmes at oracle.com>
> >>> Sent: Dienstag, 17. Dezember 2019 08:03
> >>> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-
> >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net;
> hotspot-
> >> runtime-dev at openjdk.java.net; Vladimir Kozlov
> (vladimir.kozlov at oracle.com)
> >> <vladimir.kozlov at oracle.com>
> >>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> Performance
> >> in the Presence of JVMTI Agents
> >>>
> >>> <resend as my mailer crashed during last send>
> >>>
> >>> David
> >>>
> >>> On 17/12/2019 4:57 pm, David Holmes wrote:
> >>>> Hi Richard,
> >>>>
> >>>> On 14/12/2019 5:01 am, Reingruber, Richard wrote:
> >>>>> Hi David,
> >>>>>
> >>>>>   ?? > Some further queries/concerns:
> >>>>>   ?? >
> >>>>>   ?? > src/hotspot/share/runtime/objectMonitor.cpp
> >>>>>   ?? >
> >>>>>   ?? > Can you please explain the changes to ObjectMonitor::wait:
> >>>>>   ?? >
> >>>>>   ?? > !?? _recursions = save????? // restore the old recursion count
> >>>>>   ?? > !???????????????? + jt->get_and_reset_relock_count_after_wait(); //
> >>>>>   ?? > increased by the deferred relock count
> >>>>>   ?? >
> >>>>>   ?? > what is the "deferred relock count"? I gather it relates to
> >>>>>   ?? >
> >>>>>   ?? > "The code was extended to be able to deoptimize objects of a
> >>>>> frame that
> >>>>>   ?? > is not the top frame and to let another thread than the owning
> >>>>> thread do
> >>>>>   ?? > it."
> >>>>>
> >>>>> Yes, these relate. Currently EA based optimizations are reverted,
> when
> >>>>> a compiled frame is replaced
> >>>>> with corresponding interpreter frames. Part of this is relocking
> >>>>> objects with eliminated
> >>>>> locking. New with the enhancement is that we do this also just before
> >>>>> object references are acquired
> >>>>> through JVMTI. In this case we deoptimize also the owning compiled
> >>>>> frame C and we register
> >>>>> deoptimized objects as deferred updates. When control returns to C
> it
> >>>>> gets deoptimized, we notice
> >>>>> that objects are already deoptimized (reallocated and relocked), so
> we
> >>>>> don't do it again (relocking
> >>>>> twice would be incorrect of course). Deferred updates are copied into
> >>>>> the new interpreter frames.
> >>>>>
> >>>>> Problem: relocking is not possible if the target thread T is waiting
> >>>>> on the monitor that needs to be
> >>>>> relocked. This happens only with non-local objects with
> >>>>> EliminateNestedLocks. Instead relocking is
> >>>>> deferred until T owns the monitor again. This is what the piece of
> >>>>> code above does.
> >>>>
> >>>> Sorry I need some more detail here. How can you wait() on an object
> >>>> monitor if the object allocation and/or locking was optimised away?
> And
> >>>> what is a "non-local object" in this context? Isn't EA restricted to
> >>>> thread-confined objects?
> >>>>
> >>>> Is it just that some of the locking gets optimized away e.g.
> >>>>
> >>>> synchronised(obj) {
> >>>>    ? synchronised(obj) {
> >>>>    ??? synchronised(obj) {
> >>>>    ????? obj.wait();
> >>>>    ??? }
> >>>>    ? }
> >>>> }
> >>>>
> >>>> If this is reduced to a form as-if it were a single lock of the monitor
> >>>> (due to EA) and the wait() triggers a JVM TI event which leads to the
> >>>> escape of "obj" then we need to reconstruct the true lock state, and so
> >>>> when the wait() internally unblocks and reacquires the monitor it has to
> >>>> set the true recursion count to 3, not the 1 that it appeared to be when
> >>>> wait() was initially called. Is that the scenario?
> >>>>
> >>>> If so I find this truly awful. Anyone using wait() in a realistic form
> >>>> requires a notification and so the object cannot be thread confined. In
> >>>> which case I would strongly argue that upon hitting the wait() the
> deopt
> >>>> should occur unconditionally and so the lock state is correct before we
> >>>> wait and so we don't need to mess with the recursion count internally
> >>>> when we reacquire the monitor.
> >>>>
> >>>>>
> >>>>>   ?? > which I don't like the sound of at all when it comes to
> >>>>> ObjectMonitor
> >>>>>   ?? > state. So I'd like to understand in detail exactly what is going
> >>>>> on here
> >>>>>   ?? > and why.? This is a very intrusive change that seems to badly
> break
> >>>>>   ?? > encapsulation and impacts future changes to ObjectMonitor that
> >>>>> are under
> >>>>>   ?? > investigation.
> >>>>>
> >>>>> I would not regard this as breaking encapsulation. Certainly not badly.
> >>>>>
> >>>>> I've added a property relock_count_after_wait to JavaThread. The
> >>>>> property is well
> >>>>> encapsulated. Future ObjectMonitor implementations have to deal
> with
> >>>>> recursion too. They are free in
> >>>>> choosing a way to do that as long as that property is taken into
> >>>>> account. This is hardly a
> >>>>> limitation.
> >>>>
> >>>> I do think this badly breaks encapsulation as you have to add a callout
> >>>> from the guts of the ObjectMonitor code to reach into the thread to
> get
> >>>> this lock count adjustment. I understand why you have had to do this
> but
> >>>> I would much rather see a change to the EA optimisation strategy so
> that
> >>>> this is not needed.
> >>>>
> >>>>> Note also that the property is a straight forward extension of the
> >>>>> existing concept of deferred
> >>>>> local updates. It is embedded into the structure holding them. So not
> >>>>> even the footprint of a
> >>>>> JavaThread is enlarged if no deferred updates are generated.
> >>>>>
> >>>>>   ?? > ---
> >>>>>   ?? >
> >>>>>   ?? > src/hotspot/share/runtime/thread.cpp
> >>>>>   ?? >
> >>>>>   ?? > Can you please explain why
> >>>>> JavaThread::wait_for_object_deoptimization
> >>>>>   ?? > has to be handcrafted in this way rather than using proper
> >>>>> transitions.
> >>>>>   ?? >
> >>>>>
> >>>>> I wrote wait_for_object_deoptimization taking
> >>>>> JavaThread::java_suspend_self_with_safepoint_check
> >>>>> as template. So in short: for the same reasons :)
> >>>>>
> >>>>> Threads reach both methods as part of thread state transitions,
> >>>>> therefore special handling is
> >>>>> required to change thread state on top of ongoing transitions.
> >>>>>
> >>>>>   ?? > We got rid of "deopt suspend" some time ago and it is disturbing
> >>>>> to see
> >>>>>   ?? > it being added back (effectively). This seems like it may be
> >>>>> something
> >>>>>   ?? > that handshakes could be used for.
> >>>>>
> >>>>> Deopt suspend used to be something rather different with a similar
> >>>>> name[1]. It is not being added back.
> >>>>
> >>>> I stand corrected. Despite comments in the code to the contrary
> >>>> deopt_suspend didn't actually cause a self-suspend. I was doing a lot of
> >>>> cleanup in this area 13 years ago :)
> >>>>
> >>>>>
> >>>>> I'm actually duplicating the existing external suspend mechanism,
> >>>>> because a thread can be suspended
> >>>>> at most once. And hey, and don't like that either! But it seems not
> >>>>> unlikely that the duplicate can
> >>>>> be removed together with the original and the new type of
> handshakes
> >>>>> that will be used for
> >>>>> thread suspend can be used for object deoptimization too. See
> today's
> >>>>> discussion in JDK-8227745 [2].
> >>>>
> >>>> I hope that discussion bears some fruit, at the moment it seems not to
> >>>> be possible to use handshakes here. :(
> >>>>
> >>>> The external suspend mechanism is a royal pain in the proverbial that
> we
> >>>> have to carefully live with. The idea that we're duplicating that for
> >>>> use in another fringe area of functionality does not thrill me at all.
> >>>>
> >>>> To be clear, I understand the problem that exists and that you wish to
> >>>> solve, but for the runtime parts I balk at the complexity cost of
> >>>> solving it.
> >>>>
> >>>> Thanks,
> >>>> David
> >>>> -----
> >>>>
> >>>>> Thanks, Richard.
> >>>>>
> >>>>> [1] Deopt suspend was something like an async. handshake for
> >>>>> architectures with register windows,
> >>>>>   ???? where patching the return pc for deoptimization of a compiled
> >>>>> frame was racy if the owner thread
> >>>>>   ???? was in native code. Instead a "deopt" suspend flag was set on
> >>>>> which the thread patched its own
> >>>>>   ???? frame upon return from native. So no thread was suspended. It
> got
> >>>>> its name only from the name of
> >>>>>   ???? the flags.
> >>>>>
> >>>>> [2] Discussion about using handshakes to sync. with the target thread:
> >>>>>
> >>>>> https://bugs.openjdk.java.net/browse/JDK-
> >>
> 8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.syst
> e
> >> m.issuetabpanels:comment-tabpanel#comment-14306727
> >>>>>
> >>>>>
> >>>>> -----Original Message-----
> >>>>> From: David Holmes <david.holmes at oracle.com>
> >>>>> Sent: Freitag, 13. Dezember 2019 00:56
> >>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
> >>>>> serviceability-dev at openjdk.java.net;
> >>>>> hotspot-compiler-dev at openjdk.java.net;
> >>>>> hotspot-runtime-dev at openjdk.java.net
> >>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> >>>>> Performance in the Presence of JVMTI Agents
> >>>>>
> >>>>> Hi Richard,
> >>>>>
> >>>>> Some further queries/concerns:
> >>>>>
> >>>>> src/hotspot/share/runtime/objectMonitor.cpp
> >>>>>
> >>>>> Can you please explain the changes to ObjectMonitor::wait:
> >>>>>
> >>>>> !?? _recursions = save????? // restore the old recursion count
> >>>>> !???????????????? + jt->get_and_reset_relock_count_after_wait(); //
> >>>>> increased by the deferred relock count
> >>>>>
> >>>>> what is the "deferred relock count"? I gather it relates to
> >>>>>
> >>>>> "The code was extended to be able to deoptimize objects of a frame
> that
> >>>>> is not the top frame and to let another thread than the owning thread
> do
> >>>>> it."
> >>>>>
> >>>>> which I don't like the sound of at all when it comes to ObjectMonitor
> >>>>> state. So I'd like to understand in detail exactly what is going on here
> >>>>> and why.? This is a very intrusive change that seems to badly break
> >>>>> encapsulation and impacts future changes to ObjectMonitor that are
> under
> >>>>> investigation.
> >>>>>
> >>>>> ---
> >>>>>
> >>>>> src/hotspot/share/runtime/thread.cpp
> >>>>>
> >>>>> Can you please explain why
> JavaThread::wait_for_object_deoptimization
> >>>>> has to be handcrafted in this way rather than using proper transitions.
> >>>>>
> >>>>> We got rid of "deopt suspend" some time ago and it is disturbing to
> see
> >>>>> it being added back (effectively). This seems like it may be something
> >>>>> that handshakes could be used for.
> >>>>>
> >>>>> Thanks,
> >>>>> David
> >>>>> -----
> >>>>>
> >>>>> On 12/12/2019 7:02 am, David Holmes wrote:
> >>>>>> On 12/12/2019 1:07 am, Reingruber, Richard wrote:
> >>>>>>> Hi David,
> >>>>>>>
> >>>>>>>   ??? > Most of the details here are in areas I can comment on in
> detail,
> >>>>>>> but I
> >>>>>>>   ??? > did take an initial general look at things.
> >>>>>>>
> >>>>>>> Thanks for taking the time!
> >>>>>>
> >>>>>> Apologies the above should read:
> >>>>>>
> >>>>>> "Most of the details here are in areas I *can't* comment on in detail
> >>>>>> ..."
> >>>>>>
> >>>>>> David
> >>>>>>
> >>>>>>>   ??? > The only thing that jumped out at me is that I think the
> >>>>>>>   ??? > DeoptimizeObjectsALotThread should be a hidden thread.
> >>>>>>>   ??? >
> >>>>>>>   ??? > +? bool is_hidden_from_external_view() const { return true; }
> >>>>>>>
> >>>>>>> Yes, it should. Will add the method like above.
> >>>>>>>
> >>>>>>>   ??? > Also I don't see any testing of the
> DeoptimizeObjectsALotThread.
> >>>>>>> Without
> >>>>>>>   ??? > active testing this will just bit-rot.
> >>>>>>>
> >>>>>>> DeoptimizeObjectsALot is meant for stress testing with a larger
> >>>>>>> workload. I will add a minimal test
> >>>>>>> to keep it fresh.
> >>>>>>>
> >>>>>>>   ??? > Also on the tests I don't understand your @requires clause:
> >>>>>>>   ??? >
> >>>>>>>   ??? >?? @requires ((vm.compMode != "Xcomp") &
> vm.compiler2.enabled
> >> &
> >>>>>>>   ??? > (vm.opt.TieredCompilation != true))
> >>>>>>>   ??? >
> >>>>>>>   ??? > This seems to require that TieredCompilation is disabled, but
> >>>>>>> tiered is
> >>>>>>>   ??? > our normal mode of operation. ??
> >>>>>>>   ??? >
> >>>>>>>
> >>>>>>> I removed the clause. I guess I wanted to target the tests towards
> the
> >>>>>>> code they are supposed to
> >>>>>>> test, and it's easier to analyze failures w/o tiered compilation and
> >>>>>>> with just one compiler thread.
> >>>>>>>
> >>>>>>> Additionally I will make use of
> >>>>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Richard.
> >>>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: David Holmes <david.holmes at oracle.com>
> >>>>>>> Sent: Mittwoch, 11. Dezember 2019 08:03
> >>>>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
> >>>>>>> serviceability-dev at openjdk.java.net;
> >>>>>>> hotspot-compiler-dev at openjdk.java.net;
> >>>>>>> hotspot-runtime-dev at openjdk.java.net
> >>>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> >>>>>>> Performance in the Presence of JVMTI Agents
> >>>>>>>
> >>>>>>> Hi Richard,
> >>>>>>>
> >>>>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote:
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> I would like to get reviews please for
> >>>>>>>>
> >>>>>>>>
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
> >>>>>>>>
> >>>>>>>> Corresponding RFE:
> >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745
> >>>>>>>>
> >>>>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
> >>>>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-
> 8214584 [1]
> >>>>>>>>
> >>>>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing
> without
> >>>>>>>> issues (thanks!). In addition the
> >>>>>>>> change is being tested at SAP since I posted the first RFR some
> >>>>>>>> months ago.
> >>>>>>>>
> >>>>>>>> The intention of this enhancement is to benefit performance wise
> from
> >>>>>>>> escape analysis even if JVMTI
> >>>>>>>> agents request capabilities that allow them to access local variable
> >>>>>>>> values. E.g. if you start-up
> >>>>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,
> then
> >>>>>>>> escape analysis is disabled right
> >>>>>>>> from the beginning, well before a debugger attaches -- if ever one
> >>>>>>>> should do so. With the
> >>>>>>>> enhancement, escape analysis will remain enabled until and after
> a
> >>>>>>>> debugger attaches. EA based
> >>>>>>>> optimizations are reverted just before an agent acquires the
> >>>>>>>> reference to an object. In the JBS item
> >>>>>>>> you'll find more details.
> >>>>>>>
> >>>>>>> Most of the details here are in areas I can comment on in detail, but
> I
> >>>>>>> did take an initial general look at things.
> >>>>>>>
> >>>>>>> The only thing that jumped out at me is that I think the
> >>>>>>> DeoptimizeObjectsALotThread should be a hidden thread.
> >>>>>>>
> >>>>>>> +? bool is_hidden_from_external_view() const { return true; }
> >>>>>>>
> >>>>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread.
> >>>>>>> Without
> >>>>>>> active testing this will just bit-rot.
> >>>>>>>
> >>>>>>> Also on the tests I don't understand your @requires clause:
> >>>>>>>
> >>>>>>>   ??? @requires ((vm.compMode != "Xcomp") &
> vm.compiler2.enabled &
> >>>>>>> (vm.opt.TieredCompilation != true))
> >>>>>>>
> >>>>>>> This seems to require that TieredCompilation is disabled, but tiered
> is
> >>>>>>> our normal mode of operation. ??
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> David
> >>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Richard.
> >>>>>>>>
> >>>>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745
> >>>>>>>>
> >>
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.pa
> tc
> >> h
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>

From ralf.schmelter at sap.com  Fri Mar 13 11:43:08 2020
From: ralf.schmelter at sap.com (Schmelter, Ralf)
Date: Fri, 13 Mar 2020 11:43:08 +0000
Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump
In-Reply-To: <e8943a91-839b-3601-cc33-77f338aab96e@oracle.com>
References: <AM6PR02MB450135717FBE0EC6172A7E0D9F1C0@AM6PR02MB4501.eurprd02.prod.outlook.com>
 <AM0PR02MB4500CB0F50EEBEFA6400D5009F180@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <b6aa39c9-c044-4ae3-9018-a3bf07330798@oracle.com>
 <01361a9d-2855-db67-a176-73731fada08f@oracle.com>
 <AM0PR02MB4500CE953024FA02ACFFC53A9F1B0@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com>
 <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com>
 <d852dfe2-254b-d6c4-089b-13ffce8b8257@oracle.com>
 <AM0PR02MB4500BB43A864A446E094B23E9F100@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <c32fe0b4-af56-b5ec-da51-c720dba9b030@oracle.com>
 <e726a869-cc64-4d22-78f5-c77e702615e6@oss.nttdata.com>
 <AM0PR02MB4500D5AB3290D4CE47E718789F130@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <AM0PR02MB57144A77A755760F8BD4E75B8A120@AM0PR02MB5714.eurprd02.prod.outlook.com>
 <0343dfac-61f7-1b1c-ee96-bdee130578ad@oracle.com>
 <AM0PR02MB5714BFA6B42AF1AA86A3384D8AED0@AM0PR02MB5714.eurprd02.prod.outlook.com>
 <e8943a91-839b-3601-cc33-77f338aab96e@oracle.com>
Message-ID: <AM0PR02MB45007871015FB278E8406F5D9FFA0@AM0PR02MB4500.eurprd02.prod.outlook.com>

Hi,

I have updated the webrev: http://cr.openjdk.java.net/~rschmelter/webrevs/8237354/webrev.1/

It has the following significant changes:

- The jcmd now uses two separate flags. The -gz flag is now a boolean flag which toggles the compression  on/off. And the new -gz-level flag can be used to change the compression level. If tried to change the jlong flag coding to allow the old behavior (only one flag, which acts both as a boolean flag and a jlong flag), but decided against it, since it changes the semantic of a jlong flag. And I don't expect the -gz-level flag to be used all that much.

- I no longer use my own threads. Instead I use the WorkGang returned from CollectedHeap:: get_safepoint_workers(). This works fine, apart from Shenandoah GC, which runs into assertions when calling the CollectedHeap::object_iterate() method from a worker thread. I'm not sure if the assertion is too strong, but since the GC is currently experimental, I switch back to single threading in this case (as would be the case for serial GC or epsilon GC). Using the worker threads removes the problems the original code had regarding destruction of the monitor used.

- The reported number of bytes is now the one written to disk.

Best regards,
Ralf

-----Original Message-----
From: Ioi Lam <ioi.lam at oracle.com> 
Sent: Dienstag, 25. Februar 2020 18:03
To: Langer, Christoph <christoph.langer at sap.com>; Schmelter, Ralf <ralf.schmelter at sap.com>; Yasumasa Suenaga <suenaga at oss.nttdata.com>; serguei.spitsyn at oracle.com; hotspot-runtime-dev at openjdk.java.net runtime <hotspot-runtime-dev at openjdk.java.net>
Cc: serviceability-dev at openjdk.java.net
Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump

Hi Christoph,

This sounds fair. I will remove my objection :-)

Thanks
- Ioi

From igor.ignatyev at oracle.com  Fri Mar 13 16:26:07 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Fri, 13 Mar 2020 09:26:07 -0700
Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they
 do not attempt to use sudo when available
In-Reply-To: <90e1ffcc-6636-b328-b167-fd1d42072575@oracle.com>
References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com>
 <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com>
 <c26a5f69-3127-bd0a-0e25-8b7afe4464aa@oracle.com>
 <150f87f8-5b54-deae-ced1-1227bb1ed17a@oracle.com>
 <227652ff-ffb0-a1f0-531d-03b75ecec921@oracle.com>
 <33ffd0e7-d017-c1bf-feb5-4d2f07e753f2@oracle.com>
 <90e1ffcc-6636-b328-b167-fd1d42072575@oracle.com>
Message-ID: <BAAB2AF7-B0C0-4C01-890A-F2F0FC360C42@oracle.com>

HI Chris,

overall looks good to me, a few comments though:
1. since you removed vm.hasSAandCanAttach from VMProps, you also need to remove it from all TEST.ROOT files which mention it (test/jdk/TEST.ROOT and test/hotspot/jtreg/TEST.ROOT) so people won't be confused by undefined property and jtreg will be able to properly report invalid usages of it if any.

2. in SATestUtils::canAddPrivileges, could you please add some meaningful message to the RuntimeException at L#102?

3. SATestUtils::checkAttachOk method name is somewhat misleading (hence you had to put comment every time you used it), I'd recommend you to rename to smth like skipIfCannotAttach().

4. in SATestUtils::checkAttachOk's javadoc, it would be better to use @throws tag like:
> +    /**
> +     * Checks if SA Attach is expected to work.
> +.    * @throws SkippedException ifSA Attach is not expected to work.
> +     */


5. it also might make sense to catch IOException within SATestUtils::checkAttachOk and throw it as Error or RuntimeException.

I've briefly looked at all the changed tests and they look good.

Thanks,
-- Igor 


> On Mar 12, 2020, at 11:06 PM, Chris Plummer <chris.plummer at oracle.com> wrote:
> 
> Hi Serguei,
> 
> Thanks for the review!
> 
> Can I get one more reviewer please?
> 
> thanks,
> 
> Chris
> 
> On 3/12/20 12:06 AM, serguei.spitsyn at oracle.com <mailto:serguei.spitsyn at oracle.com> wrote:
>> Hi Chris,
>> 
>> 
>> On 3/12/20 00:03, Chris Plummer wrote:
>>> Hi Serguei,
>>> 
>>> That check used to be in Platform.shouldSAAttach(), which essentially was moved to SATestUtils.checkAttachOk() and reworked some. It was necessary in Platform.shouldSAAttach() since that was used to evaluation vm.hasSAandCanAttach (which is now gone). When I moved everything to SATestUtils.checkAttachOk(), I recall thinking it wasn't really necessary since all tests that call it should have @require vm.hasSA, but left it in anyway just to be extra safe. I'm still inclined to just leave it in, but would not be opposed to removing it.
>> 
>> I agree, it is more safe to keep it, at list for now.
>> 
>> 
>> Thanks,
>> Serguei
>> 
>>> thanks,
>>> 
>>> Chris
>>> 
>>> On 3/11/20 11:20 PM, serguei.spitsyn at oracle.com <mailto:serguei.spitsyn at oracle.com> wrote:
>>>> Hi Chris,
>>>> 
>>>> I've made another pass today.
>>>> It looks good to me.
>>>> 
>>>> I have just one minor questions.
>>>> 
>>>> There is some overlap between the requires vm.hasSA check and checkAttachOk:
>>>> +    public static  void checkAttachOk() throws IOException {
>>>> +        if (!Platform.hasSA()) {
>>>> +            throw new SkippedException("SA not supported.");
>>>> +        }
>>>> In the former case, the test is not run but in the latter the SkippedException is thrown.
>>>> As I see, all tests with the checkAttachOk call use requires vm.hasSA as well.
>>>> It can be that the first check "if (!Platform.hasSA())" in the checkAttachOk is redundant.
>>>> It is okay and more safe in general but generates little confusion.
>>>> I'm okay if you don't do anything with this but wanted to know your view.
>>>> 
>>>> Thanks,
>>>> Serguei
>>>> 
>>>> 
>>>> On 3/10/20 18:57, Chris Plummer wrote:
>>>>> On 3/10/20 6:07 PM, serguei.spitsyn at oracle.com <mailto:serguei.spitsyn at oracle.com> wrote:
>>>>>> Hi Chris,
>>>>>> 
>>>>>> Overall, this looks as a right direction to me while it is not easy to verify all the details.
>>>>> Yes, there are a lot of tests with quite a few different types of changes. I did a lot of testing and verified that when the tests pass, they pass for the right reasons (really ran the test, skipped due to lack of privileges, or skipped due to running signed on OSX 10.14 or later). I also verified locally running as root, running with a cached sudo, and running without sudo.
>>>>>> I'll make another pass tomorrow. 
>>>>> Thanks!
>>>>>> 
>>>>>> A couple of quick nits so far:
>>>>>> 
>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestCpoolForInvokeDynamic.java.udiff.html <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestCpoolForInvokeDynamic.java.udiff.html>
>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestDefaultMethods.java.udiff.html <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestDefaultMethods.java.udiff.html>
>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestHeapDumpForInvokeDynamic.java.udiff.html <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestHeapDumpForInvokeDynamic.java.udiff.html>
>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestInstanceKlassSizeForInterface.java.udiff.html <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestInstanceKlassSizeForInterface.java.udiff.html>
>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestJhsdbJstackMixed.java.udiff.html <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestJhsdbJstackMixed.java.udiff.html>
>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestRevPtrsForInvokeDynamic.java.udiff.html <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestRevPtrsForInvokeDynamic.java.udiff.html>
>>>>>>  import jdk.test.lib.Utils;
>>>>>> -import jdk.test.lib.Asserts;
>>>>>> +import jdk.test.lib.SA.SATestUtils;
>>>>>> Need to swap these exports.
>>>>>> 
>>>>>> 
>>>>> Ok
>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/lib/jdk/test/lib/SA/SATestUtils.java.frames.html <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/lib/jdk/test/lib/SA/SATestUtils.java.frames.html>
>>>>>>   48         if (SATestUtils.needsPrivileges()) {
>>>>>>   49             cmdStringList = SATestUtils.addPrivileges(cmdStringList);
>>>>>> The method calls are local, so the class name can be omitted in the method names:
>>>>>>   SATestUtils.needsPrivileges and SATestUtils.addPrivileges.
>>>>> Ok
>>>>>> 
>>>>>> 
>>>>>>   94        try {
>>>>>>   95            if (echoProcess.waitFor(60, TimeUnit.SECONDS) == false) {
>>>>>>   96                // Due to using the "-n" option, sudo should complete almost immediately. 60 seconds
>>>>>>   97                // is more than generous. If it didn't complete in that time, something went very wrong.
>>>>>>   98                echoProcess.destroyForcibly();
>>>>>>   99                throw new RuntimeException("Timed out waiting for sudo to execute.");
>>>>>>  100            }
>>>>>>  101         } catch (InterruptedException e) {
>>>>>>  102            throw new RuntimeException(e);
>>>>>>  103         }
>>>>>> The lines 101/103 are misaligned.
>>>>> Ok.
>>>>>> 
>>>>>> 
>>>>>> Thanks,
>>>>>> Serguei
>>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Chris
>>>>>> 
>>>>>> 
>>>>>> On 3/9/20 19:29, Chris Plummer wrote:
>>>>>>> Hi, 
>>>>>>> 
>>>>>>> Please help review the following: 
>>>>>>> 
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238268 <https://bugs.openjdk.java.net/browse/JDK-8238268> 
>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/ <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/> 
>>>>>>> 
>>>>>>> I'll try to give enough background first to make it easier to understand the changes. On OSX you must run SA tests that attach to a live process as root or using sudo. For example: 
>>>>>>> 
>>>>>>>   sudo make run-test TEST=serviceability/sa/ClhsdbJstackXcompStress.java 
>>>>>>> 
>>>>>>> Whether running as root or under sudo, the check to allow the test to run is done with: 
>>>>>>> 
>>>>>>>     private static boolean canAttachOSX() { 
>>>>>>>           return userName.equals("root"); 
>>>>>>>     } 
>>>>>>> 
>>>>>>> Any test using "@requires vm.hasSAandCanAttach" must pass this check via Platform.shouldSAAttach(), which for OSX returns: 
>>>>>>> 
>>>>>>>              return canAttachOSX() && !isSignedOSX(); 
>>>>>>> 
>>>>>>> So if running as root the "@requires vm.hasSAandCanAttach" passes, otherwise it does not. However, using a root login to run tests is not a very desirable, nor is issuing a "sudo make run-test" (any created file ends up with root ownership). Because of this support was previously added for just running the attaching process using sudo, not the entire test. This was only done for the 20 or so tests that use ClhsdbLauncher. These tests use "@requires vm.hasSA", and then while running the test will do a "sudo" check if canAttachOSX() returns false: 
>>>>>>> 
>>>>>>>         if (!Platform.shouldSAAttach()) { 
>>>>>>>             if (Platform.isOSX()) { 
>>>>>>>                 if (Platform.isSignedOSX()) { 
>>>>>>>                     throw new SkippedException("SA attach not expected to work. JDK is signed."); 
>>>>>>>                 } else if (SATestUtils.canAddPrivileges()) { 
>>>>>>>                     needPrivileges = true; 
>>>>>>>                 } 
>>>>>>>             } 
>>>>>>>             if (!needPrivileges)  { 
>>>>>>>                // Skip the test if we don't have enough permissions to attach 
>>>>>>>                // and cannot add privileges. 
>>>>>>>                throw new SkippedException( 
>>>>>>>                    "SA attach not expected to work. Insufficient privileges."); 
>>>>>>>            } 
>>>>>>>         } 
>>>>>>> 
>>>>>>> So basically it does a runtime check of vm.hasSAandCanAttach, and if it fails then checks if running with sudo will work. This allows for either a passwordless sudo to be used when running clhsdb, or for the user to be prompted for the sudo password (note I've remove support for the latter with my changes). 
>>>>>>> 
>>>>>>> That brings us to the CR that is being fixed. ClhsdbLauncher tests support sudo and will therefore run with our CI testing on OSX, but the 25 or so tests that use "@requires vm.hasSAandCanAttach" do not, and therefore are never run with our CI OSX testing. The changes in this webrev fix that. 
>>>>>>> 
>>>>>>> There are two possible approaches to the fix. One is having the check for sudo be done as part of the vm.hasSAandCanAttach evaluation. The other approach is to do the check in the test at runtime similar to how ClhsdbLauncher currently does. This would mean just using "@requires vm.hasSA" for all the tests instead of "@requires vm.hasSAandCanAttach". I chose the later because there is an advantage to throwing SkippedException rather than just silently skipping the test using @requires. The advantage is that mdash tells you how many tests were skipped, and when you hover over the reason you can see the SkippedException message, which will differentiate between reasons like the JDK was signed or there are insufficient privileges. If all the checking was done by the vm.hasSAandCanAttach evaluation, you would not know why the test wasn't run. 
>>>>>>> 
>>>>>>> The "support" related changes made are all in the following 3 files. The rest of the changes are in the tests: 
>>>>>>> 
>>>>>>> test/jtreg-ext/requires/VMProps.java 
>>>>>>> test/lib/jdk/test/lib/Platform.java 
>>>>>>> test/lib/jdk/test/lib/SA/SATestUtils.java 
>>>>>>> 
>>>>>>> You'll noticed that one change I made to the sudo support in SATestUtils.canAddPrivileges() is to make sudo non-interactive, which means no password prompt. So that means either the user does not require a password, or the credentials have been cached. Otherwise the sudo check will fail. On most platforms if you execute a sudo command, the credentials are cached for 5 minutes. So if your user is not setup for passwordless sudo, then a sudo command can be issued before running the tests, and will likely remain cached until the test is run. The reason for using passwordless is because prompting in the middle of running tests can be confusing (you usually walk way once launching the tests and miss the prompt anyway), and avoids unnecessary delays in automated testing due to waiting for the password prompt to timeout (it used to wait 1 minute). 
>>>>>>> 
>>>>>>> There are essentially 3 types of tests that SA Attach to a process, each needing a slightly different fix: 
>>>>>>> 
>>>>>>> 1. Tests that directly launch a jdk.hotspot.agent class, such as TestClassDump.java. They need to call SATestUtils.checkAttachOk() to verify that attaching will be possible, and then SATestUtils.addPrivilegesIfNeeded(pb) to get the sudo command added if needed.They also need to switch from using hasSAandCanAttach to using hasSA. 
>>>>>>> 
>>>>>>> 2. Tests that launch command line tools such has jhsdb. They need to call SATestUtils.checkAttachOk() to verify that attaching will be possible, and then SATestUtils.createProcessBuilder() to create a process that will be launched using sudo if necessary.They also need to switch from using hasSAandCanAttach to using hasSA. 
>>>>>>> 
>>>>>>> 3. Tests that use ClhsdbLauncher. They already use hasSA instead of hasSAandCanAttach, and rely on ClhsdbLauncher to do check at runtime if attaching will work, so for the most part all the these tests are unchanged. ClhsdbLauncher was modified to take advantage of the new SATestUtils.createProcessBuilder() and SATestUtils.checkAttachOk() APIs. 
>>>>>>> 
>>>>>>> Some tests required special handling: 
>>>>>>> 
>>>>>>> test/hotspot/jtreg/compiler/ciReplay/TestSAClient.java 
>>>>>>> test/hotspot/jtreg/compiler/ciReplay/TestSAServer.java 
>>>>>>> 
>>>>>>> - These two tests SA Attach to a core file, not to a process, so only need hasSA, 
>>>>>>>   not hasSAandCanAttach. No other changes were needed. 
>>>>>>> 
>>>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFindPC.java 
>>>>>>> 
>>>>>>> - The output should never be null. If the test was skipped due to lack of privileges, you 
>>>>>>>   would never get to this section of the test. 
>>>>>>> 
>>>>>>> test/hotspot/jtreg/serviceability/sa/TestClhsdbJstackLock.java 
>>>>>>> test/hotspot/jtreg/serviceability/sa/TestIntConstant.java 
>>>>>>> test/hotspot/jtreg/serviceability/sa/TestPrintMdo.java 
>>>>>>> test/hotspot/jtreg/serviceability/sa/TestType.java 
>>>>>>> test/hotspot/jtreg/serviceability/sa/TestUniverse.java 
>>>>>>> 
>>>>>>> - These are ClhsdbLauncher tests, so they should have been using hasSA instead of 
>>>>>>>   hasSAandCanAttachin the first place. No other changes were needed. 
>>>>>>> 
>>>>>>> test/hotspot/jtreg/serviceability/sa/TestCpoolForInvokeDynamic.java 
>>>>>>> test/hotspot/jtreg/serviceability/sa/TestDefaultMethods.java 
>>>>>>> test/hotspot/jtreg/serviceability/sa/TestG1HeapRegion.java 
>>>>>>> 
>>>>>>> - These tests used to "@require mac" but seem run fine on OSX, so I removed this requirement. 
>>>>>>> 
>>>>>>> test/jdk/sun/tools/jhsdb/BasicLauncherTest.java 
>>>>>>> 
>>>>>>> - This test had a runtime check to not run on OSX due to not having core file stack 
>>>>>>>   walking support. However, this tests always attaches to a process, not a core file, 
>>>>>>>   and seems to run just fine on OSX. 
>>>>>>> 
>>>>>>> test/jdk/sun/tools/jstack/DeadlockDetectionTest.java 
>>>>>>> 
>>>>>>> - I changed the test to throw a SkippedException if it gets the unexpected error code 
>>>>>>>   rather than just println. 
>>>>>>> 
>>>>>>> And a few other miscellaneous changes not already covered: 
>>>>>>> 
>>>>>>> test/lib/jdk/test/lib/Platform.java 
>>>>>>> - Made canPtraceAttachLinux() public so it can be called from SATestUtils. 
>>>>>>> - vm.hasSAandCanAttach is now gone. 
>>>>>>> 
>>>>>>> thanks, 
>>>>>>> 
>>>>>>> Chris 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200313/a40e4d8d/attachment-0001.htm>

From pavel.rappo at oracle.com  Fri Mar 13 16:26:54 2020
From: pavel.rappo at oracle.com (Pavel Rappo)
Date: Fri, 13 Mar 2020 16:26:54 +0000
Subject: RFR: [small, docs] JDK-8240971 Fix CSS styles in some doc comments
In-Reply-To: <a70477b4-36f4-4642-867e-c76218ade93c@oracle.com>
References: <a70477b4-36f4-4642-867e-c76218ade93c@oracle.com>
Message-ID: <2216FA2C-C369-4FF1-B0D2-800D3AD59B1B@oracle.com>

This is really nice. Incidentally, it also makes

  https://bugs.openjdk.java.net/browse/JDK-8234395

less relevant.

-Pavel

> On 12 Mar 2020, at 20:50, Jonathan Gibbons <jonathan.gibbons at oracle.com> wrote:
> 
> Please review a simple fix regarding the non-standard use of some CSS in some doc comments.
> 
> From the JBS Description:
> 
> Recently, for the display of javadoc block tags, javadoc changed from using an inconsistent set of CSS class names on the generated 'dt' elements to using a single new name ("notes") on the enclosing 'dl' element.
> 
> There are a few (4) places in the main JDK code where the old-style names were used explicitly in doc comments, in order to emulate the appearance of a list of block tags. These use-sites should be fixed up. They are in the following files:
> 
> open/src/java.base/share/classes/module-info.java
> open/src/java.se/share/classes/module-info.java
> open/src/java.management.rmi/share/classes/module-info.java
> open/src/jdk.jconsole/share/classes/module-info.java
> 
> In addition, these four files used the style attribute to force the font to be used. The font is now set in the standard CSS for "notes", and so the local use of a "style" attribute is no longer necessary.
> 
> -- Jon
> 
> JBS: https://bugs.openjdk.java.net/browse/JDK-8240971
> Webrev: http://cr.openjdk.java.net/~jjg/8240971/webrev.00/index.html
> API: http://cr.openjdk.java.net/~jjg/8240971/api.00/index.html
> 


From jonathan.gibbons at oracle.com  Fri Mar 13 16:34:04 2020
From: jonathan.gibbons at oracle.com (Jonathan Gibbons)
Date: Fri, 13 Mar 2020 09:34:04 -0700
Subject: RFR: [small, docs] JDK-8240971 Fix CSS styles in some doc comments
In-Reply-To: <2216FA2C-C369-4FF1-B0D2-800D3AD59B1B@oracle.com>
References: <a70477b4-36f4-4642-867e-c76218ade93c@oracle.com>
 <2216FA2C-C369-4FF1-B0D2-800D3AD59B1B@oracle.com>
Message-ID: <158d778e-c70d-b5df-b8ab-2296fea609ac@oracle.com>

At some point, we should separate JDK-specific definitions from 
javadoc-general definitions, using a separate stylesheet.

-- Jon

On 3/13/20 9:26 AM, Pavel Rappo wrote:
> This is really nice. Incidentally, it also makes
>
>    https://bugs.openjdk.java.net/browse/JDK-8234395
>
> less relevant.
>
> -Pavel
>
>> On 12 Mar 2020, at 20:50, Jonathan Gibbons <jonathan.gibbons at oracle.com> wrote:
>>
>> Please review a simple fix regarding the non-standard use of some CSS in some doc comments.
>>
>>  From the JBS Description:
>>
>> Recently, for the display of javadoc block tags, javadoc changed from using an inconsistent set of CSS class names on the generated 'dt' elements to using a single new name ("notes") on the enclosing 'dl' element.
>>
>> There are a few (4) places in the main JDK code where the old-style names were used explicitly in doc comments, in order to emulate the appearance of a list of block tags. These use-sites should be fixed up. They are in the following files:
>>
>> open/src/java.base/share/classes/module-info.java
>> open/src/java.se/share/classes/module-info.java
>> open/src/java.management.rmi/share/classes/module-info.java
>> open/src/jdk.jconsole/share/classes/module-info.java
>>
>> In addition, these four files used the style attribute to force the font to be used. The font is now set in the standard CSS for "notes", and so the local use of a "style" attribute is no longer necessary.
>>
>> -- Jon
>>
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8240971
>> Webrev: http://cr.openjdk.java.net/~jjg/8240971/webrev.00/index.html
>> API: http://cr.openjdk.java.net/~jjg/8240971/api.00/index.html
>>

From daniil.x.titov at oracle.com  Fri Mar 13 22:05:11 2020
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Fri, 13 Mar 2020 15:05:11 -0700
Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI
 connector port
In-Reply-To: <c858b94c-0091-8d29-eb0b-145782984d86@oss.nttdata.com>
References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com>
 <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com>
 <D28BF049-D293-4F65-93EC-BCAE4F09B413@oracle.com>
 <a3b5ac77-b6ee-4927-cf99-b586c6bbeae6@oss.nttdata.com>
 <41392FC2-AC78-4E94-878B-907DDFB3E968@oracle.com>
 <c858b94c-0091-8d29-eb0b-145782984d86@oss.nttdata.com>
Message-ID: <78EA1B33-1B0D-4969-AFC5-E73D8F82E6FA@oracle.com>

Hi Yasumasa, Serguei and Alex,

Please review a new version of the webrev that includes the changes Yasumasa suggested.

> Shutdown hook is already registered in c'tor of HotSpotAgent.
>    It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed.

The shutdown hook registered in the HotSpotAgent c'tor only works for non-servers, so we still need a 
the shutdown hook for remote server being added in SALauncher. I changed it to use  the lambda expression.

101     public HotSpotAgent() {
 102         // for non-server add shutdown hook to clean-up debugger in case
 103         // of forced exit. For remote server, shutdown hook is added by
 104         // DebugServer.
 105         Runtime.getRuntime().addShutdownHook(new java.lang.Thread(
 106         new Runnable() {
 107             public void run() {
 108                 synchronized (HotSpotAgent.this) {
 109                     if (!isServer) {
 110                         detach();
 111                     }
 112                 }
 113             }
 114         }));
 115     }

>>    Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains 
>> `exclusiveAccess.dirs=.` to avoid concurrent execution
As I understand exclusiveAccess.dirs prevents only the tests located in this directory from being run simultaneously and other tests could still run in parallel with one of these tests.  Thus I would prefer to have the retry mechanism in place. I simplified the code using the class variables instead of local arrays.

Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.

[1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.03/
[2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
[3] Bug: https://bugs.openjdk.java.net/browse/JDK-8196751 

Thank you,
Daniil

?On 3/6/20, 6:15 PM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:

    Hi Daniil,
    
    On 2020/03/07 3:38, Daniil Titov wrote:
    > Hi Yasumasa,
    > 
    >   -> checkBasicOptions() is needed? I think you can remove this method and embed it in caller.
    > I think that having a piece of code that invokes  a method  named "buildAttachArgs" with a copy of the argument map  just for its side-effect ( it throws an exception if parameters are incorrect)  and ignores its return might look confusing. Thus, I found it more appropriate to wrap it inside a method with more relevant name .
    
    Ok, but I prefer to leave comment it.
    
    
    >   > SADebugDTest
    >   >  - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array.
    > We cannot use primitives there since these local variables are captured in lambda expression and are required to be final.
    > The other option is to use some other wrapper for them but I don't see any obvious benefits in it comparing to the array.
    
    Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains `exclusiveAccess.dirs=.` to avoid concurrent execution.
    If you do not think this error check, test code is more simply.
    
    
    > I will include your other suggestion in the new version of the webrev.
    
    Sorry, I have one more comment:
    
    >           - Shutdown hook is very good idea. You can implement more simply if you use lambda expression.
    
    Shutdown hook is already registered in c'tor of HotSpotAgent.
    It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed.
    
    
    Thanks,
    
    Yasumasa
    
    
    > Thanks!
    > Daniil
    > 
    > ?On 3/6/20, 12:30 AM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
    > 
    >      Hi Daniil,
    >      
    >      
    >      - SALauncher.java
    >           - checkBasicOptions() is needed? I think you can remove this method and embed it in caller.
    >           - I think registryPort should be checked with Integer.parseInt() like others (rmiPort and pid) rather than regex.
    >           - Shutdown hook is very good idea. You can implement more simply if you use lambda expression.
    >      
    >      - SADebugDTest.java
    >           - Please add bug ID to @bug.
    >           - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array.
    >      
    >      
    >      Thanks,
    >      
    >      Yasumasa
    >      
    >      
    >      On 2020/03/06 10:15, Daniil Titov wrote:
    >      > Hi Yasumasa, Serguei and Alex,
    >      >
    >      > Please review a new version of the fix [1] that addresses your comments. The new version in addition to RMI connector
    >      > port option introduces two more options to specify RMI registry port and RMI connector host name. Currently, these
    >      > last two settings could be specified using the system properties but the system properties have the following disadvantages
    >      > comparing to the command line options:
    >      >     -  It?s hard to know about them: they are not listed in tool?s help.
    >      >     -  They have long names that hard to remember
    >      >     -   It is easy to mistype them  in the command line and you will not get any warning about it.
    >      >
    >      > The CSR [2] was also updated and needs to be reviewed.
    >      >
    >      > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker
    >      > container  and connecting  to it with the GUI debugger.  Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
    >      >
    >      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.02/
    >      > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
    >      > [3] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751
    >      >
    >      > Thank you,
    >      > Daniil
    >      >
    >      > ?On 2/24/20, 5:45 AM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
    >      >
    >      >      Hi Daniil,
    >      >
    >      >         - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments.
    >      >           Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply.
    >      >
    >      >         - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used.
    >      >           But you can use same port number as RMI registry (1099).
    >      >           It is same as relation between jmxremote.port and jmxremote.rmi.port.
    >      >
    >      >
    >      >      Thanks,
    >      >
    >      >      Yasumasa
    >      >
    >      >
    >      >      On 2020/02/24 13:21, Daniil Titov wrote:
    >      >      > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port.
    >      >      > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container.
    >      >      >
    >      >      > New CSR [3] was created for this change and it needs to be reviewed as well.
    >      >      >
    >      >      > Man pages for jhsdb will be updated in a separate issue.
    >      >      >
    >      >      > The current implementation (sun.jvm.hotspot.SALauncher)  parses the command line options passed to jhsdb tool,
    >      >      > converts them to the ones for the debug server and then delegates the call  to sun.jvm.hotspot.DebugServer.main().
    >      >      >
    >      >      >                // delegate to the actual SA debug server.
    >      >      >   367         DebugServer.main(newArgArray.toArray(new String[0]));
    >      >      >
    >      >      > However,  sun.jvm.hotspot.DebugServer  doesn't support named options and that prevents from efficiently adding new options to the tool.
    >      >      > I found it more suitable to start Hotspot agent directly in  SALauncher rather than  adding a new option in  both sun.jvm.hotspot.SALauncher
    >      >      >   and sun.jvm.hotspot.DebugServer and  delegating the call.  With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated
    >      >      > but I would prefer to address it in a separate issue.
    >      >      >
    >      >      > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker
    >      >      >                  container  and connecting  to it with the GUI debugger.
    >      >      >                 Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
    >      >      >
    >      >      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01
    >      >      > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751
    >      >      > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
    >      >      >
    >      >      > Thank you,
    >      >      > Daniil
    >      >      >
    >      >      >
    >      >
    >      >
    >      >
    >      
    > 
    > 
    

From suenaga at oss.nttdata.com  Sat Mar 14 01:35:35 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Sat, 14 Mar 2020 10:35:35 +0900
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
Message-ID: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>

Hi all,

Please review this change:

   JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
   webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/

JDK-8234624 introduced DWARF parser in SA for unwinding native frames in jstack mixed mode.
However some error has seen intermittently after that.

I investigated the cause of this, I found two concerns:

   A: lack of buffer (.eh_frame section data) range check
   B: Language personality routine and Language Specific Data Area (LSDA) are not considered

I addd range check for .eh_frame processing, and ignore personality routine and LSDA in this webrev.
Also I added bailout code if DWARF processing is failed due to these concerns.

This change has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671),
also I tested it on my Fedora 31 box and Oracle Linux 7.7 container.


Thanks,

Yasumasa

From suenaga at oss.nttdata.com  Sat Mar 14 02:23:37 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Sat, 14 Mar 2020 11:23:37 +0900
Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI
 connector port
In-Reply-To: <78EA1B33-1B0D-4969-AFC5-E73D8F82E6FA@oracle.com>
References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com>
 <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com>
 <D28BF049-D293-4F65-93EC-BCAE4F09B413@oracle.com>
 <a3b5ac77-b6ee-4927-cf99-b586c6bbeae6@oss.nttdata.com>
 <41392FC2-AC78-4E94-878B-907DDFB3E968@oracle.com>
 <c858b94c-0091-8d29-eb0b-145782984d86@oss.nttdata.com>
 <78EA1B33-1B0D-4969-AFC5-E73D8F82E6FA@oracle.com>
Message-ID: <c00b118a-8619-3984-10d7-63134d7210a0@oss.nttdata.com>

Hi Daniil,

On 2020/03/14 7:05, Daniil Titov wrote:
> Hi Yasumasa, Serguei and Alex,
> 
> Please review a new version of the webrev that includes the changes Yasumasa suggested.
> 
>> Shutdown hook is already registered in c'tor of HotSpotAgent.
>>     It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed.
> 
> The shutdown hook registered in the HotSpotAgent c'tor only works for non-servers, so we still need a
> the shutdown hook for remote server being added in SALauncher. I changed it to use  the lambda expression.
> 
> 101     public HotSpotAgent() {
>   102         // for non-server add shutdown hook to clean-up debugger in case
>   103         // of forced exit. For remote server, shutdown hook is added by
>   104         // DebugServer.
>   105         Runtime.getRuntime().addShutdownHook(new java.lang.Thread(
>   106         new Runnable() {
>   107             public void run() {
>   108                 synchronized (HotSpotAgent.this) {
>   109                     if (!isServer) {
>   110                         detach();
>   111                     }
>   112                 }
>   113             }
>   114         }));
>   115     }

I missed it, thanks!


>>>     Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains
>>> `exclusiveAccess.dirs=.` to avoid concurrent execution
> As I understand exclusiveAccess.dirs prevents only the tests located in this directory from being run simultaneously and other tests could still run in parallel with one of these tests.  Thus I would prefer to have the retry mechanism in place. I simplified the code using the class variables instead of local arrays.

Ok, but I think it might be more simply with TestLibrary.
For example, can you use TestLibrary::getUnusedRandomPort ? It is used in test/jdk/java/rmi/testlibrary/RMID.java .


Thanks,

Yasumasa


> Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
> 
> [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.03/
> [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
> [3] Bug: https://bugs.openjdk.java.net/browse/JDK-8196751
> 
> Thank you,
> Daniil
> 
> ?On 3/6/20, 6:15 PM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
> 
>      Hi Daniil,
>      
>      On 2020/03/07 3:38, Daniil Titov wrote:
>      > Hi Yasumasa,
>      >
>      >   -> checkBasicOptions() is needed? I think you can remove this method and embed it in caller.
>      > I think that having a piece of code that invokes  a method  named "buildAttachArgs" with a copy of the argument map  just for its side-effect ( it throws an exception if parameters are incorrect)  and ignores its return might look confusing. Thus, I found it more appropriate to wrap it inside a method with more relevant name .
>      
>      Ok, but I prefer to leave comment it.
>      
>      
>      >   > SADebugDTest
>      >   >  - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array.
>      > We cannot use primitives there since these local variables are captured in lambda expression and are required to be final.
>      > The other option is to use some other wrapper for them but I don't see any obvious benefits in it comparing to the array.
>      
>      Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains `exclusiveAccess.dirs=.` to avoid concurrent execution.
>      If you do not think this error check, test code is more simply.
>      
>      
>      > I will include your other suggestion in the new version of the webrev.
>      
>      Sorry, I have one more comment:
>      
>      >           - Shutdown hook is very good idea. You can implement more simply if you use lambda expression.
>      
>      Shutdown hook is already registered in c'tor of HotSpotAgent.
>      It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed.
>      
>      
>      Thanks,
>      
>      Yasumasa
>      
>      
>      > Thanks!
>      > Daniil
>      >
>      > ?On 3/6/20, 12:30 AM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
>      >
>      >      Hi Daniil,
>      >
>      >
>      >      - SALauncher.java
>      >           - checkBasicOptions() is needed? I think you can remove this method and embed it in caller.
>      >           - I think registryPort should be checked with Integer.parseInt() like others (rmiPort and pid) rather than regex.
>      >           - Shutdown hook is very good idea. You can implement more simply if you use lambda expression.
>      >
>      >      - SADebugDTest.java
>      >           - Please add bug ID to @bug.
>      >           - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array.
>      >
>      >
>      >      Thanks,
>      >
>      >      Yasumasa
>      >
>      >
>      >      On 2020/03/06 10:15, Daniil Titov wrote:
>      >      > Hi Yasumasa, Serguei and Alex,
>      >      >
>      >      > Please review a new version of the fix [1] that addresses your comments. The new version in addition to RMI connector
>      >      > port option introduces two more options to specify RMI registry port and RMI connector host name. Currently, these
>      >      > last two settings could be specified using the system properties but the system properties have the following disadvantages
>      >      > comparing to the command line options:
>      >      >     -  It?s hard to know about them: they are not listed in tool?s help.
>      >      >     -  They have long names that hard to remember
>      >      >     -   It is easy to mistype them  in the command line and you will not get any warning about it.
>      >      >
>      >      > The CSR [2] was also updated and needs to be reviewed.
>      >      >
>      >      > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker
>      >      > container  and connecting  to it with the GUI debugger.  Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
>      >      >
>      >      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.02/
>      >      > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
>      >      > [3] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751
>      >      >
>      >      > Thank you,
>      >      > Daniil
>      >      >
>      >      > ?On 2/24/20, 5:45 AM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
>      >      >
>      >      >      Hi Daniil,
>      >      >
>      >      >         - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments.
>      >      >           Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply.
>      >      >
>      >      >         - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used.
>      >      >           But you can use same port number as RMI registry (1099).
>      >      >           It is same as relation between jmxremote.port and jmxremote.rmi.port.
>      >      >
>      >      >
>      >      >      Thanks,
>      >      >
>      >      >      Yasumasa
>      >      >
>      >      >
>      >      >      On 2020/02/24 13:21, Daniil Titov wrote:
>      >      >      > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port.
>      >      >      > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container.
>      >      >      >
>      >      >      > New CSR [3] was created for this change and it needs to be reviewed as well.
>      >      >      >
>      >      >      > Man pages for jhsdb will be updated in a separate issue.
>      >      >      >
>      >      >      > The current implementation (sun.jvm.hotspot.SALauncher)  parses the command line options passed to jhsdb tool,
>      >      >      > converts them to the ones for the debug server and then delegates the call  to sun.jvm.hotspot.DebugServer.main().
>      >      >      >
>      >      >      >                // delegate to the actual SA debug server.
>      >      >      >   367         DebugServer.main(newArgArray.toArray(new String[0]));
>      >      >      >
>      >      >      > However,  sun.jvm.hotspot.DebugServer  doesn't support named options and that prevents from efficiently adding new options to the tool.
>      >      >      > I found it more suitable to start Hotspot agent directly in  SALauncher rather than  adding a new option in  both sun.jvm.hotspot.SALauncher
>      >      >      >   and sun.jvm.hotspot.DebugServer and  delegating the call.  With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated
>      >      >      > but I would prefer to address it in a separate issue.
>      >      >      >
>      >      >      > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker
>      >      >      >                  container  and connecting  to it with the GUI debugger.
>      >      >      >                 Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
>      >      >      >
>      >      >      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01
>      >      >      > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751
>      >      >      > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
>      >      >      >
>      >      >      > Thank you,
>      >      >      > Daniil
>      >      >      >
>      >      >      >
>      >      >
>      >      >
>      >      >
>      >
>      >
>      >
>      
> 
> 

From chris.plummer at oracle.com  Sun Mar 15 23:35:16 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Sun, 15 Mar 2020 16:35:16 -0700
Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they
 do not attempt to use sudo when available
In-Reply-To: <BAAB2AF7-B0C0-4C01-890A-F2F0FC360C42@oracle.com>
References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com>
 <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com>
 <c26a5f69-3127-bd0a-0e25-8b7afe4464aa@oracle.com>
 <150f87f8-5b54-deae-ced1-1227bb1ed17a@oracle.com>
 <227652ff-ffb0-a1f0-531d-03b75ecec921@oracle.com>
 <33ffd0e7-d017-c1bf-feb5-4d2f07e753f2@oracle.com>
 <90e1ffcc-6636-b328-b167-fd1d42072575@oracle.com>
 <BAAB2AF7-B0C0-4C01-890A-F2F0FC360C42@oracle.com>
Message-ID: <f567ce96-4d34-8888-e74c-58e7ab64055f@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200315/1f1c280c/attachment-0001.htm>

From igor.ignatyev at oracle.com  Sun Mar 15 23:49:28 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Sun, 15 Mar 2020 16:49:28 -0700
Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they
 do not attempt to use sudo when available
In-Reply-To: <f567ce96-4d34-8888-e74c-58e7ab64055f@oracle.com>
References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com>
 <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com>
 <c26a5f69-3127-bd0a-0e25-8b7afe4464aa@oracle.com>
 <150f87f8-5b54-deae-ced1-1227bb1ed17a@oracle.com>
 <227652ff-ffb0-a1f0-531d-03b75ecec921@oracle.com>
 <33ffd0e7-d017-c1bf-feb5-4d2f07e753f2@oracle.com>
 <90e1ffcc-6636-b328-b167-fd1d42072575@oracle.com>
 <BAAB2AF7-B0C0-4C01-890A-F2F0FC360C42@oracle.com>
 <f567ce96-4d34-8888-e74c-58e7ab64055f@oracle.com>
Message-ID: <78834ED4-055B-4E43-86A8-01BB49BC3C73@oracle.com>

Hi Chris,

looks good, thanks!

one minor nit, in SATestUtils::skipIfCannotAttach, you have two exception messages which start with 'SA attach not expected to work.', and one w/ 'SA Attach not expected to work.' (w/ Attach instead of attach), it'd be nicer to have them uniform. 

Cheers,
-- Igor

> On Mar 15, 2020, at 4:35 PM, Chris Plummer <chris.plummer at oracle.com> wrote:
> 
> Hi Igor,
> 
> Thanks for the review. Here's and updated webrev with all of the suggestions from you and Serguei:
> 
> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.01/index.html <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.01/index.html>
> 
> Also some comments inline below.
> 
> On 3/13/20 9:26 AM, Igor Ignatyev wrote:
>> HI Chris,
>> 
>> overall looks good to me, a few comments though:
>> 1. since you removed vm.hasSAandCanAttach from VMProps, you also need to remove it from all TEST.ROOT files which mention it (test/jdk/TEST.ROOT and test/hotspot/jtreg/TEST.ROOT) so people won't be confused by undefined property and jtreg will be able to properly report invalid usages of it if any.
> Ok, but it's unclear to me what requires.properties is even for, and what is the impact of extra or missing properties. What kind of test would catch these errors?
jtreg uses 'requires.properties' as a list of extra variables for @require expressions, if a test uses a name which isn't in 'requires.properties' (or known to jtreg), jtreg won't execute such test and will set its status to Error w/ 'invalid name ...' message.

>> 
>> 2. in SATestUtils::canAddPrivileges, could you please add some meaningful message to the RuntimeException at L#102?
>> 
> Ok.
> 
>            throw new RuntimeException("sudo process interrupted", e);
> 
>> 3. SATestUtils::checkAttachOk method name is somewhat misleading (hence you had to put comment every time you used it), I'd recommend you to rename to smth like skipIfCannotAttach().
> Ok, but I still left the comment in place.
>> 
>> 4. in SATestUtils::checkAttachOk's javadoc, it would be better to use @throws tag like:
>>> +    /**
>>> +     * Checks if SA Attach is expected to work.
>>> +.    * @throws SkippedException ifSA Attach is not expected to work.
>>> +     */
>> 
>> 
> Ok.
>> 5. it also might make sense to catch IOException within SATestUtils::checkAttachOk and throw it as Error or RuntimeException.
>> 
> Ok.
>> I've briefly looked at all the changed tests and they look good.
> 
> Thanks!
> 
> Chris
>> 
>> Thanks,
>> -- Igor 
>> 
>> 
>>> On Mar 12, 2020, at 11:06 PM, Chris Plummer <chris.plummer at oracle.com <mailto:chris.plummer at oracle.com>> wrote:
>>> 
>>> Hi Serguei,
>>> 
>>> Thanks for the review!
>>> 
>>> Can I get one more reviewer please?
>>> 
>>> thanks,
>>> 
>>> Chris
>>> 
>>> On 3/12/20 12:06 AM, serguei.spitsyn at oracle.com <mailto:serguei.spitsyn at oracle.com> wrote:
>>>> Hi Chris,
>>>> 
>>>> 
>>>> On 3/12/20 00:03, Chris Plummer wrote:
>>>>> Hi Serguei,
>>>>> 
>>>>> That check used to be in Platform.shouldSAAttach(), which essentially was moved to SATestUtils.checkAttachOk() and reworked some. It was necessary in Platform.shouldSAAttach() since that was used to evaluation vm.hasSAandCanAttach (which is now gone). When I moved everything to SATestUtils.checkAttachOk(), I recall thinking it wasn't really necessary since all tests that call it should have @require vm.hasSA, but left it in anyway just to be extra safe. I'm still inclined to just leave it in, but would not be opposed to removing it.
>>>> 
>>>> I agree, it is more safe to keep it, at list for now.
>>>> 
>>>> 
>>>> Thanks,
>>>> Serguei
>>>> 
>>>>> thanks,
>>>>> 
>>>>> Chris
>>>>> 
>>>>> On 3/11/20 11:20 PM, serguei.spitsyn at oracle.com <mailto:serguei.spitsyn at oracle.com> wrote:
>>>>>> Hi Chris,
>>>>>> 
>>>>>> I've made another pass today.
>>>>>> It looks good to me.
>>>>>> 
>>>>>> I have just one minor questions.
>>>>>> 
>>>>>> There is some overlap between the requires vm.hasSA check and checkAttachOk:
>>>>>> +    public static  void checkAttachOk() throws IOException {
>>>>>> +        if (!Platform.hasSA()) {
>>>>>> +            throw new SkippedException("SA not supported.");
>>>>>> +        }
>>>>>> In the former case, the test is not run but in the latter the SkippedException is thrown.
>>>>>> As I see, all tests with the checkAttachOk call use requires vm.hasSA as well.
>>>>>> It can be that the first check "if (!Platform.hasSA())" in the checkAttachOk is redundant.
>>>>>> It is okay and more safe in general but generates little confusion.
>>>>>> I'm okay if you don't do anything with this but wanted to know your view.
>>>>>> 
>>>>>> Thanks,
>>>>>> Serguei
>>>>>> 
>>>>>> 
>>>>>> On 3/10/20 18:57, Chris Plummer wrote:
>>>>>>> On 3/10/20 6:07 PM, serguei.spitsyn at oracle.com <mailto:serguei.spitsyn at oracle.com> wrote:
>>>>>>>> Hi Chris,
>>>>>>>> 
>>>>>>>> Overall, this looks as a right direction to me while it is not easy to verify all the details.
>>>>>>> Yes, there are a lot of tests with quite a few different types of changes. I did a lot of testing and verified that when the tests pass, they pass for the right reasons (really ran the test, skipped due to lack of privileges, or skipped due to running signed on OSX 10.14 or later). I also verified locally running as root, running with a cached sudo, and running without sudo.
>>>>>>>> I'll make another pass tomorrow. 
>>>>>>> Thanks!
>>>>>>>> 
>>>>>>>> A couple of quick nits so far:
>>>>>>>> 
>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestCpoolForInvokeDynamic.java.udiff.html <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestCpoolForInvokeDynamic.java.udiff.html>
>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestDefaultMethods.java.udiff.html <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestDefaultMethods.java.udiff.html>
>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestHeapDumpForInvokeDynamic.java.udiff.html <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestHeapDumpForInvokeDynamic.java.udiff.html>
>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestInstanceKlassSizeForInterface.java.udiff.html <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestInstanceKlassSizeForInterface.java.udiff.html>
>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestJhsdbJstackMixed.java.udiff.html <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestJhsdbJstackMixed.java.udiff.html>
>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestRevPtrsForInvokeDynamic.java.udiff.html <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestRevPtrsForInvokeDynamic.java.udiff.html>
>>>>>>>>  import jdk.test.lib.Utils;
>>>>>>>> -import jdk.test.lib.Asserts;
>>>>>>>> +import jdk.test.lib.SA.SATestUtils;
>>>>>>>> Need to swap these exports.
>>>>>>>> 
>>>>>>>> 
>>>>>>> Ok
>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/lib/jdk/test/lib/SA/SATestUtils.java.frames.html <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/lib/jdk/test/lib/SA/SATestUtils.java.frames.html>
>>>>>>>>   48         if (SATestUtils.needsPrivileges()) {
>>>>>>>>   49             cmdStringList = SATestUtils.addPrivileges(cmdStringList);
>>>>>>>> The method calls are local, so the class name can be omitted in the method names:
>>>>>>>>   SATestUtils.needsPrivileges and SATestUtils.addPrivileges.
>>>>>>> Ok
>>>>>>>> 
>>>>>>>> 
>>>>>>>>   94        try {
>>>>>>>>   95            if (echoProcess.waitFor(60, TimeUnit.SECONDS) == false) {
>>>>>>>>   96                // Due to using the "-n" option, sudo should complete almost immediately. 60 seconds
>>>>>>>>   97                // is more than generous. If it didn't complete in that time, something went very wrong.
>>>>>>>>   98                echoProcess.destroyForcibly();
>>>>>>>>   99                throw new RuntimeException("Timed out waiting for sudo to execute.");
>>>>>>>>  100            }
>>>>>>>>  101         } catch (InterruptedException e) {
>>>>>>>>  102            throw new RuntimeException(e);
>>>>>>>>  103         }
>>>>>>>> The lines 101/103 are misaligned.
>>>>>>> Ok.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>> 
>>>>>>> Thanks,
>>>>>>> 
>>>>>>> Chris
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On 3/9/20 19:29, Chris Plummer wrote:
>>>>>>>>> Hi, 
>>>>>>>>> 
>>>>>>>>> Please help review the following: 
>>>>>>>>> 
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238268 <https://bugs.openjdk.java.net/browse/JDK-8238268> 
>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/ <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/> 
>>>>>>>>> 
>>>>>>>>> I'll try to give enough background first to make it easier to understand the changes. On OSX you must run SA tests that attach to a live process as root or using sudo. For example: 
>>>>>>>>> 
>>>>>>>>>   sudo make run-test TEST=serviceability/sa/ClhsdbJstackXcompStress.java 
>>>>>>>>> 
>>>>>>>>> Whether running as root or under sudo, the check to allow the test to run is done with: 
>>>>>>>>> 
>>>>>>>>>     private static boolean canAttachOSX() { 
>>>>>>>>>           return userName.equals("root"); 
>>>>>>>>>     } 
>>>>>>>>> 
>>>>>>>>> Any test using "@requires vm.hasSAandCanAttach" must pass this check via Platform.shouldSAAttach(), which for OSX returns: 
>>>>>>>>> 
>>>>>>>>>              return canAttachOSX() && !isSignedOSX(); 
>>>>>>>>> 
>>>>>>>>> So if running as root the "@requires vm.hasSAandCanAttach" passes, otherwise it does not. However, using a root login to run tests is not a very desirable, nor is issuing a "sudo make run-test" (any created file ends up with root ownership). Because of this support was previously added for just running the attaching process using sudo, not the entire test. This was only done for the 20 or so tests that use ClhsdbLauncher. These tests use "@requires vm.hasSA", and then while running the test will do a "sudo" check if canAttachOSX() returns false: 
>>>>>>>>> 
>>>>>>>>>         if (!Platform.shouldSAAttach()) { 
>>>>>>>>>             if (Platform.isOSX()) { 
>>>>>>>>>                 if (Platform.isSignedOSX()) { 
>>>>>>>>>                     throw new SkippedException("SA attach not expected to work. JDK is signed."); 
>>>>>>>>>                 } else if (SATestUtils.canAddPrivileges()) { 
>>>>>>>>>                     needPrivileges = true; 
>>>>>>>>>                 } 
>>>>>>>>>             } 
>>>>>>>>>             if (!needPrivileges)  { 
>>>>>>>>>                // Skip the test if we don't have enough permissions to attach 
>>>>>>>>>                // and cannot add privileges. 
>>>>>>>>>                throw new SkippedException( 
>>>>>>>>>                    "SA attach not expected to work. Insufficient privileges."); 
>>>>>>>>>            } 
>>>>>>>>>         } 
>>>>>>>>> 
>>>>>>>>> So basically it does a runtime check of vm.hasSAandCanAttach, and if it fails then checks if running with sudo will work. This allows for either a passwordless sudo to be used when running clhsdb, or for the user to be prompted for the sudo password (note I've remove support for the latter with my changes). 
>>>>>>>>> 
>>>>>>>>> That brings us to the CR that is being fixed. ClhsdbLauncher tests support sudo and will therefore run with our CI testing on OSX, but the 25 or so tests that use "@requires vm.hasSAandCanAttach" do not, and therefore are never run with our CI OSX testing. The changes in this webrev fix that. 
>>>>>>>>> 
>>>>>>>>> There are two possible approaches to the fix. One is having the check for sudo be done as part of the vm.hasSAandCanAttach evaluation. The other approach is to do the check in the test at runtime similar to how                             ClhsdbLauncher currently does. This would mean just using "@requires vm.hasSA" for all the tests instead of "@requires vm.hasSAandCanAttach". I chose the later because there is an advantage to throwing SkippedException rather than just silently skipping the test using @requires. The advantage is that mdash tells you how many tests were skipped, and when you hover over the reason you can see the SkippedException message, which will differentiate between reasons like the JDK was signed or there are insufficient privileges. If all the checking was done by the vm.hasSAandCanAttach evaluation, you would not know why the test wasn't run. 
>>>>>>>>> 
>>>>>>>>> The "support" related changes made are all in the following 3 files. The rest of the changes are in the tests: 
>>>>>>>>> 
>>>>>>>>> test/jtreg-ext/requires/VMProps.java 
>>>>>>>>> test/lib/jdk/test/lib/Platform.java 
>>>>>>>>> test/lib/jdk/test/lib/SA/SATestUtils.java 
>>>>>>>>> 
>>>>>>>>> You'll noticed that one change I made to the sudo support in SATestUtils.canAddPrivileges() is to make sudo non-interactive, which means no password prompt. So that means either the user does not require a password, or the credentials have been cached. Otherwise the sudo check will fail. On most platforms if you execute a sudo command, the credentials are cached for 5 minutes. So if your user is not setup for passwordless sudo, then a sudo command can be issued before running the tests, and will likely remain cached until the test is run. The reason for using passwordless is because prompting in the middle of running tests can be confusing (you usually walk way once launching the tests and miss the prompt anyway), and avoids unnecessary delays in automated testing due to waiting for the password prompt to timeout (it used to wait 1 minute). 
>>>>>>>>> 
>>>>>>>>> There are essentially 3 types of tests that SA Attach to a process, each needing a slightly different fix: 
>>>>>>>>> 
>>>>>>>>> 1. Tests that directly launch a jdk.hotspot.agent class, such as TestClassDump.java. They need to call SATestUtils.checkAttachOk() to verify that attaching will be possible, and then SATestUtils.addPrivilegesIfNeeded(pb) to get the sudo command added if needed.They also need to switch from using hasSAandCanAttach to using hasSA. 
>>>>>>>>> 
>>>>>>>>> 2. Tests that launch command line tools such has jhsdb. They need to call SATestUtils.checkAttachOk() to verify that attaching will be possible, and then SATestUtils.createProcessBuilder() to create a process that will be launched using sudo if necessary.They also need to switch from using hasSAandCanAttach to using hasSA. 
>>>>>>>>> 
>>>>>>>>> 3. Tests that use ClhsdbLauncher. They already use hasSA instead of hasSAandCanAttach, and rely on ClhsdbLauncher to do check at runtime if attaching will work, so for the most part all the these tests are unchanged. ClhsdbLauncher was modified to take advantage of the new SATestUtils.createProcessBuilder() and SATestUtils.checkAttachOk() APIs. 
>>>>>>>>> 
>>>>>>>>> Some tests required special handling: 
>>>>>>>>> 
>>>>>>>>> test/hotspot/jtreg/compiler/ciReplay/TestSAClient.java 
>>>>>>>>> test/hotspot/jtreg/compiler/ciReplay/TestSAServer.java 
>>>>>>>>> 
>>>>>>>>> - These two tests SA Attach to a core file, not to a process, so only need hasSA, 
>>>>>>>>>   not hasSAandCanAttach. No other changes were needed. 
>>>>>>>>> 
>>>>>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFindPC.java 
>>>>>>>>> 
>>>>>>>>> - The output should never be null. If the test was skipped due to lack of privileges, you 
>>>>>>>>>   would never get to this section of the test. 
>>>>>>>>> 
>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestClhsdbJstackLock.java 
>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestIntConstant.java 
>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestPrintMdo.java 
>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestType.java 
>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestUniverse.java 
>>>>>>>>> 
>>>>>>>>> - These are ClhsdbLauncher tests, so they should have been using hasSA instead of 
>>>>>>>>>   hasSAandCanAttachin the first place. No other changes were needed. 
>>>>>>>>> 
>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestCpoolForInvokeDynamic.java 
>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestDefaultMethods.java 
>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestG1HeapRegion.java 
>>>>>>>>> 
>>>>>>>>> - These tests used to "@require mac" but seem run fine on OSX, so I removed this requirement. 
>>>>>>>>> 
>>>>>>>>> test/jdk/sun/tools/jhsdb/BasicLauncherTest.java 
>>>>>>>>> 
>>>>>>>>> - This test had a runtime check to not run on OSX due to not having core file stack 
>>>>>>>>>   walking support. However, this tests always attaches to a process, not a core file, 
>>>>>>>>>   and seems to run just fine on OSX. 
>>>>>>>>> 
>>>>>>>>> test/jdk/sun/tools/jstack/DeadlockDetectionTest.java 
>>>>>>>>> 
>>>>>>>>> - I changed the test to throw a SkippedException if it gets the unexpected error code 
>>>>>>>>>   rather than just println. 
>>>>>>>>> 
>>>>>>>>> And a few other miscellaneous changes not already covered: 
>>>>>>>>> 
>>>>>>>>> test/lib/jdk/test/lib/Platform.java 
>>>>>>>>> - Made canPtraceAttachLinux() public so it can be called from SATestUtils. 
>>>>>>>>> - vm.hasSAandCanAttach is now gone. 
>>>>>>>>> 
>>>>>>>>> thanks, 
>>>>>>>>> 
>>>>>>>>> Chris 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200315/6ca75d11/attachment-0001.htm>

From chris.plummer at oracle.com  Mon Mar 16 00:47:38 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Sun, 15 Mar 2020 17:47:38 -0700
Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they
 do not attempt to use sudo when available
In-Reply-To: <78834ED4-055B-4E43-86A8-01BB49BC3C73@oracle.com>
References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com>
 <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com>
 <c26a5f69-3127-bd0a-0e25-8b7afe4464aa@oracle.com>
 <150f87f8-5b54-deae-ced1-1227bb1ed17a@oracle.com>
 <227652ff-ffb0-a1f0-531d-03b75ecec921@oracle.com>
 <33ffd0e7-d017-c1bf-feb5-4d2f07e753f2@oracle.com>
 <90e1ffcc-6636-b328-b167-fd1d42072575@oracle.com>
 <BAAB2AF7-B0C0-4C01-890A-F2F0FC360C42@oracle.com>
 <f567ce96-4d34-8888-e74c-58e7ab64055f@oracle.com>
 <78834ED4-055B-4E43-86A8-01BB49BC3C73@oracle.com>
Message-ID: <a2af1551-2450-4efb-00a0-ab921dc68fa7@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200315/d141cebc/attachment-0001.htm>

From david.holmes at oracle.com  Mon Mar 16 02:17:03 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 16 Mar 2020 12:17:03 +1000
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
Message-ID: <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com>

Hi Yasumasa,

I can't review this as I know nothing about the code, but I'm putting 
the patch through our internal testing.

David

On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
> Hi all,
> 
> Please review this change:
> 
>  ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>  ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/
> 
> JDK-8234624 introduced DWARF parser in SA for unwinding native frames in 
> jstack mixed mode.
> However some error has seen intermittently after that.
> 
> I investigated the cause of this, I found two concerns:
> 
>  ? A: lack of buffer (.eh_frame section data) range check
>  ? B: Language personality routine and Language Specific Data Area 
> (LSDA) are not considered
> 
> I addd range check for .eh_frame processing, and ignore personality 
> routine and LSDA in this webrev.
> Also I added bailout code if DWARF processing is failed due to these 
> concerns.
> 
> This change has passed all tests on submit repo 
> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671),
> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container.
> 
> 
> Thanks,
> 
> Yasumasa

From david.holmes at oracle.com  Mon Mar 16 02:53:48 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 16 Mar 2020 12:53:48 +1000
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com>
Message-ID: <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com>

On 16/03/2020 12:17 pm, David Holmes wrote:
> Hi Yasumasa,
> 
> I can't review this as I know nothing about the code, but I'm putting 
> the patch through our internal testing.

Sorry but the crashes still exist:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949
#
# JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 
15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, 
sharing, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# C  [libsaproc.so+0x494e]  DwarfParser::process_dwarf(unsigned long)+0x4e

in fact they seem worse as the test seems to always crash now.

David

> David
> 
> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>> Hi all,
>>
>> Please review this change:
>>
>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/
>>
>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames 
>> in jstack mixed mode.
>> However some error has seen intermittently after that.
>>
>> I investigated the cause of this, I found two concerns:
>>
>> ?? A: lack of buffer (.eh_frame section data) range check
>> ?? B: Language personality routine and Language Specific Data Area 
>> (LSDA) are not considered
>>
>> I addd range check for .eh_frame processing, and ignore personality 
>> routine and LSDA in this webrev.
>> Also I added bailout code if DWARF processing is failed due to these 
>> concerns.
>>
>> This change has passed all tests on submit repo 
>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671),
>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container.
>>
>>
>> Thanks,
>>
>> Yasumasa

From david.holmes at oracle.com  Mon Mar 16 04:12:07 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 16 Mar 2020 14:12:07 +1000
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com>
 <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com>
Message-ID: <f6d2ef25-becc-3060-f10a-445f275ee69d@oracle.com>

Correction ...

On 16/03/2020 12:53 pm, David Holmes wrote:
> On 16/03/2020 12:17 pm, David Holmes wrote:
>> Hi Yasumasa,
>>
>> I can't review this as I know nothing about the code, but I'm putting 
>> the patch through our internal testing.
> 
> Sorry but the crashes still exist:
> 
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949
> #
> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 
> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, 
> sharing, tiered, compressed oops, g1 gc, linux-amd64)
> # Problematic frame:
> # C? [libsaproc.so+0x494e]? DwarfParser::process_dwarf(unsigned long)+0x4e
> 
> in fact they seem worse as the test seems to always crash now.

Not worse - sorry. I see 6 failures out of 119 runs of the test in 
linux-x64. I don't see a pattern as to where it fails versus passes.

It doesn't fail for me locally.

David

> David
> 
>> David
>>
>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>> Hi all,
>>>
>>> Please review this change:
>>>
>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/
>>>
>>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames 
>>> in jstack mixed mode.
>>> However some error has seen intermittently after that.
>>>
>>> I investigated the cause of this, I found two concerns:
>>>
>>> ?? A: lack of buffer (.eh_frame section data) range check
>>> ?? B: Language personality routine and Language Specific Data Area 
>>> (LSDA) are not considered
>>>
>>> I addd range check for .eh_frame processing, and ignore personality 
>>> routine and LSDA in this webrev.
>>> Also I added bailout code if DWARF processing is failed due to these 
>>> concerns.
>>>
>>> This change has passed all tests on submit repo 
>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671),
>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container.
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa

From serguei.spitsyn at oracle.com  Mon Mar 16 05:22:53 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Sun, 15 Mar 2020 22:22:53 -0700
Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they
 do not attempt to use sudo when available
In-Reply-To: <a2af1551-2450-4efb-00a0-ab921dc68fa7@oracle.com>
References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com>
 <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com>
 <c26a5f69-3127-bd0a-0e25-8b7afe4464aa@oracle.com>
 <150f87f8-5b54-deae-ced1-1227bb1ed17a@oracle.com>
 <227652ff-ffb0-a1f0-531d-03b75ecec921@oracle.com>
 <33ffd0e7-d017-c1bf-feb5-4d2f07e753f2@oracle.com>
 <90e1ffcc-6636-b328-b167-fd1d42072575@oracle.com>
 <BAAB2AF7-B0C0-4C01-890A-F2F0FC360C42@oracle.com>
 <f567ce96-4d34-8888-e74c-58e7ab64055f@oracle.com>
 <78834ED4-055B-4E43-86A8-01BB49BC3C73@oracle.com>
 <a2af1551-2450-4efb-00a0-ab921dc68fa7@oracle.com>
Message-ID: <c15a019e-1dad-16de-c8b7-0ca9e97ada97@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200315/2117341c/attachment-0001.htm>

From suenaga at oss.nttdata.com  Mon Mar 16 06:36:28 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Mon, 16 Mar 2020 15:36:28 +0900
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <f6d2ef25-becc-3060-f10a-445f275ee69d@oracle.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com>
 <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com>
 <f6d2ef25-becc-3060-f10a-445f275ee69d@oracle.com>
Message-ID: <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com>

Hi David,

Thank you for testing it.

I updated webrev to avoid bailout to Java frame when DWARF has language personality routine or LSDA.
Could you try it?

   http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/

It works well on my Fedora 31 and Oracle Linux 7.7 .
I've pushed it to submit repo.

Diff from webrev.00 is here:
   http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652


Thanks,

Yasumasa


On 2020/03/16 13:12, David Holmes wrote:
> Correction ...
> 
> On 16/03/2020 12:53 pm, David Holmes wrote:
>> On 16/03/2020 12:17 pm, David Holmes wrote:
>>> Hi Yasumasa,
>>>
>>> I can't review this as I know nothing about the code, but I'm putting the patch through our internal testing.
>>
>> Sorry but the crashes still exist:
>>
>> #
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949
>> #
>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev)
>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>> # Problematic frame:
>> # C? [libsaproc.so+0x494e]? DwarfParser::process_dwarf(unsigned long)+0x4e
>>
>> in fact they seem worse as the test seems to always crash now.
> 
> Not worse - sorry. I see 6 failures out of 119 runs of the test in linux-x64. I don't see a pattern as to where it fails versus passes.
> 
> It doesn't fail for me locally.
> 
> David
> 
>> David
>>
>>> David
>>>
>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>>> Hi all,
>>>>
>>>> Please review this change:
>>>>
>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/
>>>>
>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames in jstack mixed mode.
>>>> However some error has seen intermittently after that.
>>>>
>>>> I investigated the cause of this, I found two concerns:
>>>>
>>>> ?? A: lack of buffer (.eh_frame section data) range check
>>>> ?? B: Language personality routine and Language Specific Data Area (LSDA) are not considered
>>>>
>>>> I addd range check for .eh_frame processing, and ignore personality routine and LSDA in this webrev.
>>>> Also I added bailout code if DWARF processing is failed due to these concerns.
>>>>
>>>> This change has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671),
>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa

From chris.plummer at oracle.com  Mon Mar 16 06:43:39 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Sun, 15 Mar 2020 23:43:39 -0700
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com>
 <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com>
 <f6d2ef25-becc-3060-f10a-445f275ee69d@oracle.com>
 <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com>
Message-ID: <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com>

BTW, if you submit it to the submit repo, we can then go and run 
additional internal tests (and even more builds) using that job.

Chris

On 3/15/20 11:36 PM, Yasumasa Suenaga wrote:
> Hi David,
>
> Thank you for testing it.
>
> I updated webrev to avoid bailout to Java frame when DWARF has 
> language personality routine or LSDA.
> Could you try it?
>
> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/
>
> It works well on my Fedora 31 and Oracle Linux 7.7 .
> I've pushed it to submit repo.
>
> Diff from webrev.00 is here:
> ? http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652
>
>
> Thanks,
>
> Yasumasa
>
>
> On 2020/03/16 13:12, David Holmes wrote:
>> Correction ...
>>
>> On 16/03/2020 12:53 pm, David Holmes wrote:
>>> On 16/03/2020 12:17 pm, David Holmes wrote:
>>>> Hi Yasumasa,
>>>>
>>>> I can't review this as I know nothing about the code, but I'm 
>>>> putting the patch through our internal testing.
>>>
>>> Sorry but the crashes still exist:
>>>
>>> #
>>> # A fatal error has been detected by the Java Runtime Environment:
>>> #
>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949
>>> #
>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug 
>>> build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev)
>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, 
>>> sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>> # Problematic frame:
>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned 
>>> long)+0x4e
>>>
>>> in fact they seem worse as the test seems to always crash now.
>>
>> Not worse - sorry. I see 6 failures out of 119 runs of the test in 
>> linux-x64. I don't see a pattern as to where it fails versus passes.
>>
>> It doesn't fail for me locally.
>>
>> David
>>
>>> David
>>>
>>>> David
>>>>
>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>>>> Hi all,
>>>>>
>>>>> Please review this change:
>>>>>
>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>> ?? webrev: 
>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/
>>>>>
>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native 
>>>>> frames in jstack mixed mode.
>>>>> However some error has seen intermittently after that.
>>>>>
>>>>> I investigated the cause of this, I found two concerns:
>>>>>
>>>>> ?? A: lack of buffer (.eh_frame section data) range check
>>>>> ?? B: Language personality routine and Language Specific Data Area 
>>>>> (LSDA) are not considered
>>>>>
>>>>> I addd range check for .eh_frame processing, and ignore 
>>>>> personality routine and LSDA in this webrev.
>>>>> Also I added bailout code if DWARF processing is failed due to 
>>>>> these concerns.
>>>>>
>>>>> This change has passed all tests on submit repo 
>>>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671),
>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container.
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa


From suenaga at oss.nttdata.com  Mon Mar 16 06:51:03 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Mon, 16 Mar 2020 15:51:03 +0900
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com>
 <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com>
 <f6d2ef25-becc-3060-f10a-445f275ee69d@oracle.com>
 <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com>
 <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com>
Message-ID: <d0a1f49e-7f1d-0732-38d0-2b7a75966e74@oss.nttdata.com>

On 2020/03/16 15:43, Chris Plummer wrote:
> BTW, if you submit it to the submit repo, we can then go and run additional internal tests (and even more builds) using that job.

I've pushed the change to submit repo, but I've not yet received the result.
I will share you when I get job ID.

Yasumasa

> Chris
> 
> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote:
>> Hi David,
>>
>> Thank you for testing it.
>>
>> I updated webrev to avoid bailout to Java frame when DWARF has language personality routine or LSDA.
>> Could you try it?
>>
>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/
>>
>> It works well on my Fedora 31 and Oracle Linux 7.7 .
>> I've pushed it to submit repo.
>>
>> Diff from webrev.00 is here:
>> ? http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> On 2020/03/16 13:12, David Holmes wrote:
>>> Correction ...
>>>
>>> On 16/03/2020 12:53 pm, David Holmes wrote:
>>>> On 16/03/2020 12:17 pm, David Holmes wrote:
>>>>> Hi Yasumasa,
>>>>>
>>>>> I can't review this as I know nothing about the code, but I'm putting the patch through our internal testing.
>>>>
>>>> Sorry but the crashes still exist:
>>>>
>>>> #
>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>> #
>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949
>>>> #
>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev)
>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>> # Problematic frame:
>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>
>>>> in fact they seem worse as the test seems to always crash now.
>>>
>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in linux-x64. I don't see a pattern as to where it fails versus passes.
>>>
>>> It doesn't fail for me locally.
>>>
>>> David
>>>
>>>> David
>>>>
>>>>> David
>>>>>
>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> Please review this change:
>>>>>>
>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/
>>>>>>
>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames in jstack mixed mode.
>>>>>> However some error has seen intermittently after that.
>>>>>>
>>>>>> I investigated the cause of this, I found two concerns:
>>>>>>
>>>>>> ?? A: lack of buffer (.eh_frame section data) range check
>>>>>> ?? B: Language personality routine and Language Specific Data Area (LSDA) are not considered
>>>>>>
>>>>>> I addd range check for .eh_frame processing, and ignore personality routine and LSDA in this webrev.
>>>>>> Also I added bailout code if DWARF processing is failed due to these concerns.
>>>>>>
>>>>>> This change has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671),
>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
> 
> 

From serguei.spitsyn at oracle.com  Mon Mar 16 06:57:13 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Sun, 15 Mar 2020 23:57:13 -0700
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
Message-ID: <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200315/9daf5587/attachment-0001.htm>

From david.holmes at oracle.com  Mon Mar 16 06:57:48 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 16 Mar 2020 16:57:48 +1000
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <d0a1f49e-7f1d-0732-38d0-2b7a75966e74@oss.nttdata.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com>
 <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com>
 <f6d2ef25-becc-3060-f10a-445f275ee69d@oracle.com>
 <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com>
 <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com>
 <d0a1f49e-7f1d-0732-38d0-2b7a75966e74@oss.nttdata.com>
Message-ID: <c0013500-ebc6-0eab-1705-feb99e58e73e@oracle.com>

On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote:
> On 2020/03/16 15:43, Chris Plummer wrote:
>> BTW, if you submit it to the submit repo, we can then go and run 
>> additional internal tests (and even more builds) using that job.

Thanks for that tip Chris!

> I've pushed the change to submit repo, but I've not yet received the 
> result.
> I will share you when I get job ID.

We can see the id. Just need to wait for the builds to complete before 
submitting the additional tests.

David

> Yasumasa
> 
>> Chris
>>
>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote:
>>> Hi David,
>>>
>>> Thank you for testing it.
>>>
>>> I updated webrev to avoid bailout to Java frame when DWARF has 
>>> language personality routine or LSDA.
>>> Could you try it?
>>>
>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/
>>>
>>> It works well on my Fedora 31 and Oracle Linux 7.7 .
>>> I've pushed it to submit repo.
>>>
>>> Diff from webrev.00 is here:
>>> ? http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2020/03/16 13:12, David Holmes wrote:
>>>> Correction ...
>>>>
>>>> On 16/03/2020 12:53 pm, David Holmes wrote:
>>>>> On 16/03/2020 12:17 pm, David Holmes wrote:
>>>>>> Hi Yasumasa,
>>>>>>
>>>>>> I can't review this as I know nothing about the code, but I'm 
>>>>>> putting the patch through our internal testing.
>>>>>
>>>>> Sorry but the crashes still exist:
>>>>>
>>>>> #
>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>> #
>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949
>>>>> #
>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug 
>>>>> build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev)
>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, 
>>>>> sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>> # Problematic frame:
>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned 
>>>>> long)+0x4e
>>>>>
>>>>> in fact they seem worse as the test seems to always crash now.
>>>>
>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in 
>>>> linux-x64. I don't see a pattern as to where it fails versus passes.
>>>>
>>>> It doesn't fail for me locally.
>>>>
>>>> David
>>>>
>>>>> David
>>>>>
>>>>>> David
>>>>>>
>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> Please review this change:
>>>>>>>
>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>> ?? webrev: 
>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/
>>>>>>>
>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native 
>>>>>>> frames in jstack mixed mode.
>>>>>>> However some error has seen intermittently after that.
>>>>>>>
>>>>>>> I investigated the cause of this, I found two concerns:
>>>>>>>
>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check
>>>>>>> ?? B: Language personality routine and Language Specific Data 
>>>>>>> Area (LSDA) are not considered
>>>>>>>
>>>>>>> I addd range check for .eh_frame processing, and ignore 
>>>>>>> personality routine and LSDA in this webrev.
>>>>>>> Also I added bailout code if DWARF processing is failed due to 
>>>>>>> these concerns.
>>>>>>>
>>>>>>> This change has passed all tests on submit repo 
>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671),
>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container.
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>
>>

From chris.plummer at oracle.com  Mon Mar 16 06:57:50 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Sun, 15 Mar 2020 23:57:50 -0700
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <d0a1f49e-7f1d-0732-38d0-2b7a75966e74@oss.nttdata.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com>
 <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com>
 <f6d2ef25-becc-3060-f10a-445f275ee69d@oracle.com>
 <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com>
 <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com>
 <d0a1f49e-7f1d-0732-38d0-2b7a75966e74@oss.nttdata.com>
Message-ID: <516600a1-a9f9-4ad1-89dc-049bb8e8d131@oracle.com>

On 3/15/20 11:51 PM, Yasumasa Suenaga wrote:
> On 2020/03/16 15:43, Chris Plummer wrote:
>> BTW, if you submit it to the submit repo, we can then go and run 
>> additional internal tests (and even more builds) using that job.
>
> I've pushed the change to submit repo, but I've not yet received the 
> result.
> I will share you when I get job ID.
I see it, but I'm off to bed and am not sure what David was running, so 
I'll let him take a stab at it.

Chris
>
> Yasumasa
>
>> Chris
>>
>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote:
>>> Hi David,
>>>
>>> Thank you for testing it.
>>>
>>> I updated webrev to avoid bailout to Java frame when DWARF has 
>>> language personality routine or LSDA.
>>> Could you try it?
>>>
>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/
>>>
>>> It works well on my Fedora 31 and Oracle Linux 7.7 .
>>> I've pushed it to submit repo.
>>>
>>> Diff from webrev.00 is here:
>>> ? http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2020/03/16 13:12, David Holmes wrote:
>>>> Correction ...
>>>>
>>>> On 16/03/2020 12:53 pm, David Holmes wrote:
>>>>> On 16/03/2020 12:17 pm, David Holmes wrote:
>>>>>> Hi Yasumasa,
>>>>>>
>>>>>> I can't review this as I know nothing about the code, but I'm 
>>>>>> putting the patch through our internal testing.
>>>>>
>>>>> Sorry but the crashes still exist:
>>>>>
>>>>> #
>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>> #
>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949
>>>>> #
>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug 
>>>>> build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev)
>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, 
>>>>> sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>> # Problematic frame:
>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned 
>>>>> long)+0x4e
>>>>>
>>>>> in fact they seem worse as the test seems to always crash now.
>>>>
>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in 
>>>> linux-x64. I don't see a pattern as to where it fails versus passes.
>>>>
>>>> It doesn't fail for me locally.
>>>>
>>>> David
>>>>
>>>>> David
>>>>>
>>>>>> David
>>>>>>
>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> Please review this change:
>>>>>>>
>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>> ?? webrev: 
>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/
>>>>>>>
>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native 
>>>>>>> frames in jstack mixed mode.
>>>>>>> However some error has seen intermittently after that.
>>>>>>>
>>>>>>> I investigated the cause of this, I found two concerns:
>>>>>>>
>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check
>>>>>>> ?? B: Language personality routine and Language Specific Data 
>>>>>>> Area (LSDA) are not considered
>>>>>>>
>>>>>>> I addd range check for .eh_frame processing, and ignore 
>>>>>>> personality routine and LSDA in this webrev.
>>>>>>> Also I added bailout code if DWARF processing is failed due to 
>>>>>>> these concerns.
>>>>>>>
>>>>>>> This change has passed all tests on submit repo 
>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671),
>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 
>>>>>>> container.
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>
>>


From serguei.spitsyn at oracle.com  Mon Mar 16 07:05:56 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 16 Mar 2020 00:05:56 -0700
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
 <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>
Message-ID: <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200316/4bcb5ad1/attachment-0001.htm>

From david.holmes at oracle.com  Mon Mar 16 07:17:01 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 16 Mar 2020 17:17:01 +1000
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <c0013500-ebc6-0eab-1705-feb99e58e73e@oracle.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com>
 <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com>
 <f6d2ef25-becc-3060-f10a-445f275ee69d@oracle.com>
 <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com>
 <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com>
 <d0a1f49e-7f1d-0732-38d0-2b7a75966e74@oss.nttdata.com>
 <c0013500-ebc6-0eab-1705-feb99e58e73e@oracle.com>
Message-ID: <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com>

Sorry it is still crashing.

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704
#
# JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 
15-internal+0-2020-03-16-0640217.suenaga.source)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, sharing, 
tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# C  [libsaproc.so+0x494e]  DwarfParser::process_dwarf(unsigned long)+0x4e
#

Same as before.

David
-----

On 16/03/2020 4:57 pm, David Holmes wrote:
> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote:
>> On 2020/03/16 15:43, Chris Plummer wrote:
>>> BTW, if you submit it to the submit repo, we can then go and run 
>>> additional internal tests (and even more builds) using that job.
> 
> Thanks for that tip Chris!
> 
>> I've pushed the change to submit repo, but I've not yet received the 
>> result.
>> I will share you when I get job ID.
> 
> We can see the id. Just need to wait for the builds to complete before 
> submitting the additional tests.
> 
> David
> 
>> Yasumasa
>>
>>> Chris
>>>
>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote:
>>>> Hi David,
>>>>
>>>> Thank you for testing it.
>>>>
>>>> I updated webrev to avoid bailout to Java frame when DWARF has 
>>>> language personality routine or LSDA.
>>>> Could you try it?
>>>>
>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/
>>>>
>>>> It works well on my Fedora 31 and Oracle Linux 7.7 .
>>>> I've pushed it to submit repo.
>>>>
>>>> Diff from webrev.00 is here:
>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2020/03/16 13:12, David Holmes wrote:
>>>>> Correction ...
>>>>>
>>>>> On 16/03/2020 12:53 pm, David Holmes wrote:
>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote:
>>>>>>> Hi Yasumasa,
>>>>>>>
>>>>>>> I can't review this as I know nothing about the code, but I'm 
>>>>>>> putting the patch through our internal testing.
>>>>>>
>>>>>> Sorry but the crashes still exist:
>>>>>>
>>>>>> #
>>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>>> #
>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949
>>>>>> #
>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug 
>>>>>> build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev)
>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, 
>>>>>> sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>>> # Problematic frame:
>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned 
>>>>>> long)+0x4e
>>>>>>
>>>>>> in fact they seem worse as the test seems to always crash now.
>>>>>
>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in 
>>>>> linux-x64. I don't see a pattern as to where it fails versus passes.
>>>>>
>>>>> It doesn't fail for me locally.
>>>>>
>>>>> David
>>>>>
>>>>>> David
>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> Please review this change:
>>>>>>>>
>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>> ?? webrev: 
>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/
>>>>>>>>
>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native 
>>>>>>>> frames in jstack mixed mode.
>>>>>>>> However some error has seen intermittently after that.
>>>>>>>>
>>>>>>>> I investigated the cause of this, I found two concerns:
>>>>>>>>
>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check
>>>>>>>> ?? B: Language personality routine and Language Specific Data 
>>>>>>>> Area (LSDA) are not considered
>>>>>>>>
>>>>>>>> I addd range check for .eh_frame processing, and ignore 
>>>>>>>> personality routine and LSDA in this webrev.
>>>>>>>> Also I added bailout code if DWARF processing is failed due to 
>>>>>>>> these concerns.
>>>>>>>>
>>>>>>>> This change has passed all tests on submit repo 
>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671),
>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 
>>>>>>>> container.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>
>>>

From serguei.spitsyn at oracle.com  Mon Mar 16 08:10:41 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 16 Mar 2020 01:10:41 -0700
Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI
 connector port
In-Reply-To: <78EA1B33-1B0D-4969-AFC5-E73D8F82E6FA@oracle.com>
References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com>
 <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com>
 <D28BF049-D293-4F65-93EC-BCAE4F09B413@oracle.com>
 <a3b5ac77-b6ee-4927-cf99-b586c6bbeae6@oss.nttdata.com>
 <41392FC2-AC78-4E94-878B-907DDFB3E968@oracle.com>
 <c858b94c-0091-8d29-eb0b-145782984d86@oss.nttdata.com>
 <78EA1B33-1B0D-4969-AFC5-E73D8F82E6FA@oracle.com>
Message-ID: <79af4ca3-9c8d-ada0-2a72-88b538b01077@oracle.com>

Hi Daniil,

The update looks pretty good to me so far.
I'll make another pass tomorrow.

Thanks,
Serguei


On 3/13/20 15:05, Daniil Titov wrote:
> Hi Yasumasa, Serguei and Alex,
>
> Please review a new version of the webrev that includes the changes Yasumasa suggested.
>
>> Shutdown hook is already registered in c'tor of HotSpotAgent.
>>     It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed.
> The shutdown hook registered in the HotSpotAgent c'tor only works for non-servers, so we still need a
> the shutdown hook for remote server being added in SALauncher. I changed it to use  the lambda expression.
>
> 101     public HotSpotAgent() {
>   102         // for non-server add shutdown hook to clean-up debugger in case
>   103         // of forced exit. For remote server, shutdown hook is added by
>   104         // DebugServer.
>   105         Runtime.getRuntime().addShutdownHook(new java.lang.Thread(
>   106         new Runnable() {
>   107             public void run() {
>   108                 synchronized (HotSpotAgent.this) {
>   109                     if (!isServer) {
>   110                         detach();
>   111                     }
>   112                 }
>   113             }
>   114         }));
>   115     }
>
>>>     Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains
>>> `exclusiveAccess.dirs=.` to avoid concurrent execution
> As I understand exclusiveAccess.dirs prevents only the tests located in this directory from being run simultaneously and other tests could still run in parallel with one of these tests.  Thus I would prefer to have the retry mechanism in place. I simplified the code using the class variables instead of local arrays.
>
> Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
>
> [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.03/
> [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
> [3] Bug: https://bugs.openjdk.java.net/browse/JDK-8196751
>
> Thank you,
> Daniil
>
> ?On 3/6/20, 6:15 PM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
>
>      Hi Daniil,
>      
>      On 2020/03/07 3:38, Daniil Titov wrote:
>      > Hi Yasumasa,
>      >
>      >   -> checkBasicOptions() is needed? I think you can remove this method and embed it in caller.
>      > I think that having a piece of code that invokes  a method  named "buildAttachArgs" with a copy of the argument map  just for its side-effect ( it throws an exception if parameters are incorrect)  and ignores its return might look confusing. Thus, I found it more appropriate to wrap it inside a method with more relevant name .
>      
>      Ok, but I prefer to leave comment it.
>      
>      
>      >   > SADebugDTest
>      >   >  - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array.
>      > We cannot use primitives there since these local variables are captured in lambda expression and are required to be final.
>      > The other option is to use some other wrapper for them but I don't see any obvious benefits in it comparing to the array.
>      
>      Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains `exclusiveAccess.dirs=.` to avoid concurrent execution.
>      If you do not think this error check, test code is more simply.
>      
>      
>      > I will include your other suggestion in the new version of the webrev.
>      
>      Sorry, I have one more comment:
>      
>      >           - Shutdown hook is very good idea. You can implement more simply if you use lambda expression.
>      
>      Shutdown hook is already registered in c'tor of HotSpotAgent.
>      It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed.
>      
>      
>      Thanks,
>      
>      Yasumasa
>      
>      
>      > Thanks!
>      > Daniil
>      >
>      > ?On 3/6/20, 12:30 AM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
>      >
>      >      Hi Daniil,
>      >
>      >
>      >      - SALauncher.java
>      >           - checkBasicOptions() is needed? I think you can remove this method and embed it in caller.
>      >           - I think registryPort should be checked with Integer.parseInt() like others (rmiPort and pid) rather than regex.
>      >           - Shutdown hook is very good idea. You can implement more simply if you use lambda expression.
>      >
>      >      - SADebugDTest.java
>      >           - Please add bug ID to @bug.
>      >           - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array.
>      >
>      >
>      >      Thanks,
>      >
>      >      Yasumasa
>      >
>      >
>      >      On 2020/03/06 10:15, Daniil Titov wrote:
>      >      > Hi Yasumasa, Serguei and Alex,
>      >      >
>      >      > Please review a new version of the fix [1] that addresses your comments. The new version in addition to RMI connector
>      >      > port option introduces two more options to specify RMI registry port and RMI connector host name. Currently, these
>      >      > last two settings could be specified using the system properties but the system properties have the following disadvantages
>      >      > comparing to the command line options:
>      >      >     -  It?s hard to know about them: they are not listed in tool?s help.
>      >      >     -  They have long names that hard to remember
>      >      >     -   It is easy to mistype them  in the command line and you will not get any warning about it.
>      >      >
>      >      > The CSR [2] was also updated and needs to be reviewed.
>      >      >
>      >      > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker
>      >      > container  and connecting  to it with the GUI debugger.  Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
>      >      >
>      >      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.02/
>      >      > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
>      >      > [3] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751
>      >      >
>      >      > Thank you,
>      >      > Daniil
>      >      >
>      >      > ?On 2/24/20, 5:45 AM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
>      >      >
>      >      >      Hi Daniil,
>      >      >
>      >      >         - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments.
>      >      >           Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply.
>      >      >
>      >      >         - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used.
>      >      >           But you can use same port number as RMI registry (1099).
>      >      >           It is same as relation between jmxremote.port and jmxremote.rmi.port.
>      >      >
>      >      >
>      >      >      Thanks,
>      >      >
>      >      >      Yasumasa
>      >      >
>      >      >
>      >      >      On 2020/02/24 13:21, Daniil Titov wrote:
>      >      >      > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port.
>      >      >      > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container.
>      >      >      >
>      >      >      > New CSR [3] was created for this change and it needs to be reviewed as well.
>      >      >      >
>      >      >      > Man pages for jhsdb will be updated in a separate issue.
>      >      >      >
>      >      >      > The current implementation (sun.jvm.hotspot.SALauncher)  parses the command line options passed to jhsdb tool,
>      >      >      > converts them to the ones for the debug server and then delegates the call  to sun.jvm.hotspot.DebugServer.main().
>      >      >      >
>      >      >      >                // delegate to the actual SA debug server.
>      >      >      >   367         DebugServer.main(newArgArray.toArray(new String[0]));
>      >      >      >
>      >      >      > However,  sun.jvm.hotspot.DebugServer  doesn't support named options and that prevents from efficiently adding new options to the tool.
>      >      >      > I found it more suitable to start Hotspot agent directly in  SALauncher rather than  adding a new option in  both sun.jvm.hotspot.SALauncher
>      >      >      >   and sun.jvm.hotspot.DebugServer and  delegating the call.  With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated
>      >      >      > but I would prefer to address it in a separate issue.
>      >      >      >
>      >      >      > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker
>      >      >      >                  container  and connecting  to it with the GUI debugger.
>      >      >      >                 Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
>      >      >      >
>      >      >      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01
>      >      >      > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751
>      >      >      > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
>      >      >      >
>      >      >      > Thank you,
>      >      >      > Daniil
>      >      >      >
>      >      >      >
>      >      >
>      >      >
>      >      >
>      >
>      >
>      >
>      
>
>


From linzang at tencent.com  Mon Mar 16 09:18:18 2020
From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=)
Date: Mon, 16 Mar 2020 09:18:18 +0000
Subject: RFR(L): 8215624: add parallel heap inspection support for jmap
 histo(G1)
Message-ID: <FC4AB890-98F3-4D4F-B5ED-26F23D69006D@tencent.com>

Just update a new path, my preliminary measure show about 3.5x speedup of jmap histo on a nearly full 4GB G1 heap (8-core platform with parallel thread number set to 4). 
webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_02/
bug: https://bugs.openjdk.java.net/browse/JDK-8215624
CSR: https://bugs.openjdk.java.net/browse/JDK-8239290

BRs,
Lin

?> On 2020/3/2, 9:56 PM, "linzang(??)" <linzang at tencent.com> wrote:
>
>    Dear all, 
>          Let me try to ease the reviewing work by some explanation :P
>          The patch's target is to speed up jmap -histo for heap iteration, from my experience it is necessary for large heap investigation. E.g in bigData scenario I have tried to conduct jmap -histo against 180GB heap, it does take quite a while. 
>          And if my understanding is corrent, even the jmap -histo without "live" option does heap inspection with heap lock acquired. so it is very likely to block mutator thread in allocation-sensitive scenario. I would say the faster the heap inspection does, the shorter the mutator be blocked. This is parallel iteration for jmap is necessary.
>          I think the parallel heap inspection should be applied to all kind of heap. However, consider the heap layout are different for  GCs, much time is required to understand all kinds of the heap layout to make the whole change. IMO, It is not wise to have a huge patch for the whole solution at once, and it is even harder to review it. So I plan to implement it incrementally, the first patch (this one) is going to confirm the implemention detail of how jmap accept the new option, passes it to attachListener of the jvm process and then how to make the parallel inspection closure be generic enough to make it easy to extend to different heap layout. And also how to implement the heap inspection in specific gc's heap. This patch use G1's heap as the begining.
>          This patch actually do several things:
>          1. Add an option "parallelThreadNum=<N>" to jmap -histo, the default behavior is to set N to 0, means let's JVM decide how many threads to use for heap inspection. Set this option to 1 will disable parallel heap inspection. (more details in CSR: https://bugs.openjdk.java.net/browse/JDK-8239290)
>          2. Make a change in how Jmap passing arguments, changes in http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.udiff.html, originally it pass options as separate arguments to attachListener, this patch change to that all options be compose to a single string. So the arg_count_max in attachListener.hpp do not need to be changed, and hence avoid the compatibility issue, as disscussed at https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-March/027334.html
>         3. Add an abstract class ParHeapInspectTask in heapInspection.hpp / heapInspection.cpp, It's work(uint worker_id) method prepares the data structure (KlassInfoTable) need for every parallel worker thread, and then call do_object_iterate_parallel() which is heap specific implementation. I also added some machenism in KlassInfoTable to support parallel iteration, such as merge().
>        4. In specific heap (G1 in this patch), create a subclass of ParHeapInspectTask, implement the do_object_iterate_parallel() for parallel heap inspection. For G1, it simply invoke g1CollectedHeap's object_iterate_parallel().
>        5. Add related test.
>        6. it may be easy to extend this patch for other kinds of heap by creating subclass of ParHeapInspectTask and implement the do_object_iterate_parallel().
>    
>    Hope these info could help on code review and initate the discussion :-) 
>    Thanks!
>     
>    BRs,
>    Lin
    
>    >On 2020/2/19, 9:40 AM, "linzang(??)" <linzang at tencent.com> wrote:.
>    >
>    >  Re-post this RFR with correct enhancement number to make it trackable.
>    >  please ignore the previous wrong post. sorry for troubles. 
>    >    
>    >   webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/
>    >    Hi bug: https://bugs.openjdk.java.net/browse/JDK-8215624
>    >    CSR: https://bugs.openjdk.java.net/browse/JDK-8239290
>    >    --------------
>    >    Lin
>    >    >Hi Lin,
>    >    >
>    >    >Could you, please, re-post your RFR with the right enhancement number in
>    >    >the message subject?
>    >    >It will be more trackable this way.
>    >    >
>    >    >Thanks,
>    >    >Serguei
>    >    >
>    >    >
>    >    >On 2/17/20 10:29 PM, linzang(??) wrote:
>    >    >> Dear David,
>    >    >>        Thanks a lot!
>    >    >>       I have updated the refined code to http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/.
>    >    >>        IMHO the parallel heap inspection can be extended to all kinds of heap as long as the heap layout can support parallel iteration.
>    >    >>        Maybe we can firstly use this webrev to discuss how to implement it, because I am not sure my current implementation is an appropriate way to communicate with collectedHeap, then we can extend the solution to other kinds of heap.
>    >    >>    
>    >    >> Thanks,
>    >    >> --------------
>    >    >> Lin
>    >    >>> Hi Lin,
>    >    >>>
>    >    >>> Adding in hotspot-gc-dev as they need to see how this interacts with GC
>    >    >>> worker threads, and whether it needs to be extended beyond G1.
>    >    >>>
>    >   >>> I happened to spot one nit when browsing:
>    >    >>>
>    >    >>> src/hotspot/share/gc/shared/collectedHeap.hpp
>    >    >>>
>    >    >>> +   virtual bool run_par_heap_inspect_task(KlassInfoTable* cit,
>    >    >>> +                                          BoolObjectClosure* filter,
>    >    >>> +                                          size_t* missed_count,
>    >    >>> +                                          size_t thread_num) {
>    >    >>> +     return NULL;
>    >    >>>
>    >    >>> s/NULL/false/
>    >    >>>
>    >    >>> Cheers,
>    >    >>> David
>    >    >>>
>    >    >>> On 18/02/2020 2:15 pm, linzang(??) wrote:
>    >    >>>> Dear All,
>    >    >>>>         May I ask your help to review the follow changes:
>    >    >>>>         webrev:
>    >    >>>> http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/
>    >    >>>>      bug: https://bugs.openjdk.java.net/browse/JDK-8215624
>    >    >>>>      related CSR: https://bugs.openjdk.java.net/browse/JDK-8239290
>    >    >>>>         This patch enable parallel heap inspection of G1 for jmap histo.
>    >    >>>>         my simple test shown it can speed up 2x of jmap -histo with
>    >    >>>> parallelThreadNum set to 2 for heap at ~500M on 4-core platform.
>    >    >>>>
>    >    >>>> ------------------------------------------------------------------------
>    >    >>>> BRs,
>    >    >>>> Lin
>    >    >> >
>    >    >
    
    
From suenaga at oss.nttdata.com  Mon Mar 16 09:20:28 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Mon, 16 Mar 2020 18:20:28 +0900
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com>
 <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com>
 <f6d2ef25-becc-3060-f10a-445f275ee69d@oracle.com>
 <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com>
 <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com>
 <d0a1f49e-7f1d-0732-38d0-2b7a75966e74@oss.nttdata.com>
 <c0013500-ebc6-0eab-1705-feb99e58e73e@oracle.com>
 <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com>
Message-ID: <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com>

Hi David,

I missed loop condition, so I fixed it and pushed to submit repo.
Could you try again?

   http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23

webrev is here:

   http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/


Thanks a lot!

Yasumasa


On 2020/03/16 16:17, David Holmes wrote:
> Sorry it is still crashing.
> 
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704
> #
> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0640217.suenaga.source)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
> # Problematic frame:
> # C? [libsaproc.so+0x494e]? DwarfParser::process_dwarf(unsigned long)+0x4e
> #
> 
> Same as before.
> 
> David
> -----
> 
> On 16/03/2020 4:57 pm, David Holmes wrote:
>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote:
>>> On 2020/03/16 15:43, Chris Plummer wrote:
>>>> BTW, if you submit it to the submit repo, we can then go and run additional internal tests (and even more builds) using that job.
>>
>> Thanks for that tip Chris!
>>
>>> I've pushed the change to submit repo, but I've not yet received the result.
>>> I will share you when I get job ID.
>>
>> We can see the id. Just need to wait for the builds to complete before submitting the additional tests.
>>
>> David
>>
>>> Yasumasa
>>>
>>>> Chris
>>>>
>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote:
>>>>> Hi David,
>>>>>
>>>>> Thank you for testing it.
>>>>>
>>>>> I updated webrev to avoid bailout to Java frame when DWARF has language personality routine or LSDA.
>>>>> Could you try it?
>>>>>
>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/
>>>>>
>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 .
>>>>> I've pushed it to submit repo.
>>>>>
>>>>> Diff from webrev.00 is here:
>>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2020/03/16 13:12, David Holmes wrote:
>>>>>> Correction ...
>>>>>>
>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote:
>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote:
>>>>>>>> Hi Yasumasa,
>>>>>>>>
>>>>>>>> I can't review this as I know nothing about the code, but I'm putting the patch through our internal testing.
>>>>>>>
>>>>>>> Sorry but the crashes still exist:
>>>>>>>
>>>>>>> #
>>>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>>>> #
>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949
>>>>>>> #
>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev)
>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>>>> # Problematic frame:
>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>
>>>>>>> in fact they seem worse as the test seems to always crash now.
>>>>>>
>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in linux-x64. I don't see a pattern as to where it fails versus passes.
>>>>>>
>>>>>> It doesn't fail for me locally.
>>>>>>
>>>>>> David
>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>>> David
>>>>>>>>
>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> Please review this change:
>>>>>>>>>
>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/
>>>>>>>>>
>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames in jstack mixed mode.
>>>>>>>>> However some error has seen intermittently after that.
>>>>>>>>>
>>>>>>>>> I investigated the cause of this, I found two concerns:
>>>>>>>>>
>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check
>>>>>>>>> ?? B: Language personality routine and Language Specific Data Area (LSDA) are not considered
>>>>>>>>>
>>>>>>>>> I addd range check for .eh_frame processing, and ignore personality routine and LSDA in this webrev.
>>>>>>>>> Also I added bailout code if DWARF processing is failed due to these concerns.
>>>>>>>>>
>>>>>>>>> This change has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671),
>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>
>>>>

From david.holmes at oracle.com  Mon Mar 16 11:46:41 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 16 Mar 2020 21:46:41 +1000
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com>
 <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com>
 <f6d2ef25-becc-3060-f10a-445f275ee69d@oracle.com>
 <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com>
 <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com>
 <d0a1f49e-7f1d-0732-38d0-2b7a75966e74@oss.nttdata.com>
 <c0013500-ebc6-0eab-1705-feb99e58e73e@oracle.com>
 <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com>
 <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com>
Message-ID: <a13433b4-e3f8-d280-b83f-cde27f7282cd@oracle.com>

On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote:
> Hi David,
> 
> I missed loop condition, so I fixed it and pushed to submit repo.
> Could you try again?
> 
>  ? http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23
> 
> webrev is here:
> 
>  ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/

Test job resubmitted. Will advise results if it completes before I go to 
bed :)

David

> 
> Thanks a lot!
> 
> Yasumasa
> 
> 
> On 2020/03/16 16:17, David Holmes wrote:
>> Sorry it is still crashing.
>>
>> #
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704
>> #
>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 
>> 15-internal+0-2020-03-16-0640217.suenaga.source)
>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
>> 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, sharing, 
>> tiered, compressed oops, g1 gc, linux-amd64)
>> # Problematic frame:
>> # C? [libsaproc.so+0x494e]? DwarfParser::process_dwarf(unsigned 
>> long)+0x4e
>> #
>>
>> Same as before.
>>
>> David
>> -----
>>
>> On 16/03/2020 4:57 pm, David Holmes wrote:
>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote:
>>>> On 2020/03/16 15:43, Chris Plummer wrote:
>>>>> BTW, if you submit it to the submit repo, we can then go and run 
>>>>> additional internal tests (and even more builds) using that job.
>>>
>>> Thanks for that tip Chris!
>>>
>>>> I've pushed the change to submit repo, but I've not yet received the 
>>>> result.
>>>> I will share you when I get job ID.
>>>
>>> We can see the id. Just need to wait for the builds to complete 
>>> before submitting the additional tests.
>>>
>>> David
>>>
>>>> Yasumasa
>>>>
>>>>> Chris
>>>>>
>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote:
>>>>>> Hi David,
>>>>>>
>>>>>> Thank you for testing it.
>>>>>>
>>>>>> I updated webrev to avoid bailout to Java frame when DWARF has 
>>>>>> language personality routine or LSDA.
>>>>>> Could you try it?
>>>>>>
>>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/
>>>>>>
>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 .
>>>>>> I've pushed it to submit repo.
>>>>>>
>>>>>> Diff from webrev.00 is here:
>>>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>> On 2020/03/16 13:12, David Holmes wrote:
>>>>>>> Correction ...
>>>>>>>
>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote:
>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote:
>>>>>>>>> Hi Yasumasa,
>>>>>>>>>
>>>>>>>>> I can't review this as I know nothing about the code, but I'm 
>>>>>>>>> putting the patch through our internal testing.
>>>>>>>>
>>>>>>>> Sorry but the crashes still exist:
>>>>>>>>
>>>>>>>> #
>>>>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>>>>> #
>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949
>>>>>>>> #
>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug 
>>>>>>>> build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev)
>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed 
>>>>>>>> mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>>>>> # Problematic frame:
>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned 
>>>>>>>> long)+0x4e
>>>>>>>>
>>>>>>>> in fact they seem worse as the test seems to always crash now.
>>>>>>>
>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test 
>>>>>>> in linux-x64. I don't see a pattern as to where it fails versus 
>>>>>>> passes.
>>>>>>>
>>>>>>> It doesn't fail for me locally.
>>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>>> David
>>>>>>>>
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> Please review this change:
>>>>>>>>>>
>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>>> ?? webrev: 
>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/
>>>>>>>>>>
>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native 
>>>>>>>>>> frames in jstack mixed mode.
>>>>>>>>>> However some error has seen intermittently after that.
>>>>>>>>>>
>>>>>>>>>> I investigated the cause of this, I found two concerns:
>>>>>>>>>>
>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check
>>>>>>>>>> ?? B: Language personality routine and Language Specific Data 
>>>>>>>>>> Area (LSDA) are not considered
>>>>>>>>>>
>>>>>>>>>> I addd range check for .eh_frame processing, and ignore 
>>>>>>>>>> personality routine and LSDA in this webrev.
>>>>>>>>>> Also I added bailout code if DWARF processing is failed due to 
>>>>>>>>>> these concerns.
>>>>>>>>>>
>>>>>>>>>> This change has passed all tests on submit repo 
>>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671),
>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 
>>>>>>>>>> container.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>
>>>>>

From david.holmes at oracle.com  Mon Mar 16 12:01:27 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 16 Mar 2020 22:01:27 +1000
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <a13433b4-e3f8-d280-b83f-cde27f7282cd@oracle.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com>
 <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com>
 <f6d2ef25-becc-3060-f10a-445f275ee69d@oracle.com>
 <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com>
 <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com>
 <d0a1f49e-7f1d-0732-38d0-2b7a75966e74@oss.nttdata.com>
 <c0013500-ebc6-0eab-1705-feb99e58e73e@oracle.com>
 <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com>
 <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com>
 <a13433b4-e3f8-d280-b83f-cde27f7282cd@oracle.com>
Message-ID: <ff79a3c5-7416-6ee1-73de-0ef1d91a7480@oracle.com>

On 16/03/2020 9:46 pm, David Holmes wrote:
> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote:
>> Hi David,
>>
>> I missed loop condition, so I fixed it and pushed to submit repo.
>> Could you try again?
>>
>> ?? http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23
>>
>> webrev is here:
>>
>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
> 
> Test job resubmitted. Will advise results if it completes before I go to 
> bed :)

Seems to have passed okay.

David

> David
> 
>>
>> Thanks a lot!
>>
>> Yasumasa
>>
>>
>> On 2020/03/16 16:17, David Holmes wrote:
>>> Sorry it is still crashing.
>>>
>>> #
>>> # A fatal error has been detected by the Java Runtime Environment:
>>> #
>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704
>>> #
>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug 
>>> build 15-internal+0-2020-03-16-0640217.suenaga.source)
>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
>>> 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, sharing, 
>>> tiered, compressed oops, g1 gc, linux-amd64)
>>> # Problematic frame:
>>> # C? [libsaproc.so+0x494e]? DwarfParser::process_dwarf(unsigned 
>>> long)+0x4e
>>> #
>>>
>>> Same as before.
>>>
>>> David
>>> -----
>>>
>>> On 16/03/2020 4:57 pm, David Holmes wrote:
>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote:
>>>>> On 2020/03/16 15:43, Chris Plummer wrote:
>>>>>> BTW, if you submit it to the submit repo, we can then go and run 
>>>>>> additional internal tests (and even more builds) using that job.
>>>>
>>>> Thanks for that tip Chris!
>>>>
>>>>> I've pushed the change to submit repo, but I've not yet received 
>>>>> the result.
>>>>> I will share you when I get job ID.
>>>>
>>>> We can see the id. Just need to wait for the builds to complete 
>>>> before submitting the additional tests.
>>>>
>>>> David
>>>>
>>>>> Yasumasa
>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote:
>>>>>>> Hi David,
>>>>>>>
>>>>>>> Thank you for testing it.
>>>>>>>
>>>>>>> I updated webrev to avoid bailout to Java frame when DWARF has 
>>>>>>> language personality routine or LSDA.
>>>>>>> Could you try it?
>>>>>>>
>>>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/
>>>>>>>
>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 .
>>>>>>> I've pushed it to submit repo.
>>>>>>>
>>>>>>> Diff from webrev.00 is here:
>>>>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>> On 2020/03/16 13:12, David Holmes wrote:
>>>>>>>> Correction ...
>>>>>>>>
>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote:
>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote:
>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>
>>>>>>>>>> I can't review this as I know nothing about the code, but I'm 
>>>>>>>>>> putting the patch through our internal testing.
>>>>>>>>>
>>>>>>>>> Sorry but the crashes still exist:
>>>>>>>>>
>>>>>>>>> #
>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>>>>>> #
>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949
>>>>>>>>> #
>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) 
>>>>>>>>> (fastdebug build 
>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev)
>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed 
>>>>>>>>> mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>>>>>> # Problematic frame:
>>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned 
>>>>>>>>> long)+0x4e
>>>>>>>>>
>>>>>>>>> in fact they seem worse as the test seems to always crash now.
>>>>>>>>
>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test 
>>>>>>>> in linux-x64. I don't see a pattern as to where it fails versus 
>>>>>>>> passes.
>>>>>>>>
>>>>>>>> It doesn't fail for me locally.
>>>>>>>>
>>>>>>>> David
>>>>>>>>
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> Please review this change:
>>>>>>>>>>>
>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>>>> ?? webrev: 
>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/
>>>>>>>>>>>
>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding 
>>>>>>>>>>> native frames in jstack mixed mode.
>>>>>>>>>>> However some error has seen intermittently after that.
>>>>>>>>>>>
>>>>>>>>>>> I investigated the cause of this, I found two concerns:
>>>>>>>>>>>
>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check
>>>>>>>>>>> ?? B: Language personality routine and Language Specific Data 
>>>>>>>>>>> Area (LSDA) are not considered
>>>>>>>>>>>
>>>>>>>>>>> I addd range check for .eh_frame processing, and ignore 
>>>>>>>>>>> personality routine and LSDA in this webrev.
>>>>>>>>>>> Also I added bailout code if DWARF processing is failed due 
>>>>>>>>>>> to these concerns.
>>>>>>>>>>>
>>>>>>>>>>> This change has passed all tests on submit repo 
>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671),
>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 
>>>>>>>>>>> container.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>
>>>>>>

From suenaga at oss.nttdata.com  Mon Mar 16 12:03:51 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Mon, 16 Mar 2020 21:03:51 +0900
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <ff79a3c5-7416-6ee1-73de-0ef1d91a7480@oracle.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com>
 <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com>
 <f6d2ef25-becc-3060-f10a-445f275ee69d@oracle.com>
 <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com>
 <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com>
 <d0a1f49e-7f1d-0732-38d0-2b7a75966e74@oss.nttdata.com>
 <c0013500-ebc6-0eab-1705-feb99e58e73e@oracle.com>
 <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com>
 <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com>
 <a13433b4-e3f8-d280-b83f-cde27f7282cd@oracle.com>
 <ff79a3c5-7416-6ee1-73de-0ef1d91a7480@oracle.com>
Message-ID: <b9694754-b669-0ea3-2ba9-a432b9d8dcfb@oss.nttdata.com>

Thank you so much, David!

Yasumasa


On 2020/03/16 21:01, David Holmes wrote:
> On 16/03/2020 9:46 pm, David Holmes wrote:
>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote:
>>> Hi David,
>>>
>>> I missed loop condition, so I fixed it and pushed to submit repo.
>>> Could you try again?
>>>
>>> ?? http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23
>>>
>>> webrev is here:
>>>
>>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>
>> Test job resubmitted. Will advise results if it completes before I go to bed :)
> 
> Seems to have passed okay.
> 
> David
> 
>> David
>>
>>>
>>> Thanks a lot!
>>>
>>> Yasumasa
>>>
>>>
>>> On 2020/03/16 16:17, David Holmes wrote:
>>>> Sorry it is still crashing.
>>>>
>>>> #
>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>> #
>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704
>>>> #
>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0640217.suenaga.source)
>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>> # Problematic frame:
>>>> # C? [libsaproc.so+0x494e]? DwarfParser::process_dwarf(unsigned long)+0x4e
>>>> #
>>>>
>>>> Same as before.
>>>>
>>>> David
>>>> -----
>>>>
>>>> On 16/03/2020 4:57 pm, David Holmes wrote:
>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote:
>>>>>> On 2020/03/16 15:43, Chris Plummer wrote:
>>>>>>> BTW, if you submit it to the submit repo, we can then go and run additional internal tests (and even more builds) using that job.
>>>>>
>>>>> Thanks for that tip Chris!
>>>>>
>>>>>> I've pushed the change to submit repo, but I've not yet received the result.
>>>>>> I will share you when I get job ID.
>>>>>
>>>>> We can see the id. Just need to wait for the builds to complete before submitting the additional tests.
>>>>>
>>>>> David
>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote:
>>>>>>>> Hi David,
>>>>>>>>
>>>>>>>> Thank you for testing it.
>>>>>>>>
>>>>>>>> I updated webrev to avoid bailout to Java frame when DWARF has language personality routine or LSDA.
>>>>>>>> Could you try it?
>>>>>>>>
>>>>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/
>>>>>>>>
>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 .
>>>>>>>> I've pushed it to submit repo.
>>>>>>>>
>>>>>>>> Diff from webrev.00 is here:
>>>>>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2020/03/16 13:12, David Holmes wrote:
>>>>>>>>> Correction ...
>>>>>>>>>
>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote:
>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote:
>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>
>>>>>>>>>>> I can't review this as I know nothing about the code, but I'm putting the patch through our internal testing.
>>>>>>>>>>
>>>>>>>>>> Sorry but the crashes still exist:
>>>>>>>>>>
>>>>>>>>>> #
>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>>>>>>> #
>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949
>>>>>>>>>> #
>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev)
>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>>>>>>> # Problematic frame:
>>>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>
>>>>>>>>>> in fact they seem worse as the test seems to always crash now.
>>>>>>>>>
>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in linux-x64. I don't see a pattern as to where it fails versus passes.
>>>>>>>>>
>>>>>>>>> It doesn't fail for me locally.
>>>>>>>>>
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>> David
>>>>>>>>>>>
>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>
>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>
>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/
>>>>>>>>>>>>
>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames in jstack mixed mode.
>>>>>>>>>>>> However some error has seen intermittently after that.
>>>>>>>>>>>>
>>>>>>>>>>>> I investigated the cause of this, I found two concerns:
>>>>>>>>>>>>
>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check
>>>>>>>>>>>> ?? B: Language personality routine and Language Specific Data Area (LSDA) are not considered
>>>>>>>>>>>>
>>>>>>>>>>>> I addd range check for .eh_frame processing, and ignore personality routine and LSDA in this webrev.
>>>>>>>>>>>> Also I added bailout code if DWARF processing is failed due to these concerns.
>>>>>>>>>>>>
>>>>>>>>>>>> This change has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671),
>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Yasumasa
>>>>>>>
>>>>>>>

From suenaga at oss.nttdata.com  Mon Mar 16 12:07:14 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Mon, 16 Mar 2020 21:07:14 +0900
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <b9694754-b669-0ea3-2ba9-a432b9d8dcfb@oss.nttdata.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com>
 <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com>
 <f6d2ef25-becc-3060-f10a-445f275ee69d@oracle.com>
 <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com>
 <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com>
 <d0a1f49e-7f1d-0732-38d0-2b7a75966e74@oss.nttdata.com>
 <c0013500-ebc6-0eab-1705-feb99e58e73e@oracle.com>
 <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com>
 <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com>
 <a13433b4-e3f8-d280-b83f-cde27f7282cd@oracle.com>
 <ff79a3c5-7416-6ee1-73de-0ef1d91a7480@oracle.com>
 <b9694754-b669-0ea3-2ba9-a432b9d8dcfb@oss.nttdata.com>
Message-ID: <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com>

Hi all,

This webrev has passed submit repo (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and additional tests.
So please review it:

   JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
   webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/


Thanks,

Yasumasa


On 2020/03/16 21:03, Yasumasa Suenaga wrote:
> Thank you so much, David!
> 
> Yasumasa
> 
> 
> On 2020/03/16 21:01, David Holmes wrote:
>> On 16/03/2020 9:46 pm, David Holmes wrote:
>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote:
>>>> Hi David,
>>>>
>>>> I missed loop condition, so I fixed it and pushed to submit repo.
>>>> Could you try again?
>>>>
>>>> ?? http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23
>>>>
>>>> webrev is here:
>>>>
>>>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>
>>> Test job resubmitted. Will advise results if it completes before I go to bed :)
>>
>> Seems to have passed okay.
>>
>> David
>>
>>> David
>>>
>>>>
>>>> Thanks a lot!
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2020/03/16 16:17, David Holmes wrote:
>>>>> Sorry it is still crashing.
>>>>>
>>>>> #
>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>> #
>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704
>>>>> #
>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0640217.suenaga.source)
>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>> # Problematic frame:
>>>>> # C? [libsaproc.so+0x494e]? DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>> #
>>>>>
>>>>> Same as before.
>>>>>
>>>>> David
>>>>> -----
>>>>>
>>>>> On 16/03/2020 4:57 pm, David Holmes wrote:
>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote:
>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote:
>>>>>>>> BTW, if you submit it to the submit repo, we can then go and run additional internal tests (and even more builds) using that job.
>>>>>>
>>>>>> Thanks for that tip Chris!
>>>>>>
>>>>>>> I've pushed the change to submit repo, but I've not yet received the result.
>>>>>>> I will share you when I get job ID.
>>>>>>
>>>>>> We can see the id. Just need to wait for the builds to complete before submitting the additional tests.
>>>>>>
>>>>>> David
>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>> Chris
>>>>>>>>
>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote:
>>>>>>>>> Hi David,
>>>>>>>>>
>>>>>>>>> Thank you for testing it.
>>>>>>>>>
>>>>>>>>> I updated webrev to avoid bailout to Java frame when DWARF has language personality routine or LSDA.
>>>>>>>>> Could you try it?
>>>>>>>>>
>>>>>>>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/
>>>>>>>>>
>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 .
>>>>>>>>> I've pushed it to submit repo.
>>>>>>>>>
>>>>>>>>> Diff from webrev.00 is here:
>>>>>>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote:
>>>>>>>>>> Correction ...
>>>>>>>>>>
>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote:
>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote:
>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>
>>>>>>>>>>>> I can't review this as I know nothing about the code, but I'm putting the patch through our internal testing.
>>>>>>>>>>>
>>>>>>>>>>> Sorry but the crashes still exist:
>>>>>>>>>>>
>>>>>>>>>>> #
>>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>>>>>>>> #
>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949
>>>>>>>>>>> #
>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev)
>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>
>>>>>>>>>>> in fact they seem worse as the test seems to always crash now.
>>>>>>>>>>
>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in linux-x64. I don't see a pattern as to where it fails versus passes.
>>>>>>>>>>
>>>>>>>>>> It doesn't fail for me locally.
>>>>>>>>>>
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>> David
>>>>>>>>>>>
>>>>>>>>>>>> David
>>>>>>>>>>>>
>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>
>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/
>>>>>>>>>>>>>
>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames in jstack mixed mode.
>>>>>>>>>>>>> However some error has seen intermittently after that.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I investigated the cause of this, I found two concerns:
>>>>>>>>>>>>>
>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check
>>>>>>>>>>>>> ?? B: Language personality routine and Language Specific Data Area (LSDA) are not considered
>>>>>>>>>>>>>
>>>>>>>>>>>>> I addd range check for .eh_frame processing, and ignore personality routine and LSDA in this webrev.
>>>>>>>>>>>>> Also I added bailout code if DWARF processing is failed due to these concerns.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This change has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671),
>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>

From chris.plummer at oracle.com  Mon Mar 16 18:26:37 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Mon, 16 Mar 2020 11:26:37 -0700
Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they
 do not attempt to use sudo when available
In-Reply-To: <c15a019e-1dad-16de-c8b7-0ca9e97ada97@oracle.com>
References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com>
 <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com>
 <c26a5f69-3127-bd0a-0e25-8b7afe4464aa@oracle.com>
 <150f87f8-5b54-deae-ced1-1227bb1ed17a@oracle.com>
 <227652ff-ffb0-a1f0-531d-03b75ecec921@oracle.com>
 <33ffd0e7-d017-c1bf-feb5-4d2f07e753f2@oracle.com>
 <90e1ffcc-6636-b328-b167-fd1d42072575@oracle.com>
 <BAAB2AF7-B0C0-4C01-890A-F2F0FC360C42@oracle.com>
 <f567ce96-4d34-8888-e74c-58e7ab64055f@oracle.com>
 <78834ED4-055B-4E43-86A8-01BB49BC3C73@oracle.com>
 <a2af1551-2450-4efb-00a0-ab921dc68fa7@oracle.com>
 <c15a019e-1dad-16de-c8b7-0ca9e97ada97@oracle.com>
Message-ID: <af2d05b8-01b2-61e3-a729-c29d2479599c@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200316/2147c15a/attachment-0001.htm>

From serguei.spitsyn at oracle.com  Mon Mar 16 18:43:40 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 16 Mar 2020 11:43:40 -0700
Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they
 do not attempt to use sudo when available
In-Reply-To: <af2d05b8-01b2-61e3-a729-c29d2479599c@oracle.com>
References: <769c20ac-52e1-bbdf-dadb-370a92f9615a@oracle.com>
 <1b344718-9485-eb5c-1b85-9e358851c369@oracle.com>
 <c26a5f69-3127-bd0a-0e25-8b7afe4464aa@oracle.com>
 <150f87f8-5b54-deae-ced1-1227bb1ed17a@oracle.com>
 <227652ff-ffb0-a1f0-531d-03b75ecec921@oracle.com>
 <33ffd0e7-d017-c1bf-feb5-4d2f07e753f2@oracle.com>
 <90e1ffcc-6636-b328-b167-fd1d42072575@oracle.com>
 <BAAB2AF7-B0C0-4C01-890A-F2F0FC360C42@oracle.com>
 <f567ce96-4d34-8888-e74c-58e7ab64055f@oracle.com>
 <78834ED4-055B-4E43-86A8-01BB49BC3C73@oracle.com>
 <a2af1551-2450-4efb-00a0-ab921dc68fa7@oracle.com>
 <c15a019e-1dad-16de-c8b7-0ca9e97ada97@oracle.com>
 <af2d05b8-01b2-61e3-a729-c29d2479599c@oracle.com>
Message-ID: <7bbb95be-1c85-d396-1b1a-c55e6a25e8a8@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200316/afa8f528/attachment-0001.htm>

From daniil.x.titov at oracle.com  Mon Mar 16 19:00:26 2020
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Mon, 16 Mar 2020 12:00:26 -0700
Subject: RFR: 8240711: TestJstatdPort.java failed due to "ExportException:
 Port already in use:"
Message-ID: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com>

Please review the change [1] that fixes the intermittent failure of the test.

The problem here is that if the RMI port is in use than the test keep waiting for "jstatd started (bound to " to appear in the process output and in this case
It doesn't happen.

	at java.util.concurrent.CountDownLatch.await(java.base at 15-internal/CountDownLatch.java:232)
	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:205)
	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:133)
	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:254)
	at jdk.test.lib.thread.ProcessThread$ProcessRunnable.xrun(ProcessThread.java:153)
	at jdk.test.lib.thread.XRun.run(XRun.java:40)
	at java.lang.Thread.run(java.base at 15-internal/Thread.java:832)
	at jdk.test.lib.thread.TestThread.run(TestThread.java:123)

Testing: Mach5 tests for sun/tools/jstatd/ successfully passed.  Tier1-tier3 tests are still in progress.

[1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.01/
[2] https://bugs.openjdk.java.net/browse/JDK-8240711 


Thank you,
Daniil


From igor.ignatyev at oracle.com  Mon Mar 16 19:13:00 2020
From: igor.ignatyev at oracle.com (Igor Ignatev)
Date: Mon, 16 Mar 2020 12:13:00 -0700
Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they
 do not attempt to use sudo when available
In-Reply-To: <7bbb95be-1c85-d396-1b1a-c55e6a25e8a8@oracle.com>
References: <7bbb95be-1c85-d396-1b1a-c55e6a25e8a8@oracle.com>
Message-ID: <247B7EA6-7BB5-4F6C-84C4-C110BAF8F063@oracle.com>


> On Mar 16, 2020, at 11:43 AM, "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com> wrote:
> 
> ?
>> On 3/16/20 11:26, Chris Plummer wrote:
>> I had to make another change. TestMutuallyExclusivePlatformPredicates.java failed when I ran tier 3. I had fixed it a long while back due to Platform.shouldSAAttach() being removed, but there were more changes to Platform.java after that that I didn't account for. isRoot() was added and canPtraceAttrachLinux() was made public. So this is what the diff looks like now:
>> 
>> --- a/test/hotspot/jtreg/testlibrary_tests/TestMutuallyExclusivePlatformPredicates.java
>> +++ b/test/hotspot/jtreg/testlibrary_tests/TestMutuallyExclusivePlatformPredicates.java
>> @@ -51,9 +51,9 @@
>>          VM_TYPE("isClient", "isServer", "isMinimal", "isZero", "isEmbedded"),
>>          MODE("isInt", "isMixed", "isComp"),
>>          IGNORED("isEmulatedClient", "isDebugBuild", "isFastDebugBuild",
>> -                "isSlowDebugBuild", "hasSA", "shouldSAAttach", "isTieredSupported",
>> +                "isSlowDebugBuild", "hasSA", "canPtraceAttachLinux", "isTieredSupported",
>>                  "areCustomLoadersSupportedForCDS", "isDefaultCDSArchiveSupported",
>> -                "isSignedOSX");
>> +                "isSignedOSX", "isRoot");
>> 
>> However, I'm thinking maybe I should just move canPtraceAttachLinux() to SATestUtils.java since that's the only user, and it is an SA specific API. What do you think?
> 
> The approach to localize canPtraceAttachLinux() in SATestUtils.java sounds right to me if it is an SA specific API.
> 
+1
? Igor

> Thanks,
> Serguei
> 
>> 
>> Chris
>> 
>>> On 3/15/20 10:22 PM, serguei.spitsyn at oracle.com wrote:
>>> Hi Chris,
>>> 
>>> Looks good.
>>> Thank you for update!
>>> 
>>> Thanks,
>>> Serguei
>>> 
>>> 
>>>> On 3/15/20 17:47, Chris Plummer wrote:
>>>> I changed them all to "SA Attach" and grepped to make sure there are no other occurrences of "SA attach".
>>>> 
>>>> thanks,
>>>> 
>>>> Chris
>>>> 
>>>>> On 3/15/20 4:49 PM, Igor Ignatyev wrote:
>>>>> Hi Chris,
>>>>> 
>>>>> looks good, thanks!
>>>>> 
>>>>> one minor nit, in SATestUtils::skipIfCannotAttach, you have two exception messages which start with 'SA attach not expected to work.', and one w/ 'SA Attach not expected to work.' (w/ Attach instead of attach), it'd be nicer to have them uniform. 
>>>>> 
>>>>> Cheers,
>>>>> -- Igor
>>>>> 
>>>>>> On Mar 15, 2020, at 4:35 PM, Chris Plummer <chris.plummer at oracle.com> wrote:
>>>>>> 
>>>>>> Hi Igor,
>>>>>> 
>>>>>> Thanks for the review. Here's and updated webrev with all of the suggestions from you and Serguei:
>>>>>> 
>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.01/index.html
>>>>>> 
>>>>>> Also some comments inline below.
>>>>>> 
>>>>>>> On 3/13/20 9:26 AM, Igor Ignatyev wrote:
>>>>>>> HI Chris,
>>>>>>> 
>>>>>>> overall looks good to me, a few comments though:
>>>>>>> 1. since you removed vm.hasSAandCanAttach from VMProps, you also need to remove it from all TEST.ROOT files which mention it (test/jdk/TEST.ROOT and test/hotspot/jtreg/TEST.ROOT) so people won't be confused by undefined property and jtreg will be able to properly report invalid usages of it if any.
>>>>>> Ok, but it's unclear to me what requires.properties is even for, and what is the impact of extra or missing properties. What kind of test would catch these errors?
>>>>> jtreg uses 'requires.properties' as a list of extra variables for @require expressions, if a test uses a name which isn't in 'requires.properties' (or known to jtreg), jtreg won't execute such test and will set its status to Error w/ 'invalid name ...' message.
>>>>> 
>>>>>>> 
>>>>>>> 2. in SATestUtils::canAddPrivileges, could you please add some meaningful message to the RuntimeException at L#102?
>>>>>>> 
>>>>>> Ok.
>>>>>> 
>>>>>>            throw new RuntimeException("sudo process interrupted", e);
>>>>>> 
>>>>>>> 3. SATestUtils::checkAttachOk method name is somewhat misleading (hence you had to put comment every time you used it), I'd recommend you to rename to smth like skipIfCannotAttach().
>>>>>> Ok, but I still left the comment in place.
>>>>>>> 
>>>>>>> 4. in SATestUtils::checkAttachOk's javadoc, it would be better to use @throws tag like:
>>>>>>>> +    /**
>>>>>>>> +     * Checks if SA Attach is expected to work.
>>>>>>>> +.    * @throws SkippedException ifSA Attach is not expected to work.
>>>>>>>> +     */
>>>>>>> 
>>>>>>> 
>>>>>> Ok.
>>>>>>> 5. it also might make sense to catch IOException within SATestUtils::checkAttachOk and throw it as Error or RuntimeException.
>>>>>>> 
>>>>>> Ok.
>>>>>>> I've briefly looked at all the changed tests and they look good.
>>>>>> 
>>>>>> Thanks!
>>>>>> 
>>>>>> Chris
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> -- Igor 
>>>>>>> 
>>>>>>> 
>>>>>>>> On Mar 12, 2020, at 11:06 PM, Chris Plummer <chris.plummer at oracle.com> wrote:
>>>>>>>> 
>>>>>>>> Hi Serguei,
>>>>>>>> 
>>>>>>>> Thanks for the review!
>>>>>>>> 
>>>>>>>> Can I get one more reviewer please?
>>>>>>>> 
>>>>>>>> thanks,
>>>>>>>> 
>>>>>>>> Chris
>>>>>>>> 
>>>>>>>> On 3/12/20 12:06 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>> Hi Chris,
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On 3/12/20 00:03, Chris Plummer wrote:
>>>>>>>>>> Hi Serguei,
>>>>>>>>>> 
>>>>>>>>>> That check used to be in Platform.shouldSAAttach(), which essentially was moved to SATestUtils.checkAttachOk() and reworked some. It was necessary in Platform.shouldSAAttach() since that was used to evaluation vm.hasSAandCanAttach (which is now gone). When I moved everything to SATestUtils.checkAttachOk(), I recall thinking it wasn't really necessary since all tests that call it should have @require vm.hasSA, but left it in anyway just to be extra safe. I'm still inclined to just leave it in, but would not be opposed to removing it.
>>>>>>>>> 
>>>>>>>>> I agree, it is more safe to keep it, at list for now.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Serguei
>>>>>>>>> 
>>>>>>>>>> thanks,
>>>>>>>>>> 
>>>>>>>>>> Chris
>>>>>>>>>> 
>>>>>>>>>> On 3/11/20 11:20 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>> Hi Chris,
>>>>>>>>>>> 
>>>>>>>>>>> I've made another pass today.
>>>>>>>>>>> It looks good to me.
>>>>>>>>>>> 
>>>>>>>>>>> I have just one minor questions.
>>>>>>>>>>> 
>>>>>>>>>>> There is some overlap between the requires vm.hasSA check and checkAttachOk:
>>>>>>>>>>> +    public static  void checkAttachOk() throws IOException {
>>>>>>>>>>> +        if (!Platform.hasSA()) {
>>>>>>>>>>> +            throw new SkippedException("SA not supported.");
>>>>>>>>>>> +        }
>>>>>>>>>>> In the former case, the test is not run but in the latter the SkippedException is thrown.
>>>>>>>>>>> As I see, all tests with the checkAttachOk call use requires vm.hasSA as well.
>>>>>>>>>>> It can be that the first check "if (!Platform.hasSA())" in the checkAttachOk is redundant.
>>>>>>>>>>> It is okay and more safe in general but generates little confusion.
>>>>>>>>>>> I'm okay if you don't do anything with this but wanted to know your view.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Serguei
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On 3/10/20 18:57, Chris Plummer wrote:
>>>>>>>>>>>> On 3/10/20 6:07 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Overall, this looks as a right direction to me while it is not easy to verify all the details.
>>>>>>>>>>>> Yes, there are a lot of tests with quite a few different types of changes. I did a lot of testing and verified that when the tests pass, they pass for the right reasons (really ran the test, skipped due to lack of privileges, or skipped due to running signed on OSX 10.14 or later). I also verified locally running as root, running with a cached sudo, and running without sudo.
>>>>>>>>>>>>> I'll make another pass tomorrow. 
>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>> 
>>>>>>>>>>>>> A couple of quick nits so far:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestCpoolForInvokeDynamic.java.udiff.html
>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestDefaultMethods.java.udiff.html
>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestHeapDumpForInvokeDynamic.java.udiff.html
>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestInstanceKlassSizeForInterface.java.udiff.html
>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestJhsdbJstackMixed.java.udiff.html
>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestRevPtrsForInvokeDynamic.java.udiff.html
>>>>>>>>>>>>>  import jdk.test.lib.Utils;
>>>>>>>>>>>>> -import jdk.test.lib.Asserts;
>>>>>>>>>>>>> +import jdk.test.lib.SA.SATestUtils;
>>>>>>>>>>>>> Need to swap these exports.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> Ok
>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/lib/jdk/test/lib/SA/SATestUtils.java.frames.html
>>>>>>>>>>>>>   48         if (SATestUtils.needsPrivileges()) {
>>>>>>>>>>>>>   49             cmdStringList = SATestUtils.addPrivileges(cmdStringList);
>>>>>>>>>>>>> The method calls are local, so the class name can be omitted in the method names:
>>>>>>>>>>>>>   SATestUtils.needsPrivileges and SATestUtils.addPrivileges.
>>>>>>>>>>>> Ok
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>   94        try {
>>>>>>>>>>>>>   95            if (echoProcess.waitFor(60, TimeUnit.SECONDS) == false) {
>>>>>>>>>>>>>   96                // Due to using the "-n" option, sudo should complete almost immediately. 60 seconds
>>>>>>>>>>>>>   97                // is more than generous. If it didn't complete in that time, something went very wrong.
>>>>>>>>>>>>>   98                echoProcess.destroyForcibly();
>>>>>>>>>>>>>   99                throw new RuntimeException("Timed out waiting for sudo to execute.");
>>>>>>>>>>>>>  100            }
>>>>>>>>>>>>>  101         } catch (InterruptedException e) {
>>>>>>>>>>>>>  102            throw new RuntimeException(e);
>>>>>>>>>>>>>  103         }
>>>>>>>>>>>>> The lines 101/103 are misaligned.
>>>>>>>>>>>> Ok.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> 
>>>>>>>>>>>> Chris
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On 3/9/20 19:29, Chris Plummer wrote:
>>>>>>>>>>>>>> Hi, 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Please help review the following: 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238268 
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/ 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I'll try to give enough background first to make it easier to understand the changes. On OSX you must run SA tests that attach to a live process as root or using sudo. For example: 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>   sudo make run-test TEST=serviceability/sa/ClhsdbJstackXcompStress.java 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Whether running as root or under sudo, the check to allow the test to run is done with: 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>     private static boolean canAttachOSX() { 
>>>>>>>>>>>>>>           return userName.equals("root"); 
>>>>>>>>>>>>>>     } 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Any test using "@requires vm.hasSAandCanAttach" must pass this check via Platform.shouldSAAttach(), which for OSX returns: 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>              return canAttachOSX() && !isSignedOSX(); 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> So if running as root the "@requires vm.hasSAandCanAttach" passes, otherwise it does not. However, using a root login to run tests is not a very desirable, nor is issuing a "sudo make run-test" (any created file ends up with root ownership). Because of this support was previously added for just running the attaching process using sudo, not the entire test. This was only done for the 20 or so tests that use ClhsdbLauncher. These tests use "@requires vm.hasSA", and then while running the test will do a "sudo" check if canAttachOSX() returns false: 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>         if (!Platform.shouldSAAttach()) { 
>>>>>>>>>>>>>>             if (Platform.isOSX()) { 
>>>>>>>>>>>>>>                 if (Platform.isSignedOSX()) { 
>>>>>>>>>>>>>>                     throw new SkippedException("SA attach not expected to work. JDK is signed."); 
>>>>>>>>>>>>>>                 } else if (SATestUtils.canAddPrivileges()) {
>>>>>>>>>>>>>>                     needPrivileges = true;
>>>>>>>>>>>>>>                 } 
>>>>>>>>>>>>>>             } 
>>>>>>>>>>>>>>             if (!needPrivileges)  { 
>>>>>>>>>>>>>>                // Skip the test if we don't have enough permissions to attach 
>>>>>>>>>>>>>>                // and cannot add privileges. 
>>>>>>>>>>>>>>                throw new SkippedException( 
>>>>>>>>>>>>>>                    "SA attach not expected to work. Insufficient privileges."); 
>>>>>>>>>>>>>>            } 
>>>>>>>>>>>>>>         } 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> So basically it does a runtime check of vm.hasSAandCanAttach, and if it fails then checks if running with sudo will work. This allows for either a passwordless sudo to be used when running clhsdb, or for the user to be prompted for the sudo password (note I've remove support for the latter with my changes). 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> That brings us to the CR that is being fixed. ClhsdbLauncher tests support sudo and will therefore run with our CI testing on OSX, but the 25 or so tests that use "@requires vm.hasSAandCanAttach" do not, and therefore are never run with our CI OSX testing. The changes in this webrev fix that. 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> There are two possible approaches to the fix. One is having the check for sudo be done as part of the vm.hasSAandCanAttach evaluation. The other approach is to do the check in the test at runtime similar to how ClhsdbLauncher currently does. This would mean just using "@requires vm.hasSA" for all the tests instead of "@requires vm.hasSAandCanAttach". I chose the later because there is an advantage to throwing SkippedException rather than just silently skipping the test using @requires. The advantage is that mdash tells you how many tests were skipped, and when you hover over the reason you can see the SkippedException message, which will differentiate between reasons like the JDK was signed or there are insufficient privileges. If all the checking was done by the vm.hasSAandCanAttach evaluation, you would not know why the test wasn't run. 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> The "support" related changes made are all in the following 3 files. The rest of the changes are in the tests: 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> test/jtreg-ext/requires/VMProps.java 
>>>>>>>>>>>>>> test/lib/jdk/test/lib/Platform.java 
>>>>>>>>>>>>>> test/lib/jdk/test/lib/SA/SATestUtils.java 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> You'll noticed that one change I made to the sudo support in SATestUtils.canAddPrivileges() is to make sudo non-interactive, which means no password prompt. So that means either the user does not require a password, or the credentials have been cached. Otherwise the sudo check will fail. On most platforms if you execute a sudo command, the credentials are cached for 5 minutes. So if your user is not setup for passwordless sudo, then a sudo command can be issued before running the tests, and will likely remain cached until the test is run. The reason for using passwordless is because prompting in the middle of running tests can be confusing (you usually walk way once launching the tests and miss the prompt anyway), and avoids unnecessary delays in automated testing due to waiting for the password prompt to timeout (it used to wait 1 minute). 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> There are essentially 3 types of tests that SA Attach to a process, each needing a slightly different fix: 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 1. Tests that directly launch a jdk.hotspot.agent class, such as TestClassDump.java. They need to call SATestUtils.checkAttachOk() to verify that attaching will be possible, and then SATestUtils.addPrivilegesIfNeeded(pb) to get the sudo command added if needed.They also need to switch from using hasSAandCanAttach to using hasSA. 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 2. Tests that launch command line tools such has jhsdb. They need to call SATestUtils.checkAttachOk() to verify that attaching will be possible, and then SATestUtils.createProcessBuilder() to create a process that will be launched using sudo if necessary.They also need to switch from using hasSAandCanAttach to using hasSA. 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 3. Tests that use ClhsdbLauncher. They already use hasSA instead of hasSAandCanAttach, and rely on ClhsdbLauncher to do check at runtime if attaching will work, so for the most part all the these tests are unchanged. ClhsdbLauncher was modified to take advantage of the new SATestUtils.createProcessBuilder() and SATestUtils.checkAttachOk() APIs. 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Some tests required special handling: 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> test/hotspot/jtreg/compiler/ciReplay/TestSAClient.java 
>>>>>>>>>>>>>> test/hotspot/jtreg/compiler/ciReplay/TestSAServer.java 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> - These two tests SA Attach to a core file, not to a process, so only need hasSA, 
>>>>>>>>>>>>>>   not hasSAandCanAttach. No other changes were needed. 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFindPC.java 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> - The output should never be null. If the test was skipped due to lack of privileges, you 
>>>>>>>>>>>>>>   would never get to this section of the test. 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestClhsdbJstackLock.java 
>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestIntConstant.java 
>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestPrintMdo.java 
>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestType.java 
>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestUniverse.java 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> - These are ClhsdbLauncher tests, so they should have been using hasSA instead of 
>>>>>>>>>>>>>>   hasSAandCanAttachin the first place. No other changes were needed. 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestCpoolForInvokeDynamic.java 
>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestDefaultMethods.java 
>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestG1HeapRegion.java 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> - These tests used to "@require mac" but seem run fine on OSX, so I removed this requirement. 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> test/jdk/sun/tools/jhsdb/BasicLauncherTest.java 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> - This test had a runtime check to not run on OSX due to not having core file stack 
>>>>>>>>>>>>>>   walking support. However, this tests always attaches to a process, not a core file, 
>>>>>>>>>>>>>>   and seems to run just fine on OSX. 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> test/jdk/sun/tools/jstack/DeadlockDetectionTest.java 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> - I changed the test to throw a SkippedException if it gets the unexpected error code 
>>>>>>>>>>>>>>   rather than just println. 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> And a few other miscellaneous changes not already covered: 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> test/lib/jdk/test/lib/Platform.java 
>>>>>>>>>>>>>> - Made canPtraceAttachLinux() public so it can be called from SATestUtils. 
>>>>>>>>>>>>>> - vm.hasSAandCanAttach is now gone. 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> thanks, 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Chris 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200316/5fc0a1ed/attachment-0001.htm>

From alexey.menkov at oracle.com  Mon Mar 16 23:02:08 2020
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Mon, 16 Mar 2020 16:02:08 -0700
Subject: RFR: 8240711: TestJstatdPort.java failed due to "ExportException:
 Port already in use:"
In-Reply-To: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com>
References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com>
Message-ID: <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com>

Hi Daniil,

Looks like the test is supposed to handle "port in use" issue (see lines 
103-114).
I suppose in case "port in use" jstatd exits, but 
ProcessTools.startProcess() continue to wait for "jstatd started" message.

--alex

On 03/16/2020 12:00, Daniil Titov wrote:
> Please review the change [1] that fixes the intermittent failure of the test.
> 
> The problem here is that if the RMI port is in use than the test keep waiting for "jstatd started (bound to " to appear in the process output and in this case
> It doesn't happen.
> 
> 	at java.util.concurrent.CountDownLatch.await(java.base at 15-internal/CountDownLatch.java:232)
> 	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:205)
> 	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:133)
> 	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:254)
> 	at jdk.test.lib.thread.ProcessThread$ProcessRunnable.xrun(ProcessThread.java:153)
> 	at jdk.test.lib.thread.XRun.run(XRun.java:40)
> 	at java.lang.Thread.run(java.base at 15-internal/Thread.java:832)
> 	at jdk.test.lib.thread.TestThread.run(TestThread.java:123)
> 
> Testing: Mach5 tests for sun/tools/jstatd/ successfully passed.  Tier1-tier3 tests are still in progress.
> 
> [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.01/
> [2] https://bugs.openjdk.java.net/browse/JDK-8240711
> 
> 
> Thank you,
> Daniil
> 
> 
> 

From daniil.x.titov at oracle.com  Mon Mar 16 23:13:18 2020
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Mon, 16 Mar 2020 16:13:18 -0700
Subject: 8240711: TestJstatdPort.java failed due to "ExportException: Port
 already in use:"
In-Reply-To: <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com>
References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com>
 <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com>
Message-ID: <E615E29F-1285-4624-8FA7-5BB6237DF9F0@oracle.com>

Hi Alex,

Yes, you are right, class JstatdTest has the code that is supposed to handle the "port in use" 
case but at least for this specific test sun/tools/jstatd/TestJstatdPort.java is doesn't work.

Since there are multiple tests in sun/tools/jstatd/* folder that use this class and different ports
 might be subject to the "port in use" error and taking into account that it's hard to reproduce such case
I found it safer to leave the original code and just augment it with what was missing for this specific
 case rather than completely replacing it.

Best regards,
Daniil

?On 3/16/20, 4:02 PM, "Alex Menkov" <alexey.menkov at oracle.com> wrote:

    Hi Daniil,
    
    Looks like the test is supposed to handle "port in use" issue (see lines 
    103-114).
    I suppose in case "port in use" jstatd exits, but 
    ProcessTools.startProcess() continue to wait for "jstatd started" message.
    
    --alex
    
    On 03/16/2020 12:00, Daniil Titov wrote:
    > Please review the change [1] that fixes the intermittent failure of the test.
    > 
    > The problem here is that if the RMI port is in use than the test keep waiting for "jstatd started (bound to " to appear in the process output and in this case
    > It doesn't happen.
    > 
    > 	at java.util.concurrent.CountDownLatch.await(java.base at 15-internal/CountDownLatch.java:232)
    > 	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:205)
    > 	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:133)
    > 	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:254)
    > 	at jdk.test.lib.thread.ProcessThread$ProcessRunnable.xrun(ProcessThread.java:153)
    > 	at jdk.test.lib.thread.XRun.run(XRun.java:40)
    > 	at java.lang.Thread.run(java.base at 15-internal/Thread.java:832)
    > 	at jdk.test.lib.thread.TestThread.run(TestThread.java:123)
    > 
    > Testing: Mach5 tests for sun/tools/jstatd/ successfully passed.  Tier1-tier3 tests are still in progress.
    > 
    > [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.01/
    > [2] https://bugs.openjdk.java.net/browse/JDK-8240711
    > 
    > 
    > Thank you,
    > Daniil
    > 
    > 
    > 
    

From daniil.x.titov at oracle.com  Mon Mar 16 23:17:00 2020
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Mon, 16 Mar 2020 16:17:00 -0700
Subject: RFR: 8240711: TestJstatdPort.java failed due to "ExportException:
 Port already in use:"
In-Reply-To: <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com>
References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com>
 <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com>
Message-ID: <7D3EB454-76AA-47A7-91A6-417ED92C6F56@oracle.com>

Resending with the corrected subject ...

Hi Alex,

Yes, you are right, class JstatdTest has the code that is supposed to handle the "port in use"
case but at least for this specific test  (sun/tools/jstatd/TestJstatdPort.java) it doesn't work.

Since there are multiple tests in sun/tools/jstatd/* folder that use this class and different ports
might be subject to the "port in use" error and taking into account that it's hard to reproduce such case
I found it safer to leave the original code and just augment it with what was missing for this specific
case rather than completely replacing it.

Best regards,
Daniil

?On 3/16/20, 4:02 PM, "Alex Menkov" <alexey.menkov at oracle.com> wrote:

    Hi Daniil,
    
    Looks like the test is supposed to handle "port in use" issue (see lines 
    103-114).
    I suppose in case "port in use" jstatd exits, but 
    ProcessTools.startProcess() continue to wait for "jstatd started" message.
    
    --alex
    
    On 03/16/2020 12:00, Daniil Titov wrote:
    > Please review the change [1] that fixes the intermittent failure of the test.
    > 
    > The problem here is that if the RMI port is in use than the test keep waiting for "jstatd started (bound to " to appear in the process output and in this case
    > It doesn't happen.
    > 
    > 	at java.util.concurrent.CountDownLatch.await(java.base at 15-internal/CountDownLatch.java:232)
    > 	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:205)
    > 	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:133)
    > 	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:254)
    > 	at jdk.test.lib.thread.ProcessThread$ProcessRunnable.xrun(ProcessThread.java:153)
    > 	at jdk.test.lib.thread.XRun.run(XRun.java:40)
    > 	at java.lang.Thread.run(java.base at 15-internal/Thread.java:832)
    > 	at jdk.test.lib.thread.TestThread.run(TestThread.java:123)
    > 
    > Testing: Mach5 tests for sun/tools/jstatd/ successfully passed.  Tier1-tier3 tests are still in progress.
    > 
    > [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.01/
    > [2] https://bugs.openjdk.java.net/browse/JDK-8240711
    > 
    > 
    > Thank you,
    > Daniil
    > 
    > 
    > 
    

From alexey.menkov at oracle.com  Mon Mar 16 23:47:05 2020
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Mon, 16 Mar 2020 16:47:05 -0700
Subject: RFR: 8240711: TestJstatdPort.java failed due to "ExportException:
 Port already in use:"
In-Reply-To: <7D3EB454-76AA-47A7-91A6-417ED92C6F56@oracle.com>
References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com>
 <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com>
 <7D3EB454-76AA-47A7-91A6-417ED92C6F56@oracle.com>
Message-ID: <38b055df-37cd-cc11-79c9-323fb55077ff@oracle.com>

I don't agree.
The code handles exact the same "port in use" case for the same tool.
So it either works or doesn't.
And have 2 code blocks which suppose to do the same makes the code messy.
BTW did you tested the change (I mean craft the test to get "port in 
use" error)?

--alex

On 03/16/2020 16:17, Daniil Titov wrote:
> Resending with the corrected subject ...
> 
> Hi Alex,
> 
> Yes, you are right, class JstatdTest has the code that is supposed to handle the "port in use"
> case but at least for this specific test  (sun/tools/jstatd/TestJstatdPort.java) it doesn't work.
> 
> Since there are multiple tests in sun/tools/jstatd/* folder that use this class and different ports
> might be subject to the "port in use" error and taking into account that it's hard to reproduce such case
> I found it safer to leave the original code and just augment it with what was missing for this specific
> case rather than completely replacing it.
> 
> Best regards,
> Daniil
> 
> ?On 3/16/20, 4:02 PM, "Alex Menkov" <alexey.menkov at oracle.com> wrote:
> 
>      Hi Daniil,
>      
>      Looks like the test is supposed to handle "port in use" issue (see lines
>      103-114).
>      I suppose in case "port in use" jstatd exits, but
>      ProcessTools.startProcess() continue to wait for "jstatd started" message.
>      
>      --alex
>      
>      On 03/16/2020 12:00, Daniil Titov wrote:
>      > Please review the change [1] that fixes the intermittent failure of the test.
>      >
>      > The problem here is that if the RMI port is in use than the test keep waiting for "jstatd started (bound to " to appear in the process output and in this case
>      > It doesn't happen.
>      >
>      > 	at java.util.concurrent.CountDownLatch.await(java.base at 15-internal/CountDownLatch.java:232)
>      > 	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:205)
>      > 	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:133)
>      > 	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:254)
>      > 	at jdk.test.lib.thread.ProcessThread$ProcessRunnable.xrun(ProcessThread.java:153)
>      > 	at jdk.test.lib.thread.XRun.run(XRun.java:40)
>      > 	at java.lang.Thread.run(java.base at 15-internal/Thread.java:832)
>      > 	at jdk.test.lib.thread.TestThread.run(TestThread.java:123)
>      >
>      > Testing: Mach5 tests for sun/tools/jstatd/ successfully passed.  Tier1-tier3 tests are still in progress.
>      >
>      > [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.01/
>      > [2] https://bugs.openjdk.java.net/browse/JDK-8240711
>      >
>      >
>      > Thank you,
>      > Daniil
>      >
>      >
>      >
>      
> 
> 

From chris.plummer at oracle.com  Tue Mar 17 00:11:12 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Mon, 16 Mar 2020 17:11:12 -0700
Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they
 do not attempt to use sudo when available
In-Reply-To: <247B7EA6-7BB5-4F6C-84C4-C110BAF8F063@oracle.com>
References: <7bbb95be-1c85-d396-1b1a-c55e6a25e8a8@oracle.com>
 <247B7EA6-7BB5-4F6C-84C4-C110BAF8F063@oracle.com>
Message-ID: <22798bb3-9800-fa1d-8668-dc0e95b6eccc@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200316/ad8c5347/attachment-0001.htm>

From igor.ignatyev at oracle.com  Tue Mar 17 00:14:34 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Mon, 16 Mar 2020 17:14:34 -0700
Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they
 do not attempt to use sudo when available
In-Reply-To: <22798bb3-9800-fa1d-8668-dc0e95b6eccc@oracle.com>
References: <7bbb95be-1c85-d396-1b1a-c55e6a25e8a8@oracle.com>
 <247B7EA6-7BB5-4F6C-84C4-C110BAF8F063@oracle.com>
 <22798bb3-9800-fa1d-8668-dc0e95b6eccc@oracle.com>
Message-ID: <556F1CD7-46ED-46D9-BAB9-DD099111D981@oracle.com>

Hi Chris,

does canPtraceAttachLinux have to be public?

otherwise, looks good to me.

-- Igor

> On Mar 16, 2020, at 5:11 PM, Chris Plummer <chris.plummer at oracle.com> wrote:
> 
> Hi Serguei and Igor,
> 
> New webrev:
> 
> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.02/index.html <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.02/index.html>
> 
> Only files changed were Platform.java and SATestUtils.java.
> 
> -Moved canPtraceAttachLinux() from Platform.java to SATestUtils.java
> -Changed Platform.canPtraceAttachLinux() reference in SATestUtils.java to just be canPtraceAttachLinux().
> -Had to change userName.equals("root") reference in canPtraceAttachLinux() to Platform.isRoot(). Probably should have been that way in the first place.
> -Made some adjustments to the imports
> 
> thanks,
> 
> Chris
> 
> On 3/16/20 12:13 PM, Igor Ignatev wrote:
>> 
>> 
>>> On Mar 16, 2020, at 11:43 AM, "serguei.spitsyn at oracle.com" <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com> <mailto:serguei.spitsyn at oracle.com> wrote:
>>> 
>>> ?
>>> On 3/16/20 11:26, Chris Plummer wrote:
>>>> I had to make another change. TestMutuallyExclusivePlatformPredicates.java failed when I ran tier 3. I had fixed it a long while back due to Platform.shouldSAAttach() being removed, but there were more changes to Platform.java after that that I didn't account for. isRoot() was added and canPtraceAttrachLinux() was made public. So this is what the diff looks like now: <>
>>>> 
>>>> --- a/test/hotspot/jtreg/testlibrary_tests/TestMutuallyExclusivePlatformPredicates.java
>>>> +++ b/test/hotspot/jtreg/testlibrary_tests/TestMutuallyExclusivePlatformPredicates.java
>>>> @@ -51,9 +51,9 @@
>>>>          VM_TYPE("isClient", "isServer", "isMinimal", "isZero", "isEmbedded"),
>>>>          MODE("isInt", "isMixed", "isComp"),
>>>>          IGNORED("isEmulatedClient", "isDebugBuild", "isFastDebugBuild",
>>>> -                "isSlowDebugBuild", "hasSA", "shouldSAAttach", "isTieredSupported",
>>>> +                "isSlowDebugBuild", "hasSA", "canPtraceAttachLinux", "isTieredSupported",
>>>>                  "areCustomLoadersSupportedForCDS", "isDefaultCDSArchiveSupported",
>>>> -                "isSignedOSX");
>>>> +                "isSignedOSX", "isRoot");
>>>> 
>>>> However, I'm thinking maybe I should just move canPtraceAttachLinux() to SATestUtils.java since that's the only user, and it is an SA specific API. What do you think?
>>> 
>>> The approach to localize canPtraceAttachLinux() in SATestUtils.java sounds right to me if it is an SA specific API.
>>> 
>> +1
>> ? Igor
>> 
>>> Thanks,
>>> Serguei
>>> 
>>>> 
>>>> Chris
>>>> 
>>>> On 3/15/20 10:22 PM, serguei.spitsyn at oracle.com <mailto:serguei.spitsyn at oracle.com> wrote:
>>>>> Hi Chris,
>>>>> 
>>>>> Looks good.
>>>>> Thank you for update!
>>>>> 
>>>>> Thanks,
>>>>> Serguei
>>>>> 
>>>>> 
>>>>> On 3/15/20 17:47, Chris Plummer wrote:
>>>>>> I changed them all to "SA Attach" and grepped to make sure there are no other occurrences of "SA attach".
>>>>>> 
>>>>>> thanks,
>>>>>> 
>>>>>> Chris
>>>>>> 
>>>>>> On 3/15/20 4:49 PM, Igor Ignatyev wrote:
>>>>>>> Hi Chris,
>>>>>>> 
>>>>>>> looks good, thanks!
>>>>>>> 
>>>>>>> one minor nit, in SATestUtils::skipIfCannotAttach, you have two exception messages which start with 'SA attach not expected to work.', and one w/ 'SA Attach not expected to work.' (w/ Attach instead of attach), it'd be nicer to have them uniform. 
>>>>>>> 
>>>>>>> Cheers,
>>>>>>> -- Igor
>>>>>>> 
>>>>>>>> On Mar 15, 2020, at 4:35 PM, Chris Plummer <chris.plummer at oracle.com <mailto:chris.plummer at oracle.com>> wrote:
>>>>>>>> 
>>>>>>>> Hi Igor,
>>>>>>>> 
>>>>>>>> Thanks for the review. Here's and updated webrev with all of the suggestions from you and Serguei:
>>>>>>>> 
>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.01/index.html <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.01/index.html>
>>>>>>>> 
>>>>>>>> Also some comments inline below.
>>>>>>>> 
>>>>>>>> On 3/13/20 9:26 AM, Igor Ignatyev wrote:
>>>>>>>>> HI Chris,
>>>>>>>>> 
>>>>>>>>> overall looks good to me, a few comments though:
>>>>>>>>> 1. since you removed vm.hasSAandCanAttach from VMProps, you also need to remove it from all TEST.ROOT files which mention it (test/jdk/TEST.ROOT and test/hotspot/jtreg/TEST.ROOT) so people won't be confused by undefined property and jtreg will be able to properly report invalid usages of it if any.
>>>>>>>> Ok, but it's unclear to me what requires.properties is even for, and what is the impact of extra or missing properties. What kind of test would catch these errors?
>>>>>>> jtreg uses 'requires.properties' as a list of extra variables for @require expressions, if a test uses a name which isn't in 'requires.properties' (or known to jtreg), jtreg won't execute such test and will set its status to Error w/ 'invalid name ...' message.
>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 2. in SATestUtils::canAddPrivileges, could you please add some meaningful message to the RuntimeException at L#102?
>>>>>>>>> 
>>>>>>>> Ok.
>>>>>>>> 
>>>>>>>>            throw new RuntimeException("sudo process interrupted", e);
>>>>>>>> 
>>>>>>>>> 3. SATestUtils::checkAttachOk method name is somewhat misleading (hence you had to put comment every time you used it), I'd recommend you to rename to smth like skipIfCannotAttach().
>>>>>>>> Ok, but I still left the comment in place.
>>>>>>>>> 
>>>>>>>>> 4. in SATestUtils::checkAttachOk's javadoc, it would be better to use @throws tag like:
>>>>>>>>>> +    /**
>>>>>>>>>> +     * Checks if SA Attach is expected to work.
>>>>>>>>>> +.    * @throws SkippedException ifSA Attach is not expected to work.
>>>>>>>>>> +     */
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> Ok.
>>>>>>>>> 5. it also might make sense to catch IOException within SATestUtils::checkAttachOk and throw it as Error or RuntimeException.
>>>>>>>>> 
>>>>>>>> Ok.
>>>>>>>>> I've briefly looked at all the changed tests and they look good.
>>>>>>>> 
>>>>>>>> Thanks!
>>>>>>>> 
>>>>>>>> Chris
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> -- Igor 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> On Mar 12, 2020, at 11:06 PM, Chris Plummer <chris.plummer at oracle.com <mailto:chris.plummer at oracle.com>> wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi Serguei,
>>>>>>>>>> 
>>>>>>>>>> Thanks for the review!
>>>>>>>>>> 
>>>>>>>>>> Can I get one more reviewer please?
>>>>>>>>>> 
>>>>>>>>>> thanks,
>>>>>>>>>> 
>>>>>>>>>> Chris
>>>>>>>>>> 
>>>>>>>>>> On 3/12/20 12:06 AM, serguei.spitsyn at oracle.com <mailto:serguei.spitsyn at oracle.com> wrote:
>>>>>>>>>>> Hi Chris,
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On 3/12/20 00:03, Chris Plummer wrote:
>>>>>>>>>>>> Hi Serguei,
>>>>>>>>>>>> 
>>>>>>>>>>>> That check used to be in Platform.shouldSAAttach(), which essentially was moved to SATestUtils.checkAttachOk() and reworked some. It was necessary in Platform.shouldSAAttach() since that was used to evaluation vm.hasSAandCanAttach (which is now gone). When I moved everything to SATestUtils.checkAttachOk(), I recall thinking it wasn't really necessary since all tests that call it should have @require vm.hasSA, but left it in anyway just to be extra safe. I'm still inclined to just leave it in, but would not be opposed to removing it.
>>>>>>>>>>> 
>>>>>>>>>>> I agree, it is more safe to keep it, at list for now.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Serguei
>>>>>>>>>>> 
>>>>>>>>>>>> thanks,
>>>>>>>>>>>> 
>>>>>>>>>>>> Chris
>>>>>>>>>>>> 
>>>>>>>>>>>> On 3/11/20 11:20 PM, serguei.spitsyn at oracle.com <mailto:serguei.spitsyn at oracle.com> wrote:
>>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I've made another pass today.
>>>>>>>>>>>>> It looks good to me.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I have just one minor questions.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> There is some overlap between the requires vm.hasSA check and checkAttachOk:
>>>>>>>>>>>>> +    public static  void checkAttachOk() throws IOException {
>>>>>>>>>>>>> +        if (!Platform.hasSA()) {
>>>>>>>>>>>>> +            throw new SkippedException("SA not supported.");
>>>>>>>>>>>>> +        }
>>>>>>>>>>>>> In the former case, the test is not run but in the latter the SkippedException is thrown.
>>>>>>>>>>>>> As I see, all tests with the checkAttachOk call use requires vm.hasSA as well.
>>>>>>>>>>>>> It can be that the first check "if (!Platform.hasSA())" in the checkAttachOk is redundant.
>>>>>>>>>>>>> It is okay and more safe in general but generates little confusion.
>>>>>>>>>>>>> I'm okay if you don't do anything with this but wanted to know your view.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On 3/10/20 18:57, Chris Plummer wrote:
>>>>>>>>>>>>>> On 3/10/20 6:07 PM, serguei.spitsyn at oracle.com <mailto:serguei.spitsyn at oracle.com> wrote:
>>>>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Overall, this looks as a right direction to me while it is not easy to verify all the details.
>>>>>>>>>>>>>> Yes, there are a lot of tests with quite a few different types of changes. I did a lot of testing and verified that when the tests pass, they pass for the right reasons (really ran the test, skipped due to lack of privileges, or skipped due to running signed on OSX 10.14 or later). I also verified locally running as root, running with a cached sudo, and running without sudo.
>>>>>>>>>>>>>>> I'll make another pass tomorrow. 
>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> A couple of quick nits so far:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestCpoolForInvokeDynamic.java.udiff.html <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestCpoolForInvokeDynamic.java.udiff.html>
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestDefaultMethods.java.udiff.html <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestDefaultMethods.java.udiff.html>
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestHeapDumpForInvokeDynamic.java.udiff.html <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestHeapDumpForInvokeDynamic.java.udiff.html>
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestInstanceKlassSizeForInterface.java.udiff.html <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestInstanceKlassSizeForInterface.java.udiff.html>
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestJhsdbJstackMixed.java.udiff.html <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestJhsdbJstackMixed.java.udiff.html>
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestRevPtrsForInvokeDynamic.java.udiff.html <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/hotspot/jtreg/serviceability/sa/TestRevPtrsForInvokeDynamic.java.udiff.html>
>>>>>>>>>>>>>>>  import jdk.test.lib.Utils;
>>>>>>>>>>>>>>> -import jdk.test.lib.Asserts;
>>>>>>>>>>>>>>> +import jdk.test.lib.SA.SATestUtils;
>>>>>>>>>>>>>>> Need to swap these exports.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Ok
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/lib/jdk/test/lib/SA/SATestUtils.java.frames.html <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/test/lib/jdk/test/lib/SA/SATestUtils.java.frames.html>
>>>>>>>>>>>>>>>   48         if (SATestUtils.needsPrivileges()) {
>>>>>>>>>>>>>>>   49             cmdStringList = SATestUtils.addPrivileges(cmdStringList);
>>>>>>>>>>>>>>> The method calls are local, so the class name can be omitted in the method names:
>>>>>>>>>>>>>>>   SATestUtils.needsPrivileges and SATestUtils.addPrivileges.
>>>>>>>>>>>>>> Ok
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>   94        try {
>>>>>>>>>>>>>>>   95            if (echoProcess.waitFor(60, TimeUnit.SECONDS) == false) {
>>>>>>>>>>>>>>>   96                // Due to using the "-n" option, sudo should complete almost immediately. 60 seconds
>>>>>>>>>>>>>>>   97                // is more than generous. If it didn't complete in that time, something went very wrong.
>>>>>>>>>>>>>>>   98                echoProcess.destroyForcibly();
>>>>>>>>>>>>>>>   99                throw new RuntimeException("Timed out waiting for sudo to execute.");
>>>>>>>>>>>>>>>  100            }
>>>>>>>>>>>>>>>  101         } catch (InterruptedException e) {
>>>>>>>>>>>>>>>  102            throw new RuntimeException(e);
>>>>>>>>>>>>>>>  103         }
>>>>>>>>>>>>>>> The lines 101/103 are misaligned.
>>>>>>>>>>>>>> Ok.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On 3/9/20 19:29, Chris Plummer wrote:
>>>>>>>>>>>>>>>> Hi, 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Please help review the following: 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238268 <https://bugs.openjdk.java.net/browse/JDK-8238268> 
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/ <http://cr.openjdk.java.net/~cjplummer/8238268/webrev.00/> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> I'll try to give enough background first to make it easier to understand the changes. On OSX you must run SA tests that attach to a live process as root or using sudo. For example: 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>   sudo make run-test TEST=serviceability/sa/ClhsdbJstackXcompStress.java 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Whether running as root or under sudo, the check to allow the test to run is done with: 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>     private static boolean canAttachOSX() { 
>>>>>>>>>>>>>>>>           return userName.equals("root"); 
>>>>>>>>>>>>>>>>     } 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Any test using "@requires vm.hasSAandCanAttach" must pass this check via Platform.shouldSAAttach(), which for OSX returns: 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>              return canAttachOSX() && !isSignedOSX(); 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> So if running as root the "@requires vm.hasSAandCanAttach" passes, otherwise it does not. However, using a root login to run tests is not a very desirable, nor is issuing a "sudo make run-test" (any created file ends up with root ownership). Because of this support was previously added for just running the attaching process using sudo, not the entire test. This was only done for the 20 or so tests that use ClhsdbLauncher. These tests use "@requires vm.hasSA", and then while running the test will do a "sudo" check if canAttachOSX() returns false: 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>         if (!Platform.shouldSAAttach()) { 
>>>>>>>>>>>>>>>>             if (Platform.isOSX()) { 
>>>>>>>>>>>>>>>>                 if (Platform.isSignedOSX()) { 
>>>>>>>>>>>>>>>>                     throw new SkippedException("SA attach not expected to work. JDK is signed."); 
>>>>>>>>>>>>>>>>                 } else if (SATestUtils.canAddPrivileges()) { 
>>>>>>>>>>>>>>>>                     needPrivileges = true; 
>>>>>>>>>>>>>>>>                 } 
>>>>>>>>>>>>>>>>             } 
>>>>>>>>>>>>>>>>             if (!needPrivileges)  { 
>>>>>>>>>>>>>>>>                // Skip the test if we don't have enough permissions to attach 
>>>>>>>>>>>>>>>>                // and cannot add privileges. 
>>>>>>>>>>>>>>>>                throw new SkippedException( 
>>>>>>>>>>>>>>>>                    "SA attach not expected to work. Insufficient privileges."); 
>>>>>>>>>>>>>>>>            } 
>>>>>>>>>>>>>>>>         } 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> So basically it does a runtime check of vm.hasSAandCanAttach, and if it fails then checks if running with sudo will work. This allows for either a passwordless sudo to be used when running clhsdb, or for the user to be prompted for the sudo password (note I've remove support for the latter with my changes). 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> That brings us to the CR that is being fixed. ClhsdbLauncher tests support sudo and will therefore run with our CI testing on OSX, but the 25 or so tests that use "@requires vm.hasSAandCanAttach" do not, and therefore are never run with our CI OSX testing. The changes in this webrev fix that. 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> There are two possible approaches to the fix. One is having the check for sudo be done as part of the vm.hasSAandCanAttach evaluation. The other approach is to do the check in the test at runtime similar to how ClhsdbLauncher currently does. This would mean just using "@requires vm.hasSA" for all the tests instead of "@requires vm.hasSAandCanAttach". I chose the later because there is an advantage to throwing SkippedException rather than just silently skipping the test using @requires. The advantage is that mdash tells you how many tests were skipped, and when you hover over the reason you can see the SkippedException message, which will differentiate between reasons like the JDK was signed or there are insufficient privileges. If all the checking was done by the vm.hasSAandCanAttach evaluation, you would not know why the test wasn't run. 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> The "support" related changes made are all in the following 3 files. The rest of the changes are in the tests: 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> test/jtreg-ext/requires/VMProps.java 
>>>>>>>>>>>>>>>> test/lib/jdk/test/lib/Platform.java 
>>>>>>>>>>>>>>>> test/lib/jdk/test/lib/SA/SATestUtils.java 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> You'll noticed that one change I made to the sudo support in SATestUtils.canAddPrivileges() is to make sudo non-interactive, which means no password prompt. So that means either the user does not require a password, or the credentials have been cached. Otherwise the sudo check will fail. On most platforms if you execute a sudo command, the credentials are cached for 5 minutes. So if your user is not setup for passwordless sudo, then a sudo command can be issued before running the tests, and will likely remain cached until the test is run. The reason for using passwordless is because prompting in the middle of running tests can be confusing (you usually walk way once launching the tests and miss the prompt anyway), and avoids unnecessary delays in automated testing due to waiting for the password prompt to timeout (it used to wait 1 minute). 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> There are essentially 3 types of tests that SA Attach to a process, each needing a slightly different fix: 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 1. Tests that directly launch a jdk.hotspot.agent class, such as TestClassDump.java. They need to call SATestUtils.checkAttachOk() to verify that attaching will be possible, and then SATestUtils.addPrivilegesIfNeeded(pb) to get the sudo command added if needed.They also need to switch from using hasSAandCanAttach to using hasSA. 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 2. Tests that launch command line tools such has jhsdb. They need to call SATestUtils.checkAttachOk() to verify that attaching will be possible, and then SATestUtils.createProcessBuilder() to create a process that will be launched using sudo if necessary.They also need to switch from using hasSAandCanAttach to using hasSA. 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 3. Tests that use ClhsdbLauncher. They already use hasSA instead of hasSAandCanAttach, and rely on ClhsdbLauncher to do check at runtime if attaching will work, so for the most part all the these tests are unchanged. ClhsdbLauncher was modified to take advantage of the new SATestUtils.createProcessBuilder() and SATestUtils.checkAttachOk() APIs. 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Some tests required special handling: 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> test/hotspot/jtreg/compiler/ciReplay/TestSAClient.java 
>>>>>>>>>>>>>>>> test/hotspot/jtreg/compiler/ciReplay/TestSAServer.java 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> - These two tests SA Attach to a core file, not to a process, so only need hasSA, 
>>>>>>>>>>>>>>>>   not hasSAandCanAttach. No other changes were needed. 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFindPC.java 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> - The output should never be null. If the test was skipped due to lack of privileges, you 
>>>>>>>>>>>>>>>>   would never get to this section of the test. 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestClhsdbJstackLock.java 
>>>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestIntConstant.java 
>>>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestPrintMdo.java 
>>>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestType.java 
>>>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestUniverse.java 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> - These are ClhsdbLauncher tests, so they should have been using hasSA instead of 
>>>>>>>>>>>>>>>>   hasSAandCanAttachin the first place. No other changes were needed. 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestCpoolForInvokeDynamic.java 
>>>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestDefaultMethods.java 
>>>>>>>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/TestG1HeapRegion.java 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> - These tests used to "@require mac" but seem run fine on OSX, so I removed this requirement. 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> test/jdk/sun/tools/jhsdb/BasicLauncherTest.java 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> - This test had a runtime check to not run on OSX due to not having core file stack 
>>>>>>>>>>>>>>>>   walking support. However, this tests always attaches to a process, not a core file, 
>>>>>>>>>>>>>>>>   and seems to run just fine on OSX. 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> test/jdk/sun/tools/jstack/DeadlockDetectionTest.java 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> - I changed the test to throw a SkippedException if it gets the unexpected error code 
>>>>>>>>>>>>>>>>   rather than just println. 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> And a few other miscellaneous changes not already covered: 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> test/lib/jdk/test/lib/Platform.java 
>>>>>>>>>>>>>>>> - Made canPtraceAttachLinux() public so it can be called from SATestUtils. 
>>>>>>>>>>>>>>>> - vm.hasSAandCanAttach is now gone. 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> thanks, 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Chris 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200316/fb34fbd9/attachment-0001.htm>

From serguei.spitsyn at oracle.com  Tue Mar 17 00:20:35 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 16 Mar 2020 17:20:35 -0700
Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they
 do not attempt to use sudo when available
In-Reply-To: <556F1CD7-46ED-46D9-BAB9-DD099111D981@oracle.com>
References: <7bbb95be-1c85-d396-1b1a-c55e6a25e8a8@oracle.com>
 <247B7EA6-7BB5-4F6C-84C4-C110BAF8F063@oracle.com>
 <22798bb3-9800-fa1d-8668-dc0e95b6eccc@oracle.com>
 <556F1CD7-46ED-46D9-BAB9-DD099111D981@oracle.com>
Message-ID: <40a5458c-a2e3-5e37-688c-3f82ceb689a8@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200316/489a10ee/attachment-0001.htm>

From daniil.x.titov at oracle.com  Tue Mar 17 00:38:40 2020
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Mon, 16 Mar 2020 17:38:40 -0700
Subject: RFR: 8240711: TestJstatdPort.java failed due to "ExportException:
 Port already in use:"
In-Reply-To: <38b055df-37cd-cc11-79c9-323fb55077ff@oracle.com>
References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com>
 <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com>
 <7D3EB454-76AA-47A7-91A6-417ED92C6F56@oracle.com>
 <38b055df-37cd-cc11-79c9-323fb55077ff@oracle.com>
Message-ID: <EB491449-6735-47E9-8CC9-147A3CAAD63D@oracle.com>

Hi Alex,

Yes,  I did test the change by modifying  the test to use the RMI port that is already in use
( the stack trace in the original email was exact from this changed test) and then ensured that with the fix 
the such issue is properly handled.

I will send a new version of the webrev that removes the old version of the code that tried to handle the "port in use" case.

Thanks!

Best regards,
Daniil


?On 3/16/20, 4:47 PM, "Alex Menkov" <alexey.menkov at oracle.com> wrote:

    I don't agree.
    The code handles exact the same "port in use" case for the same tool.
    So it either works or doesn't.
    And have 2 code blocks which suppose to do the same makes the code messy.
    BTW did you tested the change (I mean craft the test to get "port in 
    use" error)?
    
    --alex
    
    On 03/16/2020 16:17, Daniil Titov wrote:
    > Resending with the corrected subject ...
    > 
    > Hi Alex,
    > 
    > Yes, you are right, class JstatdTest has the code that is supposed to handle the "port in use"
    > case but at least for this specific test  (sun/tools/jstatd/TestJstatdPort.java) it doesn't work.
    > 
    > Since there are multiple tests in sun/tools/jstatd/* folder that use this class and different ports
    > might be subject to the "port in use" error and taking into account that it's hard to reproduce such case
    > I found it safer to leave the original code and just augment it with what was missing for this specific
    > case rather than completely replacing it.
    > 
    > Best regards,
    > Daniil
    > 
    > ?On 3/16/20, 4:02 PM, "Alex Menkov" <alexey.menkov at oracle.com> wrote:
    > 
    >      Hi Daniil,
    >      
    >      Looks like the test is supposed to handle "port in use" issue (see lines
    >      103-114).
    >      I suppose in case "port in use" jstatd exits, but
    >      ProcessTools.startProcess() continue to wait for "jstatd started" message.
    >      
    >      --alex
    >      
    >      On 03/16/2020 12:00, Daniil Titov wrote:
    >      > Please review the change [1] that fixes the intermittent failure of the test.
    >      >
    >      > The problem here is that if the RMI port is in use than the test keep waiting for "jstatd started (bound to " to appear in the process output and in this case
    >      > It doesn't happen.
    >      >
    >      > 	at java.util.concurrent.CountDownLatch.await(java.base at 15-internal/CountDownLatch.java:232)
    >      > 	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:205)
    >      > 	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:133)
    >      > 	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:254)
    >      > 	at jdk.test.lib.thread.ProcessThread$ProcessRunnable.xrun(ProcessThread.java:153)
    >      > 	at jdk.test.lib.thread.XRun.run(XRun.java:40)
    >      > 	at java.lang.Thread.run(java.base at 15-internal/Thread.java:832)
    >      > 	at jdk.test.lib.thread.TestThread.run(TestThread.java:123)
    >      >
    >      > Testing: Mach5 tests for sun/tools/jstatd/ successfully passed.  Tier1-tier3 tests are still in progress.
    >      >
    >      > [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.01/
    >      > [2] https://bugs.openjdk.java.net/browse/JDK-8240711
    >      >
    >      >
    >      > Thank you,
    >      > Daniil
    >      >
    >      >
    >      >
    >      
    > 
    > 
    

From chris.plummer at oracle.com  Tue Mar 17 01:16:29 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Mon, 16 Mar 2020 18:16:29 -0700
Subject: RFR(L) 8238268: Many SA tests are not running on OSX because they
 do not attempt to use sudo when available
In-Reply-To: <556F1CD7-46ED-46D9-BAB9-DD099111D981@oracle.com>
References: <7bbb95be-1c85-d396-1b1a-c55e6a25e8a8@oracle.com>
 <247B7EA6-7BB5-4F6C-84C4-C110BAF8F063@oracle.com>
 <22798bb3-9800-fa1d-8668-dc0e95b6eccc@oracle.com>
 <556F1CD7-46ED-46D9-BAB9-DD099111D981@oracle.com>
Message-ID: <7f615584-81e2-e49a-0bfb-563adc1f5834@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200316/4f5a5ae3/attachment-0001.htm>

From dms at samersoff.net  Tue Mar 17 14:02:24 2020
From: dms at samersoff.net (Dmitry Samersoff)
Date: Tue, 17 Mar 2020 17:02:24 +0300
Subject: jmx-dev RFR 8240604: Rewrite
 sun/management/jmxremote/bootstrap/CustomLauncherTest.java test to make
 binaries from source file
In-Reply-To: <b9cf2538-275a-f6ee-aa8c-3e50d1cdef3b@oracle.com>
References: <6b21ac03-0505-881e-b805-1681faab437b@bell-sw.com>
 <b9cf2538-275a-f6ee-aa8c-3e50d1cdef3b@oracle.com>
Message-ID: <9a14fff8-d4a0-83be-bbf1-cc7558e5c6b4@samersoff.net>

Hello Alexander,

The fix looks good for me.

-Dmitry

On 05.03.2020 17:27, Daniel Fuchs wrote:
> Hi Alexander,
> 
> Fixes to JMX & management agent are reviewed on the
> seviceability-dev (added in to:) these days.
> 
> best regards,
> 
> -- daniel
> 
> On 05/03/2020 13:17, Alexander Scherbatiy wrote:
>> Hello,
>>
>> Could you review a small enhancement where the test CustomLauncherTest 
>> is updated to build binary launcher file from launcher.c file.
>> The file launcher.c is renamed to exelauncher.c to follow the name 
>> convention for executable test files building by jdk make system.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240604
>> Webrev: http://cr.openjdk.java.net/~alexsch/8240604/webrev.00
>>
>> The changes for obsolete binary files from 
>> sun/management/jmxremote/bootstrap/linux-* and solaris-* are not 
>> included into the webrev. They needs to be removed manually.
>>
>> The test is passed on Ubuntu 18.04 x86-64, Solaris Sparc v9 11.2, and 
>> Solaris x64 11.4 systems.
>>
>> The test is excluded from Windows and Mac Os X systems.
>>
>> Thanks,
>> Alexander.
> 


From igor.ignatyev at oracle.com  Tue Mar 17 17:11:01 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Tue, 17 Mar 2020 10:11:01 -0700
Subject: jmx-dev RFR 8240604: Rewrite
 sun/management/jmxremote/bootstrap/CustomLauncherTest.java test to make
 binaries from source file
In-Reply-To: <b9cf2538-275a-f6ee-aa8c-3e50d1cdef3b@oracle.com>
References: <6b21ac03-0505-881e-b805-1681faab437b@bell-sw.com>
 <b9cf2538-275a-f6ee-aa8c-3e50d1cdef3b@oracle.com>
Message-ID: <DFCA7043-9F26-4FC6-940F-F448E2185151@oracle.com>

Hi Alexander,

overall looks good to me, I have a few comments though:
 - you can use Utils.TEST_CLASSPATH instead of CustomLauncherTest.TEST_CLASSPATH
- CustomLauncherTest::findLibjvm can be simplified by use Platform::jvmLibDir
- exelauncher.c has a comment which refers to the test as CustomLauncherTest.sh, could you please update the comment?
- you have to add /native flag to @run action, otherwise jtreg won't exclude this test from runs w/ test.nativepath being unset

I also have a question regarding your statement that
>> The changes for obsolete binary files <...> are not included into the webrev. They needs to be removed manually.
you are planning to remove these files as part of this patch, right?

Thanks,
-- Igor


> On Mar 5, 2020, at 6:27 AM, Daniel Fuchs <daniel.fuchs at oracle.com> wrote:
> 
> Hi Alexander,
> 
> Fixes to JMX & management agent are reviewed on the
> seviceability-dev (added in to:) these days.
> 
> best regards,
> 
> -- daniel
> 
> On 05/03/2020 13:17, Alexander Scherbatiy wrote:
>> Hello,
>> Could you review a small enhancement where the test CustomLauncherTest is updated to build binary launcher file from launcher.c file.
>> The file launcher.c is renamed to exelauncher.c to follow the name convention for executable test files building by jdk make system.
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240604
>> Webrev: http://cr.openjdk.java.net/~alexsch/8240604/webrev.00
>> The changes for obsolete binary files from sun/management/jmxremote/bootstrap/linux-* and solaris-* are not included into the webrev. They needs to be removed manually.
>> The test is passed on Ubuntu 18.04 x86-64, Solaris Sparc v9 11.2, and Solaris x64 11.4 systems.
>> The test is excluded from Windows and Mac Os X systems.
>> Thanks,
>> Alexander.
> 


From daniil.x.titov at oracle.com  Tue Mar 17 18:40:32 2020
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Tue, 17 Mar 2020 11:40:32 -0700
Subject: RFR: 8240711: TestJstatdPort.java failed due to "ExportException:
 Port already in use:"
In-Reply-To: <3F070550-DAB3-431E-89AD-66532B366FEA@oracle.com>
References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com>
 <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com>
 <7D3EB454-76AA-47A7-91A6-417ED92C6F56@oracle.com>
 <38b055df-37cd-cc11-79c9-323fb55077ff@oracle.com>
 <3F070550-DAB3-431E-89AD-66532B366FEA@oracle.com>
Message-ID: <597AFBA6-42DB-4499-BF0C-8160155BABF4@oracle.com>

Hi Alex,

Please review a new version of the fix that removes the old version of the code that tried to handle the "port in use" case.

Testing: Mach5 tests for sun/tools/jstatd/  successfully passed 100 times.  Tier1-tier3 tests successfully passed. 

[1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.02  
[2] https://bugs.openjdk.java.net/browse/JDK-8240711

Thanks,
Daniil


?On 3/16/20, 5:38 PM, "Daniil Titov" <daniil.x.titov at oracle.com> wrote:

    Hi Alex,
    
    Yes,  I did test the change by modifying  the test to use the RMI port that is already in use
    ( the stack trace in the original email was exact from this changed test) and then ensured that with the fix 
    the such issue is properly handled.
    
    I will send a new version of the webrev that removes the old version of the code that tried to handle the "port in use" case.
    
    Thanks!
    
    Best regards,
    Daniil
    
    
    ?On 3/16/20, 4:47 PM, "Alex Menkov" <alexey.menkov at oracle.com> wrote:
    
        I don't agree.
        The code handles exact the same "port in use" case for the same tool.
        So it either works or doesn't.
        And have 2 code blocks which suppose to do the same makes the code messy.
        BTW did you tested the change (I mean craft the test to get "port in 
        use" error)?
        
        --alex
        
        On 03/16/2020 16:17, Daniil Titov wrote:
        > Resending with the corrected subject ...
        > 
        > Hi Alex,
        > 
        > Yes, you are right, class JstatdTest has the code that is supposed to handle the "port in use"
        > case but at least for this specific test  (sun/tools/jstatd/TestJstatdPort.java) it doesn't work.
        > 
        > Since there are multiple tests in sun/tools/jstatd/* folder that use this class and different ports
        > might be subject to the "port in use" error and taking into account that it's hard to reproduce such case
        > I found it safer to leave the original code and just augment it with what was missing for this specific
        > case rather than completely replacing it.
        > 
        > Best regards,
        > Daniil
        > 
        > ?On 3/16/20, 4:02 PM, "Alex Menkov" <alexey.menkov at oracle.com> wrote:
        > 
        >      Hi Daniil,
        >      
        >      Looks like the test is supposed to handle "port in use" issue (see lines
        >      103-114).
        >      I suppose in case "port in use" jstatd exits, but
        >      ProcessTools.startProcess() continue to wait for "jstatd started" message.
        >      
        >      --alex
        >      
        >      On 03/16/2020 12:00, Daniil Titov wrote:
        >      > Please review the change [1] that fixes the intermittent failure of the test.
        >      >
        >      > The problem here is that if the RMI port is in use than the test keep waiting for "jstatd started (bound to " to appear in the process output and in this case
        >      > It doesn't happen.
        >      >
        >      > 	at java.util.concurrent.CountDownLatch.await(java.base at 15-internal/CountDownLatch.java:232)
        >      > 	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:205)
        >      > 	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:133)
        >      > 	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:254)
        >      > 	at jdk.test.lib.thread.ProcessThread$ProcessRunnable.xrun(ProcessThread.java:153)
        >      > 	at jdk.test.lib.thread.XRun.run(XRun.java:40)
        >      > 	at java.lang.Thread.run(java.base at 15-internal/Thread.java:832)
        >      > 	at jdk.test.lib.thread.TestThread.run(TestThread.java:123)
        >      >
        >      > Testing: Mach5 tests for sun/tools/jstatd/ successfully passed.  Tier1-tier3 tests are still in progress.
        >      >
        >      > [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.01/
        >      > [2] https://bugs.openjdk.java.net/browse/JDK-8240711
        >      >
        >      >
        >      > Thank you,
        >      > Daniil
        >      >
        >      >
        >      >
        >      
        > 
        > 
        
    
From alexey.menkov at oracle.com  Tue Mar 17 18:58:48 2020
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Tue, 17 Mar 2020 11:58:48 -0700
Subject: RFR: 8240711: TestJstatdPort.java failed due to "ExportException:
 Port already in use:"
In-Reply-To: <597AFBA6-42DB-4499-BF0C-8160155BABF4@oracle.com>
References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com>
 <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com>
 <7D3EB454-76AA-47A7-91A6-417ED92C6F56@oracle.com>
 <38b055df-37cd-cc11-79c9-323fb55077ff@oracle.com>
 <3F070550-DAB3-431E-89AD-66532B366FEA@oracle.com>
 <597AFBA6-42DB-4499-BF0C-8160155BABF4@oracle.com>
Message-ID: <8f18db0e-e988-ba8f-2f55-07584cb4b7e0@oracle.com>

LGTM

--alex

On 03/17/2020 11:40, Daniil Titov wrote:
> Hi Alex,
> 
> Please review a new version of the fix that removes the old version of the code that tried to handle the "port in use" case.
> 
> Testing: Mach5 tests for sun/tools/jstatd/  successfully passed 100 times.  Tier1-tier3 tests successfully passed.
> 
> [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.02
> [2] https://bugs.openjdk.java.net/browse/JDK-8240711
> 
> Thanks,
> Daniil
> 
> 
> 
> ?On 3/16/20, 5:38 PM, "Daniil Titov" <daniil.x.titov at oracle.com> wrote:
> 
>      Hi Alex,
>      
>      Yes,  I did test the change by modifying  the test to use the RMI port that is already in use
>      ( the stack trace in the original email was exact from this changed test) and then ensured that with the fix
>      the such issue is properly handled.
>      
>      I will send a new version of the webrev that removes the old version of the code that tried to handle the "port in use" case.
>      
>      Thanks!
>      
>      Best regards,
>      Daniil
>      
>      
>      
>      
>      ?On 3/16/20, 4:47 PM, "Alex Menkov" <alexey.menkov at oracle.com> wrote:
>      
>          I don't agree.
>          The code handles exact the same "port in use" case for the same tool.
>          So it either works or doesn't.
>          And have 2 code blocks which suppose to do the same makes the code messy.
>          BTW did you tested the change (I mean craft the test to get "port in
>          use" error)?
>          
>          --alex
>          
>          On 03/16/2020 16:17, Daniil Titov wrote:
>          > Resending with the corrected subject ...
>          >
>          > Hi Alex,
>          >
>          > Yes, you are right, class JstatdTest has the code that is supposed to handle the "port in use"
>          > case but at least for this specific test  (sun/tools/jstatd/TestJstatdPort.java) it doesn't work.
>          >
>          > Since there are multiple tests in sun/tools/jstatd/* folder that use this class and different ports
>          > might be subject to the "port in use" error and taking into account that it's hard to reproduce such case
>          > I found it safer to leave the original code and just augment it with what was missing for this specific
>          > case rather than completely replacing it.
>          >
>          > Best regards,
>          > Daniil
>          >
>          > ?On 3/16/20, 4:02 PM, "Alex Menkov" <alexey.menkov at oracle.com> wrote:
>          >
>          >      Hi Daniil,
>          >
>          >      Looks like the test is supposed to handle "port in use" issue (see lines
>          >      103-114).
>          >      I suppose in case "port in use" jstatd exits, but
>          >      ProcessTools.startProcess() continue to wait for "jstatd started" message.
>          >
>          >      --alex
>          >
>          >      On 03/16/2020 12:00, Daniil Titov wrote:
>          >      > Please review the change [1] that fixes the intermittent failure of the test.
>          >      >
>          >      > The problem here is that if the RMI port is in use than the test keep waiting for "jstatd started (bound to " to appear in the process output and in this case
>          >      > It doesn't happen.
>          >      >
>          >      > 	at java.util.concurrent.CountDownLatch.await(java.base at 15-internal/CountDownLatch.java:232)
>          >      > 	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:205)
>          >      > 	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:133)
>          >      > 	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:254)
>          >      > 	at jdk.test.lib.thread.ProcessThread$ProcessRunnable.xrun(ProcessThread.java:153)
>          >      > 	at jdk.test.lib.thread.XRun.run(XRun.java:40)
>          >      > 	at java.lang.Thread.run(java.base at 15-internal/Thread.java:832)
>          >      > 	at jdk.test.lib.thread.TestThread.run(TestThread.java:123)
>          >      >
>          >      > Testing: Mach5 tests for sun/tools/jstatd/ successfully passed.  Tier1-tier3 tests are still in progress.
>          >      >
>          >      > [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.01/
>          >      > [2] https://bugs.openjdk.java.net/browse/JDK-8240711
>          >      >
>          >      >
>          >      > Thank you,
>          >      > Daniil
>          >      >
>          >      >
>          >      >
>          >
>          >
>          >
>          
>      
> 
> 

From patricio.chilano.mateo at oracle.com  Tue Mar 17 20:14:14 2020
From: patricio.chilano.mateo at oracle.com (Patricio Chilano)
Date: Tue, 17 Mar 2020 17:14:14 -0300
Subject: RFR 8240902: JDI shared memory connector can use already closed
 Handles
Message-ID: <b644413f-060c-e856-5de5-bb7d69783a8e@oracle.com>

Hi all,

Please review the following patch:

Bug: https://bugs.openjdk.java.net/browse/JDK-8240902
Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/

Calling closeConnection() on an already created/opened connection 
includes calls to CloseHandle() on objects that can still be used by 
other threads. This can lead to either undefined behavior or, as 
detailed in the bug comments, changes of state of unrelated objects. 
This issue was found while debugging the reason behind some jshell test 
failures seen after pushing 8230594. Not as important, but there are 
also calls to closeStream() from createStream()/openStream() when 
failing to create/open a stream that will return after executing 
"CHECK_ERROR(enterMutex(stream, NULL));" without closing the intended 
resources. Then, calling closeConnection() could assert if the reason of 
the previous failure was that the stream's mutex failed to be 
created/opened. These patch aims to address these issues too.

Tested in mach5 with the current baseline, tiers1-3 and several runs of 
open/test/langtools/:tier1 which includes the jshell tests where this 
connector is used. I also applied patch 
http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev 
mentioned in the comments of the bug, on top of the baseline and run the 
langtool tests with and without this fix. Without the fix running around 
30 repetitions already shows failures in tests 
jdk/jshell/FailOverExecutionControlTest.java and 
jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With the fix 
I run several hundred runs and saw no failures. Let me know if there is 
any additional testing I should do.

As a side note, I see there are a couple of open issues related with 
jshell failures (8209848) which could be related to this bug and 
therefore might be fixed by this patch.

Thanks,
Patricio


From chris.plummer at oracle.com  Wed Mar 18 04:52:59 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 17 Mar 2020 21:52:59 -0700
Subject: RFR(XS) 8240906: Update ZGC ProblemList for
 serviceability/sa/TestJmapCoreMetaspace.java
Message-ID: <2545b3cf-3136-c990-dcfc-b6b6b7cc7d9f@oracle.com>

Hello,

Please review the following:

https://bugs.openjdk.java.net/browse/JDK-8240906

diff --git a/test/hotspot/jtreg/ProblemList-zgc.txt 
b/test/hotspot/jtreg/ProblemList-zgc.txt
--- a/test/hotspot/jtreg/ProblemList-zgc.txt
+++ b/test/hotspot/jtreg/ProblemList-zgc.txt
@@ -47,5 +47,5 @@
 ?serviceability/sa/TestJhsdbJstackLock.java 8220624?? generic-all
 ?serviceability/sa/TestJhsdbJstackMixed.java 8220624?? generic-all
 ?serviceability/sa/TestJmapCore.java 8220624?? generic-all
-serviceability/sa/TestJmapCoreMetaspace.java 8219443?? generic-all
+serviceability/sa/TestJmapCoreMetaspace.java 8220624?? generic-all
 ?serviceability/sa/sadebugd/DebugdConnectTest.java 8220624?? generic-all

8219443 [1] was closed as a dup of 8219405 [2], which is a very 
intermittent bug that occurs even without ZGC, so should not be used to 
problem list this test for ZGC. However it should be ZGC problem listed 
due to 8220624 [3] just like TestJmapCore.java is.

[1] https://bugs.openjdk.java.net/browse/JDK-8219443
[2] https://bugs.openjdk.java.net/browse/JDK-8219405
[3] https://bugs.openjdk.java.net/browse/JDK-8220624

thanks,

Chris


From chris.plummer at oracle.com  Wed Mar 18 04:59:00 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 17 Mar 2020 21:59:00 -0700
Subject: RFR(XS) 8227340: Modify problem list entry for
 javax/management/remote/mandatory/connection/MultiThreadDeadLockTest.java
Message-ID: <0c2e3ed0-e35c-c555-b8ed-ba64eb08f6ae@oracle.com>

Hello,

Please review the following:

https://bugs.openjdk.java.net/browse/JDK-8227340

diff --git a/test/jdk/ProblemList.txt b/test/jdk/ProblemList.txt
--- a/test/jdk/ProblemList.txt
+++ b/test/jdk/ProblemList.txt
@@ -587,7 +587,7 @@
 ?java/lang/management/ThreadMXBean/AllThreadIds.java 8131745 generic-all

 ?javax/management/monitor/DerivedGaugeMonitorTest.java 8042211 generic-all
-javax/management/remote/mandatory/connection/MultiThreadDeadLockTest.java 
8042215 generic-all
+javax/management/remote/mandatory/connection/MultiThreadDeadLockTest.java 
8227337 generic-all

8042215 [1] used to be the correct CR to problem list this test under, 
but it was accidentally used to fix for a different bug. 8042215 [1] has 
now been cloned to 8227337 [2] so the problem list needs to be updated also.

[1] https://bugs.openjdk.java.net/browse/JDK-8042215
[2] https://bugs.openjdk.java.net/browse/JDK-8227337

thanks,

Chris


From david.holmes at oracle.com  Wed Mar 18 05:11:27 2020
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 18 Mar 2020 15:11:27 +1000
Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump
In-Reply-To: <AM0PR02MB45007871015FB278E8406F5D9FFA0@AM0PR02MB4500.eurprd02.prod.outlook.com>
References: <AM6PR02MB450135717FBE0EC6172A7E0D9F1C0@AM6PR02MB4501.eurprd02.prod.outlook.com>
 <01361a9d-2855-db67-a176-73731fada08f@oracle.com>
 <AM0PR02MB4500CE953024FA02ACFFC53A9F1B0@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com>
 <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com>
 <d852dfe2-254b-d6c4-089b-13ffce8b8257@oracle.com>
 <AM0PR02MB4500BB43A864A446E094B23E9F100@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <c32fe0b4-af56-b5ec-da51-c720dba9b030@oracle.com>
 <e726a869-cc64-4d22-78f5-c77e702615e6@oss.nttdata.com>
 <AM0PR02MB4500D5AB3290D4CE47E718789F130@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <AM0PR02MB57144A77A755760F8BD4E75B8A120@AM0PR02MB5714.eurprd02.prod.outlook.com>
 <0343dfac-61f7-1b1c-ee96-bdee130578ad@oracle.com>
 <AM0PR02MB5714BFA6B42AF1AA86A3384D8AED0@AM0PR02MB5714.eurprd02.prod.outlook.com>
 <e8943a91-839b-3601-cc33-77f338aab96e@oracle.com>
 <AM0PR02MB45007871015FB278E8406F5D9FFA0@AM0PR02MB4500.eurprd02.prod.outlook.com>
Message-ID: <d8b47052-67b6-5194-8fc2-a7f4802abbf6@oracle.com>

Hi Ralf,

On 13/03/2020 9:43 pm, Schmelter, Ralf wrote:
> Hi,
> 
> I have updated the webrev: http://cr.openjdk.java.net/~rschmelter/webrevs/8237354/webrev.1/
> 
> It has the following significant changes:
> 
> - The jcmd now uses two separate flags. The -gz flag is now a boolean flag which toggles the compression  on/off. And the new -gz-level flag can be used to change the compression level. If tried to change the jlong flag coding to allow the old behavior (only one flag, which acts both as a boolean flag and a jlong flag), but decided against it, since it changes the semantic of a jlong flag. And I don't expect the -gz-level flag to be used all that much.
> 
> - I no longer use my own threads. Instead I use the WorkGang returned from CollectedHeap:: get_safepoint_workers(). This works fine, apart from Shenandoah GC, which runs into assertions when calling the CollectedHeap::object_iterate() method from a worker thread. I'm not sure if the assertion is too strong, but since the GC is currently experimental, I switch back to single threading in this case (as would be the case for serial GC or epsilon GC). Using the worker threads removes the problems the original code had regarding destruction of the monitor used.

I'm glad to see you are no longer using your own threads, and I 
apologise that I have not yet been able to look further into the thread 
lifecycle issues you encountered. However I'm not clear how this solves 
the problem of destroying the monitor while it can still be being 
accessed - is the dumping occurring at a safepoint in the WorkGang threads?

Thanks,
David
-----

> - The reported number of bytes is now the one written to disk.
> 
> Best regards,
> Ralf
> 
> -----Original Message-----
> From: Ioi Lam <ioi.lam at oracle.com>
> Sent: Dienstag, 25. Februar 2020 18:03
> To: Langer, Christoph <christoph.langer at sap.com>; Schmelter, Ralf <ralf.schmelter at sap.com>; Yasumasa Suenaga <suenaga at oss.nttdata.com>; serguei.spitsyn at oracle.com; hotspot-runtime-dev at openjdk.java.net runtime <hotspot-runtime-dev at openjdk.java.net>
> Cc: serviceability-dev at openjdk.java.net
> Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump
> 
> Hi Christoph,
> 
> This sounds fair. I will remove my objection :-)
> 
> Thanks
> - Ioi
> 

From david.holmes at oracle.com  Wed Mar 18 05:16:47 2020
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 18 Mar 2020 15:16:47 +1000
Subject: RFR(XS) 8227340: Modify problem list entry for
 javax/management/remote/mandatory/connection/MultiThreadDeadLockTest.java
In-Reply-To: <0c2e3ed0-e35c-c555-b8ed-ba64eb08f6ae@oracle.com>
References: <0c2e3ed0-e35c-c555-b8ed-ba64eb08f6ae@oracle.com>
Message-ID: <0eb4d0c3-60fb-be22-238a-fd8f4b10ff9e@oracle.com>

Hi Chris,

On 18/03/2020 2:59 pm, Chris Plummer wrote:
> Hello,
> 
> Please review the following:
> 
> https://bugs.openjdk.java.net/browse/JDK-8227340
> 
> diff --git a/test/jdk/ProblemList.txt b/test/jdk/ProblemList.txt
> --- a/test/jdk/ProblemList.txt
> +++ b/test/jdk/ProblemList.txt
> @@ -587,7 +587,7 @@
>  ?java/lang/management/ThreadMXBean/AllThreadIds.java 8131745 generic-all
> 
>  ?javax/management/monitor/DerivedGaugeMonitorTest.java 8042211 generic-all
> -javax/management/remote/mandatory/connection/MultiThreadDeadLockTest.java 
> 8042215 generic-all
> +javax/management/remote/mandatory/connection/MultiThreadDeadLockTest.java 
> 8227337 generic-all
> 
> 8042215 [1] used to be the correct CR to problem list this test under, 
> but it was accidentally used to fix for a different bug. 8042215 [1] has 
> now been cloned to 8227337 [2] so the problem list needs to be updated 
> also.

Okay. The bugs themselves are in a bit of a muddle but this issue is okay.

Thanks,
David

> [1] https://bugs.openjdk.java.net/browse/JDK-8042215
> [2] https://bugs.openjdk.java.net/browse/JDK-8227337
> 
> thanks,
> 
> Chris
> 
> 

From ralf.schmelter at sap.com  Wed Mar 18 06:39:36 2020
From: ralf.schmelter at sap.com (Schmelter, Ralf)
Date: Wed, 18 Mar 2020 06:39:36 +0000
Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump
In-Reply-To: <d8b47052-67b6-5194-8fc2-a7f4802abbf6@oracle.com>
References: <AM6PR02MB450135717FBE0EC6172A7E0D9F1C0@AM6PR02MB4501.eurprd02.prod.outlook.com>
 <01361a9d-2855-db67-a176-73731fada08f@oracle.com>
 <AM0PR02MB4500CE953024FA02ACFFC53A9F1B0@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com>
 <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com>
 <d852dfe2-254b-d6c4-089b-13ffce8b8257@oracle.com>
 <AM0PR02MB4500BB43A864A446E094B23E9F100@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <c32fe0b4-af56-b5ec-da51-c720dba9b030@oracle.com>
 <e726a869-cc64-4d22-78f5-c77e702615e6@oss.nttdata.com>
 <AM0PR02MB4500D5AB3290D4CE47E718789F130@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <AM0PR02MB57144A77A755760F8BD4E75B8A120@AM0PR02MB5714.eurprd02.prod.outlook.com>
 <0343dfac-61f7-1b1c-ee96-bdee130578ad@oracle.com>
 <AM0PR02MB5714BFA6B42AF1AA86A3384D8AED0@AM0PR02MB5714.eurprd02.prod.outlook.com>
 <e8943a91-839b-3601-cc33-77f338aab96e@oracle.com>
 <AM0PR02MB45007871015FB278E8406F5D9FFA0@AM0PR02MB4500.eurprd02.prod.outlook.com>,
 <d8b47052-67b6-5194-8fc2-a7f4802abbf6@oracle.com>
Message-ID: <AM0PR02MB450049E339EC54D8990B08279FF70@AM0PR02MB4500.eurprd02.prod.outlook.com>

Hi David,

>?However I'm not clear how this solves??the problem of destroying
> the monitor while it can still be being?accessed - is the dumping
> occurring at a safepoint in the WorkGang threads?

Because when the run_task() method returns, I can be sure none
of the work gang threads still use the mutex. They have to exit the
thread_loop() method to finish the task. And by exiting the method
they have released the mutex.

Best regards,
Ralf


From: David Holmes <david.holmes at oracle.com>

Sent: Wednesday, March 18, 2020 6:11 AM

To: Schmelter, Ralf <ralf.schmelter at sap.com>; Ioi Lam <ioi.lam at oracle.com>; Langer, Christoph <christoph.langer at sap.com>; Yasumasa Suenaga <suenaga at oss.nttdata.com>; serguei.spitsyn at oracle.com <serguei.spitsyn at oracle.com>; hotspot-runtime-dev at openjdk.java.net
 runtime <hotspot-runtime-dev at openjdk.java.net>

Cc: serviceability-dev at openjdk.java.net <serviceability-dev at openjdk.java.net>

Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump

?


Hi Ralf,


On 13/03/2020 9:43 pm, Schmelter, Ralf wrote:

> Hi,

> 

> I have updated the webrev: 
http://cr.openjdk.java.net/~rschmelter/webrevs/8237354/webrev.1/

> 

> It has the following significant changes:

> 

> - The jcmd now uses two separate flags. The -gz flag is now a boolean flag which toggles the compression? on/off. And the new -gz-level flag can be used to change the compression level. If tried to change the jlong flag coding to allow the old behavior (only
 one flag, which acts both as a boolean flag and a jlong flag), but decided against it, since it changes the semantic of a jlong flag. And I don't expect the -gz-level flag to be used all that much.

> 

> - I no longer use my own threads. Instead I use the WorkGang returned from CollectedHeap:: get_safepoint_workers(). This works fine, apart from Shenandoah GC, which runs into assertions when calling the CollectedHeap::object_iterate() method from a worker
 thread. I'm not sure if the assertion is too strong, but since the GC is currently experimental, I switch back to single threading in this case (as would be the case for serial GC or epsilon GC). Using the worker threads removes the problems the original code
 had regarding destruction of the monitor used.


I'm glad to see you are no longer using your own threads, and I 

apologise that I have not yet been able to look further into the thread 

lifecycle issues you encountered. However I'm not clear how this solves 

the problem of destroying the monitor while it can still be being 

accessed - is the dumping occurring at a safepoint in the WorkGang threads?


Thanks,

David

-----


From david.holmes at oracle.com  Wed Mar 18 06:43:22 2020
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 18 Mar 2020 16:43:22 +1000
Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump
In-Reply-To: <AM0PR02MB450049E339EC54D8990B08279FF70@AM0PR02MB4500.eurprd02.prod.outlook.com>
References: <AM6PR02MB450135717FBE0EC6172A7E0D9F1C0@AM6PR02MB4501.eurprd02.prod.outlook.com>
 <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com>
 <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com>
 <d852dfe2-254b-d6c4-089b-13ffce8b8257@oracle.com>
 <AM0PR02MB4500BB43A864A446E094B23E9F100@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <c32fe0b4-af56-b5ec-da51-c720dba9b030@oracle.com>
 <e726a869-cc64-4d22-78f5-c77e702615e6@oss.nttdata.com>
 <AM0PR02MB4500D5AB3290D4CE47E718789F130@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <AM0PR02MB57144A77A755760F8BD4E75B8A120@AM0PR02MB5714.eurprd02.prod.outlook.com>
 <0343dfac-61f7-1b1c-ee96-bdee130578ad@oracle.com>
 <AM0PR02MB5714BFA6B42AF1AA86A3384D8AED0@AM0PR02MB5714.eurprd02.prod.outlook.com>
 <e8943a91-839b-3601-cc33-77f338aab96e@oracle.com>
 <AM0PR02MB45007871015FB278E8406F5D9FFA0@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <d8b47052-67b6-5194-8fc2-a7f4802abbf6@oracle.com>
 <AM0PR02MB450049E339EC54D8990B08279FF70@AM0PR02MB4500.eurprd02.prod.outlook.com>
Message-ID: <05ae5818-f1b7-2e86-e4dd-c09ff240748e@oracle.com>

On 18/03/2020 4:39 pm, Schmelter, Ralf wrote:
> Hi David,
> 
>>  ?However I'm not clear how this solves??the problem of destroying
>> the monitor while it can still be being?accessed - is the dumping
>> occurring at a safepoint in the WorkGang threads?
> 
> Because when the run_task() method returns, I can be sure none
> of the work gang threads still use the mutex. They have to exit the
> thread_loop() method to finish the task. And by exiting the method
> they have released the mutex.

All of which is happening via VM_HeapDumper::doit().

Got it.

Thanks,
David


> Best regards,
> Ralf
> 
> 
> 
> 
> 
> 
> From: David Holmes <david.holmes at oracle.com>
> 
> Sent: Wednesday, March 18, 2020 6:11 AM
> 
> To: Schmelter, Ralf <ralf.schmelter at sap.com>; Ioi Lam <ioi.lam at oracle.com>; Langer, Christoph <christoph.langer at sap.com>; Yasumasa Suenaga <suenaga at oss.nttdata.com>; serguei.spitsyn at oracle.com <serguei.spitsyn at oracle.com>; hotspot-runtime-dev at openjdk.java.net
>   runtime <hotspot-runtime-dev at openjdk.java.net>
> 
> Cc: serviceability-dev at openjdk.java.net <serviceability-dev at openjdk.java.net>
> 
> Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump
> 
>   
> 
> 
> Hi Ralf,
> 
> 
> 
> On 13/03/2020 9:43 pm, Schmelter, Ralf wrote:
> 
>> Hi,
> 
>>
> 
>> I have updated the webrev:
> http://cr.openjdk.java.net/~rschmelter/webrevs/8237354/webrev.1/
> 
>>
> 
>> It has the following significant changes:
> 
>>
> 
>> - The jcmd now uses two separate flags. The -gz flag is now a boolean flag which toggles the compression? on/off. And the new -gz-level flag can be used to change the compression level. If tried to change the jlong flag coding to allow the old behavior (only
>   one flag, which acts both as a boolean flag and a jlong flag), but decided against it, since it changes the semantic of a jlong flag. And I don't expect the -gz-level flag to be used all that much.
> 
>>
> 
>> - I no longer use my own threads. Instead I use the WorkGang returned from CollectedHeap:: get_safepoint_workers(). This works fine, apart from Shenandoah GC, which runs into assertions when calling the CollectedHeap::object_iterate() method from a worker
>   thread. I'm not sure if the assertion is too strong, but since the GC is currently experimental, I switch back to single threading in this case (as would be the case for serial GC or epsilon GC). Using the worker threads removes the problems the original code
>   had regarding destruction of the monitor used.
> 
> 
> 
> I'm glad to see you are no longer using your own threads, and I
> 
> apologise that I have not yet been able to look further into the thread
> 
> lifecycle issues you encountered. However I'm not clear how this solves
> 
> the problem of destroying the monitor while it can still be being
> 
> accessed - is the dumping occurring at a safepoint in the WorkGang threads?
> 
> 
> 
> Thanks,
> 
> David
> 
> -----
> 
> 

From david.holmes at oracle.com  Wed Mar 18 07:27:30 2020
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 18 Mar 2020 17:27:30 +1000
Subject: RFR 8240902: JDI shared memory connector can use already closed
 Handles
In-Reply-To: <b644413f-060c-e856-5de5-bb7d69783a8e@oracle.com>
References: <b644413f-060c-e856-5de5-bb7d69783a8e@oracle.com>
Message-ID: <db15a209-c73d-c380-42e2-75e713392453@oracle.com>

Hi Patricio,

On 18/03/2020 6:14 am, Patricio Chilano wrote:
> Hi all,
> 
> Please review the following patch:
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8240902
> Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/
> 
> Calling closeConnection() on an already created/opened connection 
> includes calls to CloseHandle() on objects that can still be used by 
> other threads. This can lead to either undefined behavior or, as 
> detailed in the bug comments, changes of state of unrelated objects. 

This was a really great find!

> This issue was found while debugging the reason behind some jshell test 
> failures seen after pushing 8230594. Not as important, but there are 
> also calls to closeStream() from createStream()/openStream() when 
> failing to create/open a stream that will return after executing 
> "CHECK_ERROR(enterMutex(stream, NULL));" without closing the intended 
> resources. Then, calling closeConnection() could assert if the reason of 
> the previous failure was that the stream's mutex failed to be 
> created/opened. These patch aims to address these issues too.

Patch looks good in general. The internal reference count guards 
deletion of the internal resources, and is itself safe because never 
actually delete the connection. Thanks for adding the comment about this 
aspect.

A few items:

Please update copyright year before pushing.

Please align ENTER_CONNECTION/LEAVE_CONNECTION macros the same way as 
STREAM_INVARIANT.

  170     unsigned int refcount;
  171     jint state;

I'm unclear about the use of stream->state and connection->state as 
guards - unless accessed under a mutex these would seem to at least need 
acquire/release semantics.

Additionally the reads of refcount would also seem to need to some form 
of memory synchronization - though the Windows docs for the Interlocked* 
API does not show how to simply read such a variable! Though I note that 
the RtlFirstEntrySList method for the "Interlocked Singly Linked Lists" 
API does state "Access to the list is synchronized on a multiprocessor 
system." which suggests a read of such a variable does require some form 
of memory synchronization!

  413     while (attempts>0) {

spaces around >

If the loop at 413 never encounters a zero reference_count then it 
doesn't close the events or the mutex but still returns SYS_OK. That 
seems wrong but I'm not sure what the right behaviour is here.

And please wait for serviceability folk to review this.

Thanks,
David
-----

> Tested in mach5 with the current baseline, tiers1-3 and several runs of 
> open/test/langtools/:tier1 which includes the jshell tests where this 
> connector is used. I also applied patch 
> http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev 
> mentioned in the comments of the bug, on top of the baseline and run the 
> langtool tests with and without this fix. Without the fix running around 
> 30 repetitions already shows failures in tests 
> jdk/jshell/FailOverExecutionControlTest.java and 
> jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With the fix 
> I run several hundred runs and saw no failures. Let me know if there is 
> any additional testing I should do.
> 
> As a side note, I see there are a couple of open issues related with 
> jshell failures (8209848) which could be related to this bug and 
> therefore might be fixed by this patch.
> 
> Thanks,
> Patricio
> 

From stefan.karlsson at oracle.com  Wed Mar 18 09:00:54 2020
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 18 Mar 2020 10:00:54 +0100
Subject: RFR(XS) 8240906: Update ZGC ProblemList for
 serviceability/sa/TestJmapCoreMetaspace.java
In-Reply-To: <2545b3cf-3136-c990-dcfc-b6b6b7cc7d9f@oracle.com>
References: <2545b3cf-3136-c990-dcfc-b6b6b7cc7d9f@oracle.com>
Message-ID: <e15f694e-577d-5919-9c70-1d96fe6c4f94@oracle.com>

Looks good.

StefanK

On 2020-03-18 05:52, Chris Plummer wrote:
> Hello,
>
> Please review the following:
>
> https://bugs.openjdk.java.net/browse/JDK-8240906
>
> diff --git a/test/hotspot/jtreg/ProblemList-zgc.txt 
> b/test/hotspot/jtreg/ProblemList-zgc.txt
> --- a/test/hotspot/jtreg/ProblemList-zgc.txt
> +++ b/test/hotspot/jtreg/ProblemList-zgc.txt
> @@ -47,5 +47,5 @@
> ?serviceability/sa/TestJhsdbJstackLock.java 8220624?? generic-all
> ?serviceability/sa/TestJhsdbJstackMixed.java 8220624?? generic-all
> ?serviceability/sa/TestJmapCore.java 8220624?? generic-all
> -serviceability/sa/TestJmapCoreMetaspace.java 8219443 generic-all
> +serviceability/sa/TestJmapCoreMetaspace.java 8220624 generic-all
> ?serviceability/sa/sadebugd/DebugdConnectTest.java 8220624 generic-all
>
> 8219443 [1] was closed as a dup of 8219405 [2], which is a very 
> intermittent bug that occurs even without ZGC, so should not be used 
> to problem list this test for ZGC. However it should be ZGC problem 
> listed due to 8220624 [3] just like TestJmapCore.java is.
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8219443
> [2] https://bugs.openjdk.java.net/browse/JDK-8219405
> [3] https://bugs.openjdk.java.net/browse/JDK-8220624
>
> thanks,
>
> Chris
>


From alexander.scherbatiy at bell-sw.com  Wed Mar 18 11:57:28 2020
From: alexander.scherbatiy at bell-sw.com (Alexander Scherbatiy)
Date: Wed, 18 Mar 2020 14:57:28 +0300
Subject: jmx-dev RFR 8240604: Rewrite
 sun/management/jmxremote/bootstrap/CustomLauncherTest.java test to make
 binaries from source file
In-Reply-To: <DFCA7043-9F26-4FC6-940F-F448E2185151@oracle.com>
References: <6b21ac03-0505-881e-b805-1681faab437b@bell-sw.com>
 <b9cf2538-275a-f6ee-aa8c-3e50d1cdef3b@oracle.com>
 <DFCA7043-9F26-4FC6-940F-F448E2185151@oracle.com>
Message-ID: <4ac1b2eb-7d0d-7fa5-8437-b7b987df92fe@bell-sw.com>

Hello,

Could you review the updated fix:

 ? http://cr.openjdk.java.net/~alexsch/8240604/webrev.01

Utils.TEST_CLASS_PATH, Platform.jvmLibDir(), and /native flag are added 
to the CustomLauncherTest.java test. I also included TEST_NATIVE_PATH to 
the Utils lib.

I have not found a history about CustomLauncherTest.sh script in 
launcher.c so I just updated the comment as "A minature launcher for use 
by CustomLauncherTest.java test" in the exelauncher.c file.


The comment that I had about removing the linux-* and solaris-* binary 
files I wrote because it is not clear for what is the right way to 
include removed binary files into webrev.

Could I just use "hg remove binary-fie" and run webrev to add the 
removed binary files into webrev?


Thanks,

Alexander.

On 17.03.2020 20:11, Igor Ignatyev wrote:
> Hi Alexander,
>
> overall looks good to me, I have a few comments though:
>   - you can use Utils.TEST_CLASSPATH instead of CustomLauncherTest.TEST_CLASSPATH
> - CustomLauncherTest::findLibjvm can be simplified by use Platform::jvmLibDir
> - exelauncher.c has a comment which refers to the test as CustomLauncherTest.sh, could you please update the comment?
> - you have to add /native flag to @run action, otherwise jtreg won't exclude this test from runs w/ test.nativepath being unset
>
> I also have a question regarding your statement that
>>> The changes for obsolete binary files <...> are not included into the webrev. They needs to be removed manually.
> you are planning to remove these files as part of this patch, right?
>
> Thanks,
> -- Igor
>
>
>> On Mar 5, 2020, at 6:27 AM, Daniel Fuchs <daniel.fuchs at oracle.com> wrote:
>>
>> Hi Alexander,
>>
>> Fixes to JMX & management agent are reviewed on the
>> seviceability-dev (added in to:) these days.
>>
>> best regards,
>>
>> -- daniel
>>
>> On 05/03/2020 13:17, Alexander Scherbatiy wrote:
>>> Hello,
>>> Could you review a small enhancement where the test CustomLauncherTest is updated to build binary launcher file from launcher.c file.
>>> The file launcher.c is renamed to exelauncher.c to follow the name convention for executable test files building by jdk make system.
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240604
>>> Webrev: http://cr.openjdk.java.net/~alexsch/8240604/webrev.00
>>> The changes for obsolete binary files from sun/management/jmxremote/bootstrap/linux-* and solaris-* are not included into the webrev. They needs to be removed manually.
>>> The test is passed on Ubuntu 18.04 x86-64, Solaris Sparc v9 11.2, and Solaris x64 11.4 systems.
>>> The test is excluded from Windows and Mac Os X systems.
>>> Thanks,
>>> Alexander.

From alexander.scherbatiy at bell-sw.com  Wed Mar 18 15:48:57 2020
From: alexander.scherbatiy at bell-sw.com (Alexander Scherbatiy)
Date: Wed, 18 Mar 2020 18:48:57 +0300
Subject: jmx-dev RFR 8240604: Rewrite
 sun/management/jmxremote/bootstrap/CustomLauncherTest.java test to make
 binaries from source file
In-Reply-To: <4ac1b2eb-7d0d-7fa5-8437-b7b987df92fe@bell-sw.com>
References: <6b21ac03-0505-881e-b805-1681faab437b@bell-sw.com>
 <b9cf2538-275a-f6ee-aa8c-3e50d1cdef3b@oracle.com>
 <DFCA7043-9F26-4FC6-940F-F448E2185151@oracle.com>
 <4ac1b2eb-7d0d-7fa5-8437-b7b987df92fe@bell-sw.com>
Message-ID: <fb65a8f4-c727-c0e3-d724-aa924fd4d4cb@bell-sw.com>

On 18.03.2020 14:57, Alexander Scherbatiy wrote:

> Hello,
>
> Could you review the updated fix:
>
> ? http://cr.openjdk.java.net/~alexsch/8240604/webrev.01
>
> Utils.TEST_CLASS_PATH, Platform.jvmLibDir(), and /native flag are 
> added to the CustomLauncherTest.java test. I also included 
> TEST_NATIVE_PATH to the Utils lib.
>
> I have not found a history about CustomLauncherTest.sh script in 
> launcher.c so I just updated the comment as "A minature launcher for 
> use by CustomLauncherTest.java test" in the exelauncher.c file.

 ? I also updated the word with type 'minature' to 'miniature'.

Thanks,

Alexander.

>
>
> The comment that I had about removing the linux-* and solaris-* binary 
> files I wrote because it is not clear for what is the right way to 
> include removed binary files into webrev.
>
> Could I just use "hg remove binary-fie" and run webrev to add the 
> removed binary files into webrev?
>
>
> Thanks,
>
> Alexander.
>
> On 17.03.2020 20:11, Igor Ignatyev wrote:
>> Hi Alexander,
>>
>> overall looks good to me, I have a few comments though:
>> ? - you can use Utils.TEST_CLASSPATH instead of 
>> CustomLauncherTest.TEST_CLASSPATH
>> - CustomLauncherTest::findLibjvm can be simplified by use 
>> Platform::jvmLibDir
>> - exelauncher.c has a comment which refers to the test as 
>> CustomLauncherTest.sh, could you please update the comment?
>> - you have to add /native flag to @run action, otherwise jtreg won't 
>> exclude this test from runs w/ test.nativepath being unset
>>
>> I also have a question regarding your statement that
>>>> The changes for obsolete binary files <...> are not included into 
>>>> the webrev. They needs to be removed manually.
>> you are planning to remove these files as part of this patch, right?
>>
>> Thanks,
>> -- Igor
>>
>>
>>> On Mar 5, 2020, at 6:27 AM, Daniel Fuchs <daniel.fuchs at oracle.com> 
>>> wrote:
>>>
>>> Hi Alexander,
>>>
>>> Fixes to JMX & management agent are reviewed on the
>>> seviceability-dev (added in to:) these days.
>>>
>>> best regards,
>>>
>>> -- daniel
>>>
>>> On 05/03/2020 13:17, Alexander Scherbatiy wrote:
>>>> Hello,
>>>> Could you review a small enhancement where the test 
>>>> CustomLauncherTest is updated to build binary launcher file from 
>>>> launcher.c file.
>>>> The file launcher.c is renamed to exelauncher.c to follow the name 
>>>> convention for executable test files building by jdk make system.
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240604
>>>> Webrev: http://cr.openjdk.java.net/~alexsch/8240604/webrev.00
>>>> The changes for obsolete binary files from 
>>>> sun/management/jmxremote/bootstrap/linux-* and solaris-* are not 
>>>> included into the webrev. They needs to be removed manually.
>>>> The test is passed on Ubuntu 18.04 x86-64, Solaris Sparc v9 11.2, 
>>>> and Solaris x64 11.4 systems.
>>>> The test is excluded from Windows and Mac Os X systems.
>>>> Thanks,
>>>> Alexander.

From igor.ignatyev at oracle.com  Wed Mar 18 16:00:49 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Wed, 18 Mar 2020 09:00:49 -0700
Subject: jmx-dev RFR 8240604: Rewrite
 sun/management/jmxremote/bootstrap/CustomLauncherTest.java test to make
 binaries from source file
In-Reply-To: <4ac1b2eb-7d0d-7fa5-8437-b7b987df92fe@bell-sw.com>
References: <6b21ac03-0505-881e-b805-1681faab437b@bell-sw.com>
 <b9cf2538-275a-f6ee-aa8c-3e50d1cdef3b@oracle.com>
 <DFCA7043-9F26-4FC6-940F-F448E2185151@oracle.com>
 <4ac1b2eb-7d0d-7fa5-8437-b7b987df92fe@bell-sw.com>
Message-ID: <CC4C5454-0E13-44C5-9D5F-E34720110206@oracle.com>

Hi Alexander,

> I also included TEST_NATIVE_PATH to the Utils lib.
for the sake of clarity and ease of backporting, I'd prefer to have it added by a separate bug and commit.

> Could I just use "hg remove binary-fie" and run webrev to add the removed binary files into webrev?
IIRC correctly, webrev will just say 'a binary file got removed', in any case I'll take it as a 'yes, I'm going to remove these files as part of 8240604', so thumbs up.

-- Igor
 

> On Mar 18, 2020, at 4:57 AM, Alexander Scherbatiy <alexander.scherbatiy at bell-sw.com> wrote:
> 
> Hello,
> 
> Could you review the updated fix:
> 
>   http://cr.openjdk.java.net/~alexsch/8240604/webrev.01
> 
> Utils.TEST_CLASS_PATH, Platform.jvmLibDir(), and /native flag are added to the CustomLauncherTest.java test. I also included TEST_NATIVE_PATH to the Utils lib.
> 
> I have not found a history about CustomLauncherTest.sh script in launcher.c so I just updated the comment as "A minature launcher for use by CustomLauncherTest.java test" in the exelauncher.c file.
> 
> 
> The comment that I had about removing the linux-* and solaris-* binary files I wrote because it is not clear for what is the right way to include removed binary files into webrev.
> 
> Could I just use "hg remove binary-fie" and run webrev to add the removed binary files into webrev?
> 
> 
> Thanks,
> 
> Alexander.
> 
> On 17.03.2020 20:11, Igor Ignatyev wrote:
>> Hi Alexander,
>> 
>> overall looks good to me, I have a few comments though:
>>  - you can use Utils.TEST_CLASSPATH instead of CustomLauncherTest.TEST_CLASSPATH
>> - CustomLauncherTest::findLibjvm can be simplified by use Platform::jvmLibDir
>> - exelauncher.c has a comment which refers to the test as CustomLauncherTest.sh, could you please update the comment?
>> - you have to add /native flag to @run action, otherwise jtreg won't exclude this test from runs w/ test.nativepath being unset
>> 
>> I also have a question regarding your statement that
>>>> The changes for obsolete binary files <...> are not included into the webrev. They needs to be removed manually.
>> you are planning to remove these files as part of this patch, right?
>> 
>> Thanks,
>> -- Igor
>> 
>> 
>>> On Mar 5, 2020, at 6:27 AM, Daniel Fuchs <daniel.fuchs at oracle.com> wrote:
>>> 
>>> Hi Alexander,
>>> 
>>> Fixes to JMX & management agent are reviewed on the
>>> seviceability-dev (added in to:) these days.
>>> 
>>> best regards,
>>> 
>>> -- daniel
>>> 
>>> On 05/03/2020 13:17, Alexander Scherbatiy wrote:
>>>> Hello,
>>>> Could you review a small enhancement where the test CustomLauncherTest is updated to build binary launcher file from launcher.c file.
>>>> The file launcher.c is renamed to exelauncher.c to follow the name convention for executable test files building by jdk make system.
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240604
>>>> Webrev: http://cr.openjdk.java.net/~alexsch/8240604/webrev.00
>>>> The changes for obsolete binary files from sun/management/jmxremote/bootstrap/linux-* and solaris-* are not included into the webrev. They needs to be removed manually.
>>>> The test is passed on Ubuntu 18.04 x86-64, Solaris Sparc v9 11.2, and Solaris x64 11.4 systems.
>>>> The test is excluded from Windows and Mac Os X systems.
>>>> Thanks,
>>>> Alexander.


From alexander.scherbatiy at bell-sw.com  Wed Mar 18 16:54:27 2020
From: alexander.scherbatiy at bell-sw.com (Alexander Scherbatiy)
Date: Wed, 18 Mar 2020 19:54:27 +0300
Subject: jmx-dev RFR 8240604: Rewrite
 sun/management/jmxremote/bootstrap/CustomLauncherTest.java test to make
 binaries from source file
In-Reply-To: <CC4C5454-0E13-44C5-9D5F-E34720110206@oracle.com>
References: <6b21ac03-0505-881e-b805-1681faab437b@bell-sw.com>
 <b9cf2538-275a-f6ee-aa8c-3e50d1cdef3b@oracle.com>
 <DFCA7043-9F26-4FC6-940F-F448E2185151@oracle.com>
 <4ac1b2eb-7d0d-7fa5-8437-b7b987df92fe@bell-sw.com>
 <CC4C5454-0E13-44C5-9D5F-E34720110206@oracle.com>
Message-ID: <24e51dc7-714a-0932-a27f-778da35e8e29@bell-sw.com>

On 18.03.2020 19:00, Igor Ignatyev wrote:

> Hi Alexander,
>
>> I also included TEST_NATIVE_PATH to the Utils lib.
> for the sake of clarity and ease of backporting, I'd prefer to have it added by a separate bug and commit.

Here is the updated fix where TEST_NATIVE_PATH is not added to the Utils 
lib.

 ? http://cr.openjdk.java.net/~alexsch/8240604/webrev.02/


Thanks,

Alexander.

>> Could I just use "hg remove binary-fie" and run webrev to add the removed binary files into webrev?
> IIRC correctly, webrev will just say 'a binary file got removed', in any case I'll take it as a 'yes, I'm going to remove these files as part of 8240604', so thumbs up.
>
> -- Igor
>   
>
>> On Mar 18, 2020, at 4:57 AM, Alexander Scherbatiy <alexander.scherbatiy at bell-sw.com> wrote:
>>
>> Hello,
>>
>> Could you review the updated fix:
>>
>>    http://cr.openjdk.java.net/~alexsch/8240604/webrev.01
>>
>> Utils.TEST_CLASS_PATH, Platform.jvmLibDir(), and /native flag are added to the CustomLauncherTest.java test. I also included TEST_NATIVE_PATH to the Utils lib.
>>
>> I have not found a history about CustomLauncherTest.sh script in launcher.c so I just updated the comment as "A minature launcher for use by CustomLauncherTest.java test" in the exelauncher.c file.
>>
>>
>> The comment that I had about removing the linux-* and solaris-* binary files I wrote because it is not clear for what is the right way to include removed binary files into webrev.
>>
>> Could I just use "hg remove binary-fie" and run webrev to add the removed binary files into webrev?
>>
>>
>> Thanks,
>>
>> Alexander.
>>
>> On 17.03.2020 20:11, Igor Ignatyev wrote:
>>> Hi Alexander,
>>>
>>> overall looks good to me, I have a few comments though:
>>>   - you can use Utils.TEST_CLASSPATH instead of CustomLauncherTest.TEST_CLASSPATH
>>> - CustomLauncherTest::findLibjvm can be simplified by use Platform::jvmLibDir
>>> - exelauncher.c has a comment which refers to the test as CustomLauncherTest.sh, could you please update the comment?
>>> - you have to add /native flag to @run action, otherwise jtreg won't exclude this test from runs w/ test.nativepath being unset
>>>
>>> I also have a question regarding your statement that
>>>>> The changes for obsolete binary files <...> are not included into the webrev. They needs to be removed manually.
>>> you are planning to remove these files as part of this patch, right?
>>>
>>> Thanks,
>>> -- Igor
>>>
>>>
>>>> On Mar 5, 2020, at 6:27 AM, Daniel Fuchs <daniel.fuchs at oracle.com> wrote:
>>>>
>>>> Hi Alexander,
>>>>
>>>> Fixes to JMX & management agent are reviewed on the
>>>> seviceability-dev (added in to:) these days.
>>>>
>>>> best regards,
>>>>
>>>> -- daniel
>>>>
>>>> On 05/03/2020 13:17, Alexander Scherbatiy wrote:
>>>>> Hello,
>>>>> Could you review a small enhancement where the test CustomLauncherTest is updated to build binary launcher file from launcher.c file.
>>>>> The file launcher.c is renamed to exelauncher.c to follow the name convention for executable test files building by jdk make system.
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240604
>>>>> Webrev: http://cr.openjdk.java.net/~alexsch/8240604/webrev.00
>>>>> The changes for obsolete binary files from sun/management/jmxremote/bootstrap/linux-* and solaris-* are not included into the webrev. They needs to be removed manually.
>>>>> The test is passed on Ubuntu 18.04 x86-64, Solaris Sparc v9 11.2, and Solaris x64 11.4 systems.
>>>>> The test is excluded from Windows and Mac Os X systems.
>>>>> Thanks,
>>>>> Alexander.

From rkennke at redhat.com  Wed Mar 18 16:57:26 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 18 Mar 2020 17:57:26 +0100
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
 <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>
 <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>
Message-ID: <e723b777-ec5d-1f1d-c7fe-c92b720dc223@redhat.com>

Hi Serguei,

Thanks for your review! A quick update on my progress:

The wrong condition was a good find! In-fact so much that it lead to the
whole implementation not reporting any unloaded classes. I changed that
back, and now it's so slow because of all those unloaded classes firing
events, I'm trying to understand where it's loosing time.

Some other findings:
- I can't keep the lock while calling into JVMTI e.g. for GetTag() or
SetTag(), otherwise it risks to deadlock.
- The current implementation doesn't seem to report any unloaded classes
either (i.e. the bag returned by classTrack_processUnloads(JNIEnv *env)
is always empty or NULL), at least not in my testcase. I'm investigating
why this might be the case, or maybe I did something wrong.

Roman

> Sorry, forgot to complete my comments at the end (see below).
> 
> 
> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote:
>> Hi Roman,
>>
>> Thank you for the update and sorry for the latency in review.
>>
>> Some comments are below.
>>
>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>
>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
>>   88 {
>> 89 debugMonitorEnter(deletedSignatureLock);
>> 90 if (currentClassTag == -1) {
>> 91 // Class tracking not initialized, nobody's interested
>> 92 debugMonitorExit(deletedSignatureLock);
>> 93 return;
>>   94     }
>> Just a question:
>> ? Q1: Should the ObjectFree events be disabled for the jvmtiEnv that does
>> ????? the class tracking if class tracking has not been initialized?
>>
>> 70 static jlong currentClassTag; I'm thinking if the name is better to
>> be something like: lastClassTag or highestClassTag.
>>
>> 99 KlassNode* klass = *klass_ptr;
>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103
>> klass_ptr = &klass->next; 104 klass = *klass_ptr;
>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // klass not
>> found - ignore.
>> 107 debugMonitorExit(deletedSignatureLock);
>> 108 return;
>>  109     }
>> ?It seems to me, something is wrong in the condition at L106 above.
>> ?Should it be? :
>> ??? if (klass == NULL || klass->klass_tag != tag)
>>
>> ?Otherwise, how can the second check ever work correctly as the return
>> will always happen when (klass != NULL)?
>>
>> ?
>> There are several places in this file with the the indent:
>> 90 if (currentClassTag == -1) {
>> 91 // Class tracking not initialized, nobody's interested
>> 92 debugMonitorExit(deletedSignatureLock);
>> 93 return;
>>   94     }
>>  ...
>> 152 if (currentClassTag == -1) {
>> 153 // Class tracking not initialized yet, nobody's interested
>> 154 debugMonitorExit(deletedSignatureLock);
>> 155 return;
>>  156     }
>>  ...
>> 161 if (error != JVMTI_ERROR_NONE) {
>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv");
>>  163     }
>> 164 if (tag != 0l) {
>> 165 debugMonitorExit(deletedSignatureLock);
>> 166 return; // Already added
>>  167     }
>>  ...
>> 281 cleanDeleted(void *signatureVoid, void *arg)
>> 282 {
>> 283 char* sig = (char*)signatureVoid;
>> 284 jvmtiDeallocate(sig);
>> 285 return JNI_TRUE;
>>  286 }
>>  ...
>>  291 void
>>  292 classTrack_reset(void)
>>  293 {
>> 294 int idx;
>> 295 debugMonitorEnter(deletedSignatureLock);
>> 296
>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) {
>> 298 KlassNode* node = table[idx];
>> 299 while (node != NULL) {
>> 300 KlassNode* next = node->next;
>> 301 jvmtiDeallocate(node->signature);
>> 302 jvmtiDeallocate(node);
>> 303 node = next;
>> 304 }
>> 305 }
>> 306 jvmtiDeallocate(table);
>> 307
>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL);
>> 309 bagDestroyBag(deletedSignatureBag);
>> 310
>> 311 currentClassTag = -1;
>> 312
>> 313 (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv);
>> 314 trackingEnv = NULL;
>> 315
>> 316 debugMonitorExit(deletedSignatureLock);
>>
>> Could you, please, fix several comments below?
>> 63 * The JVMTI tracking env to keep track of klass tags, for class-unloads
>> ?The comma is not needed.
>> ?Would it better to replace: klass tags => klass_tag's ?
>>
>>
>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag
>> consistent
>> ?Maybe: Lock to guard ... or lock to keep integrity of ...
>>
>> 84 * Callback when classes are freed, Finds the signature and
>> remembers it in deletedSignatureBag. Would be better to use words like
>> "store" or "record", "Find" should not start from capital letter:
>> Invoke the callback when classes are freed, find and record the
>> signature in deletedSignatureBag.
>>
>> 96 // Find deleted KlassNode 133 // Class tracking not initialized,
>> nobody's interested 153 // Class tracking not initialized yet,
>> nobody's interested 158 /* Check this is not a duplicate */ Missed dot
>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { //
>> klass not found - ignore. In opposite, dot is not needed as the
>> comment does not start from a capital letter. 111 // At this point we
>> have the KlassNode corresponding to the tag
>> 112 // in klass, and the pointer to it in klass_node.
> 
>  The comment above can be better. Maybe, something like:
>  ? " At this point, we found the KlassNode matching the klass tag(and it is
> linked).
> 
>> 113 // Remember the unloaded signature.
> ?Better: Record the signature of the unloaded class and unlink it.
> 
> Thanks,
> Serguei
> 
>> Thanks,
>> Serguei 
>>
>> On 3/9/20 05:39, Roman Kennke wrote:
>>> Hello all,
>>>
>>> Can I please get reviews of this change? In the meantime, we've done
>>> more testing and also field-/torture-testing by a customer who is happy
>>> now. :-)
>>>
>>> Thanks,
>>> Roman
>>>
>>>
>>>> Hi Serguei,
>>>>
>>>> Thanks for reviewing!
>>>>
>>>> I updated the patch to reflect your suggestions, very good!
>>>> It also includes a fix to allow re-connecting an agent after disconnect,
>>>> namely move setup of the trackingEnv and deletedSignatureBag to
>>>> _activate() to ensure have those structures after re-connect.
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
>>>>
>>>> Let me know what you think!
>>>> Roman
>>>>
>>>>> Hi Roman,
>>>>>
>>>>> Thank you for taking care about this scalability issue!
>>>>>
>>>>> I have a couple of quick comments.
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>
>>>>> 72 /*
>>>>> 73 * Lock to protect deletedSignatureBag
>>>>> 74 */
>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>>>>> 78 * A bag containing all the deleted classes' signatures. Must be
>>>>> accessed under
>>>>> 79 * deletedTagLock,
>>>>>   80  */
>>>>> 81 struct bag* deletedSignatureBag;
>>>>>
>>>>> ? The comments contradict to each other.
>>>>> ? I guess, the lock name at line 79 has to be deletedSignatureLock
>>>>> instead of deletedTagLock.
>>>>> ? Also, comma at the end must be replaced with dot.
>>>>>
>>>>>
>>>>> 101 // Tag not found? Ignore.
>>>>> 102 if (klass == NULL) {
>>>>> 103 debugMonitorExit(deletedSignatureLock);
>>>>> 104 return;
>>>>> 105 }
>>>>>  106 
>>>>> 107 // Scan linked-list.
>>>>> 108 jlong found_tag = klass->klass_tag;
>>>>> 109 while (klass != NULL && found_tag != tag) {
>>>>> 110 klass_ptr = &klass->next;
>>>>> 111 klass = *klass_ptr;
>>>>> 112 found_tag = klass->klass_tag;
>>>>>  113     }
>>>>> 114
>>>>> 115 // Tag not found? Ignore.
>>>>> 116 if (found_tag != tag) {
>>>>> 117 debugMonitorExit(deletedSignatureLock);
>>>>> 118 return;
>>>>>  119     }
>>>>>
>>>>>
>>>>> ?The code above can be simplified, so that the lines 101-105 are not
>>>>> needed anymore.
>>>>> ?It can be something like this:
>>>>>
>>>>> // Scan linked-list.
>>>>> while (klass != NULL && klass->klass_tag != tag) {
>>>>> klass_ptr = &klass->next;
>>>>> klass = *klass_ptr;
>>>>>      }
>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not found - ignore.
>>>>> debugMonitorExit(deletedSignatureLock);
>>>>> return;
>>>>>      }
>>>>>
>>>>> It will take more time when I get a chance to look at the rest.
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 12/21/19 13:24, Roman Kennke wrote:
>>>>>> Here comes an update that resolves some races that happen when
>>>>>> disconnecting an agent. In particular, we need to take the lock on
>>>>>> basically every operation, and also need to check whether or not
>>>>>> class-tracking is active and return an appropriate result (e.g. an empty
>>>>>> list) when we're not.
>>>>>>
>>>>>> Updated webrev:
>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/
>>>>>>
>>>>>> Thanks,
>>>>>> Roman
>>>>>>
>>>>>>
>>>>>>> So, here comes the O(1) implementation:
>>>>>>>
>>>>>>> - Whenever a class is 'prepared', it is registered with a tag, and we
>>>>>>> set-up a listener to get notified when it is unloaded.
>>>>>>> - Prepared classes are kept in a datastructure that is a table, which
>>>>>>> each entry being the head of a linked-list of KlassNode*. The table is
>>>>>>> indexed by tag % slot-count, and then simply prepend the new KlassNode*.
>>>>>>> This is O(1) operation.
>>>>>>> - When we get notified of unloading a class, we look up the signature of
>>>>>>> the reported tag in that table, and remember it in a bag. The KlassNode*
>>>>>>> is then unlinked from the table and deallocated. This is ~O(1) operation
>>>>>>> too, depending on the depth of the table. In my testcase which hammered
>>>>>>> the code with class-loads and unloads, I usually see depths of like 2-3,
>>>>>>> but not usually more. It should be ok.
>>>>>>> - when processUnloads() gets called, we simply hand out that bag, and
>>>>>>> allocate a new one.
>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid leaking the
>>>>>>> signatures and KlassNode* etc when debug agent gets detached and/or
>>>>>>> re-attached (was missing before).
>>>>>>> - I also added locks around data-structure-manipulation (was missing
>>>>>>> before).
>>>>>>> - Also, I only activate this whole process when an actual listener gets
>>>>>>> registered on EI_GC_FINISH. This seems to happen right when attaching a
>>>>>>> jdb, not sure why jdb does that though. This may be something to improve
>>>>>>> in the future?
>>>>>>>
>>>>>>> In my tests, the performance of class-tracking itself looks really good.
>>>>>>> The bottleneck now is clearly actual synthesizing the class-unload
>>>>>>> events. I don't see how this can be helped when the debug agent asks for it?
>>>>>>>
>>>>>>> Updated webrev:
>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
>>>>>>>
>>>>>>> Please let me know what you think of it.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Roman
>>>>>>>
>>>>>>>
>>>>>>>> Alright, the perfectionist in me got me. I am implementing the even more
>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing for now.
>>>>>>>>
>>>>>>>> Thanks,Roman
>>>>>>>>
>>>>>>>>  Hi Chris,
>>>>>>>>>> I'll have a look at this, although it might not be for a few days. In
>>>>>>>>>> the meantime, maybe you can describe your new implementation in
>>>>>>>>>> classTrack.c so it's easier to look through the changes.
>>>>>>>>> Sure.
>>>>>>>>>
>>>>>>>>> The purpose of this class-tracking is to be able to determine the
>>>>>>>>> signatures of unloaded classes when GC/class-unloading happened, so that
>>>>>>>>> we can generate the appropriate JDWP event.
>>>>>>>>>
>>>>>>>>> The current implementation does so by maintaining a table of currently
>>>>>>>>> prepared classes by building that table when classTrack is initialized,
>>>>>>>>> and then add new classes whenever a class gets loaded. When unloading
>>>>>>>>> occurs, that cache is rebuilt into a new table, and compared with the
>>>>>>>>> old table, and whatever is in the old, but not in the new table gets
>>>>>>>>> returned. The problem is that when GCs happen frequently and/or many
>>>>>>>>> classes get loaded+unloaded, this amounts to O(classCount*gcCount)
>>>>>>>>> complexity.
>>>>>>>>>
>>>>>>>>> The new implementation keeps a linked-list of prepared classes, and also
>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). Whenever an
>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned, and classes
>>>>>>>>> that are also in the deletedTagBag are unlinked (thus maintaining the
>>>>>>>>> prepared-classes-list) and its signature put in the list that gets returned.
>>>>>>>>>
>>>>>>>>> The implementation is not perfect. In order to determine whether or not
>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. That process is
>>>>>>>>> therefore still O(unloadedClassCount). The assumption here is that
>>>>>>>>> unloadedClassCount << classCount. In my experiments this seems to be
>>>>>>>>> true, and also reasonable to expect.
>>>>>>>>>
>>>>>>>>> (I have some ideas how to improve the implementation to ~O(1) but it
>>>>>>>>> would be considerably more complex: have to maintain a (hash)table that
>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload, and build the
>>>>>>>>> unloaded-signatures list there, but I don't currently see that it's
>>>>>>>>> worth the effort).
>>>>>>>>>
>>>>>>>>> In addition to all that, this process is only activated when there's an
>>>>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Roman
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Chris
>>>>>>>>>>
>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>>>>> Hello all,
>>>>>>>>>>>
>>>>>>>>>>> Issue:
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>>>>
>>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c. It avoids
>>>>>>>>>>> throwing away the class cache on GC, and instead keeps track of
>>>>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>>>>
>>>>>>>>>>> In addition to that, it avoids this whole dance until an agent
>>>>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>>>>
>>>>>>>>>>> Webrev:
>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>>>>>>>>>>
>>>>>>>>>>> Testing: manual testing of provided test scenarios and timing.
>>>>>>>>>>>
>>>>>>>>>>> Eg with the testcase provided here:
>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>>
>>>>>>>>>>> I am getting those numbers:
>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s
>>>>>>>>>>>
>>>>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>>>>
>>>>>>>>>>> Can I please get a review?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Roman
>>>>>>>>>>>
>>
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200318/4abaedf2/signature-0001.asc>

From igor.ignatyev at oracle.com  Wed Mar 18 17:02:04 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Wed, 18 Mar 2020 10:02:04 -0700
Subject: jmx-dev RFR 8240604: Rewrite
 sun/management/jmxremote/bootstrap/CustomLauncherTest.java test to make
 binaries from source file
In-Reply-To: <24e51dc7-714a-0932-a27f-778da35e8e29@bell-sw.com>
References: <6b21ac03-0505-881e-b805-1681faab437b@bell-sw.com>
 <b9cf2538-275a-f6ee-aa8c-3e50d1cdef3b@oracle.com>
 <DFCA7043-9F26-4FC6-940F-F448E2185151@oracle.com>
 <4ac1b2eb-7d0d-7fa5-8437-b7b987df92fe@bell-sw.com>
 <CC4C5454-0E13-44C5-9D5F-E34720110206@oracle.com>
 <24e51dc7-714a-0932-a27f-778da35e8e29@bell-sw.com>
Message-ID: <BC07CAFB-0C57-4932-8818-08113A10856C@oracle.com>

+import static jdk.test.lib.Utils.TEST_CLASS_PATH;
I'm not a huge fun of 'import static', yet don't insist on removing it either. 

+            System.out.println("  libjvm    : " + jvmLibDir.toString());
jvmLibDir doesn't point to libjvm, so you need either update message prefix or use the actual value which will be used as path to libjvm. I personally prefer the latter. 
btw, you don't need to explicitly call toString in string concatenation.

-- Igor

> On Mar 18, 2020, at 9:54 AM, Alexander Scherbatiy <alexander.scherbatiy at bell-sw.com> wrote:
> 
> On 18.03.2020 19:00, Igor Ignatyev wrote:
> 
>> Hi Alexander,
>> 
>>> I also included TEST_NATIVE_PATH to the Utils lib.
>> for the sake of clarity and ease of backporting, I'd prefer to have it added by a separate bug and commit.
> 
> Here is the updated fix where TEST_NATIVE_PATH is not added to the Utils lib.
> 
>   http://cr.openjdk.java.net/~alexsch/8240604/webrev.02/
> 
> 
> Thanks,
> 
> Alexander.
> 
>>> Could I just use "hg remove binary-fie" and run webrev to add the removed binary files into webrev?
>> IIRC correctly, webrev will just say 'a binary file got removed', in any case I'll take it as a 'yes, I'm going to remove these files as part of 8240604', so thumbs up.
>> 
>> -- Igor
>>  
>>> On Mar 18, 2020, at 4:57 AM, Alexander Scherbatiy <alexander.scherbatiy at bell-sw.com> wrote:
>>> 
>>> Hello,
>>> 
>>> Could you review the updated fix:
>>> 
>>>   http://cr.openjdk.java.net/~alexsch/8240604/webrev.01
>>> 
>>> Utils.TEST_CLASS_PATH, Platform.jvmLibDir(), and /native flag are added to the CustomLauncherTest.java test. I also included TEST_NATIVE_PATH to the Utils lib.
>>> 
>>> I have not found a history about CustomLauncherTest.sh script in launcher.c so I just updated the comment as "A minature launcher for use by CustomLauncherTest.java test" in the exelauncher.c file.
>>> 
>>> 
>>> The comment that I had about removing the linux-* and solaris-* binary files I wrote because it is not clear for what is the right way to include removed binary files into webrev.
>>> 
>>> Could I just use "hg remove binary-fie" and run webrev to add the removed binary files into webrev?
>>> 
>>> 
>>> Thanks,
>>> 
>>> Alexander.
>>> 
>>> On 17.03.2020 20:11, Igor Ignatyev wrote:
>>>> Hi Alexander,
>>>> 
>>>> overall looks good to me, I have a few comments though:
>>>>  - you can use Utils.TEST_CLASSPATH instead of CustomLauncherTest.TEST_CLASSPATH
>>>> - CustomLauncherTest::findLibjvm can be simplified by use Platform::jvmLibDir
>>>> - exelauncher.c has a comment which refers to the test as CustomLauncherTest.sh, could you please update the comment?
>>>> - you have to add /native flag to @run action, otherwise jtreg won't exclude this test from runs w/ test.nativepath being unset
>>>> 
>>>> I also have a question regarding your statement that
>>>>>> The changes for obsolete binary files <...> are not included into the webrev. They needs to be removed manually.
>>>> you are planning to remove these files as part of this patch, right?
>>>> 
>>>> Thanks,
>>>> -- Igor
>>>> 
>>>> 
>>>>> On Mar 5, 2020, at 6:27 AM, Daniel Fuchs <daniel.fuchs at oracle.com> wrote:
>>>>> 
>>>>> Hi Alexander,
>>>>> 
>>>>> Fixes to JMX & management agent are reviewed on the
>>>>> seviceability-dev (added in to:) these days.
>>>>> 
>>>>> best regards,
>>>>> 
>>>>> -- daniel
>>>>> 
>>>>> On 05/03/2020 13:17, Alexander Scherbatiy wrote:
>>>>>> Hello,
>>>>>> Could you review a small enhancement where the test CustomLauncherTest is updated to build binary launcher file from launcher.c file.
>>>>>> The file launcher.c is renamed to exelauncher.c to follow the name convention for executable test files building by jdk make system.
>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240604
>>>>>> Webrev: http://cr.openjdk.java.net/~alexsch/8240604/webrev.00
>>>>>> The changes for obsolete binary files from sun/management/jmxremote/bootstrap/linux-* and solaris-* are not included into the webrev. They needs to be removed manually.
>>>>>> The test is passed on Ubuntu 18.04 x86-64, Solaris Sparc v9 11.2, and Solaris x64 11.4 systems.
>>>>>> The test is excluded from Windows and Mac Os X systems.
>>>>>> Thanks,
>>>>>> Alexander.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200318/3a35bf05/attachment.htm>

From chris.plummer at oracle.com  Wed Mar 18 17:16:33 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 18 Mar 2020 10:16:33 -0700
Subject: RFR(XS) 8240906: Update ZGC ProblemList for
 serviceability/sa/TestJmapCoreMetaspace.java
In-Reply-To: <e15f694e-577d-5919-9c70-1d96fe6c4f94@oracle.com>
References: <2545b3cf-3136-c990-dcfc-b6b6b7cc7d9f@oracle.com>
 <e15f694e-577d-5919-9c70-1d96fe6c4f94@oracle.com>
Message-ID: <528e337b-9b7b-ffae-d28f-8baf03f6e3cd@oracle.com>

Thanks!

On 3/18/20 2:00 AM, Stefan Karlsson wrote:
> Looks good.
>
> StefanK
>
> On 2020-03-18 05:52, Chris Plummer wrote:
>> Hello,
>>
>> Please review the following:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8240906
>>
>> diff --git a/test/hotspot/jtreg/ProblemList-zgc.txt 
>> b/test/hotspot/jtreg/ProblemList-zgc.txt
>> --- a/test/hotspot/jtreg/ProblemList-zgc.txt
>> +++ b/test/hotspot/jtreg/ProblemList-zgc.txt
>> @@ -47,5 +47,5 @@
>> ?serviceability/sa/TestJhsdbJstackLock.java 8220624 generic-all
>> ?serviceability/sa/TestJhsdbJstackMixed.java 8220624 generic-all
>> ?serviceability/sa/TestJmapCore.java 8220624?? generic-all
>> -serviceability/sa/TestJmapCoreMetaspace.java 8219443 generic-all
>> +serviceability/sa/TestJmapCoreMetaspace.java 8220624 generic-all
>> ?serviceability/sa/sadebugd/DebugdConnectTest.java 8220624 generic-all
>>
>> 8219443 [1] was closed as a dup of 8219405 [2], which is a very 
>> intermittent bug that occurs even without ZGC, so should not be used 
>> to problem list this test for ZGC. However it should be ZGC problem 
>> listed due to 8220624 [3] just like TestJmapCore.java is.
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8219443
>> [2] https://bugs.openjdk.java.net/browse/JDK-8219405
>> [3] https://bugs.openjdk.java.net/browse/JDK-8220624
>>
>> thanks,
>>
>> Chris
>>
>


From daniil.x.titov at oracle.com  Wed Mar 18 17:20:39 2020
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Wed, 18 Mar 2020 10:20:39 -0700
Subject: RFR: 8240711: TestJstatdPort.java failed due to "ExportException:
 Port already in use:"
In-Reply-To: <8f18db0e-e988-ba8f-2f55-07584cb4b7e0@oracle.com>
References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com>
 <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com>
 <7D3EB454-76AA-47A7-91A6-417ED92C6F56@oracle.com>
 <38b055df-37cd-cc11-79c9-323fb55077ff@oracle.com>
 <3F070550-DAB3-431E-89AD-66532B366FEA@oracle.com>
 <597AFBA6-42DB-4499-BF0C-8160155BABF4@oracle.com>
 <8f18db0e-e988-ba8f-2f55-07584cb4b7e0@oracle.com>
Message-ID: <D7E3AEDD-9D8B-4873-8B5B-F02B292A2851@oracle.com>

Hi Alex,

Thank you for reviewing this change.

Best regards,
Daniil

?On 3/17/20, 11:58 AM, "Alex Menkov" <alexey.menkov at oracle.com> wrote:

    LGTM
    
    --alex
    
    On 03/17/2020 11:40, Daniil Titov wrote:
    > Hi Alex,
    > 
    > Please review a new version of the fix that removes the old version of the code that tried to handle the "port in use" case.
    > 
    > Testing: Mach5 tests for sun/tools/jstatd/  successfully passed 100 times.  Tier1-tier3 tests successfully passed.
    > 
    > [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.02
    > [2] https://bugs.openjdk.java.net/browse/JDK-8240711
    > 
    > Thanks,
    > Daniil
    > 
    > 
    > 
    > ?On 3/16/20, 5:38 PM, "Daniil Titov" <daniil.x.titov at oracle.com> wrote:
    > 
    >      Hi Alex,
    >      
    >      Yes,  I did test the change by modifying  the test to use the RMI port that is already in use
    >      ( the stack trace in the original email was exact from this changed test) and then ensured that with the fix
    >      the such issue is properly handled.
    >      
    >      I will send a new version of the webrev that removes the old version of the code that tried to handle the "port in use" case.
    >      
    >      Thanks!
    >      
    >      Best regards,
    >      Daniil
    >      
    >      
    >      
    >      
    >      ?On 3/16/20, 4:47 PM, "Alex Menkov" <alexey.menkov at oracle.com> wrote:
    >      
    >          I don't agree.
    >          The code handles exact the same "port in use" case for the same tool.
    >          So it either works or doesn't.
    >          And have 2 code blocks which suppose to do the same makes the code messy.
    >          BTW did you tested the change (I mean craft the test to get "port in
    >          use" error)?
    >          
    >          --alex
    >          
    >          On 03/16/2020 16:17, Daniil Titov wrote:
    >          > Resending with the corrected subject ...
    >          >
    >          > Hi Alex,
    >          >
    >          > Yes, you are right, class JstatdTest has the code that is supposed to handle the "port in use"
    >          > case but at least for this specific test  (sun/tools/jstatd/TestJstatdPort.java) it doesn't work.
    >          >
    >          > Since there are multiple tests in sun/tools/jstatd/* folder that use this class and different ports
    >          > might be subject to the "port in use" error and taking into account that it's hard to reproduce such case
    >          > I found it safer to leave the original code and just augment it with what was missing for this specific
    >          > case rather than completely replacing it.
    >          >
    >          > Best regards,
    >          > Daniil
    >          >
    >          > ?On 3/16/20, 4:02 PM, "Alex Menkov" <alexey.menkov at oracle.com> wrote:
    >          >
    >          >      Hi Daniil,
    >          >
    >          >      Looks like the test is supposed to handle "port in use" issue (see lines
    >          >      103-114).
    >          >      I suppose in case "port in use" jstatd exits, but
    >          >      ProcessTools.startProcess() continue to wait for "jstatd started" message.
    >          >
    >          >      --alex
    >          >
    >          >      On 03/16/2020 12:00, Daniil Titov wrote:
    >          >      > Please review the change [1] that fixes the intermittent failure of the test.
    >          >      >
    >          >      > The problem here is that if the RMI port is in use than the test keep waiting for "jstatd started (bound to " to appear in the process output and in this case
    >          >      > It doesn't happen.
    >          >      >
    >          >      > 	at java.util.concurrent.CountDownLatch.await(java.base at 15-internal/CountDownLatch.java:232)
    >          >      > 	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:205)
    >          >      > 	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:133)
    >          >      > 	at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:254)
    >          >      > 	at jdk.test.lib.thread.ProcessThread$ProcessRunnable.xrun(ProcessThread.java:153)
    >          >      > 	at jdk.test.lib.thread.XRun.run(XRun.java:40)
    >          >      > 	at java.lang.Thread.run(java.base at 15-internal/Thread.java:832)
    >          >      > 	at jdk.test.lib.thread.TestThread.run(TestThread.java:123)
    >          >      >
    >          >      > Testing: Mach5 tests for sun/tools/jstatd/ successfully passed.  Tier1-tier3 tests are still in progress.
    >          >      >
    >          >      > [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.01/
    >          >      > [2] https://bugs.openjdk.java.net/browse/JDK-8240711
    >          >      >
    >          >      >
    >          >      > Thank you,
    >          >      > Daniil
    >          >      >
    >          >      >
    >          >      >
    >          >
    >          >
    >          >
    >          
    >      
    > 
    > 
    

From alexander.scherbatiy at bell-sw.com  Wed Mar 18 17:35:33 2020
From: alexander.scherbatiy at bell-sw.com (Alexander Scherbatiy)
Date: Wed, 18 Mar 2020 20:35:33 +0300
Subject: jmx-dev RFR 8240604: Rewrite
 sun/management/jmxremote/bootstrap/CustomLauncherTest.java test to make
 binaries from source file
In-Reply-To: <BC07CAFB-0C57-4932-8818-08113A10856C@oracle.com>
References: <6b21ac03-0505-881e-b805-1681faab437b@bell-sw.com>
 <b9cf2538-275a-f6ee-aa8c-3e50d1cdef3b@oracle.com>
 <DFCA7043-9F26-4FC6-940F-F448E2185151@oracle.com>
 <4ac1b2eb-7d0d-7fa5-8437-b7b987df92fe@bell-sw.com>
 <CC4C5454-0E13-44C5-9D5F-E34720110206@oracle.com>
 <24e51dc7-714a-0932-a27f-778da35e8e29@bell-sw.com>
 <BC07CAFB-0C57-4932-8818-08113A10856C@oracle.com>
Message-ID: <1c1cea2c-182f-2f9f-bdd3-6d9776723956@bell-sw.com>

On 18.03.2020 20:02, Igor Ignatyev wrote:

> +import static jdk.test.lib.Utils.TEST_CLASS_PATH;
> I'm not a huge fun of 'import static', yet don't insist on removing it 
> either.
>
> + System.out.println(" libjvm : " + jvmLibDir.toString());
> jvmLibDir doesn't point to libjvm, so you need either update message 
> prefix or use the actual value which will be used as path to libjvm. I 
> personally prefer the latter.
> btw, you don't need to explicitly call toString in string concatenation.
>
 ? Here is the updated fix where the static import is removed and libjvm 
path is used:

 ? http://cr.openjdk.java.net/~alexsch/8240604/webrev.03/


 ? Thanks,

 ? Alexander.


> -- Igor
>
>> On Mar 18, 2020, at 9:54 AM, Alexander Scherbatiy 
>> <alexander.scherbatiy at bell-sw.com 
>> <mailto:alexander.scherbatiy at bell-sw.com>> wrote:
>>
>> On 18.03.2020 19:00, Igor Ignatyev wrote:
>>
>>> Hi Alexander,
>>>
>>>> I also included TEST_NATIVE_PATH to the Utils lib.
>>> for the sake of clarity and ease of backporting, I'd prefer to have 
>>> it added by a separate bug and commit.
>>
>> Here is the updated fix where TEST_NATIVE_PATH is not added to the 
>> Utils lib.
>>
>> http://cr.openjdk.java.net/~alexsch/8240604/webrev.02/
>>
>>
>> Thanks,
>>
>> Alexander.
>>
>>>> Could I just use "hg remove binary-fie" and run webrev to add the 
>>>> removed binary files into webrev?
>>> IIRC correctly, webrev will just say 'a binary file got removed', in 
>>> any case I'll take it as a 'yes, I'm going to remove these files as 
>>> part of 8240604', so thumbs up.
>>>
>>> -- Igor
>>>
>>>> On Mar 18, 2020, at 4:57 AM, Alexander Scherbatiy 
>>>> <alexander.scherbatiy at bell-sw.com 
>>>> <mailto:alexander.scherbatiy at bell-sw.com>> wrote:
>>>>
>>>> Hello,
>>>>
>>>> Could you review the updated fix:
>>>>
>>>> http://cr.openjdk.java.net/~alexsch/8240604/webrev.01
>>>>
>>>> Utils.TEST_CLASS_PATH, Platform.jvmLibDir(), and /native flag are 
>>>> added to the CustomLauncherTest.java test. I also included 
>>>> TEST_NATIVE_PATH to the Utils lib.
>>>>
>>>> I have not found a history about CustomLauncherTest.sh script in 
>>>> launcher.c so I just updated the comment as "A minature launcher 
>>>> for use by CustomLauncherTest.java test" in the exelauncher.c file.
>>>>
>>>>
>>>> The comment that I had about removing the linux-* and solaris-* 
>>>> binary files I wrote because it is not clear for what is the right 
>>>> way to include removed binary files into webrev.
>>>>
>>>> Could I just use "hg remove binary-fie" and run webrev to add the 
>>>> removed binary files into webrev?
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Alexander.
>>>>
>>>> On 17.03.2020 20:11, Igor Ignatyev wrote:
>>>>> Hi Alexander,
>>>>>
>>>>> overall looks good to me, I have a few comments though:
>>>>> ?- you can use Utils.TEST_CLASSPATH instead of 
>>>>> CustomLauncherTest.TEST_CLASSPATH
>>>>> - CustomLauncherTest::findLibjvm can be simplified by use 
>>>>> Platform::jvmLibDir
>>>>> - exelauncher.c has a comment which refers to the test as 
>>>>> CustomLauncherTest.sh, could you please update the comment?
>>>>> - you have to add /native flag to @run action, otherwise jtreg 
>>>>> won't exclude this test from runs w/ test.nativepath being unset
>>>>>
>>>>> I also have a question regarding your statement that
>>>>>>> The changes for obsolete binary files <...> are not included 
>>>>>>> into the webrev. They needs to be removed manually.
>>>>> you are planning to remove these files as part of this patch, right?
>>>>>
>>>>> Thanks,
>>>>> -- Igor
>>>>>
>>>>>
>>>>>> On Mar 5, 2020, at 6:27 AM, Daniel Fuchs <daniel.fuchs at oracle.com 
>>>>>> <mailto:daniel.fuchs at oracle.com>> wrote:
>>>>>>
>>>>>> Hi Alexander,
>>>>>>
>>>>>> Fixes to JMX & management agent are reviewed on the
>>>>>> seviceability-dev (added in to:) these days.
>>>>>>
>>>>>> best regards,
>>>>>>
>>>>>> -- daniel
>>>>>>
>>>>>> On 05/03/2020 13:17, Alexander Scherbatiy wrote:
>>>>>>> Hello,
>>>>>>> Could you review a small enhancement where the test 
>>>>>>> CustomLauncherTest is updated to build binary launcher file from 
>>>>>>> launcher.c file.
>>>>>>> The file launcher.c is renamed to exelauncher.c to follow the 
>>>>>>> name convention for executable test files building by jdk make 
>>>>>>> system.
>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240604
>>>>>>> Webrev: http://cr.openjdk.java.net/~alexsch/8240604/webrev.00
>>>>>>> The changes for obsolete binary files from 
>>>>>>> sun/management/jmxremote/bootstrap/linux-* and solaris-* are not 
>>>>>>> included into the webrev. They needs to be removed manually.
>>>>>>> The test is passed on Ubuntu 18.04 x86-64, Solaris Sparc v9 
>>>>>>> 11.2, and Solaris x64 11.4 systems.
>>>>>>> The test is excluded from Windows and Mac Os X systems.
>>>>>>> Thanks,
>>>>>>> Alexander.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200318/366a26b2/attachment-0001.htm>

From chris.plummer at oracle.com  Wed Mar 18 17:51:50 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 18 Mar 2020 10:51:50 -0700
Subject: RFR(XS) 8227340: Modify problem list entry for
 javax/management/remote/mandatory/connection/MultiThreadDeadLockTest.java
In-Reply-To: <0eb4d0c3-60fb-be22-238a-fd8f4b10ff9e@oracle.com>
References: <0c2e3ed0-e35c-c555-b8ed-ba64eb08f6ae@oracle.com>
 <0eb4d0c3-60fb-be22-238a-fd8f4b10ff9e@oracle.com>
Message-ID: <dd861af4-e2b3-0631-8e1a-4299864d4cb2@oracle.com>

Thanks!

Chris

On 3/17/20 10:16 PM, David Holmes wrote:
> Hi Chris,
>
> On 18/03/2020 2:59 pm, Chris Plummer wrote:
>> Hello,
>>
>> Please review the following:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8227340
>>
>> diff --git a/test/jdk/ProblemList.txt b/test/jdk/ProblemList.txt
>> --- a/test/jdk/ProblemList.txt
>> +++ b/test/jdk/ProblemList.txt
>> @@ -587,7 +587,7 @@
>> ??java/lang/management/ThreadMXBean/AllThreadIds.java 8131745 
>> generic-all
>>
>> ??javax/management/monitor/DerivedGaugeMonitorTest.java 8042211 
>> generic-all
>> -javax/management/remote/mandatory/connection/MultiThreadDeadLockTest.java 
>> 8042215 generic-all
>> +javax/management/remote/mandatory/connection/MultiThreadDeadLockTest.java 
>> 8227337 generic-all
>>
>> 8042215 [1] used to be the correct CR to problem list this test 
>> under, but it was accidentally used to fix for a different bug. 
>> 8042215 [1] has now been cloned to 8227337 [2] so the problem list 
>> needs to be updated also.
>
> Okay. The bugs themselves are in a bit of a muddle but this issue is 
> okay.
>
> Thanks,
> David

>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8042215
>> [2] https://bugs.openjdk.java.net/browse/JDK-8227337
>>
>> thanks,
>>
>> Chris
>>
>>


From igor.ignatyev at oracle.com  Wed Mar 18 18:00:32 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Wed, 18 Mar 2020 11:00:32 -0700
Subject: jmx-dev RFR 8240604: Rewrite
 sun/management/jmxremote/bootstrap/CustomLauncherTest.java test to make
 binaries from source file
In-Reply-To: <1c1cea2c-182f-2f9f-bdd3-6d9776723956@bell-sw.com>
References: <6b21ac03-0505-881e-b805-1681faab437b@bell-sw.com>
 <b9cf2538-275a-f6ee-aa8c-3e50d1cdef3b@oracle.com>
 <DFCA7043-9F26-4FC6-940F-F448E2185151@oracle.com>
 <4ac1b2eb-7d0d-7fa5-8437-b7b987df92fe@bell-sw.com>
 <CC4C5454-0E13-44C5-9D5F-E34720110206@oracle.com>
 <24e51dc7-714a-0932-a27f-778da35e8e29@bell-sw.com>
 <BC07CAFB-0C57-4932-8818-08113A10856C@oracle.com>
 <1c1cea2c-182f-2f9f-bdd3-6d9776723956@bell-sw.com>
Message-ID: <40924AA7-C53C-4F56-8CE4-25672C58540C@oracle.com>

thanks! LGTM.

-- Igor

> On Mar 18, 2020, at 10:35 AM, Alexander Scherbatiy <alexander.scherbatiy at bell-sw.com> wrote:
> 
> On 18.03.2020 20:02, Igor Ignatyev wrote:
> 
>> +import static jdk.test.lib.Utils.TEST_CLASS_PATH;
>> I'm not a huge fun of 'import static', yet don't insist on removing it either. 
>> 
>> +            System.out.println("  libjvm    : " + jvmLibDir.toString());
>> jvmLibDir doesn't point to libjvm, so you need either update message prefix or use the actual value which will be used as path to libjvm. I personally prefer the latter. 
>> btw, you don't need to explicitly call toString in string concatenation.
>> 
>   Here is the updated fix where the static import is removed and libjvm path is used:
> 
>   http://cr.openjdk.java.net/~alexsch/8240604/webrev.03/ <http://cr.openjdk.java.net/~alexsch/8240604/webrev.03/>
> 
>   Thanks,
> 
>   Alexander.
> 
> 
> 
>> -- Igor
>> 
>>> On Mar 18, 2020, at 9:54 AM, Alexander Scherbatiy <alexander.scherbatiy at bell-sw.com <mailto:alexander.scherbatiy at bell-sw.com>> wrote:
>>> 
>>> On 18.03.2020 19:00, Igor Ignatyev wrote:
>>> 
>>>> Hi Alexander,
>>>> 
>>>>> I also included TEST_NATIVE_PATH to the Utils lib.
>>>> for the sake of clarity and ease of backporting, I'd prefer to have it added by a separate bug and commit.
>>> 
>>> Here is the updated fix where TEST_NATIVE_PATH is not added to the Utils lib.
>>> 
>>>   http://cr.openjdk.java.net/~alexsch/8240604/webrev.02/ <http://cr.openjdk.java.net/~alexsch/8240604/webrev.02/>
>>> 
>>> 
>>> Thanks,
>>> 
>>> Alexander.
>>> 
>>>>> Could I just use "hg remove binary-fie" and run webrev to add the removed binary files into webrev?
>>>> IIRC correctly, webrev will just say 'a binary file got removed', in any case I'll take it as a 'yes, I'm going to remove these files as part of 8240604', so thumbs up.
>>>> 
>>>> -- Igor
>>>>  
>>>>> On Mar 18, 2020, at 4:57 AM, Alexander Scherbatiy <alexander.scherbatiy at bell-sw.com <mailto:alexander.scherbatiy at bell-sw.com>> wrote:
>>>>> 
>>>>> Hello,
>>>>> 
>>>>> Could you review the updated fix:
>>>>> 
>>>>>   http://cr.openjdk.java.net/~alexsch/8240604/webrev.01 <http://cr.openjdk.java.net/~alexsch/8240604/webrev.01>
>>>>> 
>>>>> Utils.TEST_CLASS_PATH, Platform.jvmLibDir(), and /native flag are added to the CustomLauncherTest.java test. I also included TEST_NATIVE_PATH to the Utils lib.
>>>>> 
>>>>> I have not found a history about CustomLauncherTest.sh script in launcher.c so I just updated the comment as "A minature launcher for use by CustomLauncherTest.java test" in the exelauncher.c file.
>>>>> 
>>>>> 
>>>>> The comment that I had about removing the linux-* and solaris-* binary files I wrote because it is not clear for what is the right way to include removed binary files into webrev.
>>>>> 
>>>>> Could I just use "hg remove binary-fie" and run webrev to add the removed binary files into webrev?
>>>>> 
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Alexander.
>>>>> 
>>>>> On 17.03.2020 20:11, Igor Ignatyev wrote:
>>>>>> Hi Alexander,
>>>>>> 
>>>>>> overall looks good to me, I have a few comments though:
>>>>>>  - you can use Utils.TEST_CLASSPATH instead of CustomLauncherTest.TEST_CLASSPATH
>>>>>> - CustomLauncherTest::findLibjvm can be simplified by use Platform::jvmLibDir
>>>>>> - exelauncher.c has a comment which refers to the test as CustomLauncherTest.sh, could you please update the comment?
>>>>>> - you have to add /native flag to @run action, otherwise jtreg won't exclude this test from runs w/ test.nativepath being unset
>>>>>> 
>>>>>> I also have a question regarding your statement that
>>>>>>>> The changes for obsolete binary files <...> are not included into the webrev. They needs to be removed manually.
>>>>>> you are planning to remove these files as part of this patch, right?
>>>>>> 
>>>>>> Thanks,
>>>>>> -- Igor
>>>>>> 
>>>>>> 
>>>>>>> On Mar 5, 2020, at 6:27 AM, Daniel Fuchs <daniel.fuchs at oracle.com <mailto:daniel.fuchs at oracle.com>> wrote:
>>>>>>> 
>>>>>>> Hi Alexander,
>>>>>>> 
>>>>>>> Fixes to JMX & management agent are reviewed on the
>>>>>>> seviceability-dev (added in to:) these days.
>>>>>>> 
>>>>>>> best regards,
>>>>>>> 
>>>>>>> -- daniel
>>>>>>> 
>>>>>>> On 05/03/2020 13:17, Alexander Scherbatiy wrote:
>>>>>>>> Hello,
>>>>>>>> Could you review a small enhancement where the test CustomLauncherTest is updated to build binary launcher file from launcher.c file.
>>>>>>>> The file launcher.c is renamed to exelauncher.c to follow the name convention for executable test files building by jdk make system.
>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240604 <https://bugs.openjdk.java.net/browse/JDK-8240604>
>>>>>>>> Webrev: http://cr.openjdk.java.net/~alexsch/8240604/webrev.00 <http://cr.openjdk.java.net/~alexsch/8240604/webrev.00>
>>>>>>>> The changes for obsolete binary files from sun/management/jmxremote/bootstrap/linux-* and solaris-* are not included into the webrev. They needs to be removed manually.
>>>>>>>> The test is passed on Ubuntu 18.04 x86-64, Solaris Sparc v9 11.2, and Solaris x64 11.4 systems.
>>>>>>>> The test is excluded from Windows and Mac Os X systems.
>>>>>>>> Thanks,
>>>>>>>> Alexander.
>> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200318/445f764a/attachment.htm>

From leonid.mesnik at oracle.com  Wed Mar 18 19:37:17 2020
From: leonid.mesnik at oracle.com (Leonid Mesnik)
Date: Wed, 18 Mar 2020 12:37:17 -0700
Subject: RFR: 8241123: Refactor vmTestbase stress framework to use j.u.c and
 make creation of threads more flexible
Message-ID: <0d73d306-2eff-375c-65e1-67142b2c6c59@oracle.com>

Hi

Could you please review following fix which slightly refactor vmTestbase 
stress test harness. This refactoring helps to add virtual threads 
testing support.

The Wicket uses plain sync/wait/notify mechanism which cause carrier 
thread starvation and should not be used in virtual threads. The 
ManagedThread is a subclass of Thread so it couldn't be virtual thread.


Following fix changes Wicket to use locks/conditions to don't pin 
vthread to carrier thread while starting testing.

ManagedThread is fixed to keep execution thread as the thread variable 
and isolate it's creation.

Test 
vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java 
was updated to don't use Wicket. (The lock has a reference to thread 
which affects test.)

Wicket "finished" in class ThreadsRunner was changed to atomicInt/sleep 
to avoid OOME in j.u.c.l.Condition::await() which might happened in 
stress GC tests.

webrev: http://cr.openjdk.java.net/~lmesnik/8241123/webrev.00/

bug: https://bugs.openjdk.java.net/browse/JDK-8241123


Leonid


From igor.ignatyev at oracle.com  Wed Mar 18 19:48:50 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Wed, 18 Mar 2020 12:48:50 -0700
Subject: RFR: 8241123: Refactor vmTestbase stress framework to use j.u.c
 and make creation of threads more flexible
In-Reply-To: <0d73d306-2eff-375c-65e1-67142b2c6c59@oracle.com>
References: <0d73d306-2eff-375c-65e1-67142b2c6c59@oracle.com>
Message-ID: <FF8B4FCD-C583-41AF-AFE7-7D7411D4A4AC@oracle.com>

Hi Leonid,

I've started looking at your webrev, and so far have a couple questions:

> Test vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java was updated to don't use Wicket. (The lock has a reference to thread which affects test.)
can't you use just a volatile boolean field?

> Wicket "finished" in class ThreadsRunner was changed to atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which might happened in stress GC tests.
won't j.u.c.CountDownLatch be more appropriate and cleaner solution here?

I need more time to get grasp of Wicket and your changes in it; will come back to you after I understand them. 

-- Igor

> On Mar 18, 2020, at 12:37 PM, Leonid Mesnik <leonid.mesnik at oracle.com> wrote:
> 
> Hi
> 
> Could you please review following fix which slightly refactor vmTestbase stress test harness. This refactoring helps to add virtual threads testing support.
> 
> The Wicket uses plain sync/wait/notify mechanism which cause carrier thread starvation and should not be used in virtual threads. The ManagedThread is a subclass of Thread so it couldn't be virtual thread.
> 
> 
> Following fix changes Wicket to use locks/conditions to don't pin vthread to carrier thread while starting testing.
> 
> ManagedThread is fixed to keep execution thread as the thread variable and isolate it's creation.
> 
> Test vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java was updated to don't use Wicket. (The lock has a reference to thread which affects test.)
> 
> Wicket "finished" in class ThreadsRunner was changed to atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which might happened in stress GC tests.
> 
> webrev: http://cr.openjdk.java.net/~lmesnik/8241123/webrev.00/
> 
> bug: https://bugs.openjdk.java.net/browse/JDK-8241123
> 
> 
> Leonid
> 


From chris.plummer at oracle.com  Wed Mar 18 20:05:55 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 18 Mar 2020 13:05:55 -0700
Subject: RFR(XS) 8241162: ProblemList
 serviceability/sa/TestHeapDumpForInvokeDynamic.java on OSX
Message-ID: <c1267166-acfb-e37b-c21b-262ec4016a91@oracle.com>

Hello,

Please review the following:

https://bugs.openjdk.java.net/browse/JDK-8241162

diff --git a/test/hotspot/jtreg/ProblemList.txt 
b/test/hotspot/jtreg/ProblemList.txt
--- a/test/hotspot/jtreg/ProblemList.txt
+++ b/test/hotspot/jtreg/ProblemList.txt
@@ -131,7 +131,7 @@
 ?serviceability/sa/TestCpoolForInvokeDynamic.java 8193639 solaris-all
 ?serviceability/sa/TestDefaultMethods.java 8193639 solaris-all
 ?serviceability/sa/TestG1HeapRegion.java 8193639 solaris-all
-serviceability/sa/TestHeapDumpForInvokeDynamic.java 8193639 solaris-all
+serviceability/sa/TestHeapDumpForInvokeDynamic.java 8193639,8241158 
solaris-all,macosx-x64
 ?serviceability/sa/TestHeapDumpForLargeArray.java 8193639 solaris-all
 ?serviceability/sa/TestInstanceKlassSize.java 8193639,8230664 
solaris-all,linux-ppc64le,linux-ppc64
 ?serviceability/sa/TestInstanceKlassSizeForInterface.java 
8193639,8230664 solaris-all,linux-ppc64le,linux-ppc64

This test was recently re-enabled for OSX (along with a large number of 
other SA tests), but fails when using -XX:ArchiveRelocationMode=1. See 
JDK-8241158 [1]. Since a fix is not readily available, we need to 
problemlist for now.

[1] https://bugs.openjdk.java.net/browse/JDK-8241158

thanks,

Chris


From daniel.daugherty at oracle.com  Wed Mar 18 20:07:42 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Wed, 18 Mar 2020 16:07:42 -0400
Subject: RFR(XS) 8241162: ProblemList
 serviceability/sa/TestHeapDumpForInvokeDynamic.java on OSX
In-Reply-To: <c1267166-acfb-e37b-c21b-262ec4016a91@oracle.com>
References: <c1267166-acfb-e37b-c21b-262ec4016a91@oracle.com>
Message-ID: <19d22a23-9ab0-f64f-1ac9-4cd3a8dd13ee@oracle.com>

Thumbs up. Also, this fix is trivial.

Dan


On 3/18/20 4:05 PM, Chris Plummer wrote:
> Hello,
>
> Please review the following:
>
> https://bugs.openjdk.java.net/browse/JDK-8241162
>
> diff --git a/test/hotspot/jtreg/ProblemList.txt 
> b/test/hotspot/jtreg/ProblemList.txt
> --- a/test/hotspot/jtreg/ProblemList.txt
> +++ b/test/hotspot/jtreg/ProblemList.txt
> @@ -131,7 +131,7 @@
> ?serviceability/sa/TestCpoolForInvokeDynamic.java 8193639 solaris-all
> ?serviceability/sa/TestDefaultMethods.java 8193639 solaris-all
> ?serviceability/sa/TestG1HeapRegion.java 8193639 solaris-all
> -serviceability/sa/TestHeapDumpForInvokeDynamic.java 8193639 solaris-all
> +serviceability/sa/TestHeapDumpForInvokeDynamic.java 8193639,8241158 
> solaris-all,macosx-x64
> ?serviceability/sa/TestHeapDumpForLargeArray.java 8193639 solaris-all
> ?serviceability/sa/TestInstanceKlassSize.java 8193639,8230664 
> solaris-all,linux-ppc64le,linux-ppc64
> ?serviceability/sa/TestInstanceKlassSizeForInterface.java 
> 8193639,8230664 solaris-all,linux-ppc64le,linux-ppc64
>
> This test was recently re-enabled for OSX (along with a large number 
> of other SA tests), but fails when using -XX:ArchiveRelocationMode=1. 
> See JDK-8241158 [1]. Since a fix is not readily available, we need to 
> problemlist for now.
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8241158
>
> thanks,
>
> Chris
>


From chris.plummer at oracle.com  Wed Mar 18 20:23:27 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 18 Mar 2020 13:23:27 -0700
Subject: RFR(XS) 8241162: ProblemList
 serviceability/sa/TestHeapDumpForInvokeDynamic.java on OSX
In-Reply-To: <19d22a23-9ab0-f64f-1ac9-4cd3a8dd13ee@oracle.com>
References: <c1267166-acfb-e37b-c21b-262ec4016a91@oracle.com>
 <19d22a23-9ab0-f64f-1ac9-4cd3a8dd13ee@oracle.com>
Message-ID: <b0518fff-4177-4e43-9a9d-00de3fecdeb6@oracle.com>

Thanks!

On 3/18/20 1:07 PM, Daniel D. Daugherty wrote:
> Thumbs up. Also, this fix is trivial.
>
> Dan
>
>
> On 3/18/20 4:05 PM, Chris Plummer wrote:
>> Hello,
>>
>> Please review the following:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8241162
>>
>> diff --git a/test/hotspot/jtreg/ProblemList.txt 
>> b/test/hotspot/jtreg/ProblemList.txt
>> --- a/test/hotspot/jtreg/ProblemList.txt
>> +++ b/test/hotspot/jtreg/ProblemList.txt
>> @@ -131,7 +131,7 @@
>> ?serviceability/sa/TestCpoolForInvokeDynamic.java 8193639 solaris-all
>> ?serviceability/sa/TestDefaultMethods.java 8193639 solaris-all
>> ?serviceability/sa/TestG1HeapRegion.java 8193639 solaris-all
>> -serviceability/sa/TestHeapDumpForInvokeDynamic.java 8193639 solaris-all
>> +serviceability/sa/TestHeapDumpForInvokeDynamic.java 8193639,8241158 
>> solaris-all,macosx-x64
>> ?serviceability/sa/TestHeapDumpForLargeArray.java 8193639 solaris-all
>> ?serviceability/sa/TestInstanceKlassSize.java 8193639,8230664 
>> solaris-all,linux-ppc64le,linux-ppc64
>> ?serviceability/sa/TestInstanceKlassSizeForInterface.java 
>> 8193639,8230664 solaris-all,linux-ppc64le,linux-ppc64
>>
>> This test was recently re-enabled for OSX (along with a large number 
>> of other SA tests), but fails when using -XX:ArchiveRelocationMode=1. 
>> See JDK-8241158 [1]. Since a fix is not readily available, we need to 
>> problemlist for now.
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8241158
>>
>> thanks,
>>
>> Chris
>>
>


From leonid.mesnik at oracle.com  Wed Mar 18 20:29:22 2020
From: leonid.mesnik at oracle.com (Leonid Mesnik)
Date: Wed, 18 Mar 2020 13:29:22 -0700
Subject: RFR: 8241123: Refactor vmTestbase stress framework to use j.u.c
 and make creation of threads more flexible
In-Reply-To: <FF8B4FCD-C583-41AF-AFE7-7D7411D4A4AC@oracle.com>
References: <0d73d306-2eff-375c-65e1-67142b2c6c59@oracle.com>
 <FF8B4FCD-C583-41AF-AFE7-7D7411D4A4AC@oracle.com>
Message-ID: <5344eb3a-b17a-09c1-0f1f-8c1462899fe3@oracle.com>


On 3/18/20 12:48 PM, Igor Ignatyev wrote:
> Hi Leonid,
>
> I've started looking at your webrev, and so far have a couple questions:
>
>> Test vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java was updated to don't use Wicket. (The lock has a reference to thread which affects test.)
> can't you use just a volatile boolean field?
I can, but I don't see any benefits to use volatile fields instead of 
atomics. I prefer to use Atomic* anywhere because of it's clearer 
semantics. Using of explicit get/set and other similar accessors.
>
>> Wicket "finished" in class ThreadsRunner was changed to atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which might happened in stress GC tests.
> won't j.u.c.CountDownLatch be more appropriate and cleaner solution here?

Unfortunately no. The CountDownLatch would be a nice solution but it is 
possible to get OOME in gc/lock (might be other) tests. I replaced 
Wicked by the same reason. Updating the AtomicInteger doesn't allocate 
any memory and don't cause OOME.

Leonid

>
> I need more time to get grasp of Wicket and your changes in it; will come back to you after I understand them.
>
> -- Igor
>
>> On Mar 18, 2020, at 12:37 PM, Leonid Mesnik <leonid.mesnik at oracle.com> wrote:
>>
>> Hi
>>
>> Could you please review following fix which slightly refactor vmTestbase stress test harness. This refactoring helps to add virtual threads testing support.
>>
>> The Wicket uses plain sync/wait/notify mechanism which cause carrier thread starvation and should not be used in virtual threads. The ManagedThread is a subclass of Thread so it couldn't be virtual thread.
>>
>>
>> Following fix changes Wicket to use locks/conditions to don't pin vthread to carrier thread while starting testing.
>>
>> ManagedThread is fixed to keep execution thread as the thread variable and isolate it's creation.
>>
>> Test vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java was updated to don't use Wicket. (The lock has a reference to thread which affects test.)
>>
>> Wicket "finished" in class ThreadsRunner was changed to atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which might happened in stress GC tests.
>>
>> webrev: http://cr.openjdk.java.net/~lmesnik/8241123/webrev.00/
>>
>> bug: https://bugs.openjdk.java.net/browse/JDK-8241123
>>
>>
>> Leonid
>>

From daniel.daugherty at oracle.com  Wed Mar 18 20:30:43 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Wed, 18 Mar 2020 16:30:43 -0400
Subject: RFR 8240902: JDI shared memory connector can use already closed
 Handles
In-Reply-To: <b644413f-060c-e856-5de5-bb7d69783a8e@oracle.com>
References: <b644413f-060c-e856-5de5-bb7d69783a8e@oracle.com>
Message-ID: <fcdb4587-6f99-7f24-5105-83ca4f46497c@oracle.com>

On 3/17/20 4:14 PM, Patricio Chilano wrote:
> Hi all,
>
> Please review the following patch:
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8240902
> Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/

src/jdk.jdi/share/native/libdt_shmem/shmemBase.c
 ??? L411: ??? int attempts = 10;
 ??? L420: ??????? sysSleep(200);
 ??????? I presume that this is a 200 millisecond sleep so this new loop
 ??????? will delay a closeStream() call by at most 2 seconds. You may
 ??????? want those literals to be #define'ed values at the top of the
 ??????? file, e.g., like this one:

 ??????? #define MAX_GENERATION_RETRIES 20

 ??????? Your choice on the names of the new #defines if you choose to
 ??????? do that. You might even consider putting them close to
 ??????? "typedef struct SharedMemoryConnection".

 ??????? Update: Oh yuck! Now I see that there is existing code that
 ??????? does the same kind of looping with sysSleep() calls when the
 ??????? linger option is set. I revise my comment: You're following
 ??????? the existing style in the function so go with what you have.

 ??? Don't forget to update the copyright year before you push.

 ??? L379: closeStream(Stream *stream, jboolean linger, unsigned int 
*refcount )
 ??????? nit - please delete space before ')'.

 ??? L412: ??? MemoryBarrier();???? /* Prevent load of refcount to float 
above. */
 ??????? typo: s/to float/from floating/

 ??? L413: ??? while (attempts>0) {
 ??????? nit - please add spaces around '>'.

 ??? L415-418, L537, L541, L552:
 ??????? nit - indent should be four spaces instead of two spaces.

 ??????? The existing L546 and L549 should indented four spaces instead
 ??????? of two spaces. Please fix since you there.


I'm good with the code changes. I only have nits above so I don't need
to see another webrev.

Dan

> Calling closeConnection() on an already created/opened connection 
> includes calls to CloseHandle() on objects that can still be used by 
> other threads. This can lead to either undefined behavior or, as 
> detailed in the bug comments, changes of state of unrelated objects. 
> This issue was found while debugging the reason behind some jshell 
> test failures seen after pushing 8230594. Not as important, but there 
> are also calls to closeStream() from createStream()/openStream() when 
> failing to create/open a stream that will return after executing 
> "CHECK_ERROR(enterMutex(stream, NULL));" without closing the intended 
> resources. Then, calling closeConnection() could assert if the reason 
> of the previous failure was that the stream's mutex failed to be 
> created/opened. These patch aims to address these issues too.
>
> Tested in mach5 with the current baseline, tiers1-3 and several runs 
> of open/test/langtools/:tier1 which includes the jshell tests where 
> this connector is used. I also applied patch 
> http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev 
> mentioned in the comments of the bug, on top of the baseline and run 
> the langtool tests with and without this fix. Without the fix running 
> around 30 repetitions already shows failures in tests 
> jdk/jshell/FailOverExecutionControlTest.java and 
> jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With the 
> fix I run several hundred runs and saw no failures. Let me know if 
> there is any additional testing I should do.
>
> As a side note, I see there are a couple of open issues related with 
> jshell failures (8209848) which could be related to this bug and 
> therefore might be fixed by this patch.
>
> Thanks,
> Patricio
>


From patricio.chilano.mateo at oracle.com  Wed Mar 18 20:44:51 2020
From: patricio.chilano.mateo at oracle.com (Patricio Chilano)
Date: Wed, 18 Mar 2020 17:44:51 -0300
Subject: RFR 8240902: JDI shared memory connector can use already closed
 Handles
In-Reply-To: <db15a209-c73d-c380-42e2-75e713392453@oracle.com>
References: <b644413f-060c-e856-5de5-bb7d69783a8e@oracle.com>
 <db15a209-c73d-c380-42e2-75e713392453@oracle.com>
Message-ID: <2212664f-930b-cdfc-7ae6-70c7205cdaca@oracle.com>

Hi David,

On 3/18/20 4:27 AM, David Holmes wrote:
> Hi Patricio,
>
> On 18/03/2020 6:14 am, Patricio Chilano wrote:
>> Hi all,
>>
>> Please review the following patch:
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240902
>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/
>>
>> Calling closeConnection() on an already created/opened connection 
>> includes calls to CloseHandle() on objects that can still be used by 
>> other threads. This can lead to either undefined behavior or, as 
>> detailed in the bug comments, changes of state of unrelated objects. 
>
> This was a really great find!
Thanks!? : )

>> This issue was found while debugging the reason behind some jshell 
>> test failures seen after pushing 8230594. Not as important, but there 
>> are also calls to closeStream() from createStream()/openStream() when 
>> failing to create/open a stream that will return after executing 
>> "CHECK_ERROR(enterMutex(stream, NULL));" without closing the intended 
>> resources. Then, calling closeConnection() could assert if the reason 
>> of the previous failure was that the stream's mutex failed to be 
>> created/opened. These patch aims to address these issues too.
>
> Patch looks good in general. The internal reference count guards 
> deletion of the internal resources, and is itself safe because never 
> actually delete the connection. Thanks for adding the comment about 
> this aspect.
>
> A few items:
>
> Please update copyright year before pushing.
Done.

> Please align ENTER_CONNECTION/LEAVE_CONNECTION macros the same way as 
> STREAM_INVARIANT.
Done.

> ?170 unsigned int refcount;
> ?171???? jint state;
>
> I'm unclear about the use of stream->state and connection->state as 
> guards - unless accessed under a mutex these would seem to at least 
> need acquire/release semantics.
>
> Additionally the reads of refcount would also seem to need to some 
> form of memory synchronization - though the Windows docs for the 
> Interlocked* API does not show how to simply read such a variable! 
> Though I note that the RtlFirstEntrySList method for the "Interlocked 
> Singly Linked Lists" API does state "Access to the list is 
> synchronized on a multiprocessor system." which suggests a read of 
> such a variable does require some form of memory synchronization!
In the case of the stream struct, the state field is protected by the 
mutex field. It is set to STATE_CLOSED while holding the mutex, and 
threads that read it must acquire the mutex first through 
sysIPMutexEnter(). For the cases where sysIPMutexEnter() didn't acquire 
the mutex, we will return something different than SYS_OK and the call 
will exit anyways. All this behaves as before, I didn't change it.

The refcount and state that I added to the SharedMemoryConnection struct 
work together. For a thread closing the connection, setting the 
connection state to STATE_CLOSED has to happen before reading the 
refcount (more on the atomicity of that read later). That's why I added 
the MemoryBarrier() call; which I see it's better if I just move it to 
after setting the connection state to closed. For the threads accessing 
the connection, incrementing the refcount has to happen before reading 
the connection state. That's already provided by the 
InterlockedIncrement() which uses a full memory barrier. In this way if 
the thread closing the connection reads a refcount of 0, then we know 
it's safe to release the resources, since other threads accessing the 
connection will see that the state is closed after incrementing the 
refcount. If the read of refcount is not 0, then it could be that a 
thread is accessing the connection or not (it could have read a state 
connection of STATE_CLOSED after incrementing the refcount), we don't 
know, so we can't release anything. Similarly if the thread accessing 
the connection reads that the state is not closed, then we know it's 
safe to access the stream since anybody closing the connection will 
still have to read refcount which will be at least 1.
As for the atomicity of the read of refcount, from 
https://docs.microsoft.com/en-us/windows/win32/sync/interlocked-variable-access, 
it states that "simple reads and writes to properly-aligned 32-bit 
variables are atomic operations". Maybe I should declare refcount 
explicitly as DWORD32?

Instead of having a refcount we could have done something similar to the 
stream struct and protect access to the connection through a mutex. To 
avoid serializing all threads we could have used SRW locks and only the 
one closing the connection would do AcquireSRWLockExclusive(). It would 
change the state of the connection to STATE_CLOSED, close all handles, 
and then release the mutex. ENTER_CONNECTION() and LEAVE_CONNECTION() 
would acquire and release the mutex in shared mode. But other that maybe 
be more easy to read I don't think the change will be smaller.

> ?413 while (attempts>0) {
>
> spaces around >
Done.

> If the loop at 413 never encounters a zero reference_count then it 
> doesn't close the events or the mutex but still returns SYS_OK. That 
> seems wrong but I'm not sure what the right behaviour is here.
I can change the return value to be SYS_ERR, but I don't think there is 
much we can do about it unless we want to wait forever until we can 
release those resources.

> And please wait for serviceability folk to review this.
Sounds good.


Thanks for looking at this David! I will move the MemoryBarrier() and 
change the refcount to be DWORD32 if you are okay with that.


Thanks,
Patricio
> Thanks,
> David
> -----
>
>> Tested in mach5 with the current baseline, tiers1-3 and several runs 
>> of open/test/langtools/:tier1 which includes the jshell tests where 
>> this connector is used. I also applied patch 
>> http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev 
>> mentioned in the comments of the bug, on top of the baseline and run 
>> the langtool tests with and without this fix. Without the fix running 
>> around 30 repetitions already shows failures in tests 
>> jdk/jshell/FailOverExecutionControlTest.java and 
>> jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With the 
>> fix I run several hundred runs and saw no failures. Let me know if 
>> there is any additional testing I should do.
>>
>> As a side note, I see there are a couple of open issues related with 
>> jshell failures (8209848) which could be related to this bug and 
>> therefore might be fixed by this patch.
>>
>> Thanks,
>> Patricio
>>


From serguei.spitsyn at oracle.com  Wed Mar 18 21:03:53 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 18 Mar 2020 14:03:53 -0700
Subject: RFR 8240902: JDI shared memory connector can use already closed
 Handles
In-Reply-To: <b644413f-060c-e856-5de5-bb7d69783a8e@oracle.com>
References: <b644413f-060c-e856-5de5-bb7d69783a8e@oracle.com>
Message-ID: <161fe95b-4d48-fd61-15ec-7daecb193868@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200318/b72cab00/attachment.htm>

From igor.ignatyev at oracle.com  Wed Mar 18 21:15:14 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Wed, 18 Mar 2020 14:15:14 -0700
Subject: RFR: 8241123: Refactor vmTestbase stress framework to use j.u.c
 and make creation of threads more flexible
In-Reply-To: <5344eb3a-b17a-09c1-0f1f-8c1462899fe3@oracle.com>
References: <0d73d306-2eff-375c-65e1-67142b2c6c59@oracle.com>
 <FF8B4FCD-C583-41AF-AFE7-7D7411D4A4AC@oracle.com>
 <5344eb3a-b17a-09c1-0f1f-8c1462899fe3@oracle.com>
Message-ID: <50110939-1AEF-40B5-969C-C5313633B1F9@oracle.com>


> On Mar 18, 2020, at 1:29 PM, Leonid Mesnik <leonid.mesnik at oracle.com> wrote:
> 
> 
> On 3/18/20 12:48 PM, Igor Ignatyev wrote:
>> Hi Leonid,
>> 
>> I've started looking at your webrev, and so far have a couple questions:
>> 
>>> Test vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java was updated to don't use Wicket. (The lock has a reference to thread which affects test.)
>> can't you use just a volatile boolean field?
> I can, but I don't see any benefits to use volatile fields instead of atomics. I prefer to use Atomic* anywhere because of it's clearer semantics. Using of explicit get/set and other similar accessors.
you aren't using any accessors other than plain get/set, which are semantically equal to setting/getting a volatile field, so I'm not sure how it's clearer.as of benefits of a volatile field, the code is shorter (and arguable cleaner) and you save some heap space. anyhow, I don't insist on usage of volatile boolean over AtomicBoolean, 
>> 
>>> Wicket "finished" in class ThreadsRunner was changed to atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which might happened in stress GC tests.
>> won't j.u.c.CountDownLatch be more appropriate and cleaner solution here?
> 
> Unfortunately no. The CountDownLatch would be a nice solution but it is possible to get OOME in gc/lock (might be other) tests. I replaced Wicked by the same reason. Updating the AtomicInteger doesn't allocate any memory and don't cause OOME.
I see.
> 
> Leonid
> 
>> 
>> I need more time to get grasp of Wicket and your changes in it; will come back to you after I understand them.
>> 
>> -- Igor
>> 
>>> On Mar 18, 2020, at 12:37 PM, Leonid Mesnik <leonid.mesnik at oracle.com> wrote:
>>> 
>>> Hi
>>> 
>>> Could you please review following fix which slightly refactor vmTestbase stress test harness. This refactoring helps to add virtual threads testing support.
>>> 
>>> The Wicket uses plain sync/wait/notify mechanism which cause carrier thread starvation and should not be used in virtual threads. The ManagedThread is a subclass of Thread so it couldn't be virtual thread.
>>> 
>>> 
>>> Following fix changes Wicket to use locks/conditions to don't pin vthread to carrier thread while starting testing.
>>> 
>>> ManagedThread is fixed to keep execution thread as the thread variable and isolate it's creation.
>>> 
>>> Test vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java was updated to don't use Wicket. (The lock has a reference to thread which affects test.)
>>> 
>>> Wicket "finished" in class ThreadsRunner was changed to atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which might happened in stress GC tests.
>>> 
>>> webrev: http://cr.openjdk.java.net/~lmesnik/8241123/webrev.00/
>>> 
>>> bug: https://bugs.openjdk.java.net/browse/JDK-8241123
>>> 
>>> 
>>> Leonid
>>> 


From igor.ignatyev at oracle.com  Wed Mar 18 21:30:41 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Wed, 18 Mar 2020 14:30:41 -0700
Subject: RFR: 8241123: Refactor vmTestbase stress framework to use j.u.c
 and make creation of threads more flexible
In-Reply-To: <FF8B4FCD-C583-41AF-AFE7-7D7411D4A4AC@oracle.com>
References: <0d73d306-2eff-375c-65e1-67142b2c6c59@oracle.com>
 <FF8B4FCD-C583-41AF-AFE7-7D7411D4A4AC@oracle.com>
Message-ID: <4E0F364A-47F3-428D-9C08-6B1ADFCB9D24@oracle.com>

> I need more time to get grasp of Wicket and your changes in it; will come back to you after I understand them. 
ok, now when I believe that I have enough understanding of Wicket, I have a few comments:
1.
>   68     private Lock lock = new ReentrantLock();
>   69     private Condition condition = lock.newCondition();
it's better to make these fields final.

2. as all writes and reads of Wicket::count are guarded by lock.lock, there is no need for it to be atomic.
3. adding lock to getWaiters will also remove need for Wicket::waiters to be atomic.

the rest looks good to me.

Thanks,
-- Igor


> On Mar 18, 2020, at 12:48 PM, Igor Ignatyev <igor.ignatyev at oracle.com> wrote:
> 
> Hi Leonid,
> 
> I've started looking at your webrev, and so far have a couple questions:
> 
>> Test vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java was updated to don't use Wicket. (The lock has a reference to thread which affects test.)
> can't you use just a volatile boolean field?
> 
>> Wicket "finished" in class ThreadsRunner was changed to atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which might happened in stress GC tests.
> won't j.u.c.CountDownLatch be more appropriate and cleaner solution here?
> 
> I need more time to get grasp of Wicket and your changes in it; will come back to you after I understand them. 
> 
> -- Igor
> 
>> On Mar 18, 2020, at 12:37 PM, Leonid Mesnik <leonid.mesnik at oracle.com> wrote:
>> 
>> Hi
>> 
>> Could you please review following fix which slightly refactor vmTestbase stress test harness. This refactoring helps to add virtual threads testing support.
>> 
>> The Wicket uses plain sync/wait/notify mechanism which cause carrier thread starvation and should not be used in virtual threads. The ManagedThread is a subclass of Thread so it couldn't be virtual thread.
>> 
>> 
>> Following fix changes Wicket to use locks/conditions to don't pin vthread to carrier thread while starting testing.
>> 
>> ManagedThread is fixed to keep execution thread as the thread variable and isolate it's creation.
>> 
>> Test vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java was updated to don't use Wicket. (The lock has a reference to thread which affects test.)
>> 
>> Wicket "finished" in class ThreadsRunner was changed to atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which might happened in stress GC tests.
>> 
>> webrev: http://cr.openjdk.java.net/~lmesnik/8241123/webrev.00/
>> 
>> bug: https://bugs.openjdk.java.net/browse/JDK-8241123
>> 
>> 
>> Leonid
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200318/04c5af47/attachment.htm>

From daniel.daugherty at oracle.com  Wed Mar 18 21:37:30 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Wed, 18 Mar 2020 17:37:30 -0400
Subject: RFR 8240902: JDI shared memory connector can use already closed
 Handles
In-Reply-To: <b644413f-060c-e856-5de5-bb7d69783a8e@oracle.com>
References: <b644413f-060c-e856-5de5-bb7d69783a8e@oracle.com>
Message-ID: <211b15ab-96da-7e1d-09dc-5e5f25388b09@oracle.com>

Patricio,

This is a separate follow up about the jshell tests. Since I have been
tracking these as part of my GK work and filed a number of those bugs,
I figured I would help analyze them...

JDK-8209848 test/langtools/jdk/jshell tests failed with Accept timed out
https://bugs.openjdk.java.net/browse/JDK-8209848

 ??? This bug has been linked to sightings on Linux-X64, SPARC and Win-X64.
 ??? While this fix should reduce the number of sightings on Win-X64, it
 ??? won't help for Linux-X64 or SPARC.

https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8209848

JDK-8184445 JShell tests: fail intermittently if tests are run in high 
concurrent mode.
https://bugs.openjdk.java.net/browse/JDK-8184445

 ??? The tests you mention below are also mentioned in this bug.

 ??? This bug has been linked to sightings on Linux-X64, OSX, SPARC and
 ??? Win-X64. While this fix should reduce the number of sightings on
 ??? Win-X64, it won't help for Linux-X64, OSX, or SPARC.

https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8184445

JDK-8173079 JShell test: jdk/jshell/UserJdiUserRemoteTest.java fails 
intermittently
https://bugs.openjdk.java.net/browse/JDK-8173079

 ??? I have high hopes that this bug (on Win*) will be addressed by
 ??? this fix (8240902) because this test uses JDI...

 ??? Most of the sightings for this bug are for Win-X64 and few SPARC.
 ??? The bug also mentions Linux sightings, but I don't see any current
 ??? links for Linux:

https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8173079

 ??? So this fix should help with Win* sightings, but not Linux or SPARC.

JDK-8190912 jdk/jshell/JdiHangingListenExecutionControlTest.java failed with
 ??????????? timeout waiting for connection
https://bugs.openjdk.java.net/browse/JDK-8190912

 ??? I have high hopes that this bug (on Win*) will be addressed by
 ??? this fix (8240902) because this test uses JDI...

 ??? Most of the sightings for this bug are for Win-X64; there is one
 ??? Linux and one OSX.

https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8190912

 ??? So this fix should help with Win* sightings, but not Linux or OSX.

JDK-8207166 langtools/jdk/jshell/JdiHangingLaunchExecutionControlTest.java
https://bugs.openjdk.java.net/browse/JDK-8207166

 ??? I have high hopes that this bug will be addressed by this fix
 ??? (8240902) because this test uses JDI...

 ??? The one linked sighting with platform info is for Win-X64.

https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8207166

JDK-8235780 jdk/jshell/FailOverExecutionControlDyingLaunchTest.java 
fails during setup
https://bugs.openjdk.java.net/browse/JDK-8235780

 ??? I have high hopes that this bug (on Win*) will be addressed by
 ??? this fix (8240902) because this test uses JDI...

 ??? This bug has been linked to sightings on Linux-X64 and Win-X64.
 ??? While this fix should reduce the number of sightings on Win-X64,
 ??? it won't help for Linux-X64.

https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8235780

JDK-8239930 jdk/jshell/UserJdiUserRemoteTest.java fails due to agentvm 
mode timeout
https://bugs.openjdk.java.net/browse/JDK-8239930

 ??? I have high hopes that this bug will be addressed by this fix
 ??? (8240902) because this test uses JDI...

 ??? Both sightings linked to this bug are for Win-X64:

https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8239930

JDK-8240531 jshell/FailOverExecutionControlDyingLaunchTest.java fails 
due to agentvm mode timeout
https://bugs.openjdk.java.net/browse/JDK-8240531

 ??? I have high hopes that this bug will be addressed by this fix
 ??? (8240902) because this test uses JDI...

 ??? The three sightings linked to this bug are for Win-X64:

https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8240531


Okay... that's it for the jshell bugs that I track that have been
spotted on Win-X64 (and usually other platforms too).

Dan


On 3/17/20 4:14 PM, Patricio Chilano wrote:
> Hi all, ...
>
> Please review the following patch:
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8240902
> Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/
>
> Calling closeConnection() on an already created/opened connection 
> includes calls to CloseHandle() on objects that can still be used by 
> other threads. This can lead to either undefined behavior or, as 
> detailed in the bug comments, changes of state of unrelated objects. 
> This issue was found while debugging the reason behind some jshell 
> test failures seen after pushing 8230594. Not as important, but there 
> are also calls to closeStream() from createStream()/openStream() when 
> failing to create/open a stream that will return after executing 
> "CHECK_ERROR(enterMutex(stream, NULL));" without closing the intended 
> resources. Then, calling closeConnection() could assert if the reason 
> of the previous failure was that the stream's mutex failed to be 
> created/opened. These patch aims to address these issues too.
>
> Tested in mach5 with the current baseline, tiers1-3 and several runs 
> of open/test/langtools/:tier1 which includes the jshell tests where 
> this connector is used. I also applied patch 
> http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev 
> mentioned in the comments of the bug, on top of the baseline and run 
> the langtool tests with and without this fix. Without the fix running 
> around 30 repetitions already shows failures in tests 
> jdk/jshell/FailOverExecutionControlTest.java and 
> jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With the 
> fix I run several hundred runs and saw no failures. Let me know if 
> there is any additional testing I should do.
>
> As a side note, I see there are a couple of open issues related with 
> jshell failures (8209848) which could be related to this bug and 
> therefore might be fixed by this patch.
>
> Thanks,
> Patricio
>


From daniel.daugherty at oracle.com  Wed Mar 18 21:40:28 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Wed, 18 Mar 2020 17:40:28 -0400
Subject: RFR 8240902: JDI shared memory connector can use already closed
 Handles
In-Reply-To: <211b15ab-96da-7e1d-09dc-5e5f25388b09@oracle.com>
References: <b644413f-060c-e856-5de5-bb7d69783a8e@oracle.com>
 <211b15ab-96da-7e1d-09dc-5e5f25388b09@oracle.com>
Message-ID: <152abc4d-b5e4-b8d0-27c1-72b51f56d0a3@oracle.com>

serviceability-dev at ... was included in this email by mistake.
This was only supposed to go to Patricio since the Mach5 links
won't work outside of Oracle. Sigh...

Sorry about the noise folks!

Dan


On 3/18/20 5:37 PM, Daniel D. Daugherty wrote:
> Patricio,
>
> This is a separate follow up about the jshell tests. Since I have been
> tracking these as part of my GK work and filed a number of those bugs,
> I figured I would help analyze them...
>
> JDK-8209848 test/langtools/jdk/jshell tests failed with Accept timed out
> https://bugs.openjdk.java.net/browse/JDK-8209848
>
> ??? This bug has been linked to sightings on Linux-X64, SPARC and 
> Win-X64.
> ??? While this fix should reduce the number of sightings on Win-X64, it
> ??? won't help for Linux-X64 or SPARC.
>
> https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8209848
>
> JDK-8184445 JShell tests: fail intermittently if tests are run in high 
> concurrent mode.
> https://bugs.openjdk.java.net/browse/JDK-8184445
>
> ??? The tests you mention below are also mentioned in this bug.
>
> ??? This bug has been linked to sightings on Linux-X64, OSX, SPARC and
> ??? Win-X64. While this fix should reduce the number of sightings on
> ??? Win-X64, it won't help for Linux-X64, OSX, or SPARC.
>
> https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8184445
>
> JDK-8173079 JShell test: jdk/jshell/UserJdiUserRemoteTest.java fails 
> intermittently
> https://bugs.openjdk.java.net/browse/JDK-8173079
>
> ??? I have high hopes that this bug (on Win*) will be addressed by
> ??? this fix (8240902) because this test uses JDI...
>
> ??? Most of the sightings for this bug are for Win-X64 and few SPARC.
> ??? The bug also mentions Linux sightings, but I don't see any current
> ??? links for Linux:
>
> https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8173079
>
> ??? So this fix should help with Win* sightings, but not Linux or SPARC.
>
> JDK-8190912 jdk/jshell/JdiHangingListenExecutionControlTest.java 
> failed with
> ??????????? timeout waiting for connection
> https://bugs.openjdk.java.net/browse/JDK-8190912
>
> ??? I have high hopes that this bug (on Win*) will be addressed by
> ??? this fix (8240902) because this test uses JDI...
>
> ??? Most of the sightings for this bug are for Win-X64; there is one
> ??? Linux and one OSX.
>
> https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8190912
>
> ??? So this fix should help with Win* sightings, but not Linux or OSX.
>
> JDK-8207166 
> langtools/jdk/jshell/JdiHangingLaunchExecutionControlTest.java
> https://bugs.openjdk.java.net/browse/JDK-8207166
>
> ??? I have high hopes that this bug will be addressed by this fix
> ??? (8240902) because this test uses JDI...
>
> ??? The one linked sighting with platform info is for Win-X64.
>
> https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8207166
>
> JDK-8235780 jdk/jshell/FailOverExecutionControlDyingLaunchTest.java 
> fails during setup
> https://bugs.openjdk.java.net/browse/JDK-8235780
>
> ??? I have high hopes that this bug (on Win*) will be addressed by
> ??? this fix (8240902) because this test uses JDI...
>
> ??? This bug has been linked to sightings on Linux-X64 and Win-X64.
> ??? While this fix should reduce the number of sightings on Win-X64,
> ??? it won't help for Linux-X64.
>
> https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8235780
>
> JDK-8239930 jdk/jshell/UserJdiUserRemoteTest.java fails due to agentvm 
> mode timeout
> https://bugs.openjdk.java.net/browse/JDK-8239930
>
> ??? I have high hopes that this bug will be addressed by this fix
> ??? (8240902) because this test uses JDI...
>
> ??? Both sightings linked to this bug are for Win-X64:
>
> https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8239930
>
> JDK-8240531 jshell/FailOverExecutionControlDyingLaunchTest.java fails 
> due to agentvm mode timeout
> https://bugs.openjdk.java.net/browse/JDK-8240531
>
> ??? I have high hopes that this bug will be addressed by this fix
> ??? (8240902) because this test uses JDI...
>
> ??? The three sightings linked to this bug are for Win-X64:
>
> https://mach5.us.oracle.com/mdash/bugHistory?search=bugId%3AJDK-8240531
>
>
> Okay... that's it for the jshell bugs that I track that have been
> spotted on Win-X64 (and usually other platforms too).
>
> Dan
>
>
> On 3/17/20 4:14 PM, Patricio Chilano wrote:
>> Hi all, ...
>>
>> Please review the following patch:
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240902
>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/
>>
>> Calling closeConnection() on an already created/opened connection 
>> includes calls to CloseHandle() on objects that can still be used by 
>> other threads. This can lead to either undefined behavior or, as 
>> detailed in the bug comments, changes of state of unrelated objects. 
>> This issue was found while debugging the reason behind some jshell 
>> test failures seen after pushing 8230594. Not as important, but there 
>> are also calls to closeStream() from createStream()/openStream() when 
>> failing to create/open a stream that will return after executing 
>> "CHECK_ERROR(enterMutex(stream, NULL));" without closing the intended 
>> resources. Then, calling closeConnection() could assert if the reason 
>> of the previous failure was that the stream's mutex failed to be 
>> created/opened. These patch aims to address these issues too.
>>
>> Tested in mach5 with the current baseline, tiers1-3 and several runs 
>> of open/test/langtools/:tier1 which includes the jshell tests where 
>> this connector is used. I also applied patch 
>> http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev 
>> mentioned in the comments of the bug, on top of the baseline and run 
>> the langtool tests with and without this fix. Without the fix running 
>> around 30 repetitions already shows failures in tests 
>> jdk/jshell/FailOverExecutionControlTest.java and 
>> jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With the 
>> fix I run several hundred runs and saw no failures. Let me know if 
>> there is any additional testing I should do.
>>
>> As a side note, I see there are a couple of open issues related with 
>> jshell failures (8209848) which could be related to this bug and 
>> therefore might be fixed by this patch.
>>
>> Thanks,
>> Patricio
>>
>


From patricio.chilano.mateo at oracle.com  Wed Mar 18 21:48:45 2020
From: patricio.chilano.mateo at oracle.com (Patricio Chilano)
Date: Wed, 18 Mar 2020 18:48:45 -0300
Subject: RFR 8240902: JDI shared memory connector can use already closed
 Handles
In-Reply-To: <fcdb4587-6f99-7f24-5105-83ca4f46497c@oracle.com>
References: <b644413f-060c-e856-5de5-bb7d69783a8e@oracle.com>
 <fcdb4587-6f99-7f24-5105-83ca4f46497c@oracle.com>
Message-ID: <2fe1dc87-641c-44f1-aae2-6b5c9da19227@oracle.com>

Hi Dan,

On 3/18/20 5:30 PM, Daniel D. Daugherty wrote:
> On 3/17/20 4:14 PM, Patricio Chilano wrote:
>> Hi all,
>>
>> Please review the following patch:
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240902
>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/
>
> src/jdk.jdi/share/native/libdt_shmem/shmemBase.c
> ??? L411: ??? int attempts = 10;
> ??? L420: ??????? sysSleep(200);
> ??????? I presume that this is a 200 millisecond sleep so this new loop
> ??????? will delay a closeStream() call by at most 2 seconds. You may
> ??????? want those literals to be #define'ed values at the top of the
> ??????? file, e.g., like this one:
>
> ??????? #define MAX_GENERATION_RETRIES 20
>
> ??????? Your choice on the names of the new #defines if you choose to
> ??????? do that. You might even consider putting them close to
> ??????? "typedef struct SharedMemoryConnection".
>
> ??????? Update: Oh yuck! Now I see that there is existing code that
> ??????? does the same kind of looping with sysSleep() calls when the
> ??????? linger option is set. I revise my comment: You're following
> ??????? the existing style in the function so go with what you have.
Ok, I left the loop as it is now.

> Don't forget to update the copyright year before you push.
Done.

> L379: closeStream(Stream *stream, jboolean linger, unsigned int 
> *refcount )
> ??????? nit - please delete space before ')'.
Done.

> L412: ??? MemoryBarrier();???? /* Prevent load of refcount to float 
> above. */
> ??????? typo: s/to float/from floating/
After replying to David's review I realized the enterMutex() call on 
closeStream() will already provide acquire semantics so reading the 
refcount will not float above. I removed the barrier.

> L413: ??? while (attempts>0) {
> ??????? nit - please add spaces around '>'.
Done.

> L415-418, L537, L541, L552:
> ??????? nit - indent should be four spaces instead of two spaces.
Done.

> The existing L546 and L549 should indented four spaces instead
> ??????? of two spaces. Please fix since you there.
Done.

> I'm good with the code changes. I only have nits above so I don't need
> to see another webrev.
Thanks for reviewing this Dan! I might send a v2 later.


Thanks,
Patricio
> Dan
>
>> Calling closeConnection() on an already created/opened connection 
>> includes calls to CloseHandle() on objects that can still be used by 
>> other threads. This can lead to either undefined behavior or, as 
>> detailed in the bug comments, changes of state of unrelated objects. 
>> This issue was found while debugging the reason behind some jshell 
>> test failures seen after pushing 8230594. Not as important, but there 
>> are also calls to closeStream() from createStream()/openStream() when 
>> failing to create/open a stream that will return after executing 
>> "CHECK_ERROR(enterMutex(stream, NULL));" without closing the intended 
>> resources. Then, calling closeConnection() could assert if the reason 
>> of the previous failure was that the stream's mutex failed to be 
>> created/opened. These patch aims to address these issues too.
>>
>> Tested in mach5 with the current baseline, tiers1-3 and several runs 
>> of open/test/langtools/:tier1 which includes the jshell tests where 
>> this connector is used. I also applied patch 
>> http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev 
>> mentioned in the comments of the bug, on top of the baseline and run 
>> the langtool tests with and without this fix. Without the fix running 
>> around 30 repetitions already shows failures in tests 
>> jdk/jshell/FailOverExecutionControlTest.java and 
>> jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With the 
>> fix I run several hundred runs and saw no failures. Let me know if 
>> there is any additional testing I should do.
>>
>> As a side note, I see there are a couple of open issues related with 
>> jshell failures (8209848) which could be related to this bug and 
>> therefore might be fixed by this patch.
>>
>> Thanks,
>> Patricio
>>
>


From patricio.chilano.mateo at oracle.com  Wed Mar 18 22:02:00 2020
From: patricio.chilano.mateo at oracle.com (Patricio Chilano)
Date: Wed, 18 Mar 2020 19:02:00 -0300
Subject: RFR 8240902: JDI shared memory connector can use already closed
 Handles
In-Reply-To: <161fe95b-4d48-fd61-15ec-7daecb193868@oracle.com>
References: <b644413f-060c-e856-5de5-bb7d69783a8e@oracle.com>
 <161fe95b-4d48-fd61-15ec-7daecb193868@oracle.com>
Message-ID: <d0cfe234-4b2a-c249-4b74-995c13061201@oracle.com>

Hi Serguei,

On 3/18/20 6:03 PM, serguei.spitsyn at oracle.com wrote:
> Hi Patricio,
>
> Good finding, thank you for taking care about this!
> The fix looks good in general.
>
> There are several spots with the wrong indent (must be 4, not 2):
> 64 #define ENTER_CONNECTION(connection) do { \
> 65 InterlockedIncrement(&connection->refcount); \
> 66 if (IS_STATE_CLOSED(connection->state)) { \
> 67 setLastErrorMsg("stream closed"); \
> 68 InterlockedDecrement(&connection->refcount); \
> 69 return SYS_ERR; \
> 70 } \
> 71 } while (0)
> 72
> 73 #define LEAVE_CONNECTION(connection) do { \
> 74 InterlockedDecrement(&connection->refcount); \
> 75 } while (0)
> ? I'd also suggest to move content left and use indent 4 from the side.
Done. I already aligned it the same way as STREAM_INVARIANT and I fixed 
the indent inside ENTER_CONNECTION().

> 414 if (*refcount == 0) {
> 415 sysEventClose(stream->hasData);
> 416 sysEventClose(stream->hasSpace);
>   417           sysIPMutexClose(stream->mutex);
> 418 break;
> 419 } ...
> 535 Stream * stream = &connection->outgoing;
> 536 if (stream->state == STATE_OPEN) {
> 537 (void)closeStream(stream, JNI_TRUE, &connection->refcount);
> 538 }
> 539 stream = &connection->incoming;
> 540 if (stream->state == STATE_OPEN) {
> 541 (void)closeStream(stream, JNI_FALSE, &connection->refcount);
> 542 } ...
> 551 if (connection->shutdown) {
> 552 sysEventClose(connection->shutdown);
> 553 }
> 554 } ... 1022 shmemBase_sendByte(SharedMemoryConnection *connection, 
> jbyte data)
> 1023 {
> 1024 ENTER_CONNECTION(connection);
> 1025 jint rc = shmemBase_sendByte_internal(connection, data);
> 1026 LEAVE_CONNECTION(connection);
> 1027 return rc;
> 1028 }
>   ...
>
> 1055 jint
> 1056 shmemBase_receiveByte(SharedMemoryConnection *connection, jbyte 
> *data)
> 1057 {
> 1058 ENTER_CONNECTION(connection);
> 1059 jint rc = shmemBase_receiveByte_internal(connection, data);
> 1060 LEAVE_CONNECTION(connection);
> 1061 return rc;
> 1062 } ...
> 1136 jint
> 1137 shmemBase_sendPacket(SharedMemoryConnection *connection, const 
> jdwpPacket *packet)
> 1138 {
> 1139 ENTER_CONNECTION(connection);
> 1140 jint rc = shmemBase_sendPacket_internal(connection, packet);
> 1141 LEAVE_CONNECTION(connection);
> 1142 return rc;
> 1143 }
> ...
> 1229 jint
> 1230 shmemBase_receivePacket(SharedMemoryConnection *connection, 
> jdwpPacket *packet)
> 1231 {
> 1232 ENTER_CONNECTION(connection);
> 1233 jint rc = shmemBase_receivePacket_internal(connection, packet);
> 1234 LEAVE_CONNECTION(connection);
> 1235 return rc;
> 1236 }
Done. Fix all those.

> Some other nits were already commented by David and Dan.
>
> I'd suggest to test with tier-5 as well for more safety.
Thanks for looking at this Serguei! I'll give it a new run in mach5 and 
add tier5.


Thanks,
Patricio
> Thanks,
> Serguei
>
>
> On 3/17/20 13:14, Patricio Chilano wrote:
>> Hi all,
>>
>> Please review the following patch:
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240902
>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/
>>
>> Calling closeConnection() on an already created/opened connection 
>> includes calls to CloseHandle() on objects that can still be used by 
>> other threads. This can lead to either undefined behavior or, as 
>> detailed in the bug comments, changes of state of unrelated objects. 
>> This issue was found while debugging the reason behind some jshell 
>> test failures seen after pushing 8230594. Not as important, but there 
>> are also calls to closeStream() from createStream()/openStream() when 
>> failing to create/open a stream that will return after executing 
>> "CHECK_ERROR(enterMutex(stream, NULL));" without closing the intended 
>> resources. Then, calling closeConnection() could assert if the reason 
>> of the previous failure was that the stream's mutex failed to be 
>> created/opened. These patch aims to address these issues too.
>>
>> Tested in mach5 with the current baseline, tiers1-3 and several runs 
>> of open/test/langtools/:tier1 which includes the jshell tests where 
>> this connector is used. I also applied patch 
>> http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev 
>> mentioned in the comments of the bug, on top of the baseline and run 
>> the langtool tests with and without this fix. Without the fix running 
>> around 30 repetitions already shows failures in tests 
>> jdk/jshell/FailOverExecutionControlTest.java and 
>> jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With the 
>> fix I run several hundred runs and saw no failures. Let me know if 
>> there is any additional testing I should do.
>>
>> As a side note, I see there are a couple of open issues related with 
>> jshell failures (8209848) which could be related to this bug and 
>> therefore might be fixed by this patch.
>>
>> Thanks,
>> Patricio
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200318/28289c43/attachment-0001.htm>

From leonid.mesnik at oracle.com  Wed Mar 18 22:18:43 2020
From: leonid.mesnik at oracle.com (Leonid Mesnik)
Date: Wed, 18 Mar 2020 15:18:43 -0700
Subject: RFR: 8241123: Refactor vmTestbase stress framework to use j.u.c
 and make creation of threads more flexible
In-Reply-To: <4E0F364A-47F3-428D-9C08-6B1ADFCB9D24@oracle.com>
References: <0d73d306-2eff-375c-65e1-67142b2c6c59@oracle.com>
 <FF8B4FCD-C583-41AF-AFE7-7D7411D4A4AC@oracle.com>
 <4E0F364A-47F3-428D-9C08-6B1ADFCB9D24@oracle.com>
Message-ID: <504b0902-9fd1-ea8c-399a-185a4ceaa9e0@oracle.com>


On 3/18/20 2:30 PM, Igor Ignatyev wrote:
>> I need more time to get grasp of Wicket and your changes in it; will 
>> come back to you after I understand them. 
> ok, now when I believe that I have enough understanding of Wicket, I 
> have a few comments:
> 1.
>> 68 private Lock lock = new ReentrantLock();
>> 69 private Condition condition = lock.newCondition();
> it's better to make these fields final.
>
> 2. as all writes and reads of Wicket::count are guarded by lock.lock, 
> there is no need for it to be atomic.
> 3. adding lock to?getWaiters will also remove need for Wicket::waiters 
> to be atomic.

All 3 are fixed. Thanks for your suggestions.

Updated version:

http://cr.openjdk.java.net/~lmesnik/8241123/webrev.01/

Leonid

>
> the rest looks good to me.
>
> Thanks,
> -- Igor
>
>
>
>> On Mar 18, 2020, at 12:48 PM, Igor Ignatyev <igor.ignatyev at oracle.com 
>> <mailto:igor.ignatyev at oracle.com>> wrote:
>>
>> Hi Leonid,
>>
>> I've started looking at your webrev, and so far have a couple questions:
>>
>>> Test 
>>> vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java 
>>> was updated to don't use Wicket. (The lock has a reference to thread 
>>> which affects test.)
>> can't you use just a volatile boolean field?
>>
>>> Wicket "finished" in class ThreadsRunner was changed to 
>>> atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which 
>>> might happened in stress GC tests.
>> won't j.u.c.CountDownLatch be more appropriate and cleaner solution here?
>>
>> I need more time to get grasp of Wicket and your changes in it; will 
>> come back to you after I understand them.
>>
>> -- Igor
>>
>>> On Mar 18, 2020, at 12:37 PM, Leonid Mesnik 
>>> <leonid.mesnik at oracle.com <mailto:leonid.mesnik at oracle.com>> wrote:
>>>
>>> Hi
>>>
>>> Could you please review following fix which slightly refactor 
>>> vmTestbase stress test harness. This refactoring helps to add 
>>> virtual threads testing support.
>>>
>>> The Wicket uses plain sync/wait/notify mechanism which cause carrier 
>>> thread starvation and should not be used in virtual threads. The 
>>> ManagedThread is a subclass of Thread so it couldn't be virtual thread.
>>>
>>>
>>> Following fix changes Wicket to use locks/conditions to don't pin 
>>> vthread to carrier thread while starting testing.
>>>
>>> ManagedThread is fixed to keep execution thread as the thread 
>>> variable and isolate it's creation.
>>>
>>> Test 
>>> vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java 
>>> was updated to don't use Wicket. (The lock has a reference to thread 
>>> which affects test.)
>>>
>>> Wicket "finished" in class ThreadsRunner was changed to 
>>> atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which 
>>> might happened in stress GC tests.
>>>
>>> webrev: http://cr.openjdk.java.net/~lmesnik/8241123/webrev.00/
>>>
>>> bug: https://bugs.openjdk.java.net/browse/JDK-8241123
>>>
>>>
>>> Leonid
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200318/e2b30a76/attachment.htm>

From igor.ignatyev at oracle.com  Wed Mar 18 22:22:56 2020
From: igor.ignatyev at oracle.com (Igor Ignatev)
Date: Wed, 18 Mar 2020 15:22:56 -0700
Subject: RFR: 8241123: Refactor vmTestbase stress framework to use j.u.c
 and make creation of threads more flexible
In-Reply-To: <504b0902-9fd1-ea8c-399a-185a4ceaa9e0@oracle.com>
References: <504b0902-9fd1-ea8c-399a-185a4ceaa9e0@oracle.com>
Message-ID: <F4131A20-48B7-41E3-A120-BF7F05027F3B@oracle.com>

Reviewed. 

? Igor

> On Mar 18, 2020, at 3:18 PM, Leonid Mesnik <Leonid.Mesnik at oracle.com> wrote:
> 
> ?
> 
> 
> On 3/18/20 2:30 PM, Igor Ignatyev wrote:
>>> I need more time to get grasp of Wicket and your changes in it; will come back to you after I understand them. 
>> ok, now when I believe that I have enough understanding of Wicket, I have a few comments:
>> 1.
>>>   68     private Lock lock = new ReentrantLock();
>>>   69     private Condition condition = lock.newCondition();
>> it's better to make these fields final.
>> 
>> 2. as all writes and reads of Wicket::count are guarded by lock.lock, there is no need for it to be atomic.
>> 3. adding lock to getWaiters will also remove need for Wicket::waiters to be atomic.
> All 3 are fixed. Thanks for your suggestions.
> 
> Updated version:
> 
> http://cr.openjdk.java.net/~lmesnik/8241123/webrev.01/
> 
> Leonid
> 
>> 
>> the rest looks good to me.
>> 
>> Thanks,
>> -- Igor
>> 
>> 
>> 
>>> On Mar 18, 2020, at 12:48 PM, Igor Ignatyev <igor.ignatyev at oracle.com> wrote:
>>> 
>>> Hi Leonid,
>>> 
>>> I've started looking at your webrev, and so far have a couple questions:
>>> 
>>>> Test vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java was updated to don't use Wicket. (The lock has a reference to thread which affects test.)
>>> can't you use just a volatile boolean field?
>>> 
>>>> Wicket "finished" in class ThreadsRunner was changed to atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which might happened in stress GC tests.
>>> won't j.u.c.CountDownLatch be more appropriate and cleaner solution here?
>>> 
>>> I need more time to get grasp of Wicket and your changes in it; will come back to you after I understand them. 
>>> 
>>> -- Igor
>>> 
>>>> On Mar 18, 2020, at 12:37 PM, Leonid Mesnik <leonid.mesnik at oracle.com> wrote:
>>>> 
>>>> Hi
>>>> 
>>>> Could you please review following fix which slightly refactor vmTestbase stress test harness. This refactoring helps to add virtual threads testing support.
>>>> 
>>>> The Wicket uses plain sync/wait/notify mechanism which cause carrier thread starvation and should not be used in virtual threads. The ManagedThread is a subclass of Thread so it couldn't be virtual thread.
>>>> 
>>>> 
>>>> Following fix changes Wicket to use locks/conditions to don't pin vthread to carrier thread while starting testing.
>>>> 
>>>> ManagedThread is fixed to keep execution thread as the thread variable and isolate it's creation.
>>>> 
>>>> Test vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java was updated to don't use Wicket. (The lock has a reference to thread which affects test.)
>>>> 
>>>> Wicket "finished" in class ThreadsRunner was changed to atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which might happened in stress GC tests.
>>>> 
>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8241123/webrev.00/
>>>> 
>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8241123
>>>> 
>>>> 
>>>> Leonid
>>>> 
>>> 
>> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200318/f9c209d9/attachment.htm>

From leonid.mesnik at oracle.com  Wed Mar 18 22:51:15 2020
From: leonid.mesnik at oracle.com (Leonid Mesnik)
Date: Wed, 18 Mar 2020 15:51:15 -0700
Subject: RFR: 8241123: Refactor vmTestbase stress framework to use j.u.c
 and make creation of threads more flexible
In-Reply-To: <F4131A20-48B7-41E3-A120-BF7F05027F3B@oracle.com>
References: <504b0902-9fd1-ea8c-399a-185a4ceaa9e0@oracle.com>
 <F4131A20-48B7-41E3-A120-BF7F05027F3B@oracle.com>
Message-ID: <a67d3d24-341f-fc9d-efb7-1e1070ca5f74@oracle.com>

Thank you for review and? feedback.

Leonid

On 3/18/20 3:22 PM, Igor Ignatev wrote:
> Reviewed.
>
> ? Igor
>
>> On Mar 18, 2020, at 3:18 PM, Leonid Mesnik <Leonid.Mesnik at oracle.com> 
>> wrote:
>>
>> ?
>>
>>
>> On 3/18/20 2:30 PM, Igor Ignatyev wrote:
>>>> I need more time to get grasp of Wicket and your changes in it; 
>>>> will come back to you after I understand them. 
>>> ok, now when I believe that I have enough understanding of Wicket, I 
>>> have a few comments:
>>> 1.
>>>> 68 private Lock lock = new ReentrantLock();
>>>> 69 private Condition condition = lock.newCondition();
>>> it's better to make these fields final.
>>>
>>> 2. as all writes and reads of Wicket::count are guarded by 
>>> lock.lock, there is no need for it to be atomic.
>>> 3. adding lock to?getWaiters will also remove need for 
>>> Wicket::waiters to be atomic.
>>
>> All 3 are fixed. Thanks for your suggestions.
>>
>> Updated version:
>>
>> http://cr.openjdk.java.net/~lmesnik/8241123/webrev.01/
>>
>> Leonid
>>
>>>
>>> the rest looks good to me.
>>>
>>> Thanks,
>>> -- Igor
>>>
>>>
>>>
>>>> On Mar 18, 2020, at 12:48 PM, Igor Ignatyev 
>>>> <igor.ignatyev at oracle.com <mailto:igor.ignatyev at oracle.com>> wrote:
>>>>
>>>> Hi Leonid,
>>>>
>>>> I've started looking at your webrev, and so far have a couple 
>>>> questions:
>>>>
>>>>> Test 
>>>>> vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java 
>>>>> was updated to don't use Wicket. (The lock has a reference to 
>>>>> thread which affects test.)
>>>> can't you use just a volatile boolean field?
>>>>
>>>>> Wicket "finished" in class ThreadsRunner was changed to 
>>>>> atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which 
>>>>> might happened in stress GC tests.
>>>> won't j.u.c.CountDownLatch be more appropriate and cleaner solution 
>>>> here?
>>>>
>>>> I need more time to get grasp of Wicket and your changes in it; 
>>>> will come back to you after I understand them.
>>>>
>>>> -- Igor
>>>>
>>>>> On Mar 18, 2020, at 12:37 PM, Leonid Mesnik 
>>>>> <leonid.mesnik at oracle.com <mailto:leonid.mesnik at oracle.com>> wrote:
>>>>>
>>>>> Hi
>>>>>
>>>>> Could you please review following fix which slightly refactor 
>>>>> vmTestbase stress test harness. This refactoring helps to add 
>>>>> virtual threads testing support.
>>>>>
>>>>> The Wicket uses plain sync/wait/notify mechanism which cause 
>>>>> carrier thread starvation and should not be used in virtual 
>>>>> threads. The ManagedThread is a subclass of Thread so it couldn't 
>>>>> be virtual thread.
>>>>>
>>>>>
>>>>> Following fix changes Wicket to use locks/conditions to don't pin 
>>>>> vthread to carrier thread while starting testing.
>>>>>
>>>>> ManagedThread is fixed to keep execution thread as the thread 
>>>>> variable and isolate it's creation.
>>>>>
>>>>> Test 
>>>>> vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java 
>>>>> was updated to don't use Wicket. (The lock has a reference to 
>>>>> thread which affects test.)
>>>>>
>>>>> Wicket "finished" in class ThreadsRunner was changed to 
>>>>> atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which 
>>>>> might happened in stress GC tests.
>>>>>
>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8241123/webrev.00/
>>>>>
>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8241123
>>>>>
>>>>>
>>>>> Leonid
>>>>>
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200318/28083951/attachment-0001.htm>

From david.holmes at oracle.com  Wed Mar 18 23:10:44 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 19 Mar 2020 09:10:44 +1000
Subject: RFR 8240902: JDI shared memory connector can use already closed
 Handles
In-Reply-To: <2212664f-930b-cdfc-7ae6-70c7205cdaca@oracle.com>
References: <b644413f-060c-e856-5de5-bb7d69783a8e@oracle.com>
 <db15a209-c73d-c380-42e2-75e713392453@oracle.com>
 <2212664f-930b-cdfc-7ae6-70c7205cdaca@oracle.com>
Message-ID: <8928590e-6516-051e-aa84-098a6fdc9d45@oracle.com>

Hi Patricio,

On 19/03/2020 6:44 am, Patricio Chilano wrote:
> Hi David,
> 
> On 3/18/20 4:27 AM, David Holmes wrote:
>> Hi Patricio,
>>
>> On 18/03/2020 6:14 am, Patricio Chilano wrote:
>>> Hi all,
>>>
>>> Please review the following patch:
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240902
>>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/
>>>
>>> Calling closeConnection() on an already created/opened connection 
>>> includes calls to CloseHandle() on objects that can still be used by 
>>> other threads. This can lead to either undefined behavior or, as 
>>> detailed in the bug comments, changes of state of unrelated objects. 
>>
>> This was a really great find!
> Thanks!? : )
> 
>>> This issue was found while debugging the reason behind some jshell 
>>> test failures seen after pushing 8230594. Not as important, but there 
>>> are also calls to closeStream() from createStream()/openStream() when 
>>> failing to create/open a stream that will return after executing 
>>> "CHECK_ERROR(enterMutex(stream, NULL));" without closing the intended 
>>> resources. Then, calling closeConnection() could assert if the reason 
>>> of the previous failure was that the stream's mutex failed to be 
>>> created/opened. These patch aims to address these issues too.
>>
>> Patch looks good in general. The internal reference count guards 
>> deletion of the internal resources, and is itself safe because never 
>> actually delete the connection. Thanks for adding the comment about 
>> this aspect.
>>
>> A few items:
>>
>> Please update copyright year before pushing.
> Done.
> 
>> Please align ENTER_CONNECTION/LEAVE_CONNECTION macros the same way as 
>> STREAM_INVARIANT.
> Done.
> 
>> ?170 unsigned int refcount;
>> ?171???? jint state;
>>
>> I'm unclear about the use of stream->state and connection->state as 
>> guards - unless accessed under a mutex these would seem to at least 
>> need acquire/release semantics.
>>
>> Additionally the reads of refcount would also seem to need to some 
>> form of memory synchronization - though the Windows docs for the 
>> Interlocked* API does not show how to simply read such a variable! 
>> Though I note that the RtlFirstEntrySList method for the "Interlocked 
>> Singly Linked Lists" API does state "Access to the list is 
>> synchronized on a multiprocessor system." which suggests a read of 
>> such a variable does require some form of memory synchronization!
> In the case of the stream struct, the state field is protected by the 
> mutex field. It is set to STATE_CLOSED while holding the mutex, and 
> threads that read it must acquire the mutex first through 
> sysIPMutexEnter(). For the cases where sysIPMutexEnter() didn't acquire 
> the mutex, we will return something different than SYS_OK and the call 
> will exit anyways. All this behaves as before, I didn't change it.

Thanks for clarifying.

> The refcount and state that I added to the SharedMemoryConnection struct 
> work together. For a thread closing the connection, setting the 
> connection state to STATE_CLOSED has to happen before reading the 
> refcount (more on the atomicity of that read later). That's why I added 
> the MemoryBarrier() call; which I see it's better if I just move it to 
> after setting the connection state to closed. For the threads accessing 
> the connection, incrementing the refcount has to happen before reading 
> the connection state. That's already provided by the 
> InterlockedIncrement() which uses a full memory barrier. In this way if 
> the thread closing the connection reads a refcount of 0, then we know 
> it's safe to release the resources, since other threads accessing the 
> connection will see that the state is closed after incrementing the 
> refcount. If the read of refcount is not 0, then it could be that a 
> thread is accessing the connection or not (it could have read a state 
> connection of STATE_CLOSED after incrementing the refcount), we don't 
> know, so we can't release anything. Similarly if the thread accessing 
> the connection reads that the state is not closed, then we know it's 
> safe to access the stream since anybody closing the connection will 
> still have to read refcount which will be at least 1.
> As for the atomicity of the read of refcount, from 
> https://docs.microsoft.com/en-us/windows/win32/sync/interlocked-variable-access, 
> it states that "simple reads and writes to properly-aligned 32-bit 
> variables are atomic operations". Maybe I should declare refcount 
> explicitly as DWORD32?

It isn't the atomicity in question with the naked read but the 
visibility. Any latency in the visibility of the store done by the 
InterLocked*() function should be handled by the retry loop, but what is 
to stop the C++ compiler from hoisting the read of refcount out of the 
loop? It isn't even volatile (which has a stronger meaning in VS than 
regular C+++).

> Instead of having a refcount we could have done something similar to the 
> stream struct and protect access to the connection through a mutex. To 
> avoid serializing all threads we could have used SRW locks and only the 
> one closing the connection would do AcquireSRWLockExclusive(). It would 
> change the state of the connection to STATE_CLOSED, close all handles, 
> and then release the mutex. ENTER_CONNECTION() and LEAVE_CONNECTION() 
> would acquire and release the mutex in shared mode. But other that maybe 
> be more easy to read I don't think the change will be smaller.
> 
>> ?413 while (attempts>0) {
>>
>> spaces around >
> Done.
> 
>> If the loop at 413 never encounters a zero reference_count then it 
>> doesn't close the events or the mutex but still returns SYS_OK. That 
>> seems wrong but I'm not sure what the right behaviour is here.
> I can change the return value to be SYS_ERR, but I don't think there is 
> much we can do about it unless we want to wait forever until we can 
> release those resources.

SYS_ERR would look better, but I see now that the return value is 
completely ignored anyway. So we're just going to leak resources if the 
loop "times out". I guess this is the best we can do.

Thanks,
David

> 
>> And please wait for serviceability folk to review this.
> Sounds good.
> 
> 
> Thanks for looking at this David! I will move the MemoryBarrier() and 
> change the refcount to be DWORD32 if you are okay with that.
> 
> 
> Thanks,
> Patricio
>> Thanks,
>> David
>> -----
>>
>>> Tested in mach5 with the current baseline, tiers1-3 and several runs 
>>> of open/test/langtools/:tier1 which includes the jshell tests where 
>>> this connector is used. I also applied patch 
>>> http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev 
>>> mentioned in the comments of the bug, on top of the baseline and run 
>>> the langtool tests with and without this fix. Without the fix running 
>>> around 30 repetitions already shows failures in tests 
>>> jdk/jshell/FailOverExecutionControlTest.java and 
>>> jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With the 
>>> fix I run several hundred runs and saw no failures. Let me know if 
>>> there is any additional testing I should do.
>>>
>>> As a side note, I see there are a couple of open issues related with 
>>> jshell failures (8209848) which could be related to this bug and 
>>> therefore might be fixed by this patch.
>>>
>>> Thanks,
>>> Patricio
>>>
> 

From chris.plummer at oracle.com  Thu Mar 19 02:35:16 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 18 Mar 2020 19:35:16 -0700
Subject: RFR(XS) 8240543: Update problem list entry for
 serviceability/sa/TestRevPtrsForInvokeDynamic.java to reference JDK-8241235
Message-ID: <f49a8507-4c95-d1f8-6384-c83816614292@oracle.com>

Hello,

Please review the following:

https://bugs.openjdk.java.net/browse/JDK-8240543

diff --git a/test/hotspot/jtreg/ProblemList.txt 
b/test/hotspot/jtreg/ProblemList.txt
--- a/test/hotspot/jtreg/ProblemList.txt
+++ b/test/hotspot/jtreg/ProblemList.txt
@@ -140,7 +140,7 @@
 ?serviceability/sa/TestJmapCore.java 8193639 solaris-all
 ?serviceability/sa/TestJmapCoreMetaspace.java 8193639 solaris-all
 ?serviceability/sa/TestPrintMdo.java 8193639 solaris-all
-serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all
+serviceability/sa/TestRevPtrsForInvokeDynamic.java 8241235 generic-all
 ?serviceability/sa/TestType.java 8193639 solaris-all
 ?serviceability/sa/TestUniverse.java#id0 8193639 solaris-all

8191270 [1] no longer seems to reproduce. Because of that I was hoping 
to remove this test from the problem list, but found that in a tier4 run 
that this test fails for a different reason when a combination of 
compiler related flags is specified. I opened up a 8241235 [2]? for that 
failure and need to update ProblemList.txt to reference it instead. I 
will close 8191270 [1] once this change is pushed.

[1] https://bugs.openjdk.java.net/browse/JDK-8191270
[2] https://bugs.openjdk.java.net/browse/JDK-8241235

thanks,

Chris


From david.holmes at oracle.com  Thu Mar 19 03:42:31 2020
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 18 Mar 2020 20:42:31 -0700 (PDT)
Subject: RFR(XS) 8240543: Update problem list entry for
 serviceability/sa/TestRevPtrsForInvokeDynamic.java to reference JDK-8241235
In-Reply-To: <f49a8507-4c95-d1f8-6384-c83816614292@oracle.com>
References: <f49a8507-4c95-d1f8-6384-c83816614292@oracle.com>
Message-ID: <9f82c4d1-9f43-20e3-3f08-41f5531c46e4@oracle.com>

Looks good Chris.

Thanks,
David

On 19/03/2020 12:35 pm, Chris Plummer wrote:
> Hello,
> 
> Please review the following:
> 
> https://bugs.openjdk.java.net/browse/JDK-8240543
> 
> diff --git a/test/hotspot/jtreg/ProblemList.txt 
> b/test/hotspot/jtreg/ProblemList.txt
> --- a/test/hotspot/jtreg/ProblemList.txt
> +++ b/test/hotspot/jtreg/ProblemList.txt
> @@ -140,7 +140,7 @@
>  ?serviceability/sa/TestJmapCore.java 8193639 solaris-all
>  ?serviceability/sa/TestJmapCoreMetaspace.java 8193639 solaris-all
>  ?serviceability/sa/TestPrintMdo.java 8193639 solaris-all
> -serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all
> +serviceability/sa/TestRevPtrsForInvokeDynamic.java 8241235 generic-all
>  ?serviceability/sa/TestType.java 8193639 solaris-all
>  ?serviceability/sa/TestUniverse.java#id0 8193639 solaris-all
> 
> 8191270 [1] no longer seems to reproduce. Because of that I was hoping 
> to remove this test from the problem list, but found that in a tier4 run 
> that this test fails for a different reason when a combination of 
> compiler related flags is specified. I opened up a 8241235 [2]? for that 
> failure and need to update ProblemList.txt to reference it instead. I 
> will close 8191270 [1] once this change is pushed.
> 
> [1] https://bugs.openjdk.java.net/browse/JDK-8191270
> [2] https://bugs.openjdk.java.net/browse/JDK-8241235
> 
> thanks,
> 
> Chris
> 

From chris.plummer at oracle.com  Thu Mar 19 04:17:32 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 18 Mar 2020 21:17:32 -0700
Subject: RFR(XS) 8240543: Update problem list entry for
 serviceability/sa/TestRevPtrsForInvokeDynamic.java to reference JDK-8241235
In-Reply-To: <9f82c4d1-9f43-20e3-3f08-41f5531c46e4@oracle.com>
References: <f49a8507-4c95-d1f8-6384-c83816614292@oracle.com>
 <9f82c4d1-9f43-20e3-3f08-41f5531c46e4@oracle.com>
Message-ID: <258b9aca-58c7-dd1c-fb8e-aa69ea02706f@oracle.com>

Thanks!

On 3/18/20 8:42 PM, David Holmes wrote:
> Looks good Chris.
>
> Thanks,
> David
>
> On 19/03/2020 12:35 pm, Chris Plummer wrote:
>> Hello,
>>
>> Please review the following:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8240543
>>
>> diff --git a/test/hotspot/jtreg/ProblemList.txt 
>> b/test/hotspot/jtreg/ProblemList.txt
>> --- a/test/hotspot/jtreg/ProblemList.txt
>> +++ b/test/hotspot/jtreg/ProblemList.txt
>> @@ -140,7 +140,7 @@
>> ??serviceability/sa/TestJmapCore.java 8193639 solaris-all
>> ??serviceability/sa/TestJmapCoreMetaspace.java 8193639 solaris-all
>> ??serviceability/sa/TestPrintMdo.java 8193639 solaris-all
>> -serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all
>> +serviceability/sa/TestRevPtrsForInvokeDynamic.java 8241235 generic-all
>> ??serviceability/sa/TestType.java 8193639 solaris-all
>> ??serviceability/sa/TestUniverse.java#id0 8193639 solaris-all
>>
>> 8191270 [1] no longer seems to reproduce. Because of that I was 
>> hoping to remove this test from the problem list, but found that in a 
>> tier4 run that this test fails for a different reason when a 
>> combination of compiler related flags is specified. I opened up a 
>> 8241235 [2]? for that failure and need to update ProblemList.txt to 
>> reference it instead. I will close 8191270 [1] once this change is 
>> pushed.
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8191270
>> [2] https://bugs.openjdk.java.net/browse/JDK-8241235
>>
>> thanks,
>>
>> Chris
>>


From patricio.chilano.mateo at oracle.com  Thu Mar 19 06:18:07 2020
From: patricio.chilano.mateo at oracle.com (Patricio Chilano)
Date: Thu, 19 Mar 2020 03:18:07 -0300
Subject: RFR 8240902: JDI shared memory connector can use already closed
 Handles
In-Reply-To: <8928590e-6516-051e-aa84-098a6fdc9d45@oracle.com>
References: <b644413f-060c-e856-5de5-bb7d69783a8e@oracle.com>
 <db15a209-c73d-c380-42e2-75e713392453@oracle.com>
 <2212664f-930b-cdfc-7ae6-70c7205cdaca@oracle.com>
 <8928590e-6516-051e-aa84-098a6fdc9d45@oracle.com>
Message-ID: <e1a2d907-3907-0f0e-31f3-221338549a5e@oracle.com>

Hi David,

On 3/18/20 8:10 PM, David Holmes wrote:
> Hi Patricio,
>
> On 19/03/2020 6:44 am, Patricio Chilano wrote:
>> Hi David,
>>
>> On 3/18/20 4:27 AM, David Holmes wrote:
>>> Hi Patricio,
>>>
>>> On 18/03/2020 6:14 am, Patricio Chilano wrote:
>>>> Hi all,
>>>>
>>>> Please review the following patch:
>>>>
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240902
>>>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/
>>>>
>>>> Calling closeConnection() on an already created/opened connection 
>>>> includes calls to CloseHandle() on objects that can still be used 
>>>> by other threads. This can lead to either undefined behavior or, as 
>>>> detailed in the bug comments, changes of state of unrelated objects. 
>>>
>>> This was a really great find!
>> Thanks!? : )
>>
>>>> This issue was found while debugging the reason behind some jshell 
>>>> test failures seen after pushing 8230594. Not as important, but 
>>>> there are also calls to closeStream() from 
>>>> createStream()/openStream() when failing to create/open a stream 
>>>> that will return after executing "CHECK_ERROR(enterMutex(stream, 
>>>> NULL));" without closing the intended resources. Then, calling 
>>>> closeConnection() could assert if the reason of the previous 
>>>> failure was that the stream's mutex failed to be created/opened. 
>>>> These patch aims to address these issues too.
>>>
>>> Patch looks good in general. The internal reference count guards 
>>> deletion of the internal resources, and is itself safe because never 
>>> actually delete the connection. Thanks for adding the comment about 
>>> this aspect.
>>>
>>> A few items:
>>>
>>> Please update copyright year before pushing.
>> Done.
>>
>>> Please align ENTER_CONNECTION/LEAVE_CONNECTION macros the same way 
>>> as STREAM_INVARIANT.
>> Done.
>>
>>> ?170 unsigned int refcount;
>>> ?171???? jint state;
>>>
>>> I'm unclear about the use of stream->state and connection->state as 
>>> guards - unless accessed under a mutex these would seem to at least 
>>> need acquire/release semantics.
>>>
>>> Additionally the reads of refcount would also seem to need to some 
>>> form of memory synchronization - though the Windows docs for the 
>>> Interlocked* API does not show how to simply read such a variable! 
>>> Though I note that the RtlFirstEntrySList method for the 
>>> "Interlocked Singly Linked Lists" API does state "Access to the list 
>>> is synchronized on a multiprocessor system." which suggests a read 
>>> of such a variable does require some form of memory synchronization!
>> In the case of the stream struct, the state field is protected by the 
>> mutex field. It is set to STATE_CLOSED while holding the mutex, and 
>> threads that read it must acquire the mutex first through 
>> sysIPMutexEnter(). For the cases where sysIPMutexEnter() didn't 
>> acquire the mutex, we will return something different than SYS_OK and 
>> the call will exit anyways. All this behaves as before, I didn't 
>> change it.
>
> Thanks for clarifying.
>
>> The refcount and state that I added to the SharedMemoryConnection 
>> struct work together. For a thread closing the connection, setting 
>> the connection state to STATE_CLOSED has to happen before reading the 
>> refcount (more on the atomicity of that read later). That's why I 
>> added the MemoryBarrier() call; which I see it's better if I just 
>> move it to after setting the connection state to closed. For the 
>> threads accessing the connection, incrementing the refcount has to 
>> happen before reading the connection state. That's already provided 
>> by the InterlockedIncrement() which uses a full memory barrier. In 
>> this way if the thread closing the connection reads a refcount of 0, 
>> then we know it's safe to release the resources, since other threads 
>> accessing the connection will see that the state is closed after 
>> incrementing the refcount. If the read of refcount is not 0, then it 
>> could be that a thread is accessing the connection or not (it could 
>> have read a state connection of STATE_CLOSED after incrementing the 
>> refcount), we don't know, so we can't release anything. Similarly if 
>> the thread accessing the connection reads that the state is not 
>> closed, then we know it's safe to access the stream since anybody 
>> closing the connection will still have to read refcount which will be 
>> at least 1.
>> As for the atomicity of the read of refcount, from 
>> https://docs.microsoft.com/en-us/windows/win32/sync/interlocked-variable-access, 
>> it states that "simple reads and writes to properly-aligned 32-bit 
>> variables are atomic operations". Maybe I should declare refcount 
>> explicitly as DWORD32?
>
> It isn't the atomicity in question with the naked read but the 
> visibility. Any latency in the visibility of the store done by the 
> InterLocked*() function should be handled by the retry loop, but what 
> is to stop the C++ compiler from hoisting the read of refcount out of 
> the loop? It isn't even volatile (which has a stronger meaning in VS 
> than regular C+++).
I see what you mean now, I was thinking on atomicity and order of 
operations but didn't consider the visibility of that read. Yes, if the 
compiler decides to be smart and hoist the read out of the loop we might 
never notice that it is safe to release those resources and we would 
leak them for no reason. I see from the windows 
docs(https://docs.microsoft.com/en-us/cpp/c-language/type-qualifiers) 
that declaring it volatile as you pointed out should be enough to 
prevent that.

>> Instead of having a refcount we could have done something similar to 
>> the stream struct and protect access to the connection through a 
>> mutex. To avoid serializing all threads we could have used SRW locks 
>> and only the one closing the connection would do 
>> AcquireSRWLockExclusive(). It would change the state of the 
>> connection to STATE_CLOSED, close all handles, and then release the 
>> mutex. ENTER_CONNECTION() and LEAVE_CONNECTION() would acquire and 
>> release the mutex in shared mode. But other that maybe be more easy 
>> to read I don't think the change will be smaller.
>>
>>> ?413 while (attempts>0) {
>>>
>>> spaces around >
>> Done.
>>
>>> If the loop at 413 never encounters a zero reference_count then it 
>>> doesn't close the events or the mutex but still returns SYS_OK. That 
>>> seems wrong but I'm not sure what the right behaviour is here.
>> I can change the return value to be SYS_ERR, but I don't think there 
>> is much we can do about it unless we want to wait forever until we 
>> can release those resources.
>
> SYS_ERR would look better, but I see now that the return value is 
> completely ignored anyway. So we're just going to leak resources if 
> the loop "times out". I guess this is the best we can do.
Here is v2 with the corrections:

Full: http://cr.openjdk.java.net/~pchilanomate/8240902/v2/webrev/
Inc: http://cr.openjdk.java.net/~pchilanomate/8240902/v2/inc/webrev/ 
<http://cr.openjdk.java.net/~pchilanomate/8240902/v2/inc/> ? (not sure 
why the indent fixes are not highlighted as changes but the Frames view 
does show they changed)

I'll give it a run on mach5 adding tier5 as Serguei suggested.


Thanks,
Patricio
> Thanks,
> David
>
>>
>>> And please wait for serviceability folk to review this.
>> Sounds good.
>>
>>
>> Thanks for looking at this David! I will move the MemoryBarrier() and 
>> change the refcount to be DWORD32 if you are okay with that.
>>
>>
>> Thanks,
>> Patricio
>>> Thanks,
>>> David
>>> -----
>>>
>>>> Tested in mach5 with the current baseline, tiers1-3 and several 
>>>> runs of open/test/langtools/:tier1 which includes the jshell tests 
>>>> where this connector is used. I also applied patch 
>>>> http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev 
>>>> mentioned in the comments of the bug, on top of the baseline and 
>>>> run the langtool tests with and without this fix. Without the fix 
>>>> running around 30 repetitions already shows failures in tests 
>>>> jdk/jshell/FailOverExecutionControlTest.java and 
>>>> jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With the 
>>>> fix I run several hundred runs and saw no failures. Let me know if 
>>>> there is any additional testing I should do.
>>>>
>>>> As a side note, I see there are a couple of open issues related 
>>>> with jshell failures (8209848) which could be related to this bug 
>>>> and therefore might be fixed by this patch.
>>>>
>>>> Thanks,
>>>> Patricio
>>>>
>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200319/aec1cbe4/attachment-0001.htm>

From david.holmes at oracle.com  Thu Mar 19 07:50:55 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 19 Mar 2020 17:50:55 +1000
Subject: RFR 8240902: JDI shared memory connector can use already closed
 Handles
In-Reply-To: <e1a2d907-3907-0f0e-31f3-221338549a5e@oracle.com>
References: <b644413f-060c-e856-5de5-bb7d69783a8e@oracle.com>
 <db15a209-c73d-c380-42e2-75e713392453@oracle.com>
 <2212664f-930b-cdfc-7ae6-70c7205cdaca@oracle.com>
 <8928590e-6516-051e-aa84-098a6fdc9d45@oracle.com>
 <e1a2d907-3907-0f0e-31f3-221338549a5e@oracle.com>
Message-ID: <acd59970-d497-1e7e-07de-3045bb070519@oracle.com>

Hi Patricio,

Incremental changes look good.

Thanks,
David

On 19/03/2020 4:18 pm, Patricio Chilano wrote:
> Hi David,
> 
> On 3/18/20 8:10 PM, David Holmes wrote:
>> Hi Patricio,
>>
>> On 19/03/2020 6:44 am, Patricio Chilano wrote:
>>> Hi David,
>>>
>>> On 3/18/20 4:27 AM, David Holmes wrote:
>>>> Hi Patricio,
>>>>
>>>> On 18/03/2020 6:14 am, Patricio Chilano wrote:
>>>>> Hi all,
>>>>>
>>>>> Please review the following patch:
>>>>>
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240902
>>>>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/
>>>>>
>>>>> Calling closeConnection() on an already created/opened connection 
>>>>> includes calls to CloseHandle() on objects that can still be used 
>>>>> by other threads. This can lead to either undefined behavior or, as 
>>>>> detailed in the bug comments, changes of state of unrelated objects. 
>>>>
>>>> This was a really great find!
>>> Thanks!? : )
>>>
>>>>> This issue was found while debugging the reason behind some jshell 
>>>>> test failures seen after pushing 8230594. Not as important, but 
>>>>> there are also calls to closeStream() from 
>>>>> createStream()/openStream() when failing to create/open a stream 
>>>>> that will return after executing "CHECK_ERROR(enterMutex(stream, 
>>>>> NULL));" without closing the intended resources. Then, calling 
>>>>> closeConnection() could assert if the reason of the previous 
>>>>> failure was that the stream's mutex failed to be created/opened. 
>>>>> These patch aims to address these issues too.
>>>>
>>>> Patch looks good in general. The internal reference count guards 
>>>> deletion of the internal resources, and is itself safe because never 
>>>> actually delete the connection. Thanks for adding the comment about 
>>>> this aspect.
>>>>
>>>> A few items:
>>>>
>>>> Please update copyright year before pushing.
>>> Done.
>>>
>>>> Please align ENTER_CONNECTION/LEAVE_CONNECTION macros the same way 
>>>> as STREAM_INVARIANT.
>>> Done.
>>>
>>>> ?170 unsigned int refcount;
>>>> ?171???? jint state;
>>>>
>>>> I'm unclear about the use of stream->state and connection->state as 
>>>> guards - unless accessed under a mutex these would seem to at least 
>>>> need acquire/release semantics.
>>>>
>>>> Additionally the reads of refcount would also seem to need to some 
>>>> form of memory synchronization - though the Windows docs for the 
>>>> Interlocked* API does not show how to simply read such a variable! 
>>>> Though I note that the RtlFirstEntrySList method for the 
>>>> "Interlocked Singly Linked Lists" API does state "Access to the list 
>>>> is synchronized on a multiprocessor system." which suggests a read 
>>>> of such a variable does require some form of memory synchronization!
>>> In the case of the stream struct, the state field is protected by the 
>>> mutex field. It is set to STATE_CLOSED while holding the mutex, and 
>>> threads that read it must acquire the mutex first through 
>>> sysIPMutexEnter(). For the cases where sysIPMutexEnter() didn't 
>>> acquire the mutex, we will return something different than SYS_OK and 
>>> the call will exit anyways. All this behaves as before, I didn't 
>>> change it.
>>
>> Thanks for clarifying.
>>
>>> The refcount and state that I added to the SharedMemoryConnection 
>>> struct work together. For a thread closing the connection, setting 
>>> the connection state to STATE_CLOSED has to happen before reading the 
>>> refcount (more on the atomicity of that read later). That's why I 
>>> added the MemoryBarrier() call; which I see it's better if I just 
>>> move it to after setting the connection state to closed. For the 
>>> threads accessing the connection, incrementing the refcount has to 
>>> happen before reading the connection state. That's already provided 
>>> by the InterlockedIncrement() which uses a full memory barrier. In 
>>> this way if the thread closing the connection reads a refcount of 0, 
>>> then we know it's safe to release the resources, since other threads 
>>> accessing the connection will see that the state is closed after 
>>> incrementing the refcount. If the read of refcount is not 0, then it 
>>> could be that a thread is accessing the connection or not (it could 
>>> have read a state connection of STATE_CLOSED after incrementing the 
>>> refcount), we don't know, so we can't release anything. Similarly if 
>>> the thread accessing the connection reads that the state is not 
>>> closed, then we know it's safe to access the stream since anybody 
>>> closing the connection will still have to read refcount which will be 
>>> at least 1.
>>> As for the atomicity of the read of refcount, from 
>>> https://docs.microsoft.com/en-us/windows/win32/sync/interlocked-variable-access, 
>>> it states that "simple reads and writes to properly-aligned 32-bit 
>>> variables are atomic operations". Maybe I should declare refcount 
>>> explicitly as DWORD32?
>>
>> It isn't the atomicity in question with the naked read but the 
>> visibility. Any latency in the visibility of the store done by the 
>> InterLocked*() function should be handled by the retry loop, but what 
>> is to stop the C++ compiler from hoisting the read of refcount out of 
>> the loop? It isn't even volatile (which has a stronger meaning in VS 
>> than regular C+++).
> I see what you mean now, I was thinking on atomicity and order of 
> operations but didn't consider the visibility of that read. Yes, if the 
> compiler decides to be smart and hoist the read out of the loop we might 
> never notice that it is safe to release those resources and we would 
> leak them for no reason. I see from the windows 
> docs(https://docs.microsoft.com/en-us/cpp/c-language/type-qualifiers) 
> that declaring it volatile as you pointed out should be enough to 
> prevent that.
> 
>>> Instead of having a refcount we could have done something similar to 
>>> the stream struct and protect access to the connection through a 
>>> mutex. To avoid serializing all threads we could have used SRW locks 
>>> and only the one closing the connection would do 
>>> AcquireSRWLockExclusive(). It would change the state of the 
>>> connection to STATE_CLOSED, close all handles, and then release the 
>>> mutex. ENTER_CONNECTION() and LEAVE_CONNECTION() would acquire and 
>>> release the mutex in shared mode. But other that maybe be more easy 
>>> to read I don't think the change will be smaller.
>>>
>>>> ?413 while (attempts>0) {
>>>>
>>>> spaces around >
>>> Done.
>>>
>>>> If the loop at 413 never encounters a zero reference_count then it 
>>>> doesn't close the events or the mutex but still returns SYS_OK. That 
>>>> seems wrong but I'm not sure what the right behaviour is here.
>>> I can change the return value to be SYS_ERR, but I don't think there 
>>> is much we can do about it unless we want to wait forever until we 
>>> can release those resources.
>>
>> SYS_ERR would look better, but I see now that the return value is 
>> completely ignored anyway. So we're just going to leak resources if 
>> the loop "times out". I guess this is the best we can do.
> Here is v2 with the corrections:
> 
> Full: http://cr.openjdk.java.net/~pchilanomate/8240902/v2/webrev/
> Inc: http://cr.openjdk.java.net/~pchilanomate/8240902/v2/inc/webrev/ 
> <http://cr.openjdk.java.net/~pchilanomate/8240902/v2/inc/> ? (not sure 
> why the indent fixes are not highlighted as changes but the Frames view 
> does show they changed)
> 
> I'll give it a run on mach5 adding tier5 as Serguei suggested.
> 
> 
> Thanks,
> Patricio
>> Thanks,
>> David
>>
>>>
>>>> And please wait for serviceability folk to review this.
>>> Sounds good.
>>>
>>>
>>> Thanks for looking at this David! I will move the MemoryBarrier() and 
>>> change the refcount to be DWORD32 if you are okay with that.
>>>
>>>
>>> Thanks,
>>> Patricio
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>> Tested in mach5 with the current baseline, tiers1-3 and several 
>>>>> runs of open/test/langtools/:tier1 which includes the jshell tests 
>>>>> where this connector is used. I also applied patch 
>>>>> http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev 
>>>>> mentioned in the comments of the bug, on top of the baseline and 
>>>>> run the langtool tests with and without this fix. Without the fix 
>>>>> running around 30 repetitions already shows failures in tests 
>>>>> jdk/jshell/FailOverExecutionControlTest.java and 
>>>>> jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With the 
>>>>> fix I run several hundred runs and saw no failures. Let me know if 
>>>>> there is any additional testing I should do.
>>>>>
>>>>> As a side note, I see there are a couple of open issues related 
>>>>> with jshell failures (8209848) which could be related to this bug 
>>>>> and therefore might be fixed by this patch.
>>>>>
>>>>> Thanks,
>>>>> Patricio
>>>>>
>>>
> 

From daniel.daugherty at oracle.com  Thu Mar 19 14:22:01 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 19 Mar 2020 10:22:01 -0400
Subject: RFR 8240902: JDI shared memory connector can use already closed
 Handles
In-Reply-To: <e1a2d907-3907-0f0e-31f3-221338549a5e@oracle.com>
References: <b644413f-060c-e856-5de5-bb7d69783a8e@oracle.com>
 <db15a209-c73d-c380-42e2-75e713392453@oracle.com>
 <2212664f-930b-cdfc-7ae6-70c7205cdaca@oracle.com>
 <8928590e-6516-051e-aa84-098a6fdc9d45@oracle.com>
 <e1a2d907-3907-0f0e-31f3-221338549a5e@oracle.com>
Message-ID: <29500fec-2419-b49d-6493-dd66aca9caf2@oracle.com>

 > Inc: http://cr.openjdk.java.net/~pchilanomate/8240902/v2/inc/webrev/
 >? ? (not sure why the indent fixes are not highlighted as changes but 
the Frames view does show they changed)

By default, webrev ignores leading and trailing whitespace changes. Use:

 ??? -b: Do not ignore changes in the amount of white space.

if you want to see them. I'm okay that they are not there in most of
the views. If you want to see them, look at the patch.


src/jdk.jdi/share/native/libdt_shmem/shmemBase.c
 ??? No comments.

Thumbs up.

Dan


On 3/19/20 2:18 AM, Patricio Chilano wrote:
> Hi David,
>
> On 3/18/20 8:10 PM, David Holmes wrote:
>> Hi Patricio,
>>
>> On 19/03/2020 6:44 am, Patricio Chilano wrote:
>>> Hi David,
>>>
>>> On 3/18/20 4:27 AM, David Holmes wrote:
>>>> Hi Patricio,
>>>>
>>>> On 18/03/2020 6:14 am, Patricio Chilano wrote:
>>>>> Hi all,
>>>>>
>>>>> Please review the following patch:
>>>>>
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240902
>>>>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/
>>>>>
>>>>> Calling closeConnection() on an already created/opened connection 
>>>>> includes calls to CloseHandle() on objects that can still be used 
>>>>> by other threads. This can lead to either undefined behavior or, 
>>>>> as detailed in the bug comments, changes of state of unrelated 
>>>>> objects. 
>>>>
>>>> This was a really great find!
>>> Thanks!? : )
>>>
>>>>> This issue was found while debugging the reason behind some jshell 
>>>>> test failures seen after pushing 8230594. Not as important, but 
>>>>> there are also calls to closeStream() from 
>>>>> createStream()/openStream() when failing to create/open a stream 
>>>>> that will return after executing "CHECK_ERROR(enterMutex(stream, 
>>>>> NULL));" without closing the intended resources. Then, calling 
>>>>> closeConnection() could assert if the reason of the previous 
>>>>> failure was that the stream's mutex failed to be created/opened. 
>>>>> These patch aims to address these issues too.
>>>>
>>>> Patch looks good in general. The internal reference count guards 
>>>> deletion of the internal resources, and is itself safe because 
>>>> never actually delete the connection. Thanks for adding the comment 
>>>> about this aspect.
>>>>
>>>> A few items:
>>>>
>>>> Please update copyright year before pushing.
>>> Done.
>>>
>>>> Please align ENTER_CONNECTION/LEAVE_CONNECTION macros the same way 
>>>> as STREAM_INVARIANT.
>>> Done.
>>>
>>>> ?170 unsigned int refcount;
>>>> ?171???? jint state;
>>>>
>>>> I'm unclear about the use of stream->state and connection->state as 
>>>> guards - unless accessed under a mutex these would seem to at least 
>>>> need acquire/release semantics.
>>>>
>>>> Additionally the reads of refcount would also seem to need to some 
>>>> form of memory synchronization - though the Windows docs for the 
>>>> Interlocked* API does not show how to simply read such a variable! 
>>>> Though I note that the RtlFirstEntrySList method for the 
>>>> "Interlocked Singly Linked Lists" API does state "Access to the 
>>>> list is synchronized on a multiprocessor system." which suggests a 
>>>> read of such a variable does require some form of memory 
>>>> synchronization!
>>> In the case of the stream struct, the state field is protected by 
>>> the mutex field. It is set to STATE_CLOSED while holding the mutex, 
>>> and threads that read it must acquire the mutex first through 
>>> sysIPMutexEnter(). For the cases where sysIPMutexEnter() didn't 
>>> acquire the mutex, we will return something different than SYS_OK 
>>> and the call will exit anyways. All this behaves as before, I didn't 
>>> change it.
>>
>> Thanks for clarifying.
>>
>>> The refcount and state that I added to the SharedMemoryConnection 
>>> struct work together. For a thread closing the connection, setting 
>>> the connection state to STATE_CLOSED has to happen before reading 
>>> the refcount (more on the atomicity of that read later). That's why 
>>> I added the MemoryBarrier() call; which I see it's better if I just 
>>> move it to after setting the connection state to closed. For the 
>>> threads accessing the connection, incrementing the refcount has to 
>>> happen before reading the connection state. That's already provided 
>>> by the InterlockedIncrement() which uses a full memory barrier. In 
>>> this way if the thread closing the connection reads a refcount of 0, 
>>> then we know it's safe to release the resources, since other threads 
>>> accessing the connection will see that the state is closed after 
>>> incrementing the refcount. If the read of refcount is not 0, then it 
>>> could be that a thread is accessing the connection or not (it could 
>>> have read a state connection of STATE_CLOSED after incrementing the 
>>> refcount), we don't know, so we can't release anything. Similarly if 
>>> the thread accessing the connection reads that the state is not 
>>> closed, then we know it's safe to access the stream since anybody 
>>> closing the connection will still have to read refcount which will 
>>> be at least 1.
>>> As for the atomicity of the read of refcount, from 
>>> https://docs.microsoft.com/en-us/windows/win32/sync/interlocked-variable-access, 
>>> it states that "simple reads and writes to properly-aligned 32-bit 
>>> variables are atomic operations". Maybe I should declare refcount 
>>> explicitly as DWORD32?
>>
>> It isn't the atomicity in question with the naked read but the 
>> visibility. Any latency in the visibility of the store done by the 
>> InterLocked*() function should be handled by the retry loop, but what 
>> is to stop the C++ compiler from hoisting the read of refcount out of 
>> the loop? It isn't even volatile (which has a stronger meaning in VS 
>> than regular C+++).
> I see what you mean now, I was thinking on atomicity and order of 
> operations but didn't consider the visibility of that read. Yes, if 
> the compiler decides to be smart and hoist the read out of the loop we 
> might never notice that it is safe to release those resources and we 
> would leak them for no reason. I see from the windows 
> docs(https://docs.microsoft.com/en-us/cpp/c-language/type-qualifiers) 
> that declaring it volatile as you pointed out should be enough to 
> prevent that.
>
>>> Instead of having a refcount we could have done something similar to 
>>> the stream struct and protect access to the connection through a 
>>> mutex. To avoid serializing all threads we could have used SRW locks 
>>> and only the one closing the connection would do 
>>> AcquireSRWLockExclusive(). It would change the state of the 
>>> connection to STATE_CLOSED, close all handles, and then release the 
>>> mutex. ENTER_CONNECTION() and LEAVE_CONNECTION() would acquire and 
>>> release the mutex in shared mode. But other that maybe be more easy 
>>> to read I don't think the change will be smaller.
>>>
>>>> ?413 while (attempts>0) {
>>>>
>>>> spaces around >
>>> Done.
>>>
>>>> If the loop at 413 never encounters a zero reference_count then it 
>>>> doesn't close the events or the mutex but still returns SYS_OK. 
>>>> That seems wrong but I'm not sure what the right behaviour is here.
>>> I can change the return value to be SYS_ERR, but I don't think there 
>>> is much we can do about it unless we want to wait forever until we 
>>> can release those resources.
>>
>> SYS_ERR would look better, but I see now that the return value is 
>> completely ignored anyway. So we're just going to leak resources if 
>> the loop "times out". I guess this is the best we can do.
> Here is v2 with the corrections:
>
> Full: http://cr.openjdk.java.net/~pchilanomate/8240902/v2/webrev/
> Inc: http://cr.openjdk.java.net/~pchilanomate/8240902/v2/inc/webrev/ 
> <http://cr.openjdk.java.net/~pchilanomate/8240902/v2/inc/> ? (not sure 
> why the indent fixes are not highlighted as changes but the Frames 
> view does show they changed)
>
> I'll give it a run on mach5 adding tier5 as Serguei suggested.
>
>
> Thanks,
> Patricio
>> Thanks,
>> David
>>
>>>
>>>> And please wait for serviceability folk to review this.
>>> Sounds good.
>>>
>>>
>>> Thanks for looking at this David! I will move the MemoryBarrier() 
>>> and change the refcount to be DWORD32 if you are okay with that.
>>>
>>>
>>> Thanks,
>>> Patricio
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>> Tested in mach5 with the current baseline, tiers1-3 and several 
>>>>> runs of open/test/langtools/:tier1 which includes the jshell tests 
>>>>> where this connector is used. I also applied patch 
>>>>> http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev 
>>>>> mentioned in the comments of the bug, on top of the baseline and 
>>>>> run the langtool tests with and without this fix. Without the fix 
>>>>> running around 30 repetitions already shows failures in tests 
>>>>> jdk/jshell/FailOverExecutionControlTest.java and 
>>>>> jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With 
>>>>> the fix I run several hundred runs and saw no failures. Let me 
>>>>> know if there is any additional testing I should do.
>>>>>
>>>>> As a side note, I see there are a couple of open issues related 
>>>>> with jshell failures (8209848) which could be related to this bug 
>>>>> and therefore might be fixed by this patch.
>>>>>
>>>>> Thanks,
>>>>> Patricio
>>>>>
>>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200319/ba98a078/attachment.htm>

From patricio.chilano.mateo at oracle.com  Thu Mar 19 14:40:08 2020
From: patricio.chilano.mateo at oracle.com (Patricio Chilano)
Date: Thu, 19 Mar 2020 11:40:08 -0300
Subject: RFR 8240902: JDI shared memory connector can use already closed
 Handles
In-Reply-To: <acd59970-d497-1e7e-07de-3045bb070519@oracle.com>
References: <b644413f-060c-e856-5de5-bb7d69783a8e@oracle.com>
 <db15a209-c73d-c380-42e2-75e713392453@oracle.com>
 <2212664f-930b-cdfc-7ae6-70c7205cdaca@oracle.com>
 <8928590e-6516-051e-aa84-098a6fdc9d45@oracle.com>
 <e1a2d907-3907-0f0e-31f3-221338549a5e@oracle.com>
 <acd59970-d497-1e7e-07de-3045bb070519@oracle.com>
Message-ID: <77a0b7aa-aa26-cde2-b465-a074c0cae240@oracle.com>

Thanks David!

Patricio
On 3/19/20 4:50 AM, David Holmes wrote:
> Hi Patricio,
>
> Incremental changes look good.
>
> Thanks,
> David
>
> On 19/03/2020 4:18 pm, Patricio Chilano wrote:
>> Hi David,
>>
>> On 3/18/20 8:10 PM, David Holmes wrote:
>>> Hi Patricio,
>>>
>>> On 19/03/2020 6:44 am, Patricio Chilano wrote:
>>>> Hi David,
>>>>
>>>> On 3/18/20 4:27 AM, David Holmes wrote:
>>>>> Hi Patricio,
>>>>>
>>>>> On 18/03/2020 6:14 am, Patricio Chilano wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> Please review the following patch:
>>>>>>
>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240902
>>>>>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/
>>>>>>
>>>>>> Calling closeConnection() on an already created/opened connection 
>>>>>> includes calls to CloseHandle() on objects that can still be used 
>>>>>> by other threads. This can lead to either undefined behavior or, 
>>>>>> as detailed in the bug comments, changes of state of unrelated 
>>>>>> objects. 
>>>>>
>>>>> This was a really great find!
>>>> Thanks!? : )
>>>>
>>>>>> This issue was found while debugging the reason behind some 
>>>>>> jshell test failures seen after pushing 8230594. Not as 
>>>>>> important, but there are also calls to closeStream() from 
>>>>>> createStream()/openStream() when failing to create/open a stream 
>>>>>> that will return after executing "CHECK_ERROR(enterMutex(stream, 
>>>>>> NULL));" without closing the intended resources. Then, calling 
>>>>>> closeConnection() could assert if the reason of the previous 
>>>>>> failure was that the stream's mutex failed to be created/opened. 
>>>>>> These patch aims to address these issues too.
>>>>>
>>>>> Patch looks good in general. The internal reference count guards 
>>>>> deletion of the internal resources, and is itself safe because 
>>>>> never actually delete the connection. Thanks for adding the 
>>>>> comment about this aspect.
>>>>>
>>>>> A few items:
>>>>>
>>>>> Please update copyright year before pushing.
>>>> Done.
>>>>
>>>>> Please align ENTER_CONNECTION/LEAVE_CONNECTION macros the same way 
>>>>> as STREAM_INVARIANT.
>>>> Done.
>>>>
>>>>> ?170 unsigned int refcount;
>>>>> ?171???? jint state;
>>>>>
>>>>> I'm unclear about the use of stream->state and connection->state 
>>>>> as guards - unless accessed under a mutex these would seem to at 
>>>>> least need acquire/release semantics.
>>>>>
>>>>> Additionally the reads of refcount would also seem to need to some 
>>>>> form of memory synchronization - though the Windows docs for the 
>>>>> Interlocked* API does not show how to simply read such a variable! 
>>>>> Though I note that the RtlFirstEntrySList method for the 
>>>>> "Interlocked Singly Linked Lists" API does state "Access to the 
>>>>> list is synchronized on a multiprocessor system." which suggests a 
>>>>> read of such a variable does require some form of memory 
>>>>> synchronization!
>>>> In the case of the stream struct, the state field is protected by 
>>>> the mutex field. It is set to STATE_CLOSED while holding the mutex, 
>>>> and threads that read it must acquire the mutex first through 
>>>> sysIPMutexEnter(). For the cases where sysIPMutexEnter() didn't 
>>>> acquire the mutex, we will return something different than SYS_OK 
>>>> and the call will exit anyways. All this behaves as before, I 
>>>> didn't change it.
>>>
>>> Thanks for clarifying.
>>>
>>>> The refcount and state that I added to the SharedMemoryConnection 
>>>> struct work together. For a thread closing the connection, setting 
>>>> the connection state to STATE_CLOSED has to happen before reading 
>>>> the refcount (more on the atomicity of that read later). That's why 
>>>> I added the MemoryBarrier() call; which I see it's better if I just 
>>>> move it to after setting the connection state to closed. For the 
>>>> threads accessing the connection, incrementing the refcount has to 
>>>> happen before reading the connection state. That's already provided 
>>>> by the InterlockedIncrement() which uses a full memory barrier. In 
>>>> this way if the thread closing the connection reads a refcount of 
>>>> 0, then we know it's safe to release the resources, since other 
>>>> threads accessing the connection will see that the state is closed 
>>>> after incrementing the refcount. If the read of refcount is not 0, 
>>>> then it could be that a thread is accessing the connection or not 
>>>> (it could have read a state connection of STATE_CLOSED after 
>>>> incrementing the refcount), we don't know, so we can't release 
>>>> anything. Similarly if the thread accessing the connection reads 
>>>> that the state is not closed, then we know it's safe to access the 
>>>> stream since anybody closing the connection will still have to read 
>>>> refcount which will be at least 1.
>>>> As for the atomicity of the read of refcount, from 
>>>> https://docs.microsoft.com/en-us/windows/win32/sync/interlocked-variable-access, 
>>>> it states that "simple reads and writes to properly-aligned 32-bit 
>>>> variables are atomic operations". Maybe I should declare refcount 
>>>> explicitly as DWORD32?
>>>
>>> It isn't the atomicity in question with the naked read but the 
>>> visibility. Any latency in the visibility of the store done by the 
>>> InterLocked*() function should be handled by the retry loop, but 
>>> what is to stop the C++ compiler from hoisting the read of refcount 
>>> out of the loop? It isn't even volatile (which has a stronger 
>>> meaning in VS than regular C+++).
>> I see what you mean now, I was thinking on atomicity and order of 
>> operations but didn't consider the visibility of that read. Yes, if 
>> the compiler decides to be smart and hoist the read out of the loop 
>> we might never notice that it is safe to release those resources and 
>> we would leak them for no reason. I see from the windows 
>> docs(https://docs.microsoft.com/en-us/cpp/c-language/type-qualifiers) 
>> that declaring it volatile as you pointed out should be enough to 
>> prevent that.
>>
>>>> Instead of having a refcount we could have done something similar 
>>>> to the stream struct and protect access to the connection through a 
>>>> mutex. To avoid serializing all threads we could have used SRW 
>>>> locks and only the one closing the connection would do 
>>>> AcquireSRWLockExclusive(). It would change the state of the 
>>>> connection to STATE_CLOSED, close all handles, and then release the 
>>>> mutex. ENTER_CONNECTION() and LEAVE_CONNECTION() would acquire and 
>>>> release the mutex in shared mode. But other that maybe be more easy 
>>>> to read I don't think the change will be smaller.
>>>>
>>>>> ?413 while (attempts>0) {
>>>>>
>>>>> spaces around >
>>>> Done.
>>>>
>>>>> If the loop at 413 never encounters a zero reference_count then it 
>>>>> doesn't close the events or the mutex but still returns SYS_OK. 
>>>>> That seems wrong but I'm not sure what the right behaviour is here.
>>>> I can change the return value to be SYS_ERR, but I don't think 
>>>> there is much we can do about it unless we want to wait forever 
>>>> until we can release those resources.
>>>
>>> SYS_ERR would look better, but I see now that the return value is 
>>> completely ignored anyway. So we're just going to leak resources if 
>>> the loop "times out". I guess this is the best we can do.
>> Here is v2 with the corrections:
>>
>> Full: http://cr.openjdk.java.net/~pchilanomate/8240902/v2/webrev/
>> Inc: http://cr.openjdk.java.net/~pchilanomate/8240902/v2/inc/webrev/ 
>> <http://cr.openjdk.java.net/~pchilanomate/8240902/v2/inc/> ? (not 
>> sure why the indent fixes are not highlighted as changes but the 
>> Frames view does show they changed)
>>
>> I'll give it a run on mach5 adding tier5 as Serguei suggested.
>>
>>
>> Thanks,
>> Patricio
>>> Thanks,
>>> David
>>>
>>>>
>>>>> And please wait for serviceability folk to review this.
>>>> Sounds good.
>>>>
>>>>
>>>> Thanks for looking at this David! I will move the MemoryBarrier() 
>>>> and change the refcount to be DWORD32 if you are okay with that.
>>>>
>>>>
>>>> Thanks,
>>>> Patricio
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>>
>>>>>> Tested in mach5 with the current baseline, tiers1-3 and several 
>>>>>> runs of open/test/langtools/:tier1 which includes the jshell 
>>>>>> tests where this connector is used. I also applied patch 
>>>>>> http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev 
>>>>>> mentioned in the comments of the bug, on top of the baseline and 
>>>>>> run the langtool tests with and without this fix. Without the fix 
>>>>>> running around 30 repetitions already shows failures in tests 
>>>>>> jdk/jshell/FailOverExecutionControlTest.java and 
>>>>>> jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With 
>>>>>> the fix I run several hundred runs and saw no failures. Let me 
>>>>>> know if there is any additional testing I should do.
>>>>>>
>>>>>> As a side note, I see there are a couple of open issues related 
>>>>>> with jshell failures (8209848) which could be related to this bug 
>>>>>> and therefore might be fixed by this patch.
>>>>>>
>>>>>> Thanks,
>>>>>> Patricio
>>>>>>
>>>>
>>


From patricio.chilano.mateo at oracle.com  Thu Mar 19 14:43:27 2020
From: patricio.chilano.mateo at oracle.com (Patricio Chilano)
Date: Thu, 19 Mar 2020 11:43:27 -0300
Subject: RFR 8240902: JDI shared memory connector can use already closed
 Handles
In-Reply-To: <29500fec-2419-b49d-6493-dd66aca9caf2@oracle.com>
References: <b644413f-060c-e856-5de5-bb7d69783a8e@oracle.com>
 <db15a209-c73d-c380-42e2-75e713392453@oracle.com>
 <2212664f-930b-cdfc-7ae6-70c7205cdaca@oracle.com>
 <8928590e-6516-051e-aa84-098a6fdc9d45@oracle.com>
 <e1a2d907-3907-0f0e-31f3-221338549a5e@oracle.com>
 <29500fec-2419-b49d-6493-dd66aca9caf2@oracle.com>
Message-ID: <c29d6145-08fc-add7-2269-636a63978568@oracle.com>


On 3/19/20 11:22 AM, Daniel D. Daugherty wrote:
> > Inc: http://cr.openjdk.java.net/~pchilanomate/8240902/v2/inc/webrev/
> >? ? (not sure why the indent fixes are not highlighted as changes but 
> the Frames view does show they changed)
>
> By default, webrev ignores leading and trailing whitespace changes. Use:
>
> ??? -b: Do not ignore changes in the amount of white space.
>
> if you want to see them. I'm okay that they are not there in most of
> the views. If you want to see them, look at the patch.
Thanks! I didn't know that option.
>
> src/jdk.jdi/share/native/libdt_shmem/shmemBase.c
> ??? No comments.
>
> Thumbs up.
Thanks for looking at this Dan!


Patricio
> Dan
>
>
> On 3/19/20 2:18 AM, Patricio Chilano wrote:
>> Hi David,
>>
>> On 3/18/20 8:10 PM, David Holmes wrote:
>>> Hi Patricio,
>>>
>>> On 19/03/2020 6:44 am, Patricio Chilano wrote:
>>>> Hi David,
>>>>
>>>> On 3/18/20 4:27 AM, David Holmes wrote:
>>>>> Hi Patricio,
>>>>>
>>>>> On 18/03/2020 6:14 am, Patricio Chilano wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> Please review the following patch:
>>>>>>
>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240902
>>>>>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/
>>>>>>
>>>>>> Calling closeConnection() on an already created/opened connection 
>>>>>> includes calls to CloseHandle() on objects that can still be used 
>>>>>> by other threads. This can lead to either undefined behavior or, 
>>>>>> as detailed in the bug comments, changes of state of unrelated 
>>>>>> objects. 
>>>>>
>>>>> This was a really great find!
>>>> Thanks!? : )
>>>>
>>>>>> This issue was found while debugging the reason behind some 
>>>>>> jshell test failures seen after pushing 8230594. Not as 
>>>>>> important, but there are also calls to closeStream() from 
>>>>>> createStream()/openStream() when failing to create/open a stream 
>>>>>> that will return after executing "CHECK_ERROR(enterMutex(stream, 
>>>>>> NULL));" without closing the intended resources. Then, calling 
>>>>>> closeConnection() could assert if the reason of the previous 
>>>>>> failure was that the stream's mutex failed to be created/opened. 
>>>>>> These patch aims to address these issues too.
>>>>>
>>>>> Patch looks good in general. The internal reference count guards 
>>>>> deletion of the internal resources, and is itself safe because 
>>>>> never actually delete the connection. Thanks for adding the 
>>>>> comment about this aspect.
>>>>>
>>>>> A few items:
>>>>>
>>>>> Please update copyright year before pushing.
>>>> Done.
>>>>
>>>>> Please align ENTER_CONNECTION/LEAVE_CONNECTION macros the same way 
>>>>> as STREAM_INVARIANT.
>>>> Done.
>>>>
>>>>> ?170 unsigned int refcount;
>>>>> ?171???? jint state;
>>>>>
>>>>> I'm unclear about the use of stream->state and connection->state 
>>>>> as guards - unless accessed under a mutex these would seem to at 
>>>>> least need acquire/release semantics.
>>>>>
>>>>> Additionally the reads of refcount would also seem to need to some 
>>>>> form of memory synchronization - though the Windows docs for the 
>>>>> Interlocked* API does not show how to simply read such a variable! 
>>>>> Though I note that the RtlFirstEntrySList method for the 
>>>>> "Interlocked Singly Linked Lists" API does state "Access to the 
>>>>> list is synchronized on a multiprocessor system." which suggests a 
>>>>> read of such a variable does require some form of memory 
>>>>> synchronization!
>>>> In the case of the stream struct, the state field is protected by 
>>>> the mutex field. It is set to STATE_CLOSED while holding the mutex, 
>>>> and threads that read it must acquire the mutex first through 
>>>> sysIPMutexEnter(). For the cases where sysIPMutexEnter() didn't 
>>>> acquire the mutex, we will return something different than SYS_OK 
>>>> and the call will exit anyways. All this behaves as before, I 
>>>> didn't change it.
>>>
>>> Thanks for clarifying.
>>>
>>>> The refcount and state that I added to the SharedMemoryConnection 
>>>> struct work together. For a thread closing the connection, setting 
>>>> the connection state to STATE_CLOSED has to happen before reading 
>>>> the refcount (more on the atomicity of that read later). That's why 
>>>> I added the MemoryBarrier() call; which I see it's better if I just 
>>>> move it to after setting the connection state to closed. For the 
>>>> threads accessing the connection, incrementing the refcount has to 
>>>> happen before reading the connection state. That's already provided 
>>>> by the InterlockedIncrement() which uses a full memory barrier. In 
>>>> this way if the thread closing the connection reads a refcount of 
>>>> 0, then we know it's safe to release the resources, since other 
>>>> threads accessing the connection will see that the state is closed 
>>>> after incrementing the refcount. If the read of refcount is not 0, 
>>>> then it could be that a thread is accessing the connection or not 
>>>> (it could have read a state connection of STATE_CLOSED after 
>>>> incrementing the refcount), we don't know, so we can't release 
>>>> anything. Similarly if the thread accessing the connection reads 
>>>> that the state is not closed, then we know it's safe to access the 
>>>> stream since anybody closing the connection will still have to read 
>>>> refcount which will be at least 1.
>>>> As for the atomicity of the read of refcount, from 
>>>> https://docs.microsoft.com/en-us/windows/win32/sync/interlocked-variable-access, 
>>>> it states that "simple reads and writes to properly-aligned 32-bit 
>>>> variables are atomic operations". Maybe I should declare refcount 
>>>> explicitly as DWORD32?
>>>
>>> It isn't the atomicity in question with the naked read but the 
>>> visibility. Any latency in the visibility of the store done by the 
>>> InterLocked*() function should be handled by the retry loop, but 
>>> what is to stop the C++ compiler from hoisting the read of refcount 
>>> out of the loop? It isn't even volatile (which has a stronger 
>>> meaning in VS than regular C+++).
>> I see what you mean now, I was thinking on atomicity and order of 
>> operations but didn't consider the visibility of that read. Yes, if 
>> the compiler decides to be smart and hoist the read out of the loop 
>> we might never notice that it is safe to release those resources and 
>> we would leak them for no reason. I see from the windows 
>> docs(https://docs.microsoft.com/en-us/cpp/c-language/type-qualifiers) 
>> that declaring it volatile as you pointed out should be enough to 
>> prevent that.
>>
>>>> Instead of having a refcount we could have done something similar 
>>>> to the stream struct and protect access to the connection through a 
>>>> mutex. To avoid serializing all threads we could have used SRW 
>>>> locks and only the one closing the connection would do 
>>>> AcquireSRWLockExclusive(). It would change the state of the 
>>>> connection to STATE_CLOSED, close all handles, and then release the 
>>>> mutex. ENTER_CONNECTION() and LEAVE_CONNECTION() would acquire and 
>>>> release the mutex in shared mode. But other that maybe be more easy 
>>>> to read I don't think the change will be smaller.
>>>>
>>>>> ?413 while (attempts>0) {
>>>>>
>>>>> spaces around >
>>>> Done.
>>>>
>>>>> If the loop at 413 never encounters a zero reference_count then it 
>>>>> doesn't close the events or the mutex but still returns SYS_OK. 
>>>>> That seems wrong but I'm not sure what the right behaviour is here.
>>>> I can change the return value to be SYS_ERR, but I don't think 
>>>> there is much we can do about it unless we want to wait forever 
>>>> until we can release those resources.
>>>
>>> SYS_ERR would look better, but I see now that the return value is 
>>> completely ignored anyway. So we're just going to leak resources if 
>>> the loop "times out". I guess this is the best we can do.
>> Here is v2 with the corrections:
>>
>> Full: http://cr.openjdk.java.net/~pchilanomate/8240902/v2/webrev/
>> Inc: http://cr.openjdk.java.net/~pchilanomate/8240902/v2/inc/webrev/ 
>> <http://cr.openjdk.java.net/~pchilanomate/8240902/v2/inc/> ? (not 
>> sure why the indent fixes are not highlighted as changes but the 
>> Frames view does show they changed)
>>
>> I'll give it a run on mach5 adding tier5 as Serguei suggested.
>>
>>
>> Thanks,
>> Patricio
>>> Thanks,
>>> David
>>>
>>>>
>>>>> And please wait for serviceability folk to review this.
>>>> Sounds good.
>>>>
>>>>
>>>> Thanks for looking at this David! I will move the MemoryBarrier() 
>>>> and change the refcount to be DWORD32 if you are okay with that.
>>>>
>>>>
>>>> Thanks,
>>>> Patricio
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>>
>>>>>> Tested in mach5 with the current baseline, tiers1-3 and several 
>>>>>> runs of open/test/langtools/:tier1 which includes the jshell 
>>>>>> tests where this connector is used. I also applied patch 
>>>>>> http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev 
>>>>>> mentioned in the comments of the bug, on top of the baseline and 
>>>>>> run the langtool tests with and without this fix. Without the fix 
>>>>>> running around 30 repetitions already shows failures in tests 
>>>>>> jdk/jshell/FailOverExecutionControlTest.java and 
>>>>>> jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With 
>>>>>> the fix I run several hundred runs and saw no failures. Let me 
>>>>>> know if there is any additional testing I should do.
>>>>>>
>>>>>> As a side note, I see there are a couple of open issues related 
>>>>>> with jshell failures (8209848) which could be related to this bug 
>>>>>> and therefore might be fixed by this patch.
>>>>>>
>>>>>> Thanks,
>>>>>> Patricio
>>>>>>
>>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200319/06a7ff6a/attachment.htm>

From coleen.phillimore at oracle.com  Thu Mar 19 19:46:09 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Thu, 19 Mar 2020 15:46:09 -0400
Subject: RFR (T) 8241320: The ClassLoaderData::_is_unsafe_anonymous field is
 unused in the SA
Message-ID: <8c45be74-c33d-c4d3-76ff-98766112b6e7@oracle.com>

Summary: remove unused code that is changing in Hotspot for hidden classes.

Ran tier1-3 tests.? See bug for more details.

open webrev at http://cr.openjdk.java.net/~coleenp/2020/8241320.01/webrev
bug link https://bugs.openjdk.java.net/browse/JDK-8241320

Thanks,
Coleen

From lois.foltan at oracle.com  Thu Mar 19 20:12:04 2020
From: lois.foltan at oracle.com (Lois Foltan)
Date: Thu, 19 Mar 2020 16:12:04 -0400
Subject: RFR (T) 8241320: The ClassLoaderData::_is_unsafe_anonymous field
 is unused in the SA
In-Reply-To: <8c45be74-c33d-c4d3-76ff-98766112b6e7@oracle.com>
References: <8c45be74-c33d-c4d3-76ff-98766112b6e7@oracle.com>
Message-ID: <fa18776d-fe6f-6b93-e322-e6695587944c@oracle.com>

Looks good.
Lois

On 3/19/2020 3:46 PM, coleen.phillimore at oracle.com wrote:
> Summary: remove unused code that is changing in Hotspot for hidden 
> classes.
>
> Ran tier1-3 tests.? See bug for more details.
>
> open webrev at http://cr.openjdk.java.net/~coleenp/2020/8241320.01/webrev
> bug link https://bugs.openjdk.java.net/browse/JDK-8241320
>
> Thanks,
> Coleen


From coleen.phillimore at oracle.com  Thu Mar 19 21:06:35 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Thu, 19 Mar 2020 17:06:35 -0400
Subject: RFR (T) 8241320: The ClassLoaderData::_is_unsafe_anonymous field
 is unused in the SA
In-Reply-To: <fa18776d-fe6f-6b93-e322-e6695587944c@oracle.com>
References: <8c45be74-c33d-c4d3-76ff-98766112b6e7@oracle.com>
 <fa18776d-fe6f-6b93-e322-e6695587944c@oracle.com>
Message-ID: <0a68d9e5-8500-3158-123c-e4253f07129f@oracle.com>

Thanks Lois!
Coleen

On 3/19/20 4:12 PM, Lois Foltan wrote:
> Looks good.
> Lois
>
> On 3/19/2020 3:46 PM, coleen.phillimore at oracle.com wrote:
>> Summary: remove unused code that is changing in Hotspot for hidden 
>> classes.
>>
>> Ran tier1-3 tests.? See bug for more details.
>>
>> open webrev at 
>> http://cr.openjdk.java.net/~coleenp/2020/8241320.01/webrev
>> bug link https://bugs.openjdk.java.net/browse/JDK-8241320
>>
>> Thanks,
>> Coleen
>


From david.holmes at oracle.com  Thu Mar 19 22:43:08 2020
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 20 Mar 2020 08:43:08 +1000
Subject: RFR (T) 8241320: The ClassLoaderData::_is_unsafe_anonymous field
 is unused in the SA
In-Reply-To: <8c45be74-c33d-c4d3-76ff-98766112b6e7@oracle.com>
References: <8c45be74-c33d-c4d3-76ff-98766112b6e7@oracle.com>
Message-ID: <a562a190-eb44-b556-166c-460ed991a691@oracle.com>

Hi Coleen,

On 20/03/2020 5:46 am, coleen.phillimore at oracle.com wrote:
> Summary: remove unused code that is changing in Hotspot for hidden classes.

I'm not sure how to identify unused code in the SA given that it exposes 
a Java API for querying the JVM internals. You say 
getisUnsafeAnonymous() is unused because nothing in the SA calls it. But 
the same would seem to be true for other parts of the CLD API - for example

- ClassLoaderData::dictionary() is called from
   - ClassLoaderData::allEntriesDo, is called from
     - ClassLoaderDataGraph::allEntriesDo, is called from
       - nowhere ???

David
-----

> Ran tier1-3 tests.? See bug for more details.
> 
> open webrev at http://cr.openjdk.java.net/~coleenp/2020/8241320.01/webrev
> bug link https://bugs.openjdk.java.net/browse/JDK-8241320
> 
> Thanks,
> Coleen

From serguei.spitsyn at oracle.com  Fri Mar 20 01:03:10 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 19 Mar 2020 18:03:10 -0700
Subject: RFR: 8241123: Refactor vmTestbase stress framework to use j.u.c
 and make creation of threads more flexible
In-Reply-To: <504b0902-9fd1-ea8c-399a-185a4ceaa9e0@oracle.com>
References: <0d73d306-2eff-375c-65e1-67142b2c6c59@oracle.com>
 <FF8B4FCD-C583-41AF-AFE7-7D7411D4A4AC@oracle.com>
 <4E0F364A-47F3-428D-9C08-6B1ADFCB9D24@oracle.com>
 <504b0902-9fd1-ea8c-399a-185a4ceaa9e0@oracle.com>
Message-ID: <240d6e82-d229-8a2d-6be2-3042a1537c11@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200319/7fd24cb2/attachment.htm>

From leonid.mesnik at oracle.com  Fri Mar 20 01:10:28 2020
From: leonid.mesnik at oracle.com (Leonid Mesnik)
Date: Thu, 19 Mar 2020 18:10:28 -0700
Subject: RFR: 8241123: Refactor vmTestbase stress framework to use j.u.c
 and make creation of threads more flexible
In-Reply-To: <240d6e82-d229-8a2d-6be2-3042a1537c11@oracle.com>
References: <0d73d306-2eff-375c-65e1-67142b2c6c59@oracle.com>
 <FF8B4FCD-C583-41AF-AFE7-7D7411D4A4AC@oracle.com>
 <4E0F364A-47F3-428D-9C08-6B1ADFCB9D24@oracle.com>
 <504b0902-9fd1-ea8c-399a-185a4ceaa9e0@oracle.com>
 <240d6e82-d229-8a2d-6be2-3042a1537c11@oracle.com>
Message-ID: <E18735A7-ACB8-4AB3-8170-99CA1FF7DD70@oracle.com>

Hi

Thank you for review and feedback. See my comments inline.

> On Mar 19, 2020, at 6:03 PM, serguei.spitsyn at oracle.com wrote:
> 
> Hi Leonid,
> 
> It looks good in general.
> Just a couple of comments.
> 
> 
> http://cr.openjdk.java.net/~lmesnik/8241123/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/share/Wicket.java.frames.html <http://cr.openjdk.java.net/~lmesnik/8241123/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/share/Wicket.java.frames.html>
>  168     public int waitFor(long timeout) {
>  169         if (timeout < 0)
>  170             throw new IllegalArgumentException(
>  171                     "timeout value is negative: " + timeout);
>  172 
>  173         long id = System.currentTimeMillis();
>  174 
>  175         try {
>  176             lock.lock();
>  177             --waiters;
>  178             if (debugOutput != null) {
>  179                 debugOutput.printf("Wicket %d %s: waitFor(). There are %d waiters totally now.\n", id, name, waiters);
>  180             }
>  181 
>  182             long waitTime = timeout;
>  183             long startTime = System.currentTimeMillis();
>  184 
>  185             while (count > 0  && waitTime > 0) {
>  186                 try {
>  187                     condition.await(waitTime, TimeUnit.MILLISECONDS);
>  188                 } catch (InterruptedException e) {
>  189                 }
>  190                 waitTime = timeout - (System.currentTimeMillis() - startTime);
>  191             }
>  192             --waiters;
>  193             return count;
>  194         } finally {
>  195             lock.unlock();
>  196         }
>  197     }
> 
>  The waiters probably needs to be incremented instead of decremented at line:
>         177             --waiters;
Thank you, fixed.
> 
> http://cr.openjdk.java.net/~lmesnik/8241123/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/share/runner/ThreadsRunner.java.udiff.html <http://cr.openjdk.java.net/~lmesnik/8241123/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/share/runner/ThreadsRunner.java.udiff.html>
>          private void waitForOtherThreads() {
>              if (shouldWait) {
>                  shouldWait = false;
> -                finished.unlock();
> -                finished.waitFor();
> +                finished.decrementAndGet();
> +                while (finished.get() != 0) {
> +                    try {
> +                        Thread.sleep(1000);
> +                    } catch (InterruptedException ie) {
> +                    }
> +                }
>              } else {
>                  throw new TestBug("Waiting a second time is not premitted");
>              }
>          }
> 
>  Should we use a shorter sleep, something like Thread.sleep(100)?
> 
These tests executed 30 or 60 seconds now by default, so sleeping 1 sec doesn't increase overall time. But tI am fine to change it 100, it also should works fine.

Leonid

> 
> Thanks,
> Serguei
> 
> 
> On 3/18/20 15:18, Leonid Mesnik wrote:
>> 
>> On 3/18/20 2:30 PM, Igor Ignatyev wrote:
>>>> I need more time to get grasp of Wicket and your changes in it; will come back to you after I understand them. 
>>> ok, now when I believe that I have enough understanding of Wicket, I have a few comments:
>>> 1.
>>>>   68     private Lock lock = new ReentrantLock();
>>>>   69     private Condition condition = lock.newCondition();
>>> it's better to make these fields final.
>>> 
>>> 2. as all writes and reads of Wicket::count are guarded by lock.lock, there is no need for it to be atomic.
>>> 3. adding lock to getWaiters will also remove need for Wicket::waiters to be atomic.
>> All 3 are fixed. Thanks for your suggestions.
>> 
>> Updated version:
>> 
>> http://cr.openjdk.java.net/~lmesnik/8241123/webrev.01/ <http://cr.openjdk.java.net/~lmesnik/8241123/webrev.01/>
>> Leonid
>> 
>>> 
>>> the rest looks good to me.
>>> 
>>> Thanks,
>>> -- Igor
>>> 
>>> 
>>> 
>>>> On Mar 18, 2020, at 12:48 PM, Igor Ignatyev <igor.ignatyev at oracle.com <mailto:igor.ignatyev at oracle.com>> wrote:
>>>> 
>>>> Hi Leonid,
>>>> 
>>>> I've started looking at your webrev, and so far have a couple questions:
>>>> 
>>>>> Test vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java was updated to don't use Wicket. (The lock has a reference to thread which affects test.)
>>>> can't you use just a volatile boolean field?
>>>> 
>>>>> Wicket "finished" in class ThreadsRunner was changed to atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which might happened in stress GC tests.
>>>> won't j.u.c.CountDownLatch be more appropriate and cleaner solution here?
>>>> 
>>>> I need more time to get grasp of Wicket and your changes in it; will come back to you after I understand them. 
>>>> 
>>>> -- Igor
>>>> 
>>>>> On Mar 18, 2020, at 12:37 PM, Leonid Mesnik <leonid.mesnik at oracle.com <mailto:leonid.mesnik at oracle.com>> wrote:
>>>>> 
>>>>> Hi
>>>>> 
>>>>> Could you please review following fix which slightly refactor vmTestbase stress test harness. This refactoring helps to add virtual threads testing support.
>>>>> 
>>>>> The Wicket uses plain sync/wait/notify mechanism which cause carrier thread starvation and should not be used in virtual threads. The ManagedThread is a subclass of Thread so it couldn't be virtual thread.
>>>>> 
>>>>> 
>>>>> Following fix changes Wicket to use locks/conditions to don't pin vthread to carrier thread while starting testing.
>>>>> 
>>>>> ManagedThread is fixed to keep execution thread as the thread variable and isolate it's creation.
>>>>> 
>>>>> Test vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects003/referringObjects003a.java was updated to don't use Wicket. (The lock has a reference to thread which affects test.)
>>>>> 
>>>>> Wicket "finished" in class ThreadsRunner was changed to atomicInt/sleep to avoid OOME in j.u.c.l.Condition::await() which might happened in stress GC tests.
>>>>> 
>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8241123/webrev.00/ <http://cr.openjdk.java.net/~lmesnik/8241123/webrev.00/>
>>>>> 
>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8241123 <https://bugs.openjdk.java.net/browse/JDK-8241123>
>>>>> 
>>>>> 
>>>>> Leonid
>>>>> 
>>>> 
>>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200319/93c61621/attachment-0001.htm>

From serguei.spitsyn at oracle.com  Fri Mar 20 02:05:24 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 19 Mar 2020 19:05:24 -0700
Subject: RFR: 8241123: Refactor vmTestbase stress framework to use j.u.c
 and make creation of threads more flexible
In-Reply-To: <E18735A7-ACB8-4AB3-8170-99CA1FF7DD70@oracle.com>
References: <0d73d306-2eff-375c-65e1-67142b2c6c59@oracle.com>
 <FF8B4FCD-C583-41AF-AFE7-7D7411D4A4AC@oracle.com>
 <4E0F364A-47F3-428D-9C08-6B1ADFCB9D24@oracle.com>
 <504b0902-9fd1-ea8c-399a-185a4ceaa9e0@oracle.com>
 <240d6e82-d229-8a2d-6be2-3042a1537c11@oracle.com>
 <E18735A7-ACB8-4AB3-8170-99CA1FF7DD70@oracle.com>
Message-ID: <afd2d820-82b2-6869-7622-d49b7736590c@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200319/a0e4ee57/attachment.htm>

From serguei.spitsyn at oracle.com  Fri Mar 20 02:32:37 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 19 Mar 2020 19:32:37 -0700
Subject: RFR (T) 8241320: The ClassLoaderData::_is_unsafe_anonymous field
 is unused in the SA
In-Reply-To: <fa18776d-fe6f-6b93-e322-e6695587944c@oracle.com>
References: <8c45be74-c33d-c4d3-76ff-98766112b6e7@oracle.com>
 <fa18776d-fe6f-6b93-e322-e6695587944c@oracle.com>
Message-ID: <c26d1ec7-8e64-959f-eb1d-ea2acc14a104@oracle.com>

+1

Thanks,
Serguei

On 3/19/20 13:12, Lois Foltan wrote:
> Looks good.
> Lois
>
> On 3/19/2020 3:46 PM, coleen.phillimore at oracle.com wrote:
>> Summary: remove unused code that is changing in Hotspot for hidden 
>> classes.
>>
>> Ran tier1-3 tests.? See bug for more details.
>>
>> open webrev at 
>> http://cr.openjdk.java.net/~coleenp/2020/8241320.01/webrev
>> bug link https://bugs.openjdk.java.net/browse/JDK-8241320
>>
>> Thanks,
>> Coleen
>


From chris.plummer at oracle.com  Fri Mar 20 03:25:45 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Thu, 19 Mar 2020 20:25:45 -0700
Subject: RFR(XS) 8241335: ProblemList serviceability/sa/ClhsdbPstack.java due
 to JDK-8240956
Message-ID: <6f4dcbfb-9174-44f4-1e1a-3e678778807e@oracle.com>

Hello,

Please review the following:

diff --git a/test/hotspot/jtreg/ProblemList.txt 
b/test/hotspot/jtreg/ProblemList.txt
--- a/test/hotspot/jtreg/ProblemList.txt
+++ b/test/hotspot/jtreg/ProblemList.txt
@@ -115,7 +115,7 @@
 ?serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all
 ?serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all
 ?serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all
-serviceability/sa/ClhsdbPstack.java 8193639 solaris-all
+serviceability/sa/ClhsdbPstack.java 8193639,8240956 solaris-all,linux-all
 ?serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 
solaris-all
 ?serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 
solaris-all,linux-x64,macosx-x64,windows-x64
 ?serviceability/sa/ClhsdbSource.java 8193639 solaris-all

I'm still waiting for a tier1 run to make sure the test isn't run on 
linux. I'll push once that is done if I've had a review by then.

thanks,

Chris


From mikael.vidstedt at oracle.com  Fri Mar 20 03:27:31 2020
From: mikael.vidstedt at oracle.com (Mikael Vidstedt)
Date: Thu, 19 Mar 2020 20:27:31 -0700
Subject: RFR(XS) 8241335: ProblemList serviceability/sa/ClhsdbPstack.java
 due to JDK-8240956
In-Reply-To: <6f4dcbfb-9174-44f4-1e1a-3e678778807e@oracle.com>
References: <6f4dcbfb-9174-44f4-1e1a-3e678778807e@oracle.com>
Message-ID: <FC3C1A67-A4CB-415F-810B-0CA903ACBE04@oracle.com>


Looks good, thanks for doing this!

Cheers,
Mikael

> On Mar 19, 2020, at 8:25 PM, Chris Plummer <chris.plummer at oracle.com> wrote:
> 
> ?Hello,
> 
> Please review the following:
> 
> diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt
> --- a/test/hotspot/jtreg/ProblemList.txt
> +++ b/test/hotspot/jtreg/ProblemList.txt
> @@ -115,7 +115,7 @@
>  serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all
>  serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all
>  serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all
> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all
> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 solaris-all,linux-all
>  serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 solaris-all
>  serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 solaris-all,linux-x64,macosx-x64,windows-x64
>  serviceability/sa/ClhsdbSource.java 8193639 solaris-all
> 
> I'm still waiting for a tier1 run to make sure the test isn't run on linux. I'll push once that is done if I've had a review by then.
> 
> thanks,
> 
> Chris
> 


From serguei.spitsyn at oracle.com  Fri Mar 20 04:41:35 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 19 Mar 2020 21:41:35 -0700
Subject: RFR(XS) 8241335: ProblemList serviceability/sa/ClhsdbPstack.java
 due to JDK-8240956
In-Reply-To: <FC3C1A67-A4CB-415F-810B-0CA903ACBE04@oracle.com>
References: <6f4dcbfb-9174-44f4-1e1a-3e678778807e@oracle.com>
 <FC3C1A67-A4CB-415F-810B-0CA903ACBE04@oracle.com>
Message-ID: <ecb9a340-f51e-0d91-c074-b250b0db8073@oracle.com>

+1

Thanks,
Serguei

On 3/19/20 20:27, Mikael Vidstedt wrote:
> Looks good, thanks for doing this!
>
> Cheers,
> Mikael
>
>> On Mar 19, 2020, at 8:25 PM, Chris Plummer <chris.plummer at oracle.com> wrote:
>>
>> ?Hello,
>>
>> Please review the following:
>>
>> diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt
>> --- a/test/hotspot/jtreg/ProblemList.txt
>> +++ b/test/hotspot/jtreg/ProblemList.txt
>> @@ -115,7 +115,7 @@
>>   serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all
>>   serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all
>>   serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all
>> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all
>> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 solaris-all,linux-all
>>   serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 solaris-all
>>   serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 solaris-all,linux-x64,macosx-x64,windows-x64
>>   serviceability/sa/ClhsdbSource.java 8193639 solaris-all
>>
>> I'm still waiting for a tier1 run to make sure the test isn't run on linux. I'll push once that is done if I've had a review by then.
>>
>> thanks,
>>
>> Chris
>>


From chris.plummer at oracle.com  Fri Mar 20 04:55:34 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Thu, 19 Mar 2020 21:55:34 -0700
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com>
 <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com>
 <f6d2ef25-becc-3060-f10a-445f275ee69d@oracle.com>
 <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com>
 <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com>
 <d0a1f49e-7f1d-0732-38d0-2b7a75966e74@oss.nttdata.com>
 <c0013500-ebc6-0eab-1705-feb99e58e73e@oracle.com>
 <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com>
 <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com>
 <a13433b4-e3f8-d280-b83f-cde27f7282cd@oracle.com>
 <ff79a3c5-7416-6ee1-73de-0ef1d91a7480@oracle.com>
 <b9694754-b669-0ea3-2ba9-a432b9d8dcfb@oss.nttdata.com>
 <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com>
Message-ID: <c00aeece-863a-dfb3-f8da-2d3d5ae25330@oracle.com>

Hi Yasumasa,

The test has been problem listed so please add undoing this to your 
webrev. Here's the diff that problem listed it:

diff --git a/test/hotspot/jtreg/ProblemList.txt 
b/test/hotspot/jtreg/ProblemList.txt
--- a/test/hotspot/jtreg/ProblemList.txt
+++ b/test/hotspot/jtreg/ProblemList.txt
@@ -115,7 +115,7 @@
 ?serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all
 ?serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all
 ?serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all
-serviceability/sa/ClhsdbPstack.java 8193639 solaris-all
+serviceability/sa/ClhsdbPstack.java 8193639,8240956 solaris-all,linux-all
 ?serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 
solaris-all
 ?serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 
solaris-all,linux-x64,macosx-x64,windows-x64
 ?serviceability/sa/ClhsdbSource.java 8193639 solaris-all

thanks,

Chris

On 3/16/20 5:07 AM, Yasumasa Suenaga wrote:
> Hi all,
>
> This webrev has passed submit repo 
> (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and additional 
> tests.
> So please review it:
>
> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>
>
> Thanks,
>
> Yasumasa
>
>
> On 2020/03/16 21:03, Yasumasa Suenaga wrote:
>> Thank you so much, David!
>>
>> Yasumasa
>>
>>
>> On 2020/03/16 21:01, David Holmes wrote:
>>> On 16/03/2020 9:46 pm, David Holmes wrote:
>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote:
>>>>> Hi David,
>>>>>
>>>>> I missed loop condition, so I fixed it and pushed to submit repo.
>>>>> Could you try again?
>>>>>
>>>>> ?? http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23
>>>>>
>>>>> webrev is here:
>>>>>
>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>>
>>>> Test job resubmitted. Will advise results if it completes before I 
>>>> go to bed :)
>>>
>>> Seems to have passed okay.
>>>
>>> David
>>>
>>>> David
>>>>
>>>>>
>>>>> Thanks a lot!
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2020/03/16 16:17, David Holmes wrote:
>>>>>> Sorry it is still crashing.
>>>>>>
>>>>>> #
>>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>>> #
>>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704
>>>>>> #
>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug 
>>>>>> build 15-internal+0-2020-03-16-0640217.suenaga.source)
>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
>>>>>> 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, 
>>>>>> sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>>> # Problematic frame:
>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned 
>>>>>> long)+0x4e
>>>>>> #
>>>>>>
>>>>>> Same as before.
>>>>>>
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote:
>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote:
>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote:
>>>>>>>>> BTW, if you submit it to the submit repo, we can then go and 
>>>>>>>>> run additional internal tests (and even more builds) using 
>>>>>>>>> that job.
>>>>>>>
>>>>>>> Thanks for that tip Chris!
>>>>>>>
>>>>>>>> I've pushed the change to submit repo, but I've not yet 
>>>>>>>> received the result.
>>>>>>>> I will share you when I get job ID.
>>>>>>>
>>>>>>> We can see the id. Just need to wait for the builds to complete 
>>>>>>> before submitting the additional tests.
>>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote:
>>>>>>>>>> Hi David,
>>>>>>>>>>
>>>>>>>>>> Thank you for testing it.
>>>>>>>>>>
>>>>>>>>>> I updated webrev to avoid bailout to Java frame when DWARF 
>>>>>>>>>> has language personality routine or LSDA.
>>>>>>>>>> Could you try it?
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/
>>>>>>>>>>
>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 .
>>>>>>>>>> I've pushed it to submit repo.
>>>>>>>>>>
>>>>>>>>>> Diff from webrev.00 is here:
>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote:
>>>>>>>>>>> Correction ...
>>>>>>>>>>>
>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote:
>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote:
>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I can't review this as I know nothing about the code, but 
>>>>>>>>>>>>> I'm putting the patch through our internal testing.
>>>>>>>>>>>>
>>>>>>>>>>>> Sorry but the crashes still exist:
>>>>>>>>>>>>
>>>>>>>>>>>> #
>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime 
>>>>>>>>>>>> Environment:
>>>>>>>>>>>> #
>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, 
>>>>>>>>>>>> tid=16949
>>>>>>>>>>>> #
>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) 
>>>>>>>>>>>> (fastdebug build 
>>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev)
>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
>>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, 
>>>>>>>>>>>> mixed mode, sharing, tiered, compressed oops, g1 gc, 
>>>>>>>>>>>> linux-amd64)
>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>> # C? [libsaproc.so+0x494e] 
>>>>>>>>>>>> DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>>
>>>>>>>>>>>> in fact they seem worse as the test seems to always crash now.
>>>>>>>>>>>
>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the 
>>>>>>>>>>> test in linux-x64. I don't see a pattern as to where it 
>>>>>>>>>>> fails versus passes.
>>>>>>>>>>>
>>>>>>>>>>> It doesn't fail for me locally.
>>>>>>>>>>>
>>>>>>>>>>> David
>>>>>>>>>>>
>>>>>>>>>>>> David
>>>>>>>>>>>>
>>>>>>>>>>>>> David
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>>>>>>> ?? webrev: 
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding 
>>>>>>>>>>>>>> native frames in jstack mixed mode.
>>>>>>>>>>>>>> However some error has seen intermittently after that.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I investigated the cause of this, I found two concerns:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check
>>>>>>>>>>>>>> ?? B: Language personality routine and Language Specific 
>>>>>>>>>>>>>> Data Area (LSDA) are not considered
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I addd range check for .eh_frame processing, and ignore 
>>>>>>>>>>>>>> personality routine and LSDA in this webrev.
>>>>>>>>>>>>>> Also I added bailout code if DWARF processing is failed 
>>>>>>>>>>>>>> due to these concerns.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This change has passed all tests on submit repo 
>>>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671),
>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 
>>>>>>>>>>>>>> container.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>


From suenaga at oss.nttdata.com  Fri Mar 20 09:45:19 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Fri, 20 Mar 2020 18:45:19 +0900
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <c00aeece-863a-dfb3-f8da-2d3d5ae25330@oracle.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com>
 <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com>
 <f6d2ef25-becc-3060-f10a-445f275ee69d@oracle.com>
 <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com>
 <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com>
 <d0a1f49e-7f1d-0732-38d0-2b7a75966e74@oss.nttdata.com>
 <c0013500-ebc6-0eab-1705-feb99e58e73e@oracle.com>
 <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com>
 <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com>
 <a13433b4-e3f8-d280-b83f-cde27f7282cd@oracle.com>
 <ff79a3c5-7416-6ee1-73de-0ef1d91a7480@oracle.com>
 <b9694754-b669-0ea3-2ba9-a432b9d8dcfb@oss.nttdata.com>
 <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com>
 <c00aeece-863a-dfb3-f8da-2d3d5ae25330@oracle.com>
Message-ID: <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com>

Hi Chris,

I uploaded new webrev which includes reverting change for ProblemList:

   http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/

I tested it on submit repo (mach5-one-ysuenaga-JDK-8240956-2-20200320-0814-9586301),
but it has failed in ClhsdbJstackXcompStress.java.
However I think it is not caused by this change because ClhsdbJstackXcompStress.java tests `jhsdb jstack`, not mixed mode, it would not parse DWARF.


Thanks,

Yasumasa


On 2020/03/20 13:55, Chris Plummer wrote:
> Hi Yasumasa,
> 
> The test has been problem listed so please add undoing this to your webrev. Here's the diff that problem listed it:
> 
> diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt
> --- a/test/hotspot/jtreg/ProblemList.txt
> +++ b/test/hotspot/jtreg/ProblemList.txt
> @@ -115,7 +115,7 @@
>  ?serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all
>  ?serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all
>  ?serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all
> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all
> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 solaris-all,linux-all
>  ?serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 solaris-all
>  ?serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 solaris-all,linux-x64,macosx-x64,windows-x64
>  ?serviceability/sa/ClhsdbSource.java 8193639 solaris-all
> 
> thanks,
> 
> Chris
> 
> On 3/16/20 5:07 AM, Yasumasa Suenaga wrote:
>> Hi all,
>>
>> This webrev has passed submit repo (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and additional tests.
>> So please review it:
>>
>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> On 2020/03/16 21:03, Yasumasa Suenaga wrote:
>>> Thank you so much, David!
>>>
>>> Yasumasa
>>>
>>>
>>> On 2020/03/16 21:01, David Holmes wrote:
>>>> On 16/03/2020 9:46 pm, David Holmes wrote:
>>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote:
>>>>>> Hi David,
>>>>>>
>>>>>> I missed loop condition, so I fixed it and pushed to submit repo.
>>>>>> Could you try again?
>>>>>>
>>>>>> ?? http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23
>>>>>>
>>>>>> webrev is here:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>>>
>>>>> Test job resubmitted. Will advise results if it completes before I go to bed :)
>>>>
>>>> Seems to have passed okay.
>>>>
>>>> David
>>>>
>>>>> David
>>>>>
>>>>>>
>>>>>> Thanks a lot!
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>> On 2020/03/16 16:17, David Holmes wrote:
>>>>>>> Sorry it is still crashing.
>>>>>>>
>>>>>>> #
>>>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>>>> #
>>>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704
>>>>>>> #
>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0640217.suenaga.source)
>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>>>> # Problematic frame:
>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>> #
>>>>>>>
>>>>>>> Same as before.
>>>>>>>
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote:
>>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote:
>>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote:
>>>>>>>>>> BTW, if you submit it to the submit repo, we can then go and run additional internal tests (and even more builds) using that job.
>>>>>>>>
>>>>>>>> Thanks for that tip Chris!
>>>>>>>>
>>>>>>>>> I've pushed the change to submit repo, but I've not yet received the result.
>>>>>>>>> I will share you when I get job ID.
>>>>>>>>
>>>>>>>> We can see the id. Just need to wait for the builds to complete before submitting the additional tests.
>>>>>>>>
>>>>>>>> David
>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>> Chris
>>>>>>>>>>
>>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote:
>>>>>>>>>>> Hi David,
>>>>>>>>>>>
>>>>>>>>>>> Thank you for testing it.
>>>>>>>>>>>
>>>>>>>>>>> I updated webrev to avoid bailout to Java frame when DWARF has language personality routine or LSDA.
>>>>>>>>>>> Could you try it?
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/
>>>>>>>>>>>
>>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 .
>>>>>>>>>>> I've pushed it to submit repo.
>>>>>>>>>>>
>>>>>>>>>>> Diff from webrev.00 is here:
>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote:
>>>>>>>>>>>> Correction ...
>>>>>>>>>>>>
>>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote:
>>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote:
>>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I can't review this as I know nothing about the code, but I'm putting the patch through our internal testing.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sorry but the crashes still exist:
>>>>>>>>>>>>>
>>>>>>>>>>>>> #
>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>>>>>>>>>> #
>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949
>>>>>>>>>>>>> #
>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev)
>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>>>
>>>>>>>>>>>>> in fact they seem worse as the test seems to always crash now.
>>>>>>>>>>>>
>>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in linux-x64. I don't see a pattern as to where it fails versus passes.
>>>>>>>>>>>>
>>>>>>>>>>>> It doesn't fail for me locally.
>>>>>>>>>>>>
>>>>>>>>>>>> David
>>>>>>>>>>>>
>>>>>>>>>>>>> David
>>>>>>>>>>>>>
>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames in jstack mixed mode.
>>>>>>>>>>>>>>> However some error has seen intermittently after that.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I investigated the cause of this, I found two concerns:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check
>>>>>>>>>>>>>>> ?? B: Language personality routine and Language Specific Data Area (LSDA) are not considered
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I addd range check for .eh_frame processing, and ignore personality routine and LSDA in this webrev.
>>>>>>>>>>>>>>> Also I added bailout code if DWARF processing is failed due to these concerns.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This change has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671),
>>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>>
> 
> 

From coleen.phillimore at oracle.com  Fri Mar 20 11:25:57 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 20 Mar 2020 07:25:57 -0400
Subject: RFR (T) 8241320: The ClassLoaderData::_is_unsafe_anonymous field
 is unused in the SA
In-Reply-To: <c26d1ec7-8e64-959f-eb1d-ea2acc14a104@oracle.com>
References: <8c45be74-c33d-c4d3-76ff-98766112b6e7@oracle.com>
 <fa18776d-fe6f-6b93-e322-e6695587944c@oracle.com>
 <c26d1ec7-8e64-959f-eb1d-ea2acc14a104@oracle.com>
Message-ID: <4f42fa70-77b8-76f7-85af-57ad15419812@oracle.com>

Thanks Serguei!
Coleen

On 3/19/20 10:32 PM, serguei.spitsyn at oracle.com wrote:
> +1
>
> Thanks,
> Serguei
>
> On 3/19/20 13:12, Lois Foltan wrote:
>> Looks good.
>> Lois
>>
>> On 3/19/2020 3:46 PM, coleen.phillimore at oracle.com wrote:
>>> Summary: remove unused code that is changing in Hotspot for hidden 
>>> classes.
>>>
>>> Ran tier1-3 tests.? See bug for more details.
>>>
>>> open webrev at 
>>> http://cr.openjdk.java.net/~coleenp/2020/8241320.01/webrev
>>> bug link https://bugs.openjdk.java.net/browse/JDK-8241320
>>>
>>> Thanks,
>>> Coleen
>>
>


From coleen.phillimore at oracle.com  Fri Mar 20 11:28:26 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 20 Mar 2020 07:28:26 -0400
Subject: RFR (T) 8241320: The ClassLoaderData::_is_unsafe_anonymous field
 is unused in the SA
In-Reply-To: <a562a190-eb44-b556-166c-460ed991a691@oracle.com>
References: <8c45be74-c33d-c4d3-76ff-98766112b6e7@oracle.com>
 <a562a190-eb44-b556-166c-460ed991a691@oracle.com>
Message-ID: <496b7e2c-09f3-913c-df19-dd8a475e4b67@oracle.com>


On 3/19/20 6:43 PM, David Holmes wrote:
> Hi Coleen,
>
> On 20/03/2020 5:46 am, coleen.phillimore at oracle.com wrote:
>> Summary: remove unused code that is changing in Hotspot for hidden 
>> classes.
>
> I'm not sure how to identify unused code in the SA given that it 
> exposes a Java API for querying the JVM internals. You say 
> getisUnsafeAnonymous() is unused because nothing in the SA calls it. 
> But the same would seem to be true for other parts of the CLD API - 
> for example
>
> - ClassLoaderData::dictionary() is called from
> ? - ClassLoaderData::allEntriesDo, is called from
> ??? - ClassLoaderDataGraph::allEntriesDo, is called from
> ????? - nowhere ???

Actually I had a look at that too because, of course, I was trying to 
remove more.? I think there is a caller for that:

utilities/soql/sa.js: 
sa.sysDict["allEntriesDo(sun.jvm.hotspot.classfile.ClassLoaderDataGraph.ClassAndLoaderVisitor)"](visitor);

But I don't know what the java script interface to SA is.? So I thought 
I'd leave it for now.? It might actually be useful theoretically.

Thanks,
Coleen

>
> David
> -----
>
>> Ran tier1-3 tests.? See bug for more details.
>>
>> open webrev at 
>> http://cr.openjdk.java.net/~coleenp/2020/8241320.01/webrev
>> bug link https://bugs.openjdk.java.net/browse/JDK-8241320
>>
>> Thanks,
>> Coleen


From serguei.spitsyn at oracle.com  Fri Mar 20 15:11:53 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 20 Mar 2020 08:11:53 -0700
Subject: RFR (T) 8241320: The ClassLoaderData::_is_unsafe_anonymous field
 is unused in the SA
In-Reply-To: <496b7e2c-09f3-913c-df19-dd8a475e4b67@oracle.com>
References: <8c45be74-c33d-c4d3-76ff-98766112b6e7@oracle.com>
 <a562a190-eb44-b556-166c-460ed991a691@oracle.com>
 <496b7e2c-09f3-913c-df19-dd8a475e4b67@oracle.com>
Message-ID: <3eb26fb1-2292-f800-5245-31edebaa27b2@oracle.com>


On 3/20/20 04:28, coleen.phillimore at oracle.com wrote:
>
>
> On 3/19/20 6:43 PM, David Holmes wrote:
>> Hi Coleen,
>>
>> On 20/03/2020 5:46 am, coleen.phillimore at oracle.com wrote:
>>> Summary: remove unused code that is changing in Hotspot for hidden 
>>> classes.
>>
>> I'm not sure how to identify unused code in the SA given that it 
>> exposes a Java API for querying the JVM internals. You say 
>> getisUnsafeAnonymous() is unused because nothing in the SA calls it. 
>> But the same would seem to be true for other parts of the CLD API - 
>> for example
>>
>> - ClassLoaderData::dictionary() is called from
>> ? - ClassLoaderData::allEntriesDo, is called from
>> ??? - ClassLoaderDataGraph::allEntriesDo, is called from
>> ????? - nowhere ???
>
> Actually I had a look at that too because, of course, I was trying to 
> remove more.? I think there is a caller for that:
>
> utilities/soql/sa.js: 
> sa.sysDict["allEntriesDo(sun.jvm.hotspot.classfile.ClassLoaderDataGraph.ClassAndLoaderVisitor)"](visitor);
>
> But I don't know what the java script interface to SA is.? So I 
> thought I'd leave it for now.? It might actually be useful theoretically.

We have a plan to remove the java script support from SA.
Chris P. investigated this and, probably, can tell more.

Thanks,
Serguei

>
> Thanks,
> Coleen
>
>>
>> David
>> -----
>>
>>> Ran tier1-3 tests.? See bug for more details.
>>>
>>> open webrev at 
>>> http://cr.openjdk.java.net/~coleenp/2020/8241320.01/webrev
>>> bug link https://bugs.openjdk.java.net/browse/JDK-8241320
>>>
>>> Thanks,
>>> Coleen
>


From rkennke at redhat.com  Fri Mar 20 15:30:24 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 20 Mar 2020 16:30:24 +0100
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
 <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>
 <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>
Message-ID: <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com>

I believe I came up with a much simpler solution that also solves the
problems of the existing one, and the ones I proposed earlier.

It turns out that we can take advantage of the fact that we can use
*anything* as tags in JVMTI, even pointers to stuff (this is explicitely
mentioned in the JVMTI spec). This means we can simply stick a pointer
to the signature of a class into the tag, and pull it out again when we
get notified that the class gets unloaded.

This means we don't need an extra data-structure to keep track of
classes and signatures, and it also makes the story around locking
*much* simpler. Performance-wise this is O(1), i.e. no scanning of all
classes needed (as in the current implementation) and no searching of
table needed (like in my previous attempts).

Please review this new revision:
http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/

(Notice that there still appears to be a performance bottleneck with
class-unloading when an actual debugger is attached. This doesn't seem
to be related to the classTrack.c implementation though, but looks like
a consequence of getting all those class-unload notifications over the
wire. My testcase generates 1000s of them, and it's clogging up the
buffers.)

I am not sure why jdb needs to enable class-unload listener always. A
simple hack disables it, and performance is brilliant, even when jdb is
attached:
http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch

But this is not in the scope of this bug.)

Roman


On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote:
> Sorry, forgot to complete my comments at the end (see below).
> 
> 
> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote:
>> Hi Roman,
>>
>> Thank you for the update and sorry for the latency in review.
>>
>> Some comments are below.
>>
>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>
>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
>>   88 {
>> 89 debugMonitorEnter(deletedSignatureLock);
>> 90 if (currentClassTag == -1) {
>> 91 // Class tracking not initialized, nobody's interested
>> 92 debugMonitorExit(deletedSignatureLock);
>> 93 return;
>>   94     }
>> Just a question:
>> ? Q1: Should the ObjectFree events be disabled for the jvmtiEnv that does
>> ????? the class tracking if class tracking has not been initialized?
>>
>> 70 static jlong currentClassTag; I'm thinking if the name is better to
>> be something like: lastClassTag or highestClassTag.
>>
>> 99 KlassNode* klass = *klass_ptr;
>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103
>> klass_ptr = &klass->next; 104 klass = *klass_ptr;
>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // klass not
>> found - ignore.
>> 107 debugMonitorExit(deletedSignatureLock);
>> 108 return;
>>  109     }
>> ?It seems to me, something is wrong in the condition at L106 above.
>> ?Should it be? :
>> ??? if (klass == NULL || klass->klass_tag != tag)
>>
>> ?Otherwise, how can the second check ever work correctly as the return
>> will always happen when (klass != NULL)?
>>
>> ?
>> There are several places in this file with the the indent:
>> 90 if (currentClassTag == -1) {
>> 91 // Class tracking not initialized, nobody's interested
>> 92 debugMonitorExit(deletedSignatureLock);
>> 93 return;
>>   94     }
>>  ...
>> 152 if (currentClassTag == -1) {
>> 153 // Class tracking not initialized yet, nobody's interested
>> 154 debugMonitorExit(deletedSignatureLock);
>> 155 return;
>>  156     }
>>  ...
>> 161 if (error != JVMTI_ERROR_NONE) {
>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv");
>>  163     }
>> 164 if (tag != 0l) {
>> 165 debugMonitorExit(deletedSignatureLock);
>> 166 return; // Already added
>>  167     }
>>  ...
>> 281 cleanDeleted(void *signatureVoid, void *arg)
>> 282 {
>> 283 char* sig = (char*)signatureVoid;
>> 284 jvmtiDeallocate(sig);
>> 285 return JNI_TRUE;
>>  286 }
>>  ...
>>  291 void
>>  292 classTrack_reset(void)
>>  293 {
>> 294 int idx;
>> 295 debugMonitorEnter(deletedSignatureLock);
>> 296
>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) {
>> 298 KlassNode* node = table[idx];
>> 299 while (node != NULL) {
>> 300 KlassNode* next = node->next;
>> 301 jvmtiDeallocate(node->signature);
>> 302 jvmtiDeallocate(node);
>> 303 node = next;
>> 304 }
>> 305 }
>> 306 jvmtiDeallocate(table);
>> 307
>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL);
>> 309 bagDestroyBag(deletedSignatureBag);
>> 310
>> 311 currentClassTag = -1;
>> 312
>> 313 (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv);
>> 314 trackingEnv = NULL;
>> 315
>> 316 debugMonitorExit(deletedSignatureLock);
>>
>> Could you, please, fix several comments below?
>> 63 * The JVMTI tracking env to keep track of klass tags, for class-unloads
>> ?The comma is not needed.
>> ?Would it better to replace: klass tags => klass_tag's ?
>>
>>
>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag
>> consistent
>> ?Maybe: Lock to guard ... or lock to keep integrity of ...
>>
>> 84 * Callback when classes are freed, Finds the signature and
>> remembers it in deletedSignatureBag. Would be better to use words like
>> "store" or "record", "Find" should not start from capital letter:
>> Invoke the callback when classes are freed, find and record the
>> signature in deletedSignatureBag.
>>
>> 96 // Find deleted KlassNode 133 // Class tracking not initialized,
>> nobody's interested 153 // Class tracking not initialized yet,
>> nobody's interested 158 /* Check this is not a duplicate */ Missed dot
>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { //
>> klass not found - ignore. In opposite, dot is not needed as the
>> comment does not start from a capital letter. 111 // At this point we
>> have the KlassNode corresponding to the tag
>> 112 // in klass, and the pointer to it in klass_node.
> 
>  The comment above can be better. Maybe, something like:
>  ? " At this point, we found the KlassNode matching the klass tag(and it is
> linked).
> 
>> 113 // Remember the unloaded signature.
> ?Better: Record the signature of the unloaded class and unlink it.
> 
> Thanks,
> Serguei
> 
>> Thanks,
>> Serguei 
>>
>> On 3/9/20 05:39, Roman Kennke wrote:
>>> Hello all,
>>>
>>> Can I please get reviews of this change? In the meantime, we've done
>>> more testing and also field-/torture-testing by a customer who is happy
>>> now. :-)
>>>
>>> Thanks,
>>> Roman
>>>
>>>
>>>> Hi Serguei,
>>>>
>>>> Thanks for reviewing!
>>>>
>>>> I updated the patch to reflect your suggestions, very good!
>>>> It also includes a fix to allow re-connecting an agent after disconnect,
>>>> namely move setup of the trackingEnv and deletedSignatureBag to
>>>> _activate() to ensure have those structures after re-connect.
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
>>>>
>>>> Let me know what you think!
>>>> Roman
>>>>
>>>>> Hi Roman,
>>>>>
>>>>> Thank you for taking care about this scalability issue!
>>>>>
>>>>> I have a couple of quick comments.
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>
>>>>> 72 /*
>>>>> 73 * Lock to protect deletedSignatureBag
>>>>> 74 */
>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>>>>> 78 * A bag containing all the deleted classes' signatures. Must be
>>>>> accessed under
>>>>> 79 * deletedTagLock,
>>>>>   80  */
>>>>> 81 struct bag* deletedSignatureBag;
>>>>>
>>>>> ? The comments contradict to each other.
>>>>> ? I guess, the lock name at line 79 has to be deletedSignatureLock
>>>>> instead of deletedTagLock.
>>>>> ? Also, comma at the end must be replaced with dot.
>>>>>
>>>>>
>>>>> 101 // Tag not found? Ignore.
>>>>> 102 if (klass == NULL) {
>>>>> 103 debugMonitorExit(deletedSignatureLock);
>>>>> 104 return;
>>>>> 105 }
>>>>>  106 
>>>>> 107 // Scan linked-list.
>>>>> 108 jlong found_tag = klass->klass_tag;
>>>>> 109 while (klass != NULL && found_tag != tag) {
>>>>> 110 klass_ptr = &klass->next;
>>>>> 111 klass = *klass_ptr;
>>>>> 112 found_tag = klass->klass_tag;
>>>>>  113     }
>>>>> 114
>>>>> 115 // Tag not found? Ignore.
>>>>> 116 if (found_tag != tag) {
>>>>> 117 debugMonitorExit(deletedSignatureLock);
>>>>> 118 return;
>>>>>  119     }
>>>>>
>>>>>
>>>>> ?The code above can be simplified, so that the lines 101-105 are not
>>>>> needed anymore.
>>>>> ?It can be something like this:
>>>>>
>>>>> // Scan linked-list.
>>>>> while (klass != NULL && klass->klass_tag != tag) {
>>>>> klass_ptr = &klass->next;
>>>>> klass = *klass_ptr;
>>>>>      }
>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not found - ignore.
>>>>> debugMonitorExit(deletedSignatureLock);
>>>>> return;
>>>>>      }
>>>>>
>>>>> It will take more time when I get a chance to look at the rest.
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 12/21/19 13:24, Roman Kennke wrote:
>>>>>> Here comes an update that resolves some races that happen when
>>>>>> disconnecting an agent. In particular, we need to take the lock on
>>>>>> basically every operation, and also need to check whether or not
>>>>>> class-tracking is active and return an appropriate result (e.g. an empty
>>>>>> list) when we're not.
>>>>>>
>>>>>> Updated webrev:
>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/
>>>>>>
>>>>>> Thanks,
>>>>>> Roman
>>>>>>
>>>>>>
>>>>>>> So, here comes the O(1) implementation:
>>>>>>>
>>>>>>> - Whenever a class is 'prepared', it is registered with a tag, and we
>>>>>>> set-up a listener to get notified when it is unloaded.
>>>>>>> - Prepared classes are kept in a datastructure that is a table, which
>>>>>>> each entry being the head of a linked-list of KlassNode*. The table is
>>>>>>> indexed by tag % slot-count, and then simply prepend the new KlassNode*.
>>>>>>> This is O(1) operation.
>>>>>>> - When we get notified of unloading a class, we look up the signature of
>>>>>>> the reported tag in that table, and remember it in a bag. The KlassNode*
>>>>>>> is then unlinked from the table and deallocated. This is ~O(1) operation
>>>>>>> too, depending on the depth of the table. In my testcase which hammered
>>>>>>> the code with class-loads and unloads, I usually see depths of like 2-3,
>>>>>>> but not usually more. It should be ok.
>>>>>>> - when processUnloads() gets called, we simply hand out that bag, and
>>>>>>> allocate a new one.
>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid leaking the
>>>>>>> signatures and KlassNode* etc when debug agent gets detached and/or
>>>>>>> re-attached (was missing before).
>>>>>>> - I also added locks around data-structure-manipulation (was missing
>>>>>>> before).
>>>>>>> - Also, I only activate this whole process when an actual listener gets
>>>>>>> registered on EI_GC_FINISH. This seems to happen right when attaching a
>>>>>>> jdb, not sure why jdb does that though. This may be something to improve
>>>>>>> in the future?
>>>>>>>
>>>>>>> In my tests, the performance of class-tracking itself looks really good.
>>>>>>> The bottleneck now is clearly actual synthesizing the class-unload
>>>>>>> events. I don't see how this can be helped when the debug agent asks for it?
>>>>>>>
>>>>>>> Updated webrev:
>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
>>>>>>>
>>>>>>> Please let me know what you think of it.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Roman
>>>>>>>
>>>>>>>
>>>>>>>> Alright, the perfectionist in me got me. I am implementing the even more
>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing for now.
>>>>>>>>
>>>>>>>> Thanks,Roman
>>>>>>>>
>>>>>>>>  Hi Chris,
>>>>>>>>>> I'll have a look at this, although it might not be for a few days. In
>>>>>>>>>> the meantime, maybe you can describe your new implementation in
>>>>>>>>>> classTrack.c so it's easier to look through the changes.
>>>>>>>>> Sure.
>>>>>>>>>
>>>>>>>>> The purpose of this class-tracking is to be able to determine the
>>>>>>>>> signatures of unloaded classes when GC/class-unloading happened, so that
>>>>>>>>> we can generate the appropriate JDWP event.
>>>>>>>>>
>>>>>>>>> The current implementation does so by maintaining a table of currently
>>>>>>>>> prepared classes by building that table when classTrack is initialized,
>>>>>>>>> and then add new classes whenever a class gets loaded. When unloading
>>>>>>>>> occurs, that cache is rebuilt into a new table, and compared with the
>>>>>>>>> old table, and whatever is in the old, but not in the new table gets
>>>>>>>>> returned. The problem is that when GCs happen frequently and/or many
>>>>>>>>> classes get loaded+unloaded, this amounts to O(classCount*gcCount)
>>>>>>>>> complexity.
>>>>>>>>>
>>>>>>>>> The new implementation keeps a linked-list of prepared classes, and also
>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). Whenever an
>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned, and classes
>>>>>>>>> that are also in the deletedTagBag are unlinked (thus maintaining the
>>>>>>>>> prepared-classes-list) and its signature put in the list that gets returned.
>>>>>>>>>
>>>>>>>>> The implementation is not perfect. In order to determine whether or not
>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. That process is
>>>>>>>>> therefore still O(unloadedClassCount). The assumption here is that
>>>>>>>>> unloadedClassCount << classCount. In my experiments this seems to be
>>>>>>>>> true, and also reasonable to expect.
>>>>>>>>>
>>>>>>>>> (I have some ideas how to improve the implementation to ~O(1) but it
>>>>>>>>> would be considerably more complex: have to maintain a (hash)table that
>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload, and build the
>>>>>>>>> unloaded-signatures list there, but I don't currently see that it's
>>>>>>>>> worth the effort).
>>>>>>>>>
>>>>>>>>> In addition to all that, this process is only activated when there's an
>>>>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Roman
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Chris
>>>>>>>>>>
>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>>>>> Hello all,
>>>>>>>>>>>
>>>>>>>>>>> Issue:
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>>>>
>>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c. It avoids
>>>>>>>>>>> throwing away the class cache on GC, and instead keeps track of
>>>>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>>>>
>>>>>>>>>>> In addition to that, it avoids this whole dance until an agent
>>>>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>>>>
>>>>>>>>>>> Webrev:
>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>>>>>>>>>>
>>>>>>>>>>> Testing: manual testing of provided test scenarios and timing.
>>>>>>>>>>>
>>>>>>>>>>> Eg with the testcase provided here:
>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>>
>>>>>>>>>>> I am getting those numbers:
>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s
>>>>>>>>>>>
>>>>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>>>>
>>>>>>>>>>> Can I please get a review?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Roman
>>>>>>>>>>>
>>
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200320/8d48a626/signature-0001.asc>

From chris.plummer at oracle.com  Fri Mar 20 19:23:19 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Fri, 20 Mar 2020 12:23:19 -0700
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com>
 <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com>
 <f6d2ef25-becc-3060-f10a-445f275ee69d@oracle.com>
 <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com>
 <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com>
 <d0a1f49e-7f1d-0732-38d0-2b7a75966e74@oss.nttdata.com>
 <c0013500-ebc6-0eab-1705-feb99e58e73e@oracle.com>
 <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com>
 <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com>
 <a13433b4-e3f8-d280-b83f-cde27f7282cd@oracle.com>
 <ff79a3c5-7416-6ee1-73de-0ef1d91a7480@oracle.com>
 <b9694754-b669-0ea3-2ba9-a432b9d8dcfb@oss.nttdata.com>
 <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com>
 <c00aeece-863a-dfb3-f8da-2d3d5ae25330@oracle.com>
 <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com>
Message-ID: <f0488e84-9758-ea16-82a6-4ce1f424a523@oracle.com>

Hi Yasumasa,

The failure is due to JDK-8231634, so not something you need to worry about.

thanks,

Chris

On 3/20/20 2:45 AM, Yasumasa Suenaga wrote:
> Hi Chris,
>
> I uploaded new webrev which includes reverting change for ProblemList:
>
> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/
>
> I tested it on submit repo 
> (mach5-one-ysuenaga-JDK-8240956-2-20200320-0814-9586301),
> but it has failed in ClhsdbJstackXcompStress.java.
> However I think it is not caused by this change because 
> ClhsdbJstackXcompStress.java tests `jhsdb jstack`, not mixed mode, it 
> would not parse DWARF.
>
>
> Thanks,
>
> Yasumasa
>
>
> On 2020/03/20 13:55, Chris Plummer wrote:
>> Hi Yasumasa,
>>
>> The test has been problem listed so please add undoing this to your 
>> webrev. Here's the diff that problem listed it:
>>
>> diff --git a/test/hotspot/jtreg/ProblemList.txt 
>> b/test/hotspot/jtreg/ProblemList.txt
>> --- a/test/hotspot/jtreg/ProblemList.txt
>> +++ b/test/hotspot/jtreg/ProblemList.txt
>> @@ -115,7 +115,7 @@
>> ??serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all
>> ??serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all
>> ??serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all
>> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all
>> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 
>> solaris-all,linux-all
>> ??serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 
>> solaris-all
>> ??serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 
>> solaris-all,linux-x64,macosx-x64,windows-x64
>> ??serviceability/sa/ClhsdbSource.java 8193639 solaris-all
>>
>> thanks,
>>
>> Chris
>>
>> On 3/16/20 5:07 AM, Yasumasa Suenaga wrote:
>>> Hi all,
>>>
>>> This webrev has passed submit repo 
>>> (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and 
>>> additional tests.
>>> So please review it:
>>>
>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2020/03/16 21:03, Yasumasa Suenaga wrote:
>>>> Thank you so much, David!
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2020/03/16 21:01, David Holmes wrote:
>>>>> On 16/03/2020 9:46 pm, David Holmes wrote:
>>>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote:
>>>>>>> Hi David,
>>>>>>>
>>>>>>> I missed loop condition, so I fixed it and pushed to submit repo.
>>>>>>> Could you try again?
>>>>>>>
>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23
>>>>>>>
>>>>>>> webrev is here:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>>>>
>>>>>> Test job resubmitted. Will advise results if it completes before 
>>>>>> I go to bed :)
>>>>>
>>>>> Seems to have passed okay.
>>>>>
>>>>> David
>>>>>
>>>>>> David
>>>>>>
>>>>>>>
>>>>>>> Thanks a lot!
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>> On 2020/03/16 16:17, David Holmes wrote:
>>>>>>>> Sorry it is still crashing.
>>>>>>>>
>>>>>>>> #
>>>>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>>>>> #
>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704
>>>>>>>> #
>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) 
>>>>>>>> (fastdebug build 15-internal+0-2020-03-16-0640217.suenaga.source)
>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
>>>>>>>> 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, 
>>>>>>>> sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>>>>> # Problematic frame:
>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned 
>>>>>>>> long)+0x4e
>>>>>>>> #
>>>>>>>>
>>>>>>>> Same as before.
>>>>>>>>
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote:
>>>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote:
>>>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote:
>>>>>>>>>>> BTW, if you submit it to the submit repo, we can then go and 
>>>>>>>>>>> run additional internal tests (and even more builds) using 
>>>>>>>>>>> that job.
>>>>>>>>>
>>>>>>>>> Thanks for that tip Chris!
>>>>>>>>>
>>>>>>>>>> I've pushed the change to submit repo, but I've not yet 
>>>>>>>>>> received the result.
>>>>>>>>>> I will share you when I get job ID.
>>>>>>>>>
>>>>>>>>> We can see the id. Just need to wait for the builds to 
>>>>>>>>> complete before submitting the additional tests.
>>>>>>>>>
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>>> Chris
>>>>>>>>>>>
>>>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote:
>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you for testing it.
>>>>>>>>>>>>
>>>>>>>>>>>> I updated webrev to avoid bailout to Java frame when DWARF 
>>>>>>>>>>>> has language personality routine or LSDA.
>>>>>>>>>>>> Could you try it?
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/
>>>>>>>>>>>>
>>>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 .
>>>>>>>>>>>> I've pushed it to submit repo.
>>>>>>>>>>>>
>>>>>>>>>>>> Diff from webrev.00 is here:
>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote:
>>>>>>>>>>>>> Correction ...
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote:
>>>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote:
>>>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I can't review this as I know nothing about the code, 
>>>>>>>>>>>>>>> but I'm putting the patch through our internal testing.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sorry but the crashes still exist:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> #
>>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime 
>>>>>>>>>>>>>> Environment:
>>>>>>>>>>>>>> #
>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, 
>>>>>>>>>>>>>> tid=16949
>>>>>>>>>>>>>> #
>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) 
>>>>>>>>>>>>>> (fastdebug build 
>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev)
>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, 
>>>>>>>>>>>>>> mixed mode, sharing, tiered, compressed oops, g1 gc, 
>>>>>>>>>>>>>> linux-amd64)
>>>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] 
>>>>>>>>>>>>>> DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> in fact they seem worse as the test seems to always crash 
>>>>>>>>>>>>>> now.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the 
>>>>>>>>>>>>> test in linux-x64. I don't see a pattern as to where it 
>>>>>>>>>>>>> fails versus passes.
>>>>>>>>>>>>>
>>>>>>>>>>>>> It doesn't fail for me locally.
>>>>>>>>>>>>>
>>>>>>>>>>>>> David
>>>>>>>>>>>>>
>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>>>>>>>>> ?? webrev: 
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ 
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding 
>>>>>>>>>>>>>>>> native frames in jstack mixed mode.
>>>>>>>>>>>>>>>> However some error has seen intermittently after that.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I investigated the cause of this, I found two concerns:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check
>>>>>>>>>>>>>>>> ?? B: Language personality routine and Language 
>>>>>>>>>>>>>>>> Specific Data Area (LSDA) are not considered
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I addd range check for .eh_frame processing, and ignore 
>>>>>>>>>>>>>>>> personality routine and LSDA in this webrev.
>>>>>>>>>>>>>>>> Also I added bailout code if DWARF processing is failed 
>>>>>>>>>>>>>>>> due to these concerns.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This change has passed all tests on submit repo 
>>>>>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671),
>>>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 
>>>>>>>>>>>>>>>> 7.7 container.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>>
>>
>>


From coleen.phillimore at oracle.com  Fri Mar 20 19:28:36 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 20 Mar 2020 15:28:36 -0400
Subject: RFR (T) 8241320: The ClassLoaderData::_is_unsafe_anonymous field
 is unused in the SA
In-Reply-To: <3eb26fb1-2292-f800-5245-31edebaa27b2@oracle.com>
References: <8c45be74-c33d-c4d3-76ff-98766112b6e7@oracle.com>
 <a562a190-eb44-b556-166c-460ed991a691@oracle.com>
 <496b7e2c-09f3-913c-df19-dd8a475e4b67@oracle.com>
 <3eb26fb1-2292-f800-5245-31edebaa27b2@oracle.com>
Message-ID: <599f317f-2330-248e-d456-3fb506090abf@oracle.com>


On 3/20/20 11:11 AM, serguei.spitsyn at oracle.com wrote:
>
> On 3/20/20 04:28, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 3/19/20 6:43 PM, David Holmes wrote:
>>> Hi Coleen,
>>>
>>> On 20/03/2020 5:46 am, coleen.phillimore at oracle.com wrote:
>>>> Summary: remove unused code that is changing in Hotspot for hidden 
>>>> classes.
>>>
>>> I'm not sure how to identify unused code in the SA given that it 
>>> exposes a Java API for querying the JVM internals. You say 
>>> getisUnsafeAnonymous() is unused because nothing in the SA calls it. 
>>> But the same would seem to be true for other parts of the CLD API - 
>>> for example
>>>
>>> - ClassLoaderData::dictionary() is called from
>>> ? - ClassLoaderData::allEntriesDo, is called from
>>> ??? - ClassLoaderDataGraph::allEntriesDo, is called from
>>> ????? - nowhere ???
>>
>> Actually I had a look at that too because, of course, I was trying to 
>> remove more.? I think there is a caller for that:
>>
>> utilities/soql/sa.js: 
>> sa.sysDict["allEntriesDo(sun.jvm.hotspot.classfile.ClassLoaderDataGraph.ClassAndLoaderVisitor)"](visitor);
>>
>> But I don't know what the java script interface to SA is.? So I 
>> thought I'd leave it for now.? It might actually be useful 
>> theoretically.
>
I had second thoughts about it being useful from SA.? I think if we 
wanted to see what classes were loaded in the system dictionary for each 
loader, we could write a pretty simple python script from within gdb to 
do so.

Coleen

> We have a plan to remove the java script support from SA.
> Chris P. investigated this and, probably, can tell more.
>
> Thanks,
> Serguei
>
>>
>> Thanks,
>> Coleen
>>
>>>
>>> David
>>> -----
>>>
>>>> Ran tier1-3 tests.? See bug for more details.
>>>>
>>>> open webrev at 
>>>> http://cr.openjdk.java.net/~coleenp/2020/8241320.01/webrev
>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8241320
>>>>
>>>> Thanks,
>>>> Coleen
>>
>


From chris.plummer at oracle.com  Fri Mar 20 19:52:30 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Fri, 20 Mar 2020 12:52:30 -0700
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
 <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>
 <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>
 <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com>
Message-ID: <7766d2f0-e223-7129-9c37-474bbcd67d70@oracle.com>

On 3/20/20 8:30 AM, Roman Kennke wrote:
> I believe I came up with a much simpler solution that also solves the
> problems of the existing one, and the ones I proposed earlier.
>
> It turns out that we can take advantage of the fact that we can use
> *anything* as tags in JVMTI, even pointers to stuff (this is explicitely
> mentioned in the JVMTI spec). This means we can simply stick a pointer
> to the signature of a class into the tag, and pull it out again when we
> get notified that the class gets unloaded.
>
> This means we don't need an extra data-structure to keep track of
> classes and signatures, and it also makes the story around locking
> *much* simpler. Performance-wise this is O(1), i.e. no scanning of all
> classes needed (as in the current implementation) and no searching of
> table needed (like in my previous attempts).
>
> Please review this new revision:
> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/
I'll have a look at this.
>
> (Notice that there still appears to be a performance bottleneck with
> class-unloading when an actual debugger is attached. This doesn't seem
> to be related to the classTrack.c implementation though, but looks like
> a consequence of getting all those class-unload notifications over the
> wire. My testcase generates 1000s of them, and it's clogging up the
> buffers.)
At least this is only a one-shot hit when the classes are unloaded, and 
the performance hit is based on the number of classes being unloaded. 
The main issue is happening every GC, and is O(n) where n is the number 
of loaded classes.
> I am not sure why jdb needs to enable class-unload listener always. A
> simple hack disables it, and performance is brilliant, even when jdb is
> attached:
> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch
This is JDI, not jdb. It looks like it needs ClassUnload events so it 
can maintain typesBySignature, which is used by public APIs like 
allClasses(). So we have caching of loaded classes both in the debug 
agent and in JDI.

Chris
> But this is not in the scope of this bug.)
>
> Roman
>
>
> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote:
>> Sorry, forgot to complete my comments at the end (see below).
>>
>>
>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote:
>>> Hi Roman,
>>>
>>> Thank you for the update and sorry for the latency in review.
>>>
>>> Some comments are below.
>>>
>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>
>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
>>>    88 {
>>> 89 debugMonitorEnter(deletedSignatureLock);
>>> 90 if (currentClassTag == -1) {
>>> 91 // Class tracking not initialized, nobody's interested
>>> 92 debugMonitorExit(deletedSignatureLock);
>>> 93 return;
>>>    94     }
>>> Just a question:
>>>  ? Q1: Should the ObjectFree events be disabled for the jvmtiEnv that does
>>>  ????? the class tracking if class tracking has not been initialized?
>>>
>>> 70 static jlong currentClassTag; I'm thinking if the name is better to
>>> be something like: lastClassTag or highestClassTag.
>>>
>>> 99 KlassNode* klass = *klass_ptr;
>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103
>>> klass_ptr = &klass->next; 104 klass = *klass_ptr;
>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // klass not
>>> found - ignore.
>>> 107 debugMonitorExit(deletedSignatureLock);
>>> 108 return;
>>>   109     }
>>>  ?It seems to me, something is wrong in the condition at L106 above.
>>>  ?Should it be? :
>>>  ??? if (klass == NULL || klass->klass_tag != tag)
>>>
>>>  ?Otherwise, how can the second check ever work correctly as the return
>>> will always happen when (klass != NULL)?
>>>
>>>   
>>> There are several places in this file with the the indent:
>>> 90 if (currentClassTag == -1) {
>>> 91 // Class tracking not initialized, nobody's interested
>>> 92 debugMonitorExit(deletedSignatureLock);
>>> 93 return;
>>>    94     }
>>>   ...
>>> 152 if (currentClassTag == -1) {
>>> 153 // Class tracking not initialized yet, nobody's interested
>>> 154 debugMonitorExit(deletedSignatureLock);
>>> 155 return;
>>>   156     }
>>>   ...
>>> 161 if (error != JVMTI_ERROR_NONE) {
>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv");
>>>   163     }
>>> 164 if (tag != 0l) {
>>> 165 debugMonitorExit(deletedSignatureLock);
>>> 166 return; // Already added
>>>   167     }
>>>   ...
>>> 281 cleanDeleted(void *signatureVoid, void *arg)
>>> 282 {
>>> 283 char* sig = (char*)signatureVoid;
>>> 284 jvmtiDeallocate(sig);
>>> 285 return JNI_TRUE;
>>>   286 }
>>>   ...
>>>   291 void
>>>   292 classTrack_reset(void)
>>>   293 {
>>> 294 int idx;
>>> 295 debugMonitorEnter(deletedSignatureLock);
>>> 296
>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) {
>>> 298 KlassNode* node = table[idx];
>>> 299 while (node != NULL) {
>>> 300 KlassNode* next = node->next;
>>> 301 jvmtiDeallocate(node->signature);
>>> 302 jvmtiDeallocate(node);
>>> 303 node = next;
>>> 304 }
>>> 305 }
>>> 306 jvmtiDeallocate(table);
>>> 307
>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL);
>>> 309 bagDestroyBag(deletedSignatureBag);
>>> 310
>>> 311 currentClassTag = -1;
>>> 312
>>> 313 (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv);
>>> 314 trackingEnv = NULL;
>>> 315
>>> 316 debugMonitorExit(deletedSignatureLock);
>>>
>>> Could you, please, fix several comments below?
>>> 63 * The JVMTI tracking env to keep track of klass tags, for class-unloads
>>>  ?The comma is not needed.
>>>  ?Would it better to replace: klass tags => klass_tag's ?
>>>
>>>
>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag
>>> consistent
>>>  ?Maybe: Lock to guard ... or lock to keep integrity of ...
>>>
>>> 84 * Callback when classes are freed, Finds the signature and
>>> remembers it in deletedSignatureBag. Would be better to use words like
>>> "store" or "record", "Find" should not start from capital letter:
>>> Invoke the callback when classes are freed, find and record the
>>> signature in deletedSignatureBag.
>>>
>>> 96 // Find deleted KlassNode 133 // Class tracking not initialized,
>>> nobody's interested 153 // Class tracking not initialized yet,
>>> nobody's interested 158 /* Check this is not a duplicate */ Missed dot
>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { //
>>> klass not found - ignore. In opposite, dot is not needed as the
>>> comment does not start from a capital letter. 111 // At this point we
>>> have the KlassNode corresponding to the tag
>>> 112 // in klass, and the pointer to it in klass_node.
>>   The comment above can be better. Maybe, something like:
>>   ? " At this point, we found the KlassNode matching the klass tag(and it is
>> linked).
>>
>>> 113 // Remember the unloaded signature.
>>  ?Better: Record the signature of the unloaded class and unlink it.
>>
>> Thanks,
>> Serguei
>>
>>> Thanks,
>>> Serguei
>>>
>>> On 3/9/20 05:39, Roman Kennke wrote:
>>>> Hello all,
>>>>
>>>> Can I please get reviews of this change? In the meantime, we've done
>>>> more testing and also field-/torture-testing by a customer who is happy
>>>> now. :-)
>>>>
>>>> Thanks,
>>>> Roman
>>>>
>>>>
>>>>> Hi Serguei,
>>>>>
>>>>> Thanks for reviewing!
>>>>>
>>>>> I updated the patch to reflect your suggestions, very good!
>>>>> It also includes a fix to allow re-connecting an agent after disconnect,
>>>>> namely move setup of the trackingEnv and deletedSignatureBag to
>>>>> _activate() to ensure have those structures after re-connect.
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
>>>>>
>>>>> Let me know what you think!
>>>>> Roman
>>>>>
>>>>>> Hi Roman,
>>>>>>
>>>>>> Thank you for taking care about this scalability issue!
>>>>>>
>>>>>> I have a couple of quick comments.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>
>>>>>> 72 /*
>>>>>> 73 * Lock to protect deletedSignatureBag
>>>>>> 74 */
>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>>>>>> 78 * A bag containing all the deleted classes' signatures. Must be
>>>>>> accessed under
>>>>>> 79 * deletedTagLock,
>>>>>>    80  */
>>>>>> 81 struct bag* deletedSignatureBag;
>>>>>>
>>>>>>  ? The comments contradict to each other.
>>>>>>  ? I guess, the lock name at line 79 has to be deletedSignatureLock
>>>>>> instead of deletedTagLock.
>>>>>>  ? Also, comma at the end must be replaced with dot.
>>>>>>
>>>>>>
>>>>>> 101 // Tag not found? Ignore.
>>>>>> 102 if (klass == NULL) {
>>>>>> 103 debugMonitorExit(deletedSignatureLock);
>>>>>> 104 return;
>>>>>> 105 }
>>>>>>   106
>>>>>> 107 // Scan linked-list.
>>>>>> 108 jlong found_tag = klass->klass_tag;
>>>>>> 109 while (klass != NULL && found_tag != tag) {
>>>>>> 110 klass_ptr = &klass->next;
>>>>>> 111 klass = *klass_ptr;
>>>>>> 112 found_tag = klass->klass_tag;
>>>>>>   113     }
>>>>>> 114
>>>>>> 115 // Tag not found? Ignore.
>>>>>> 116 if (found_tag != tag) {
>>>>>> 117 debugMonitorExit(deletedSignatureLock);
>>>>>> 118 return;
>>>>>>   119     }
>>>>>>
>>>>>>
>>>>>>  ?The code above can be simplified, so that the lines 101-105 are not
>>>>>> needed anymore.
>>>>>>  ?It can be something like this:
>>>>>>
>>>>>> // Scan linked-list.
>>>>>> while (klass != NULL && klass->klass_tag != tag) {
>>>>>> klass_ptr = &klass->next;
>>>>>> klass = *klass_ptr;
>>>>>>       }
>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not found - ignore.
>>>>>> debugMonitorExit(deletedSignatureLock);
>>>>>> return;
>>>>>>       }
>>>>>>
>>>>>> It will take more time when I get a chance to look at the rest.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 12/21/19 13:24, Roman Kennke wrote:
>>>>>>> Here comes an update that resolves some races that happen when
>>>>>>> disconnecting an agent. In particular, we need to take the lock on
>>>>>>> basically every operation, and also need to check whether or not
>>>>>>> class-tracking is active and return an appropriate result (e.g. an empty
>>>>>>> list) when we're not.
>>>>>>>
>>>>>>> Updated webrev:
>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Roman
>>>>>>>
>>>>>>>
>>>>>>>> So, here comes the O(1) implementation:
>>>>>>>>
>>>>>>>> - Whenever a class is 'prepared', it is registered with a tag, and we
>>>>>>>> set-up a listener to get notified when it is unloaded.
>>>>>>>> - Prepared classes are kept in a datastructure that is a table, which
>>>>>>>> each entry being the head of a linked-list of KlassNode*. The table is
>>>>>>>> indexed by tag % slot-count, and then simply prepend the new KlassNode*.
>>>>>>>> This is O(1) operation.
>>>>>>>> - When we get notified of unloading a class, we look up the signature of
>>>>>>>> the reported tag in that table, and remember it in a bag. The KlassNode*
>>>>>>>> is then unlinked from the table and deallocated. This is ~O(1) operation
>>>>>>>> too, depending on the depth of the table. In my testcase which hammered
>>>>>>>> the code with class-loads and unloads, I usually see depths of like 2-3,
>>>>>>>> but not usually more. It should be ok.
>>>>>>>> - when processUnloads() gets called, we simply hand out that bag, and
>>>>>>>> allocate a new one.
>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid leaking the
>>>>>>>> signatures and KlassNode* etc when debug agent gets detached and/or
>>>>>>>> re-attached (was missing before).
>>>>>>>> - I also added locks around data-structure-manipulation (was missing
>>>>>>>> before).
>>>>>>>> - Also, I only activate this whole process when an actual listener gets
>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when attaching a
>>>>>>>> jdb, not sure why jdb does that though. This may be something to improve
>>>>>>>> in the future?
>>>>>>>>
>>>>>>>> In my tests, the performance of class-tracking itself looks really good.
>>>>>>>> The bottleneck now is clearly actual synthesizing the class-unload
>>>>>>>> events. I don't see how this can be helped when the debug agent asks for it?
>>>>>>>>
>>>>>>>> Updated webrev:
>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
>>>>>>>>
>>>>>>>> Please let me know what you think of it.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Roman
>>>>>>>>
>>>>>>>>
>>>>>>>>> Alright, the perfectionist in me got me. I am implementing the even more
>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing for now.
>>>>>>>>>
>>>>>>>>> Thanks,Roman
>>>>>>>>>
>>>>>>>>>   Hi Chris,
>>>>>>>>>>> I'll have a look at this, although it might not be for a few days. In
>>>>>>>>>>> the meantime, maybe you can describe your new implementation in
>>>>>>>>>>> classTrack.c so it's easier to look through the changes.
>>>>>>>>>> Sure.
>>>>>>>>>>
>>>>>>>>>> The purpose of this class-tracking is to be able to determine the
>>>>>>>>>> signatures of unloaded classes when GC/class-unloading happened, so that
>>>>>>>>>> we can generate the appropriate JDWP event.
>>>>>>>>>>
>>>>>>>>>> The current implementation does so by maintaining a table of currently
>>>>>>>>>> prepared classes by building that table when classTrack is initialized,
>>>>>>>>>> and then add new classes whenever a class gets loaded. When unloading
>>>>>>>>>> occurs, that cache is rebuilt into a new table, and compared with the
>>>>>>>>>> old table, and whatever is in the old, but not in the new table gets
>>>>>>>>>> returned. The problem is that when GCs happen frequently and/or many
>>>>>>>>>> classes get loaded+unloaded, this amounts to O(classCount*gcCount)
>>>>>>>>>> complexity.
>>>>>>>>>>
>>>>>>>>>> The new implementation keeps a linked-list of prepared classes, and also
>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). Whenever an
>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned, and classes
>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus maintaining the
>>>>>>>>>> prepared-classes-list) and its signature put in the list that gets returned.
>>>>>>>>>>
>>>>>>>>>> The implementation is not perfect. In order to determine whether or not
>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. That process is
>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here is that
>>>>>>>>>> unloadedClassCount << classCount. In my experiments this seems to be
>>>>>>>>>> true, and also reasonable to expect.
>>>>>>>>>>
>>>>>>>>>> (I have some ideas how to improve the implementation to ~O(1) but it
>>>>>>>>>> would be considerably more complex: have to maintain a (hash)table that
>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload, and build the
>>>>>>>>>> unloaded-signatures list there, but I don't currently see that it's
>>>>>>>>>> worth the effort).
>>>>>>>>>>
>>>>>>>>>> In addition to all that, this process is only activated when there's an
>>>>>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Roman
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Chris
>>>>>>>>>>>
>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>
>>>>>>>>>>>> Issue:
>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>>>>>
>>>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c. It avoids
>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps track of
>>>>>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>>>>>
>>>>>>>>>>>> In addition to that, it avoids this whole dance until an agent
>>>>>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>>>>>
>>>>>>>>>>>> Webrev:
>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>>>>>>>>>>>
>>>>>>>>>>>> Testing: manual testing of provided test scenarios and timing.
>>>>>>>>>>>>
>>>>>>>>>>>> Eg with the testcase provided here:
>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>>>
>>>>>>>>>>>> I am getting those numbers:
>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s
>>>>>>>>>>>>
>>>>>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>>>>>
>>>>>>>>>>>> Can I please get a review?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Roman
>>>>>>>>>>>>


From suenaga at oss.nttdata.com  Sat Mar 21 00:55:04 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Sat, 21 Mar 2020 09:55:04 +0900
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <f0488e84-9758-ea16-82a6-4ce1f424a523@oracle.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com>
 <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com>
 <f6d2ef25-becc-3060-f10a-445f275ee69d@oracle.com>
 <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com>
 <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com>
 <d0a1f49e-7f1d-0732-38d0-2b7a75966e74@oss.nttdata.com>
 <c0013500-ebc6-0eab-1705-feb99e58e73e@oracle.com>
 <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com>
 <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com>
 <a13433b4-e3f8-d280-b83f-cde27f7282cd@oracle.com>
 <ff79a3c5-7416-6ee1-73de-0ef1d91a7480@oracle.com>
 <b9694754-b669-0ea3-2ba9-a432b9d8dcfb@oss.nttdata.com>
 <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com>
 <c00aeece-863a-dfb3-f8da-2d3d5ae25330@oracle.com>
 <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com>
 <f0488e84-9758-ea16-82a6-4ce1f424a523@oracle.com>
Message-ID: <4f940048-06e1-bda1-6f75-225001572232@oss.nttdata.com>

Thanks Chris!
I'm waiting for reviewers for this change.


Yasumasa


On 2020/03/21 4:23, Chris Plummer wrote:
> Hi Yasumasa,
> 
> The failure is due to JDK-8231634, so not something you need to worry about.
> 
> thanks,
> 
> Chris
> 
> On 3/20/20 2:45 AM, Yasumasa Suenaga wrote:
>> Hi Chris,
>>
>> I uploaded new webrev which includes reverting change for ProblemList:
>>
>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/
>>
>> I tested it on submit repo (mach5-one-ysuenaga-JDK-8240956-2-20200320-0814-9586301),
>> but it has failed in ClhsdbJstackXcompStress.java.
>> However I think it is not caused by this change because ClhsdbJstackXcompStress.java tests `jhsdb jstack`, not mixed mode, it would not parse DWARF.
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> On 2020/03/20 13:55, Chris Plummer wrote:
>>> Hi Yasumasa,
>>>
>>> The test has been problem listed so please add undoing this to your webrev. Here's the diff that problem listed it:
>>>
>>> diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt
>>> --- a/test/hotspot/jtreg/ProblemList.txt
>>> +++ b/test/hotspot/jtreg/ProblemList.txt
>>> @@ -115,7 +115,7 @@
>>> ??serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all
>>> ??serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all
>>> ??serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all
>>> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all
>>> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 solaris-all,linux-all
>>> ??serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 solaris-all
>>> ??serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 solaris-all,linux-x64,macosx-x64,windows-x64
>>> ??serviceability/sa/ClhsdbSource.java 8193639 solaris-all
>>>
>>> thanks,
>>>
>>> Chris
>>>
>>> On 3/16/20 5:07 AM, Yasumasa Suenaga wrote:
>>>> Hi all,
>>>>
>>>> This webrev has passed submit repo (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and additional tests.
>>>> So please review it:
>>>>
>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2020/03/16 21:03, Yasumasa Suenaga wrote:
>>>>> Thank you so much, David!
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2020/03/16 21:01, David Holmes wrote:
>>>>>> On 16/03/2020 9:46 pm, David Holmes wrote:
>>>>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote:
>>>>>>>> Hi David,
>>>>>>>>
>>>>>>>> I missed loop condition, so I fixed it and pushed to submit repo.
>>>>>>>> Could you try again?
>>>>>>>>
>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23
>>>>>>>>
>>>>>>>> webrev is here:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>>>>>
>>>>>>> Test job resubmitted. Will advise results if it completes before I go to bed :)
>>>>>>
>>>>>> Seems to have passed okay.
>>>>>>
>>>>>> David
>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks a lot!
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2020/03/16 16:17, David Holmes wrote:
>>>>>>>>> Sorry it is still crashing.
>>>>>>>>>
>>>>>>>>> #
>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>>>>>> #
>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704
>>>>>>>>> #
>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0640217.suenaga.source)
>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>>>>>> # Problematic frame:
>>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>> #
>>>>>>>>>
>>>>>>>>> Same as before.
>>>>>>>>>
>>>>>>>>> David
>>>>>>>>> -----
>>>>>>>>>
>>>>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote:
>>>>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote:
>>>>>>>>>>>> BTW, if you submit it to the submit repo, we can then go and run additional internal tests (and even more builds) using that job.
>>>>>>>>>>
>>>>>>>>>> Thanks for that tip Chris!
>>>>>>>>>>
>>>>>>>>>>> I've pushed the change to submit repo, but I've not yet received the result.
>>>>>>>>>>> I will share you when I get job ID.
>>>>>>>>>>
>>>>>>>>>> We can see the id. Just need to wait for the builds to complete before submitting the additional tests.
>>>>>>>>>>
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>>> Chris
>>>>>>>>>>>>
>>>>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thank you for testing it.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I updated webrev to avoid bailout to Java frame when DWARF has language personality routine or LSDA.
>>>>>>>>>>>>> Could you try it?
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/
>>>>>>>>>>>>>
>>>>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 .
>>>>>>>>>>>>> I've pushed it to submit repo.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Diff from webrev.00 is here:
>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote:
>>>>>>>>>>>>>> Correction ...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote:
>>>>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote:
>>>>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I can't review this as I know nothing about the code, but I'm putting the patch through our internal testing.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sorry but the crashes still exist:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949
>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev)
>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> in fact they seem worse as the test seems to always crash now.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in linux-x64. I don't see a pattern as to where it fails versus passes.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It doesn't fail for me locally.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames in jstack mixed mode.
>>>>>>>>>>>>>>>>> However some error has seen intermittently after that.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I investigated the cause of this, I found two concerns:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check
>>>>>>>>>>>>>>>>> ?? B: Language personality routine and Language Specific Data Area (LSDA) are not considered
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I addd range check for .eh_frame processing, and ignore personality routine and LSDA in this webrev.
>>>>>>>>>>>>>>>>> Also I added bailout code if DWARF processing is failed due to these concerns.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This change has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671),
>>>>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>
>>>
> 
> 

From daniil.x.titov at oracle.com  Sun Mar 22 22:29:27 2020
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Sun, 22 Mar 2020 15:29:27 -0700
Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI
 connector port
In-Reply-To: <c00b118a-8619-3984-10d7-63134d7210a0@oss.nttdata.com>
References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com>
 <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com>
 <D28BF049-D293-4F65-93EC-BCAE4F09B413@oracle.com>
 <a3b5ac77-b6ee-4927-cf99-b586c6bbeae6@oss.nttdata.com>
 <41392FC2-AC78-4E94-878B-907DDFB3E968@oracle.com>
 <c858b94c-0091-8d29-eb0b-145782984d86@oss.nttdata.com>
 <78EA1B33-1B0D-4969-AFC5-E73D8F82E6FA@oracle.com>
 <c00b118a-8619-3984-10d7-63134d7210a0@oss.nttdata.com>
Message-ID: <FBCAD683-2A99-415D-8926-8AEF560EE55A@oracle.com>

Hi Yasumasa, Serguei and Alex,

Please review a new version of the webrev that merges SADebugDTest.java  with changes  done in  [2].

Also the CRS [3] and the help message for debug server in SALauncher.java were updated to specify that  '--hostname' 
option could be a hostname or an IPv4/IPv6 address.

 >  Ok, but I think it might be more simply with TestLibrary.
 >   For example, can you use TestLibrary::getUnusedRandomPort ? It is used in test/jdk/java/rmi/testlibrary/RMID.java .

TestLibrary:: getUnusedRandomPort() doesn't allow to specify what ports are reserved and it uses some hardcoded port range [FIXED_PORT_MIN, FIXED_PORT_MAX] as reserved ports. Besides,  test/jdk/java/rmi/testlibrary/TestLibrary.java class cannot be directly used in test/hotspot/jtreg/serviceability/* tests (it doesn't compile).

Nevertheless, to simplify the test itself I moved findUnreservedFreePort(int .. reservedPorts) from SADebugTest.java to jdk.test.lib.Utils in /test/lib.

Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.

[1] http://cr.openjdk.java.net/~dtitov/8196751/webrev.04/ 
[2] https://bugs.openjdk.java.net/browse/JDK-8238268 
[3] https://bugs.openjdk.java.net/browse/JDK-8239831  

Thank you,
Daniil

?On 3/13/20, 7:23 PM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:

    Hi Daniil,
    
    On 2020/03/14 7:05, Daniil Titov wrote:
    > Hi Yasumasa, Serguei and Alex,
    > 
    > Please review a new version of the webrev that includes the changes Yasumasa suggested.
    > 
    >> Shutdown hook is already registered in c'tor of HotSpotAgent.
    >>     It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed.
    > 
    > The shutdown hook registered in the HotSpotAgent c'tor only works for non-servers, so we still need a
    > the shutdown hook for remote server being added in SALauncher. I changed it to use  the lambda expression.
    > 
    > 101     public HotSpotAgent() {
    >   102         // for non-server add shutdown hook to clean-up debugger in case
    >   103         // of forced exit. For remote server, shutdown hook is added by
    >   104         // DebugServer.
    >   105         Runtime.getRuntime().addShutdownHook(new java.lang.Thread(
    >   106         new Runnable() {
    >   107             public void run() {
    >   108                 synchronized (HotSpotAgent.this) {
    >   109                     if (!isServer) {
    >   110                         detach();
    >   111                     }
    >   112                 }
    >   113             }
    >   114         }));
    >   115     }
    
    I missed it, thanks!
    
    
    >>>     Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains
    >>> `exclusiveAccess.dirs=.` to avoid concurrent execution
    > As I understand exclusiveAccess.dirs prevents only the tests located in this directory from being run simultaneously and other tests could still run in parallel with one of these tests.  Thus I would prefer to have the retry mechanism in place. I simplified the code using the class variables instead of local arrays.
    
    Ok, but I think it might be more simply with TestLibrary.
    For example, can you use TestLibrary::getUnusedRandomPort ? It is used in test/jdk/java/rmi/testlibrary/RMID.java .
    
    
    Thanks,
    
    Yasumasa
    
    
    > Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
    > 
    > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.03/
    > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
    > [3] Bug: https://bugs.openjdk.java.net/browse/JDK-8196751
    > 
    > Thank you,
    > Daniil
    > 
    > ?On 3/6/20, 6:15 PM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
    > 
    >      Hi Daniil,
    >      
    >      On 2020/03/07 3:38, Daniil Titov wrote:
    >      > Hi Yasumasa,
    >      >
    >      >   -> checkBasicOptions() is needed? I think you can remove this method and embed it in caller.
    >      > I think that having a piece of code that invokes  a method  named "buildAttachArgs" with a copy of the argument map  just for its side-effect ( it throws an exception if parameters are incorrect)  and ignores its return might look confusing. Thus, I found it more appropriate to wrap it inside a method with more relevant name .
    >      
    >      Ok, but I prefer to leave comment it.
    >      
    >      
    >      >   > SADebugDTest
    >      >   >  - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array.
    >      > We cannot use primitives there since these local variables are captured in lambda expression and are required to be final.
    >      > The other option is to use some other wrapper for them but I don't see any obvious benefits in it comparing to the array.
    >      
    >      Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains `exclusiveAccess.dirs=.` to avoid concurrent execution.
    >      If you do not think this error check, test code is more simply.
    >      
    >      
    >      > I will include your other suggestion in the new version of the webrev.
    >      
    >      Sorry, I have one more comment:
    >      
    >      >           - Shutdown hook is very good idea. You can implement more simply if you use lambda expression.
    >      
    >      Shutdown hook is already registered in c'tor of HotSpotAgent.
    >      It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed.
    >      
    >      
    >      Thanks,
    >      
    >      Yasumasa
    >      
    >      
    >      > Thanks!
    >      > Daniil
    >      >
    >      > ?On 3/6/20, 12:30 AM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
    >      >
    >      >      Hi Daniil,
    >      >
    >      >
    >      >      - SALauncher.java
    >      >           - checkBasicOptions() is needed? I think you can remove this method and embed it in caller.
    >      >           - I think registryPort should be checked with Integer.parseInt() like others (rmiPort and pid) rather than regex.
    >      >           - Shutdown hook is very good idea. You can implement more simply if you use lambda expression.
    >      >
    >      >      - SADebugDTest.java
    >      >           - Please add bug ID to @bug.
    >      >           - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array.
    >      >
    >      >
    >      >      Thanks,
    >      >
    >      >      Yasumasa
    >      >
    >      >
    >      >      On 2020/03/06 10:15, Daniil Titov wrote:
    >      >      > Hi Yasumasa, Serguei and Alex,
    >      >      >
    >      >      > Please review a new version of the fix [1] that addresses your comments. The new version in addition to RMI connector
    >      >      > port option introduces two more options to specify RMI registry port and RMI connector host name. Currently, these
    >      >      > last two settings could be specified using the system properties but the system properties have the following disadvantages
    >      >      > comparing to the command line options:
    >      >      >     -  It?s hard to know about them: they are not listed in tool?s help.
    >      >      >     -  They have long names that hard to remember
    >      >      >     -   It is easy to mistype them  in the command line and you will not get any warning about it.
    >      >      >
    >      >      > The CSR [2] was also updated and needs to be reviewed.
    >      >      >
    >      >      > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker
    >      >      > container  and connecting  to it with the GUI debugger.  Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
    >      >      >
    >      >      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.02/
    >      >      > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
    >      >      > [3] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751
    >      >      >
    >      >      > Thank you,
    >      >      > Daniil
    >      >      >
    >      >      > ?On 2/24/20, 5:45 AM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
    >      >      >
    >      >      >      Hi Daniil,
    >      >      >
    >      >      >         - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments.
    >      >      >           Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply.
    >      >      >
    >      >      >         - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used.
    >      >      >           But you can use same port number as RMI registry (1099).
    >      >      >           It is same as relation between jmxremote.port and jmxremote.rmi.port.
    >      >      >
    >      >      >
    >      >      >      Thanks,
    >      >      >
    >      >      >      Yasumasa
    >      >      >
    >      >      >
    >      >      >      On 2020/02/24 13:21, Daniil Titov wrote:
    >      >      >      > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port.
    >      >      >      > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container.
    >      >      >      >
    >      >      >      > New CSR [3] was created for this change and it needs to be reviewed as well.
    >      >      >      >
    >      >      >      > Man pages for jhsdb will be updated in a separate issue.
    >      >      >      >
    >      >      >      > The current implementation (sun.jvm.hotspot.SALauncher)  parses the command line options passed to jhsdb tool,
    >      >      >      > converts them to the ones for the debug server and then delegates the call  to sun.jvm.hotspot.DebugServer.main().
    >      >      >      >
    >      >      >      >                // delegate to the actual SA debug server.
    >      >      >      >   367         DebugServer.main(newArgArray.toArray(new String[0]));
    >      >      >      >
    >      >      >      > However,  sun.jvm.hotspot.DebugServer  doesn't support named options and that prevents from efficiently adding new options to the tool.
    >      >      >      > I found it more suitable to start Hotspot agent directly in  SALauncher rather than  adding a new option in  both sun.jvm.hotspot.SALauncher
    >      >      >      >   and sun.jvm.hotspot.DebugServer and  delegating the call.  With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated
    >      >      >      > but I would prefer to address it in a separate issue.
    >      >      >      >
    >      >      >      > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker
    >      >      >      >                  container  and connecting  to it with the GUI debugger.
    >      >      >      >                 Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
    >      >      >      >
    >      >      >      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01
    >      >      >      > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751
    >      >      >      > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
    >      >      >      >
    >      >      >      > Thank you,
    >      >      >      > Daniil
    >      >      >      >
    >      >      >      >
    >      >      >
    >      >      >
    >      >      >
    >      >
    >      >
    >      >
    >      
    > 
    > 
    

From suenaga at oss.nttdata.com  Mon Mar 23 06:13:53 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Mon, 23 Mar 2020 15:13:53 +0900
Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI
 connector port
In-Reply-To: <FBCAD683-2A99-415D-8926-8AEF560EE55A@oracle.com>
References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com>
 <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com>
 <D28BF049-D293-4F65-93EC-BCAE4F09B413@oracle.com>
 <a3b5ac77-b6ee-4927-cf99-b586c6bbeae6@oss.nttdata.com>
 <41392FC2-AC78-4E94-878B-907DDFB3E968@oracle.com>
 <c858b94c-0091-8d29-eb0b-145782984d86@oss.nttdata.com>
 <78EA1B33-1B0D-4969-AFC5-E73D8F82E6FA@oracle.com>
 <c00b118a-8619-3984-10d7-63134d7210a0@oss.nttdata.com>
 <FBCAD683-2A99-415D-8926-8AEF560EE55A@oracle.com>
Message-ID: <e7835112-28dd-9a36-4e7d-2b07bcb824e8@oss.nttdata.com>

Hi Daniil,

Looks good!


Yasumasa


On 2020/03/23 7:29, Daniil Titov wrote:
> Hi Yasumasa, Serguei and Alex,
> 
> Please review a new version of the webrev that merges SADebugDTest.java  with changes  done in  [2].
> 
> Also the CRS [3] and the help message for debug server in SALauncher.java were updated to specify that  '--hostname'
> option could be a hostname or an IPv4/IPv6 address.
> 
>   >  Ok, but I think it might be more simply with TestLibrary.
>   >   For example, can you use TestLibrary::getUnusedRandomPort ? It is used in test/jdk/java/rmi/testlibrary/RMID.java .
> 
> TestLibrary:: getUnusedRandomPort() doesn't allow to specify what ports are reserved and it uses some hardcoded port range [FIXED_PORT_MIN, FIXED_PORT_MAX] as reserved ports. Besides,  test/jdk/java/rmi/testlibrary/TestLibrary.java class cannot be directly used in test/hotspot/jtreg/serviceability/* tests (it doesn't compile).
> 
> Nevertheless, to simplify the test itself I moved findUnreservedFreePort(int .. reservedPorts) from SADebugTest.java to jdk.test.lib.Utils in /test/lib.
> 
> Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
> 
> [1] http://cr.openjdk.java.net/~dtitov/8196751/webrev.04/
> [2] https://bugs.openjdk.java.net/browse/JDK-8238268
> [3] https://bugs.openjdk.java.net/browse/JDK-8239831
> 
> Thank you,
> Daniil
> 
> ?On 3/13/20, 7:23 PM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
> 
>      Hi Daniil,
>      
>      On 2020/03/14 7:05, Daniil Titov wrote:
>      > Hi Yasumasa, Serguei and Alex,
>      >
>      > Please review a new version of the webrev that includes the changes Yasumasa suggested.
>      >
>      >> Shutdown hook is already registered in c'tor of HotSpotAgent.
>      >>     It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed.
>      >
>      > The shutdown hook registered in the HotSpotAgent c'tor only works for non-servers, so we still need a
>      > the shutdown hook for remote server being added in SALauncher. I changed it to use  the lambda expression.
>      >
>      > 101     public HotSpotAgent() {
>      >   102         // for non-server add shutdown hook to clean-up debugger in case
>      >   103         // of forced exit. For remote server, shutdown hook is added by
>      >   104         // DebugServer.
>      >   105         Runtime.getRuntime().addShutdownHook(new java.lang.Thread(
>      >   106         new Runnable() {
>      >   107             public void run() {
>      >   108                 synchronized (HotSpotAgent.this) {
>      >   109                     if (!isServer) {
>      >   110                         detach();
>      >   111                     }
>      >   112                 }
>      >   113             }
>      >   114         }));
>      >   115     }
>      
>      I missed it, thanks!
>      
>      
>      >>>     Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains
>      >>> `exclusiveAccess.dirs=.` to avoid concurrent execution
>      > As I understand exclusiveAccess.dirs prevents only the tests located in this directory from being run simultaneously and other tests could still run in parallel with one of these tests.  Thus I would prefer to have the retry mechanism in place. I simplified the code using the class variables instead of local arrays.
>      
>      Ok, but I think it might be more simply with TestLibrary.
>      For example, can you use TestLibrary::getUnusedRandomPort ? It is used in test/jdk/java/rmi/testlibrary/RMID.java .
>      
>      
>      Thanks,
>      
>      Yasumasa
>      
>      
>      > Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
>      >
>      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.03/
>      > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
>      > [3] Bug: https://bugs.openjdk.java.net/browse/JDK-8196751
>      >
>      > Thank you,
>      > Daniil
>      >
>      > ?On 3/6/20, 6:15 PM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
>      >
>      >      Hi Daniil,
>      >
>      >      On 2020/03/07 3:38, Daniil Titov wrote:
>      >      > Hi Yasumasa,
>      >      >
>      >      >   -> checkBasicOptions() is needed? I think you can remove this method and embed it in caller.
>      >      > I think that having a piece of code that invokes  a method  named "buildAttachArgs" with a copy of the argument map  just for its side-effect ( it throws an exception if parameters are incorrect)  and ignores its return might look confusing. Thus, I found it more appropriate to wrap it inside a method with more relevant name .
>      >
>      >      Ok, but I prefer to leave comment it.
>      >
>      >
>      >      >   > SADebugDTest
>      >      >   >  - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array.
>      >      > We cannot use primitives there since these local variables are captured in lambda expression and are required to be final.
>      >      > The other option is to use some other wrapper for them but I don't see any obvious benefits in it comparing to the array.
>      >
>      >      Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains `exclusiveAccess.dirs=.` to avoid concurrent execution.
>      >      If you do not think this error check, test code is more simply.
>      >
>      >
>      >      > I will include your other suggestion in the new version of the webrev.
>      >
>      >      Sorry, I have one more comment:
>      >
>      >      >           - Shutdown hook is very good idea. You can implement more simply if you use lambda expression.
>      >
>      >      Shutdown hook is already registered in c'tor of HotSpotAgent.
>      >      It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed.
>      >
>      >
>      >      Thanks,
>      >
>      >      Yasumasa
>      >
>      >
>      >      > Thanks!
>      >      > Daniil
>      >      >
>      >      > ?On 3/6/20, 12:30 AM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
>      >      >
>      >      >      Hi Daniil,
>      >      >
>      >      >
>      >      >      - SALauncher.java
>      >      >           - checkBasicOptions() is needed? I think you can remove this method and embed it in caller.
>      >      >           - I think registryPort should be checked with Integer.parseInt() like others (rmiPort and pid) rather than regex.
>      >      >           - Shutdown hook is very good idea. You can implement more simply if you use lambda expression.
>      >      >
>      >      >      - SADebugDTest.java
>      >      >           - Please add bug ID to @bug.
>      >      >           - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array.
>      >      >
>      >      >
>      >      >      Thanks,
>      >      >
>      >      >      Yasumasa
>      >      >
>      >      >
>      >      >      On 2020/03/06 10:15, Daniil Titov wrote:
>      >      >      > Hi Yasumasa, Serguei and Alex,
>      >      >      >
>      >      >      > Please review a new version of the fix [1] that addresses your comments. The new version in addition to RMI connector
>      >      >      > port option introduces two more options to specify RMI registry port and RMI connector host name. Currently, these
>      >      >      > last two settings could be specified using the system properties but the system properties have the following disadvantages
>      >      >      > comparing to the command line options:
>      >      >      >     -  It?s hard to know about them: they are not listed in tool?s help.
>      >      >      >     -  They have long names that hard to remember
>      >      >      >     -   It is easy to mistype them  in the command line and you will not get any warning about it.
>      >      >      >
>      >      >      > The CSR [2] was also updated and needs to be reviewed.
>      >      >      >
>      >      >      > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker
>      >      >      > container  and connecting  to it with the GUI debugger.  Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
>      >      >      >
>      >      >      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.02/
>      >      >      > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
>      >      >      > [3] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751
>      >      >      >
>      >      >      > Thank you,
>      >      >      > Daniil
>      >      >      >
>      >      >      > ?On 2/24/20, 5:45 AM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
>      >      >      >
>      >      >      >      Hi Daniil,
>      >      >      >
>      >      >      >         - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments.
>      >      >      >           Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply.
>      >      >      >
>      >      >      >         - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used.
>      >      >      >           But you can use same port number as RMI registry (1099).
>      >      >      >           It is same as relation between jmxremote.port and jmxremote.rmi.port.
>      >      >      >
>      >      >      >
>      >      >      >      Thanks,
>      >      >      >
>      >      >      >      Yasumasa
>      >      >      >
>      >      >      >
>      >      >      >      On 2020/02/24 13:21, Daniil Titov wrote:
>      >      >      >      > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port.
>      >      >      >      > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container.
>      >      >      >      >
>      >      >      >      > New CSR [3] was created for this change and it needs to be reviewed as well.
>      >      >      >      >
>      >      >      >      > Man pages for jhsdb will be updated in a separate issue.
>      >      >      >      >
>      >      >      >      > The current implementation (sun.jvm.hotspot.SALauncher)  parses the command line options passed to jhsdb tool,
>      >      >      >      > converts them to the ones for the debug server and then delegates the call  to sun.jvm.hotspot.DebugServer.main().
>      >      >      >      >
>      >      >      >      >                // delegate to the actual SA debug server.
>      >      >      >      >   367         DebugServer.main(newArgArray.toArray(new String[0]));
>      >      >      >      >
>      >      >      >      > However,  sun.jvm.hotspot.DebugServer  doesn't support named options and that prevents from efficiently adding new options to the tool.
>      >      >      >      > I found it more suitable to start Hotspot agent directly in  SALauncher rather than  adding a new option in  both sun.jvm.hotspot.SALauncher
>      >      >      >      >   and sun.jvm.hotspot.DebugServer and  delegating the call.  With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated
>      >      >      >      > but I would prefer to address it in a separate issue.
>      >      >      >      >
>      >      >      >      > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker
>      >      >      >      >                  container  and connecting  to it with the GUI debugger.
>      >      >      >      >                 Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
>      >      >      >      >
>      >      >      >      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01
>      >      >      >      > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751
>      >      >      >      > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
>      >      >      >      >
>      >      >      >      > Thank you,
>      >      >      >      > Daniil
>      >      >      >      >
>      >      >      >      >
>      >      >      >
>      >      >      >
>      >      >      >
>      >      >
>      >      >
>      >      >
>      >
>      >
>      >
>      
> 
> 

From serguei.spitsyn at oracle.com  Mon Mar 23 17:18:37 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 23 Mar 2020 10:18:37 -0700
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <4f940048-06e1-bda1-6f75-225001572232@oss.nttdata.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <4a9ce5cc-34d1-0dc6-17ed-2337915ddd21@oracle.com>
 <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com>
 <f6d2ef25-becc-3060-f10a-445f275ee69d@oracle.com>
 <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com>
 <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com>
 <d0a1f49e-7f1d-0732-38d0-2b7a75966e74@oss.nttdata.com>
 <c0013500-ebc6-0eab-1705-feb99e58e73e@oracle.com>
 <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com>
 <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com>
 <a13433b4-e3f8-d280-b83f-cde27f7282cd@oracle.com>
 <ff79a3c5-7416-6ee1-73de-0ef1d91a7480@oracle.com>
 <b9694754-b669-0ea3-2ba9-a432b9d8dcfb@oss.nttdata.com>
 <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com>
 <c00aeece-863a-dfb3-f8da-2d3d5ae25330@oracle.com>
 <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com>
 <f0488e84-9758-ea16-82a6-4ce1f424a523@oracle.com>
 <4f940048-06e1-bda1-6f75-225001572232@oss.nttdata.com>
Message-ID: <908005c7-c3ad-2f92-5853-2ae0895689b4@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200323/f33d7e3c/attachment.htm>

From serguei.spitsyn at oracle.com  Mon Mar 23 18:22:38 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 23 Mar 2020 11:22:38 -0700
Subject: RFR 8240902: JDI shared memory connector can use already closed
 Handles
In-Reply-To: <e1a2d907-3907-0f0e-31f3-221338549a5e@oracle.com>
References: <b644413f-060c-e856-5de5-bb7d69783a8e@oracle.com>
 <db15a209-c73d-c380-42e2-75e713392453@oracle.com>
 <2212664f-930b-cdfc-7ae6-70c7205cdaca@oracle.com>
 <8928590e-6516-051e-aa84-098a6fdc9d45@oracle.com>
 <e1a2d907-3907-0f0e-31f3-221338549a5e@oracle.com>
Message-ID: <68e2e091-7bfd-8501-f8bf-60b55d64b9af@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200323/d27923f4/attachment-0001.htm>

From serguei.spitsyn at oracle.com  Mon Mar 23 18:45:08 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 23 Mar 2020 11:45:08 -0700
Subject: RFR: 8240711: TestJstatdPort.java failed due to "ExportException:
 Port already in use:"
In-Reply-To: <597AFBA6-42DB-4499-BF0C-8160155BABF4@oracle.com>
References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com>
 <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com>
 <7D3EB454-76AA-47A7-91A6-417ED92C6F56@oracle.com>
 <38b055df-37cd-cc11-79c9-323fb55077ff@oracle.com>
 <3F070550-DAB3-431E-89AD-66532B366FEA@oracle.com>
 <597AFBA6-42DB-4499-BF0C-8160155BABF4@oracle.com>
Message-ID: <d8845b8f-3e8c-23e7-293b-b542f9ff2028@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200323/42aa89aa/attachment.htm>

From magnus.ihse.bursie at oracle.com  Mon Mar 23 19:03:26 2020
From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie)
Date: Mon, 23 Mar 2020 20:03:26 +0100
Subject: RFR: JDK-8241463 Move build tools to respective modules
Message-ID: <ff05910f-9656-94e6-2aaa-ce722443017f@oracle.com>

The build tools (small java tools that are run during the build to 
generate source code, or data, needed in the JDK) have historically been 
placed in the "make" directory. This maybe made sense long time ago, but 
does not do so anymore.

Instead, the build tools source code should move the the module that 
needs them. For instance, compilefontconfig should move to java.desktop, 
etc.

There are multiple reasons for this:

* Currently we build *all* build tools at once, which mean that we 
cannot compile java.base until e.g. the compilefontconfig tool is 
compiled, even though it is not needed.

* If a build tool, e.g. compilefontconfig is modified, all build tools 
are recompiled, which triggers a rebuild of more or less the entire JDK. 
This makes development of the build tools unnecessary tedious.

* When the build tools are modified, the group owning the corresponding 
module is the proper review instance, not the build team. But since they 
reside under "make", the review mails often include build-dev, but this 
is mostly noise for us. With this move, the ownership is made clear.

In this patch, I have not modified how and when the build tools are 
compiled, but this shuffle is the prerequisite for continuing with that 
in a follow-up patch.

I have also moved the build tools to the org.openjdk.buildtools.* 
package name space (inspired by Skara), instead of the strangely named 
build.tools.* name space.

A few build tools are not moved in this patch. Two of them, 
charsetmapping and cldrconverter, are shared between two modules. (I 
think they should move to modules nevertheless, but they need some more 
thought to make sure I do this right.) The rest are tools that are 
needed for the build in general, like linking or javadoc support. I'll 
move this to a better location too, but in a separate patch.

Bug: https://bugs.openjdk.java.net/browse/JDK-8241463
WebRev: 
http://cr.openjdk.java.net/~ihse/JDK-8241463-move-build-tools-to-modules/webrev.01

/Magnus


From daniil.x.titov at oracle.com  Mon Mar 23 19:05:21 2020
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Mon, 23 Mar 2020 12:05:21 -0700
Subject: 8240711: TestJstatdPort.java failed due to "ExportException: Port
 already in use:"
In-Reply-To: <d8845b8f-3e8c-23e7-293b-b542f9ff2028@oracle.com>
References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com>
 <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com>
 <7D3EB454-76AA-47A7-91A6-417ED92C6F56@oracle.com>
 <38b055df-37cd-cc11-79c9-323fb55077ff@oracle.com>
 <3F070550-DAB3-431E-89AD-66532B366FEA@oracle.com>
 <597AFBA6-42DB-4499-BF0C-8160155BABF4@oracle.com>
 <d8845b8f-3e8c-23e7-293b-b542f9ff2028@oracle.com>
Message-ID: <1575A797-0E97-4473-B803-B3FBDA888B2A@oracle.com>

Hi Serguei,

 
In this case tryToSetupJstatdProcess() on line 346 return null and the test ?will try to find a new pair of ports and start jstatd process.

 
Best regards,

Daniil

 
From: "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>
Date: Monday, March 23, 2020 at 11:45 AM
To: Daniil Titov <daniil.x.titov at oracle.com>, Alex Menkov <alexey.menkov at oracle.com>, serviceability-dev <serviceability-dev at openjdk.java.net>
Subject: Re: RFR: 8240711: TestJstatdPort.java failed due to "ExportException: Port already in use:"

 
Hi Daniil,

It looks Okay in general.
But I've got a question.
 329???????????? while (jstatdThread == null) {
 330???????????????? if (!useDefaultPort) {
 331???????????????????? port = String.valueOf(Utils.getFreePort());
 332???????????????? }
 333 
?334???????????????? if (!useDefaultRmiPort) {
 335???????????????????? rmiPort = String.valueOf(Utils.getFreePort());
 336???????????????? }
 337 
?338???????????????? if (withExternalRegistry) {
 339???????????????????? Registry registry = startRegistry();
 340???????????????????? if (registry == null) {
 341???????????????????????? // The port is already in use. Cancel and try with a new one.
 342???????????????????????? continue;
 343???????????????????? }
 344???????????????? }
 345 
?346???????????????? jstatdThread = tryToSetupJstatdProcess();
 347???????????? }

What is going to happen if all ports that we try are already in use?
Does the test report this situation?

Thanks,
Serguei


On 3/17/20 11:40, Daniil Titov wrote:
Hi Alex,
 
Please review a new version of the fix that removes the old version of the code that tried to handle the "port in use" case.
 
Testing: Mach5 tests for sun/tools/jstatd/? successfully passed 100 times.? Tier1-tier3 tests successfully passed. 
 
[1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.02? 
[2] https://bugs.openjdk.java.net/browse/JDK-8240711
 
Thanks,
Daniil
 
 
?On 3/16/20, 5:38 PM, "Daniil Titov" <daniil.x.titov at oracle.com> wrote:
 
??? Hi Alex,
??? 
????Yes,? I did test the change by modifying? the test to use the RMI port that is already in use
??? ( the stack trace in the original email was exact from this changed test) and then ensured that with the fix 
????the such issue is properly handled.
??? 
????I will send a new version of the webrev that removes the old version of the code that tried to handle the "port in use" case.
??? 
????Thanks!
??? 
????Best regards,
??? Daniil
??? 
????
????
????
?????On 3/16/20, 4:47 PM, "Alex Menkov" <alexey.menkov at oracle.com> wrote:
??? 
????????I don't agree.
??????? The code handles exact the same "port in use" case for the same tool.
??????? So it either works or doesn't.
??????? And have 2 code blocks which suppose to do the same makes the code messy.
??????? BTW did you tested the change (I mean craft the test to get "port in 
????????use" error)?
??????? 
????????--alex
??????? 
????????On 03/16/2020 16:17, Daniil Titov wrote:
??????? > Resending with the corrected subject ...
??????? > 
????????> Hi Alex,
??????? > 
????????> Yes, you are right, class JstatdTest has the code that is supposed to handle the "port in use"
??????? > case but at least for this specific test? (sun/tools/jstatd/TestJstatdPort.java) it doesn't work.
??????? > 
????????> Since there are multiple tests in sun/tools/jstatd/* folder that use this class and different ports
??????? > might be subject to the "port in use" error and taking into account that it's hard to reproduce such case
??????? > I found it safer to leave the original code and just augment it with what was missing for this specific
??????? > case rather than completely replacing it.
??????? > 
????????> Best regards,
??????? > Daniil
??????? > 
????????> ?On 3/16/20, 4:02 PM, "Alex Menkov" <alexey.menkov at oracle.com> wrote:
??????? > 
????????>????? Hi Daniil,
??????? >????? 
????????>????? Looks like the test is supposed to handle "port in use" issue (see lines
??????? >????? 103-114).
??????? >????? I suppose in case "port in use" jstatd exits, but
??????? >????? ProcessTools.startProcess() continue to wait for "jstatd started" message.
??????? >????? 
????????>????? --alex
??????? >????? 
????????>????? On 03/16/2020 12:00, Daniil Titov wrote:
????? ??>????? > Please review the change [1] that fixes the intermittent failure of the test.
??????? >????? >
??????? >????? > The problem here is that if the RMI port is in use than the test keep waiting for "jstatd started (bound to " to appear in the process output and in this case
??????? >????? > It doesn't happen.
??????? >????? >
??????? >????? > ??????? at java.util.concurrent.CountDownLatch.await(java.base at 15-internal/CountDownLatch.java:232)
??????? >????? > ??????? at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:205)
??????? >????? > ??????? at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:133)
??????? >????? > ??????? at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:254)
??????? >????? > ??????? at jdk.test.lib.thread.ProcessThread$ProcessRunnable.xrun(ProcessThread.java:153)
??????? >????? > ??????? at jdk.test.lib.thread.XRun.run(XRun.java:40)
??????? >????? > ??????? at java.lang.Thread.run(java.base at 15-internal/Thread.java:832)
??????? >????? > ??????? at jdk.test.lib.thread.TestThread.run(TestThread.java:123)
??????? >????? >
??????? >????? > Testing: Mach5 tests for sun/tools/jstatd/ successfully passed.? Tier1-tier3 tests are still in progress.
??????? >????? >
??????? >????? > [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.01/
??????? >????? > [2] https://bugs.openjdk.java.net/browse/JDK-8240711
??????? >????? >
??????? >????? >
??????? >????? > Thank you,
??????? >????? > Daniil
??????? >????? >
??????? >????? >
??????? >????? >
??????? >????? 
????????> 
????????> 
????????
????
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200323/07fa4ca4/attachment-0001.htm>

From serguei.spitsyn at oracle.com  Mon Mar 23 19:13:12 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 23 Mar 2020 12:13:12 -0700
Subject: 8240711: TestJstatdPort.java failed due to "ExportException: Port
 already in use:"
In-Reply-To: <1575A797-0E97-4473-B803-B3FBDA888B2A@oracle.com>
References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com>
 <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com>
 <7D3EB454-76AA-47A7-91A6-417ED92C6F56@oracle.com>
 <38b055df-37cd-cc11-79c9-323fb55077ff@oracle.com>
 <3F070550-DAB3-431E-89AD-66532B366FEA@oracle.com>
 <597AFBA6-42DB-4499-BF0C-8160155BABF4@oracle.com>
 <d8845b8f-3e8c-23e7-293b-b542f9ff2028@oracle.com>
 <1575A797-0E97-4473-B803-B3FBDA888B2A@oracle.com>
Message-ID: <5fb5bca9-8052-6168-042f-dc0b7192daa5@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200323/e57f1a3a/attachment.htm>

From daniil.x.titov at oracle.com  Mon Mar 23 19:32:21 2020
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Mon, 23 Mar 2020 12:32:21 -0700
Subject: RFR: 8240711: TestJstatdPort.java failed due to "ExportException:
 Port already in use:"
In-Reply-To: <5fb5bca9-8052-6168-042f-dc0b7192daa5@oracle.com>
References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com>
 <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com>
 <7D3EB454-76AA-47A7-91A6-417ED92C6F56@oracle.com>
 <38b055df-37cd-cc11-79c9-323fb55077ff@oracle.com>
 <3F070550-DAB3-431E-89AD-66532B366FEA@oracle.com>
 <597AFBA6-42DB-4499-BF0C-8160155BABF4@oracle.com>
 <d8845b8f-3e8c-23e7-293b-b542f9ff2028@oracle.com>
 <1575A797-0E97-4473-B803-B3FBDA888B2A@oracle.com>
 <5fb5bca9-8052-6168-042f-dc0b7192daa5@oracle.com>
Message-ID: <1A634ECF-141A-4144-8777-A87A5FC41234@oracle.com>

Hi Serguei,

 
I don?t think ?that in any real environment the loop could not be able to find the pair of free ports before it is killed by JTREG due to timeout. But if you think that we need to limit the number of attempts here I could create a new issue for that.

 
Thanks!

--Daniil

 
From: "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>
Date: Monday, March 23, 2020 at 12:13 PM
To: Daniil Titov <daniil.x.titov at oracle.com>, Alex Menkov <alexey.menkov at oracle.com>, serviceability-dev <serviceability-dev at openjdk.java.net>
Subject: Re: 8240711: TestJstatdPort.java failed due to "ExportException: Port already in use:"

 
On 3/23/20 12:05, Daniil Titov wrote:

Hi Serguei,

 
In this case tryToSetupJstatdProcess() on line 346 return null and the test  will try to find a new pair of ports and start jstatd process.


I understand this.
My question if this loop can be endless.
What happens if there is no new pair of ports that we did not check yet?
Do we fail with a timeout in such a case?
If so, would it better to report that unused free port was not found?
Is it possible to detect this situation?

Thanks,
Serguei
  

Best regards,

Daniil

 
From: "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>
Date: Monday, March 23, 2020 at 11:45 AM
To: Daniil Titov <daniil.x.titov at oracle.com>, Alex Menkov <alexey.menkov at oracle.com>, serviceability-dev <serviceability-dev at openjdk.java.net>
Subject: Re: RFR: 8240711: TestJstatdPort.java failed due to "ExportException: Port already in use:"

 
Hi Daniil,

It looks Okay in general.
But I've got a question.
 329             while (jstatdThread == null) {
 330                 if (!useDefaultPort) {
 331                     port = String.valueOf(Utils.getFreePort());
 332                 }
 333 
 334                 if (!useDefaultRmiPort) {
 335                     rmiPort = String.valueOf(Utils.getFreePort());
 336                 }
 337 
 338                 if (withExternalRegistry) {
 339                     Registry registry = startRegistry();
 340                     if (registry == null) {
 341                         // The port is already in use. Cancel and try with a new one.
 342                         continue;
 343                     }
 344                 }
 345 
 346                 jstatdThread = tryToSetupJstatdProcess();
 347             }

What is going to happen if all ports that we try are already in use?
Does the test report this situation?

Thanks,
Serguei


On 3/17/20 11:40, Daniil Titov wrote:
Hi Alex,
 
Please review a new version of the fix that removes the old version of the code that tried to handle the "port in use" case.
 
Testing: Mach5 tests for sun/tools/jstatd/  successfully passed 100 times.  Tier1-tier3 tests successfully passed. 
 
[1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.02  
[2] https://bugs.openjdk.java.net/browse/JDK-8240711
 
Thanks,
Daniil
 
 
?On 3/16/20, 5:38 PM, "Daniil Titov" <daniil.x.titov at oracle.com> wrote:
 
    Hi Alex,
    
    Yes,  I did test the change by modifying  the test to use the RMI port that is already in use
    ( the stack trace in the original email was exact from this changed test) and then ensured that with the fix 
    the such issue is properly handled.
    
    I will send a new version of the webrev that removes the old version of the code that tried to handle the "port in use" case.
    
    Thanks!
    
    Best regards,
    Daniil
    
    
    ?On 3/16/20, 4:47 PM, "Alex Menkov" <alexey.menkov at oracle.com> wrote:
    
        I don't agree.
        The code handles exact the same "port in use" case for the same tool.
        So it either works or doesn't.
        And have 2 code blocks which suppose to do the same makes the code messy.
        BTW did you tested the change (I mean craft the test to get "port in 
        use" error)?
        
        --alex
        
        On 03/16/2020 16:17, Daniil Titov wrote:
        > Resending with the corrected subject ...
        > 
        > Hi Alex,
        > 
        > Yes, you are right, class JstatdTest has the code that is supposed to handle the "port in use"
        > case but at least for this specific test  (sun/tools/jstatd/TestJstatdPort.java) it doesn't work.
        > 
        > Since there are multiple tests in sun/tools/jstatd/* folder that use this class and different ports
        > might be subject to the "port in use" error and taking into account that it's hard to reproduce such case
        > I found it safer to leave the original code and just augment it with what was missing for this specific
        > case rather than completely replacing it.
        > 
        > Best regards,
        > Daniil
        > 
        > ?On 3/16/20, 4:02 PM, "Alex Menkov" <alexey.menkov at oracle.com> wrote:
        > 
        >      Hi Daniil,
        >      
        >      Looks like the test is supposed to handle "port in use" issue (see lines
        >      103-114).
        >      I suppose in case "port in use" jstatd exits, but
        >      ProcessTools.startProcess() continue to wait for "jstatd started" message.
        >      
        >      --alex
        >      
        >      On 03/16/2020 12:00, Daniil Titov wrote:
        >      > Please review the change [1] that fixes the intermittent failure of the test.
        >      >
        >      > The problem here is that if the RMI port is in use than the test keep waiting for "jstatd started (bound to " to appear in the process output and in this case
        >      > It doesn't happen.
        >      >
        >      >         at java.util.concurrent.CountDownLatch.await(java.base at 15-internal/CountDownLatch.java:232)
        >      >         at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:205)
        >      >         at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:133)
        >      >         at jdk.test.lib.process.ProcessTools.startProcess(ProcessTools.java:254)
        >      >         at jdk.test.lib.thread.ProcessThread$ProcessRunnable.xrun(ProcessThread.java:153)
        >      >         at jdk.test.lib.thread.XRun.run(XRun.java:40)
        >      >         at java.lang.Thread.run(java.base at 15-internal/Thread.java:832)
        >      >         at jdk.test.lib.thread.TestThread.run(TestThread.java:123)
        >      >
        >      > Testing: Mach5 tests for sun/tools/jstatd/ successfully passed.  Tier1-tier3 tests are still in progress.
        >      >
        >      > [1] http://cr.openjdk.java.net/~dtitov/8240711/webrev.01/
        >      > [2] https://bugs.openjdk.java.net/browse/JDK-8240711
        >      >
        >      >
        >      > Thank you,
        >      > Daniil
        >      >
        >      >
        >      >
        >      
        > 
        > 
        
    
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200323/e38ea74f/attachment-0001.htm>

From serguei.spitsyn at oracle.com  Mon Mar 23 19:48:52 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 23 Mar 2020 12:48:52 -0700
Subject: RFR: 8240711: TestJstatdPort.java failed due to "ExportException:
 Port already in use:"
In-Reply-To: <1A634ECF-141A-4144-8777-A87A5FC41234@oracle.com>
References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com>
 <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com>
 <7D3EB454-76AA-47A7-91A6-417ED92C6F56@oracle.com>
 <38b055df-37cd-cc11-79c9-323fb55077ff@oracle.com>
 <3F070550-DAB3-431E-89AD-66532B366FEA@oracle.com>
 <597AFBA6-42DB-4499-BF0C-8160155BABF4@oracle.com>
 <d8845b8f-3e8c-23e7-293b-b542f9ff2028@oracle.com>
 <1575A797-0E97-4473-B803-B3FBDA888B2A@oracle.com>
 <5fb5bca9-8052-6168-042f-dc0b7192daa5@oracle.com>
 <1A634ECF-141A-4144-8777-A87A5FC41234@oracle.com>
Message-ID: <d87e7c4a-8edd-2774-4eec-bd9340af2961@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200323/7d1cd247/attachment.htm>

From erik.joelsson at oracle.com  Mon Mar 23 19:54:44 2020
From: erik.joelsson at oracle.com (Erik Joelsson)
Date: Mon, 23 Mar 2020 12:54:44 -0700
Subject: RFR: JDK-8241463 Move build tools to respective modules
In-Reply-To: <ff05910f-9656-94e6-2aaa-ce722443017f@oracle.com>
References: <ff05910f-9656-94e6-2aaa-ce722443017f@oracle.com>
Message-ID: <705f74a1-8f64-166c-63d1-7174c89443cd@oracle.com>

Looks good.

/Erik

On 2020-03-23 12:03, Magnus Ihse Bursie wrote:
> The build tools (small java tools that are run during the build to 
> generate source code, or data, needed in the JDK) have historically 
> been placed in the "make" directory. This maybe made sense long time 
> ago, but does not do so anymore.
>
> Instead, the build tools source code should move the the module that 
> needs them. For instance, compilefontconfig should move to 
> java.desktop, etc.
>
> There are multiple reasons for this:
>
> * Currently we build *all* build tools at once, which mean that we 
> cannot compile java.base until e.g. the compilefontconfig tool is 
> compiled, even though it is not needed.
>
> * If a build tool, e.g. compilefontconfig is modified, all build tools 
> are recompiled, which triggers a rebuild of more or less the entire 
> JDK. This makes development of the build tools unnecessary tedious.
>
> * When the build tools are modified, the group owning the 
> corresponding module is the proper review instance, not the build 
> team. But since they reside under "make", the review mails often 
> include build-dev, but this is mostly noise for us. With this move, 
> the ownership is made clear.
>
> In this patch, I have not modified how and when the build tools are 
> compiled, but this shuffle is the prerequisite for continuing with 
> that in a follow-up patch.
>
> I have also moved the build tools to the org.openjdk.buildtools.* 
> package name space (inspired by Skara), instead of the strangely named 
> build.tools.* name space.
>
> A few build tools are not moved in this patch. Two of them, 
> charsetmapping and cldrconverter, are shared between two modules. (I 
> think they should move to modules nevertheless, but they need some 
> more thought to make sure I do this right.) The rest are tools that 
> are needed for the build in general, like linking or javadoc support. 
> I'll move this to a better location too, but in a separate patch.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8241463
> WebRev: 
> http://cr.openjdk.java.net/~ihse/JDK-8241463-move-build-tools-to-modules/webrev.01
>
> /Magnus
>

From mandy.chung at oracle.com  Mon Mar 23 20:19:13 2020
From: mandy.chung at oracle.com (Mandy Chung)
Date: Mon, 23 Mar 2020 13:19:13 -0700
Subject: RFR: JDK-8241463 Move build tools to respective modules
In-Reply-To: <ff05910f-9656-94e6-2aaa-ce722443017f@oracle.com>
References: <ff05910f-9656-94e6-2aaa-ce722443017f@oracle.com>
Message-ID: <53d7119b-5e7e-22fe-97c1-0382f2d94fbc@oracle.com>

Hi Magnus,

Modularizing the build tools is a good move.??? This patch suggests to 
place the build tools under
 ??? src/$MODULE/share/tools/$PACKAGE/*.java

I think the modular source location of the build tools needs more 
discussion, including jigsaw-dev for this discussion.

The JDK source as specified in JEP 201 is under:
 ??? src/$MODULE/{share,$OS}/classes/$PACKAGE/*.java

Compiling the source files from the `src` directory are the intermediate 
input to build the resulting image.??? Build tools are used to generate 
additional intermediate input (that is not part of the `src` directory) 
to build the image.?? So I wonder if make/$MODULE/share/tools or 
make/tools/$MODULE? may be better location for the build tools.

Mandy

On 3/23/20 12:03 PM, Magnus Ihse Bursie wrote:
> The build tools (small java tools that are run during the build to 
> generate source code, or data, needed in the JDK) have historically 
> been placed in the "make" directory. This maybe made sense long time 
> ago, but does not do so anymore.
>
> Instead, the build tools source code should move the the module that 
> needs them. For instance, compilefontconfig should move to 
> java.desktop, etc.
>
> There are multiple reasons for this:
>
> * Currently we build *all* build tools at once, which mean that we 
> cannot compile java.base until e.g. the compilefontconfig tool is 
> compiled, even though it is not needed.
>
> * If a build tool, e.g. compilefontconfig is modified, all build tools 
> are recompiled, which triggers a rebuild of more or less the entire 
> JDK. This makes development of the build tools unnecessary tedious.
>
> * When the build tools are modified, the group owning the 
> corresponding module is the proper review instance, not the build 
> team. But since they reside under "make", the review mails often 
> include build-dev, but this is mostly noise for us. With this move, 
> the ownership is made clear.
>
> In this patch, I have not modified how and when the build tools are 
> compiled, but this shuffle is the prerequisite for continuing with 
> that in a follow-up patch.
>
> I have also moved the build tools to the org.openjdk.buildtools.* 
> package name space (inspired by Skara), instead of the strangely named 
> build.tools.* name space.
>
> A few build tools are not moved in this patch. Two of them, 
> charsetmapping and cldrconverter, are shared between two modules. (I 
> think they should move to modules nevertheless, but they need some 
> more thought to make sure I do this right.) The rest are tools that 
> are needed for the build in general, like linking or javadoc support. 
> I'll move this to a better location too, but in a separate patch.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8241463
> WebRev: 
> http://cr.openjdk.java.net/~ihse/JDK-8241463-move-build-tools-to-modules/webrev.01
>
> /Magnus
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200323/9eabd984/attachment-0001.htm>

From Alan.Bateman at oracle.com  Mon Mar 23 20:33:50 2020
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Mon, 23 Mar 2020 20:33:50 +0000
Subject: RFR: JDK-8241463 Move build tools to respective modules
In-Reply-To: <ff05910f-9656-94e6-2aaa-ce722443017f@oracle.com>
References: <ff05910f-9656-94e6-2aaa-ce722443017f@oracle.com>
Message-ID: <20158b4b-f6f1-92e1-59da-d0f6c07a85ca@oracle.com>


On 23/03/2020 19:03, Magnus Ihse Bursie wrote:
> The build tools (small java tools that are run during the build to 
> generate source code, or data, needed in the JDK) have historically 
> been placed in the "make" directory. This maybe made sense long time 
> ago, but does not do so anymore.
>
> Instead, the build tools source code should move the the module that 
> needs them. For instance, compilefontconfig should move to 
> java.desktop, etc.
>
> There are multiple reasons for this:
>
> * Currently we build *all* build tools at once, which mean that we 
> cannot compile java.base until e.g. the compilefontconfig tool is 
> compiled, even though it is not needed.
>
> * If a build tool, e.g. compilefontconfig is modified, all build tools 
> are recompiled, which triggers a rebuild of more or less the entire 
> JDK. This makes development of the build tools unnecessary tedious.
>
> * When the build tools are modified, the group owning the 
> corresponding module is the proper review instance, not the build 
> team. But since they reside under "make", the review mails often 
> include build-dev, but this is mostly noise for us. With this move, 
> the ownership is made clear.
>
> In this patch, I have not modified how and when the build tools are 
> compiled, but this shuffle is the prerequisite for continuing with 
> that in a follow-up patch.
>
> I have also moved the build tools to the org.openjdk.buildtools.* 
> package name space (inspired by Skara), instead of the strangely named 
> build.tools.* name space.
>
> A few build tools are not moved in this patch. Two of them, 
> charsetmapping and cldrconverter, are shared between two modules. (I 
> think they should move to modules nevertheless, but they need some 
> more thought to make sure I do this right.) The rest are tools that 
> are needed for the build in general, like linking or javadoc support. 
> I'll move this to a better location too, but in a separate patch.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8241463
> WebRev: 
> http://cr.openjdk.java.net/~ihse/JDK-8241463-move-build-tools-to-modules/webrev.01
I think this will require further discussion, maybe even an update to 
JEP 201. I think it would be useful to see what other options were 
exploring, in particular options that organize the tools by module in 
the make tree (as it will confuse people to put them in the src tree).

-Alan

From serguei.spitsyn at oracle.com  Mon Mar 23 21:59:56 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 23 Mar 2020 14:59:56 -0700
Subject: RFR: 8240711: TestJstatdPort.java failed due to "ExportException:
 Port already in use:"
In-Reply-To: <d87e7c4a-8edd-2774-4eec-bd9340af2961@oracle.com>
References: <9949DE34-401E-4558-A870-8A39EC1F764E@oracle.com>
 <829edcd9-f14e-b2b5-3005-79cf538a0f31@oracle.com>
 <7D3EB454-76AA-47A7-91A6-417ED92C6F56@oracle.com>
 <38b055df-37cd-cc11-79c9-323fb55077ff@oracle.com>
 <3F070550-DAB3-431E-89AD-66532B366FEA@oracle.com>
 <597AFBA6-42DB-4499-BF0C-8160155BABF4@oracle.com>
 <d8845b8f-3e8c-23e7-293b-b542f9ff2028@oracle.com>
 <1575A797-0E97-4473-B803-B3FBDA888B2A@oracle.com>
 <5fb5bca9-8052-6168-042f-dc0b7192daa5@oracle.com>
 <1A634ECF-141A-4144-8777-A87A5FC41234@oracle.com>
 <d87e7c4a-8edd-2774-4eec-bd9340af2961@oracle.com>
Message-ID: <c4784606-da36-5239-ce3d-b254709d343a@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200323/85867088/attachment.htm>

From naoto.sato at oracle.com  Mon Mar 23 22:15:31 2020
From: naoto.sato at oracle.com (naoto.sato at oracle.com)
Date: Mon, 23 Mar 2020 15:15:31 -0700
Subject: RFR: JDK-8241463 Move build tools to respective modules
In-Reply-To: <ff05910f-9656-94e6-2aaa-ce722443017f@oracle.com>
References: <ff05910f-9656-94e6-2aaa-ce722443017f@oracle.com>
Message-ID: <bea03feb-9ead-7261-0385-01d18a70688e@oracle.com>

Hi Magnus,

I looked at i18n related changes:

make/CopyInterimTZDB.gmk
make/ToolsJdk.gmk
make/gendata/Gendata-java.base.gmk
make/gendata/GendataBreakIterator.gmk
make/gendata/GendataTZDB.gmk
make/gensrc/GensrcCharacterData.gmk
make/gensrc/GensrcEmojiData.gmk

They look ok to me.

The *.java changes should have copyright year update.

As to charsetmapping and cldrconverter, I believe they can reside in 
java.base, as jdk.charsets and jdk.localedata modules depend on it.

Naoto

On 3/23/20 12:03 PM, Magnus Ihse Bursie wrote:
> The build tools (small java tools that are run during the build to 
> generate source code, or data, needed in the JDK) have historically been 
> placed in the "make" directory. This maybe made sense long time ago, but 
> does not do so anymore.
> 
> Instead, the build tools source code should move the the module that 
> needs them. For instance, compilefontconfig should move to java.desktop, 
> etc.
> 
> There are multiple reasons for this:
> 
> * Currently we build *all* build tools at once, which mean that we 
> cannot compile java.base until e.g. the compilefontconfig tool is 
> compiled, even though it is not needed.
> 
> * If a build tool, e.g. compilefontconfig is modified, all build tools 
> are recompiled, which triggers a rebuild of more or less the entire JDK. 
> This makes development of the build tools unnecessary tedious.
> 
> * When the build tools are modified, the group owning the corresponding 
> module is the proper review instance, not the build team. But since they 
> reside under "make", the review mails often include build-dev, but this 
> is mostly noise for us. With this move, the ownership is made clear.
> 
> In this patch, I have not modified how and when the build tools are 
> compiled, but this shuffle is the prerequisite for continuing with that 
> in a follow-up patch.
> 
> I have also moved the build tools to the org.openjdk.buildtools.* 
> package name space (inspired by Skara), instead of the strangely named 
> build.tools.* name space.
> 
> A few build tools are not moved in this patch. Two of them, 
> charsetmapping and cldrconverter, are shared between two modules. (I 
> think they should move to modules nevertheless, but they need some more 
> thought to make sure I do this right.) The rest are tools that are 
> needed for the build in general, like linking or javadoc support. I'll 
> move this to a better location too, but in a separate patch.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8241463
> WebRev: 
> http://cr.openjdk.java.net/~ihse/JDK-8241463-move-build-tools-to-modules/webrev.01 
> 
> 
> /Magnus
> 

From serguei.spitsyn at oracle.com  Mon Mar 23 22:39:51 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 23 Mar 2020 15:39:51 -0700
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <908005c7-c3ad-2f92-5853-2ae0895689b4@oracle.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <28d38995-b4e6-f6db-ed33-dd2d2e8a11eb@oracle.com>
 <f6d2ef25-becc-3060-f10a-445f275ee69d@oracle.com>
 <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com>
 <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com>
 <d0a1f49e-7f1d-0732-38d0-2b7a75966e74@oss.nttdata.com>
 <c0013500-ebc6-0eab-1705-feb99e58e73e@oracle.com>
 <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com>
 <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com>
 <a13433b4-e3f8-d280-b83f-cde27f7282cd@oracle.com>
 <ff79a3c5-7416-6ee1-73de-0ef1d91a7480@oracle.com>
 <b9694754-b669-0ea3-2ba9-a432b9d8dcfb@oss.nttdata.com>
 <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com>
 <c00aeece-863a-dfb3-f8da-2d3d5ae25330@oracle.com>
 <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com>
 <f0488e84-9758-ea16-82a6-4ce1f424a523@oracle.com>
 <4f940048-06e1-bda1-6f75-225001572232@oss.nttdata.com>
 <908005c7-c3ad-2f92-5853-2ae0895689b4@oracle.com>
Message-ID: <2114ef12-b790-ee2d-4242-d114cc47151c@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200323/647914c0/attachment.htm>

From suenaga at oss.nttdata.com  Tue Mar 24 00:08:09 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Tue, 24 Mar 2020 09:08:09 +0900
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <2114ef12-b790-ee2d-4242-d114cc47151c@oracle.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <f6d2ef25-becc-3060-f10a-445f275ee69d@oracle.com>
 <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com>
 <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com>
 <d0a1f49e-7f1d-0732-38d0-2b7a75966e74@oss.nttdata.com>
 <c0013500-ebc6-0eab-1705-feb99e58e73e@oracle.com>
 <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com>
 <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com>
 <a13433b4-e3f8-d280-b83f-cde27f7282cd@oracle.com>
 <ff79a3c5-7416-6ee1-73de-0ef1d91a7480@oracle.com>
 <b9694754-b669-0ea3-2ba9-a432b9d8dcfb@oss.nttdata.com>
 <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com>
 <c00aeece-863a-dfb3-f8da-2d3d5ae25330@oracle.com>
 <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com>
 <f0488e84-9758-ea16-82a6-4ce1f424a523@oracle.com>
 <4f940048-06e1-bda1-6f75-225001572232@oss.nttdata.com>
 <908005c7-c3ad-2f92-5853-2ae0895689b4@oracle.com>
 <2114ef12-b790-ee2d-4242-d114cc47151c@oracle.com>
Message-ID: <9c9950af-5e36-90b5-d22f-a3e00d45fbcc@oss.nttdata.com>

Hi Serguei,

Thanks for your comment!
I uploaded new webrev:

   http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.04/

Also I pushed it to submit repo:

   http://hg.openjdk.java.net/jdk/submit/rev/fade6a949bd1

On 2020/03/24 7:39, serguei.spitsyn at oracle.com wrote:
> Hi Yasumasa,
> 
> The mach5 tier5 testing looks good.
> The serviceability/sa/ClhsdbPstack.java is failed without fix and is not failed with it.
> 
> Thanks,
> Serguei
> 
> 
> On 3/23/20 10:18, serguei.spitsyn at oracle.com wrote:
>> Hi Yasumasa,
>>
>> I looked at you changes.
>> It is hard to understand if this fully solves the issue.
>>
>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html
>>
>> @@ -34,10 +34,11 @@
>>   
>>      public static LinuxAMD64CFrame getTopFrame(LinuxDebugger dbg, Address rip, ThreadContext context) {
>>         Address libptr = dbg.findLibPtrByAddress(rip);
>>         Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP);
>>         DwarfParser dwarf = null;
>> + boolean unsupportedDwarf = false;
>>   
>>         if (libptr != null) { // Native frame
>>           try {
>>             dwarf = new DwarfParser(libptr);
>>             dwarf.processDwarf(rip);
>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> @@ -45,24 +46,33 @@
>>                    !dwarf.isBPOffsetAvailable())
>>                       ? context.getRegisterAsAddress(AMD64ThreadContext.RBP)
>>                       : context.getRegisterAsAddress(dwarf.getCFARegister())
>>                                .addOffsetTo(dwarf.getCFAOffset());
>>           } catch (DebuggerException e) {
>> - // Bail out to Java frame case
>> + if (dwarf != null) {
>> + // DWARF processing should succeed when the frame is native
>> + // but it might fail if CIE has language personality routine
>> + // and/or LSDA.
>> + dwarf = null;
>> + unsupportedDwarf = true;
>> + } else {
>> + throw e;
>> + }
>>           }
>>         }
>>   
>>         return (cfa == null) ? null
>> - : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf);
>> + : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf, !unsupportedDwarf);
>>      }
>>
>> @@ -121,13 +131,25 @@
>>        }
>>   
>>        return isValidFrame(nextCFA, context) ? nextCFA : null;
>>      }
>>   
>> - private DwarfParser getNextDwarf(Address nextPC) {
>> - DwarfParser nextDwarf = null;
>> + @Override
>> + public CFrame sender(ThreadProxy thread) {
>> + if (!possibleNext) {
>> + return null;
>> + }
>> +
>> + ThreadContext context = thread.getContext();
>> +
>> + Address nextPC = getNextPC(dwarf != null);
>> + if (nextPC == null) {
>> + return null;
>> + }
>>   
>> + DwarfParser nextDwarf = null;
>> + boolean unsupportedDwarf = false;
>>        if ((dwarf != null) && dwarf.isIn(nextPC)) {
>>          nextDwarf = dwarf;
>>        } else {
>>          Address libptr = dbg.findLibPtrByAddress(nextPC);
>>          if (libptr != null) {
>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> @@ -138,33 +160,29 @@
>>            }
>>          }
>>        }
>>   
>>        if (nextDwarf != null) {
>> + try {
>>          nextDwarf.processDwarf(nextPC);
>> + } catch (DebuggerException e) {
>> + // DWARF processing should succeed when the frame is native
>> + // but it might fail if CIE has language personality routine
>> + // and/or LSDA.
>> + nextDwarf = null;
>> + unsupportedDwarf = true;
>>        }
>>
>> This fix looks like a hack.
>> Should we just propagate the Debugging exception instead of trying to maintain unsupportedDwarf flag?

DwarfParser::processDwarf would throw DebuggerException if it cannot find DWARF which relates to PC.
PC at this point is for next frame. So current frame (`this` object) is valid, and it should be processed.


>> Also, I don't like that DWARF-specific abbreviations (like CIE, IDE,LSDA, etc.) are used without any comments explaining them.
>> The code has to be generally readable without looking into the DWARF spec each time.

I added comments for them in this webrev.


Thanks,

Yasumasa


>> I'm submitting mach5 jobs to make sure the issue has been resolved with your fix.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 3/20/20 17:55, Yasumasa Suenaga wrote:
>>> Thanks Chris!
>>> I'm waiting for reviewers for this change.
>>>
>>>
>>> Yasumasa
>>>
>>>
>>> On 2020/03/21 4:23, Chris Plummer wrote:
>>>> Hi Yasumasa,
>>>>
>>>> The failure is due to JDK-8231634, so not something you need to worry about.
>>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>
>>>> On 3/20/20 2:45 AM, Yasumasa Suenaga wrote:
>>>>> Hi Chris,
>>>>>
>>>>> I uploaded new webrev which includes reverting change for ProblemList:
>>>>>
>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/
>>>>>
>>>>> I tested it on submit repo (mach5-one-ysuenaga-JDK-8240956-2-20200320-0814-9586301),
>>>>> but it has failed in ClhsdbJstackXcompStress.java.
>>>>> However I think it is not caused by this change because ClhsdbJstackXcompStress.java tests `jhsdb jstack`, not mixed mode, it would not parse DWARF.
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2020/03/20 13:55, Chris Plummer wrote:
>>>>>> Hi Yasumasa,
>>>>>>
>>>>>> The test has been problem listed so please add undoing this to your webrev. Here's the diff that problem listed it:
>>>>>>
>>>>>> diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt
>>>>>> --- a/test/hotspot/jtreg/ProblemList.txt
>>>>>> +++ b/test/hotspot/jtreg/ProblemList.txt
>>>>>> @@ -115,7 +115,7 @@
>>>>>> ??serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all
>>>>>> ??serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all
>>>>>> ??serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all
>>>>>> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all
>>>>>> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 solaris-all,linux-all
>>>>>> ??serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 solaris-all
>>>>>> ??serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 solaris-all,linux-x64,macosx-x64,windows-x64
>>>>>> ??serviceability/sa/ClhsdbSource.java 8193639 solaris-all
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On 3/16/20 5:07 AM, Yasumasa Suenaga wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> This webrev has passed submit repo (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and additional tests.
>>>>>>> So please review it:
>>>>>>>
>>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>> On 2020/03/16 21:03, Yasumasa Suenaga wrote:
>>>>>>>> Thank you so much, David!
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2020/03/16 21:01, David Holmes wrote:
>>>>>>>>> On 16/03/2020 9:46 pm, David Holmes wrote:
>>>>>>>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>> Hi David,
>>>>>>>>>>>
>>>>>>>>>>> I missed loop condition, so I fixed it and pushed to submit repo.
>>>>>>>>>>> Could you try again?
>>>>>>>>>>>
>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23
>>>>>>>>>>>
>>>>>>>>>>> webrev is here:
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>>>>>>>>
>>>>>>>>>> Test job resubmitted. Will advise results if it completes before I go to bed :)
>>>>>>>>>
>>>>>>>>> Seems to have passed okay.
>>>>>>>>>
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks a lot!
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 2020/03/16 16:17, David Holmes wrote:
>>>>>>>>>>>> Sorry it is still crashing.
>>>>>>>>>>>>
>>>>>>>>>>>> #
>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>>>>>>>>> #
>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704
>>>>>>>>>>>> #
>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0640217.suenaga.source)
>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>> #
>>>>>>>>>>>>
>>>>>>>>>>>> Same as before.
>>>>>>>>>>>>
>>>>>>>>>>>> David
>>>>>>>>>>>> -----
>>>>>>>>>>>>
>>>>>>>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote:
>>>>>>>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote:
>>>>>>>>>>>>>>> BTW, if you submit it to the submit repo, we can then go and run additional internal tests (and even more builds) using that job.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for that tip Chris!
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've pushed the change to submit repo, but I've not yet received the result.
>>>>>>>>>>>>>> I will share you when I get job ID.
>>>>>>>>>>>>>
>>>>>>>>>>>>> We can see the id. Just need to wait for the builds to complete before submitting the additional tests.
>>>>>>>>>>>>>
>>>>>>>>>>>>> David
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thank you for testing it.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I updated webrev to avoid bailout to Java frame when DWARF has language personality routine or LSDA.
>>>>>>>>>>>>>>>> Could you try it?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 .
>>>>>>>>>>>>>>>> I've pushed it to submit repo.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Diff from webrev.00 is here:
>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote:
>>>>>>>>>>>>>>>>> Correction ...
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I can't review this as I know nothing about the code, but I'm putting the patch through our internal testing.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Sorry but the crashes still exist:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949
>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev)
>>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>>>>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> in fact they seem worse as the test seems to always crash now.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in linux-x64. I don't see a pattern as to where it fails versus passes.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> It doesn't fail for me locally.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames in jstack mixed mode.
>>>>>>>>>>>>>>>>>>>> However some error has seen intermittently after that.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I investigated the cause of this, I found two concerns:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check
>>>>>>>>>>>>>>>>>>>> ?? B: Language personality routine and Language Specific Data Area (LSDA) are not considered
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I addd range check for .eh_frame processing, and ignore personality routine and LSDA in this webrev.
>>>>>>>>>>>>>>>>>>>> Also I added bailout code if DWARF processing is failed due to these concerns.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> This change has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671),
>>>>>>>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>>
> 

From chris.plummer at oracle.com  Tue Mar 24 06:34:38 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Mon, 23 Mar 2020 23:34:38 -0700
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <7766d2f0-e223-7129-9c37-474bbcd67d70@oracle.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
 <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>
 <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>
 <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com>
 <7766d2f0-e223-7129-9c37-474bbcd67d70@oracle.com>
Message-ID: <b542b056-6de2-1ad0-b54d-9029c80d5164@oracle.com>

Hi Roman,

I assume JVMTI maintains separate tagging data for each agent so having 
two agents doing tagging won't result in confusion. I didn't actually 
find this in the spec. Would be nice to confirm that it is the case. 
However, your implementation does seem to conflict with other uses of 
tagging in the debug agent:

 ?- During the execution of ObjectReference.ReferringObjects, the object 
being checked is tagged. If this happens to be a Class instance, the tag 
you setup will end up being cleared.

- During the execution of VirtualMachine.InstanceCounts, each Class 
instance being counted is tagged. So that means your tag is cleared for 
any Class passed to this API

- SetTag is used in commonRef.c. I believe any Object for which an 
objectID is created and sent to the front end (debugger), a weakref to 
that object is created and tagged with a RefNode*. So you will have many 
Weakref objects with tags. When these are freed, they are passed to 
cbTrackingObjectFree() and these tags are incorrectly added to 
deletedSignatures(). This means you end up treating a RefNode* as a 
char* in synthesizeUnloadEvent(), and a ClassUnload event gets created 
with garbage for the classname. I also think this could cause issues 
when eventually this RefNode* is passed to jvmtiDeallocate(). However, I 
think you have a bug where you never actually free up signatures for 
Classes that get unloaded. Only signatures for loaded classes seem to 
get deleted, and that is done when the agent detaches.

What would cause classTrack_addPreparedClass() to be called for a Class 
you've already seen? I don't understand the need for the "tag != 0l" check.

thanks,

Chris

On 3/20/20 12:52 PM, Chris Plummer wrote:
> On 3/20/20 8:30 AM, Roman Kennke wrote:
>> I believe I came up with a much simpler solution that also solves the
>> problems of the existing one, and the ones I proposed earlier.
>>
>> It turns out that we can take advantage of the fact that we can use
>> *anything* as tags in JVMTI, even pointers to stuff (this is explicitely
>> mentioned in the JVMTI spec). This means we can simply stick a pointer
>> to the signature of a class into the tag, and pull it out again when we
>> get notified that the class gets unloaded.
>>
>> This means we don't need an extra data-structure to keep track of
>> classes and signatures, and it also makes the story around locking
>> *much* simpler. Performance-wise this is O(1), i.e. no scanning of all
>> classes needed (as in the current implementation) and no searching of
>> table needed (like in my previous attempts).
>>
>> Please review this new revision:
>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/
> I'll have a look at this.
>>
>> (Notice that there still appears to be a performance bottleneck with
>> class-unloading when an actual debugger is attached. This doesn't seem
>> to be related to the classTrack.c implementation though, but looks like
>> a consequence of getting all those class-unload notifications over the
>> wire. My testcase generates 1000s of them, and it's clogging up the
>> buffers.)
> At least this is only a one-shot hit when the classes are unloaded, 
> and the performance hit is based on the number of classes being 
> unloaded. The main issue is happening every GC, and is O(n) where n is 
> the number of loaded classes.
>> I am not sure why jdb needs to enable class-unload listener always. A
>> simple hack disables it, and performance is brilliant, even when jdb is
>> attached:
>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch
> This is JDI, not jdb. It looks like it needs ClassUnload events so it 
> can maintain typesBySignature, which is used by public APIs like 
> allClasses(). So we have caching of loaded classes both in the debug 
> agent and in JDI.
>
> Chris
>> But this is not in the scope of this bug.)
>>
>> Roman
>>
>>
>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote:
>>> Sorry, forgot to complete my comments at the end (see below).
>>>
>>>
>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote:
>>>> Hi Roman,
>>>>
>>>> Thank you for the update and sorry for the latency in review.
>>>>
>>>> Some comments are below.
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html 
>>>>
>>>>
>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
>>>> ?? 88 {
>>>> 89 debugMonitorEnter(deletedSignatureLock);
>>>> 90 if (currentClassTag == -1) {
>>>> 91 // Class tracking not initialized, nobody's interested
>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>> 93 return;
>>>> ?? 94???? }
>>>> Just a question:
>>>> ?? Q1: Should the ObjectFree events be disabled for the jvmtiEnv 
>>>> that does
>>>> ?????? the class tracking if class tracking has not been initialized?
>>>>
>>>> 70 static jlong currentClassTag; I'm thinking if the name is better to
>>>> be something like: lastClassTag or highestClassTag.
>>>>
>>>> 99 KlassNode* klass = *klass_ptr;
>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103
>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr;
>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // klass not
>>>> found - ignore.
>>>> 107 debugMonitorExit(deletedSignatureLock);
>>>> 108 return;
>>>> ? 109???? }
>>>> ??It seems to me, something is wrong in the condition at L106 above.
>>>> ??Should it be? :
>>>> ???? if (klass == NULL || klass->klass_tag != tag)
>>>>
>>>> ??Otherwise, how can the second check ever work correctly as the 
>>>> return
>>>> will always happen when (klass != NULL)?
>>>>
>>>> ? There are several places in this file with the the indent:
>>>> 90 if (currentClassTag == -1) {
>>>> 91 // Class tracking not initialized, nobody's interested
>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>> 93 return;
>>>> ?? 94???? }
>>>> ? ...
>>>> 152 if (currentClassTag == -1) {
>>>> 153 // Class tracking not initialized yet, nobody's interested
>>>> 154 debugMonitorExit(deletedSignatureLock);
>>>> 155 return;
>>>> ? 156???? }
>>>> ? ...
>>>> 161 if (error != JVMTI_ERROR_NONE) {
>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv");
>>>> ? 163???? }
>>>> 164 if (tag != 0l) {
>>>> 165 debugMonitorExit(deletedSignatureLock);
>>>> 166 return; // Already added
>>>> ? 167???? }
>>>> ? ...
>>>> 281 cleanDeleted(void *signatureVoid, void *arg)
>>>> 282 {
>>>> 283 char* sig = (char*)signatureVoid;
>>>> 284 jvmtiDeallocate(sig);
>>>> 285 return JNI_TRUE;
>>>> ? 286 }
>>>> ? ...
>>>> ? 291 void
>>>> ? 292 classTrack_reset(void)
>>>> ? 293 {
>>>> 294 int idx;
>>>> 295 debugMonitorEnter(deletedSignatureLock);
>>>> 296
>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) {
>>>> 298 KlassNode* node = table[idx];
>>>> 299 while (node != NULL) {
>>>> 300 KlassNode* next = node->next;
>>>> 301 jvmtiDeallocate(node->signature);
>>>> 302 jvmtiDeallocate(node);
>>>> 303 node = next;
>>>> 304 }
>>>> 305 }
>>>> 306 jvmtiDeallocate(table);
>>>> 307
>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL);
>>>> 309 bagDestroyBag(deletedSignatureBag);
>>>> 310
>>>> 311 currentClassTag = -1;
>>>> 312
>>>> 313 (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv);
>>>> 314 trackingEnv = NULL;
>>>> 315
>>>> 316 debugMonitorExit(deletedSignatureLock);
>>>>
>>>> Could you, please, fix several comments below?
>>>> 63 * The JVMTI tracking env to keep track of klass tags, for 
>>>> class-unloads
>>>> ??The comma is not needed.
>>>> ??Would it better to replace: klass tags => klass_tag's ?
>>>>
>>>>
>>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag
>>>> consistent
>>>> ??Maybe: Lock to guard ... or lock to keep integrity of ...
>>>>
>>>> 84 * Callback when classes are freed, Finds the signature and
>>>> remembers it in deletedSignatureBag. Would be better to use words like
>>>> "store" or "record", "Find" should not start from capital letter:
>>>> Invoke the callback when classes are freed, find and record the
>>>> signature in deletedSignatureBag.
>>>>
>>>> 96 // Find deleted KlassNode 133 // Class tracking not initialized,
>>>> nobody's interested 153 // Class tracking not initialized yet,
>>>> nobody's interested 158 /* Check this is not a duplicate */ Missed dot
>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { //
>>>> klass not found - ignore. In opposite, dot is not needed as the
>>>> comment does not start from a capital letter. 111 // At this point we
>>>> have the KlassNode corresponding to the tag
>>>> 112 // in klass, and the pointer to it in klass_node.
>>> ? The comment above can be better. Maybe, something like:
>>> ? ? " At this point, we found the KlassNode matching the klass 
>>> tag(and it is
>>> linked).
>>>
>>>> 113 // Remember the unloaded signature.
>>> ??Better: Record the signature of the unloaded class and unlink it.
>>>
>>> Thanks,
>>> Serguei
>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>> On 3/9/20 05:39, Roman Kennke wrote:
>>>>> Hello all,
>>>>>
>>>>> Can I please get reviews of this change? In the meantime, we've done
>>>>> more testing and also field-/torture-testing by a customer who is 
>>>>> happy
>>>>> now. :-)
>>>>>
>>>>> Thanks,
>>>>> Roman
>>>>>
>>>>>
>>>>>> Hi Serguei,
>>>>>>
>>>>>> Thanks for reviewing!
>>>>>>
>>>>>> I updated the patch to reflect your suggestions, very good!
>>>>>> It also includes a fix to allow re-connecting an agent after 
>>>>>> disconnect,
>>>>>> namely move setup of the trackingEnv and deletedSignatureBag to
>>>>>> _activate() to ensure have those structures after re-connect.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
>>>>>>
>>>>>> Let me know what you think!
>>>>>> Roman
>>>>>>
>>>>>>> Hi Roman,
>>>>>>>
>>>>>>> Thank you for taking care about this scalability issue!
>>>>>>>
>>>>>>> I have a couple of quick comments.
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html 
>>>>>>>
>>>>>>>
>>>>>>> 72 /*
>>>>>>> 73 * Lock to protect deletedSignatureBag
>>>>>>> 74 */
>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>>>>>>> 78 * A bag containing all the deleted classes' signatures. Must be
>>>>>>> accessed under
>>>>>>> 79 * deletedTagLock,
>>>>>>> ?? 80? */
>>>>>>> 81 struct bag* deletedSignatureBag;
>>>>>>>
>>>>>>> ?? The comments contradict to each other.
>>>>>>> ?? I guess, the lock name at line 79 has to be deletedSignatureLock
>>>>>>> instead of deletedTagLock.
>>>>>>> ?? Also, comma at the end must be replaced with dot.
>>>>>>>
>>>>>>>
>>>>>>> 101 // Tag not found? Ignore.
>>>>>>> 102 if (klass == NULL) {
>>>>>>> 103 debugMonitorExit(deletedSignatureLock);
>>>>>>> 104 return;
>>>>>>> 105 }
>>>>>>> ? 106
>>>>>>> 107 // Scan linked-list.
>>>>>>> 108 jlong found_tag = klass->klass_tag;
>>>>>>> 109 while (klass != NULL && found_tag != tag) {
>>>>>>> 110 klass_ptr = &klass->next;
>>>>>>> 111 klass = *klass_ptr;
>>>>>>> 112 found_tag = klass->klass_tag;
>>>>>>> ? 113???? }
>>>>>>> 114
>>>>>>> 115 // Tag not found? Ignore.
>>>>>>> 116 if (found_tag != tag) {
>>>>>>> 117 debugMonitorExit(deletedSignatureLock);
>>>>>>> 118 return;
>>>>>>> ? 119???? }
>>>>>>>
>>>>>>>
>>>>>>> ??The code above can be simplified, so that the lines 101-105 
>>>>>>> are not
>>>>>>> needed anymore.
>>>>>>> ??It can be something like this:
>>>>>>>
>>>>>>> // Scan linked-list.
>>>>>>> while (klass != NULL && klass->klass_tag != tag) {
>>>>>>> klass_ptr = &klass->next;
>>>>>>> klass = *klass_ptr;
>>>>>>> ????? }
>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not 
>>>>>>> found - ignore.
>>>>>>> debugMonitorExit(deletedSignatureLock);
>>>>>>> return;
>>>>>>> ????? }
>>>>>>>
>>>>>>> It will take more time when I get a chance to look at the rest.
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 12/21/19 13:24, Roman Kennke wrote:
>>>>>>>> Here comes an update that resolves some races that happen when
>>>>>>>> disconnecting an agent. In particular, we need to take the lock on
>>>>>>>> basically every operation, and also need to check whether or not
>>>>>>>> class-tracking is active and return an appropriate result (e.g. 
>>>>>>>> an empty
>>>>>>>> list) when we're not.
>>>>>>>>
>>>>>>>> Updated webrev:
>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Roman
>>>>>>>>
>>>>>>>>
>>>>>>>>> So, here comes the O(1) implementation:
>>>>>>>>>
>>>>>>>>> - Whenever a class is 'prepared', it is registered with a tag, 
>>>>>>>>> and we
>>>>>>>>> set-up a listener to get notified when it is unloaded.
>>>>>>>>> - Prepared classes are kept in a datastructure that is a 
>>>>>>>>> table, which
>>>>>>>>> each entry being the head of a linked-list of KlassNode*. The 
>>>>>>>>> table is
>>>>>>>>> indexed by tag % slot-count, and then simply prepend the new 
>>>>>>>>> KlassNode*.
>>>>>>>>> This is O(1) operation.
>>>>>>>>> - When we get notified of unloading a class, we look up the 
>>>>>>>>> signature of
>>>>>>>>> the reported tag in that table, and remember it in a bag. The 
>>>>>>>>> KlassNode*
>>>>>>>>> is then unlinked from the table and deallocated. This is ~O(1) 
>>>>>>>>> operation
>>>>>>>>> too, depending on the depth of the table. In my testcase which 
>>>>>>>>> hammered
>>>>>>>>> the code with class-loads and unloads, I usually see depths of 
>>>>>>>>> like 2-3,
>>>>>>>>> but not usually more. It should be ok.
>>>>>>>>> - when processUnloads() gets called, we simply hand out that 
>>>>>>>>> bag, and
>>>>>>>>> allocate a new one.
>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid 
>>>>>>>>> leaking the
>>>>>>>>> signatures and KlassNode* etc when debug agent gets detached 
>>>>>>>>> and/or
>>>>>>>>> re-attached (was missing before).
>>>>>>>>> - I also added locks around data-structure-manipulation (was 
>>>>>>>>> missing
>>>>>>>>> before).
>>>>>>>>> - Also, I only activate this whole process when an actual 
>>>>>>>>> listener gets
>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when 
>>>>>>>>> attaching a
>>>>>>>>> jdb, not sure why jdb does that though. This may be something 
>>>>>>>>> to improve
>>>>>>>>> in the future?
>>>>>>>>>
>>>>>>>>> In my tests, the performance of class-tracking itself looks 
>>>>>>>>> really good.
>>>>>>>>> The bottleneck now is clearly actual synthesizing the 
>>>>>>>>> class-unload
>>>>>>>>> events. I don't see how this can be helped when the debug 
>>>>>>>>> agent asks for it?
>>>>>>>>>
>>>>>>>>> Updated webrev:
>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
>>>>>>>>>
>>>>>>>>> Please let me know what you think of it.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Roman
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Alright, the perfectionist in me got me. I am implementing 
>>>>>>>>>> the even more
>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing for 
>>>>>>>>>> now.
>>>>>>>>>>
>>>>>>>>>> Thanks,Roman
>>>>>>>>>>
>>>>>>>>>> ? Hi Chris,
>>>>>>>>>>>> I'll have a look at this, although it might not be for a 
>>>>>>>>>>>> few days. In
>>>>>>>>>>>> the meantime, maybe you can describe your new 
>>>>>>>>>>>> implementation in
>>>>>>>>>>>> classTrack.c so it's easier to look through the changes.
>>>>>>>>>>> Sure.
>>>>>>>>>>>
>>>>>>>>>>> The purpose of this class-tracking is to be able to 
>>>>>>>>>>> determine the
>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading 
>>>>>>>>>>> happened, so that
>>>>>>>>>>> we can generate the appropriate JDWP event.
>>>>>>>>>>>
>>>>>>>>>>> The current implementation does so by maintaining a table of 
>>>>>>>>>>> currently
>>>>>>>>>>> prepared classes by building that table when classTrack is 
>>>>>>>>>>> initialized,
>>>>>>>>>>> and then add new classes whenever a class gets loaded. When 
>>>>>>>>>>> unloading
>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and compared 
>>>>>>>>>>> with the
>>>>>>>>>>> old table, and whatever is in the old, but not in the new 
>>>>>>>>>>> table gets
>>>>>>>>>>> returned. The problem is that when GCs happen frequently 
>>>>>>>>>>> and/or many
>>>>>>>>>>> classes get loaded+unloaded, this amounts to 
>>>>>>>>>>> O(classCount*gcCount)
>>>>>>>>>>> complexity.
>>>>>>>>>>>
>>>>>>>>>>> The new implementation keeps a linked-list of prepared 
>>>>>>>>>>> classes, and also
>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). 
>>>>>>>>>>> Whenever an
>>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned, 
>>>>>>>>>>> and classes
>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus 
>>>>>>>>>>> maintaining the
>>>>>>>>>>> prepared-classes-list) and its signature put in the list 
>>>>>>>>>>> that gets returned.
>>>>>>>>>>>
>>>>>>>>>>> The implementation is not perfect. In order to determine 
>>>>>>>>>>> whether or not
>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. 
>>>>>>>>>>> That process is
>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here 
>>>>>>>>>>> is that
>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this 
>>>>>>>>>>> seems to be
>>>>>>>>>>> true, and also reasonable to expect.
>>>>>>>>>>>
>>>>>>>>>>> (I have some ideas how to improve the implementation to 
>>>>>>>>>>> ~O(1) but it
>>>>>>>>>>> would be considerably more complex: have to maintain a 
>>>>>>>>>>> (hash)table that
>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload, 
>>>>>>>>>>> and build the
>>>>>>>>>>> unloaded-signatures list there, but I don't currently see 
>>>>>>>>>>> that it's
>>>>>>>>>>> worth the effort).
>>>>>>>>>>>
>>>>>>>>>>> In addition to all that, this process is only activated when 
>>>>>>>>>>> there's an
>>>>>>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Roman
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Chris
>>>>>>>>>>>>
>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Issue:
>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c. 
>>>>>>>>>>>>> It avoids
>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps 
>>>>>>>>>>>>> track of
>>>>>>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In addition to that, it avoids this whole dance until an 
>>>>>>>>>>>>> agent
>>>>>>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>>>>>>>>>>>>
>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and 
>>>>>>>>>>>>> timing.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Eg with the testcase provided here:
>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am getting those numbers:
>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s
>>>>>>>>>>>>>
>>>>>>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>>>>>>
>>>>>>>>>>>>> Can I please get a review?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>
>
>


From serguei.spitsyn at oracle.com  Tue Mar 24 06:56:49 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 23 Mar 2020 23:56:49 -0700
Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI
 connector port
In-Reply-To: <FBCAD683-2A99-415D-8926-8AEF560EE55A@oracle.com>
References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com>
 <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com>
 <D28BF049-D293-4F65-93EC-BCAE4F09B413@oracle.com>
 <a3b5ac77-b6ee-4927-cf99-b586c6bbeae6@oss.nttdata.com>
 <41392FC2-AC78-4E94-878B-907DDFB3E968@oracle.com>
 <c858b94c-0091-8d29-eb0b-145782984d86@oss.nttdata.com>
 <78EA1B33-1B0D-4969-AFC5-E73D8F82E6FA@oracle.com>
 <c00b118a-8619-3984-10d7-63134d7210a0@oss.nttdata.com>
 <FBCAD683-2A99-415D-8926-8AEF560EE55A@oracle.com>
Message-ID: <9c888cc4-f9f8-8f64-f90b-a949007bc1dc@oracle.com>

Hi Daniil,

It looks pretty good in general.

It looks like you removed the last call site of DebugServer.main.
Do we need to remove the DebugServer.java as well?

Thanks,
Serguei


On 3/22/20 15:29, Daniil Titov wrote:
> Hi Yasumasa, Serguei and Alex,
>
> Please review a new version of the webrev that merges SADebugDTest.java  with changes  done in  [2].
>
> Also the CRS [3] and the help message for debug server in SALauncher.java were updated to specify that  '--hostname'
> option could be a hostname or an IPv4/IPv6 address.
>
>   >  Ok, but I think it might be more simply with TestLibrary.
>   >   For example, can you use TestLibrary::getUnusedRandomPort ? It is used in test/jdk/java/rmi/testlibrary/RMID.java .
>
> TestLibrary:: getUnusedRandomPort() doesn't allow to specify what ports are reserved and it uses some hardcoded port range [FIXED_PORT_MIN, FIXED_PORT_MAX] as reserved ports. Besides,  test/jdk/java/rmi/testlibrary/TestLibrary.java class cannot be directly used in test/hotspot/jtreg/serviceability/* tests (it doesn't compile).
>
> Nevertheless, to simplify the test itself I moved findUnreservedFreePort(int .. reservedPorts) from SADebugTest.java to jdk.test.lib.Utils in /test/lib.
>
> Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
>
> [1] http://cr.openjdk.java.net/~dtitov/8196751/webrev.04/
> [2] https://bugs.openjdk.java.net/browse/JDK-8238268
> [3] https://bugs.openjdk.java.net/browse/JDK-8239831
>
> Thank you,
> Daniil
>
> ?On 3/13/20, 7:23 PM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
>
>      Hi Daniil,
>      
>      On 2020/03/14 7:05, Daniil Titov wrote:
>      > Hi Yasumasa, Serguei and Alex,
>      >
>      > Please review a new version of the webrev that includes the changes Yasumasa suggested.
>      >
>      >> Shutdown hook is already registered in c'tor of HotSpotAgent.
>      >>     It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed.
>      >
>      > The shutdown hook registered in the HotSpotAgent c'tor only works for non-servers, so we still need a
>      > the shutdown hook for remote server being added in SALauncher. I changed it to use  the lambda expression.
>      >
>      > 101     public HotSpotAgent() {
>      >   102         // for non-server add shutdown hook to clean-up debugger in case
>      >   103         // of forced exit. For remote server, shutdown hook is added by
>      >   104         // DebugServer.
>      >   105         Runtime.getRuntime().addShutdownHook(new java.lang.Thread(
>      >   106         new Runnable() {
>      >   107             public void run() {
>      >   108                 synchronized (HotSpotAgent.this) {
>      >   109                     if (!isServer) {
>      >   110                         detach();
>      >   111                     }
>      >   112                 }
>      >   113             }
>      >   114         }));
>      >   115     }
>      
>      I missed it, thanks!
>      
>      
>      >>>     Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains
>      >>> `exclusiveAccess.dirs=.` to avoid concurrent execution
>      > As I understand exclusiveAccess.dirs prevents only the tests located in this directory from being run simultaneously and other tests could still run in parallel with one of these tests.  Thus I would prefer to have the retry mechanism in place. I simplified the code using the class variables instead of local arrays.
>      
>      Ok, but I think it might be more simply with TestLibrary.
>      For example, can you use TestLibrary::getUnusedRandomPort ? It is used in test/jdk/java/rmi/testlibrary/RMID.java .
>      
>      
>      Thanks,
>      
>      Yasumasa
>      
>      
>      > Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
>      >
>      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.03/
>      > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
>      > [3] Bug: https://bugs.openjdk.java.net/browse/JDK-8196751
>      >
>      > Thank you,
>      > Daniil
>      >
>      > ?On 3/6/20, 6:15 PM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
>      >
>      >      Hi Daniil,
>      >
>      >      On 2020/03/07 3:38, Daniil Titov wrote:
>      >      > Hi Yasumasa,
>      >      >
>      >      >   -> checkBasicOptions() is needed? I think you can remove this method and embed it in caller.
>      >      > I think that having a piece of code that invokes  a method  named "buildAttachArgs" with a copy of the argument map  just for its side-effect ( it throws an exception if parameters are incorrect)  and ignores its return might look confusing. Thus, I found it more appropriate to wrap it inside a method with more relevant name .
>      >
>      >      Ok, but I prefer to leave comment it.
>      >
>      >
>      >      >   > SADebugDTest
>      >      >   >  - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array.
>      >      > We cannot use primitives there since these local variables are captured in lambda expression and are required to be final.
>      >      > The other option is to use some other wrapper for them but I don't see any obvious benefits in it comparing to the array.
>      >
>      >      Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains `exclusiveAccess.dirs=.` to avoid concurrent execution.
>      >      If you do not think this error check, test code is more simply.
>      >
>      >
>      >      > I will include your other suggestion in the new version of the webrev.
>      >
>      >      Sorry, I have one more comment:
>      >
>      >      >           - Shutdown hook is very good idea. You can implement more simply if you use lambda expression.
>      >
>      >      Shutdown hook is already registered in c'tor of HotSpotAgent.
>      >      It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed.
>      >
>      >
>      >      Thanks,
>      >
>      >      Yasumasa
>      >
>      >
>      >      > Thanks!
>      >      > Daniil
>      >      >
>      >      > ?On 3/6/20, 12:30 AM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
>      >      >
>      >      >      Hi Daniil,
>      >      >
>      >      >
>      >      >      - SALauncher.java
>      >      >           - checkBasicOptions() is needed? I think you can remove this method and embed it in caller.
>      >      >           - I think registryPort should be checked with Integer.parseInt() like others (rmiPort and pid) rather than regex.
>      >      >           - Shutdown hook is very good idea. You can implement more simply if you use lambda expression.
>      >      >
>      >      >      - SADebugDTest.java
>      >      >           - Please add bug ID to @bug.
>      >      >           - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array.
>      >      >
>      >      >
>      >      >      Thanks,
>      >      >
>      >      >      Yasumasa
>      >      >
>      >      >
>      >      >      On 2020/03/06 10:15, Daniil Titov wrote:
>      >      >      > Hi Yasumasa, Serguei and Alex,
>      >      >      >
>      >      >      > Please review a new version of the fix [1] that addresses your comments. The new version in addition to RMI connector
>      >      >      > port option introduces two more options to specify RMI registry port and RMI connector host name. Currently, these
>      >      >      > last two settings could be specified using the system properties but the system properties have the following disadvantages
>      >      >      > comparing to the command line options:
>      >      >      >     -  It?s hard to know about them: they are not listed in tool?s help.
>      >      >      >     -  They have long names that hard to remember
>      >      >      >     -   It is easy to mistype them  in the command line and you will not get any warning about it.
>      >      >      >
>      >      >      > The CSR [2] was also updated and needs to be reviewed.
>      >      >      >
>      >      >      > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker
>      >      >      > container  and connecting  to it with the GUI debugger.  Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
>      >      >      >
>      >      >      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.02/
>      >      >      > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
>      >      >      > [3] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751
>      >      >      >
>      >      >      > Thank you,
>      >      >      > Daniil
>      >      >      >
>      >      >      > ?On 2/24/20, 5:45 AM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
>      >      >      >
>      >      >      >      Hi Daniil,
>      >      >      >
>      >      >      >         - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments.
>      >      >      >           Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply.
>      >      >      >
>      >      >      >         - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used.
>      >      >      >           But you can use same port number as RMI registry (1099).
>      >      >      >           It is same as relation between jmxremote.port and jmxremote.rmi.port.
>      >      >      >
>      >      >      >
>      >      >      >      Thanks,
>      >      >      >
>      >      >      >      Yasumasa
>      >      >      >
>      >      >      >
>      >      >      >      On 2020/02/24 13:21, Daniil Titov wrote:
>      >      >      >      > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port.
>      >      >      >      > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container.
>      >      >      >      >
>      >      >      >      > New CSR [3] was created for this change and it needs to be reviewed as well.
>      >      >      >      >
>      >      >      >      > Man pages for jhsdb will be updated in a separate issue.
>      >      >      >      >
>      >      >      >      > The current implementation (sun.jvm.hotspot.SALauncher)  parses the command line options passed to jhsdb tool,
>      >      >      >      > converts them to the ones for the debug server and then delegates the call  to sun.jvm.hotspot.DebugServer.main().
>      >      >      >      >
>      >      >      >      >                // delegate to the actual SA debug server.
>      >      >      >      >   367         DebugServer.main(newArgArray.toArray(new String[0]));
>      >      >      >      >
>      >      >      >      > However,  sun.jvm.hotspot.DebugServer  doesn't support named options and that prevents from efficiently adding new options to the tool.
>      >      >      >      > I found it more suitable to start Hotspot agent directly in  SALauncher rather than  adding a new option in  both sun.jvm.hotspot.SALauncher
>      >      >      >      >   and sun.jvm.hotspot.DebugServer and  delegating the call.  With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated
>      >      >      >      > but I would prefer to address it in a separate issue.
>      >      >      >      >
>      >      >      >      > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker
>      >      >      >      >                  container  and connecting  to it with the GUI debugger.
>      >      >      >      >                 Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
>      >      >      >      >
>      >      >      >      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01
>      >      >      >      > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751
>      >      >      >      > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
>      >      >      >      >
>      >      >      >      > Thank you,
>      >      >      >      > Daniil
>      >      >      >      >
>      >      >      >      >
>      >      >      >
>      >      >      >
>      >      >      >
>      >      >
>      >      >
>      >      >
>      >
>      >
>      >
>      
>
>


From rkennke at redhat.com  Tue Mar 24 08:56:17 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Tue, 24 Mar 2020 09:56:17 +0100
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <b542b056-6de2-1ad0-b54d-9029c80d5164@oracle.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
 <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>
 <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>
 <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com>
 <7766d2f0-e223-7129-9c37-474bbcd67d70@oracle.com>
 <b542b056-6de2-1ad0-b54d-9029c80d5164@oracle.com>
Message-ID: <7667eb6e-8458-6763-3f27-f97f6f49bed8@redhat.com>

Hi Chris,

> I assume JVMTI maintains separate tagging data for each agent so having
> two agents doing tagging won't result in confusion. I didn't actually
> find this in the spec. Would be nice to confirm that it is the case.
> However, your implementation does seem to conflict with other uses of
> tagging in the debug agent:

The tagging data is per-jvmtiEnv. We create and use our own env (private
to class-tracking), so this wouldn't conflict with other uses of tags.
Could it be a problem that we have a single trackingEnv per JVM, though?
/me scratches head.

> What would cause classTrack_addPreparedClass() to be called for a Class
> you've already seen? I don't understand the need for the "tag != 0l" check.

It's probably not needed, may be a left-over from previous installments
of this implementation. I will check it, and turn into an assert or so.

Thanks,
Roman

> thanks,
> 
> Chris
> 
> On 3/20/20 12:52 PM, Chris Plummer wrote:
>> On 3/20/20 8:30 AM, Roman Kennke wrote:
>>> I believe I came up with a much simpler solution that also solves the
>>> problems of the existing one, and the ones I proposed earlier.
>>>
>>> It turns out that we can take advantage of the fact that we can use
>>> *anything* as tags in JVMTI, even pointers to stuff (this is explicitely
>>> mentioned in the JVMTI spec). This means we can simply stick a pointer
>>> to the signature of a class into the tag, and pull it out again when we
>>> get notified that the class gets unloaded.
>>>
>>> This means we don't need an extra data-structure to keep track of
>>> classes and signatures, and it also makes the story around locking
>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning of all
>>> classes needed (as in the current implementation) and no searching of
>>> table needed (like in my previous attempts).
>>>
>>> Please review this new revision:
>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/
>> I'll have a look at this.
>>>
>>> (Notice that there still appears to be a performance bottleneck with
>>> class-unloading when an actual debugger is attached. This doesn't seem
>>> to be related to the classTrack.c implementation though, but looks like
>>> a consequence of getting all those class-unload notifications over the
>>> wire. My testcase generates 1000s of them, and it's clogging up the
>>> buffers.)
>> At least this is only a one-shot hit when the classes are unloaded,
>> and the performance hit is based on the number of classes being
>> unloaded. The main issue is happening every GC, and is O(n) where n is
>> the number of loaded classes.
>>> I am not sure why jdb needs to enable class-unload listener always. A
>>> simple hack disables it, and performance is brilliant, even when jdb is
>>> attached:
>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch
>> This is JDI, not jdb. It looks like it needs ClassUnload events so it
>> can maintain typesBySignature, which is used by public APIs like
>> allClasses(). So we have caching of loaded classes both in the debug
>> agent and in JDI.
>>
>> Chris
>>> But this is not in the scope of this bug.)
>>>
>>> Roman
>>>
>>>
>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote:
>>>> Sorry, forgot to complete my comments at the end (see below).
>>>>
>>>>
>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote:
>>>>> Hi Roman,
>>>>>
>>>>> Thank you for the update and sorry for the latency in review.
>>>>>
>>>>> Some comments are below.
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>
>>>>>
>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
>>>>> ?? 88 {
>>>>> 89 debugMonitorEnter(deletedSignatureLock);
>>>>> 90 if (currentClassTag == -1) {
>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>> 93 return;
>>>>> ?? 94???? }
>>>>> Just a question:
>>>>> ?? Q1: Should the ObjectFree events be disabled for the jvmtiEnv
>>>>> that does
>>>>> ?????? the class tracking if class tracking has not been initialized?
>>>>>
>>>>> 70 static jlong currentClassTag; I'm thinking if the name is better to
>>>>> be something like: lastClassTag or highestClassTag.
>>>>>
>>>>> 99 KlassNode* klass = *klass_ptr;
>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103
>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr;
>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // klass not
>>>>> found - ignore.
>>>>> 107 debugMonitorExit(deletedSignatureLock);
>>>>> 108 return;
>>>>> ? 109???? }
>>>>> ??It seems to me, something is wrong in the condition at L106 above.
>>>>> ??Should it be? :
>>>>> ???? if (klass == NULL || klass->klass_tag != tag)
>>>>>
>>>>> ??Otherwise, how can the second check ever work correctly as the
>>>>> return
>>>>> will always happen when (klass != NULL)?
>>>>>
>>>>> ? There are several places in this file with the the indent:
>>>>> 90 if (currentClassTag == -1) {
>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>> 93 return;
>>>>> ?? 94???? }
>>>>> ? ...
>>>>> 152 if (currentClassTag == -1) {
>>>>> 153 // Class tracking not initialized yet, nobody's interested
>>>>> 154 debugMonitorExit(deletedSignatureLock);
>>>>> 155 return;
>>>>> ? 156???? }
>>>>> ? ...
>>>>> 161 if (error != JVMTI_ERROR_NONE) {
>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv");
>>>>> ? 163???? }
>>>>> 164 if (tag != 0l) {
>>>>> 165 debugMonitorExit(deletedSignatureLock);
>>>>> 166 return; // Already added
>>>>> ? 167???? }
>>>>> ? ...
>>>>> 281 cleanDeleted(void *signatureVoid, void *arg)
>>>>> 282 {
>>>>> 283 char* sig = (char*)signatureVoid;
>>>>> 284 jvmtiDeallocate(sig);
>>>>> 285 return JNI_TRUE;
>>>>> ? 286 }
>>>>> ? ...
>>>>> ? 291 void
>>>>> ? 292 classTrack_reset(void)
>>>>> ? 293 {
>>>>> 294 int idx;
>>>>> 295 debugMonitorEnter(deletedSignatureLock);
>>>>> 296
>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) {
>>>>> 298 KlassNode* node = table[idx];
>>>>> 299 while (node != NULL) {
>>>>> 300 KlassNode* next = node->next;
>>>>> 301 jvmtiDeallocate(node->signature);
>>>>> 302 jvmtiDeallocate(node);
>>>>> 303 node = next;
>>>>> 304 }
>>>>> 305 }
>>>>> 306 jvmtiDeallocate(table);
>>>>> 307
>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL);
>>>>> 309 bagDestroyBag(deletedSignatureBag);
>>>>> 310
>>>>> 311 currentClassTag = -1;
>>>>> 312
>>>>> 313 (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv);
>>>>> 314 trackingEnv = NULL;
>>>>> 315
>>>>> 316 debugMonitorExit(deletedSignatureLock);
>>>>>
>>>>> Could you, please, fix several comments below?
>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for
>>>>> class-unloads
>>>>> ??The comma is not needed.
>>>>> ??Would it better to replace: klass tags => klass_tag's ?
>>>>>
>>>>>
>>>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag
>>>>> consistent
>>>>> ??Maybe: Lock to guard ... or lock to keep integrity of ...
>>>>>
>>>>> 84 * Callback when classes are freed, Finds the signature and
>>>>> remembers it in deletedSignatureBag. Would be better to use words like
>>>>> "store" or "record", "Find" should not start from capital letter:
>>>>> Invoke the callback when classes are freed, find and record the
>>>>> signature in deletedSignatureBag.
>>>>>
>>>>> 96 // Find deleted KlassNode 133 // Class tracking not initialized,
>>>>> nobody's interested 153 // Class tracking not initialized yet,
>>>>> nobody's interested 158 /* Check this is not a duplicate */ Missed dot
>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { //
>>>>> klass not found - ignore. In opposite, dot is not needed as the
>>>>> comment does not start from a capital letter. 111 // At this point we
>>>>> have the KlassNode corresponding to the tag
>>>>> 112 // in klass, and the pointer to it in klass_node.
>>>> ? The comment above can be better. Maybe, something like:
>>>> ? ? " At this point, we found the KlassNode matching the klass
>>>> tag(and it is
>>>> linked).
>>>>
>>>>> 113 // Remember the unloaded signature.
>>>> ??Better: Record the signature of the unloaded class and unlink it.
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>> On 3/9/20 05:39, Roman Kennke wrote:
>>>>>> Hello all,
>>>>>>
>>>>>> Can I please get reviews of this change? In the meantime, we've done
>>>>>> more testing and also field-/torture-testing by a customer who is
>>>>>> happy
>>>>>> now. :-)
>>>>>>
>>>>>> Thanks,
>>>>>> Roman
>>>>>>
>>>>>>
>>>>>>> Hi Serguei,
>>>>>>>
>>>>>>> Thanks for reviewing!
>>>>>>>
>>>>>>> I updated the patch to reflect your suggestions, very good!
>>>>>>> It also includes a fix to allow re-connecting an agent after
>>>>>>> disconnect,
>>>>>>> namely move setup of the trackingEnv and deletedSignatureBag to
>>>>>>> _activate() to ensure have those structures after re-connect.
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
>>>>>>>
>>>>>>> Let me know what you think!
>>>>>>> Roman
>>>>>>>
>>>>>>>> Hi Roman,
>>>>>>>>
>>>>>>>> Thank you for taking care about this scalability issue!
>>>>>>>>
>>>>>>>> I have a couple of quick comments.
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>
>>>>>>>>
>>>>>>>> 72 /*
>>>>>>>> 73 * Lock to protect deletedSignatureBag
>>>>>>>> 74 */
>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>>>>>>>> 78 * A bag containing all the deleted classes' signatures. Must be
>>>>>>>> accessed under
>>>>>>>> 79 * deletedTagLock,
>>>>>>>> ?? 80? */
>>>>>>>> 81 struct bag* deletedSignatureBag;
>>>>>>>>
>>>>>>>> ?? The comments contradict to each other.
>>>>>>>> ?? I guess, the lock name at line 79 has to be deletedSignatureLock
>>>>>>>> instead of deletedTagLock.
>>>>>>>> ?? Also, comma at the end must be replaced with dot.
>>>>>>>>
>>>>>>>>
>>>>>>>> 101 // Tag not found? Ignore.
>>>>>>>> 102 if (klass == NULL) {
>>>>>>>> 103 debugMonitorExit(deletedSignatureLock);
>>>>>>>> 104 return;
>>>>>>>> 105 }
>>>>>>>> ? 106
>>>>>>>> 107 // Scan linked-list.
>>>>>>>> 108 jlong found_tag = klass->klass_tag;
>>>>>>>> 109 while (klass != NULL && found_tag != tag) {
>>>>>>>> 110 klass_ptr = &klass->next;
>>>>>>>> 111 klass = *klass_ptr;
>>>>>>>> 112 found_tag = klass->klass_tag;
>>>>>>>> ? 113???? }
>>>>>>>> 114
>>>>>>>> 115 // Tag not found? Ignore.
>>>>>>>> 116 if (found_tag != tag) {
>>>>>>>> 117 debugMonitorExit(deletedSignatureLock);
>>>>>>>> 118 return;
>>>>>>>> ? 119???? }
>>>>>>>>
>>>>>>>>
>>>>>>>> ??The code above can be simplified, so that the lines 101-105
>>>>>>>> are not
>>>>>>>> needed anymore.
>>>>>>>> ??It can be something like this:
>>>>>>>>
>>>>>>>> // Scan linked-list.
>>>>>>>> while (klass != NULL && klass->klass_tag != tag) {
>>>>>>>> klass_ptr = &klass->next;
>>>>>>>> klass = *klass_ptr;
>>>>>>>> ????? }
>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not
>>>>>>>> found - ignore.
>>>>>>>> debugMonitorExit(deletedSignatureLock);
>>>>>>>> return;
>>>>>>>> ????? }
>>>>>>>>
>>>>>>>> It will take more time when I get a chance to look at the rest.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote:
>>>>>>>>> Here comes an update that resolves some races that happen when
>>>>>>>>> disconnecting an agent. In particular, we need to take the lock on
>>>>>>>>> basically every operation, and also need to check whether or not
>>>>>>>>> class-tracking is active and return an appropriate result (e.g.
>>>>>>>>> an empty
>>>>>>>>> list) when we're not.
>>>>>>>>>
>>>>>>>>> Updated webrev:
>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Roman
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> So, here comes the O(1) implementation:
>>>>>>>>>>
>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a tag,
>>>>>>>>>> and we
>>>>>>>>>> set-up a listener to get notified when it is unloaded.
>>>>>>>>>> - Prepared classes are kept in a datastructure that is a
>>>>>>>>>> table, which
>>>>>>>>>> each entry being the head of a linked-list of KlassNode*. The
>>>>>>>>>> table is
>>>>>>>>>> indexed by tag % slot-count, and then simply prepend the new
>>>>>>>>>> KlassNode*.
>>>>>>>>>> This is O(1) operation.
>>>>>>>>>> - When we get notified of unloading a class, we look up the
>>>>>>>>>> signature of
>>>>>>>>>> the reported tag in that table, and remember it in a bag. The
>>>>>>>>>> KlassNode*
>>>>>>>>>> is then unlinked from the table and deallocated. This is ~O(1)
>>>>>>>>>> operation
>>>>>>>>>> too, depending on the depth of the table. In my testcase which
>>>>>>>>>> hammered
>>>>>>>>>> the code with class-loads and unloads, I usually see depths of
>>>>>>>>>> like 2-3,
>>>>>>>>>> but not usually more. It should be ok.
>>>>>>>>>> - when processUnloads() gets called, we simply hand out that
>>>>>>>>>> bag, and
>>>>>>>>>> allocate a new one.
>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid
>>>>>>>>>> leaking the
>>>>>>>>>> signatures and KlassNode* etc when debug agent gets detached
>>>>>>>>>> and/or
>>>>>>>>>> re-attached (was missing before).
>>>>>>>>>> - I also added locks around data-structure-manipulation (was
>>>>>>>>>> missing
>>>>>>>>>> before).
>>>>>>>>>> - Also, I only activate this whole process when an actual
>>>>>>>>>> listener gets
>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when
>>>>>>>>>> attaching a
>>>>>>>>>> jdb, not sure why jdb does that though. This may be something
>>>>>>>>>> to improve
>>>>>>>>>> in the future?
>>>>>>>>>>
>>>>>>>>>> In my tests, the performance of class-tracking itself looks
>>>>>>>>>> really good.
>>>>>>>>>> The bottleneck now is clearly actual synthesizing the
>>>>>>>>>> class-unload
>>>>>>>>>> events. I don't see how this can be helped when the debug
>>>>>>>>>> agent asks for it?
>>>>>>>>>>
>>>>>>>>>> Updated webrev:
>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
>>>>>>>>>>
>>>>>>>>>> Please let me know what you think of it.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Roman
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Alright, the perfectionist in me got me. I am implementing
>>>>>>>>>>> the even more
>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing for
>>>>>>>>>>> now.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,Roman
>>>>>>>>>>>
>>>>>>>>>>> ? Hi Chris,
>>>>>>>>>>>>> I'll have a look at this, although it might not be for a
>>>>>>>>>>>>> few days. In
>>>>>>>>>>>>> the meantime, maybe you can describe your new
>>>>>>>>>>>>> implementation in
>>>>>>>>>>>>> classTrack.c so it's easier to look through the changes.
>>>>>>>>>>>> Sure.
>>>>>>>>>>>>
>>>>>>>>>>>> The purpose of this class-tracking is to be able to
>>>>>>>>>>>> determine the
>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading
>>>>>>>>>>>> happened, so that
>>>>>>>>>>>> we can generate the appropriate JDWP event.
>>>>>>>>>>>>
>>>>>>>>>>>> The current implementation does so by maintaining a table of
>>>>>>>>>>>> currently
>>>>>>>>>>>> prepared classes by building that table when classTrack is
>>>>>>>>>>>> initialized,
>>>>>>>>>>>> and then add new classes whenever a class gets loaded. When
>>>>>>>>>>>> unloading
>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and compared
>>>>>>>>>>>> with the
>>>>>>>>>>>> old table, and whatever is in the old, but not in the new
>>>>>>>>>>>> table gets
>>>>>>>>>>>> returned. The problem is that when GCs happen frequently
>>>>>>>>>>>> and/or many
>>>>>>>>>>>> classes get loaded+unloaded, this amounts to
>>>>>>>>>>>> O(classCount*gcCount)
>>>>>>>>>>>> complexity.
>>>>>>>>>>>>
>>>>>>>>>>>> The new implementation keeps a linked-list of prepared
>>>>>>>>>>>> classes, and also
>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree().
>>>>>>>>>>>> Whenever an
>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned,
>>>>>>>>>>>> and classes
>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus
>>>>>>>>>>>> maintaining the
>>>>>>>>>>>> prepared-classes-list) and its signature put in the list
>>>>>>>>>>>> that gets returned.
>>>>>>>>>>>>
>>>>>>>>>>>> The implementation is not perfect. In order to determine
>>>>>>>>>>>> whether or not
>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag.
>>>>>>>>>>>> That process is
>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here
>>>>>>>>>>>> is that
>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this
>>>>>>>>>>>> seems to be
>>>>>>>>>>>> true, and also reasonable to expect.
>>>>>>>>>>>>
>>>>>>>>>>>> (I have some ideas how to improve the implementation to
>>>>>>>>>>>> ~O(1) but it
>>>>>>>>>>>> would be considerably more complex: have to maintain a
>>>>>>>>>>>> (hash)table that
>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload,
>>>>>>>>>>>> and build the
>>>>>>>>>>>> unloaded-signatures list there, but I don't currently see
>>>>>>>>>>>> that it's
>>>>>>>>>>>> worth the effort).
>>>>>>>>>>>>
>>>>>>>>>>>> In addition to all that, this process is only activated when
>>>>>>>>>>>> there's an
>>>>>>>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Roman
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Issue:
>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c.
>>>>>>>>>>>>>> It avoids
>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps
>>>>>>>>>>>>>> track of
>>>>>>>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In addition to that, it avoids this whole dance until an
>>>>>>>>>>>>>> agent
>>>>>>>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and
>>>>>>>>>>>>>> timing.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Eg with the testcase provided here:
>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I am getting those numbers:
>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can I please get a review?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>
>>
>>
> 
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200324/dc9954ae/signature-0001.asc>

From magnus.ihse.bursie at oracle.com  Tue Mar 24 12:12:45 2020
From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie)
Date: Tue, 24 Mar 2020 13:12:45 +0100
Subject: RFR: JDK-8241463 Move build tools to respective modules
In-Reply-To: <bea03feb-9ead-7261-0385-01d18a70688e@oracle.com>
References: <ff05910f-9656-94e6-2aaa-ce722443017f@oracle.com>
 <bea03feb-9ead-7261-0385-01d18a70688e@oracle.com>
Message-ID: <01a1c812-1adb-a4a2-3db0-327f76c0b21c@oracle.com>

On 2020-03-23 23:15, naoto.sato at oracle.com wrote:
> Hi Magnus,
>
> I looked at i18n related changes:
>
> make/CopyInterimTZDB.gmk
> make/ToolsJdk.gmk
> make/gendata/Gendata-java.base.gmk
> make/gendata/GendataBreakIterator.gmk
> make/gendata/GendataTZDB.gmk
> make/gensrc/GensrcCharacterData.gmk
> make/gensrc/GensrcEmojiData.gmk
>
> They look ok to me.
Thank you!
>
> The *.java changes should have copyright year update.
Ok, I'll update them.
>
> As to charsetmapping and cldrconverter, I believe they can reside in 
> java.base, as jdk.charsets and jdk.localedata modules depend on it.
Okay. It's not ideal, but I think you're right. I'll move them as well.

I'll publish an updated webrev with these changes when there's agreement 
on where in the source code tree to move the files.

/Magnus
>
> Naoto
>
> On 3/23/20 12:03 PM, Magnus Ihse Bursie wrote:
>> The build tools (small java tools that are run during the build to 
>> generate source code, or data, needed in the JDK) have historically 
>> been placed in the "make" directory. This maybe made sense long time 
>> ago, but does not do so anymore.
>>
>> Instead, the build tools source code should move the the module that 
>> needs them. For instance, compilefontconfig should move to 
>> java.desktop, etc.
>>
>> There are multiple reasons for this:
>>
>> * Currently we build *all* build tools at once, which mean that we 
>> cannot compile java.base until e.g. the compilefontconfig tool is 
>> compiled, even though it is not needed.
>>
>> * If a build tool, e.g. compilefontconfig is modified, all build 
>> tools are recompiled, which triggers a rebuild of more or less the 
>> entire JDK. This makes development of the build tools unnecessary 
>> tedious.
>>
>> * When the build tools are modified, the group owning the 
>> corresponding module is the proper review instance, not the build 
>> team. But since they reside under "make", the review mails often 
>> include build-dev, but this is mostly noise for us. With this move, 
>> the ownership is made clear.
>>
>> In this patch, I have not modified how and when the build tools are 
>> compiled, but this shuffle is the prerequisite for continuing with 
>> that in a follow-up patch.
>>
>> I have also moved the build tools to the org.openjdk.buildtools.* 
>> package name space (inspired by Skara), instead of the strangely 
>> named build.tools.* name space.
>>
>> A few build tools are not moved in this patch. Two of them, 
>> charsetmapping and cldrconverter, are shared between two modules. (I 
>> think they should move to modules nevertheless, but they need some 
>> more thought to make sure I do this right.) The rest are tools that 
>> are needed for the build in general, like linking or javadoc support. 
>> I'll move this to a better location too, but in a separate patch.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8241463
>> WebRev: 
>> http://cr.openjdk.java.net/~ihse/JDK-8241463-move-build-tools-to-modules/webrev.01 
>>
>>
>> /Magnus
>>


From serguei.spitsyn at oracle.com  Tue Mar 24 16:39:34 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 24 Mar 2020 09:39:34 -0700
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <9c9950af-5e36-90b5-d22f-a3e00d45fbcc@oss.nttdata.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <623695a0-e6ae-cac3-571b-775d6f80462a@oss.nttdata.com>
 <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com>
 <d0a1f49e-7f1d-0732-38d0-2b7a75966e74@oss.nttdata.com>
 <c0013500-ebc6-0eab-1705-feb99e58e73e@oracle.com>
 <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com>
 <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com>
 <a13433b4-e3f8-d280-b83f-cde27f7282cd@oracle.com>
 <ff79a3c5-7416-6ee1-73de-0ef1d91a7480@oracle.com>
 <b9694754-b669-0ea3-2ba9-a432b9d8dcfb@oss.nttdata.com>
 <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com>
 <c00aeece-863a-dfb3-f8da-2d3d5ae25330@oracle.com>
 <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com>
 <f0488e84-9758-ea16-82a6-4ce1f424a523@oracle.com>
 <4f940048-06e1-bda1-6f75-225001572232@oss.nttdata.com>
 <908005c7-c3ad-2f92-5853-2ae0895689b4@oracle.com>
 <2114ef12-b790-ee2d-4242-d114cc47151c@oracle.com>
 <9c9950af-5e36-90b5-d22f-a3e00d45fbcc@oss.nttdata.com>
Message-ID: <f45b2bc7-a0ca-d85b-5998-1e30e99d0d36@oracle.com>

Hi Yasumasa,

I'm okay with this update.
My mach5 test run for this patch is passed.

Thanks,
Serguei


On 3/23/20 17:08, Yasumasa Suenaga wrote:
> Hi Serguei,
>
> Thanks for your comment!
> I uploaded new webrev:
>
> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.04/
>
> Also I pushed it to submit repo:
>
> ? http://hg.openjdk.java.net/jdk/submit/rev/fade6a949bd1
>
> On 2020/03/24 7:39, serguei.spitsyn at oracle.com wrote:
>> Hi Yasumasa,
>>
>> The mach5 tier5 testing looks good.
>> The serviceability/sa/ClhsdbPstack.java is failed without fix and is 
>> not failed with it.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 3/23/20 10:18, serguei.spitsyn at oracle.com wrote:
>>> Hi Yasumasa,
>>>
>>> I looked at you changes.
>>> It is hard to understand if this fully solves the issue.
>>>
>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html 
>>>
>>>
>>> @@ -34,10 +34,11 @@
>>> ? ???? public static LinuxAMD64CFrame getTopFrame(LinuxDebugger dbg, 
>>> Address rip, ThreadContext context) {
>>> ??????? Address libptr = dbg.findLibPtrByAddress(rip);
>>> ??????? Address cfa = 
>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP);
>>> ??????? DwarfParser dwarf = null;
>>> + boolean unsupportedDwarf = false;
>>> ? ??????? if (libptr != null) { // Native frame
>>> ????????? try {
>>> ??????????? dwarf = new DwarfParser(libptr);
>>> ??????????? dwarf.processDwarf(rip);
>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
>>>
>>> @@ -45,24 +46,33 @@
>>> ?????????????????? !dwarf.isBPOffsetAvailable())
>>> ????????????????????? ? 
>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP)
>>> ????????????????????? : 
>>> context.getRegisterAsAddress(dwarf.getCFARegister())
>>> .addOffsetTo(dwarf.getCFAOffset());
>>> ????????? } catch (DebuggerException e) {
>>> - // Bail out to Java frame case
>>> + if (dwarf != null) {
>>> + // DWARF processing should succeed when the frame is native
>>> + // but it might fail if CIE has language personality routine
>>> + // and/or LSDA.
>>> + dwarf = null;
>>> + unsupportedDwarf = true;
>>> + } else {
>>> + throw e;
>>> + }
>>> ????????? }
>>> ??????? }
>>> ? ??????? return (cfa == null) ? null
>>> - : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf);
>>> + : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf, !unsupportedDwarf);
>>> ???? }
>>>
>>> @@ -121,13 +131,25 @@
>>> ?????? }
>>> ? ?????? return isValidFrame(nextCFA, context) ? nextCFA : null;
>>> ???? }
>>> ? - private DwarfParser getNextDwarf(Address nextPC) {
>>> - DwarfParser nextDwarf = null;
>>> + @Override
>>> + public CFrame sender(ThreadProxy thread) {
>>> + if (!possibleNext) {
>>> + return null;
>>> + }
>>> +
>>> + ThreadContext context = thread.getContext();
>>> +
>>> + Address nextPC = getNextPC(dwarf != null);
>>> + if (nextPC == null) {
>>> + return null;
>>> + }
>>> ? + DwarfParser nextDwarf = null;
>>> + boolean unsupportedDwarf = false;
>>> ?????? if ((dwarf != null) && dwarf.isIn(nextPC)) {
>>> ???????? nextDwarf = dwarf;
>>> ?????? } else {
>>> ???????? Address libptr = dbg.findLibPtrByAddress(nextPC);
>>> ???????? if (libptr != null) {
>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
>>>
>>> @@ -138,33 +160,29 @@
>>> ?????????? }
>>> ???????? }
>>> ?????? }
>>> ? ?????? if (nextDwarf != null) {
>>> + try {
>>> ???????? nextDwarf.processDwarf(nextPC);
>>> + } catch (DebuggerException e) {
>>> + // DWARF processing should succeed when the frame is native
>>> + // but it might fail if CIE has language personality routine
>>> + // and/or LSDA.
>>> + nextDwarf = null;
>>> + unsupportedDwarf = true;
>>> ?????? }
>>>
>>> This fix looks like a hack.
>>> Should we just propagate the Debugging exception instead of trying 
>>> to maintain unsupportedDwarf flag?
>
> DwarfParser::processDwarf would throw DebuggerException if it cannot 
> find DWARF which relates to PC.
> PC at this point is for next frame. So current frame (`this` object) 
> is valid, and it should be processed.
>
>
>>> Also, I don't like that DWARF-specific abbreviations (like CIE, 
>>> IDE,LSDA, etc.) are used without any comments explaining them.
>>> The code has to be generally readable without looking into the DWARF 
>>> spec each time.
>
> I added comments for them in this webrev.
>
>
> Thanks,
>
> Yasumasa
>
>
>>> I'm submitting mach5 jobs to make sure the issue has been resolved 
>>> with your fix.
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 3/20/20 17:55, Yasumasa Suenaga wrote:
>>>> Thanks Chris!
>>>> I'm waiting for reviewers for this change.
>>>>
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2020/03/21 4:23, Chris Plummer wrote:
>>>>> Hi Yasumasa,
>>>>>
>>>>> The failure is due to JDK-8231634, so not something you need to 
>>>>> worry about.
>>>>>
>>>>> thanks,
>>>>>
>>>>> Chris
>>>>>
>>>>> On 3/20/20 2:45 AM, Yasumasa Suenaga wrote:
>>>>>> Hi Chris,
>>>>>>
>>>>>> I uploaded new webrev which includes reverting change for 
>>>>>> ProblemList:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/
>>>>>>
>>>>>> I tested it on submit repo 
>>>>>> (mach5-one-ysuenaga-JDK-8240956-2-20200320-0814-9586301),
>>>>>> but it has failed in ClhsdbJstackXcompStress.java.
>>>>>> However I think it is not caused by this change because 
>>>>>> ClhsdbJstackXcompStress.java tests `jhsdb jstack`, not mixed 
>>>>>> mode, it would not parse DWARF.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>> On 2020/03/20 13:55, Chris Plummer wrote:
>>>>>>> Hi Yasumasa,
>>>>>>>
>>>>>>> The test has been problem listed so please add undoing this to 
>>>>>>> your webrev. Here's the diff that problem listed it:
>>>>>>>
>>>>>>> diff --git a/test/hotspot/jtreg/ProblemList.txt 
>>>>>>> b/test/hotspot/jtreg/ProblemList.txt
>>>>>>> --- a/test/hotspot/jtreg/ProblemList.txt
>>>>>>> +++ b/test/hotspot/jtreg/ProblemList.txt
>>>>>>> @@ -115,7 +115,7 @@
>>>>>>> ??serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all
>>>>>>> ??serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all
>>>>>>> ??serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all
>>>>>>> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all
>>>>>>> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 
>>>>>>> solaris-all,linux-all
>>>>>>> ??serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 
>>>>>>> 8193639 solaris-all
>>>>>>> ??serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 
>>>>>>> solaris-all,linux-x64,macosx-x64,windows-x64
>>>>>>> ??serviceability/sa/ClhsdbSource.java 8193639 solaris-all
>>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On 3/16/20 5:07 AM, Yasumasa Suenaga wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> This webrev has passed submit repo 
>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and 
>>>>>>>> additional tests.
>>>>>>>> So please review it:
>>>>>>>>
>>>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>> ? webrev: 
>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2020/03/16 21:03, Yasumasa Suenaga wrote:
>>>>>>>>> Thank you so much, David!
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2020/03/16 21:01, David Holmes wrote:
>>>>>>>>>> On 16/03/2020 9:46 pm, David Holmes wrote:
>>>>>>>>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>
>>>>>>>>>>>> I missed loop condition, so I fixed it and pushed to submit 
>>>>>>>>>>>> repo.
>>>>>>>>>>>> Could you try again?
>>>>>>>>>>>>
>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23
>>>>>>>>>>>>
>>>>>>>>>>>> webrev is here:
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>>>>>>>>>
>>>>>>>>>>> Test job resubmitted. Will advise results if it completes 
>>>>>>>>>>> before I go to bed :)
>>>>>>>>>>
>>>>>>>>>> Seems to have passed okay.
>>>>>>>>>>
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>> David
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks a lot!
>>>>>>>>>>>>
>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 2020/03/16 16:17, David Holmes wrote:
>>>>>>>>>>>>> Sorry it is still crashing.
>>>>>>>>>>>>>
>>>>>>>>>>>>> #
>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime 
>>>>>>>>>>>>> Environment:
>>>>>>>>>>>>> #
>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, 
>>>>>>>>>>>>> tid=13704
>>>>>>>>>>>>> #
>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) 
>>>>>>>>>>>>> (fastdebug build 
>>>>>>>>>>>>> 15-internal+0-2020-03-16-0640217.suenaga.source)
>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
>>>>>>>>>>>>> 15-internal+0-2020-03-16-0640217.suenaga.source, mixed 
>>>>>>>>>>>>> mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] 
>>>>>>>>>>>>> DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>>> #
>>>>>>>>>>>>>
>>>>>>>>>>>>> Same as before.
>>>>>>>>>>>>>
>>>>>>>>>>>>> David
>>>>>>>>>>>>> -----
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote:
>>>>>>>>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote:
>>>>>>>>>>>>>>>> BTW, if you submit it to the submit repo, we can then 
>>>>>>>>>>>>>>>> go and run additional internal tests (and even more 
>>>>>>>>>>>>>>>> builds) using that job.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for that tip Chris!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I've pushed the change to submit repo, but I've not yet 
>>>>>>>>>>>>>>> received the result.
>>>>>>>>>>>>>>> I will share you when I get job ID.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> We can see the id. Just need to wait for the builds to 
>>>>>>>>>>>>>> complete before submitting the additional tests.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thank you for testing it.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I updated webrev to avoid bailout to Java frame when 
>>>>>>>>>>>>>>>>> DWARF has language personality routine or LSDA.
>>>>>>>>>>>>>>>>> Could you try it?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ 
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 .
>>>>>>>>>>>>>>>>> I've pushed it to submit repo.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Diff from webrev.00 is here:
>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote:
>>>>>>>>>>>>>>>>>> Correction ...
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I can't review this as I know nothing about the 
>>>>>>>>>>>>>>>>>>>> code, but I'm putting the patch through our 
>>>>>>>>>>>>>>>>>>>> internal testing.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Sorry but the crashes still exist:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java 
>>>>>>>>>>>>>>>>>>> Runtime Environment:
>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, 
>>>>>>>>>>>>>>>>>>> pid=16948, tid=16949
>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment 
>>>>>>>>>>>>>>>>>>> (15.0) (fastdebug build 
>>>>>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev)
>>>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM 
>>>>>>>>>>>>>>>>>>> (fastdebug 
>>>>>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, 
>>>>>>>>>>>>>>>>>>> mixed mode, sharing, tiered, compressed oops, g1 gc, 
>>>>>>>>>>>>>>>>>>> linux-amd64)
>>>>>>>>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] 
>>>>>>>>>>>>>>>>>>> DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> in fact they seem worse as the test seems to always 
>>>>>>>>>>>>>>>>>>> crash now.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs 
>>>>>>>>>>>>>>>>>> of the test in linux-x64. I don't see a pattern as to 
>>>>>>>>>>>>>>>>>> where it fails versus passes.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> It doesn't fail for me locally.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> ?? JBS: 
>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>>>>>>>>>>>>>> ?? webrev: 
>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ 
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for 
>>>>>>>>>>>>>>>>>>>>> unwinding native frames in jstack mixed mode.
>>>>>>>>>>>>>>>>>>>>> However some error has seen intermittently after 
>>>>>>>>>>>>>>>>>>>>> that.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I investigated the cause of this, I found two 
>>>>>>>>>>>>>>>>>>>>> concerns:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) 
>>>>>>>>>>>>>>>>>>>>> range check
>>>>>>>>>>>>>>>>>>>>> ?? B: Language personality routine and Language 
>>>>>>>>>>>>>>>>>>>>> Specific Data Area (LSDA) are not considered
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I addd range check for .eh_frame processing, and 
>>>>>>>>>>>>>>>>>>>>> ignore personality routine and LSDA in this webrev.
>>>>>>>>>>>>>>>>>>>>> Also I added bailout code if DWARF processing is 
>>>>>>>>>>>>>>>>>>>>> failed due to these concerns.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> This change has passed all tests on submit repo 
>>>>>>>>>>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), 
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle 
>>>>>>>>>>>>>>>>>>>>> Linux 7.7 container.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>
>>


From daniil.x.titov at oracle.com  Tue Mar 24 17:00:29 2020
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Tue, 24 Mar 2020 10:00:29 -0700
Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI
 connector port
In-Reply-To: <9c888cc4-f9f8-8f64-f90b-a949007bc1dc@oracle.com>
References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com>
 <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com>
 <D28BF049-D293-4F65-93EC-BCAE4F09B413@oracle.com>
 <a3b5ac77-b6ee-4927-cf99-b586c6bbeae6@oss.nttdata.com>
 <41392FC2-AC78-4E94-878B-907DDFB3E968@oracle.com>
 <c858b94c-0091-8d29-eb0b-145782984d86@oss.nttdata.com>
 <78EA1B33-1B0D-4969-AFC5-E73D8F82E6FA@oracle.com>
 <c00b118a-8619-3984-10d7-63134d7210a0@oss.nttdata.com>
 <FBCAD683-2A99-415D-8926-8AEF560EE55A@oracle.com>
 <9c888cc4-f9f8-8f64-f90b-a949007bc1dc@oracle.com>
Message-ID: <202C5C21-BA69-4ACF-9421-A9B5D6704C8C@oracle.com>

Hi Serguei,

>    It looks like you removed the last call site of DebugServer.main.

Yes. It is correct.

>    Do we need to remove the DebugServer.java as well?
I was considering this but since it is a public class I think it needs to be deprecated first. I also think that it would be better to do in a separate issue
since a  CSR for deprecation needs to be filed for that.  If you agree I will create a new issue for that.

Thanks,
Daniil


?On 3/23/20, 11:56 PM, "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com> wrote:

    Hi Daniil,
    
    It looks pretty good in general.
    
    It looks like you removed the last call site of DebugServer.main.
    Do we need to remove the DebugServer.java as well?
    
    Thanks,
    Serguei
    
    
    On 3/22/20 15:29, Daniil Titov wrote:
    > Hi Yasumasa, Serguei and Alex,
    >
    > Please review a new version of the webrev that merges SADebugDTest.java  with changes  done in  [2].
    >
    > Also the CRS [3] and the help message for debug server in SALauncher.java were updated to specify that  '--hostname'
    > option could be a hostname or an IPv4/IPv6 address.
    >
    >   >  Ok, but I think it might be more simply with TestLibrary.
    >   >   For example, can you use TestLibrary::getUnusedRandomPort ? It is used in test/jdk/java/rmi/testlibrary/RMID.java .
    >
    > TestLibrary:: getUnusedRandomPort() doesn't allow to specify what ports are reserved and it uses some hardcoded port range [FIXED_PORT_MIN, FIXED_PORT_MAX] as reserved ports. Besides,  test/jdk/java/rmi/testlibrary/TestLibrary.java class cannot be directly used in test/hotspot/jtreg/serviceability/* tests (it doesn't compile).
    >
    > Nevertheless, to simplify the test itself I moved findUnreservedFreePort(int .. reservedPorts) from SADebugTest.java to jdk.test.lib.Utils in /test/lib.
    >
    > Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
    >
    > [1] http://cr.openjdk.java.net/~dtitov/8196751/webrev.04/
    > [2] https://bugs.openjdk.java.net/browse/JDK-8238268
    > [3] https://bugs.openjdk.java.net/browse/JDK-8239831
    >
    > Thank you,
    > Daniil
    >
    > ?On 3/13/20, 7:23 PM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
    >
    >      Hi Daniil,
    >      
    >      On 2020/03/14 7:05, Daniil Titov wrote:
    >      > Hi Yasumasa, Serguei and Alex,
    >      >
    >      > Please review a new version of the webrev that includes the changes Yasumasa suggested.
    >      >
    >      >> Shutdown hook is already registered in c'tor of HotSpotAgent.
    >      >>     It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed.
    >      >
    >      > The shutdown hook registered in the HotSpotAgent c'tor only works for non-servers, so we still need a
    >      > the shutdown hook for remote server being added in SALauncher. I changed it to use  the lambda expression.
    >      >
    >      > 101     public HotSpotAgent() {
    >      >   102         // for non-server add shutdown hook to clean-up debugger in case
    >      >   103         // of forced exit. For remote server, shutdown hook is added by
    >      >   104         // DebugServer.
    >      >   105         Runtime.getRuntime().addShutdownHook(new java.lang.Thread(
    >      >   106         new Runnable() {
    >      >   107             public void run() {
    >      >   108                 synchronized (HotSpotAgent.this) {
    >      >   109                     if (!isServer) {
    >      >   110                         detach();
    >      >   111                     }
    >      >   112                 }
    >      >   113             }
    >      >   114         }));
    >      >   115     }
    >      
    >      I missed it, thanks!
    >      
    >      
    >      >>>     Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains
    >      >>> `exclusiveAccess.dirs=.` to avoid concurrent execution
    >      > As I understand exclusiveAccess.dirs prevents only the tests located in this directory from being run simultaneously and other tests could still run in parallel with one of these tests.  Thus I would prefer to have the retry mechanism in place. I simplified the code using the class variables instead of local arrays.
    >      
    >      Ok, but I think it might be more simply with TestLibrary.
    >      For example, can you use TestLibrary::getUnusedRandomPort ? It is used in test/jdk/java/rmi/testlibrary/RMID.java .
    >      
    >      
    >      Thanks,
    >      
    >      Yasumasa
    >      
    >      
    >      > Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
    >      >
    >      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.03/
    >      > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
    >      > [3] Bug: https://bugs.openjdk.java.net/browse/JDK-8196751
    >      >
    >      > Thank you,
    >      > Daniil
    >      >
    >      > ?On 3/6/20, 6:15 PM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
    >      >
    >      >      Hi Daniil,
    >      >
    >      >      On 2020/03/07 3:38, Daniil Titov wrote:
    >      >      > Hi Yasumasa,
    >      >      >
    >      >      >   -> checkBasicOptions() is needed? I think you can remove this method and embed it in caller.
    >      >      > I think that having a piece of code that invokes  a method  named "buildAttachArgs" with a copy of the argument map  just for its side-effect ( it throws an exception if parameters are incorrect)  and ignores its return might look confusing. Thus, I found it more appropriate to wrap it inside a method with more relevant name .
    >      >
    >      >      Ok, but I prefer to leave comment it.
    >      >
    >      >
    >      >      >   > SADebugDTest
    >      >      >   >  - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array.
    >      >      > We cannot use primitives there since these local variables are captured in lambda expression and are required to be final.
    >      >      > The other option is to use some other wrapper for them but I don't see any obvious benefits in it comparing to the array.
    >      >
    >      >      Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains `exclusiveAccess.dirs=.` to avoid concurrent execution.
    >      >      If you do not think this error check, test code is more simply.
    >      >
    >      >
    >      >      > I will include your other suggestion in the new version of the webrev.
    >      >
    >      >      Sorry, I have one more comment:
    >      >
    >      >      >           - Shutdown hook is very good idea. You can implement more simply if you use lambda expression.
    >      >
    >      >      Shutdown hook is already registered in c'tor of HotSpotAgent.
    >      >      It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed.
    >      >
    >      >
    >      >      Thanks,
    >      >
    >      >      Yasumasa
    >      >
    >      >
    >      >      > Thanks!
    >      >      > Daniil
    >      >      >
    >      >      > ?On 3/6/20, 12:30 AM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
    >      >      >
    >      >      >      Hi Daniil,
    >      >      >
    >      >      >
    >      >      >      - SALauncher.java
    >      >      >           - checkBasicOptions() is needed? I think you can remove this method and embed it in caller.
    >      >      >           - I think registryPort should be checked with Integer.parseInt() like others (rmiPort and pid) rather than regex.
    >      >      >           - Shutdown hook is very good idea. You can implement more simply if you use lambda expression.
    >      >      >
    >      >      >      - SADebugDTest.java
    >      >      >           - Please add bug ID to @bug.
    >      >      >           - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array.
    >      >      >
    >      >      >
    >      >      >      Thanks,
    >      >      >
    >      >      >      Yasumasa
    >      >      >
    >      >      >
    >      >      >      On 2020/03/06 10:15, Daniil Titov wrote:
    >      >      >      > Hi Yasumasa, Serguei and Alex,
    >      >      >      >
    >      >      >      > Please review a new version of the fix [1] that addresses your comments. The new version in addition to RMI connector
    >      >      >      > port option introduces two more options to specify RMI registry port and RMI connector host name. Currently, these
    >      >      >      > last two settings could be specified using the system properties but the system properties have the following disadvantages
    >      >      >      > comparing to the command line options:
    >      >      >      >     -  It?s hard to know about them: they are not listed in tool?s help.
    >      >      >      >     -  They have long names that hard to remember
    >      >      >      >     -   It is easy to mistype them  in the command line and you will not get any warning about it.
    >      >      >      >
    >      >      >      > The CSR [2] was also updated and needs to be reviewed.
    >      >      >      >
    >      >      >      > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker
    >      >      >      > container  and connecting  to it with the GUI debugger.  Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
    >      >      >      >
    >      >      >      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.02/
    >      >      >      > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
    >      >      >      > [3] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751
    >      >      >      >
    >      >      >      > Thank you,
    >      >      >      > Daniil
    >      >      >      >
    >      >      >      > ?On 2/24/20, 5:45 AM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
    >      >      >      >
    >      >      >      >      Hi Daniil,
    >      >      >      >
    >      >      >      >         - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments.
    >      >      >      >           Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply.
    >      >      >      >
    >      >      >      >         - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used.
    >      >      >      >           But you can use same port number as RMI registry (1099).
    >      >      >      >           It is same as relation between jmxremote.port and jmxremote.rmi.port.
    >      >      >      >
    >      >      >      >
    >      >      >      >      Thanks,
    >      >      >      >
    >      >      >      >      Yasumasa
    >      >      >      >
    >      >      >      >
    >      >      >      >      On 2020/02/24 13:21, Daniil Titov wrote:
    >      >      >      >      > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port.
    >      >      >      >      > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container.
    >      >      >      >      >
    >      >      >      >      > New CSR [3] was created for this change and it needs to be reviewed as well.
    >      >      >      >      >
    >      >      >      >      > Man pages for jhsdb will be updated in a separate issue.
    >      >      >      >      >
    >      >      >      >      > The current implementation (sun.jvm.hotspot.SALauncher)  parses the command line options passed to jhsdb tool,
    >      >      >      >      > converts them to the ones for the debug server and then delegates the call  to sun.jvm.hotspot.DebugServer.main().
    >      >      >      >      >
    >      >      >      >      >                // delegate to the actual SA debug server.
    >      >      >      >      >   367         DebugServer.main(newArgArray.toArray(new String[0]));
    >      >      >      >      >
    >      >      >      >      > However,  sun.jvm.hotspot.DebugServer  doesn't support named options and that prevents from efficiently adding new options to the tool.
    >      >      >      >      > I found it more suitable to start Hotspot agent directly in  SALauncher rather than  adding a new option in  both sun.jvm.hotspot.SALauncher
    >      >      >      >      >   and sun.jvm.hotspot.DebugServer and  delegating the call.  With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated
    >      >      >      >      > but I would prefer to address it in a separate issue.
    >      >      >      >      >
    >      >      >      >      > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker
    >      >      >      >      >                  container  and connecting  to it with the GUI debugger.
    >      >      >      >      >                 Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
    >      >      >      >      >
    >      >      >      >      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01
    >      >      >      >      > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751
    >      >      >      >      > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
    >      >      >      >      >
    >      >      >      >      > Thank you,
    >      >      >      >      > Daniil
    >      >      >      >      >
    >      >      >      >      >
    >      >      >      >
    >      >      >      >
    >      >      >      >
    >      >      >
    >      >      >
    >      >      >
    >      >
    >      >
    >      >
    >      
    >
    >
    
    
From daniel.daugherty at oracle.com  Tue Mar 24 17:01:29 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Tue, 24 Mar 2020 13:01:29 -0400
Subject: RFR(T): 8241532: ProblemList tests from 8241530 on OSX
Message-ID: <59631faa-ee70-c95f-7c9f-91f24cd0d1de@oracle.com>

Greetings,

I have a trivial review for ProblemListing some tests.

We're having some network issues with the new OSX 10.15 machines that
are being addressed. In the mean time, I'm trying to reduce the noise
in the CI in Tier5 and Tier6 so I'm ProblemListing the affected tests:

$ hg diff
diff -r 23dab0354eb0 test/jdk/ProblemList.txt
--- a/test/jdk/ProblemList.txt??? Tue Mar 24 17:39:52 2020 +0100
+++ b/test/jdk/ProblemList.txt??? Tue Mar 24 12:57:43 2020 -0400
@@ -604,6 +604,10 @@
 ?com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad.java 
8030957 aix-all
 ?com/sun/management/OperatingSystemMXBean/GetSystemCpuLoad.java 8030957 
aix-all

+sun/management/jdp/JdpDefaultsTest.java 8241530 macosx-all
+sun/management/jdp/JdpJmxRemoteDynamicPortTest.java 8241530 macosx-all
+sun/management/jdp/JdpSpecificAddressTest.java 8241530 macosx-all
+
 ?############################################################################

 ?# jdk_jmx
@@ -924,6 +928,9 @@

 ?com/sun/jdi/InvokeHangTest.java 8218463 linux-all

+com/sun/jdi/JdwpAttachTest.java 8241530 macosx-all
+com/sun/jdi/JdwpListenTest.java 8241530 macosx-all
+
 ?############################################################################

 ?# jdk_time

Thanks, in advance, for any comments, questions or suggestions.

Dan


From christian.tornqvist at oracle.com  Tue Mar 24 17:03:31 2020
From: christian.tornqvist at oracle.com (Christian Tornqvist)
Date: Tue, 24 Mar 2020 10:03:31 -0700
Subject: RFR(T): 8241532: ProblemList tests from 8241530 on OSX
In-Reply-To: <59631faa-ee70-c95f-7c9f-91f24cd0d1de@oracle.com>
References: <59631faa-ee70-c95f-7c9f-91f24cd0d1de@oracle.com>
Message-ID: <F7C23387-123D-42B8-BBE0-CE1E0205FC81@oracle.com>

Looks good, thanks for doing this.

Thanks,
Christian

> On Mar 24, 2020, at 10:01 AM, Daniel D. Daugherty <daniel.daugherty at oracle.com> wrote:
> 
> Greetings,
> 
> I have a trivial review for ProblemListing some tests.
> 
> We're having some network issues with the new OSX 10.15 machines that
> are being addressed. In the mean time, I'm trying to reduce the noise
> in the CI in Tier5 and Tier6 so I'm ProblemListing the affected tests:
> 
> $ hg diff
> diff -r 23dab0354eb0 test/jdk/ProblemList.txt
> --- a/test/jdk/ProblemList.txt    Tue Mar 24 17:39:52 2020 +0100
> +++ b/test/jdk/ProblemList.txt    Tue Mar 24 12:57:43 2020 -0400
> @@ -604,6 +604,10 @@
>  com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad.java 8030957 aix-all
>  com/sun/management/OperatingSystemMXBean/GetSystemCpuLoad.java 8030957 aix-all
> 
> +sun/management/jdp/JdpDefaultsTest.java 8241530 macosx-all
> +sun/management/jdp/JdpJmxRemoteDynamicPortTest.java 8241530 macosx-all
> +sun/management/jdp/JdpSpecificAddressTest.java 8241530 macosx-all
> +
>  ############################################################################
> 
>  # jdk_jmx
> @@ -924,6 +928,9 @@
> 
>  com/sun/jdi/InvokeHangTest.java 8218463 linux-all
> 
> +com/sun/jdi/JdwpAttachTest.java 8241530 macosx-all
> +com/sun/jdi/JdwpListenTest.java 8241530 macosx-all
> +
>  ############################################################################
> 
>  # jdk_time
> 
> Thanks, in advance, for any comments, questions or suggestions.
> 
> Dan
> 


From daniel.daugherty at oracle.com  Tue Mar 24 17:04:23 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Tue, 24 Mar 2020 13:04:23 -0400
Subject: RFR(T): 8241532: ProblemList tests from 8241530 on OSX
In-Reply-To: <F7C23387-123D-42B8-BBE0-CE1E0205FC81@oracle.com>
References: <59631faa-ee70-c95f-7c9f-91f24cd0d1de@oracle.com>
 <F7C23387-123D-42B8-BBE0-CE1E0205FC81@oracle.com>
Message-ID: <5e7bf2c8-ad77-949d-c984-b63cf0ace03a@oracle.com>

Thanks for the fast review!

Dan


On 3/24/20 1:03 PM, Christian Tornqvist wrote:
> Looks good, thanks for doing this.
>
> Thanks,
> Christian
>
>> On Mar 24, 2020, at 10:01 AM, Daniel D. Daugherty <daniel.daugherty at oracle.com> wrote:
>>
>> Greetings,
>>
>> I have a trivial review for ProblemListing some tests.
>>
>> We're having some network issues with the new OSX 10.15 machines that
>> are being addressed. In the mean time, I'm trying to reduce the noise
>> in the CI in Tier5 and Tier6 so I'm ProblemListing the affected tests:
>>
>> $ hg diff
>> diff -r 23dab0354eb0 test/jdk/ProblemList.txt
>> --- a/test/jdk/ProblemList.txt    Tue Mar 24 17:39:52 2020 +0100
>> +++ b/test/jdk/ProblemList.txt    Tue Mar 24 12:57:43 2020 -0400
>> @@ -604,6 +604,10 @@
>>   com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad.java 8030957 aix-all
>>   com/sun/management/OperatingSystemMXBean/GetSystemCpuLoad.java 8030957 aix-all
>>
>> +sun/management/jdp/JdpDefaultsTest.java 8241530 macosx-all
>> +sun/management/jdp/JdpJmxRemoteDynamicPortTest.java 8241530 macosx-all
>> +sun/management/jdp/JdpSpecificAddressTest.java 8241530 macosx-all
>> +
>>   ############################################################################
>>
>>   # jdk_jmx
>> @@ -924,6 +928,9 @@
>>
>>   com/sun/jdi/InvokeHangTest.java 8218463 linux-all
>>
>> +com/sun/jdi/JdwpAttachTest.java 8241530 macosx-all
>> +com/sun/jdi/JdwpListenTest.java 8241530 macosx-all
>> +
>>   ############################################################################
>>
>>   # jdk_time
>>
>> Thanks, in advance, for any comments, questions or suggestions.
>>
>> Dan
>>


From chris.plummer at oracle.com  Tue Mar 24 20:35:01 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 24 Mar 2020 13:35:01 -0700
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <7667eb6e-8458-6763-3f27-f97f6f49bed8@redhat.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
 <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>
 <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>
 <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com>
 <7766d2f0-e223-7129-9c37-474bbcd67d70@oracle.com>
 <b542b056-6de2-1ad0-b54d-9029c80d5164@oracle.com>
 <7667eb6e-8458-6763-3f27-f97f6f49bed8@redhat.com>
Message-ID: <86692131-c9ff-3904-dce6-ac59149b2b88@oracle.com>

Hi Roman,

On 3/24/20 1:56 AM, Roman Kennke wrote:
> Hi Chris,
>
>> I assume JVMTI maintains separate tagging data for each agent so having
>> two agents doing tagging won't result in confusion. I didn't actually
>> find this in the spec. Would be nice to confirm that it is the case.
>> However, your implementation does seem to conflict with other uses of
>> tagging in the debug agent:
> The tagging data is per-jvmtiEnv. We create and use our own env (private
> to class-tracking), so this wouldn't conflict with other uses of tags.
> Could it be a problem that we have a single trackingEnv per JVM, though?
> /me scratches head.
Ok. This is an area I'm not familiar with, but the spec does say:

"Each call to GetEnv creates a new JVM TI connection and thus a new JVM 
TI environment."

So it looks like what you are doing should be ok. I still think you have 
a bug where you are not deallocating signatures of classes that are 
unloaded. If you think otherwise please point out where this is done.

thanks,

Chris
>> What would cause classTrack_addPreparedClass() to be called for a Class
>> you've already seen? I don't understand the need for the "tag != 0l" check.
> It's probably not needed, may be a left-over from previous installments
> of this implementation. I will check it, and turn into an assert or so.
>
> Thanks,
> Roman
>
>> thanks,
>>
>> Chris
>>
>> On 3/20/20 12:52 PM, Chris Plummer wrote:
>>> On 3/20/20 8:30 AM, Roman Kennke wrote:
>>>> I believe I came up with a much simpler solution that also solves the
>>>> problems of the existing one, and the ones I proposed earlier.
>>>>
>>>> It turns out that we can take advantage of the fact that we can use
>>>> *anything* as tags in JVMTI, even pointers to stuff (this is explicitely
>>>> mentioned in the JVMTI spec). This means we can simply stick a pointer
>>>> to the signature of a class into the tag, and pull it out again when we
>>>> get notified that the class gets unloaded.
>>>>
>>>> This means we don't need an extra data-structure to keep track of
>>>> classes and signatures, and it also makes the story around locking
>>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning of all
>>>> classes needed (as in the current implementation) and no searching of
>>>> table needed (like in my previous attempts).
>>>>
>>>> Please review this new revision:
>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/
>>> I'll have a look at this.
>>>> (Notice that there still appears to be a performance bottleneck with
>>>> class-unloading when an actual debugger is attached. This doesn't seem
>>>> to be related to the classTrack.c implementation though, but looks like
>>>> a consequence of getting all those class-unload notifications over the
>>>> wire. My testcase generates 1000s of them, and it's clogging up the
>>>> buffers.)
>>> At least this is only a one-shot hit when the classes are unloaded,
>>> and the performance hit is based on the number of classes being
>>> unloaded. The main issue is happening every GC, and is O(n) where n is
>>> the number of loaded classes.
>>>> I am not sure why jdb needs to enable class-unload listener always. A
>>>> simple hack disables it, and performance is brilliant, even when jdb is
>>>> attached:
>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch
>>> This is JDI, not jdb. It looks like it needs ClassUnload events so it
>>> can maintain typesBySignature, which is used by public APIs like
>>> allClasses(). So we have caching of loaded classes both in the debug
>>> agent and in JDI.
>>>
>>> Chris
>>>> But this is not in the scope of this bug.)
>>>>
>>>> Roman
>>>>
>>>>
>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote:
>>>>> Sorry, forgot to complete my comments at the end (see below).
>>>>>
>>>>>
>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote:
>>>>>> Hi Roman,
>>>>>>
>>>>>> Thank you for the update and sorry for the latency in review.
>>>>>>
>>>>>> Some comments are below.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>
>>>>>>
>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
>>>>>>  ?? 88 {
>>>>>> 89 debugMonitorEnter(deletedSignatureLock);
>>>>>> 90 if (currentClassTag == -1) {
>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>> 93 return;
>>>>>>  ?? 94???? }
>>>>>> Just a question:
>>>>>>  ?? Q1: Should the ObjectFree events be disabled for the jvmtiEnv
>>>>>> that does
>>>>>>  ?????? the class tracking if class tracking has not been initialized?
>>>>>>
>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is better to
>>>>>> be something like: lastClassTag or highestClassTag.
>>>>>>
>>>>>> 99 KlassNode* klass = *klass_ptr;
>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103
>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr;
>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // klass not
>>>>>> found - ignore.
>>>>>> 107 debugMonitorExit(deletedSignatureLock);
>>>>>> 108 return;
>>>>>>  ? 109???? }
>>>>>>  ??It seems to me, something is wrong in the condition at L106 above.
>>>>>>  ??Should it be? :
>>>>>>  ???? if (klass == NULL || klass->klass_tag != tag)
>>>>>>
>>>>>>  ??Otherwise, how can the second check ever work correctly as the
>>>>>> return
>>>>>> will always happen when (klass != NULL)?
>>>>>>
>>>>>>  ? There are several places in this file with the the indent:
>>>>>> 90 if (currentClassTag == -1) {
>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>> 93 return;
>>>>>>  ?? 94???? }
>>>>>>  ? ...
>>>>>> 152 if (currentClassTag == -1) {
>>>>>> 153 // Class tracking not initialized yet, nobody's interested
>>>>>> 154 debugMonitorExit(deletedSignatureLock);
>>>>>> 155 return;
>>>>>>  ? 156???? }
>>>>>>  ? ...
>>>>>> 161 if (error != JVMTI_ERROR_NONE) {
>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv");
>>>>>>  ? 163???? }
>>>>>> 164 if (tag != 0l) {
>>>>>> 165 debugMonitorExit(deletedSignatureLock);
>>>>>> 166 return; // Already added
>>>>>>  ? 167???? }
>>>>>>  ? ...
>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg)
>>>>>> 282 {
>>>>>> 283 char* sig = (char*)signatureVoid;
>>>>>> 284 jvmtiDeallocate(sig);
>>>>>> 285 return JNI_TRUE;
>>>>>>  ? 286 }
>>>>>>  ? ...
>>>>>>  ? 291 void
>>>>>>  ? 292 classTrack_reset(void)
>>>>>>  ? 293 {
>>>>>> 294 int idx;
>>>>>> 295 debugMonitorEnter(deletedSignatureLock);
>>>>>> 296
>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) {
>>>>>> 298 KlassNode* node = table[idx];
>>>>>> 299 while (node != NULL) {
>>>>>> 300 KlassNode* next = node->next;
>>>>>> 301 jvmtiDeallocate(node->signature);
>>>>>> 302 jvmtiDeallocate(node);
>>>>>> 303 node = next;
>>>>>> 304 }
>>>>>> 305 }
>>>>>> 306 jvmtiDeallocate(table);
>>>>>> 307
>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL);
>>>>>> 309 bagDestroyBag(deletedSignatureBag);
>>>>>> 310
>>>>>> 311 currentClassTag = -1;
>>>>>> 312
>>>>>> 313 (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv);
>>>>>> 314 trackingEnv = NULL;
>>>>>> 315
>>>>>> 316 debugMonitorExit(deletedSignatureLock);
>>>>>>
>>>>>> Could you, please, fix several comments below?
>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for
>>>>>> class-unloads
>>>>>>  ??The comma is not needed.
>>>>>>  ??Would it better to replace: klass tags => klass_tag's ?
>>>>>>
>>>>>>
>>>>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag
>>>>>> consistent
>>>>>>  ??Maybe: Lock to guard ... or lock to keep integrity of ...
>>>>>>
>>>>>> 84 * Callback when classes are freed, Finds the signature and
>>>>>> remembers it in deletedSignatureBag. Would be better to use words like
>>>>>> "store" or "record", "Find" should not start from capital letter:
>>>>>> Invoke the callback when classes are freed, find and record the
>>>>>> signature in deletedSignatureBag.
>>>>>>
>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not initialized,
>>>>>> nobody's interested 153 // Class tracking not initialized yet,
>>>>>> nobody's interested 158 /* Check this is not a duplicate */ Missed dot
>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { //
>>>>>> klass not found - ignore. In opposite, dot is not needed as the
>>>>>> comment does not start from a capital letter. 111 // At this point we
>>>>>> have the KlassNode corresponding to the tag
>>>>>> 112 // in klass, and the pointer to it in klass_node.
>>>>>  ? The comment above can be better. Maybe, something like:
>>>>>  ? ? " At this point, we found the KlassNode matching the klass
>>>>> tag(and it is
>>>>> linked).
>>>>>
>>>>>> 113 // Remember the unloaded signature.
>>>>>  ??Better: Record the signature of the unloaded class and unlink it.
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>> On 3/9/20 05:39, Roman Kennke wrote:
>>>>>>> Hello all,
>>>>>>>
>>>>>>> Can I please get reviews of this change? In the meantime, we've done
>>>>>>> more testing and also field-/torture-testing by a customer who is
>>>>>>> happy
>>>>>>> now. :-)
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Roman
>>>>>>>
>>>>>>>
>>>>>>>> Hi Serguei,
>>>>>>>>
>>>>>>>> Thanks for reviewing!
>>>>>>>>
>>>>>>>> I updated the patch to reflect your suggestions, very good!
>>>>>>>> It also includes a fix to allow re-connecting an agent after
>>>>>>>> disconnect,
>>>>>>>> namely move setup of the trackingEnv and deletedSignatureBag to
>>>>>>>> _activate() to ensure have those structures after re-connect.
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
>>>>>>>>
>>>>>>>> Let me know what you think!
>>>>>>>> Roman
>>>>>>>>
>>>>>>>>> Hi Roman,
>>>>>>>>>
>>>>>>>>> Thank you for taking care about this scalability issue!
>>>>>>>>>
>>>>>>>>> I have a couple of quick comments.
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 72 /*
>>>>>>>>> 73 * Lock to protect deletedSignatureBag
>>>>>>>>> 74 */
>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>>>>>>>>> 78 * A bag containing all the deleted classes' signatures. Must be
>>>>>>>>> accessed under
>>>>>>>>> 79 * deletedTagLock,
>>>>>>>>>  ?? 80? */
>>>>>>>>> 81 struct bag* deletedSignatureBag;
>>>>>>>>>
>>>>>>>>>  ?? The comments contradict to each other.
>>>>>>>>>  ?? I guess, the lock name at line 79 has to be deletedSignatureLock
>>>>>>>>> instead of deletedTagLock.
>>>>>>>>>  ?? Also, comma at the end must be replaced with dot.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 101 // Tag not found? Ignore.
>>>>>>>>> 102 if (klass == NULL) {
>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock);
>>>>>>>>> 104 return;
>>>>>>>>> 105 }
>>>>>>>>>  ? 106
>>>>>>>>> 107 // Scan linked-list.
>>>>>>>>> 108 jlong found_tag = klass->klass_tag;
>>>>>>>>> 109 while (klass != NULL && found_tag != tag) {
>>>>>>>>> 110 klass_ptr = &klass->next;
>>>>>>>>> 111 klass = *klass_ptr;
>>>>>>>>> 112 found_tag = klass->klass_tag;
>>>>>>>>>  ? 113???? }
>>>>>>>>> 114
>>>>>>>>> 115 // Tag not found? Ignore.
>>>>>>>>> 116 if (found_tag != tag) {
>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock);
>>>>>>>>> 118 return;
>>>>>>>>>  ? 119???? }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  ??The code above can be simplified, so that the lines 101-105
>>>>>>>>> are not
>>>>>>>>> needed anymore.
>>>>>>>>>  ??It can be something like this:
>>>>>>>>>
>>>>>>>>> // Scan linked-list.
>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) {
>>>>>>>>> klass_ptr = &klass->next;
>>>>>>>>> klass = *klass_ptr;
>>>>>>>>>  ????? }
>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not
>>>>>>>>> found - ignore.
>>>>>>>>> debugMonitorExit(deletedSignatureLock);
>>>>>>>>> return;
>>>>>>>>>  ????? }
>>>>>>>>>
>>>>>>>>> It will take more time when I get a chance to look at the rest.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Serguei
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote:
>>>>>>>>>> Here comes an update that resolves some races that happen when
>>>>>>>>>> disconnecting an agent. In particular, we need to take the lock on
>>>>>>>>>> basically every operation, and also need to check whether or not
>>>>>>>>>> class-tracking is active and return an appropriate result (e.g.
>>>>>>>>>> an empty
>>>>>>>>>> list) when we're not.
>>>>>>>>>>
>>>>>>>>>> Updated webrev:
>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Roman
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> So, here comes the O(1) implementation:
>>>>>>>>>>>
>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a tag,
>>>>>>>>>>> and we
>>>>>>>>>>> set-up a listener to get notified when it is unloaded.
>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a
>>>>>>>>>>> table, which
>>>>>>>>>>> each entry being the head of a linked-list of KlassNode*. The
>>>>>>>>>>> table is
>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend the new
>>>>>>>>>>> KlassNode*.
>>>>>>>>>>> This is O(1) operation.
>>>>>>>>>>> - When we get notified of unloading a class, we look up the
>>>>>>>>>>> signature of
>>>>>>>>>>> the reported tag in that table, and remember it in a bag. The
>>>>>>>>>>> KlassNode*
>>>>>>>>>>> is then unlinked from the table and deallocated. This is ~O(1)
>>>>>>>>>>> operation
>>>>>>>>>>> too, depending on the depth of the table. In my testcase which
>>>>>>>>>>> hammered
>>>>>>>>>>> the code with class-loads and unloads, I usually see depths of
>>>>>>>>>>> like 2-3,
>>>>>>>>>>> but not usually more. It should be ok.
>>>>>>>>>>> - when processUnloads() gets called, we simply hand out that
>>>>>>>>>>> bag, and
>>>>>>>>>>> allocate a new one.
>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid
>>>>>>>>>>> leaking the
>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets detached
>>>>>>>>>>> and/or
>>>>>>>>>>> re-attached (was missing before).
>>>>>>>>>>> - I also added locks around data-structure-manipulation (was
>>>>>>>>>>> missing
>>>>>>>>>>> before).
>>>>>>>>>>> - Also, I only activate this whole process when an actual
>>>>>>>>>>> listener gets
>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when
>>>>>>>>>>> attaching a
>>>>>>>>>>> jdb, not sure why jdb does that though. This may be something
>>>>>>>>>>> to improve
>>>>>>>>>>> in the future?
>>>>>>>>>>>
>>>>>>>>>>> In my tests, the performance of class-tracking itself looks
>>>>>>>>>>> really good.
>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the
>>>>>>>>>>> class-unload
>>>>>>>>>>> events. I don't see how this can be helped when the debug
>>>>>>>>>>> agent asks for it?
>>>>>>>>>>>
>>>>>>>>>>> Updated webrev:
>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
>>>>>>>>>>>
>>>>>>>>>>> Please let me know what you think of it.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Roman
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Alright, the perfectionist in me got me. I am implementing
>>>>>>>>>>>> the even more
>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing for
>>>>>>>>>>>> now.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,Roman
>>>>>>>>>>>>
>>>>>>>>>>>>  ? Hi Chris,
>>>>>>>>>>>>>> I'll have a look at this, although it might not be for a
>>>>>>>>>>>>>> few days. In
>>>>>>>>>>>>>> the meantime, maybe you can describe your new
>>>>>>>>>>>>>> implementation in
>>>>>>>>>>>>>> classTrack.c so it's easier to look through the changes.
>>>>>>>>>>>>> Sure.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The purpose of this class-tracking is to be able to
>>>>>>>>>>>>> determine the
>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading
>>>>>>>>>>>>> happened, so that
>>>>>>>>>>>>> we can generate the appropriate JDWP event.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The current implementation does so by maintaining a table of
>>>>>>>>>>>>> currently
>>>>>>>>>>>>> prepared classes by building that table when classTrack is
>>>>>>>>>>>>> initialized,
>>>>>>>>>>>>> and then add new classes whenever a class gets loaded. When
>>>>>>>>>>>>> unloading
>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and compared
>>>>>>>>>>>>> with the
>>>>>>>>>>>>> old table, and whatever is in the old, but not in the new
>>>>>>>>>>>>> table gets
>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently
>>>>>>>>>>>>> and/or many
>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to
>>>>>>>>>>>>> O(classCount*gcCount)
>>>>>>>>>>>>> complexity.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared
>>>>>>>>>>>>> classes, and also
>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree().
>>>>>>>>>>>>> Whenever an
>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned,
>>>>>>>>>>>>> and classes
>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus
>>>>>>>>>>>>> maintaining the
>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list
>>>>>>>>>>>>> that gets returned.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The implementation is not perfect. In order to determine
>>>>>>>>>>>>> whether or not
>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag.
>>>>>>>>>>>>> That process is
>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here
>>>>>>>>>>>>> is that
>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this
>>>>>>>>>>>>> seems to be
>>>>>>>>>>>>> true, and also reasonable to expect.
>>>>>>>>>>>>>
>>>>>>>>>>>>> (I have some ideas how to improve the implementation to
>>>>>>>>>>>>> ~O(1) but it
>>>>>>>>>>>>> would be considerably more complex: have to maintain a
>>>>>>>>>>>>> (hash)table that
>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload,
>>>>>>>>>>>>> and build the
>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently see
>>>>>>>>>>>>> that it's
>>>>>>>>>>>>> worth the effort).
>>>>>>>>>>>>>
>>>>>>>>>>>>> In addition to all that, this process is only activated when
>>>>>>>>>>>>> there's an
>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Issue:
>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c.
>>>>>>>>>>>>>>> It avoids
>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps
>>>>>>>>>>>>>>> track of
>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance until an
>>>>>>>>>>>>>>> agent
>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and
>>>>>>>>>>>>>>> timing.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Eg with the testcase provided here:
>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I am getting those numbers:
>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Can I please get a review?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>
>>>
>>


From rkennke at redhat.com  Tue Mar 24 20:45:16 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Tue, 24 Mar 2020 21:45:16 +0100
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <86692131-c9ff-3904-dce6-ac59149b2b88@oracle.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
 <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>
 <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>
 <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com>
 <7766d2f0-e223-7129-9c37-474bbcd67d70@oracle.com>
 <b542b056-6de2-1ad0-b54d-9029c80d5164@oracle.com>
 <7667eb6e-8458-6763-3f27-f97f6f49bed8@redhat.com>
 <86692131-c9ff-3904-dce6-ac59149b2b88@oracle.com>
Message-ID: <1de4cfc0-a719-efc6-8f9c-55c73c9043f7@redhat.com>

>>> I assume JVMTI maintains separate tagging data for each agent so having
>>> two agents doing tagging won't result in confusion. I didn't actually
>>> find this in the spec. Would be nice to confirm that it is the case.
>>> However, your implementation does seem to conflict with other uses of
>>> tagging in the debug agent:
>> The tagging data is per-jvmtiEnv. We create and use our own env (private
>> to class-tracking), so this wouldn't conflict with other uses of tags.
>> Could it be a problem that we have a single trackingEnv per JVM, though?
>> /me scratches head.
> Ok. This is an area I'm not familiar with, but the spec does say:
> 
> "Each call to GetEnv creates a new JVM TI connection and thus a new JVM
> TI environment."
> 
> So it looks like what you are doing should be ok. I still think you have
> a bug where you are not deallocating signatures of classes that are
> unloaded. If you think otherwise please point out where this is done.

Signatures that make it out of processUnloading() are deallocated in
eventHandler.c, in synthesizeUnload(), right after it has been used.

http://hg.openjdk.java.net/jdk/jdk/file/b9562cc25fc0/src/jdk.jdwp.agent/share/native/libjdwp/eventHandler.c#l527

Pending signatures on debug-agent-disconnect are deallocated in
classTrack.c, in the reset() routine.

Thanks,
Roman

> thanks,
> 
> Chris
>>> What would cause classTrack_addPreparedClass() to be called for a Class
>>> you've already seen? I don't understand the need for the "tag != 0l"
>>> check.
>> It's probably not needed, may be a left-over from previous installments
>> of this implementation. I will check it, and turn into an assert or so.
>>
>> Thanks,
>> Roman
>>
>>> thanks,
>>>
>>> Chris
>>>
>>> On 3/20/20 12:52 PM, Chris Plummer wrote:
>>>> On 3/20/20 8:30 AM, Roman Kennke wrote:
>>>>> I believe I came up with a much simpler solution that also solves the
>>>>> problems of the existing one, and the ones I proposed earlier.
>>>>>
>>>>> It turns out that we can take advantage of the fact that we can use
>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is
>>>>> explicitely
>>>>> mentioned in the JVMTI spec). This means we can simply stick a pointer
>>>>> to the signature of a class into the tag, and pull it out again
>>>>> when we
>>>>> get notified that the class gets unloaded.
>>>>>
>>>>> This means we don't need an extra data-structure to keep track of
>>>>> classes and signatures, and it also makes the story around locking
>>>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning of all
>>>>> classes needed (as in the current implementation) and no searching of
>>>>> table needed (like in my previous attempts).
>>>>>
>>>>> Please review this new revision:
>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/
>>>> I'll have a look at this.
>>>>> (Notice that there still appears to be a performance bottleneck with
>>>>> class-unloading when an actual debugger is attached. This doesn't seem
>>>>> to be related to the classTrack.c implementation though, but looks
>>>>> like
>>>>> a consequence of getting all those class-unload notifications over the
>>>>> wire. My testcase generates 1000s of them, and it's clogging up the
>>>>> buffers.)
>>>> At least this is only a one-shot hit when the classes are unloaded,
>>>> and the performance hit is based on the number of classes being
>>>> unloaded. The main issue is happening every GC, and is O(n) where n is
>>>> the number of loaded classes.
>>>>> I am not sure why jdb needs to enable class-unload listener always. A
>>>>> simple hack disables it, and performance is brilliant, even when
>>>>> jdb is
>>>>> attached:
>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch
>>>> This is JDI, not jdb. It looks like it needs ClassUnload events so it
>>>> can maintain typesBySignature, which is used by public APIs like
>>>> allClasses(). So we have caching of loaded classes both in the debug
>>>> agent and in JDI.
>>>>
>>>> Chris
>>>>> But this is not in the scope of this bug.)
>>>>>
>>>>> Roman
>>>>>
>>>>>
>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote:
>>>>>> Sorry, forgot to complete my comments at the end (see below).
>>>>>>
>>>>>>
>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote:
>>>>>>> Hi Roman,
>>>>>>>
>>>>>>> Thank you for the update and sorry for the latency in review.
>>>>>>>
>>>>>>> Some comments are below.
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
>>>>>>> ??? 88 {
>>>>>>> 89 debugMonitorEnter(deletedSignatureLock);
>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>> 93 return;
>>>>>>> ??? 94???? }
>>>>>>> Just a question:
>>>>>>> ??? Q1: Should the ObjectFree events be disabled for the jvmtiEnv
>>>>>>> that does
>>>>>>> ??????? the class tracking if class tracking has not been
>>>>>>> initialized?
>>>>>>>
>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is
>>>>>>> better to
>>>>>>> be something like: lastClassTag or highestClassTag.
>>>>>>>
>>>>>>> 99 KlassNode* klass = *klass_ptr;
>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103
>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr;
>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { //
>>>>>>> klass not
>>>>>>> found - ignore.
>>>>>>> 107 debugMonitorExit(deletedSignatureLock);
>>>>>>> 108 return;
>>>>>>> ?? 109???? }
>>>>>>> ???It seems to me, something is wrong in the condition at L106
>>>>>>> above.
>>>>>>> ???Should it be? :
>>>>>>> ????? if (klass == NULL || klass->klass_tag != tag)
>>>>>>>
>>>>>>> ???Otherwise, how can the second check ever work correctly as the
>>>>>>> return
>>>>>>> will always happen when (klass != NULL)?
>>>>>>>
>>>>>>> ?? There are several places in this file with the the indent:
>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>> 93 return;
>>>>>>> ??? 94???? }
>>>>>>> ?? ...
>>>>>>> 152 if (currentClassTag == -1) {
>>>>>>> 153 // Class tracking not initialized yet, nobody's interested
>>>>>>> 154 debugMonitorExit(deletedSignatureLock);
>>>>>>> 155 return;
>>>>>>> ?? 156???? }
>>>>>>> ?? ...
>>>>>>> 161 if (error != JVMTI_ERROR_NONE) {
>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv");
>>>>>>> ?? 163???? }
>>>>>>> 164 if (tag != 0l) {
>>>>>>> 165 debugMonitorExit(deletedSignatureLock);
>>>>>>> 166 return; // Already added
>>>>>>> ?? 167???? }
>>>>>>> ?? ...
>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg)
>>>>>>> 282 {
>>>>>>> 283 char* sig = (char*)signatureVoid;
>>>>>>> 284 jvmtiDeallocate(sig);
>>>>>>> 285 return JNI_TRUE;
>>>>>>> ?? 286 }
>>>>>>> ?? ...
>>>>>>> ?? 291 void
>>>>>>> ?? 292 classTrack_reset(void)
>>>>>>> ?? 293 {
>>>>>>> 294 int idx;
>>>>>>> 295 debugMonitorEnter(deletedSignatureLock);
>>>>>>> 296
>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) {
>>>>>>> 298 KlassNode* node = table[idx];
>>>>>>> 299 while (node != NULL) {
>>>>>>> 300 KlassNode* next = node->next;
>>>>>>> 301 jvmtiDeallocate(node->signature);
>>>>>>> 302 jvmtiDeallocate(node);
>>>>>>> 303 node = next;
>>>>>>> 304 }
>>>>>>> 305 }
>>>>>>> 306 jvmtiDeallocate(table);
>>>>>>> 307
>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL);
>>>>>>> 309 bagDestroyBag(deletedSignatureBag);
>>>>>>> 310
>>>>>>> 311 currentClassTag = -1;
>>>>>>> 312
>>>>>>> 313
>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv);
>>>>>>> 314 trackingEnv = NULL;
>>>>>>> 315
>>>>>>> 316 debugMonitorExit(deletedSignatureLock);
>>>>>>>
>>>>>>> Could you, please, fix several comments below?
>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for
>>>>>>> class-unloads
>>>>>>> ???The comma is not needed.
>>>>>>> ???Would it better to replace: klass tags => klass_tag's ?
>>>>>>>
>>>>>>>
>>>>>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag
>>>>>>> consistent
>>>>>>> ???Maybe: Lock to guard ... or lock to keep integrity of ...
>>>>>>>
>>>>>>> 84 * Callback when classes are freed, Finds the signature and
>>>>>>> remembers it in deletedSignatureBag. Would be better to use words
>>>>>>> like
>>>>>>> "store" or "record", "Find" should not start from capital letter:
>>>>>>> Invoke the callback when classes are freed, find and record the
>>>>>>> signature in deletedSignatureBag.
>>>>>>>
>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not initialized,
>>>>>>> nobody's interested 153 // Class tracking not initialized yet,
>>>>>>> nobody's interested 158 /* Check this is not a duplicate */
>>>>>>> Missed dot
>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { //
>>>>>>> klass not found - ignore. In opposite, dot is not needed as the
>>>>>>> comment does not start from a capital letter. 111 // At this
>>>>>>> point we
>>>>>>> have the KlassNode corresponding to the tag
>>>>>>> 112 // in klass, and the pointer to it in klass_node.
>>>>>> ?? The comment above can be better. Maybe, something like:
>>>>>> ?? ? " At this point, we found the KlassNode matching the klass
>>>>>> tag(and it is
>>>>>> linked).
>>>>>>
>>>>>>> 113 // Remember the unloaded signature.
>>>>>> ???Better: Record the signature of the unloaded class and unlink it.
>>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>> On 3/9/20 05:39, Roman Kennke wrote:
>>>>>>>> Hello all,
>>>>>>>>
>>>>>>>> Can I please get reviews of this change? In the meantime, we've
>>>>>>>> done
>>>>>>>> more testing and also field-/torture-testing by a customer who is
>>>>>>>> happy
>>>>>>>> now. :-)
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Roman
>>>>>>>>
>>>>>>>>
>>>>>>>>> Hi Serguei,
>>>>>>>>>
>>>>>>>>> Thanks for reviewing!
>>>>>>>>>
>>>>>>>>> I updated the patch to reflect your suggestions, very good!
>>>>>>>>> It also includes a fix to allow re-connecting an agent after
>>>>>>>>> disconnect,
>>>>>>>>> namely move setup of the trackingEnv and deletedSignatureBag to
>>>>>>>>> _activate() to ensure have those structures after re-connect.
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
>>>>>>>>>
>>>>>>>>> Let me know what you think!
>>>>>>>>> Roman
>>>>>>>>>
>>>>>>>>>> Hi Roman,
>>>>>>>>>>
>>>>>>>>>> Thank you for taking care about this scalability issue!
>>>>>>>>>>
>>>>>>>>>> I have a couple of quick comments.
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 72 /*
>>>>>>>>>> 73 * Lock to protect deletedSignatureBag
>>>>>>>>>> 74 */
>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>>>>>>>>>> 78 * A bag containing all the deleted classes' signatures.
>>>>>>>>>> Must be
>>>>>>>>>> accessed under
>>>>>>>>>> 79 * deletedTagLock,
>>>>>>>>>> ??? 80? */
>>>>>>>>>> 81 struct bag* deletedSignatureBag;
>>>>>>>>>>
>>>>>>>>>> ??? The comments contradict to each other.
>>>>>>>>>> ??? I guess, the lock name at line 79 has to be
>>>>>>>>>> deletedSignatureLock
>>>>>>>>>> instead of deletedTagLock.
>>>>>>>>>> ??? Also, comma at the end must be replaced with dot.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 101 // Tag not found? Ignore.
>>>>>>>>>> 102 if (klass == NULL) {
>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>> 104 return;
>>>>>>>>>> 105 }
>>>>>>>>>> ?? 106
>>>>>>>>>> 107 // Scan linked-list.
>>>>>>>>>> 108 jlong found_tag = klass->klass_tag;
>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) {
>>>>>>>>>> 110 klass_ptr = &klass->next;
>>>>>>>>>> 111 klass = *klass_ptr;
>>>>>>>>>> 112 found_tag = klass->klass_tag;
>>>>>>>>>> ?? 113???? }
>>>>>>>>>> 114
>>>>>>>>>> 115 // Tag not found? Ignore.
>>>>>>>>>> 116 if (found_tag != tag) {
>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>> 118 return;
>>>>>>>>>> ?? 119???? }
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ???The code above can be simplified, so that the lines 101-105
>>>>>>>>>> are not
>>>>>>>>>> needed anymore.
>>>>>>>>>> ???It can be something like this:
>>>>>>>>>>
>>>>>>>>>> // Scan linked-list.
>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) {
>>>>>>>>>> klass_ptr = &klass->next;
>>>>>>>>>> klass = *klass_ptr;
>>>>>>>>>> ?????? }
>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not
>>>>>>>>>> found - ignore.
>>>>>>>>>> debugMonitorExit(deletedSignatureLock);
>>>>>>>>>> return;
>>>>>>>>>> ?????? }
>>>>>>>>>>
>>>>>>>>>> It will take more time when I get a chance to look at the rest.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Serguei
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote:
>>>>>>>>>>> Here comes an update that resolves some races that happen when
>>>>>>>>>>> disconnecting an agent. In particular, we need to take the
>>>>>>>>>>> lock on
>>>>>>>>>>> basically every operation, and also need to check whether or not
>>>>>>>>>>> class-tracking is active and return an appropriate result (e.g.
>>>>>>>>>>> an empty
>>>>>>>>>>> list) when we're not.
>>>>>>>>>>>
>>>>>>>>>>> Updated webrev:
>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Roman
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> So, here comes the O(1) implementation:
>>>>>>>>>>>>
>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a tag,
>>>>>>>>>>>> and we
>>>>>>>>>>>> set-up a listener to get notified when it is unloaded.
>>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a
>>>>>>>>>>>> table, which
>>>>>>>>>>>> each entry being the head of a linked-list of KlassNode*. The
>>>>>>>>>>>> table is
>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend the new
>>>>>>>>>>>> KlassNode*.
>>>>>>>>>>>> This is O(1) operation.
>>>>>>>>>>>> - When we get notified of unloading a class, we look up the
>>>>>>>>>>>> signature of
>>>>>>>>>>>> the reported tag in that table, and remember it in a bag. The
>>>>>>>>>>>> KlassNode*
>>>>>>>>>>>> is then unlinked from the table and deallocated. This is ~O(1)
>>>>>>>>>>>> operation
>>>>>>>>>>>> too, depending on the depth of the table. In my testcase which
>>>>>>>>>>>> hammered
>>>>>>>>>>>> the code with class-loads and unloads, I usually see depths of
>>>>>>>>>>>> like 2-3,
>>>>>>>>>>>> but not usually more. It should be ok.
>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out that
>>>>>>>>>>>> bag, and
>>>>>>>>>>>> allocate a new one.
>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid
>>>>>>>>>>>> leaking the
>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets detached
>>>>>>>>>>>> and/or
>>>>>>>>>>>> re-attached (was missing before).
>>>>>>>>>>>> - I also added locks around data-structure-manipulation (was
>>>>>>>>>>>> missing
>>>>>>>>>>>> before).
>>>>>>>>>>>> - Also, I only activate this whole process when an actual
>>>>>>>>>>>> listener gets
>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when
>>>>>>>>>>>> attaching a
>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be something
>>>>>>>>>>>> to improve
>>>>>>>>>>>> in the future?
>>>>>>>>>>>>
>>>>>>>>>>>> In my tests, the performance of class-tracking itself looks
>>>>>>>>>>>> really good.
>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the
>>>>>>>>>>>> class-unload
>>>>>>>>>>>> events. I don't see how this can be helped when the debug
>>>>>>>>>>>> agent asks for it?
>>>>>>>>>>>>
>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
>>>>>>>>>>>>
>>>>>>>>>>>> Please let me know what you think of it.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Roman
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am implementing
>>>>>>>>>>>>> the even more
>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing for
>>>>>>>>>>>>> now.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,Roman
>>>>>>>>>>>>>
>>>>>>>>>>>>> ?? Hi Chris,
>>>>>>>>>>>>>>> I'll have a look at this, although it might not be for a
>>>>>>>>>>>>>>> few days. In
>>>>>>>>>>>>>>> the meantime, maybe you can describe your new
>>>>>>>>>>>>>>> implementation in
>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the changes.
>>>>>>>>>>>>>> Sure.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to
>>>>>>>>>>>>>> determine the
>>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading
>>>>>>>>>>>>>> happened, so that
>>>>>>>>>>>>>> we can generate the appropriate JDWP event.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The current implementation does so by maintaining a table of
>>>>>>>>>>>>>> currently
>>>>>>>>>>>>>> prepared classes by building that table when classTrack is
>>>>>>>>>>>>>> initialized,
>>>>>>>>>>>>>> and then add new classes whenever a class gets loaded. When
>>>>>>>>>>>>>> unloading
>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and compared
>>>>>>>>>>>>>> with the
>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the new
>>>>>>>>>>>>>> table gets
>>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently
>>>>>>>>>>>>>> and/or many
>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to
>>>>>>>>>>>>>> O(classCount*gcCount)
>>>>>>>>>>>>>> complexity.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared
>>>>>>>>>>>>>> classes, and also
>>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree().
>>>>>>>>>>>>>> Whenever an
>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned,
>>>>>>>>>>>>>> and classes
>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus
>>>>>>>>>>>>>> maintaining the
>>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list
>>>>>>>>>>>>>> that gets returned.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The implementation is not perfect. In order to determine
>>>>>>>>>>>>>> whether or not
>>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag.
>>>>>>>>>>>>>> That process is
>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here
>>>>>>>>>>>>>> is that
>>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this
>>>>>>>>>>>>>> seems to be
>>>>>>>>>>>>>> true, and also reasonable to expect.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> (I have some ideas how to improve the implementation to
>>>>>>>>>>>>>> ~O(1) but it
>>>>>>>>>>>>>> would be considerably more complex: have to maintain a
>>>>>>>>>>>>>> (hash)table that
>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload,
>>>>>>>>>>>>>> and build the
>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently see
>>>>>>>>>>>>>> that it's
>>>>>>>>>>>>>> worth the effort).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In addition to all that, this process is only activated when
>>>>>>>>>>>>>> there's an
>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Issue:
>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c.
>>>>>>>>>>>>>>>> It avoids
>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps
>>>>>>>>>>>>>>>> track of
>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance until an
>>>>>>>>>>>>>>>> agent
>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and
>>>>>>>>>>>>>>>> timing.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Eg with the testcase provided here:
>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I am getting those numbers:
>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Can I please get a review?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>
>>>>
>>>
> 
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200324/35657eab/signature-0001.asc>

From chris.plummer at oracle.com  Tue Mar 24 21:39:46 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 24 Mar 2020 14:39:46 -0700
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <1de4cfc0-a719-efc6-8f9c-55c73c9043f7@redhat.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
 <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>
 <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>
 <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com>
 <7766d2f0-e223-7129-9c37-474bbcd67d70@oracle.com>
 <b542b056-6de2-1ad0-b54d-9029c80d5164@oracle.com>
 <7667eb6e-8458-6763-3f27-f97f6f49bed8@redhat.com>
 <86692131-c9ff-3904-dce6-ac59149b2b88@oracle.com>
 <1de4cfc0-a719-efc6-8f9c-55c73c9043f7@redhat.com>
Message-ID: <9e86af41-0c60-de4a-53f0-9627b7248209@oracle.com>

On 3/24/20 1:45 PM, Roman Kennke wrote:
>>>> I assume JVMTI maintains separate tagging data for each agent so having
>>>> two agents doing tagging won't result in confusion. I didn't actually
>>>> find this in the spec. Would be nice to confirm that it is the case.
>>>> However, your implementation does seem to conflict with other uses of
>>>> tagging in the debug agent:
>>> The tagging data is per-jvmtiEnv. We create and use our own env (private
>>> to class-tracking), so this wouldn't conflict with other uses of tags.
>>> Could it be a problem that we have a single trackingEnv per JVM, though?
>>> /me scratches head.
>> Ok. This is an area I'm not familiar with, but the spec does say:
>>
>> "Each call to GetEnv creates a new JVM TI connection and thus a new JVM
>> TI environment."
>>
>> So it looks like what you are doing should be ok. I still think you have
>> a bug where you are not deallocating signatures of classes that are
>> unloaded. If you think otherwise please point out where this is done.
> Signatures that make it out of processUnloading() are deallocated in
> eventHandler.c, in synthesizeUnload(), right after it has been used.
>
> http://hg.openjdk.java.net/jdk/jdk/file/b9562cc25fc0/src/jdk.jdwp.agent/share/native/libjdwp/eventHandler.c#l527
Ok. Good to know. Not the best of designs, but that's not your fault. 
I'll make another pass over the changes, but I think in general it looks 
good. I don't think I've seen another reviewer yet, so hopefully someone 
jumps in.

Chris
> Pending signatures on debug-agent-disconnect are deallocated in
> classTrack.c, in the reset() routine.
>
> Thanks,
> Roman
>
>> thanks,
>>
>> Chris
>>>> What would cause classTrack_addPreparedClass() to be called for a Class
>>>> you've already seen? I don't understand the need for the "tag != 0l"
>>>> check.
>>> It's probably not needed, may be a left-over from previous installments
>>> of this implementation. I will check it, and turn into an assert or so.
>>>
>>> Thanks,
>>> Roman
>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>
>>>> On 3/20/20 12:52 PM, Chris Plummer wrote:
>>>>> On 3/20/20 8:30 AM, Roman Kennke wrote:
>>>>>> I believe I came up with a much simpler solution that also solves the
>>>>>> problems of the existing one, and the ones I proposed earlier.
>>>>>>
>>>>>> It turns out that we can take advantage of the fact that we can use
>>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is
>>>>>> explicitely
>>>>>> mentioned in the JVMTI spec). This means we can simply stick a pointer
>>>>>> to the signature of a class into the tag, and pull it out again
>>>>>> when we
>>>>>> get notified that the class gets unloaded.
>>>>>>
>>>>>> This means we don't need an extra data-structure to keep track of
>>>>>> classes and signatures, and it also makes the story around locking
>>>>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning of all
>>>>>> classes needed (as in the current implementation) and no searching of
>>>>>> table needed (like in my previous attempts).
>>>>>>
>>>>>> Please review this new revision:
>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/
>>>>> I'll have a look at this.
>>>>>> (Notice that there still appears to be a performance bottleneck with
>>>>>> class-unloading when an actual debugger is attached. This doesn't seem
>>>>>> to be related to the classTrack.c implementation though, but looks
>>>>>> like
>>>>>> a consequence of getting all those class-unload notifications over the
>>>>>> wire. My testcase generates 1000s of them, and it's clogging up the
>>>>>> buffers.)
>>>>> At least this is only a one-shot hit when the classes are unloaded,
>>>>> and the performance hit is based on the number of classes being
>>>>> unloaded. The main issue is happening every GC, and is O(n) where n is
>>>>> the number of loaded classes.
>>>>>> I am not sure why jdb needs to enable class-unload listener always. A
>>>>>> simple hack disables it, and performance is brilliant, even when
>>>>>> jdb is
>>>>>> attached:
>>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch
>>>>> This is JDI, not jdb. It looks like it needs ClassUnload events so it
>>>>> can maintain typesBySignature, which is used by public APIs like
>>>>> allClasses(). So we have caching of loaded classes both in the debug
>>>>> agent and in JDI.
>>>>>
>>>>> Chris
>>>>>> But this is not in the scope of this bug.)
>>>>>>
>>>>>> Roman
>>>>>>
>>>>>>
>>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>> Sorry, forgot to complete my comments at the end (see below).
>>>>>>>
>>>>>>>
>>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote:
>>>>>>>> Hi Roman,
>>>>>>>>
>>>>>>>> Thank you for the update and sorry for the latency in review.
>>>>>>>>
>>>>>>>> Some comments are below.
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
>>>>>>>>  ??? 88 {
>>>>>>>> 89 debugMonitorEnter(deletedSignatureLock);
>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>> 93 return;
>>>>>>>>  ??? 94???? }
>>>>>>>> Just a question:
>>>>>>>>  ??? Q1: Should the ObjectFree events be disabled for the jvmtiEnv
>>>>>>>> that does
>>>>>>>>  ??????? the class tracking if class tracking has not been
>>>>>>>> initialized?
>>>>>>>>
>>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is
>>>>>>>> better to
>>>>>>>> be something like: lastClassTag or highestClassTag.
>>>>>>>>
>>>>>>>> 99 KlassNode* klass = *klass_ptr;
>>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103
>>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr;
>>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { //
>>>>>>>> klass not
>>>>>>>> found - ignore.
>>>>>>>> 107 debugMonitorExit(deletedSignatureLock);
>>>>>>>> 108 return;
>>>>>>>>  ?? 109???? }
>>>>>>>>  ???It seems to me, something is wrong in the condition at L106
>>>>>>>> above.
>>>>>>>>  ???Should it be? :
>>>>>>>>  ????? if (klass == NULL || klass->klass_tag != tag)
>>>>>>>>
>>>>>>>>  ???Otherwise, how can the second check ever work correctly as the
>>>>>>>> return
>>>>>>>> will always happen when (klass != NULL)?
>>>>>>>>
>>>>>>>>  ?? There are several places in this file with the the indent:
>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>> 93 return;
>>>>>>>>  ??? 94???? }
>>>>>>>>  ?? ...
>>>>>>>> 152 if (currentClassTag == -1) {
>>>>>>>> 153 // Class tracking not initialized yet, nobody's interested
>>>>>>>> 154 debugMonitorExit(deletedSignatureLock);
>>>>>>>> 155 return;
>>>>>>>>  ?? 156???? }
>>>>>>>>  ?? ...
>>>>>>>> 161 if (error != JVMTI_ERROR_NONE) {
>>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv");
>>>>>>>>  ?? 163???? }
>>>>>>>> 164 if (tag != 0l) {
>>>>>>>> 165 debugMonitorExit(deletedSignatureLock);
>>>>>>>> 166 return; // Already added
>>>>>>>>  ?? 167???? }
>>>>>>>>  ?? ...
>>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg)
>>>>>>>> 282 {
>>>>>>>> 283 char* sig = (char*)signatureVoid;
>>>>>>>> 284 jvmtiDeallocate(sig);
>>>>>>>> 285 return JNI_TRUE;
>>>>>>>>  ?? 286 }
>>>>>>>>  ?? ...
>>>>>>>>  ?? 291 void
>>>>>>>>  ?? 292 classTrack_reset(void)
>>>>>>>>  ?? 293 {
>>>>>>>> 294 int idx;
>>>>>>>> 295 debugMonitorEnter(deletedSignatureLock);
>>>>>>>> 296
>>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) {
>>>>>>>> 298 KlassNode* node = table[idx];
>>>>>>>> 299 while (node != NULL) {
>>>>>>>> 300 KlassNode* next = node->next;
>>>>>>>> 301 jvmtiDeallocate(node->signature);
>>>>>>>> 302 jvmtiDeallocate(node);
>>>>>>>> 303 node = next;
>>>>>>>> 304 }
>>>>>>>> 305 }
>>>>>>>> 306 jvmtiDeallocate(table);
>>>>>>>> 307
>>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL);
>>>>>>>> 309 bagDestroyBag(deletedSignatureBag);
>>>>>>>> 310
>>>>>>>> 311 currentClassTag = -1;
>>>>>>>> 312
>>>>>>>> 313
>>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv);
>>>>>>>> 314 trackingEnv = NULL;
>>>>>>>> 315
>>>>>>>> 316 debugMonitorExit(deletedSignatureLock);
>>>>>>>>
>>>>>>>> Could you, please, fix several comments below?
>>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for
>>>>>>>> class-unloads
>>>>>>>>  ???The comma is not needed.
>>>>>>>>  ???Would it better to replace: klass tags => klass_tag's ?
>>>>>>>>
>>>>>>>>
>>>>>>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag
>>>>>>>> consistent
>>>>>>>>  ???Maybe: Lock to guard ... or lock to keep integrity of ...
>>>>>>>>
>>>>>>>> 84 * Callback when classes are freed, Finds the signature and
>>>>>>>> remembers it in deletedSignatureBag. Would be better to use words
>>>>>>>> like
>>>>>>>> "store" or "record", "Find" should not start from capital letter:
>>>>>>>> Invoke the callback when classes are freed, find and record the
>>>>>>>> signature in deletedSignatureBag.
>>>>>>>>
>>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not initialized,
>>>>>>>> nobody's interested 153 // Class tracking not initialized yet,
>>>>>>>> nobody's interested 158 /* Check this is not a duplicate */
>>>>>>>> Missed dot
>>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { //
>>>>>>>> klass not found - ignore. In opposite, dot is not needed as the
>>>>>>>> comment does not start from a capital letter. 111 // At this
>>>>>>>> point we
>>>>>>>> have the KlassNode corresponding to the tag
>>>>>>>> 112 // in klass, and the pointer to it in klass_node.
>>>>>>>  ?? The comment above can be better. Maybe, something like:
>>>>>>>  ?? ? " At this point, we found the KlassNode matching the klass
>>>>>>> tag(and it is
>>>>>>> linked).
>>>>>>>
>>>>>>>> 113 // Remember the unloaded signature.
>>>>>>>  ???Better: Record the signature of the unloaded class and unlink it.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>>
>>>>>>>> On 3/9/20 05:39, Roman Kennke wrote:
>>>>>>>>> Hello all,
>>>>>>>>>
>>>>>>>>> Can I please get reviews of this change? In the meantime, we've
>>>>>>>>> done
>>>>>>>>> more testing and also field-/torture-testing by a customer who is
>>>>>>>>> happy
>>>>>>>>> now. :-)
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Roman
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Hi Serguei,
>>>>>>>>>>
>>>>>>>>>> Thanks for reviewing!
>>>>>>>>>>
>>>>>>>>>> I updated the patch to reflect your suggestions, very good!
>>>>>>>>>> It also includes a fix to allow re-connecting an agent after
>>>>>>>>>> disconnect,
>>>>>>>>>> namely move setup of the trackingEnv and deletedSignatureBag to
>>>>>>>>>> _activate() to ensure have those structures after re-connect.
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
>>>>>>>>>>
>>>>>>>>>> Let me know what you think!
>>>>>>>>>> Roman
>>>>>>>>>>
>>>>>>>>>>> Hi Roman,
>>>>>>>>>>>
>>>>>>>>>>> Thank you for taking care about this scalability issue!
>>>>>>>>>>>
>>>>>>>>>>> I have a couple of quick comments.
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 72 /*
>>>>>>>>>>> 73 * Lock to protect deletedSignatureBag
>>>>>>>>>>> 74 */
>>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>>>>>>>>>>> 78 * A bag containing all the deleted classes' signatures.
>>>>>>>>>>> Must be
>>>>>>>>>>> accessed under
>>>>>>>>>>> 79 * deletedTagLock,
>>>>>>>>>>>  ??? 80? */
>>>>>>>>>>> 81 struct bag* deletedSignatureBag;
>>>>>>>>>>>
>>>>>>>>>>>  ??? The comments contradict to each other.
>>>>>>>>>>>  ??? I guess, the lock name at line 79 has to be
>>>>>>>>>>> deletedSignatureLock
>>>>>>>>>>> instead of deletedTagLock.
>>>>>>>>>>>  ??? Also, comma at the end must be replaced with dot.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 101 // Tag not found? Ignore.
>>>>>>>>>>> 102 if (klass == NULL) {
>>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>> 104 return;
>>>>>>>>>>> 105 }
>>>>>>>>>>>  ?? 106
>>>>>>>>>>> 107 // Scan linked-list.
>>>>>>>>>>> 108 jlong found_tag = klass->klass_tag;
>>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) {
>>>>>>>>>>> 110 klass_ptr = &klass->next;
>>>>>>>>>>> 111 klass = *klass_ptr;
>>>>>>>>>>> 112 found_tag = klass->klass_tag;
>>>>>>>>>>>  ?? 113???? }
>>>>>>>>>>> 114
>>>>>>>>>>> 115 // Tag not found? Ignore.
>>>>>>>>>>> 116 if (found_tag != tag) {
>>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>> 118 return;
>>>>>>>>>>>  ?? 119???? }
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>  ???The code above can be simplified, so that the lines 101-105
>>>>>>>>>>> are not
>>>>>>>>>>> needed anymore.
>>>>>>>>>>>  ???It can be something like this:
>>>>>>>>>>>
>>>>>>>>>>> // Scan linked-list.
>>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) {
>>>>>>>>>>> klass_ptr = &klass->next;
>>>>>>>>>>> klass = *klass_ptr;
>>>>>>>>>>>  ?????? }
>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not
>>>>>>>>>>> found - ignore.
>>>>>>>>>>> debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>> return;
>>>>>>>>>>>  ?????? }
>>>>>>>>>>>
>>>>>>>>>>> It will take more time when I get a chance to look at the rest.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Serguei
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote:
>>>>>>>>>>>> Here comes an update that resolves some races that happen when
>>>>>>>>>>>> disconnecting an agent. In particular, we need to take the
>>>>>>>>>>>> lock on
>>>>>>>>>>>> basically every operation, and also need to check whether or not
>>>>>>>>>>>> class-tracking is active and return an appropriate result (e.g.
>>>>>>>>>>>> an empty
>>>>>>>>>>>> list) when we're not.
>>>>>>>>>>>>
>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Roman
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> So, here comes the O(1) implementation:
>>>>>>>>>>>>>
>>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a tag,
>>>>>>>>>>>>> and we
>>>>>>>>>>>>> set-up a listener to get notified when it is unloaded.
>>>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a
>>>>>>>>>>>>> table, which
>>>>>>>>>>>>> each entry being the head of a linked-list of KlassNode*. The
>>>>>>>>>>>>> table is
>>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend the new
>>>>>>>>>>>>> KlassNode*.
>>>>>>>>>>>>> This is O(1) operation.
>>>>>>>>>>>>> - When we get notified of unloading a class, we look up the
>>>>>>>>>>>>> signature of
>>>>>>>>>>>>> the reported tag in that table, and remember it in a bag. The
>>>>>>>>>>>>> KlassNode*
>>>>>>>>>>>>> is then unlinked from the table and deallocated. This is ~O(1)
>>>>>>>>>>>>> operation
>>>>>>>>>>>>> too, depending on the depth of the table. In my testcase which
>>>>>>>>>>>>> hammered
>>>>>>>>>>>>> the code with class-loads and unloads, I usually see depths of
>>>>>>>>>>>>> like 2-3,
>>>>>>>>>>>>> but not usually more. It should be ok.
>>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out that
>>>>>>>>>>>>> bag, and
>>>>>>>>>>>>> allocate a new one.
>>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid
>>>>>>>>>>>>> leaking the
>>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets detached
>>>>>>>>>>>>> and/or
>>>>>>>>>>>>> re-attached (was missing before).
>>>>>>>>>>>>> - I also added locks around data-structure-manipulation (was
>>>>>>>>>>>>> missing
>>>>>>>>>>>>> before).
>>>>>>>>>>>>> - Also, I only activate this whole process when an actual
>>>>>>>>>>>>> listener gets
>>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when
>>>>>>>>>>>>> attaching a
>>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be something
>>>>>>>>>>>>> to improve
>>>>>>>>>>>>> in the future?
>>>>>>>>>>>>>
>>>>>>>>>>>>> In my tests, the performance of class-tracking itself looks
>>>>>>>>>>>>> really good.
>>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the
>>>>>>>>>>>>> class-unload
>>>>>>>>>>>>> events. I don't see how this can be helped when the debug
>>>>>>>>>>>>> agent asks for it?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please let me know what you think of it.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am implementing
>>>>>>>>>>>>>> the even more
>>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing for
>>>>>>>>>>>>>> now.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,Roman
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  ?? Hi Chris,
>>>>>>>>>>>>>>>> I'll have a look at this, although it might not be for a
>>>>>>>>>>>>>>>> few days. In
>>>>>>>>>>>>>>>> the meantime, maybe you can describe your new
>>>>>>>>>>>>>>>> implementation in
>>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the changes.
>>>>>>>>>>>>>>> Sure.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to
>>>>>>>>>>>>>>> determine the
>>>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading
>>>>>>>>>>>>>>> happened, so that
>>>>>>>>>>>>>>> we can generate the appropriate JDWP event.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The current implementation does so by maintaining a table of
>>>>>>>>>>>>>>> currently
>>>>>>>>>>>>>>> prepared classes by building that table when classTrack is
>>>>>>>>>>>>>>> initialized,
>>>>>>>>>>>>>>> and then add new classes whenever a class gets loaded. When
>>>>>>>>>>>>>>> unloading
>>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and compared
>>>>>>>>>>>>>>> with the
>>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the new
>>>>>>>>>>>>>>> table gets
>>>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently
>>>>>>>>>>>>>>> and/or many
>>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to
>>>>>>>>>>>>>>> O(classCount*gcCount)
>>>>>>>>>>>>>>> complexity.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared
>>>>>>>>>>>>>>> classes, and also
>>>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree().
>>>>>>>>>>>>>>> Whenever an
>>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned,
>>>>>>>>>>>>>>> and classes
>>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus
>>>>>>>>>>>>>>> maintaining the
>>>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list
>>>>>>>>>>>>>>> that gets returned.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The implementation is not perfect. In order to determine
>>>>>>>>>>>>>>> whether or not
>>>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag.
>>>>>>>>>>>>>>> That process is
>>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here
>>>>>>>>>>>>>>> is that
>>>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this
>>>>>>>>>>>>>>> seems to be
>>>>>>>>>>>>>>> true, and also reasonable to expect.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> (I have some ideas how to improve the implementation to
>>>>>>>>>>>>>>> ~O(1) but it
>>>>>>>>>>>>>>> would be considerably more complex: have to maintain a
>>>>>>>>>>>>>>> (hash)table that
>>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload,
>>>>>>>>>>>>>>> and build the
>>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently see
>>>>>>>>>>>>>>> that it's
>>>>>>>>>>>>>>> worth the effort).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In addition to all that, this process is only activated when
>>>>>>>>>>>>>>> there's an
>>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Issue:
>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c.
>>>>>>>>>>>>>>>>> It avoids
>>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps
>>>>>>>>>>>>>>>>> track of
>>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance until an
>>>>>>>>>>>>>>>>> agent
>>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and
>>>>>>>>>>>>>>>>> timing.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Eg with the testcase provided here:
>>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I am getting those numbers:
>>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Can I please get a review?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>
>>


From serguei.spitsyn at oracle.com  Tue Mar 24 21:46:33 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 24 Mar 2020 14:46:33 -0700
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <9e86af41-0c60-de4a-53f0-9627b7248209@oracle.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
 <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>
 <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>
 <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com>
 <7766d2f0-e223-7129-9c37-474bbcd67d70@oracle.com>
 <b542b056-6de2-1ad0-b54d-9029c80d5164@oracle.com>
 <7667eb6e-8458-6763-3f27-f97f6f49bed8@redhat.com>
 <86692131-c9ff-3904-dce6-ac59149b2b88@oracle.com>
 <1de4cfc0-a719-efc6-8f9c-55c73c9043f7@redhat.com>
 <9e86af41-0c60-de4a-53f0-9627b7248209@oracle.com>
Message-ID: <8e6f2f0c-ceee-2ef1-50b5-cea01bb31893@oracle.com>

On 3/24/20 14:39, Chris Plummer wrote:
> On 3/24/20 1:45 PM, Roman Kennke wrote:
>>>>> I assume JVMTI maintains separate tagging data for each agent so 
>>>>> having
>>>>> two agents doing tagging won't result in confusion. I didn't actually
>>>>> find this in the spec. Would be nice to confirm that it is the case.
>>>>> However, your implementation does seem to conflict with other uses of
>>>>> tagging in the debug agent:
>>>> The tagging data is per-jvmtiEnv. We create and use our own env 
>>>> (private
>>>> to class-tracking), so this wouldn't conflict with other uses of tags.
>>>> Could it be a problem that we have a single trackingEnv per JVM, 
>>>> though?
>>>> /me scratches head.
>>> Ok. This is an area I'm not familiar with, but the spec does say:
>>>
>>> "Each call to GetEnv creates a new JVM TI connection and thus a new JVM
>>> TI environment."
>>>
>>> So it looks like what you are doing should be ok. I still think you 
>>> have
>>> a bug where you are not deallocating signatures of classes that are
>>> unloaded. If you think otherwise please point out where this is done.
>> Signatures that make it out of processUnloading() are deallocated in
>> eventHandler.c, in synthesizeUnload(), right after it has been used.
>>
>> http://hg.openjdk.java.net/jdk/jdk/file/b9562cc25fc0/src/jdk.jdwp.agent/share/native/libjdwp/eventHandler.c#l527 
>>
> Ok. Good to know. Not the best of designs, but that's not your fault. 
> I'll make another pass over the changes, but I think in general it 
> looks good. I don't think I've seen another reviewer yet, so hopefully 
> someone jumps in.

As I understand, Roman already resolved my previous comments.
So, I will do another pass for v6.

Thanks,
Serguei

>
> Chris
>> Pending signatures on debug-agent-disconnect are deallocated in
>> classTrack.c, in the reset() routine.
>>
>> Thanks,
>> Roman
>>
>>> thanks,
>>>
>>> Chris
>>>>> What would cause classTrack_addPreparedClass() to be called for a 
>>>>> Class
>>>>> you've already seen? I don't understand the need for the "tag != 0l"
>>>>> check.
>>>> It's probably not needed, may be a left-over from previous 
>>>> installments
>>>> of this implementation. I will check it, and turn into an assert or 
>>>> so.
>>>>
>>>> Thanks,
>>>> Roman
>>>>
>>>>> thanks,
>>>>>
>>>>> Chris
>>>>>
>>>>> On 3/20/20 12:52 PM, Chris Plummer wrote:
>>>>>> On 3/20/20 8:30 AM, Roman Kennke wrote:
>>>>>>> I believe I came up with a much simpler solution that also 
>>>>>>> solves the
>>>>>>> problems of the existing one, and the ones I proposed earlier.
>>>>>>>
>>>>>>> It turns out that we can take advantage of the fact that we can use
>>>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is
>>>>>>> explicitely
>>>>>>> mentioned in the JVMTI spec). This means we can simply stick a 
>>>>>>> pointer
>>>>>>> to the signature of a class into the tag, and pull it out again
>>>>>>> when we
>>>>>>> get notified that the class gets unloaded.
>>>>>>>
>>>>>>> This means we don't need an extra data-structure to keep track of
>>>>>>> classes and signatures, and it also makes the story around locking
>>>>>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning 
>>>>>>> of all
>>>>>>> classes needed (as in the current implementation) and no 
>>>>>>> searching of
>>>>>>> table needed (like in my previous attempts).
>>>>>>>
>>>>>>> Please review this new revision:
>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/
>>>>>> I'll have a look at this.
>>>>>>> (Notice that there still appears to be a performance bottleneck 
>>>>>>> with
>>>>>>> class-unloading when an actual debugger is attached. This 
>>>>>>> doesn't seem
>>>>>>> to be related to the classTrack.c implementation though, but looks
>>>>>>> like
>>>>>>> a consequence of getting all those class-unload notifications 
>>>>>>> over the
>>>>>>> wire. My testcase generates 1000s of them, and it's clogging up the
>>>>>>> buffers.)
>>>>>> At least this is only a one-shot hit when the classes are unloaded,
>>>>>> and the performance hit is based on the number of classes being
>>>>>> unloaded. The main issue is happening every GC, and is O(n) where 
>>>>>> n is
>>>>>> the number of loaded classes.
>>>>>>> I am not sure why jdb needs to enable class-unload listener 
>>>>>>> always. A
>>>>>>> simple hack disables it, and performance is brilliant, even when
>>>>>>> jdb is
>>>>>>> attached:
>>>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch
>>>>>> This is JDI, not jdb. It looks like it needs ClassUnload events 
>>>>>> so it
>>>>>> can maintain typesBySignature, which is used by public APIs like
>>>>>> allClasses(). So we have caching of loaded classes both in the debug
>>>>>> agent and in JDI.
>>>>>>
>>>>>> Chris
>>>>>>> But this is not in the scope of this bug.)
>>>>>>>
>>>>>>> Roman
>>>>>>>
>>>>>>>
>>>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>> Sorry, forgot to complete my comments at the end (see below).
>>>>>>>>
>>>>>>>>
>>>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote:
>>>>>>>>> Hi Roman,
>>>>>>>>>
>>>>>>>>> Thank you for the update and sorry for the latency in review.
>>>>>>>>>
>>>>>>>>> Some comments are below.
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
>>>>>>>>> ???? 88 {
>>>>>>>>> 89 debugMonitorEnter(deletedSignatureLock);
>>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>>> 93 return;
>>>>>>>>> ???? 94???? }
>>>>>>>>> Just a question:
>>>>>>>>> ???? Q1: Should the ObjectFree events be disabled for the 
>>>>>>>>> jvmtiEnv
>>>>>>>>> that does
>>>>>>>>> ???????? the class tracking if class tracking has not been
>>>>>>>>> initialized?
>>>>>>>>>
>>>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is
>>>>>>>>> better to
>>>>>>>>> be something like: lastClassTag or highestClassTag.
>>>>>>>>>
>>>>>>>>> 99 KlassNode* klass = *klass_ptr;
>>>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103
>>>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr;
>>>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { //
>>>>>>>>> klass not
>>>>>>>>> found - ignore.
>>>>>>>>> 107 debugMonitorExit(deletedSignatureLock);
>>>>>>>>> 108 return;
>>>>>>>>> ??? 109???? }
>>>>>>>>> ????It seems to me, something is wrong in the condition at L106
>>>>>>>>> above.
>>>>>>>>> ????Should it be? :
>>>>>>>>> ?????? if (klass == NULL || klass->klass_tag != tag)
>>>>>>>>>
>>>>>>>>> ????Otherwise, how can the second check ever work correctly as 
>>>>>>>>> the
>>>>>>>>> return
>>>>>>>>> will always happen when (klass != NULL)?
>>>>>>>>>
>>>>>>>>> ??? There are several places in this file with the the indent:
>>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>>> 93 return;
>>>>>>>>> ???? 94???? }
>>>>>>>>> ??? ...
>>>>>>>>> 152 if (currentClassTag == -1) {
>>>>>>>>> 153 // Class tracking not initialized yet, nobody's interested
>>>>>>>>> 154 debugMonitorExit(deletedSignatureLock);
>>>>>>>>> 155 return;
>>>>>>>>> ??? 156???? }
>>>>>>>>> ??? ...
>>>>>>>>> 161 if (error != JVMTI_ERROR_NONE) {
>>>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv");
>>>>>>>>> ??? 163???? }
>>>>>>>>> 164 if (tag != 0l) {
>>>>>>>>> 165 debugMonitorExit(deletedSignatureLock);
>>>>>>>>> 166 return; // Already added
>>>>>>>>> ??? 167???? }
>>>>>>>>> ??? ...
>>>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg)
>>>>>>>>> 282 {
>>>>>>>>> 283 char* sig = (char*)signatureVoid;
>>>>>>>>> 284 jvmtiDeallocate(sig);
>>>>>>>>> 285 return JNI_TRUE;
>>>>>>>>> ??? 286 }
>>>>>>>>> ??? ...
>>>>>>>>> ??? 291 void
>>>>>>>>> ??? 292 classTrack_reset(void)
>>>>>>>>> ??? 293 {
>>>>>>>>> 294 int idx;
>>>>>>>>> 295 debugMonitorEnter(deletedSignatureLock);
>>>>>>>>> 296
>>>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) {
>>>>>>>>> 298 KlassNode* node = table[idx];
>>>>>>>>> 299 while (node != NULL) {
>>>>>>>>> 300 KlassNode* next = node->next;
>>>>>>>>> 301 jvmtiDeallocate(node->signature);
>>>>>>>>> 302 jvmtiDeallocate(node);
>>>>>>>>> 303 node = next;
>>>>>>>>> 304 }
>>>>>>>>> 305 }
>>>>>>>>> 306 jvmtiDeallocate(table);
>>>>>>>>> 307
>>>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL);
>>>>>>>>> 309 bagDestroyBag(deletedSignatureBag);
>>>>>>>>> 310
>>>>>>>>> 311 currentClassTag = -1;
>>>>>>>>> 312
>>>>>>>>> 313
>>>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); 
>>>>>>>>>
>>>>>>>>> 314 trackingEnv = NULL;
>>>>>>>>> 315
>>>>>>>>> 316 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>
>>>>>>>>> Could you, please, fix several comments below?
>>>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for
>>>>>>>>> class-unloads
>>>>>>>>> ????The comma is not needed.
>>>>>>>>> ????Would it better to replace: klass tags => klass_tag's ?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag
>>>>>>>>> consistent
>>>>>>>>> ????Maybe: Lock to guard ... or lock to keep integrity of ...
>>>>>>>>>
>>>>>>>>> 84 * Callback when classes are freed, Finds the signature and
>>>>>>>>> remembers it in deletedSignatureBag. Would be better to use words
>>>>>>>>> like
>>>>>>>>> "store" or "record", "Find" should not start from capital letter:
>>>>>>>>> Invoke the callback when classes are freed, find and record the
>>>>>>>>> signature in deletedSignatureBag.
>>>>>>>>>
>>>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not 
>>>>>>>>> initialized,
>>>>>>>>> nobody's interested 153 // Class tracking not initialized yet,
>>>>>>>>> nobody's interested 158 /* Check this is not a duplicate */
>>>>>>>>> Missed dot
>>>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) 
>>>>>>>>> { //
>>>>>>>>> klass not found - ignore. In opposite, dot is not needed as the
>>>>>>>>> comment does not start from a capital letter. 111 // At this
>>>>>>>>> point we
>>>>>>>>> have the KlassNode corresponding to the tag
>>>>>>>>> 112 // in klass, and the pointer to it in klass_node.
>>>>>>>> ??? The comment above can be better. Maybe, something like:
>>>>>>>> ??? ? " At this point, we found the KlassNode matching the klass
>>>>>>>> tag(and it is
>>>>>>>> linked).
>>>>>>>>
>>>>>>>>> 113 // Remember the unloaded signature.
>>>>>>>> ????Better: Record the signature of the unloaded class and 
>>>>>>>> unlink it.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Serguei
>>>>>>>>>
>>>>>>>>> On 3/9/20 05:39, Roman Kennke wrote:
>>>>>>>>>> Hello all,
>>>>>>>>>>
>>>>>>>>>> Can I please get reviews of this change? In the meantime, we've
>>>>>>>>>> done
>>>>>>>>>> more testing and also field-/torture-testing by a customer 
>>>>>>>>>> who is
>>>>>>>>>> happy
>>>>>>>>>> now. :-)
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Roman
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Hi Serguei,
>>>>>>>>>>>
>>>>>>>>>>> Thanks for reviewing!
>>>>>>>>>>>
>>>>>>>>>>> I updated the patch to reflect your suggestions, very good!
>>>>>>>>>>> It also includes a fix to allow re-connecting an agent after
>>>>>>>>>>> disconnect,
>>>>>>>>>>> namely move setup of the trackingEnv and deletedSignatureBag to
>>>>>>>>>>> _activate() to ensure have those structures after re-connect.
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
>>>>>>>>>>>
>>>>>>>>>>> Let me know what you think!
>>>>>>>>>>> Roman
>>>>>>>>>>>
>>>>>>>>>>>> Hi Roman,
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you for taking care about this scalability issue!
>>>>>>>>>>>>
>>>>>>>>>>>> I have a couple of quick comments.
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html 
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 72 /*
>>>>>>>>>>>> 73 * Lock to protect deletedSignatureBag
>>>>>>>>>>>> 74 */
>>>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>>>>>>>>>>>> 78 * A bag containing all the deleted classes' signatures.
>>>>>>>>>>>> Must be
>>>>>>>>>>>> accessed under
>>>>>>>>>>>> 79 * deletedTagLock,
>>>>>>>>>>>> ???? 80? */
>>>>>>>>>>>> 81 struct bag* deletedSignatureBag;
>>>>>>>>>>>>
>>>>>>>>>>>> ???? The comments contradict to each other.
>>>>>>>>>>>> ???? I guess, the lock name at line 79 has to be
>>>>>>>>>>>> deletedSignatureLock
>>>>>>>>>>>> instead of deletedTagLock.
>>>>>>>>>>>> ???? Also, comma at the end must be replaced with dot.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 101 // Tag not found? Ignore.
>>>>>>>>>>>> 102 if (klass == NULL) {
>>>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>> 104 return;
>>>>>>>>>>>> 105 }
>>>>>>>>>>>> ??? 106
>>>>>>>>>>>> 107 // Scan linked-list.
>>>>>>>>>>>> 108 jlong found_tag = klass->klass_tag;
>>>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) {
>>>>>>>>>>>> 110 klass_ptr = &klass->next;
>>>>>>>>>>>> 111 klass = *klass_ptr;
>>>>>>>>>>>> 112 found_tag = klass->klass_tag;
>>>>>>>>>>>> ??? 113???? }
>>>>>>>>>>>> 114
>>>>>>>>>>>> 115 // Tag not found? Ignore.
>>>>>>>>>>>> 116 if (found_tag != tag) {
>>>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>> 118 return;
>>>>>>>>>>>> ??? 119???? }
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> ????The code above can be simplified, so that the lines 
>>>>>>>>>>>> 101-105
>>>>>>>>>>>> are not
>>>>>>>>>>>> needed anymore.
>>>>>>>>>>>> ????It can be something like this:
>>>>>>>>>>>>
>>>>>>>>>>>> // Scan linked-list.
>>>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) {
>>>>>>>>>>>> klass_ptr = &klass->next;
>>>>>>>>>>>> klass = *klass_ptr;
>>>>>>>>>>>> ??????? }
>>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not
>>>>>>>>>>>> found - ignore.
>>>>>>>>>>>> debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>> return;
>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>
>>>>>>>>>>>> It will take more time when I get a chance to look at the 
>>>>>>>>>>>> rest.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Serguei
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote:
>>>>>>>>>>>>> Here comes an update that resolves some races that happen 
>>>>>>>>>>>>> when
>>>>>>>>>>>>> disconnecting an agent. In particular, we need to take the
>>>>>>>>>>>>> lock on
>>>>>>>>>>>>> basically every operation, and also need to check whether 
>>>>>>>>>>>>> or not
>>>>>>>>>>>>> class-tracking is active and return an appropriate result 
>>>>>>>>>>>>> (e.g.
>>>>>>>>>>>>> an empty
>>>>>>>>>>>>> list) when we're not.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> So, here comes the O(1) implementation:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a 
>>>>>>>>>>>>>> tag,
>>>>>>>>>>>>>> and we
>>>>>>>>>>>>>> set-up a listener to get notified when it is unloaded.
>>>>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a
>>>>>>>>>>>>>> table, which
>>>>>>>>>>>>>> each entry being the head of a linked-list of KlassNode*. 
>>>>>>>>>>>>>> The
>>>>>>>>>>>>>> table is
>>>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend the new
>>>>>>>>>>>>>> KlassNode*.
>>>>>>>>>>>>>> This is O(1) operation.
>>>>>>>>>>>>>> - When we get notified of unloading a class, we look up the
>>>>>>>>>>>>>> signature of
>>>>>>>>>>>>>> the reported tag in that table, and remember it in a bag. 
>>>>>>>>>>>>>> The
>>>>>>>>>>>>>> KlassNode*
>>>>>>>>>>>>>> is then unlinked from the table and deallocated. This is 
>>>>>>>>>>>>>> ~O(1)
>>>>>>>>>>>>>> operation
>>>>>>>>>>>>>> too, depending on the depth of the table. In my testcase 
>>>>>>>>>>>>>> which
>>>>>>>>>>>>>> hammered
>>>>>>>>>>>>>> the code with class-loads and unloads, I usually see 
>>>>>>>>>>>>>> depths of
>>>>>>>>>>>>>> like 2-3,
>>>>>>>>>>>>>> but not usually more. It should be ok.
>>>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out that
>>>>>>>>>>>>>> bag, and
>>>>>>>>>>>>>> allocate a new one.
>>>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid
>>>>>>>>>>>>>> leaking the
>>>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets detached
>>>>>>>>>>>>>> and/or
>>>>>>>>>>>>>> re-attached (was missing before).
>>>>>>>>>>>>>> - I also added locks around data-structure-manipulation (was
>>>>>>>>>>>>>> missing
>>>>>>>>>>>>>> before).
>>>>>>>>>>>>>> - Also, I only activate this whole process when an actual
>>>>>>>>>>>>>> listener gets
>>>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when
>>>>>>>>>>>>>> attaching a
>>>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be 
>>>>>>>>>>>>>> something
>>>>>>>>>>>>>> to improve
>>>>>>>>>>>>>> in the future?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In my tests, the performance of class-tracking itself looks
>>>>>>>>>>>>>> really good.
>>>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the
>>>>>>>>>>>>>> class-unload
>>>>>>>>>>>>>> events. I don't see how this can be helped when the debug
>>>>>>>>>>>>>> agent asks for it?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please let me know what you think of it.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am implementing
>>>>>>>>>>>>>>> the even more
>>>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off 
>>>>>>>>>>>>>>> reviewing for
>>>>>>>>>>>>>>> now.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,Roman
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ??? Hi Chris,
>>>>>>>>>>>>>>>>> I'll have a look at this, although it might not be for a
>>>>>>>>>>>>>>>>> few days. In
>>>>>>>>>>>>>>>>> the meantime, maybe you can describe your new
>>>>>>>>>>>>>>>>> implementation in
>>>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the changes.
>>>>>>>>>>>>>>>> Sure.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to
>>>>>>>>>>>>>>>> determine the
>>>>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading
>>>>>>>>>>>>>>>> happened, so that
>>>>>>>>>>>>>>>> we can generate the appropriate JDWP event.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The current implementation does so by maintaining a 
>>>>>>>>>>>>>>>> table of
>>>>>>>>>>>>>>>> currently
>>>>>>>>>>>>>>>> prepared classes by building that table when classTrack is
>>>>>>>>>>>>>>>> initialized,
>>>>>>>>>>>>>>>> and then add new classes whenever a class gets loaded. 
>>>>>>>>>>>>>>>> When
>>>>>>>>>>>>>>>> unloading
>>>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and 
>>>>>>>>>>>>>>>> compared
>>>>>>>>>>>>>>>> with the
>>>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the new
>>>>>>>>>>>>>>>> table gets
>>>>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently
>>>>>>>>>>>>>>>> and/or many
>>>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to
>>>>>>>>>>>>>>>> O(classCount*gcCount)
>>>>>>>>>>>>>>>> complexity.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared
>>>>>>>>>>>>>>>> classes, and also
>>>>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree().
>>>>>>>>>>>>>>>> Whenever an
>>>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned,
>>>>>>>>>>>>>>>> and classes
>>>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus
>>>>>>>>>>>>>>>> maintaining the
>>>>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list
>>>>>>>>>>>>>>>> that gets returned.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The implementation is not perfect. In order to determine
>>>>>>>>>>>>>>>> whether or not
>>>>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag.
>>>>>>>>>>>>>>>> That process is
>>>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here
>>>>>>>>>>>>>>>> is that
>>>>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this
>>>>>>>>>>>>>>>> seems to be
>>>>>>>>>>>>>>>> true, and also reasonable to expect.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> (I have some ideas how to improve the implementation to
>>>>>>>>>>>>>>>> ~O(1) but it
>>>>>>>>>>>>>>>> would be considerably more complex: have to maintain a
>>>>>>>>>>>>>>>> (hash)table that
>>>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload,
>>>>>>>>>>>>>>>> and build the
>>>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently see
>>>>>>>>>>>>>>>> that it's
>>>>>>>>>>>>>>>> worth the effort).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> In addition to all that, this process is only activated 
>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>> there's an
>>>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Issue:
>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of 
>>>>>>>>>>>>>>>>>> classTrack.c.
>>>>>>>>>>>>>>>>>> It avoids
>>>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps
>>>>>>>>>>>>>>>>>> track of
>>>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance until an
>>>>>>>>>>>>>>>>>> agent
>>>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ 
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and
>>>>>>>>>>>>>>>>>> timing.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Eg with the testcase provided here:
>>>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I am getting those numbers:
>>>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Can I please get a review?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>>
>>>
>
>


From chris.plummer at oracle.com  Tue Mar 24 21:47:54 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 24 Mar 2020 14:47:54 -0700
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <8e6f2f0c-ceee-2ef1-50b5-cea01bb31893@oracle.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
 <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>
 <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>
 <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com>
 <7766d2f0-e223-7129-9c37-474bbcd67d70@oracle.com>
 <b542b056-6de2-1ad0-b54d-9029c80d5164@oracle.com>
 <7667eb6e-8458-6763-3f27-f97f6f49bed8@redhat.com>
 <86692131-c9ff-3904-dce6-ac59149b2b88@oracle.com>
 <1de4cfc0-a719-efc6-8f9c-55c73c9043f7@redhat.com>
 <9e86af41-0c60-de4a-53f0-9627b7248209@oracle.com>
 <8e6f2f0c-ceee-2ef1-50b5-cea01bb31893@oracle.com>
Message-ID: <f8ad7842-4619-61a5-084c-aa79a9923902@oracle.com>

On 3/24/20 2:46 PM, serguei.spitsyn at oracle.com wrote:
> On 3/24/20 14:39, Chris Plummer wrote:
>> On 3/24/20 1:45 PM, Roman Kennke wrote:
>>>>>> I assume JVMTI maintains separate tagging data for each agent so 
>>>>>> having
>>>>>> two agents doing tagging won't result in confusion. I didn't 
>>>>>> actually
>>>>>> find this in the spec. Would be nice to confirm that it is the case.
>>>>>> However, your implementation does seem to conflict with other 
>>>>>> uses of
>>>>>> tagging in the debug agent:
>>>>> The tagging data is per-jvmtiEnv. We create and use our own env 
>>>>> (private
>>>>> to class-tracking), so this wouldn't conflict with other uses of 
>>>>> tags.
>>>>> Could it be a problem that we have a single trackingEnv per JVM, 
>>>>> though?
>>>>> /me scratches head.
>>>> Ok. This is an area I'm not familiar with, but the spec does say:
>>>>
>>>> "Each call to GetEnv creates a new JVM TI connection and thus a new 
>>>> JVM
>>>> TI environment."
>>>>
>>>> So it looks like what you are doing should be ok. I still think you 
>>>> have
>>>> a bug where you are not deallocating signatures of classes that are
>>>> unloaded. If you think otherwise please point out where this is done.
>>> Signatures that make it out of processUnloading() are deallocated in
>>> eventHandler.c, in synthesizeUnload(), right after it has been used.
>>>
>>> http://hg.openjdk.java.net/jdk/jdk/file/b9562cc25fc0/src/jdk.jdwp.agent/share/native/libjdwp/eventHandler.c#l527 
>>>
>> Ok. Good to know. Not the best of designs, but that's not your fault. 
>> I'll make another pass over the changes, but I think in general it 
>> looks good. I don't think I've seen another reviewer yet, so 
>> hopefully someone jumps in.
>
> As I understand, Roman already resolved my previous comments.
> So, I will do another pass for v6.
I think it's pretty much a rewrite since you last reviewed it.

Chris
>
> Thanks,
> Serguei
>
>>
>> Chris
>>> Pending signatures on debug-agent-disconnect are deallocated in
>>> classTrack.c, in the reset() routine.
>>>
>>> Thanks,
>>> Roman
>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>>> What would cause classTrack_addPreparedClass() to be called for a 
>>>>>> Class
>>>>>> you've already seen? I don't understand the need for the "tag != 0l"
>>>>>> check.
>>>>> It's probably not needed, may be a left-over from previous 
>>>>> installments
>>>>> of this implementation. I will check it, and turn into an assert 
>>>>> or so.
>>>>>
>>>>> Thanks,
>>>>> Roman
>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On 3/20/20 12:52 PM, Chris Plummer wrote:
>>>>>>> On 3/20/20 8:30 AM, Roman Kennke wrote:
>>>>>>>> I believe I came up with a much simpler solution that also 
>>>>>>>> solves the
>>>>>>>> problems of the existing one, and the ones I proposed earlier.
>>>>>>>>
>>>>>>>> It turns out that we can take advantage of the fact that we can 
>>>>>>>> use
>>>>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is
>>>>>>>> explicitely
>>>>>>>> mentioned in the JVMTI spec). This means we can simply stick a 
>>>>>>>> pointer
>>>>>>>> to the signature of a class into the tag, and pull it out again
>>>>>>>> when we
>>>>>>>> get notified that the class gets unloaded.
>>>>>>>>
>>>>>>>> This means we don't need an extra data-structure to keep track of
>>>>>>>> classes and signatures, and it also makes the story around locking
>>>>>>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning 
>>>>>>>> of all
>>>>>>>> classes needed (as in the current implementation) and no 
>>>>>>>> searching of
>>>>>>>> table needed (like in my previous attempts).
>>>>>>>>
>>>>>>>> Please review this new revision:
>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/
>>>>>>> I'll have a look at this.
>>>>>>>> (Notice that there still appears to be a performance bottleneck 
>>>>>>>> with
>>>>>>>> class-unloading when an actual debugger is attached. This 
>>>>>>>> doesn't seem
>>>>>>>> to be related to the classTrack.c implementation though, but looks
>>>>>>>> like
>>>>>>>> a consequence of getting all those class-unload notifications 
>>>>>>>> over the
>>>>>>>> wire. My testcase generates 1000s of them, and it's clogging up 
>>>>>>>> the
>>>>>>>> buffers.)
>>>>>>> At least this is only a one-shot hit when the classes are unloaded,
>>>>>>> and the performance hit is based on the number of classes being
>>>>>>> unloaded. The main issue is happening every GC, and is O(n) 
>>>>>>> where n is
>>>>>>> the number of loaded classes.
>>>>>>>> I am not sure why jdb needs to enable class-unload listener 
>>>>>>>> always. A
>>>>>>>> simple hack disables it, and performance is brilliant, even when
>>>>>>>> jdb is
>>>>>>>> attached:
>>>>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch
>>>>>>> This is JDI, not jdb. It looks like it needs ClassUnload events 
>>>>>>> so it
>>>>>>> can maintain typesBySignature, which is used by public APIs like
>>>>>>> allClasses(). So we have caching of loaded classes both in the 
>>>>>>> debug
>>>>>>> agent and in JDI.
>>>>>>>
>>>>>>> Chris
>>>>>>>> But this is not in the scope of this bug.)
>>>>>>>>
>>>>>>>> Roman
>>>>>>>>
>>>>>>>>
>>>>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>> Sorry, forgot to complete my comments at the end (see below).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>> Hi Roman,
>>>>>>>>>>
>>>>>>>>>> Thank you for the update and sorry for the latency in review.
>>>>>>>>>>
>>>>>>>>>> Some comments are below.
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
>>>>>>>>>> ???? 88 {
>>>>>>>>>> 89 debugMonitorEnter(deletedSignatureLock);
>>>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>> 93 return;
>>>>>>>>>> ???? 94???? }
>>>>>>>>>> Just a question:
>>>>>>>>>> ???? Q1: Should the ObjectFree events be disabled for the 
>>>>>>>>>> jvmtiEnv
>>>>>>>>>> that does
>>>>>>>>>> ???????? the class tracking if class tracking has not been
>>>>>>>>>> initialized?
>>>>>>>>>>
>>>>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is
>>>>>>>>>> better to
>>>>>>>>>> be something like: lastClassTag or highestClassTag.
>>>>>>>>>>
>>>>>>>>>> 99 KlassNode* klass = *klass_ptr;
>>>>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103
>>>>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr;
>>>>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { //
>>>>>>>>>> klass not
>>>>>>>>>> found - ignore.
>>>>>>>>>> 107 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>> 108 return;
>>>>>>>>>> ??? 109???? }
>>>>>>>>>> ????It seems to me, something is wrong in the condition at L106
>>>>>>>>>> above.
>>>>>>>>>> ????Should it be? :
>>>>>>>>>> ?????? if (klass == NULL || klass->klass_tag != tag)
>>>>>>>>>>
>>>>>>>>>> ????Otherwise, how can the second check ever work correctly 
>>>>>>>>>> as the
>>>>>>>>>> return
>>>>>>>>>> will always happen when (klass != NULL)?
>>>>>>>>>>
>>>>>>>>>> ??? There are several places in this file with the the indent:
>>>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>> 93 return;
>>>>>>>>>> ???? 94???? }
>>>>>>>>>> ??? ...
>>>>>>>>>> 152 if (currentClassTag == -1) {
>>>>>>>>>> 153 // Class tracking not initialized yet, nobody's interested
>>>>>>>>>> 154 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>> 155 return;
>>>>>>>>>> ??? 156???? }
>>>>>>>>>> ??? ...
>>>>>>>>>> 161 if (error != JVMTI_ERROR_NONE) {
>>>>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class 
>>>>>>>>>> trackingEnv");
>>>>>>>>>> ??? 163???? }
>>>>>>>>>> 164 if (tag != 0l) {
>>>>>>>>>> 165 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>> 166 return; // Already added
>>>>>>>>>> ??? 167???? }
>>>>>>>>>> ??? ...
>>>>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg)
>>>>>>>>>> 282 {
>>>>>>>>>> 283 char* sig = (char*)signatureVoid;
>>>>>>>>>> 284 jvmtiDeallocate(sig);
>>>>>>>>>> 285 return JNI_TRUE;
>>>>>>>>>> ??? 286 }
>>>>>>>>>> ??? ...
>>>>>>>>>> ??? 291 void
>>>>>>>>>> ??? 292 classTrack_reset(void)
>>>>>>>>>> ??? 293 {
>>>>>>>>>> 294 int idx;
>>>>>>>>>> 295 debugMonitorEnter(deletedSignatureLock);
>>>>>>>>>> 296
>>>>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) {
>>>>>>>>>> 298 KlassNode* node = table[idx];
>>>>>>>>>> 299 while (node != NULL) {
>>>>>>>>>> 300 KlassNode* next = node->next;
>>>>>>>>>> 301 jvmtiDeallocate(node->signature);
>>>>>>>>>> 302 jvmtiDeallocate(node);
>>>>>>>>>> 303 node = next;
>>>>>>>>>> 304 }
>>>>>>>>>> 305 }
>>>>>>>>>> 306 jvmtiDeallocate(table);
>>>>>>>>>> 307
>>>>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL);
>>>>>>>>>> 309 bagDestroyBag(deletedSignatureBag);
>>>>>>>>>> 310
>>>>>>>>>> 311 currentClassTag = -1;
>>>>>>>>>> 312
>>>>>>>>>> 313
>>>>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); 
>>>>>>>>>>
>>>>>>>>>> 314 trackingEnv = NULL;
>>>>>>>>>> 315
>>>>>>>>>> 316 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>
>>>>>>>>>> Could you, please, fix several comments below?
>>>>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for
>>>>>>>>>> class-unloads
>>>>>>>>>> ????The comma is not needed.
>>>>>>>>>> ????Would it better to replace: klass tags => klass_tag's ?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag
>>>>>>>>>> consistent
>>>>>>>>>> ????Maybe: Lock to guard ... or lock to keep integrity of ...
>>>>>>>>>>
>>>>>>>>>> 84 * Callback when classes are freed, Finds the signature and
>>>>>>>>>> remembers it in deletedSignatureBag. Would be better to use 
>>>>>>>>>> words
>>>>>>>>>> like
>>>>>>>>>> "store" or "record", "Find" should not start from capital 
>>>>>>>>>> letter:
>>>>>>>>>> Invoke the callback when classes are freed, find and record the
>>>>>>>>>> signature in deletedSignatureBag.
>>>>>>>>>>
>>>>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not 
>>>>>>>>>> initialized,
>>>>>>>>>> nobody's interested 153 // Class tracking not initialized yet,
>>>>>>>>>> nobody's interested 158 /* Check this is not a duplicate */
>>>>>>>>>> Missed dot
>>>>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) 
>>>>>>>>>> { //
>>>>>>>>>> klass not found - ignore. In opposite, dot is not needed as the
>>>>>>>>>> comment does not start from a capital letter. 111 // At this
>>>>>>>>>> point we
>>>>>>>>>> have the KlassNode corresponding to the tag
>>>>>>>>>> 112 // in klass, and the pointer to it in klass_node.
>>>>>>>>> ??? The comment above can be better. Maybe, something like:
>>>>>>>>> ??? ? " At this point, we found the KlassNode matching the klass
>>>>>>>>> tag(and it is
>>>>>>>>> linked).
>>>>>>>>>
>>>>>>>>>> 113 // Remember the unloaded signature.
>>>>>>>>> ????Better: Record the signature of the unloaded class and 
>>>>>>>>> unlink it.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Serguei
>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Serguei
>>>>>>>>>>
>>>>>>>>>> On 3/9/20 05:39, Roman Kennke wrote:
>>>>>>>>>>> Hello all,
>>>>>>>>>>>
>>>>>>>>>>> Can I please get reviews of this change? In the meantime, we've
>>>>>>>>>>> done
>>>>>>>>>>> more testing and also field-/torture-testing by a customer 
>>>>>>>>>>> who is
>>>>>>>>>>> happy
>>>>>>>>>>> now. :-)
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Roman
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Hi Serguei,
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for reviewing!
>>>>>>>>>>>>
>>>>>>>>>>>> I updated the patch to reflect your suggestions, very good!
>>>>>>>>>>>> It also includes a fix to allow re-connecting an agent after
>>>>>>>>>>>> disconnect,
>>>>>>>>>>>> namely move setup of the trackingEnv and 
>>>>>>>>>>>> deletedSignatureBag to
>>>>>>>>>>>> _activate() to ensure have those structures after re-connect.
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
>>>>>>>>>>>>
>>>>>>>>>>>> Let me know what you think!
>>>>>>>>>>>> Roman
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Roman,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thank you for taking care about this scalability issue!
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have a couple of quick comments.
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html 
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 72 /*
>>>>>>>>>>>>> 73 * Lock to protect deletedSignatureBag
>>>>>>>>>>>>> 74 */
>>>>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>>>>>>>>>>>>> 78 * A bag containing all the deleted classes' signatures.
>>>>>>>>>>>>> Must be
>>>>>>>>>>>>> accessed under
>>>>>>>>>>>>> 79 * deletedTagLock,
>>>>>>>>>>>>> ???? 80? */
>>>>>>>>>>>>> 81 struct bag* deletedSignatureBag;
>>>>>>>>>>>>>
>>>>>>>>>>>>> ???? The comments contradict to each other.
>>>>>>>>>>>>> ???? I guess, the lock name at line 79 has to be
>>>>>>>>>>>>> deletedSignatureLock
>>>>>>>>>>>>> instead of deletedTagLock.
>>>>>>>>>>>>> ???? Also, comma at the end must be replaced with dot.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 101 // Tag not found? Ignore.
>>>>>>>>>>>>> 102 if (klass == NULL) {
>>>>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>> 104 return;
>>>>>>>>>>>>> 105 }
>>>>>>>>>>>>> ??? 106
>>>>>>>>>>>>> 107 // Scan linked-list.
>>>>>>>>>>>>> 108 jlong found_tag = klass->klass_tag;
>>>>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) {
>>>>>>>>>>>>> 110 klass_ptr = &klass->next;
>>>>>>>>>>>>> 111 klass = *klass_ptr;
>>>>>>>>>>>>> 112 found_tag = klass->klass_tag;
>>>>>>>>>>>>> ??? 113???? }
>>>>>>>>>>>>> 114
>>>>>>>>>>>>> 115 // Tag not found? Ignore.
>>>>>>>>>>>>> 116 if (found_tag != tag) {
>>>>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>> 118 return;
>>>>>>>>>>>>> ??? 119???? }
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> ????The code above can be simplified, so that the lines 
>>>>>>>>>>>>> 101-105
>>>>>>>>>>>>> are not
>>>>>>>>>>>>> needed anymore.
>>>>>>>>>>>>> ????It can be something like this:
>>>>>>>>>>>>>
>>>>>>>>>>>>> // Scan linked-list.
>>>>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) {
>>>>>>>>>>>>> klass_ptr = &klass->next;
>>>>>>>>>>>>> klass = *klass_ptr;
>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not
>>>>>>>>>>>>> found - ignore.
>>>>>>>>>>>>> debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>> return;
>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>>
>>>>>>>>>>>>> It will take more time when I get a chance to look at the 
>>>>>>>>>>>>> rest.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote:
>>>>>>>>>>>>>> Here comes an update that resolves some races that happen 
>>>>>>>>>>>>>> when
>>>>>>>>>>>>>> disconnecting an agent. In particular, we need to take the
>>>>>>>>>>>>>> lock on
>>>>>>>>>>>>>> basically every operation, and also need to check whether 
>>>>>>>>>>>>>> or not
>>>>>>>>>>>>>> class-tracking is active and return an appropriate result 
>>>>>>>>>>>>>> (e.g.
>>>>>>>>>>>>>> an empty
>>>>>>>>>>>>>> list) when we're not.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So, here comes the O(1) implementation:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with 
>>>>>>>>>>>>>>> a tag,
>>>>>>>>>>>>>>> and we
>>>>>>>>>>>>>>> set-up a listener to get notified when it is unloaded.
>>>>>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a
>>>>>>>>>>>>>>> table, which
>>>>>>>>>>>>>>> each entry being the head of a linked-list of 
>>>>>>>>>>>>>>> KlassNode*. The
>>>>>>>>>>>>>>> table is
>>>>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend the 
>>>>>>>>>>>>>>> new
>>>>>>>>>>>>>>> KlassNode*.
>>>>>>>>>>>>>>> This is O(1) operation.
>>>>>>>>>>>>>>> - When we get notified of unloading a class, we look up the
>>>>>>>>>>>>>>> signature of
>>>>>>>>>>>>>>> the reported tag in that table, and remember it in a 
>>>>>>>>>>>>>>> bag. The
>>>>>>>>>>>>>>> KlassNode*
>>>>>>>>>>>>>>> is then unlinked from the table and deallocated. This is 
>>>>>>>>>>>>>>> ~O(1)
>>>>>>>>>>>>>>> operation
>>>>>>>>>>>>>>> too, depending on the depth of the table. In my testcase 
>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>> hammered
>>>>>>>>>>>>>>> the code with class-loads and unloads, I usually see 
>>>>>>>>>>>>>>> depths of
>>>>>>>>>>>>>>> like 2-3,
>>>>>>>>>>>>>>> but not usually more. It should be ok.
>>>>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out 
>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>> bag, and
>>>>>>>>>>>>>>> allocate a new one.
>>>>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid
>>>>>>>>>>>>>>> leaking the
>>>>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets 
>>>>>>>>>>>>>>> detached
>>>>>>>>>>>>>>> and/or
>>>>>>>>>>>>>>> re-attached (was missing before).
>>>>>>>>>>>>>>> - I also added locks around data-structure-manipulation 
>>>>>>>>>>>>>>> (was
>>>>>>>>>>>>>>> missing
>>>>>>>>>>>>>>> before).
>>>>>>>>>>>>>>> - Also, I only activate this whole process when an actual
>>>>>>>>>>>>>>> listener gets
>>>>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when
>>>>>>>>>>>>>>> attaching a
>>>>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be 
>>>>>>>>>>>>>>> something
>>>>>>>>>>>>>>> to improve
>>>>>>>>>>>>>>> in the future?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In my tests, the performance of class-tracking itself looks
>>>>>>>>>>>>>>> really good.
>>>>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the
>>>>>>>>>>>>>>> class-unload
>>>>>>>>>>>>>>> events. I don't see how this can be helped when the debug
>>>>>>>>>>>>>>> agent asks for it?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please let me know what you think of it.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am implementing
>>>>>>>>>>>>>>>> the even more
>>>>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off 
>>>>>>>>>>>>>>>> reviewing for
>>>>>>>>>>>>>>>> now.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,Roman
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ??? Hi Chris,
>>>>>>>>>>>>>>>>>> I'll have a look at this, although it might not be for a
>>>>>>>>>>>>>>>>>> few days. In
>>>>>>>>>>>>>>>>>> the meantime, maybe you can describe your new
>>>>>>>>>>>>>>>>>> implementation in
>>>>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the changes.
>>>>>>>>>>>>>>>>> Sure.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to
>>>>>>>>>>>>>>>>> determine the
>>>>>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading
>>>>>>>>>>>>>>>>> happened, so that
>>>>>>>>>>>>>>>>> we can generate the appropriate JDWP event.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The current implementation does so by maintaining a 
>>>>>>>>>>>>>>>>> table of
>>>>>>>>>>>>>>>>> currently
>>>>>>>>>>>>>>>>> prepared classes by building that table when 
>>>>>>>>>>>>>>>>> classTrack is
>>>>>>>>>>>>>>>>> initialized,
>>>>>>>>>>>>>>>>> and then add new classes whenever a class gets loaded. 
>>>>>>>>>>>>>>>>> When
>>>>>>>>>>>>>>>>> unloading
>>>>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and 
>>>>>>>>>>>>>>>>> compared
>>>>>>>>>>>>>>>>> with the
>>>>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the new
>>>>>>>>>>>>>>>>> table gets
>>>>>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently
>>>>>>>>>>>>>>>>> and/or many
>>>>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to
>>>>>>>>>>>>>>>>> O(classCount*gcCount)
>>>>>>>>>>>>>>>>> complexity.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared
>>>>>>>>>>>>>>>>> classes, and also
>>>>>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree().
>>>>>>>>>>>>>>>>> Whenever an
>>>>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is 
>>>>>>>>>>>>>>>>> scanned,
>>>>>>>>>>>>>>>>> and classes
>>>>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus
>>>>>>>>>>>>>>>>> maintaining the
>>>>>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list
>>>>>>>>>>>>>>>>> that gets returned.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The implementation is not perfect. In order to determine
>>>>>>>>>>>>>>>>> whether or not
>>>>>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag.
>>>>>>>>>>>>>>>>> That process is
>>>>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption 
>>>>>>>>>>>>>>>>> here
>>>>>>>>>>>>>>>>> is that
>>>>>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this
>>>>>>>>>>>>>>>>> seems to be
>>>>>>>>>>>>>>>>> true, and also reasonable to expect.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> (I have some ideas how to improve the implementation to
>>>>>>>>>>>>>>>>> ~O(1) but it
>>>>>>>>>>>>>>>>> would be considerably more complex: have to maintain a
>>>>>>>>>>>>>>>>> (hash)table that
>>>>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon 
>>>>>>>>>>>>>>>>> unload,
>>>>>>>>>>>>>>>>> and build the
>>>>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently see
>>>>>>>>>>>>>>>>> that it's
>>>>>>>>>>>>>>>>> worth the effort).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> In addition to all that, this process is only 
>>>>>>>>>>>>>>>>> activated when
>>>>>>>>>>>>>>>>> there's an
>>>>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Issue:
>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of 
>>>>>>>>>>>>>>>>>>> classTrack.c.
>>>>>>>>>>>>>>>>>>> It avoids
>>>>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps
>>>>>>>>>>>>>>>>>>> track of
>>>>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance 
>>>>>>>>>>>>>>>>>>> until an
>>>>>>>>>>>>>>>>>>> agent
>>>>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ 
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and
>>>>>>>>>>>>>>>>>>> timing.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Eg with the testcase provided here:
>>>>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I am getting those numbers:
>>>>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Can I please get a review?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>>>
>>>>
>>
>>
>


From serguei.spitsyn at oracle.com  Tue Mar 24 21:50:25 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 24 Mar 2020 14:50:25 -0700
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <f8ad7842-4619-61a5-084c-aa79a9923902@oracle.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
 <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>
 <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>
 <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com>
 <7766d2f0-e223-7129-9c37-474bbcd67d70@oracle.com>
 <b542b056-6de2-1ad0-b54d-9029c80d5164@oracle.com>
 <7667eb6e-8458-6763-3f27-f97f6f49bed8@redhat.com>
 <86692131-c9ff-3904-dce6-ac59149b2b88@oracle.com>
 <1de4cfc0-a719-efc6-8f9c-55c73c9043f7@redhat.com>
 <9e86af41-0c60-de4a-53f0-9627b7248209@oracle.com>
 <8e6f2f0c-ceee-2ef1-50b5-cea01bb31893@oracle.com>
 <f8ad7842-4619-61a5-084c-aa79a9923902@oracle.com>
Message-ID: <8c996072-9b3b-0b64-6aea-bf477dced13c@oracle.com>

On 3/24/20 14:47, Chris Plummer wrote:
> On 3/24/20 2:46 PM, serguei.spitsyn at oracle.com wrote:
>> On 3/24/20 14:39, Chris Plummer wrote:
>>> On 3/24/20 1:45 PM, Roman Kennke wrote:
>>>>>>> I assume JVMTI maintains separate tagging data for each agent so 
>>>>>>> having
>>>>>>> two agents doing tagging won't result in confusion. I didn't 
>>>>>>> actually
>>>>>>> find this in the spec. Would be nice to confirm that it is the 
>>>>>>> case.
>>>>>>> However, your implementation does seem to conflict with other 
>>>>>>> uses of
>>>>>>> tagging in the debug agent:
>>>>>> The tagging data is per-jvmtiEnv. We create and use our own env 
>>>>>> (private
>>>>>> to class-tracking), so this wouldn't conflict with other uses of 
>>>>>> tags.
>>>>>> Could it be a problem that we have a single trackingEnv per JVM, 
>>>>>> though?
>>>>>> /me scratches head.
>>>>> Ok. This is an area I'm not familiar with, but the spec does say:
>>>>>
>>>>> "Each call to GetEnv creates a new JVM TI connection and thus a 
>>>>> new JVM
>>>>> TI environment."
>>>>>
>>>>> So it looks like what you are doing should be ok. I still think 
>>>>> you have
>>>>> a bug where you are not deallocating signatures of classes that are
>>>>> unloaded. If you think otherwise please point out where this is done.
>>>> Signatures that make it out of processUnloading() are deallocated in
>>>> eventHandler.c, in synthesizeUnload(), right after it has been used.
>>>>
>>>> http://hg.openjdk.java.net/jdk/jdk/file/b9562cc25fc0/src/jdk.jdwp.agent/share/native/libjdwp/eventHandler.c#l527 
>>>>
>>> Ok. Good to know. Not the best of designs, but that's not your 
>>> fault. I'll make another pass over the changes, but I think in 
>>> general it looks good. I don't think I've seen another reviewer yet, 
>>> so hopefully someone jumps in.
>>
>> As I understand, Roman already resolved my previous comments.
>> So, I will do another pass for v6.
> I think it's pretty much a rewrite since you last reviewed it.

Yes, I'm expecting this as some performance related issues were discovered.

Thanks,
Serguei

>
> Chris
>>
>> Thanks,
>> Serguei
>>
>>>
>>> Chris
>>>> Pending signatures on debug-agent-disconnect are deallocated in
>>>> classTrack.c, in the reset() routine.
>>>>
>>>> Thanks,
>>>> Roman
>>>>
>>>>> thanks,
>>>>>
>>>>> Chris
>>>>>>> What would cause classTrack_addPreparedClass() to be called for 
>>>>>>> a Class
>>>>>>> you've already seen? I don't understand the need for the "tag != 
>>>>>>> 0l"
>>>>>>> check.
>>>>>> It's probably not needed, may be a left-over from previous 
>>>>>> installments
>>>>>> of this implementation. I will check it, and turn into an assert 
>>>>>> or so.
>>>>>>
>>>>>> Thanks,
>>>>>> Roman
>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On 3/20/20 12:52 PM, Chris Plummer wrote:
>>>>>>>> On 3/20/20 8:30 AM, Roman Kennke wrote:
>>>>>>>>> I believe I came up with a much simpler solution that also 
>>>>>>>>> solves the
>>>>>>>>> problems of the existing one, and the ones I proposed earlier.
>>>>>>>>>
>>>>>>>>> It turns out that we can take advantage of the fact that we 
>>>>>>>>> can use
>>>>>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is
>>>>>>>>> explicitely
>>>>>>>>> mentioned in the JVMTI spec). This means we can simply stick a 
>>>>>>>>> pointer
>>>>>>>>> to the signature of a class into the tag, and pull it out again
>>>>>>>>> when we
>>>>>>>>> get notified that the class gets unloaded.
>>>>>>>>>
>>>>>>>>> This means we don't need an extra data-structure to keep track of
>>>>>>>>> classes and signatures, and it also makes the story around 
>>>>>>>>> locking
>>>>>>>>> *much* simpler. Performance-wise this is O(1), i.e. no 
>>>>>>>>> scanning of all
>>>>>>>>> classes needed (as in the current implementation) and no 
>>>>>>>>> searching of
>>>>>>>>> table needed (like in my previous attempts).
>>>>>>>>>
>>>>>>>>> Please review this new revision:
>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/
>>>>>>>> I'll have a look at this.
>>>>>>>>> (Notice that there still appears to be a performance 
>>>>>>>>> bottleneck with
>>>>>>>>> class-unloading when an actual debugger is attached. This 
>>>>>>>>> doesn't seem
>>>>>>>>> to be related to the classTrack.c implementation though, but 
>>>>>>>>> looks
>>>>>>>>> like
>>>>>>>>> a consequence of getting all those class-unload notifications 
>>>>>>>>> over the
>>>>>>>>> wire. My testcase generates 1000s of them, and it's clogging 
>>>>>>>>> up the
>>>>>>>>> buffers.)
>>>>>>>> At least this is only a one-shot hit when the classes are 
>>>>>>>> unloaded,
>>>>>>>> and the performance hit is based on the number of classes being
>>>>>>>> unloaded. The main issue is happening every GC, and is O(n) 
>>>>>>>> where n is
>>>>>>>> the number of loaded classes.
>>>>>>>>> I am not sure why jdb needs to enable class-unload listener 
>>>>>>>>> always. A
>>>>>>>>> simple hack disables it, and performance is brilliant, even when
>>>>>>>>> jdb is
>>>>>>>>> attached:
>>>>>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch 
>>>>>>>>>
>>>>>>>> This is JDI, not jdb. It looks like it needs ClassUnload events 
>>>>>>>> so it
>>>>>>>> can maintain typesBySignature, which is used by public APIs like
>>>>>>>> allClasses(). So we have caching of loaded classes both in the 
>>>>>>>> debug
>>>>>>>> agent and in JDI.
>>>>>>>>
>>>>>>>> Chris
>>>>>>>>> But this is not in the scope of this bug.)
>>>>>>>>>
>>>>>>>>> Roman
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>> Sorry, forgot to complete my comments at the end (see below).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>> Hi Roman,
>>>>>>>>>>>
>>>>>>>>>>> Thank you for the update and sorry for the latency in review.
>>>>>>>>>>>
>>>>>>>>>>> Some comments are below.
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
>>>>>>>>>>> ???? 88 {
>>>>>>>>>>> 89 debugMonitorEnter(deletedSignatureLock);
>>>>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>> 93 return;
>>>>>>>>>>> ???? 94???? }
>>>>>>>>>>> Just a question:
>>>>>>>>>>> ???? Q1: Should the ObjectFree events be disabled for the 
>>>>>>>>>>> jvmtiEnv
>>>>>>>>>>> that does
>>>>>>>>>>> ???????? the class tracking if class tracking has not been
>>>>>>>>>>> initialized?
>>>>>>>>>>>
>>>>>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is
>>>>>>>>>>> better to
>>>>>>>>>>> be something like: lastClassTag or highestClassTag.
>>>>>>>>>>>
>>>>>>>>>>> 99 KlassNode* klass = *klass_ptr;
>>>>>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103
>>>>>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr;
>>>>>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { //
>>>>>>>>>>> klass not
>>>>>>>>>>> found - ignore.
>>>>>>>>>>> 107 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>> 108 return;
>>>>>>>>>>> ??? 109???? }
>>>>>>>>>>> ????It seems to me, something is wrong in the condition at L106
>>>>>>>>>>> above.
>>>>>>>>>>> ????Should it be? :
>>>>>>>>>>> ?????? if (klass == NULL || klass->klass_tag != tag)
>>>>>>>>>>>
>>>>>>>>>>> ????Otherwise, how can the second check ever work correctly 
>>>>>>>>>>> as the
>>>>>>>>>>> return
>>>>>>>>>>> will always happen when (klass != NULL)?
>>>>>>>>>>>
>>>>>>>>>>> ??? There are several places in this file with the the indent:
>>>>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>> 93 return;
>>>>>>>>>>> ???? 94???? }
>>>>>>>>>>> ??? ...
>>>>>>>>>>> 152 if (currentClassTag == -1) {
>>>>>>>>>>> 153 // Class tracking not initialized yet, nobody's interested
>>>>>>>>>>> 154 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>> 155 return;
>>>>>>>>>>> ??? 156???? }
>>>>>>>>>>> ??? ...
>>>>>>>>>>> 161 if (error != JVMTI_ERROR_NONE) {
>>>>>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class 
>>>>>>>>>>> trackingEnv");
>>>>>>>>>>> ??? 163???? }
>>>>>>>>>>> 164 if (tag != 0l) {
>>>>>>>>>>> 165 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>> 166 return; // Already added
>>>>>>>>>>> ??? 167???? }
>>>>>>>>>>> ??? ...
>>>>>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg)
>>>>>>>>>>> 282 {
>>>>>>>>>>> 283 char* sig = (char*)signatureVoid;
>>>>>>>>>>> 284 jvmtiDeallocate(sig);
>>>>>>>>>>> 285 return JNI_TRUE;
>>>>>>>>>>> ??? 286 }
>>>>>>>>>>> ??? ...
>>>>>>>>>>> ??? 291 void
>>>>>>>>>>> ??? 292 classTrack_reset(void)
>>>>>>>>>>> ??? 293 {
>>>>>>>>>>> 294 int idx;
>>>>>>>>>>> 295 debugMonitorEnter(deletedSignatureLock);
>>>>>>>>>>> 296
>>>>>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) {
>>>>>>>>>>> 298 KlassNode* node = table[idx];
>>>>>>>>>>> 299 while (node != NULL) {
>>>>>>>>>>> 300 KlassNode* next = node->next;
>>>>>>>>>>> 301 jvmtiDeallocate(node->signature);
>>>>>>>>>>> 302 jvmtiDeallocate(node);
>>>>>>>>>>> 303 node = next;
>>>>>>>>>>> 304 }
>>>>>>>>>>> 305 }
>>>>>>>>>>> 306 jvmtiDeallocate(table);
>>>>>>>>>>> 307
>>>>>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL);
>>>>>>>>>>> 309 bagDestroyBag(deletedSignatureBag);
>>>>>>>>>>> 310
>>>>>>>>>>> 311 currentClassTag = -1;
>>>>>>>>>>> 312
>>>>>>>>>>> 313
>>>>>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); 
>>>>>>>>>>>
>>>>>>>>>>> 314 trackingEnv = NULL;
>>>>>>>>>>> 315
>>>>>>>>>>> 316 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>
>>>>>>>>>>> Could you, please, fix several comments below?
>>>>>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for
>>>>>>>>>>> class-unloads
>>>>>>>>>>> ????The comma is not needed.
>>>>>>>>>>> ????Would it better to replace: klass tags => klass_tag's ?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 73 * Lock to keep table, currentClassTag and 
>>>>>>>>>>> deletedSignatureBag
>>>>>>>>>>> consistent
>>>>>>>>>>> ????Maybe: Lock to guard ... or lock to keep integrity of ...
>>>>>>>>>>>
>>>>>>>>>>> 84 * Callback when classes are freed, Finds the signature and
>>>>>>>>>>> remembers it in deletedSignatureBag. Would be better to use 
>>>>>>>>>>> words
>>>>>>>>>>> like
>>>>>>>>>>> "store" or "record", "Find" should not start from capital 
>>>>>>>>>>> letter:
>>>>>>>>>>> Invoke the callback when classes are freed, find and record the
>>>>>>>>>>> signature in deletedSignatureBag.
>>>>>>>>>>>
>>>>>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not 
>>>>>>>>>>> initialized,
>>>>>>>>>>> nobody's interested 153 // Class tracking not initialized yet,
>>>>>>>>>>> nobody's interested 158 /* Check this is not a duplicate */
>>>>>>>>>>> Missed dot
>>>>>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != 
>>>>>>>>>>> tag) { //
>>>>>>>>>>> klass not found - ignore. In opposite, dot is not needed as the
>>>>>>>>>>> comment does not start from a capital letter. 111 // At this
>>>>>>>>>>> point we
>>>>>>>>>>> have the KlassNode corresponding to the tag
>>>>>>>>>>> 112 // in klass, and the pointer to it in klass_node.
>>>>>>>>>> ??? The comment above can be better. Maybe, something like:
>>>>>>>>>> ??? ? " At this point, we found the KlassNode matching the klass
>>>>>>>>>> tag(and it is
>>>>>>>>>> linked).
>>>>>>>>>>
>>>>>>>>>>> 113 // Remember the unloaded signature.
>>>>>>>>>> ????Better: Record the signature of the unloaded class and 
>>>>>>>>>> unlink it.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Serguei
>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Serguei
>>>>>>>>>>>
>>>>>>>>>>> On 3/9/20 05:39, Roman Kennke wrote:
>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>
>>>>>>>>>>>> Can I please get reviews of this change? In the meantime, 
>>>>>>>>>>>> we've
>>>>>>>>>>>> done
>>>>>>>>>>>> more testing and also field-/torture-testing by a customer 
>>>>>>>>>>>> who is
>>>>>>>>>>>> happy
>>>>>>>>>>>> now. :-)
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Roman
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Serguei,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for reviewing!
>>>>>>>>>>>>>
>>>>>>>>>>>>> I updated the patch to reflect your suggestions, very good!
>>>>>>>>>>>>> It also includes a fix to allow re-connecting an agent after
>>>>>>>>>>>>> disconnect,
>>>>>>>>>>>>> namely move setup of the trackingEnv and 
>>>>>>>>>>>>> deletedSignatureBag to
>>>>>>>>>>>>> _activate() to ensure have those structures after re-connect.
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
>>>>>>>>>>>>>
>>>>>>>>>>>>> Let me know what you think!
>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Roman,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thank you for taking care about this scalability issue!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have a couple of quick comments.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html 
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 72 /*
>>>>>>>>>>>>>> 73 * Lock to protect deletedSignatureBag
>>>>>>>>>>>>>> 74 */
>>>>>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>>>>>>>>>>>>>> 78 * A bag containing all the deleted classes' signatures.
>>>>>>>>>>>>>> Must be
>>>>>>>>>>>>>> accessed under
>>>>>>>>>>>>>> 79 * deletedTagLock,
>>>>>>>>>>>>>> ???? 80? */
>>>>>>>>>>>>>> 81 struct bag* deletedSignatureBag;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ???? The comments contradict to each other.
>>>>>>>>>>>>>> ???? I guess, the lock name at line 79 has to be
>>>>>>>>>>>>>> deletedSignatureLock
>>>>>>>>>>>>>> instead of deletedTagLock.
>>>>>>>>>>>>>> ???? Also, comma at the end must be replaced with dot.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 101 // Tag not found? Ignore.
>>>>>>>>>>>>>> 102 if (klass == NULL) {
>>>>>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>> 104 return;
>>>>>>>>>>>>>> 105 }
>>>>>>>>>>>>>> ??? 106
>>>>>>>>>>>>>> 107 // Scan linked-list.
>>>>>>>>>>>>>> 108 jlong found_tag = klass->klass_tag;
>>>>>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) {
>>>>>>>>>>>>>> 110 klass_ptr = &klass->next;
>>>>>>>>>>>>>> 111 klass = *klass_ptr;
>>>>>>>>>>>>>> 112 found_tag = klass->klass_tag;
>>>>>>>>>>>>>> ??? 113???? }
>>>>>>>>>>>>>> 114
>>>>>>>>>>>>>> 115 // Tag not found? Ignore.
>>>>>>>>>>>>>> 116 if (found_tag != tag) {
>>>>>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>> 118 return;
>>>>>>>>>>>>>> ??? 119???? }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ????The code above can be simplified, so that the lines 
>>>>>>>>>>>>>> 101-105
>>>>>>>>>>>>>> are not
>>>>>>>>>>>>>> needed anymore.
>>>>>>>>>>>>>> ????It can be something like this:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> // Scan linked-list.
>>>>>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) {
>>>>>>>>>>>>>> klass_ptr = &klass->next;
>>>>>>>>>>>>>> klass = *klass_ptr;
>>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not
>>>>>>>>>>>>>> found - ignore.
>>>>>>>>>>>>>> debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>> return;
>>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It will take more time when I get a chance to look at the 
>>>>>>>>>>>>>> rest.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote:
>>>>>>>>>>>>>>> Here comes an update that resolves some races that 
>>>>>>>>>>>>>>> happen when
>>>>>>>>>>>>>>> disconnecting an agent. In particular, we need to take the
>>>>>>>>>>>>>>> lock on
>>>>>>>>>>>>>>> basically every operation, and also need to check 
>>>>>>>>>>>>>>> whether or not
>>>>>>>>>>>>>>> class-tracking is active and return an appropriate 
>>>>>>>>>>>>>>> result (e.g.
>>>>>>>>>>>>>>> an empty
>>>>>>>>>>>>>>> list) when we're not.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> So, here comes the O(1) implementation:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with 
>>>>>>>>>>>>>>>> a tag,
>>>>>>>>>>>>>>>> and we
>>>>>>>>>>>>>>>> set-up a listener to get notified when it is unloaded.
>>>>>>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a
>>>>>>>>>>>>>>>> table, which
>>>>>>>>>>>>>>>> each entry being the head of a linked-list of 
>>>>>>>>>>>>>>>> KlassNode*. The
>>>>>>>>>>>>>>>> table is
>>>>>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend 
>>>>>>>>>>>>>>>> the new
>>>>>>>>>>>>>>>> KlassNode*.
>>>>>>>>>>>>>>>> This is O(1) operation.
>>>>>>>>>>>>>>>> - When we get notified of unloading a class, we look up 
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> signature of
>>>>>>>>>>>>>>>> the reported tag in that table, and remember it in a 
>>>>>>>>>>>>>>>> bag. The
>>>>>>>>>>>>>>>> KlassNode*
>>>>>>>>>>>>>>>> is then unlinked from the table and deallocated. This 
>>>>>>>>>>>>>>>> is ~O(1)
>>>>>>>>>>>>>>>> operation
>>>>>>>>>>>>>>>> too, depending on the depth of the table. In my 
>>>>>>>>>>>>>>>> testcase which
>>>>>>>>>>>>>>>> hammered
>>>>>>>>>>>>>>>> the code with class-loads and unloads, I usually see 
>>>>>>>>>>>>>>>> depths of
>>>>>>>>>>>>>>>> like 2-3,
>>>>>>>>>>>>>>>> but not usually more. It should be ok.
>>>>>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out 
>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>> bag, and
>>>>>>>>>>>>>>>> allocate a new one.
>>>>>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid
>>>>>>>>>>>>>>>> leaking the
>>>>>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets 
>>>>>>>>>>>>>>>> detached
>>>>>>>>>>>>>>>> and/or
>>>>>>>>>>>>>>>> re-attached (was missing before).
>>>>>>>>>>>>>>>> - I also added locks around data-structure-manipulation 
>>>>>>>>>>>>>>>> (was
>>>>>>>>>>>>>>>> missing
>>>>>>>>>>>>>>>> before).
>>>>>>>>>>>>>>>> - Also, I only activate this whole process when an actual
>>>>>>>>>>>>>>>> listener gets
>>>>>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right 
>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>> attaching a
>>>>>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be 
>>>>>>>>>>>>>>>> something
>>>>>>>>>>>>>>>> to improve
>>>>>>>>>>>>>>>> in the future?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> In my tests, the performance of class-tracking itself 
>>>>>>>>>>>>>>>> looks
>>>>>>>>>>>>>>>> really good.
>>>>>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the
>>>>>>>>>>>>>>>> class-unload
>>>>>>>>>>>>>>>> events. I don't see how this can be helped when the debug
>>>>>>>>>>>>>>>> agent asks for it?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Please let me know what you think of it.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am 
>>>>>>>>>>>>>>>>> implementing
>>>>>>>>>>>>>>>>> the even more
>>>>>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off 
>>>>>>>>>>>>>>>>> reviewing for
>>>>>>>>>>>>>>>>> now.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,Roman
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ??? Hi Chris,
>>>>>>>>>>>>>>>>>>> I'll have a look at this, although it might not be 
>>>>>>>>>>>>>>>>>>> for a
>>>>>>>>>>>>>>>>>>> few days. In
>>>>>>>>>>>>>>>>>>> the meantime, maybe you can describe your new
>>>>>>>>>>>>>>>>>>> implementation in
>>>>>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the 
>>>>>>>>>>>>>>>>>>> changes.
>>>>>>>>>>>>>>>>>> Sure.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to
>>>>>>>>>>>>>>>>>> determine the
>>>>>>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading
>>>>>>>>>>>>>>>>>> happened, so that
>>>>>>>>>>>>>>>>>> we can generate the appropriate JDWP event.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The current implementation does so by maintaining a 
>>>>>>>>>>>>>>>>>> table of
>>>>>>>>>>>>>>>>>> currently
>>>>>>>>>>>>>>>>>> prepared classes by building that table when 
>>>>>>>>>>>>>>>>>> classTrack is
>>>>>>>>>>>>>>>>>> initialized,
>>>>>>>>>>>>>>>>>> and then add new classes whenever a class gets 
>>>>>>>>>>>>>>>>>> loaded. When
>>>>>>>>>>>>>>>>>> unloading
>>>>>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and 
>>>>>>>>>>>>>>>>>> compared
>>>>>>>>>>>>>>>>>> with the
>>>>>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the 
>>>>>>>>>>>>>>>>>> new
>>>>>>>>>>>>>>>>>> table gets
>>>>>>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently
>>>>>>>>>>>>>>>>>> and/or many
>>>>>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to
>>>>>>>>>>>>>>>>>> O(classCount*gcCount)
>>>>>>>>>>>>>>>>>> complexity.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared
>>>>>>>>>>>>>>>>>> classes, and also
>>>>>>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree().
>>>>>>>>>>>>>>>>>> Whenever an
>>>>>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is 
>>>>>>>>>>>>>>>>>> scanned,
>>>>>>>>>>>>>>>>>> and classes
>>>>>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus
>>>>>>>>>>>>>>>>>> maintaining the
>>>>>>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list
>>>>>>>>>>>>>>>>>> that gets returned.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The implementation is not perfect. In order to determine
>>>>>>>>>>>>>>>>>> whether or not
>>>>>>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag.
>>>>>>>>>>>>>>>>>> That process is
>>>>>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption 
>>>>>>>>>>>>>>>>>> here
>>>>>>>>>>>>>>>>>> is that
>>>>>>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this
>>>>>>>>>>>>>>>>>> seems to be
>>>>>>>>>>>>>>>>>> true, and also reasonable to expect.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> (I have some ideas how to improve the implementation to
>>>>>>>>>>>>>>>>>> ~O(1) but it
>>>>>>>>>>>>>>>>>> would be considerably more complex: have to maintain a
>>>>>>>>>>>>>>>>>> (hash)table that
>>>>>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon 
>>>>>>>>>>>>>>>>>> unload,
>>>>>>>>>>>>>>>>>> and build the
>>>>>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently 
>>>>>>>>>>>>>>>>>> see
>>>>>>>>>>>>>>>>>> that it's
>>>>>>>>>>>>>>>>>> worth the effort).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> In addition to all that, this process is only 
>>>>>>>>>>>>>>>>>> activated when
>>>>>>>>>>>>>>>>>> there's an
>>>>>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Issue:
>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of 
>>>>>>>>>>>>>>>>>>>> classTrack.c.
>>>>>>>>>>>>>>>>>>>> It avoids
>>>>>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps
>>>>>>>>>>>>>>>>>>>> track of
>>>>>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance 
>>>>>>>>>>>>>>>>>>>> until an
>>>>>>>>>>>>>>>>>>>> agent
>>>>>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ 
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and
>>>>>>>>>>>>>>>>>>>> timing.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Eg with the testcase provided here:
>>>>>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I am getting those numbers:
>>>>>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Can I please get a review?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>>>>
>>>>>
>>>
>>>
>>
>
>


From suenaga at oss.nttdata.com  Tue Mar 24 23:47:28 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Wed, 25 Mar 2020 08:47:28 +0900
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <f45b2bc7-a0ca-d85b-5998-1e30e99d0d36@oracle.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <7aba941d-ea71-25e4-1f04-838ef9764a4f@oracle.com>
 <d0a1f49e-7f1d-0732-38d0-2b7a75966e74@oss.nttdata.com>
 <c0013500-ebc6-0eab-1705-feb99e58e73e@oracle.com>
 <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com>
 <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com>
 <a13433b4-e3f8-d280-b83f-cde27f7282cd@oracle.com>
 <ff79a3c5-7416-6ee1-73de-0ef1d91a7480@oracle.com>
 <b9694754-b669-0ea3-2ba9-a432b9d8dcfb@oss.nttdata.com>
 <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com>
 <c00aeece-863a-dfb3-f8da-2d3d5ae25330@oracle.com>
 <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com>
 <f0488e84-9758-ea16-82a6-4ce1f424a523@oracle.com>
 <4f940048-06e1-bda1-6f75-225001572232@oss.nttdata.com>
 <908005c7-c3ad-2f92-5853-2ae0895689b4@oracle.com>
 <2114ef12-b790-ee2d-4242-d114cc47151c@oracle.com>
 <9c9950af-5e36-90b5-d22f-a3e00d45fbcc@oss.nttdata.com>
 <f45b2bc7-a0ca-d85b-5998-1e30e99d0d36@oracle.com>
Message-ID: <cb32ea21-a06a-824a-cd6f-0b731841f352@oss.nttdata.com>

Thanks Serguei!

I will push it when I get second reviewer.


Yasumasa


On 2020/03/25 1:39, serguei.spitsyn at oracle.com wrote:
> Hi Yasumasa,
> 
> I'm okay with this update.
> My mach5 test run for this patch is passed.
> 
> Thanks,
> Serguei
> 
> 
> On 3/23/20 17:08, Yasumasa Suenaga wrote:
>> Hi Serguei,
>>
>> Thanks for your comment!
>> I uploaded new webrev:
>>
>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.04/
>>
>> Also I pushed it to submit repo:
>>
>> ? http://hg.openjdk.java.net/jdk/submit/rev/fade6a949bd1
>>
>> On 2020/03/24 7:39, serguei.spitsyn at oracle.com wrote:
>>> Hi Yasumasa,
>>>
>>> The mach5 tier5 testing looks good.
>>> The serviceability/sa/ClhsdbPstack.java is failed without fix and is not failed with it.
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 3/23/20 10:18, serguei.spitsyn at oracle.com wrote:
>>>> Hi Yasumasa,
>>>>
>>>> I looked at you changes.
>>>> It is hard to understand if this fully solves the issue.
>>>>
>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html
>>>>
>>>> @@ -34,10 +34,11 @@
>>>> ? ???? public static LinuxAMD64CFrame getTopFrame(LinuxDebugger dbg, Address rip, ThreadContext context) {
>>>> ??????? Address libptr = dbg.findLibPtrByAddress(rip);
>>>> ??????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP);
>>>> ??????? DwarfParser dwarf = null;
>>>> + boolean unsupportedDwarf = false;
>>>> ? ??????? if (libptr != null) { // Native frame
>>>> ????????? try {
>>>> ??????????? dwarf = new DwarfParser(libptr);
>>>> ??????????? dwarf.processDwarf(rip);
>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
>>>>
>>>> @@ -45,24 +46,33 @@
>>>> ?????????????????? !dwarf.isBPOffsetAvailable())
>>>> ????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP)
>>>> ????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister())
>>>> .addOffsetTo(dwarf.getCFAOffset());
>>>> ????????? } catch (DebuggerException e) {
>>>> - // Bail out to Java frame case
>>>> + if (dwarf != null) {
>>>> + // DWARF processing should succeed when the frame is native
>>>> + // but it might fail if CIE has language personality routine
>>>> + // and/or LSDA.
>>>> + dwarf = null;
>>>> + unsupportedDwarf = true;
>>>> + } else {
>>>> + throw e;
>>>> + }
>>>> ????????? }
>>>> ??????? }
>>>> ? ??????? return (cfa == null) ? null
>>>> - : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf);
>>>> + : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf, !unsupportedDwarf);
>>>> ???? }
>>>>
>>>> @@ -121,13 +131,25 @@
>>>> ?????? }
>>>> ? ?????? return isValidFrame(nextCFA, context) ? nextCFA : null;
>>>> ???? }
>>>> ? - private DwarfParser getNextDwarf(Address nextPC) {
>>>> - DwarfParser nextDwarf = null;
>>>> + @Override
>>>> + public CFrame sender(ThreadProxy thread) {
>>>> + if (!possibleNext) {
>>>> + return null;
>>>> + }
>>>> +
>>>> + ThreadContext context = thread.getContext();
>>>> +
>>>> + Address nextPC = getNextPC(dwarf != null);
>>>> + if (nextPC == null) {
>>>> + return null;
>>>> + }
>>>> ? + DwarfParser nextDwarf = null;
>>>> + boolean unsupportedDwarf = false;
>>>> ?????? if ((dwarf != null) && dwarf.isIn(nextPC)) {
>>>> ???????? nextDwarf = dwarf;
>>>> ?????? } else {
>>>> ???????? Address libptr = dbg.findLibPtrByAddress(nextPC);
>>>> ???????? if (libptr != null) {
>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
>>>>
>>>> @@ -138,33 +160,29 @@
>>>> ?????????? }
>>>> ???????? }
>>>> ?????? }
>>>> ? ?????? if (nextDwarf != null) {
>>>> + try {
>>>> ???????? nextDwarf.processDwarf(nextPC);
>>>> + } catch (DebuggerException e) {
>>>> + // DWARF processing should succeed when the frame is native
>>>> + // but it might fail if CIE has language personality routine
>>>> + // and/or LSDA.
>>>> + nextDwarf = null;
>>>> + unsupportedDwarf = true;
>>>> ?????? }
>>>>
>>>> This fix looks like a hack.
>>>> Should we just propagate the Debugging exception instead of trying to maintain unsupportedDwarf flag?
>>
>> DwarfParser::processDwarf would throw DebuggerException if it cannot find DWARF which relates to PC.
>> PC at this point is for next frame. So current frame (`this` object) is valid, and it should be processed.
>>
>>
>>>> Also, I don't like that DWARF-specific abbreviations (like CIE, IDE,LSDA, etc.) are used without any comments explaining them.
>>>> The code has to be generally readable without looking into the DWARF spec each time.
>>
>> I added comments for them in this webrev.
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>>>> I'm submitting mach5 jobs to make sure the issue has been resolved with your fix.
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>> On 3/20/20 17:55, Yasumasa Suenaga wrote:
>>>>> Thanks Chris!
>>>>> I'm waiting for reviewers for this change.
>>>>>
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2020/03/21 4:23, Chris Plummer wrote:
>>>>>> Hi Yasumasa,
>>>>>>
>>>>>> The failure is due to JDK-8231634, so not something you need to worry about.
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On 3/20/20 2:45 AM, Yasumasa Suenaga wrote:
>>>>>>> Hi Chris,
>>>>>>>
>>>>>>> I uploaded new webrev which includes reverting change for ProblemList:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/
>>>>>>>
>>>>>>> I tested it on submit repo (mach5-one-ysuenaga-JDK-8240956-2-20200320-0814-9586301),
>>>>>>> but it has failed in ClhsdbJstackXcompStress.java.
>>>>>>> However I think it is not caused by this change because ClhsdbJstackXcompStress.java tests `jhsdb jstack`, not mixed mode, it would not parse DWARF.
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>> On 2020/03/20 13:55, Chris Plummer wrote:
>>>>>>>> Hi Yasumasa,
>>>>>>>>
>>>>>>>> The test has been problem listed so please add undoing this to your webrev. Here's the diff that problem listed it:
>>>>>>>>
>>>>>>>> diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt
>>>>>>>> --- a/test/hotspot/jtreg/ProblemList.txt
>>>>>>>> +++ b/test/hotspot/jtreg/ProblemList.txt
>>>>>>>> @@ -115,7 +115,7 @@
>>>>>>>> ??serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all
>>>>>>>> ??serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all
>>>>>>>> ??serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all
>>>>>>>> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all
>>>>>>>> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 solaris-all,linux-all
>>>>>>>> ??serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 solaris-all
>>>>>>>> ??serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 solaris-all,linux-x64,macosx-x64,windows-x64
>>>>>>>> ??serviceability/sa/ClhsdbSource.java 8193639 solaris-all
>>>>>>>>
>>>>>>>> thanks,
>>>>>>>>
>>>>>>>> Chris
>>>>>>>>
>>>>>>>> On 3/16/20 5:07 AM, Yasumasa Suenaga wrote:
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> This webrev has passed submit repo (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and additional tests.
>>>>>>>>> So please review it:
>>>>>>>>>
>>>>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2020/03/16 21:03, Yasumasa Suenaga wrote:
>>>>>>>>>> Thank you so much, David!
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2020/03/16 21:01, David Holmes wrote:
>>>>>>>>>>> On 16/03/2020 9:46 pm, David Holmes wrote:
>>>>>>>>>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I missed loop condition, so I fixed it and pushed to submit repo.
>>>>>>>>>>>>> Could you try again?
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23
>>>>>>>>>>>>>
>>>>>>>>>>>>> webrev is here:
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>>>>>>>>>>
>>>>>>>>>>>> Test job resubmitted. Will advise results if it completes before I go to bed :)
>>>>>>>>>>>
>>>>>>>>>>> Seems to have passed okay.
>>>>>>>>>>>
>>>>>>>>>>> David
>>>>>>>>>>>
>>>>>>>>>>>> David
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks a lot!
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 2020/03/16 16:17, David Holmes wrote:
>>>>>>>>>>>>>> Sorry it is still crashing.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> #
>>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>>>>>>>>>>> #
>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704
>>>>>>>>>>>>>> #
>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0640217.suenaga.source)
>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Same as before.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> David
>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote:
>>>>>>>>>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote:
>>>>>>>>>>>>>>>>> BTW, if you submit it to the submit repo, we can then go and run additional internal tests (and even more builds) using that job.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks for that tip Chris!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I've pushed the change to submit repo, but I've not yet received the result.
>>>>>>>>>>>>>>>> I will share you when I get job ID.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We can see the id. Just need to wait for the builds to complete before submitting the additional tests.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thank you for testing it.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I updated webrev to avoid bailout to Java frame when DWARF has language personality routine or LSDA.
>>>>>>>>>>>>>>>>>> Could you try it?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 .
>>>>>>>>>>>>>>>>>> I've pushed it to submit repo.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Diff from webrev.00 is here:
>>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote:
>>>>>>>>>>>>>>>>>>> Correction ...
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I can't review this as I know nothing about the code, but I'm putting the patch through our internal testing.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Sorry but the crashes still exist:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949
>>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev)
>>>>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>>>>>>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> in fact they seem worse as the test seems to always crash now.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in linux-x64. I don't see a pattern as to where it fails versus passes.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> It doesn't fail for me locally.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>>>>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames in jstack mixed mode.
>>>>>>>>>>>>>>>>>>>>>> However some error has seen intermittently after that.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I investigated the cause of this, I found two concerns:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check
>>>>>>>>>>>>>>>>>>>>>> ?? B: Language personality routine and Language Specific Data Area (LSDA) are not considered
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I addd range check for .eh_frame processing, and ignore personality routine and LSDA in this webrev.
>>>>>>>>>>>>>>>>>>>>>> Also I added bailout code if DWARF processing is failed due to these concerns.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> This change has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671),
>>>>>>>>>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>
> 
> 

From serguei.spitsyn at oracle.com  Wed Mar 25 09:44:13 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 25 Mar 2020 02:44:13 -0700
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
 <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>
 <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>
 <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com>
Message-ID: <f322e0c2-a87e-7c45-5b02-f5380e50246a@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200325/e9f68280/attachment-0001.htm>

From rkennke at redhat.com  Wed Mar 25 13:00:45 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 25 Mar 2020 14:00:45 +0100
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <f322e0c2-a87e-7c45-5b02-f5380e50246a@oracle.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
 <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>
 <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>
 <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com>
 <f322e0c2-a87e-7c45-5b02-f5380e50246a@oracle.com>
Message-ID: <359442d4-d6a1-f123-dad1-ac712f06eb0e@redhat.com>

Hi Sergei,

> The fix looks pretty clean now.
> I also like new name of the lock.:)

Thank you!

> Just one comment below.
> 
> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
> 
> 110 if (tag != 0l) {
> 111 return; // Already added
>  112     }
> 
> ?It is better to use a named constant or macro instead.
> ?Also, it'd be nice to add a short comment about this value is.

As I replied to Chris earlier, this whole block can be turned into an
assert. I also made a constant for the value 0, which should be pretty
much self-explaining.

http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.07/

> How do you test the fix?

I am using a manual test that is provided in this bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=1751985

"Script to compare performance of GC with and without debugger, when
many classes are loaded and classes are being unloaded":

https://bugzilla.redhat.com/attachment.cgi?id=1640688

I am also using this test and manually attach/detach jdb a couple of
times in a row to check that disconnecting and reconnecting works well
(this tended to deadlock or crash with an earlier version of the patch,
and is now looking good).

I am also running tier1 and tier2 tests locally, and as soon as we all
agree that the fix is reasonable, I will push it to the submit repo. I
am not sure if any of those tests actually exercise that code, though.
Let me know if you want me to run any specific tests.

Thank you,
Roman


> Thanks,
> Serguei
> 
> 
> On 3/20/20 08:30, Roman Kennke wrote:
>> I believe I came up with a much simpler solution that also solves the
>> problems of the existing one, and the ones I proposed earlier.
>>
>> It turns out that we can take advantage of the fact that we can use
>> *anything* as tags in JVMTI, even pointers to stuff (this is explicitely
>> mentioned in the JVMTI spec). This means we can simply stick a pointer
>> to the signature of a class into the tag, and pull it out again when we
>> get notified that the class gets unloaded.
>>
>> This means we don't need an extra data-structure to keep track of
>> classes and signatures, and it also makes the story around locking
>> *much* simpler. Performance-wise this is O(1), i.e. no scanning of all
>> classes needed (as in the current implementation) and no searching of
>> table needed (like in my previous attempts).
>>
>> Please review this new revision:
>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/
>>
>> (Notice that there still appears to be a performance bottleneck with
>> class-unloading when an actual debugger is attached. This doesn't seem
>> to be related to the classTrack.c implementation though, but looks like
>> a consequence of getting all those class-unload notifications over the
>> wire. My testcase generates 1000s of them, and it's clogging up the
>> buffers.)
>>
>> I am not sure why jdb needs to enable class-unload listener always. A
>> simple hack disables it, and performance is brilliant, even when jdb is
>> attached:
>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch
>>
>> But this is not in the scope of this bug.)
>>
>> Roman
>>
>>
>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote:
>>> Sorry, forgot to complete my comments at the end (see below).
>>>
>>>
>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote:
>>>> Hi Roman,
>>>>
>>>> Thank you for the update and sorry for the latency in review.
>>>>
>>>> Some comments are below.
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>
>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
>>>>   88 {
>>>> 89 debugMonitorEnter(deletedSignatureLock);
>>>> 90 if (currentClassTag == -1) {
>>>> 91 // Class tracking not initialized, nobody's interested
>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>> 93 return;
>>>>   94     }
>>>> Just a question:
>>>> ? Q1: Should the ObjectFree events be disabled for the jvmtiEnv that does
>>>> ????? the class tracking if class tracking has not been initialized?
>>>>
>>>> 70 static jlong currentClassTag; I'm thinking if the name is better to
>>>> be something like: lastClassTag or highestClassTag.
>>>>
>>>> 99 KlassNode* klass = *klass_ptr;
>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103
>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr;
>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // klass not
>>>> found - ignore.
>>>> 107 debugMonitorExit(deletedSignatureLock);
>>>> 108 return;
>>>>  109     }
>>>> ?It seems to me, something is wrong in the condition at L106 above.
>>>> ?Should it be? :
>>>> ??? if (klass == NULL || klass->klass_tag != tag)
>>>>
>>>> ?Otherwise, how can the second check ever work correctly as the return
>>>> will always happen when (klass != NULL)?
>>>>
>>>> ?
>>>> There are several places in this file with the the indent:
>>>> 90 if (currentClassTag == -1) {
>>>> 91 // Class tracking not initialized, nobody's interested
>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>> 93 return;
>>>>   94     }
>>>>  ...
>>>> 152 if (currentClassTag == -1) {
>>>> 153 // Class tracking not initialized yet, nobody's interested
>>>> 154 debugMonitorExit(deletedSignatureLock);
>>>> 155 return;
>>>>  156     }
>>>>  ...
>>>> 161 if (error != JVMTI_ERROR_NONE) {
>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv");
>>>>  163     }
>>>> 164 if (tag != 0l) {
>>>> 165 debugMonitorExit(deletedSignatureLock);
>>>> 166 return; // Already added
>>>>  167     }
>>>>  ...
>>>> 281 cleanDeleted(void *signatureVoid, void *arg)
>>>> 282 {
>>>> 283 char* sig = (char*)signatureVoid;
>>>> 284 jvmtiDeallocate(sig);
>>>> 285 return JNI_TRUE;
>>>>  286 }
>>>>  ...
>>>>  291 void
>>>>  292 classTrack_reset(void)
>>>>  293 {
>>>> 294 int idx;
>>>> 295 debugMonitorEnter(deletedSignatureLock);
>>>> 296
>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) {
>>>> 298 KlassNode* node = table[idx];
>>>> 299 while (node != NULL) {
>>>> 300 KlassNode* next = node->next;
>>>> 301 jvmtiDeallocate(node->signature);
>>>> 302 jvmtiDeallocate(node);
>>>> 303 node = next;
>>>> 304 }
>>>> 305 }
>>>> 306 jvmtiDeallocate(table);
>>>> 307
>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL);
>>>> 309 bagDestroyBag(deletedSignatureBag);
>>>> 310
>>>> 311 currentClassTag = -1;
>>>> 312
>>>> 313 (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv);
>>>> 314 trackingEnv = NULL;
>>>> 315
>>>> 316 debugMonitorExit(deletedSignatureLock);
>>>>
>>>> Could you, please, fix several comments below?
>>>> 63 * The JVMTI tracking env to keep track of klass tags, for class-unloads
>>>> ?The comma is not needed.
>>>> ?Would it better to replace: klass tags => klass_tag's ?
>>>>
>>>>
>>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag
>>>> consistent
>>>> ?Maybe: Lock to guard ... or lock to keep integrity of ...
>>>>
>>>> 84 * Callback when classes are freed, Finds the signature and
>>>> remembers it in deletedSignatureBag. Would be better to use words like
>>>> "store" or "record", "Find" should not start from capital letter:
>>>> Invoke the callback when classes are freed, find and record the
>>>> signature in deletedSignatureBag.
>>>>
>>>> 96 // Find deleted KlassNode 133 // Class tracking not initialized,
>>>> nobody's interested 153 // Class tracking not initialized yet,
>>>> nobody's interested 158 /* Check this is not a duplicate */ Missed dot
>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { //
>>>> klass not found - ignore. In opposite, dot is not needed as the
>>>> comment does not start from a capital letter. 111 // At this point we
>>>> have the KlassNode corresponding to the tag
>>>> 112 // in klass, and the pointer to it in klass_node.
>>>  The comment above can be better. Maybe, something like:
>>>  ? " At this point, we found the KlassNode matching the klass tag(and it is
>>> linked).
>>>
>>>> 113 // Remember the unloaded signature.
>>> ?Better: Record the signature of the unloaded class and unlink it.
>>>
>>> Thanks,
>>> Serguei
>>>
>>>> Thanks,
>>>> Serguei 
>>>>
>>>> On 3/9/20 05:39, Roman Kennke wrote:
>>>>> Hello all,
>>>>>
>>>>> Can I please get reviews of this change? In the meantime, we've done
>>>>> more testing and also field-/torture-testing by a customer who is happy
>>>>> now. :-)
>>>>>
>>>>> Thanks,
>>>>> Roman
>>>>>
>>>>>
>>>>>> Hi Serguei,
>>>>>>
>>>>>> Thanks for reviewing!
>>>>>>
>>>>>> I updated the patch to reflect your suggestions, very good!
>>>>>> It also includes a fix to allow re-connecting an agent after disconnect,
>>>>>> namely move setup of the trackingEnv and deletedSignatureBag to
>>>>>> _activate() to ensure have those structures after re-connect.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
>>>>>>
>>>>>> Let me know what you think!
>>>>>> Roman
>>>>>>
>>>>>>> Hi Roman,
>>>>>>>
>>>>>>> Thank you for taking care about this scalability issue!
>>>>>>>
>>>>>>> I have a couple of quick comments.
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>
>>>>>>> 72 /*
>>>>>>> 73 * Lock to protect deletedSignatureBag
>>>>>>> 74 */
>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>>>>>>> 78 * A bag containing all the deleted classes' signatures. Must be
>>>>>>> accessed under
>>>>>>> 79 * deletedTagLock,
>>>>>>>   80  */
>>>>>>> 81 struct bag* deletedSignatureBag;
>>>>>>>
>>>>>>> ? The comments contradict to each other.
>>>>>>> ? I guess, the lock name at line 79 has to be deletedSignatureLock
>>>>>>> instead of deletedTagLock.
>>>>>>> ? Also, comma at the end must be replaced with dot.
>>>>>>>
>>>>>>>
>>>>>>> 101 // Tag not found? Ignore.
>>>>>>> 102 if (klass == NULL) {
>>>>>>> 103 debugMonitorExit(deletedSignatureLock);
>>>>>>> 104 return;
>>>>>>> 105 }
>>>>>>>  106 
>>>>>>> 107 // Scan linked-list.
>>>>>>> 108 jlong found_tag = klass->klass_tag;
>>>>>>> 109 while (klass != NULL && found_tag != tag) {
>>>>>>> 110 klass_ptr = &klass->next;
>>>>>>> 111 klass = *klass_ptr;
>>>>>>> 112 found_tag = klass->klass_tag;
>>>>>>>  113     }
>>>>>>> 114
>>>>>>> 115 // Tag not found? Ignore.
>>>>>>> 116 if (found_tag != tag) {
>>>>>>> 117 debugMonitorExit(deletedSignatureLock);
>>>>>>> 118 return;
>>>>>>>  119     }
>>>>>>>
>>>>>>>
>>>>>>> ?The code above can be simplified, so that the lines 101-105 are not
>>>>>>> needed anymore.
>>>>>>> ?It can be something like this:
>>>>>>>
>>>>>>> // Scan linked-list.
>>>>>>> while (klass != NULL && klass->klass_tag != tag) {
>>>>>>> klass_ptr = &klass->next;
>>>>>>> klass = *klass_ptr;
>>>>>>>      }
>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not found - ignore.
>>>>>>> debugMonitorExit(deletedSignatureLock);
>>>>>>> return;
>>>>>>>      }
>>>>>>>
>>>>>>> It will take more time when I get a chance to look at the rest.
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 12/21/19 13:24, Roman Kennke wrote:
>>>>>>>> Here comes an update that resolves some races that happen when
>>>>>>>> disconnecting an agent. In particular, we need to take the lock on
>>>>>>>> basically every operation, and also need to check whether or not
>>>>>>>> class-tracking is active and return an appropriate result (e.g. an empty
>>>>>>>> list) when we're not.
>>>>>>>>
>>>>>>>> Updated webrev:
>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Roman
>>>>>>>>
>>>>>>>>
>>>>>>>>> So, here comes the O(1) implementation:
>>>>>>>>>
>>>>>>>>> - Whenever a class is 'prepared', it is registered with a tag, and we
>>>>>>>>> set-up a listener to get notified when it is unloaded.
>>>>>>>>> - Prepared classes are kept in a datastructure that is a table, which
>>>>>>>>> each entry being the head of a linked-list of KlassNode*. The table is
>>>>>>>>> indexed by tag % slot-count, and then simply prepend the new KlassNode*.
>>>>>>>>> This is O(1) operation.
>>>>>>>>> - When we get notified of unloading a class, we look up the signature of
>>>>>>>>> the reported tag in that table, and remember it in a bag. The KlassNode*
>>>>>>>>> is then unlinked from the table and deallocated. This is ~O(1) operation
>>>>>>>>> too, depending on the depth of the table. In my testcase which hammered
>>>>>>>>> the code with class-loads and unloads, I usually see depths of like 2-3,
>>>>>>>>> but not usually more. It should be ok.
>>>>>>>>> - when processUnloads() gets called, we simply hand out that bag, and
>>>>>>>>> allocate a new one.
>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid leaking the
>>>>>>>>> signatures and KlassNode* etc when debug agent gets detached and/or
>>>>>>>>> re-attached (was missing before).
>>>>>>>>> - I also added locks around data-structure-manipulation (was missing
>>>>>>>>> before).
>>>>>>>>> - Also, I only activate this whole process when an actual listener gets
>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when attaching a
>>>>>>>>> jdb, not sure why jdb does that though. This may be something to improve
>>>>>>>>> in the future?
>>>>>>>>>
>>>>>>>>> In my tests, the performance of class-tracking itself looks really good.
>>>>>>>>> The bottleneck now is clearly actual synthesizing the class-unload
>>>>>>>>> events. I don't see how this can be helped when the debug agent asks for it?
>>>>>>>>>
>>>>>>>>> Updated webrev:
>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
>>>>>>>>>
>>>>>>>>> Please let me know what you think of it.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Roman
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Alright, the perfectionist in me got me. I am implementing the even more
>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing for now.
>>>>>>>>>>
>>>>>>>>>> Thanks,Roman
>>>>>>>>>>
>>>>>>>>>>  Hi Chris,
>>>>>>>>>>>> I'll have a look at this, although it might not be for a few days. In
>>>>>>>>>>>> the meantime, maybe you can describe your new implementation in
>>>>>>>>>>>> classTrack.c so it's easier to look through the changes.
>>>>>>>>>>> Sure.
>>>>>>>>>>>
>>>>>>>>>>> The purpose of this class-tracking is to be able to determine the
>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading happened, so that
>>>>>>>>>>> we can generate the appropriate JDWP event.
>>>>>>>>>>>
>>>>>>>>>>> The current implementation does so by maintaining a table of currently
>>>>>>>>>>> prepared classes by building that table when classTrack is initialized,
>>>>>>>>>>> and then add new classes whenever a class gets loaded. When unloading
>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and compared with the
>>>>>>>>>>> old table, and whatever is in the old, but not in the new table gets
>>>>>>>>>>> returned. The problem is that when GCs happen frequently and/or many
>>>>>>>>>>> classes get loaded+unloaded, this amounts to O(classCount*gcCount)
>>>>>>>>>>> complexity.
>>>>>>>>>>>
>>>>>>>>>>> The new implementation keeps a linked-list of prepared classes, and also
>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). Whenever an
>>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned, and classes
>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus maintaining the
>>>>>>>>>>> prepared-classes-list) and its signature put in the list that gets returned.
>>>>>>>>>>>
>>>>>>>>>>> The implementation is not perfect. In order to determine whether or not
>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. That process is
>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here is that
>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this seems to be
>>>>>>>>>>> true, and also reasonable to expect.
>>>>>>>>>>>
>>>>>>>>>>> (I have some ideas how to improve the implementation to ~O(1) but it
>>>>>>>>>>> would be considerably more complex: have to maintain a (hash)table that
>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload, and build the
>>>>>>>>>>> unloaded-signatures list there, but I don't currently see that it's
>>>>>>>>>>> worth the effort).
>>>>>>>>>>>
>>>>>>>>>>> In addition to all that, this process is only activated when there's an
>>>>>>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Roman
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Chris
>>>>>>>>>>>>
>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Issue:
>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c. It avoids
>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps track of
>>>>>>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In addition to that, it avoids this whole dance until an agent
>>>>>>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>>>>>>>>>>>>
>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and timing.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Eg with the testcase provided here:
>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am getting those numbers:
>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s
>>>>>>>>>>>>>
>>>>>>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>>>>>>
>>>>>>>>>>>>> Can I please get a review?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200325/27a05803/signature-0001.asc>

From leonid.mesnik at oracle.com  Wed Mar 25 15:55:36 2020
From: leonid.mesnik at oracle.com (Leonid Mesnik)
Date: Wed, 25 Mar 2020 08:55:36 -0700
Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the
 children process if vmArguments is already specified
Message-ID: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com>

Hi

Could you please review following fix which change LingeredApp to 
prepend vm options to java/vm.test.opts when startApp is used and 
provide startAppVmOpts to override options completely.

The intention is to avoid issue like in this bug where test/jtreg 
options were ignored by tests. Also I fixed some tests where intention 
was to append vm options rather than to override them.

webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/

bug: https://bugs.openjdk.java.net/browse/JDK-8240698

Leonid


From igor.ignatyev at oracle.com  Wed Mar 25 16:40:15 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Wed, 25 Mar 2020 09:40:15 -0700
Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the
 children process if vmArguments is already specified
In-Reply-To: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com>
References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com>
Message-ID: <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com>

Hi Leonid,

I have briefly looked at the patch, a few comments so far:

test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java:
 - at L#114, could you please call static method using class name (as the opposite of using instance)? or was it meant to be theApp.runAppVmOpts(vmArgs) ?

test/lib/jdk/test/lib/apps/LingeredApp.java:
- it seems that code indent of startApp(LingeredApp, String[]) isn't correct
- I don't like startAppVmOpts name, but unfortunately don't have a better suggestion (yet)

Thanks,
-- Igor

> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik <leonid.mesnik at oracle.com> wrote:
> 
> Hi
> 
> Could you please review following fix which change LingeredApp to prepend vm options to java/vm.test.opts when startApp is used and provide startAppVmOpts to override options completely.
> 
> The intention is to avoid issue like in this bug where test/jtreg options were ignored by tests. Also I fixed some tests where intention was to append vm options rather than to override them.
> 
> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/
> 
> bug: https://bugs.openjdk.java.net/browse/JDK-8240698
> 
> Leonid
> 


From ioi.lam at oracle.com  Wed Mar 25 16:46:07 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Wed, 25 Mar 2020 09:46:07 -0700
Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the
 children process if vmArguments is already specified
In-Reply-To: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com>
References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com>
Message-ID: <ae75c731-ce56-83c8-6323-566cfb5c2607@oracle.com>

Hi Lenoid,

Thanks for fixing this.

If you just look at a test case, it's not very obvious what the 
difference is between

 ??? LingerApp.startApp(myApp, "-XX:Xyz=123");
 ??? LingerApp.startAppVmOpts(myApp, "-XX:Xyz=123");

How about renaming startAppVmOpts/runAppVmOpts -> 
startAppExactVmOpts/runAppExactVmOpts?

===

 ?415???? public static void startApp(LingeredApp theApp, String... 
additonalVMOpts) throws IOException {
 ?416???????????? String[] vmOpts = additonalVMOpts == null ? 
Utils.getTestJavaOpts() : Utils.appendTestJavaOpts(additonalVMOpts);
 ?417???????????? startAppVmOpts(theApp, vmOpts);
 ?418???? }

I think there's no need to check for additonalVMOpts == null. If the 
caller passes no arguments, additonalVMOpts will be an empty array (but 
not null);

You will get a null for additonalVMOpts only if the caller explicitly 
passes in a null, like this

 ????? LingerApp.startApp(theApp, null);

but this is not good programming style and you will get a Javac warning:

public class DotDotDot {
 ? public static void main(String args[]) {
 ??? doit();
 ??? doit(null);
 ? }
 ? static void doit(String ...args) {
 ??? System.out.println(args);
 ? }
}

$ javac DotDotDot.java
DotDotDot.java:4: warning: non-varargs call of varargs method with 
inexact argument type for last parameter;
 ??? doit(null);
 ???????? ^
 ? cast to String for a varargs call
 ? cast to String[] for a non-varargs call and to suppress this warning
1 warning

Thanks!
- Ioi


On 3/25/20 8:55 AM, Leonid Mesnik wrote:
> Hi
>
> Could you please review following fix which change LingeredApp to 
> prepend vm options to java/vm.test.opts when startApp is used and 
> provide startAppVmOpts to override options completely.
>
> The intention is to avoid issue like in this bug where test/jtreg 
> options were ignored by tests. Also I fixed some tests where intention 
> was to append vm options rather than to override them.
>
> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/
>
> bug: https://bugs.openjdk.java.net/browse/JDK-8240698
>
> Leonid
>


From stefan.karlsson at oracle.com  Wed Mar 25 17:14:03 2020
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 25 Mar 2020 18:14:03 +0100
Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the
 children process if vmArguments is already specified
In-Reply-To: <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com>
References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com>
 <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com>
Message-ID: <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com>

On 2020-03-25 17:40, Igor Ignatyev wrote:
> Hi Leonid,
>
> I have briefly looked at the patch, a few comments so far:
>
> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java:
>   - at L#114, could you please call static method using class name (as the opposite of using instance)? or was it meant to be theApp.runAppVmOpts(vmArgs) ?
>
> test/lib/jdk/test/lib/apps/LingeredApp.java:
> - it seems that code indent of startApp(LingeredApp, String[]) isn't correct
> - I don't like startAppVmOpts name, but unfortunately don't have a better suggestion (yet)

I was going to say the same. Jtreg has the concept of "java options" and 
"vm options". We have had a fair share of bugs and wasted time when 
tests have been using the "vm options" part (VM_OPTIONS, 
test.vm.options, etc), and we've been moving away from using that way to 
pass options. I recently cleaned up some of this with:

8237111: LingeredApp should be started with getTestJavaOpts

Because of this, I would prefer if we used a name that doesn't include 
"VmOpts", because it's too alike the other concept. Some suggestions:
 ?startAppJavaOptions
 ?startAppUsingJavaOptions
 ?startAppWithJavaOptions
 ?startAppExactJavaOptions
 ?startAppJvmOptions

Thanks,
StefanK

> Thanks,
> -- Igor
>
>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik <leonid.mesnik at oracle.com> wrote:
>>
>> Hi
>>
>> Could you please review following fix which change LingeredApp to prepend vm options to java/vm.test.opts when startApp is used and provide startAppVmOpts to override options completely.
>>
>> The intention is to avoid issue like in this bug where test/jtreg options were ignored by tests. Also I fixed some tests where intention was to append vm options rather than to override them.
>>
>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/
>>
>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698
>>
>> Leonid
>>


From leonid.mesnik at oracle.com  Wed Mar 25 17:52:18 2020
From: leonid.mesnik at oracle.com (Leonid Mesnik)
Date: Wed, 25 Mar 2020 10:52:18 -0700
Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the
 children process if vmArguments is already specified
In-Reply-To: <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com>
References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com>
 <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com>
 <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com>
Message-ID: <dca346a7-4f17-57ee-2e31-af4821c32cf6@oracle.com>

Igor, Stefan

Thank you for feedback, see my comments inline.

On 3/25/20 10:14 AM, Stefan Karlsson wrote:
> On 2020-03-25 17:40, Igor Ignatyev wrote:
>> Hi Leonid,
>>
>> I have briefly looked at the patch, a few comments so far:
>>
>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java:
>> ? - at L#114, could you please call static method using class name 
>> (as the opposite of using instance)? or was it meant to be 
>> theApp.runAppVmOpts(vmArgs) ?
No, it is a plain bug. I wanted to use non-static method first and 
forget to change to classname.
>>
>> test/lib/jdk/test/lib/apps/LingeredApp.java:
>> - it seems that code indent of startApp(LingeredApp, String[]) isn't 
>> correct
>> - I don't like startAppVmOpts name, but unfortunately don't have a 
>> better suggestion (yet)
>
> I was going to say the same. Jtreg has the concept of "java options" 
> and "vm options". We have had a fair share of bugs and wasted time 
> when tests have been using the "vm options" part (VM_OPTIONS, 
> test.vm.options, etc), and we've been moving away from using that way 
> to pass options. I recently cleaned up some of this with:
>
> 8237111: LingeredApp should be started with getTestJavaOpts
>
> Because of this, I would prefer if we used a name that doesn't include 
> "VmOpts", because it's too alike the other concept. Some suggestions:
> ?startAppJavaOptions
> ?startAppUsingJavaOptions
> ?startAppWithJavaOptions
> ?startAppExactJavaOptions
> ?startAppJvmOptions

I prefer 'startAppExactJvmOptions' (and same runApp..) to be clear that 
this method doesn't use default test options and whole combination 
should be prepared by user.

And left startApp(String .. addtionaJVmOpts) for cases when additional 
options are prepend to standard set.

Let me know if what do you think about this.

Leonid

>
> Thanks,
> StefanK
>
>> Thanks,
>> -- Igor
>>
>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik 
>>> <leonid.mesnik at oracle.com> wrote:
>>>
>>> Hi
>>>
>>> Could you please review following fix which change LingeredApp to 
>>> prepend vm options to java/vm.test.opts when startApp is used and 
>>> provide startAppVmOpts to override options completely.
>>>
>>> The intention is to avoid issue like in this bug where test/jtreg 
>>> options were ignored by tests. Also I fixed some tests where 
>>> intention was to append vm options rather than to override them.
>>>
>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/
>>>
>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698
>>>
>>> Leonid
>>>
>

From chris.plummer at oracle.com  Wed Mar 25 18:07:48 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 25 Mar 2020 11:07:48 -0700
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <359442d4-d6a1-f123-dad1-ac712f06eb0e@redhat.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
 <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>
 <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>
 <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com>
 <f322e0c2-a87e-7c45-5b02-f5380e50246a@oracle.com>
 <359442d4-d6a1-f123-dad1-ac712f06eb0e@redhat.com>
Message-ID: <d5dbee0b-9f6e-173f-15c4-f88bd0e9b619@oracle.com>

Hi Roman,

Regarding the new assert:

 ?105???? if (gdata && gdata->assertOn) {
 ?106???????? // Check this is not already tagged.
 ?107???????? jlong tag;
 ?108???????? error = JVMTI_FUNC_PTR(trackingEnv, GetTag)(env, klass, &tag);
 ?109???????? if (error != JVMTI_ERROR_NONE) {
 ?110???????????? EXIT_ERROR(error, "Unable to GetTag with class 
trackingEnv");
 ?111???????? }
 ?112???????? JDI_ASSERT(tag == NOT_TAGGED);
 ?113???? }

I think you should remove the gdata check. gdata should never be NULL 
when you get to this code. If it is ever NULL then there's a bug, and 
the check will hide the bug.

Regarding testing, after you do the submit repo testing let me know the 
jobID and I'll do additional testing on it.

thanks,

Chris

On 3/25/20 6:00 AM, Roman Kennke wrote:
> Hi Sergei,
>
>> The fix looks pretty clean now.
>> I also like new name of the lock.:)
> Thank you!
>
>> Just one comment below.
>>
>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>
>> 110 if (tag != 0l) {
>> 111 return; // Already added
>>   112     }
>>
>>  ?It is better to use a named constant or macro instead.
>>  ?Also, it'd be nice to add a short comment about this value is.
> As I replied to Chris earlier, this whole block can be turned into an
> assert. I also made a constant for the value 0, which should be pretty
> much self-explaining.
>
> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.07/
>
>> How do you test the fix?
> I am using a manual test that is provided in this bug report:
> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>
> "Script to compare performance of GC with and without debugger, when
> many classes are loaded and classes are being unloaded":
>
> https://bugzilla.redhat.com/attachment.cgi?id=1640688
>
> I am also using this test and manually attach/detach jdb a couple of
> times in a row to check that disconnecting and reconnecting works well
> (this tended to deadlock or crash with an earlier version of the patch,
> and is now looking good).
>
> I am also running tier1 and tier2 tests locally, and as soon as we all
> agree that the fix is reasonable, I will push it to the submit repo. I
> am not sure if any of those tests actually exercise that code, though.
> Let me know if you want me to run any specific tests.
>
> Thank you,
> Roman
>
>
>
>> Thanks,
>> Serguei
>>
>>
>> On 3/20/20 08:30, Roman Kennke wrote:
>>> I believe I came up with a much simpler solution that also solves the
>>> problems of the existing one, and the ones I proposed earlier.
>>>
>>> It turns out that we can take advantage of the fact that we can use
>>> *anything* as tags in JVMTI, even pointers to stuff (this is explicitely
>>> mentioned in the JVMTI spec). This means we can simply stick a pointer
>>> to the signature of a class into the tag, and pull it out again when we
>>> get notified that the class gets unloaded.
>>>
>>> This means we don't need an extra data-structure to keep track of
>>> classes and signatures, and it also makes the story around locking
>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning of all
>>> classes needed (as in the current implementation) and no searching of
>>> table needed (like in my previous attempts).
>>>
>>> Please review this new revision:
>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/
>>>
>>> (Notice that there still appears to be a performance bottleneck with
>>> class-unloading when an actual debugger is attached. This doesn't seem
>>> to be related to the classTrack.c implementation though, but looks like
>>> a consequence of getting all those class-unload notifications over the
>>> wire. My testcase generates 1000s of them, and it's clogging up the
>>> buffers.)
>>>
>>> I am not sure why jdb needs to enable class-unload listener always. A
>>> simple hack disables it, and performance is brilliant, even when jdb is
>>> attached:
>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch
>>>
>>> But this is not in the scope of this bug.)
>>>
>>> Roman
>>>
>>>
>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote:
>>>> Sorry, forgot to complete my comments at the end (see below).
>>>>
>>>>
>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote:
>>>>> Hi Roman,
>>>>>
>>>>> Thank you for the update and sorry for the latency in review.
>>>>>
>>>>> Some comments are below.
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>
>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
>>>>>    88 {
>>>>> 89 debugMonitorEnter(deletedSignatureLock);
>>>>> 90 if (currentClassTag == -1) {
>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>> 93 return;
>>>>>    94     }
>>>>> Just a question:
>>>>>  ? Q1: Should the ObjectFree events be disabled for the jvmtiEnv that does
>>>>>  ????? the class tracking if class tracking has not been initialized?
>>>>>
>>>>> 70 static jlong currentClassTag; I'm thinking if the name is better to
>>>>> be something like: lastClassTag or highestClassTag.
>>>>>
>>>>> 99 KlassNode* klass = *klass_ptr;
>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103
>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr;
>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // klass not
>>>>> found - ignore.
>>>>> 107 debugMonitorExit(deletedSignatureLock);
>>>>> 108 return;
>>>>>   109     }
>>>>>  ?It seems to me, something is wrong in the condition at L106 above.
>>>>>  ?Should it be? :
>>>>>  ??? if (klass == NULL || klass->klass_tag != tag)
>>>>>
>>>>>  ?Otherwise, how can the second check ever work correctly as the return
>>>>> will always happen when (klass != NULL)?
>>>>>
>>>>>   
>>>>> There are several places in this file with the the indent:
>>>>> 90 if (currentClassTag == -1) {
>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>> 93 return;
>>>>>    94     }
>>>>>   ...
>>>>> 152 if (currentClassTag == -1) {
>>>>> 153 // Class tracking not initialized yet, nobody's interested
>>>>> 154 debugMonitorExit(deletedSignatureLock);
>>>>> 155 return;
>>>>>   156     }
>>>>>   ...
>>>>> 161 if (error != JVMTI_ERROR_NONE) {
>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv");
>>>>>   163     }
>>>>> 164 if (tag != 0l) {
>>>>> 165 debugMonitorExit(deletedSignatureLock);
>>>>> 166 return; // Already added
>>>>>   167     }
>>>>>   ...
>>>>> 281 cleanDeleted(void *signatureVoid, void *arg)
>>>>> 282 {
>>>>> 283 char* sig = (char*)signatureVoid;
>>>>> 284 jvmtiDeallocate(sig);
>>>>> 285 return JNI_TRUE;
>>>>>   286 }
>>>>>   ...
>>>>>   291 void
>>>>>   292 classTrack_reset(void)
>>>>>   293 {
>>>>> 294 int idx;
>>>>> 295 debugMonitorEnter(deletedSignatureLock);
>>>>> 296
>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) {
>>>>> 298 KlassNode* node = table[idx];
>>>>> 299 while (node != NULL) {
>>>>> 300 KlassNode* next = node->next;
>>>>> 301 jvmtiDeallocate(node->signature);
>>>>> 302 jvmtiDeallocate(node);
>>>>> 303 node = next;
>>>>> 304 }
>>>>> 305 }
>>>>> 306 jvmtiDeallocate(table);
>>>>> 307
>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL);
>>>>> 309 bagDestroyBag(deletedSignatureBag);
>>>>> 310
>>>>> 311 currentClassTag = -1;
>>>>> 312
>>>>> 313 (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv);
>>>>> 314 trackingEnv = NULL;
>>>>> 315
>>>>> 316 debugMonitorExit(deletedSignatureLock);
>>>>>
>>>>> Could you, please, fix several comments below?
>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for class-unloads
>>>>>  ?The comma is not needed.
>>>>>  ?Would it better to replace: klass tags => klass_tag's ?
>>>>>
>>>>>
>>>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag
>>>>> consistent
>>>>>  ?Maybe: Lock to guard ... or lock to keep integrity of ...
>>>>>
>>>>> 84 * Callback when classes are freed, Finds the signature and
>>>>> remembers it in deletedSignatureBag. Would be better to use words like
>>>>> "store" or "record", "Find" should not start from capital letter:
>>>>> Invoke the callback when classes are freed, find and record the
>>>>> signature in deletedSignatureBag.
>>>>>
>>>>> 96 // Find deleted KlassNode 133 // Class tracking not initialized,
>>>>> nobody's interested 153 // Class tracking not initialized yet,
>>>>> nobody's interested 158 /* Check this is not a duplicate */ Missed dot
>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { //
>>>>> klass not found - ignore. In opposite, dot is not needed as the
>>>>> comment does not start from a capital letter. 111 // At this point we
>>>>> have the KlassNode corresponding to the tag
>>>>> 112 // in klass, and the pointer to it in klass_node.
>>>>   The comment above can be better. Maybe, something like:
>>>>   ? " At this point, we found the KlassNode matching the klass tag(and it is
>>>> linked).
>>>>
>>>>> 113 // Remember the unloaded signature.
>>>>  ?Better: Record the signature of the unloaded class and unlink it.
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>> On 3/9/20 05:39, Roman Kennke wrote:
>>>>>> Hello all,
>>>>>>
>>>>>> Can I please get reviews of this change? In the meantime, we've done
>>>>>> more testing and also field-/torture-testing by a customer who is happy
>>>>>> now. :-)
>>>>>>
>>>>>> Thanks,
>>>>>> Roman
>>>>>>
>>>>>>
>>>>>>> Hi Serguei,
>>>>>>>
>>>>>>> Thanks for reviewing!
>>>>>>>
>>>>>>> I updated the patch to reflect your suggestions, very good!
>>>>>>> It also includes a fix to allow re-connecting an agent after disconnect,
>>>>>>> namely move setup of the trackingEnv and deletedSignatureBag to
>>>>>>> _activate() to ensure have those structures after re-connect.
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
>>>>>>>
>>>>>>> Let me know what you think!
>>>>>>> Roman
>>>>>>>
>>>>>>>> Hi Roman,
>>>>>>>>
>>>>>>>> Thank you for taking care about this scalability issue!
>>>>>>>>
>>>>>>>> I have a couple of quick comments.
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>
>>>>>>>> 72 /*
>>>>>>>> 73 * Lock to protect deletedSignatureBag
>>>>>>>> 74 */
>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>>>>>>>> 78 * A bag containing all the deleted classes' signatures. Must be
>>>>>>>> accessed under
>>>>>>>> 79 * deletedTagLock,
>>>>>>>>    80  */
>>>>>>>> 81 struct bag* deletedSignatureBag;
>>>>>>>>
>>>>>>>>  ? The comments contradict to each other.
>>>>>>>>  ? I guess, the lock name at line 79 has to be deletedSignatureLock
>>>>>>>> instead of deletedTagLock.
>>>>>>>>  ? Also, comma at the end must be replaced with dot.
>>>>>>>>
>>>>>>>>
>>>>>>>> 101 // Tag not found? Ignore.
>>>>>>>> 102 if (klass == NULL) {
>>>>>>>> 103 debugMonitorExit(deletedSignatureLock);
>>>>>>>> 104 return;
>>>>>>>> 105 }
>>>>>>>>   106
>>>>>>>> 107 // Scan linked-list.
>>>>>>>> 108 jlong found_tag = klass->klass_tag;
>>>>>>>> 109 while (klass != NULL && found_tag != tag) {
>>>>>>>> 110 klass_ptr = &klass->next;
>>>>>>>> 111 klass = *klass_ptr;
>>>>>>>> 112 found_tag = klass->klass_tag;
>>>>>>>>   113     }
>>>>>>>> 114
>>>>>>>> 115 // Tag not found? Ignore.
>>>>>>>> 116 if (found_tag != tag) {
>>>>>>>> 117 debugMonitorExit(deletedSignatureLock);
>>>>>>>> 118 return;
>>>>>>>>   119     }
>>>>>>>>
>>>>>>>>
>>>>>>>>  ?The code above can be simplified, so that the lines 101-105 are not
>>>>>>>> needed anymore.
>>>>>>>>  ?It can be something like this:
>>>>>>>>
>>>>>>>> // Scan linked-list.
>>>>>>>> while (klass != NULL && klass->klass_tag != tag) {
>>>>>>>> klass_ptr = &klass->next;
>>>>>>>> klass = *klass_ptr;
>>>>>>>>       }
>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not found - ignore.
>>>>>>>> debugMonitorExit(deletedSignatureLock);
>>>>>>>> return;
>>>>>>>>       }
>>>>>>>>
>>>>>>>> It will take more time when I get a chance to look at the rest.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote:
>>>>>>>>> Here comes an update that resolves some races that happen when
>>>>>>>>> disconnecting an agent. In particular, we need to take the lock on
>>>>>>>>> basically every operation, and also need to check whether or not
>>>>>>>>> class-tracking is active and return an appropriate result (e.g. an empty
>>>>>>>>> list) when we're not.
>>>>>>>>>
>>>>>>>>> Updated webrev:
>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Roman
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> So, here comes the O(1) implementation:
>>>>>>>>>>
>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a tag, and we
>>>>>>>>>> set-up a listener to get notified when it is unloaded.
>>>>>>>>>> - Prepared classes are kept in a datastructure that is a table, which
>>>>>>>>>> each entry being the head of a linked-list of KlassNode*. The table is
>>>>>>>>>> indexed by tag % slot-count, and then simply prepend the new KlassNode*.
>>>>>>>>>> This is O(1) operation.
>>>>>>>>>> - When we get notified of unloading a class, we look up the signature of
>>>>>>>>>> the reported tag in that table, and remember it in a bag. The KlassNode*
>>>>>>>>>> is then unlinked from the table and deallocated. This is ~O(1) operation
>>>>>>>>>> too, depending on the depth of the table. In my testcase which hammered
>>>>>>>>>> the code with class-loads and unloads, I usually see depths of like 2-3,
>>>>>>>>>> but not usually more. It should be ok.
>>>>>>>>>> - when processUnloads() gets called, we simply hand out that bag, and
>>>>>>>>>> allocate a new one.
>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid leaking the
>>>>>>>>>> signatures and KlassNode* etc when debug agent gets detached and/or
>>>>>>>>>> re-attached (was missing before).
>>>>>>>>>> - I also added locks around data-structure-manipulation (was missing
>>>>>>>>>> before).
>>>>>>>>>> - Also, I only activate this whole process when an actual listener gets
>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when attaching a
>>>>>>>>>> jdb, not sure why jdb does that though. This may be something to improve
>>>>>>>>>> in the future?
>>>>>>>>>>
>>>>>>>>>> In my tests, the performance of class-tracking itself looks really good.
>>>>>>>>>> The bottleneck now is clearly actual synthesizing the class-unload
>>>>>>>>>> events. I don't see how this can be helped when the debug agent asks for it?
>>>>>>>>>>
>>>>>>>>>> Updated webrev:
>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
>>>>>>>>>>
>>>>>>>>>> Please let me know what you think of it.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Roman
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Alright, the perfectionist in me got me. I am implementing the even more
>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing for now.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,Roman
>>>>>>>>>>>
>>>>>>>>>>>   Hi Chris,
>>>>>>>>>>>>> I'll have a look at this, although it might not be for a few days. In
>>>>>>>>>>>>> the meantime, maybe you can describe your new implementation in
>>>>>>>>>>>>> classTrack.c so it's easier to look through the changes.
>>>>>>>>>>>> Sure.
>>>>>>>>>>>>
>>>>>>>>>>>> The purpose of this class-tracking is to be able to determine the
>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading happened, so that
>>>>>>>>>>>> we can generate the appropriate JDWP event.
>>>>>>>>>>>>
>>>>>>>>>>>> The current implementation does so by maintaining a table of currently
>>>>>>>>>>>> prepared classes by building that table when classTrack is initialized,
>>>>>>>>>>>> and then add new classes whenever a class gets loaded. When unloading
>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and compared with the
>>>>>>>>>>>> old table, and whatever is in the old, but not in the new table gets
>>>>>>>>>>>> returned. The problem is that when GCs happen frequently and/or many
>>>>>>>>>>>> classes get loaded+unloaded, this amounts to O(classCount*gcCount)
>>>>>>>>>>>> complexity.
>>>>>>>>>>>>
>>>>>>>>>>>> The new implementation keeps a linked-list of prepared classes, and also
>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). Whenever an
>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned, and classes
>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus maintaining the
>>>>>>>>>>>> prepared-classes-list) and its signature put in the list that gets returned.
>>>>>>>>>>>>
>>>>>>>>>>>> The implementation is not perfect. In order to determine whether or not
>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. That process is
>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here is that
>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this seems to be
>>>>>>>>>>>> true, and also reasonable to expect.
>>>>>>>>>>>>
>>>>>>>>>>>> (I have some ideas how to improve the implementation to ~O(1) but it
>>>>>>>>>>>> would be considerably more complex: have to maintain a (hash)table that
>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload, and build the
>>>>>>>>>>>> unloaded-signatures list there, but I don't currently see that it's
>>>>>>>>>>>> worth the effort).
>>>>>>>>>>>>
>>>>>>>>>>>> In addition to all that, this process is only activated when there's an
>>>>>>>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Roman
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Issue:
>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c. It avoids
>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps track of
>>>>>>>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In addition to that, it avoids this whole dance until an agent
>>>>>>>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and timing.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Eg with the testcase provided here:
>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I am getting those numbers:
>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can I please get a review?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>


From rkennke at redhat.com  Wed Mar 25 18:37:23 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 25 Mar 2020 19:37:23 +0100
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <d5dbee0b-9f6e-173f-15c4-f88bd0e9b619@oracle.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
 <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>
 <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>
 <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com>
 <f322e0c2-a87e-7c45-5b02-f5380e50246a@oracle.com>
 <359442d4-d6a1-f123-dad1-ac712f06eb0e@redhat.com>
 <d5dbee0b-9f6e-173f-15c4-f88bd0e9b619@oracle.com>
Message-ID: <ef85c1cc-a22b-5b72-caa2-30020f7f487a@redhat.com>

Hi Chris,

> Regarding the new assert:
> 
> ?105???? if (gdata && gdata->assertOn) {
> ?106???????? // Check this is not already tagged.
> ?107???????? jlong tag;
> ?108???????? error = JVMTI_FUNC_PTR(trackingEnv, GetTag)(env, klass, &tag);
> ?109???????? if (error != JVMTI_ERROR_NONE) {
> ?110???????????? EXIT_ERROR(error, "Unable to GetTag with class
> trackingEnv");
> ?111???????? }
> ?112???????? JDI_ASSERT(tag == NOT_TAGGED);
> ?113???? }
> 
> I think you should remove the gdata check. gdata should never be NULL
> when you get to this code. If it is ever NULL then there's a bug, and
> the check will hide the bug.

Ok, will remove this.

> Regarding testing, after you do the submit repo testing let me know the
> jobID and I'll do additional testing on it.

I did the submit repo earlier today, and it came back green:

mach5-one-rkennke-JDK-8227269-2-20200325-1421-9706762

Thanks,
Roman

> thanks,
> 
> Chris
> 
> On 3/25/20 6:00 AM, Roman Kennke wrote:
>> Hi Sergei,
>>
>>> The fix looks pretty clean now.
>>> I also like new name of the lock.:)
>> Thank you!
>>
>>> Just one comment below.
>>>
>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>
>>>
>>> 110 if (tag != 0l) {
>>> 111 return; // Already added
>>> ? 112???? }
>>>
>>> ??It is better to use a named constant or macro instead.
>>> ??Also, it'd be nice to add a short comment about this value is.
>> As I replied to Chris earlier, this whole block can be turned into an
>> assert. I also made a constant for the value 0, which should be pretty
>> much self-explaining.
>>
>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.07/
>>
>>> How do you test the fix?
>> I am using a manual test that is provided in this bug report:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>
>> "Script to compare performance of GC with and without debugger, when
>> many classes are loaded and classes are being unloaded":
>>
>> https://bugzilla.redhat.com/attachment.cgi?id=1640688
>>
>> I am also using this test and manually attach/detach jdb a couple of
>> times in a row to check that disconnecting and reconnecting works well
>> (this tended to deadlock or crash with an earlier version of the patch,
>> and is now looking good).
>>
>> I am also running tier1 and tier2 tests locally, and as soon as we all
>> agree that the fix is reasonable, I will push it to the submit repo. I
>> am not sure if any of those tests actually exercise that code, though.
>> Let me know if you want me to run any specific tests.
>>
>> Thank you,
>> Roman
>>
>>
>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 3/20/20 08:30, Roman Kennke wrote:
>>>> I believe I came up with a much simpler solution that also solves the
>>>> problems of the existing one, and the ones I proposed earlier.
>>>>
>>>> It turns out that we can take advantage of the fact that we can use
>>>> *anything* as tags in JVMTI, even pointers to stuff (this is
>>>> explicitely
>>>> mentioned in the JVMTI spec). This means we can simply stick a pointer
>>>> to the signature of a class into the tag, and pull it out again when we
>>>> get notified that the class gets unloaded.
>>>>
>>>> This means we don't need an extra data-structure to keep track of
>>>> classes and signatures, and it also makes the story around locking
>>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning of all
>>>> classes needed (as in the current implementation) and no searching of
>>>> table needed (like in my previous attempts).
>>>>
>>>> Please review this new revision:
>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/
>>>>
>>>> (Notice that there still appears to be a performance bottleneck with
>>>> class-unloading when an actual debugger is attached. This doesn't seem
>>>> to be related to the classTrack.c implementation though, but looks like
>>>> a consequence of getting all those class-unload notifications over the
>>>> wire. My testcase generates 1000s of them, and it's clogging up the
>>>> buffers.)
>>>>
>>>> I am not sure why jdb needs to enable class-unload listener always. A
>>>> simple hack disables it, and performance is brilliant, even when jdb is
>>>> attached:
>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch
>>>>
>>>> But this is not in the scope of this bug.)
>>>>
>>>> Roman
>>>>
>>>>
>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote:
>>>>> Sorry, forgot to complete my comments at the end (see below).
>>>>>
>>>>>
>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote:
>>>>>> Hi Roman,
>>>>>>
>>>>>> Thank you for the update and sorry for the latency in review.
>>>>>>
>>>>>> Some comments are below.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>
>>>>>>
>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
>>>>>> ?? 88 {
>>>>>> 89 debugMonitorEnter(deletedSignatureLock);
>>>>>> 90 if (currentClassTag == -1) {
>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>> 93 return;
>>>>>> ?? 94???? }
>>>>>> Just a question:
>>>>>> ?? Q1: Should the ObjectFree events be disabled for the jvmtiEnv
>>>>>> that does
>>>>>> ?????? the class tracking if class tracking has not been initialized?
>>>>>>
>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is
>>>>>> better to
>>>>>> be something like: lastClassTag or highestClassTag.
>>>>>>
>>>>>> 99 KlassNode* klass = *klass_ptr;
>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103
>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr;
>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // klass
>>>>>> not
>>>>>> found - ignore.
>>>>>> 107 debugMonitorExit(deletedSignatureLock);
>>>>>> 108 return;
>>>>>> ? 109???? }
>>>>>> ??It seems to me, something is wrong in the condition at L106 above.
>>>>>> ??Should it be? :
>>>>>> ???? if (klass == NULL || klass->klass_tag != tag)
>>>>>>
>>>>>> ??Otherwise, how can the second check ever work correctly as the
>>>>>> return
>>>>>> will always happen when (klass != NULL)?
>>>>>>
>>>>>> ? There are several places in this file with the the indent:
>>>>>> 90 if (currentClassTag == -1) {
>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>> 93 return;
>>>>>> ?? 94???? }
>>>>>> ? ...
>>>>>> 152 if (currentClassTag == -1) {
>>>>>> 153 // Class tracking not initialized yet, nobody's interested
>>>>>> 154 debugMonitorExit(deletedSignatureLock);
>>>>>> 155 return;
>>>>>> ? 156???? }
>>>>>> ? ...
>>>>>> 161 if (error != JVMTI_ERROR_NONE) {
>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv");
>>>>>> ? 163???? }
>>>>>> 164 if (tag != 0l) {
>>>>>> 165 debugMonitorExit(deletedSignatureLock);
>>>>>> 166 return; // Already added
>>>>>> ? 167???? }
>>>>>> ? ...
>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg)
>>>>>> 282 {
>>>>>> 283 char* sig = (char*)signatureVoid;
>>>>>> 284 jvmtiDeallocate(sig);
>>>>>> 285 return JNI_TRUE;
>>>>>> ? 286 }
>>>>>> ? ...
>>>>>> ? 291 void
>>>>>> ? 292 classTrack_reset(void)
>>>>>> ? 293 {
>>>>>> 294 int idx;
>>>>>> 295 debugMonitorEnter(deletedSignatureLock);
>>>>>> 296
>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) {
>>>>>> 298 KlassNode* node = table[idx];
>>>>>> 299 while (node != NULL) {
>>>>>> 300 KlassNode* next = node->next;
>>>>>> 301 jvmtiDeallocate(node->signature);
>>>>>> 302 jvmtiDeallocate(node);
>>>>>> 303 node = next;
>>>>>> 304 }
>>>>>> 305 }
>>>>>> 306 jvmtiDeallocate(table);
>>>>>> 307
>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL);
>>>>>> 309 bagDestroyBag(deletedSignatureBag);
>>>>>> 310
>>>>>> 311 currentClassTag = -1;
>>>>>> 312
>>>>>> 313
>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv);
>>>>>> 314 trackingEnv = NULL;
>>>>>> 315
>>>>>> 316 debugMonitorExit(deletedSignatureLock);
>>>>>>
>>>>>> Could you, please, fix several comments below?
>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for
>>>>>> class-unloads
>>>>>> ??The comma is not needed.
>>>>>> ??Would it better to replace: klass tags => klass_tag's ?
>>>>>>
>>>>>>
>>>>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag
>>>>>> consistent
>>>>>> ??Maybe: Lock to guard ... or lock to keep integrity of ...
>>>>>>
>>>>>> 84 * Callback when classes are freed, Finds the signature and
>>>>>> remembers it in deletedSignatureBag. Would be better to use words
>>>>>> like
>>>>>> "store" or "record", "Find" should not start from capital letter:
>>>>>> Invoke the callback when classes are freed, find and record the
>>>>>> signature in deletedSignatureBag.
>>>>>>
>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not initialized,
>>>>>> nobody's interested 153 // Class tracking not initialized yet,
>>>>>> nobody's interested 158 /* Check this is not a duplicate */ Missed
>>>>>> dot
>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { //
>>>>>> klass not found - ignore. In opposite, dot is not needed as the
>>>>>> comment does not start from a capital letter. 111 // At this point we
>>>>>> have the KlassNode corresponding to the tag
>>>>>> 112 // in klass, and the pointer to it in klass_node.
>>>>> ? The comment above can be better. Maybe, something like:
>>>>> ? ? " At this point, we found the KlassNode matching the klass
>>>>> tag(and it is
>>>>> linked).
>>>>>
>>>>>> 113 // Remember the unloaded signature.
>>>>> ??Better: Record the signature of the unloaded class and unlink it.
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>> On 3/9/20 05:39, Roman Kennke wrote:
>>>>>>> Hello all,
>>>>>>>
>>>>>>> Can I please get reviews of this change? In the meantime, we've done
>>>>>>> more testing and also field-/torture-testing by a customer who is
>>>>>>> happy
>>>>>>> now. :-)
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Roman
>>>>>>>
>>>>>>>
>>>>>>>> Hi Serguei,
>>>>>>>>
>>>>>>>> Thanks for reviewing!
>>>>>>>>
>>>>>>>> I updated the patch to reflect your suggestions, very good!
>>>>>>>> It also includes a fix to allow re-connecting an agent after
>>>>>>>> disconnect,
>>>>>>>> namely move setup of the trackingEnv and deletedSignatureBag to
>>>>>>>> _activate() to ensure have those structures after re-connect.
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
>>>>>>>>
>>>>>>>> Let me know what you think!
>>>>>>>> Roman
>>>>>>>>
>>>>>>>>> Hi Roman,
>>>>>>>>>
>>>>>>>>> Thank you for taking care about this scalability issue!
>>>>>>>>>
>>>>>>>>> I have a couple of quick comments.
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 72 /*
>>>>>>>>> 73 * Lock to protect deletedSignatureBag
>>>>>>>>> 74 */
>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>>>>>>>>> 78 * A bag containing all the deleted classes' signatures. Must be
>>>>>>>>> accessed under
>>>>>>>>> 79 * deletedTagLock,
>>>>>>>>> ?? 80? */
>>>>>>>>> 81 struct bag* deletedSignatureBag;
>>>>>>>>>
>>>>>>>>> ?? The comments contradict to each other.
>>>>>>>>> ?? I guess, the lock name at line 79 has to be
>>>>>>>>> deletedSignatureLock
>>>>>>>>> instead of deletedTagLock.
>>>>>>>>> ?? Also, comma at the end must be replaced with dot.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 101 // Tag not found? Ignore.
>>>>>>>>> 102 if (klass == NULL) {
>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock);
>>>>>>>>> 104 return;
>>>>>>>>> 105 }
>>>>>>>>> ? 106
>>>>>>>>> 107 // Scan linked-list.
>>>>>>>>> 108 jlong found_tag = klass->klass_tag;
>>>>>>>>> 109 while (klass != NULL && found_tag != tag) {
>>>>>>>>> 110 klass_ptr = &klass->next;
>>>>>>>>> 111 klass = *klass_ptr;
>>>>>>>>> 112 found_tag = klass->klass_tag;
>>>>>>>>> ? 113???? }
>>>>>>>>> 114
>>>>>>>>> 115 // Tag not found? Ignore.
>>>>>>>>> 116 if (found_tag != tag) {
>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock);
>>>>>>>>> 118 return;
>>>>>>>>> ? 119???? }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ??The code above can be simplified, so that the lines 101-105
>>>>>>>>> are not
>>>>>>>>> needed anymore.
>>>>>>>>> ??It can be something like this:
>>>>>>>>>
>>>>>>>>> // Scan linked-list.
>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) {
>>>>>>>>> klass_ptr = &klass->next;
>>>>>>>>> klass = *klass_ptr;
>>>>>>>>> ????? }
>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not
>>>>>>>>> found - ignore.
>>>>>>>>> debugMonitorExit(deletedSignatureLock);
>>>>>>>>> return;
>>>>>>>>> ????? }
>>>>>>>>>
>>>>>>>>> It will take more time when I get a chance to look at the rest.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Serguei
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote:
>>>>>>>>>> Here comes an update that resolves some races that happen when
>>>>>>>>>> disconnecting an agent. In particular, we need to take the
>>>>>>>>>> lock on
>>>>>>>>>> basically every operation, and also need to check whether or not
>>>>>>>>>> class-tracking is active and return an appropriate result
>>>>>>>>>> (e.g. an empty
>>>>>>>>>> list) when we're not.
>>>>>>>>>>
>>>>>>>>>> Updated webrev:
>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Roman
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> So, here comes the O(1) implementation:
>>>>>>>>>>>
>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a
>>>>>>>>>>> tag, and we
>>>>>>>>>>> set-up a listener to get notified when it is unloaded.
>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a
>>>>>>>>>>> table, which
>>>>>>>>>>> each entry being the head of a linked-list of KlassNode*. The
>>>>>>>>>>> table is
>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend the new
>>>>>>>>>>> KlassNode*.
>>>>>>>>>>> This is O(1) operation.
>>>>>>>>>>> - When we get notified of unloading a class, we look up the
>>>>>>>>>>> signature of
>>>>>>>>>>> the reported tag in that table, and remember it in a bag. The
>>>>>>>>>>> KlassNode*
>>>>>>>>>>> is then unlinked from the table and deallocated. This is
>>>>>>>>>>> ~O(1) operation
>>>>>>>>>>> too, depending on the depth of the table. In my testcase
>>>>>>>>>>> which hammered
>>>>>>>>>>> the code with class-loads and unloads, I usually see depths
>>>>>>>>>>> of like 2-3,
>>>>>>>>>>> but not usually more. It should be ok.
>>>>>>>>>>> - when processUnloads() gets called, we simply hand out that
>>>>>>>>>>> bag, and
>>>>>>>>>>> allocate a new one.
>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid
>>>>>>>>>>> leaking the
>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets detached
>>>>>>>>>>> and/or
>>>>>>>>>>> re-attached (was missing before).
>>>>>>>>>>> - I also added locks around data-structure-manipulation (was
>>>>>>>>>>> missing
>>>>>>>>>>> before).
>>>>>>>>>>> - Also, I only activate this whole process when an actual
>>>>>>>>>>> listener gets
>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when
>>>>>>>>>>> attaching a
>>>>>>>>>>> jdb, not sure why jdb does that though. This may be something
>>>>>>>>>>> to improve
>>>>>>>>>>> in the future?
>>>>>>>>>>>
>>>>>>>>>>> In my tests, the performance of class-tracking itself looks
>>>>>>>>>>> really good.
>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the
>>>>>>>>>>> class-unload
>>>>>>>>>>> events. I don't see how this can be helped when the debug
>>>>>>>>>>> agent asks for it?
>>>>>>>>>>>
>>>>>>>>>>> Updated webrev:
>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
>>>>>>>>>>>
>>>>>>>>>>> Please let me know what you think of it.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Roman
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Alright, the perfectionist in me got me. I am implementing
>>>>>>>>>>>> the even more
>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing
>>>>>>>>>>>> for now.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,Roman
>>>>>>>>>>>>
>>>>>>>>>>>> ? Hi Chris,
>>>>>>>>>>>>>> I'll have a look at this, although it might not be for a
>>>>>>>>>>>>>> few days. In
>>>>>>>>>>>>>> the meantime, maybe you can describe your new
>>>>>>>>>>>>>> implementation in
>>>>>>>>>>>>>> classTrack.c so it's easier to look through the changes.
>>>>>>>>>>>>> Sure.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The purpose of this class-tracking is to be able to
>>>>>>>>>>>>> determine the
>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading
>>>>>>>>>>>>> happened, so that
>>>>>>>>>>>>> we can generate the appropriate JDWP event.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The current implementation does so by maintaining a table
>>>>>>>>>>>>> of currently
>>>>>>>>>>>>> prepared classes by building that table when classTrack is
>>>>>>>>>>>>> initialized,
>>>>>>>>>>>>> and then add new classes whenever a class gets loaded. When
>>>>>>>>>>>>> unloading
>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and
>>>>>>>>>>>>> compared with the
>>>>>>>>>>>>> old table, and whatever is in the old, but not in the new
>>>>>>>>>>>>> table gets
>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently
>>>>>>>>>>>>> and/or many
>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to
>>>>>>>>>>>>> O(classCount*gcCount)
>>>>>>>>>>>>> complexity.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared
>>>>>>>>>>>>> classes, and also
>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree().
>>>>>>>>>>>>> Whenever an
>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned,
>>>>>>>>>>>>> and classes
>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus
>>>>>>>>>>>>> maintaining the
>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list
>>>>>>>>>>>>> that gets returned.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The implementation is not perfect. In order to determine
>>>>>>>>>>>>> whether or not
>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag.
>>>>>>>>>>>>> That process is
>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here
>>>>>>>>>>>>> is that
>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this
>>>>>>>>>>>>> seems to be
>>>>>>>>>>>>> true, and also reasonable to expect.
>>>>>>>>>>>>>
>>>>>>>>>>>>> (I have some ideas how to improve the implementation to
>>>>>>>>>>>>> ~O(1) but it
>>>>>>>>>>>>> would be considerably more complex: have to maintain a
>>>>>>>>>>>>> (hash)table that
>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload,
>>>>>>>>>>>>> and build the
>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently see
>>>>>>>>>>>>> that it's
>>>>>>>>>>>>> worth the effort).
>>>>>>>>>>>>>
>>>>>>>>>>>>> In addition to all that, this process is only activated
>>>>>>>>>>>>> when there's an
>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Issue:
>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c.
>>>>>>>>>>>>>>> It avoids
>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps
>>>>>>>>>>>>>>> track of
>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance until an
>>>>>>>>>>>>>>> agent
>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and
>>>>>>>>>>>>>>> timing.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Eg with the testcase provided here:
>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I am getting those numbers:
>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Can I please get a review?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>
> 
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200325/e8194f5c/signature-0001.asc>

From leonid.mesnik at oracle.com  Wed Mar 25 19:01:30 2020
From: leonid.mesnik at oracle.com (Leonid Mesnik)
Date: Wed, 25 Mar 2020 12:01:30 -0700
Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the
 children process if vmArguments is already specified
In-Reply-To: <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com>
References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com>
 <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com>
 <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com>
Message-ID: <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com>

Added Ioi, who also proposed new version of startAppVmOpts.

Please find new webrev: 
http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/

Renamed startAppVmOpts/runAppVmOpts to 
"startAppExactJvmOpts/runAppExactJvmOpts" is used. It should make very 
clear that this method doesn't use any of test.java.opts, test.vm.opts.

Also, I fixed test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java 
metnioned by Igor, and removed null pointer check as Ioi suggested in 
startApp method.

+ public static void startApp(LingeredApp theApp, String... 
additionalJvmOpts) throws IOException {
+ startAppExactJvmOpts(theApp, Utils.appendTestJavaOpts(additionalJvmOpts));
+ }

Leonid

On 3/25/20 10:14 AM, Stefan Karlsson wrote:
> On 2020-03-25 17:40, Igor Ignatyev wrote:
>> Hi Leonid,
>>
>> I have briefly looked at the patch, a few comments so far:
>>
>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java:
>> ? - at L#114, could you please call static method using class name 
>> (as the opposite of using instance)? or was it meant to be 
>> theApp.runAppVmOpts(vmArgs) ?
>>
>> test/lib/jdk/test/lib/apps/LingeredApp.java:
>> - it seems that code indent of startApp(LingeredApp, String[]) isn't 
>> correct
>> - I don't like startAppVmOpts name, but unfortunately don't have a 
>> better suggestion (yet)
>
> I was going to say the same. Jtreg has the concept of "java options" 
> and "vm options". We have had a fair share of bugs and wasted time 
> when tests have been using the "vm options" part (VM_OPTIONS, 
> test.vm.options, etc), and we've been moving away from using that way 
> to pass options. I recently cleaned up some of this with:
>
> 8237111: LingeredApp should be started with getTestJavaOpts
>
> Because of this, I would prefer if we used a name that doesn't include 
> "VmOpts", because it's too alike the other concept. Some suggestions:
> ?startAppJavaOptions
> ?startAppUsingJavaOptions
> ?startAppWithJavaOptions
> ?startAppExactJavaOptions
> ?startAppJvmOptions
>
> Thanks,
> StefanK
>
>> Thanks,
>> -- Igor
>>
>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik 
>>> <leonid.mesnik at oracle.com> wrote:
>>>
>>> Hi
>>>
>>> Could you please review following fix which change LingeredApp to 
>>> prepend vm options to java/vm.test.opts when startApp is used and 
>>> provide startAppVmOpts to override options completely.
>>>
>>> The intention is to avoid issue like in this bug where test/jtreg 
>>> options were ignored by tests. Also I fixed some tests where 
>>> intention was to append vm options rather than to override them.
>>>
>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/
>>>
>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698
>>>
>>> Leonid
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200325/96c4657c/attachment.htm>

From stefan.karlsson at oracle.com  Wed Mar 25 19:06:57 2020
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 25 Mar 2020 20:06:57 +0100
Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the
 children process if vmArguments is already specified
In-Reply-To: <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com>
References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com>
 <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com>
 <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com>
 <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com>
Message-ID: <375a722e-2397-450f-51d0-781b8bdd9ee8@oracle.com>

Thanks for changing the name. Sounds good to me. I leave the full review 
to others.

StefanK

On 2020-03-25 20:01, Leonid Mesnik wrote:
>
> Added Ioi, who also proposed new version of startAppVmOpts.
>
> Please find new webrev: 
> http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/
>
> Renamed startAppVmOpts/runAppVmOpts to 
> "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should make very 
> clear that this method doesn't use any of test.java.opts, test.vm.opts.
>
> Also, I fixed test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java 
> metnioned by Igor, and removed null pointer check as Ioi suggested in 
> startApp method.
>
> + public static void startApp(LingeredApp theApp, String... 
> additionalJvmOpts) throws IOException {
> + startAppExactJvmOpts(theApp, 
> Utils.appendTestJavaOpts(additionalJvmOpts));
> + }
>
> Leonid
>
> On 3/25/20 10:14 AM, Stefan Karlsson wrote:
>> On 2020-03-25 17:40, Igor Ignatyev wrote:
>>> Hi Leonid,
>>>
>>> I have briefly looked at the patch, a few comments so far:
>>>
>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java:
>>> ? - at L#114, could you please call static method using class name 
>>> (as the opposite of using instance)? or was it meant to be 
>>> theApp.runAppVmOpts(vmArgs) ?
>>>
>>> test/lib/jdk/test/lib/apps/LingeredApp.java:
>>> - it seems that code indent of startApp(LingeredApp, String[]) isn't 
>>> correct
>>> - I don't like startAppVmOpts name, but unfortunately don't have a 
>>> better suggestion (yet)
>>
>> I was going to say the same. Jtreg has the concept of "java options" 
>> and "vm options". We have had a fair share of bugs and wasted time 
>> when tests have been using the "vm options" part (VM_OPTIONS, 
>> test.vm.options, etc), and we've been moving away from using that way 
>> to pass options. I recently cleaned up some of this with:
>>
>> 8237111: LingeredApp should be started with getTestJavaOpts
>>
>> Because of this, I would prefer if we used a name that doesn't 
>> include "VmOpts", because it's too alike the other concept. Some 
>> suggestions:
>> ?startAppJavaOptions
>> ?startAppUsingJavaOptions
>> ?startAppWithJavaOptions
>> ?startAppExactJavaOptions
>> ?startAppJvmOptions
>>
>> Thanks,
>> StefanK
>>
>>> Thanks,
>>> -- Igor
>>>
>>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik 
>>>> <leonid.mesnik at oracle.com> wrote:
>>>>
>>>> Hi
>>>>
>>>> Could you please review following fix which change LingeredApp to 
>>>> prepend vm options to java/vm.test.opts when startApp is used and 
>>>> provide startAppVmOpts to override options completely.
>>>>
>>>> The intention is to avoid issue like in this bug where test/jtreg 
>>>> options were ignored by tests. Also I fixed some tests where 
>>>> intention was to append vm options rather than to override them.
>>>>
>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/
>>>>
>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698
>>>>
>>>> Leonid
>>>>
>>


From magnus.ihse.bursie at oracle.com  Wed Mar 25 19:29:53 2020
From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie)
Date: Wed, 25 Mar 2020 20:29:53 +0100
Subject: RFR: JDK-8241618 Fix unchecked warning for jdk.hotspot.agent
Message-ID: <8d884fcb-f424-1b54-7ece-5260037b2843@oracle.com>

With the recent fixes in JDK-8241310, JDK-8237746 and JDK-8241073, and 
the upcoming fixes to remove the deprecated nashorn and jdk.rmi, the JDK 
build is very close to producing no warnings when compiling the Java 
classes.

The one remaining sinner is jdk.hotspot.agent. Most of the warnings here 
are turned off, but unchecked and deprecation cannot be completely silenced.

Since the poor agent does not seem to receive much love nowadays, I took 
it upon myself to fix these warnings, so we can finally get a quiet build.

I started to address the unchecked warnings. Unfortunately, this was a 
much bigger task than I anticipated. I had to generify most of the 
module. On the plus side, the code is so much better now. And most of 
the changes were trivial, just tedious.

There are a few places were I'm not entirely happy with the current 
solution, and that at least merits some discussion.

I have resorted to @SuppressWarnings in four classes: ciMethodData, 
MethodData, TableModelComparator and VirtualBaseConstructor. All of them 
has in common that they are doing slightly fishy things with classes in 
collections. I'm not entirely sure they are bug-free, but this patch 
leaves the behavior untouched. I did some efforts to sort out the logic, 
but it turned out to be too hairy for me to fix, and it will probably 
require more substantial changes to the workings of the code.

To make the code valid, I have moved ConstMethod to extend Metadata 
instead of VMObject. My understanding is that this is benign (and likely 
intended), but I really need for someone who knows the code to confirm 
this. I have also added a FIXME to signal this. I'll remove the FIXME as 
soon as I get confirmation that this is OK.
(The reason for this is the following piece of code from Metadata.java: 
metadataConstructor.addMapping("ConstMethod", ConstMethod.class))

In ObjectListPanel, there is some code that screams "dead" with this 
change. I added a FIXME to point this out:
 ??? for (Iterator<Oop> iter = elements.iterator(); iter.hasNext(); ) {
 ????? if (iter.next() instanceof Array) {
 ??????? // FIXME: Does not seem possible to happen
 ??????? hasArrays = true;
 ??????? return;
 ????? }
It seems that if you start pulling this thread, even more dead code will 
unravel, so I'm not so eager to touch this in the current patch. But I 
can remove the FIXME if you want.

My first iteration of this patch tried to generify the IntervalTree and 
related class hierarchy. However, this turned out to be impossible due 
to some weird usage in AnnotatedMemoryPanel, where there seemed to be 
confusion as to whether the tree stored Annotations or Addresses. I'm 
not entirely convinced the code is correct, it certainly looked and 
smelled very fishy. However, I reverted these changes since I could not 
get them to work due to this, and it was not needed for the goal of just 
getting rid of the warning.

Finally, I have done no testing apart from verifying that it builds. 
Please advice on suitable tests to run.

Bug: https://bugs.openjdk.java.net/browse/JDK-8241618
WebRev: 
http://cr.openjdk.java.net/~ihse/JDK-8241618-fix-unchecked-warnings-for-agent/webrev.01

/Magnus

From chris.plummer at oracle.com  Wed Mar 25 19:36:34 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 25 Mar 2020 12:36:34 -0700
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <ef85c1cc-a22b-5b72-caa2-30020f7f487a@redhat.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
 <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>
 <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>
 <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com>
 <f322e0c2-a87e-7c45-5b02-f5380e50246a@oracle.com>
 <359442d4-d6a1-f123-dad1-ac712f06eb0e@redhat.com>
 <d5dbee0b-9f6e-173f-15c4-f88bd0e9b619@oracle.com>
 <ef85c1cc-a22b-5b72-caa2-30020f7f487a@redhat.com>
Message-ID: <7c613970-1788-5bd5-06f9-5c754843a193@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200325/500e9622/attachment-0001.htm>

From igor.ignatyev at oracle.com  Wed Mar 25 19:46:15 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Wed, 25 Mar 2020 12:46:15 -0700
Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the
 children process if vmArguments is already specified
In-Reply-To: <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com>
References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com>
 <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com>
 <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com>
 <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com>
Message-ID: <042D1C56-75E3-40B9-8259-035C9A13C13B@oracle.com>

Hi Leonid,

not related related to your patch (but yet somewhat made more obvious by it), it seems all (or at least almost all) the tests which use LingeredApp should be run in "driver" mode as they just orchestrate execution of other JVMs, so running them w/ main (let alone main/othervm) just wastes time, test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java#id1, for example, will now executed w/ Xcomp which will make it very slow for no reasons. since you already got your hands dirty w/ these tests, could you please file an RFE to sort this out and list all the affected tests there?

re: the patch, could you please update ClhsdbJstack.java test not to be run w/ Xcomp and follow the same pattern you used in other tests (e.g. ClhsdbScanOops) ? other than that it looks fine to me, I however wouldn't be able to tell if all svc tests continue to do that they were supposed to, so I'd prefer for someone from svc team to chime in.

Thanks,
-- Igor

> On Mar 25, 2020, at 12:01 PM, Leonid Mesnik <leonid.mesnik at oracle.com> wrote:
> 
> Added Ioi, who also proposed new version of startAppVmOpts.
> 
> Please find new webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/ <http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/>
> Renamed startAppVmOpts/runAppVmOpts to "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should make very clear that this method doesn't use any of test.java.opts, test.vm.opts.
> 
> Also, I fixed test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java metnioned by Igor, and removed null pointer check as Ioi suggested in startApp method. 
> 
> +    public static void startApp(LingeredApp theApp, String... additionalJvmOpts) throws IOException {
> +        startAppExactJvmOpts(theApp, Utils.appendTestJavaOpts(additionalJvmOpts));
> +    }
> 
> Leonid
> 
> On 3/25/20 10:14 AM, Stefan Karlsson wrote:
>> On 2020-03-25 17:40, Igor Ignatyev wrote: 
>>> Hi Leonid, 
>>> 
>>> I have briefly looked at the patch, a few comments so far: 
>>> 
>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java: 
>>>   - at L#114, could you please call static method using class name (as the opposite of using instance)? or was it meant to be theApp.runAppVmOpts(vmArgs) ? 
>>> 
>>> test/lib/jdk/test/lib/apps/LingeredApp.java: 
>>> - it seems that code indent of startApp(LingeredApp, String[]) isn't correct 
>>> - I don't like startAppVmOpts name, but unfortunately don't have a better suggestion (yet) 
>> 
>> I was going to say the same. Jtreg has the concept of "java options" and "vm options". We have had a fair share of bugs and wasted time when tests have been using the "vm options" part (VM_OPTIONS, test.vm.options, etc), and we've been moving away from using that way to pass options. I recently cleaned up some of this with: 
>> 
>> 8237111: LingeredApp should be started with getTestJavaOpts 
>> 
>> Because of this, I would prefer if we used a name that doesn't include "VmOpts", because it's too alike the other concept. Some suggestions: 
>>  startAppJavaOptions 
>>  startAppUsingJavaOptions 
>>  startAppWithJavaOptions 
>>  startAppExactJavaOptions 
>>  startAppJvmOptions 
>> 
>> Thanks, 
>> StefanK 
>> 
>>> Thanks, 
>>> -- Igor 
>>> 
>>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik <leonid.mesnik at oracle.com> <mailto:leonid.mesnik at oracle.com> wrote: 
>>>> 
>>>> Hi 
>>>> 
>>>> Could you please review following fix which change LingeredApp to prepend vm options to java/vm.test.opts when startApp is used and provide startAppVmOpts to override options completely. 
>>>> 
>>>> The intention is to avoid issue like in this bug where test/jtreg options were ignored by tests. Also I fixed some tests where intention was to append vm options rather than to override them. 
>>>> 
>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/ <http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/> 
>>>> 
>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698 <https://bugs.openjdk.java.net/browse/JDK-8240698> 
>>>> 
>>>> Leonid 
>>>> 
>> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200325/09229c42/attachment.htm>

From chris.plummer at oracle.com  Wed Mar 25 19:52:02 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 25 Mar 2020 12:52:02 -0700
Subject: RFR: JDK-8241618 Fix unchecked warning for jdk.hotspot.agent
In-Reply-To: <8d884fcb-f424-1b54-7ece-5260037b2843@oracle.com>
References: <8d884fcb-f424-1b54-7ece-5260037b2843@oracle.com>
Message-ID: <01f77be3-e7d2-a051-80ab-e81c83922cf6@oracle.com>

Hi Magus,

I haven't looked at the changes yet, other to see that there are many 
files touched, but after reading below (and only partly understanding 
since I don't know this area well), I was wondering if this issue 
wouldn't be better served with multiple passes made to fix the warnings. 
Start with a straight forward one where you are maybe only making one or 
two types of changes, but that affect a large number of files and don't 
cascade into other more complicated changes. This will get a lot of the 
noise out of the way, and then we can focus on some of the harder issues 
you bring up below.

As for testing, I think the following list will capture all of them, but 
can't say for sure:

open/test/hotspot/jtreg/serviceability/sa
open/test/hotspot/jtreg/resourcehogs/serviceability/sa
open/test/jdk/sun/tools/jhsdb
open/test/jdk/sun/tools/jstack
open/test/jdk/sun/tools/jmap
open/test/hotspot/jtreg/gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java
open/test/hotspot/jtreg/compiler/ciReplay/TestSAClient.java 
open/test/hotspot/jtreg/compiler/ciReplay/TestSAServer.java

Chris

On 3/25/20 12:29 PM, Magnus Ihse Bursie wrote:
> With the recent fixes in JDK-8241310, JDK-8237746 and JDK-8241073, and 
> the upcoming fixes to remove the deprecated nashorn and jdk.rmi, the 
> JDK build is very close to producing no warnings when compiling the 
> Java classes.
>
> The one remaining sinner is jdk.hotspot.agent. Most of the warnings 
> here are turned off, but unchecked and deprecation cannot be 
> completely silenced.
>
> Since the poor agent does not seem to receive much love nowadays, I 
> took it upon myself to fix these warnings, so we can finally get a 
> quiet build.
>
> I started to address the unchecked warnings. Unfortunately, this was a 
> much bigger task than I anticipated. I had to generify most of the 
> module. On the plus side, the code is so much better now. And most of 
> the changes were trivial, just tedious.
>
> There are a few places were I'm not entirely happy with the current 
> solution, and that at least merits some discussion.
>
> I have resorted to @SuppressWarnings in four classes: ciMethodData, 
> MethodData, TableModelComparator and VirtualBaseConstructor. All of 
> them has in common that they are doing slightly fishy things with 
> classes in collections. I'm not entirely sure they are bug-free, but 
> this patch leaves the behavior untouched. I did some efforts to sort 
> out the logic, but it turned out to be too hairy for me to fix, and it 
> will probably require more substantial changes to the workings of the 
> code.
>
> To make the code valid, I have moved ConstMethod to extend Metadata 
> instead of VMObject. My understanding is that this is benign (and 
> likely intended), but I really need for someone who knows the code to 
> confirm this. I have also added a FIXME to signal this. I'll remove 
> the FIXME as soon as I get confirmation that this is OK.
> (The reason for this is the following piece of code from 
> Metadata.java: metadataConstructor.addMapping("ConstMethod", 
> ConstMethod.class))
>
> In ObjectListPanel, there is some code that screams "dead" with this 
> change. I added a FIXME to point this out:
> ??? for (Iterator<Oop> iter = elements.iterator(); iter.hasNext(); ) {
> ????? if (iter.next() instanceof Array) {
> ??????? // FIXME: Does not seem possible to happen
> ??????? hasArrays = true;
> ??????? return;
> ????? }
> It seems that if you start pulling this thread, even more dead code 
> will unravel, so I'm not so eager to touch this in the current patch. 
> But I can remove the FIXME if you want.
>
> My first iteration of this patch tried to generify the IntervalTree 
> and related class hierarchy. However, this turned out to be impossible 
> due to some weird usage in AnnotatedMemoryPanel, where there seemed to 
> be confusion as to whether the tree stored Annotations or Addresses. 
> I'm not entirely convinced the code is correct, it certainly looked 
> and smelled very fishy. However, I reverted these changes since I 
> could not get them to work due to this, and it was not needed for the 
> goal of just getting rid of the warning.
>
> Finally, I have done no testing apart from verifying that it builds. 
> Please advice on suitable tests to run.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8241618
> WebRev: 
> http://cr.openjdk.java.net/~ihse/JDK-8241618-fix-unchecked-warnings-for-agent/webrev.01
>
> /Magnus


From rkennke at redhat.com  Wed Mar 25 19:59:10 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 25 Mar 2020 20:59:10 +0100
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <7c613970-1788-5bd5-06f9-5c754843a193@oracle.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
 <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>
 <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>
 <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com>
 <f322e0c2-a87e-7c45-5b02-f5380e50246a@oracle.com>
 <359442d4-d6a1-f123-dad1-ac712f06eb0e@redhat.com>
 <d5dbee0b-9f6e-173f-15c4-f88bd0e9b619@oracle.com>
 <ef85c1cc-a22b-5b72-caa2-30020f7f487a@redhat.com>
 <7c613970-1788-5bd5-06f9-5c754843a193@oracle.com>
Message-ID: <70717ba0-bcd1-05d4-3f25-ad1dd30309c1@redhat.com>

Hi Chris,

Apparently we can get into classTrack_reset() before calling activate(),
and we're seeing a null deletedSignatureBag. A simple NULL-check around
the cleaning routine fixes the problem for me.

http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.08/

Should I post another submit-repo job with that fix?

Thanks,
Roman


> Hi Roman,
> 
> com/sun/jdi/JdwpAllowTest.java crashed on many runs:
> 
> Stack: [0x00007fbb790f9000,0x00007fbb791fa000],  sp=0x00007fbb791f8af0,  free space=1022k
> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
> C  [libjdwp.so+0xdb71]  bagEnumerateOver+0x11
> C  [libjdwp.so+0xe365]  classTrack_reset+0x25
> C  [libjdwp.so+0xfca1]  debugInit_reset+0x71
> C  [libjdwp.so+0x12e0d]  debugLoop_run+0x38d
> C  [libjdwp.so+0x25700]  acceptThread+0x80
> V  [libjvm.so+0xf4b5a7]  JvmtiAgentThread::call_start_function()+0x1c7
> V  [libjvm.so+0x15215c6]  JavaThread::thread_main_inner()+0x226
> V  [libjvm.so+0x1527736]  Thread::call_run()+0xf6
> V  [libjvm.so+0x1250ade]  thread_native_entry(Thread*)+0x10e
> 
> 
> This happened during a test task run of open/test/jdk/:jdk_jdi. There
> doesn't seem to be anything magic on the command line that might be
> triggering. Pretty much I see it with all the various VM configs we test.
> 
> I'm also seeing crashes in the following tests, but not as often:
> 
> serviceability/jvmti/ModuleAwareAgents/ThreadStart/MAAThreadStart.java
> vmTestbase/nsk/jdwp/VirtualMachine/Version/version002/TestDescription.java
> vmTestbase/nsk/jdwp/VirtualMachine/ReleaseEvents/releaseevents002/TestDescription.java
> vmTestbase/nsk/jdwp/VirtualMachine/HoldEvents/holdevents002/TestDescription.java
> vmTestbase/nsk/jdwp/VirtualMachine/Dispose/dispose001/TestDescription.java
> 
> thanks,
> 
> Chris
> 
> 
> On 3/25/20 11:37 AM, Roman Kennke wrote:
>> Hi Chris,
>>
>>> Regarding the new assert:
>>>
>>> ?105???? if (gdata && gdata->assertOn) {
>>> ?106???????? // Check this is not already tagged.
>>> ?107???????? jlong tag;
>>> ?108???????? error = JVMTI_FUNC_PTR(trackingEnv, GetTag)(env, klass, &tag);
>>> ?109???????? if (error != JVMTI_ERROR_NONE) {
>>> ?110???????????? EXIT_ERROR(error, "Unable to GetTag with class
>>> trackingEnv");
>>> ?111???????? }
>>> ?112???????? JDI_ASSERT(tag == NOT_TAGGED);
>>> ?113???? }
>>>
>>> I think you should remove the gdata check. gdata should never be NULL
>>> when you get to this code. If it is ever NULL then there's a bug, and
>>> the check will hide the bug.
>> Ok, will remove this.
>>
>>> Regarding testing, after you do the submit repo testing let me know the
>>> jobID and I'll do additional testing on it.
>> I did the submit repo earlier today, and it came back green:
>>
>> mach5-one-rkennke-JDK-8227269-2-20200325-1421-9706762
>>
>> Thanks,
>> Roman
>>
>>> thanks,
>>>
>>> Chris
>>>
>>> On 3/25/20 6:00 AM, Roman Kennke wrote:
>>>> Hi Sergei,
>>>>
>>>>> The fix looks pretty clean now.
>>>>> I also like new name of the lock.:)
>>>> Thank you!
>>>>
>>>>> Just one comment below.
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>
>>>>>
>>>>> 110 if (tag != 0l) {
>>>>> 111 return; // Already added
>>>>> ? 112???? }
>>>>>
>>>>> ??It is better to use a named constant or macro instead.
>>>>> ??Also, it'd be nice to add a short comment about this value is.
>>>> As I replied to Chris earlier, this whole block can be turned into an
>>>> assert. I also made a constant for the value 0, which should be pretty
>>>> much self-explaining.
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.07/
>>>>
>>>>> How do you test the fix?
>>>> I am using a manual test that is provided in this bug report:
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>
>>>> "Script to compare performance of GC with and without debugger, when
>>>> many classes are loaded and classes are being unloaded":
>>>>
>>>> https://bugzilla.redhat.com/attachment.cgi?id=1640688
>>>>
>>>> I am also using this test and manually attach/detach jdb a couple of
>>>> times in a row to check that disconnecting and reconnecting works well
>>>> (this tended to deadlock or crash with an earlier version of the patch,
>>>> and is now looking good).
>>>>
>>>> I am also running tier1 and tier2 tests locally, and as soon as we all
>>>> agree that the fix is reasonable, I will push it to the submit repo. I
>>>> am not sure if any of those tests actually exercise that code, though.
>>>> Let me know if you want me to run any specific tests.
>>>>
>>>> Thank you,
>>>> Roman
>>>>
>>>>
>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>> On 3/20/20 08:30, Roman Kennke wrote:
>>>>>> I believe I came up with a much simpler solution that also solves the
>>>>>> problems of the existing one, and the ones I proposed earlier.
>>>>>>
>>>>>> It turns out that we can take advantage of the fact that we can use
>>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is
>>>>>> explicitely
>>>>>> mentioned in the JVMTI spec). This means we can simply stick a pointer
>>>>>> to the signature of a class into the tag, and pull it out again when we
>>>>>> get notified that the class gets unloaded.
>>>>>>
>>>>>> This means we don't need an extra data-structure to keep track of
>>>>>> classes and signatures, and it also makes the story around locking
>>>>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning of all
>>>>>> classes needed (as in the current implementation) and no searching of
>>>>>> table needed (like in my previous attempts).
>>>>>>
>>>>>> Please review this new revision:
>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/
>>>>>>
>>>>>> (Notice that there still appears to be a performance bottleneck with
>>>>>> class-unloading when an actual debugger is attached. This doesn't seem
>>>>>> to be related to the classTrack.c implementation though, but looks like
>>>>>> a consequence of getting all those class-unload notifications over the
>>>>>> wire. My testcase generates 1000s of them, and it's clogging up the
>>>>>> buffers.)
>>>>>>
>>>>>> I am not sure why jdb needs to enable class-unload listener always. A
>>>>>> simple hack disables it, and performance is brilliant, even when jdb is
>>>>>> attached:
>>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch
>>>>>>
>>>>>> But this is not in the scope of this bug.)
>>>>>>
>>>>>> Roman
>>>>>>
>>>>>>
>>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>> Sorry, forgot to complete my comments at the end (see below).
>>>>>>>
>>>>>>>
>>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote:
>>>>>>>> Hi Roman,
>>>>>>>>
>>>>>>>> Thank you for the update and sorry for the latency in review.
>>>>>>>>
>>>>>>>> Some comments are below.
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>
>>>>>>>>
>>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
>>>>>>>> ?? 88 {
>>>>>>>> 89 debugMonitorEnter(deletedSignatureLock);
>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>> 93 return;
>>>>>>>> ?? 94???? }
>>>>>>>> Just a question:
>>>>>>>> ?? Q1: Should the ObjectFree events be disabled for the jvmtiEnv
>>>>>>>> that does
>>>>>>>> ?????? the class tracking if class tracking has not been initialized?
>>>>>>>>
>>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is
>>>>>>>> better to
>>>>>>>> be something like: lastClassTag or highestClassTag.
>>>>>>>>
>>>>>>>> 99 KlassNode* klass = *klass_ptr;
>>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103
>>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr;
>>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // klass
>>>>>>>> not
>>>>>>>> found - ignore.
>>>>>>>> 107 debugMonitorExit(deletedSignatureLock);
>>>>>>>> 108 return;
>>>>>>>> ? 109???? }
>>>>>>>> ??It seems to me, something is wrong in the condition at L106 above.
>>>>>>>> ??Should it be? :
>>>>>>>> ???? if (klass == NULL || klass->klass_tag != tag)
>>>>>>>>
>>>>>>>> ??Otherwise, how can the second check ever work correctly as the
>>>>>>>> return
>>>>>>>> will always happen when (klass != NULL)?
>>>>>>>>
>>>>>>>> ? There are several places in this file with the the indent:
>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>> 93 return;
>>>>>>>> ?? 94???? }
>>>>>>>> ? ...
>>>>>>>> 152 if (currentClassTag == -1) {
>>>>>>>> 153 // Class tracking not initialized yet, nobody's interested
>>>>>>>> 154 debugMonitorExit(deletedSignatureLock);
>>>>>>>> 155 return;
>>>>>>>> ? 156???? }
>>>>>>>> ? ...
>>>>>>>> 161 if (error != JVMTI_ERROR_NONE) {
>>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv");
>>>>>>>> ? 163???? }
>>>>>>>> 164 if (tag != 0l) {
>>>>>>>> 165 debugMonitorExit(deletedSignatureLock);
>>>>>>>> 166 return; // Already added
>>>>>>>> ? 167???? }
>>>>>>>> ? ...
>>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg)
>>>>>>>> 282 {
>>>>>>>> 283 char* sig = (char*)signatureVoid;
>>>>>>>> 284 jvmtiDeallocate(sig);
>>>>>>>> 285 return JNI_TRUE;
>>>>>>>> ? 286 }
>>>>>>>> ? ...
>>>>>>>> ? 291 void
>>>>>>>> ? 292 classTrack_reset(void)
>>>>>>>> ? 293 {
>>>>>>>> 294 int idx;
>>>>>>>> 295 debugMonitorEnter(deletedSignatureLock);
>>>>>>>> 296
>>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) {
>>>>>>>> 298 KlassNode* node = table[idx];
>>>>>>>> 299 while (node != NULL) {
>>>>>>>> 300 KlassNode* next = node->next;
>>>>>>>> 301 jvmtiDeallocate(node->signature);
>>>>>>>> 302 jvmtiDeallocate(node);
>>>>>>>> 303 node = next;
>>>>>>>> 304 }
>>>>>>>> 305 }
>>>>>>>> 306 jvmtiDeallocate(table);
>>>>>>>> 307
>>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL);
>>>>>>>> 309 bagDestroyBag(deletedSignatureBag);
>>>>>>>> 310
>>>>>>>> 311 currentClassTag = -1;
>>>>>>>> 312
>>>>>>>> 313
>>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv);
>>>>>>>> 314 trackingEnv = NULL;
>>>>>>>> 315
>>>>>>>> 316 debugMonitorExit(deletedSignatureLock);
>>>>>>>>
>>>>>>>> Could you, please, fix several comments below?
>>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for
>>>>>>>> class-unloads
>>>>>>>> ??The comma is not needed.
>>>>>>>> ??Would it better to replace: klass tags => klass_tag's ?
>>>>>>>>
>>>>>>>>
>>>>>>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag
>>>>>>>> consistent
>>>>>>>> ??Maybe: Lock to guard ... or lock to keep integrity of ...
>>>>>>>>
>>>>>>>> 84 * Callback when classes are freed, Finds the signature and
>>>>>>>> remembers it in deletedSignatureBag. Would be better to use words
>>>>>>>> like
>>>>>>>> "store" or "record", "Find" should not start from capital letter:
>>>>>>>> Invoke the callback when classes are freed, find and record the
>>>>>>>> signature in deletedSignatureBag.
>>>>>>>>
>>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not initialized,
>>>>>>>> nobody's interested 153 // Class tracking not initialized yet,
>>>>>>>> nobody's interested 158 /* Check this is not a duplicate */ Missed
>>>>>>>> dot
>>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { //
>>>>>>>> klass not found - ignore. In opposite, dot is not needed as the
>>>>>>>> comment does not start from a capital letter. 111 // At this point we
>>>>>>>> have the KlassNode corresponding to the tag
>>>>>>>> 112 // in klass, and the pointer to it in klass_node.
>>>>>>> ? The comment above can be better. Maybe, something like:
>>>>>>> ? ? " At this point, we found the KlassNode matching the klass
>>>>>>> tag(and it is
>>>>>>> linked).
>>>>>>>
>>>>>>>> 113 // Remember the unloaded signature.
>>>>>>> ??Better: Record the signature of the unloaded class and unlink it.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>>
>>>>>>>> On 3/9/20 05:39, Roman Kennke wrote:
>>>>>>>>> Hello all,
>>>>>>>>>
>>>>>>>>> Can I please get reviews of this change? In the meantime, we've done
>>>>>>>>> more testing and also field-/torture-testing by a customer who is
>>>>>>>>> happy
>>>>>>>>> now. :-)
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Roman
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Hi Serguei,
>>>>>>>>>>
>>>>>>>>>> Thanks for reviewing!
>>>>>>>>>>
>>>>>>>>>> I updated the patch to reflect your suggestions, very good!
>>>>>>>>>> It also includes a fix to allow re-connecting an agent after
>>>>>>>>>> disconnect,
>>>>>>>>>> namely move setup of the trackingEnv and deletedSignatureBag to
>>>>>>>>>> _activate() to ensure have those structures after re-connect.
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
>>>>>>>>>>
>>>>>>>>>> Let me know what you think!
>>>>>>>>>> Roman
>>>>>>>>>>
>>>>>>>>>>> Hi Roman,
>>>>>>>>>>>
>>>>>>>>>>> Thank you for taking care about this scalability issue!
>>>>>>>>>>>
>>>>>>>>>>> I have a couple of quick comments.
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 72 /*
>>>>>>>>>>> 73 * Lock to protect deletedSignatureBag
>>>>>>>>>>> 74 */
>>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>>>>>>>>>>> 78 * A bag containing all the deleted classes' signatures. Must be
>>>>>>>>>>> accessed under
>>>>>>>>>>> 79 * deletedTagLock,
>>>>>>>>>>> ?? 80? */
>>>>>>>>>>> 81 struct bag* deletedSignatureBag;
>>>>>>>>>>>
>>>>>>>>>>> ?? The comments contradict to each other.
>>>>>>>>>>> ?? I guess, the lock name at line 79 has to be
>>>>>>>>>>> deletedSignatureLock
>>>>>>>>>>> instead of deletedTagLock.
>>>>>>>>>>> ?? Also, comma at the end must be replaced with dot.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 101 // Tag not found? Ignore.
>>>>>>>>>>> 102 if (klass == NULL) {
>>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>> 104 return;
>>>>>>>>>>> 105 }
>>>>>>>>>>> ? 106
>>>>>>>>>>> 107 // Scan linked-list.
>>>>>>>>>>> 108 jlong found_tag = klass->klass_tag;
>>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) {
>>>>>>>>>>> 110 klass_ptr = &klass->next;
>>>>>>>>>>> 111 klass = *klass_ptr;
>>>>>>>>>>> 112 found_tag = klass->klass_tag;
>>>>>>>>>>> ? 113???? }
>>>>>>>>>>> 114
>>>>>>>>>>> 115 // Tag not found? Ignore.
>>>>>>>>>>> 116 if (found_tag != tag) {
>>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>> 118 return;
>>>>>>>>>>> ? 119???? }
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> ??The code above can be simplified, so that the lines 101-105
>>>>>>>>>>> are not
>>>>>>>>>>> needed anymore.
>>>>>>>>>>> ??It can be something like this:
>>>>>>>>>>>
>>>>>>>>>>> // Scan linked-list.
>>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) {
>>>>>>>>>>> klass_ptr = &klass->next;
>>>>>>>>>>> klass = *klass_ptr;
>>>>>>>>>>> ????? }
>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not
>>>>>>>>>>> found - ignore.
>>>>>>>>>>> debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>> return;
>>>>>>>>>>> ????? }
>>>>>>>>>>>
>>>>>>>>>>> It will take more time when I get a chance to look at the rest.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Serguei
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote:
>>>>>>>>>>>> Here comes an update that resolves some races that happen when
>>>>>>>>>>>> disconnecting an agent. In particular, we need to take the
>>>>>>>>>>>> lock on
>>>>>>>>>>>> basically every operation, and also need to check whether or not
>>>>>>>>>>>> class-tracking is active and return an appropriate result
>>>>>>>>>>>> (e.g. an empty
>>>>>>>>>>>> list) when we're not.
>>>>>>>>>>>>
>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Roman
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> So, here comes the O(1) implementation:
>>>>>>>>>>>>>
>>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a
>>>>>>>>>>>>> tag, and we
>>>>>>>>>>>>> set-up a listener to get notified when it is unloaded.
>>>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a
>>>>>>>>>>>>> table, which
>>>>>>>>>>>>> each entry being the head of a linked-list of KlassNode*. The
>>>>>>>>>>>>> table is
>>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend the new
>>>>>>>>>>>>> KlassNode*.
>>>>>>>>>>>>> This is O(1) operation.
>>>>>>>>>>>>> - When we get notified of unloading a class, we look up the
>>>>>>>>>>>>> signature of
>>>>>>>>>>>>> the reported tag in that table, and remember it in a bag. The
>>>>>>>>>>>>> KlassNode*
>>>>>>>>>>>>> is then unlinked from the table and deallocated. This is
>>>>>>>>>>>>> ~O(1) operation
>>>>>>>>>>>>> too, depending on the depth of the table. In my testcase
>>>>>>>>>>>>> which hammered
>>>>>>>>>>>>> the code with class-loads and unloads, I usually see depths
>>>>>>>>>>>>> of like 2-3,
>>>>>>>>>>>>> but not usually more. It should be ok.
>>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out that
>>>>>>>>>>>>> bag, and
>>>>>>>>>>>>> allocate a new one.
>>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid
>>>>>>>>>>>>> leaking the
>>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets detached
>>>>>>>>>>>>> and/or
>>>>>>>>>>>>> re-attached (was missing before).
>>>>>>>>>>>>> - I also added locks around data-structure-manipulation (was
>>>>>>>>>>>>> missing
>>>>>>>>>>>>> before).
>>>>>>>>>>>>> - Also, I only activate this whole process when an actual
>>>>>>>>>>>>> listener gets
>>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when
>>>>>>>>>>>>> attaching a
>>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be something
>>>>>>>>>>>>> to improve
>>>>>>>>>>>>> in the future?
>>>>>>>>>>>>>
>>>>>>>>>>>>> In my tests, the performance of class-tracking itself looks
>>>>>>>>>>>>> really good.
>>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the
>>>>>>>>>>>>> class-unload
>>>>>>>>>>>>> events. I don't see how this can be helped when the debug
>>>>>>>>>>>>> agent asks for it?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please let me know what you think of it.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am implementing
>>>>>>>>>>>>>> the even more
>>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing
>>>>>>>>>>>>>> for now.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,Roman
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ? Hi Chris,
>>>>>>>>>>>>>>>> I'll have a look at this, although it might not be for a
>>>>>>>>>>>>>>>> few days. In
>>>>>>>>>>>>>>>> the meantime, maybe you can describe your new
>>>>>>>>>>>>>>>> implementation in
>>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the changes.
>>>>>>>>>>>>>>> Sure.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to
>>>>>>>>>>>>>>> determine the
>>>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading
>>>>>>>>>>>>>>> happened, so that
>>>>>>>>>>>>>>> we can generate the appropriate JDWP event.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The current implementation does so by maintaining a table
>>>>>>>>>>>>>>> of currently
>>>>>>>>>>>>>>> prepared classes by building that table when classTrack is
>>>>>>>>>>>>>>> initialized,
>>>>>>>>>>>>>>> and then add new classes whenever a class gets loaded. When
>>>>>>>>>>>>>>> unloading
>>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and
>>>>>>>>>>>>>>> compared with the
>>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the new
>>>>>>>>>>>>>>> table gets
>>>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently
>>>>>>>>>>>>>>> and/or many
>>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to
>>>>>>>>>>>>>>> O(classCount*gcCount)
>>>>>>>>>>>>>>> complexity.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared
>>>>>>>>>>>>>>> classes, and also
>>>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree().
>>>>>>>>>>>>>>> Whenever an
>>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned,
>>>>>>>>>>>>>>> and classes
>>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus
>>>>>>>>>>>>>>> maintaining the
>>>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list
>>>>>>>>>>>>>>> that gets returned.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The implementation is not perfect. In order to determine
>>>>>>>>>>>>>>> whether or not
>>>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag.
>>>>>>>>>>>>>>> That process is
>>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here
>>>>>>>>>>>>>>> is that
>>>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this
>>>>>>>>>>>>>>> seems to be
>>>>>>>>>>>>>>> true, and also reasonable to expect.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> (I have some ideas how to improve the implementation to
>>>>>>>>>>>>>>> ~O(1) but it
>>>>>>>>>>>>>>> would be considerably more complex: have to maintain a
>>>>>>>>>>>>>>> (hash)table that
>>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload,
>>>>>>>>>>>>>>> and build the
>>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently see
>>>>>>>>>>>>>>> that it's
>>>>>>>>>>>>>>> worth the effort).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In addition to all that, this process is only activated
>>>>>>>>>>>>>>> when there's an
>>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Issue:
>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c.
>>>>>>>>>>>>>>>>> It avoids
>>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps
>>>>>>>>>>>>>>>>> track of
>>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance until an
>>>>>>>>>>>>>>>>> agent
>>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and
>>>>>>>>>>>>>>>>> timing.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Eg with the testcase provided here:
>>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I am getting those numbers:
>>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Can I please get a review?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200325/9f5d4948/signature-0001.asc>

From serguei.spitsyn at oracle.com  Wed Mar 25 20:01:23 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 25 Mar 2020 13:01:23 -0700
Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI
 connector port
In-Reply-To: <202C5C21-BA69-4ACF-9421-A9B5D6704C8C@oracle.com>
References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com>
 <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com>
 <D28BF049-D293-4F65-93EC-BCAE4F09B413@oracle.com>
 <a3b5ac77-b6ee-4927-cf99-b586c6bbeae6@oss.nttdata.com>
 <41392FC2-AC78-4E94-878B-907DDFB3E968@oracle.com>
 <c858b94c-0091-8d29-eb0b-145782984d86@oss.nttdata.com>
 <78EA1B33-1B0D-4969-AFC5-E73D8F82E6FA@oracle.com>
 <c00b118a-8619-3984-10d7-63134d7210a0@oss.nttdata.com>
 <FBCAD683-2A99-415D-8926-8AEF560EE55A@oracle.com>
 <9c888cc4-f9f8-8f64-f90b-a949007bc1dc@oracle.com>
 <202C5C21-BA69-4ACF-9421-A9B5D6704C8C@oracle.com>
Message-ID: <3fe390e9-39d1-c547-9480-fa1962cef0d8@oracle.com>

Hi Daniil,

On 3/24/20 10:00, Daniil Titov wrote:
> Hi Serguei,
>
>>     It looks like you removed the last call site of DebugServer.main.
> Yes. It is correct.
>
>>     Do we need to remove the DebugServer.java as well?
> I was considering this but since it is a public class I think it needs to be deprecated first. I also think that it would be better to do in a separate issue
> since a  CSR for deprecation needs to be filed for that.  If you agree I will create a new issue for that.

I'm okay to separate this.

Thanks,
Serguei

>
> Thanks,
> Daniil
>
>
> ?On 3/23/20, 11:56 PM, "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com> wrote:
>
>      Hi Daniil,
>      
>      It looks pretty good in general.
>      
>      It looks like you removed the last call site of DebugServer.main.
>      Do we need to remove the DebugServer.java as well?
>      
>      Thanks,
>      Serguei
>      
>      
>      On 3/22/20 15:29, Daniil Titov wrote:
>      > Hi Yasumasa, Serguei and Alex,
>      >
>      > Please review a new version of the webrev that merges SADebugDTest.java  with changes  done in  [2].
>      >
>      > Also the CRS [3] and the help message for debug server in SALauncher.java were updated to specify that  '--hostname'
>      > option could be a hostname or an IPv4/IPv6 address.
>      >
>      >   >  Ok, but I think it might be more simply with TestLibrary.
>      >   >   For example, can you use TestLibrary::getUnusedRandomPort ? It is used in test/jdk/java/rmi/testlibrary/RMID.java .
>      >
>      > TestLibrary:: getUnusedRandomPort() doesn't allow to specify what ports are reserved and it uses some hardcoded port range [FIXED_PORT_MIN, FIXED_PORT_MAX] as reserved ports. Besides,  test/jdk/java/rmi/testlibrary/TestLibrary.java class cannot be directly used in test/hotspot/jtreg/serviceability/* tests (it doesn't compile).
>      >
>      > Nevertheless, to simplify the test itself I moved findUnreservedFreePort(int .. reservedPorts) from SADebugTest.java to jdk.test.lib.Utils in /test/lib.
>      >
>      > Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
>      >
>      > [1] http://cr.openjdk.java.net/~dtitov/8196751/webrev.04/
>      > [2] https://bugs.openjdk.java.net/browse/JDK-8238268
>      > [3] https://bugs.openjdk.java.net/browse/JDK-8239831
>      >
>      > Thank you,
>      > Daniil
>      >
>      > ?On 3/13/20, 7:23 PM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
>      >
>      >      Hi Daniil,
>      >
>      >      On 2020/03/14 7:05, Daniil Titov wrote:
>      >      > Hi Yasumasa, Serguei and Alex,
>      >      >
>      >      > Please review a new version of the webrev that includes the changes Yasumasa suggested.
>      >      >
>      >      >> Shutdown hook is already registered in c'tor of HotSpotAgent.
>      >      >>     It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed.
>      >      >
>      >      > The shutdown hook registered in the HotSpotAgent c'tor only works for non-servers, so we still need a
>      >      > the shutdown hook for remote server being added in SALauncher. I changed it to use  the lambda expression.
>      >      >
>      >      > 101     public HotSpotAgent() {
>      >      >   102         // for non-server add shutdown hook to clean-up debugger in case
>      >      >   103         // of forced exit. For remote server, shutdown hook is added by
>      >      >   104         // DebugServer.
>      >      >   105         Runtime.getRuntime().addShutdownHook(new java.lang.Thread(
>      >      >   106         new Runnable() {
>      >      >   107             public void run() {
>      >      >   108                 synchronized (HotSpotAgent.this) {
>      >      >   109                     if (!isServer) {
>      >      >   110                         detach();
>      >      >   111                     }
>      >      >   112                 }
>      >      >   113             }
>      >      >   114         }));
>      >      >   115     }
>      >
>      >      I missed it, thanks!
>      >
>      >
>      >      >>>     Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains
>      >      >>> `exclusiveAccess.dirs=.` to avoid concurrent execution
>      >      > As I understand exclusiveAccess.dirs prevents only the tests located in this directory from being run simultaneously and other tests could still run in parallel with one of these tests.  Thus I would prefer to have the retry mechanism in place. I simplified the code using the class variables instead of local arrays.
>      >
>      >      Ok, but I think it might be more simply with TestLibrary.
>      >      For example, can you use TestLibrary::getUnusedRandomPort ? It is used in test/jdk/java/rmi/testlibrary/RMID.java .
>      >
>      >
>      >      Thanks,
>      >
>      >      Yasumasa
>      >
>      >
>      >      > Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
>      >      >
>      >      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.03/
>      >      > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
>      >      > [3] Bug: https://bugs.openjdk.java.net/browse/JDK-8196751
>      >      >
>      >      > Thank you,
>      >      > Daniil
>      >      >
>      >      > ?On 3/6/20, 6:15 PM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
>      >      >
>      >      >      Hi Daniil,
>      >      >
>      >      >      On 2020/03/07 3:38, Daniil Titov wrote:
>      >      >      > Hi Yasumasa,
>      >      >      >
>      >      >      >   -> checkBasicOptions() is needed? I think you can remove this method and embed it in caller.
>      >      >      > I think that having a piece of code that invokes  a method  named "buildAttachArgs" with a copy of the argument map  just for its side-effect ( it throws an exception if parameters are incorrect)  and ignores its return might look confusing. Thus, I found it more appropriate to wrap it inside a method with more relevant name .
>      >      >
>      >      >      Ok, but I prefer to leave comment it.
>      >      >
>      >      >
>      >      >      >   > SADebugDTest
>      >      >      >   >  - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array.
>      >      >      > We cannot use primitives there since these local variables are captured in lambda expression and are required to be final.
>      >      >      > The other option is to use some other wrapper for them but I don't see any obvious benefits in it comparing to the array.
>      >      >
>      >      >      Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains `exclusiveAccess.dirs=.` to avoid concurrent execution.
>      >      >      If you do not think this error check, test code is more simply.
>      >      >
>      >      >
>      >      >      > I will include your other suggestion in the new version of the webrev.
>      >      >
>      >      >      Sorry, I have one more comment:
>      >      >
>      >      >      >           - Shutdown hook is very good idea. You can implement more simply if you use lambda expression.
>      >      >
>      >      >      Shutdown hook is already registered in c'tor of HotSpotAgent.
>      >      >      It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed.
>      >      >
>      >      >
>      >      >      Thanks,
>      >      >
>      >      >      Yasumasa
>      >      >
>      >      >
>      >      >      > Thanks!
>      >      >      > Daniil
>      >      >      >
>      >      >      > ?On 3/6/20, 12:30 AM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
>      >      >      >
>      >      >      >      Hi Daniil,
>      >      >      >
>      >      >      >
>      >      >      >      - SALauncher.java
>      >      >      >           - checkBasicOptions() is needed? I think you can remove this method and embed it in caller.
>      >      >      >           - I think registryPort should be checked with Integer.parseInt() like others (rmiPort and pid) rather than regex.
>      >      >      >           - Shutdown hook is very good idea. You can implement more simply if you use lambda expression.
>      >      >      >
>      >      >      >      - SADebugDTest.java
>      >      >      >           - Please add bug ID to @bug.
>      >      >      >           - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array.
>      >      >      >
>      >      >      >
>      >      >      >      Thanks,
>      >      >      >
>      >      >      >      Yasumasa
>      >      >      >
>      >      >      >
>      >      >      >      On 2020/03/06 10:15, Daniil Titov wrote:
>      >      >      >      > Hi Yasumasa, Serguei and Alex,
>      >      >      >      >
>      >      >      >      > Please review a new version of the fix [1] that addresses your comments. The new version in addition to RMI connector
>      >      >      >      > port option introduces two more options to specify RMI registry port and RMI connector host name. Currently, these
>      >      >      >      > last two settings could be specified using the system properties but the system properties have the following disadvantages
>      >      >      >      > comparing to the command line options:
>      >      >      >      >     -  It?s hard to know about them: they are not listed in tool?s help.
>      >      >      >      >     -  They have long names that hard to remember
>      >      >      >      >     -   It is easy to mistype them  in the command line and you will not get any warning about it.
>      >      >      >      >
>      >      >      >      > The CSR [2] was also updated and needs to be reviewed.
>      >      >      >      >
>      >      >      >      > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker
>      >      >      >      > container  and connecting  to it with the GUI debugger.  Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
>      >      >      >      >
>      >      >      >      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.02/
>      >      >      >      > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
>      >      >      >      > [3] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751
>      >      >      >      >
>      >      >      >      > Thank you,
>      >      >      >      > Daniil
>      >      >      >      >
>      >      >      >      > ?On 2/24/20, 5:45 AM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
>      >      >      >      >
>      >      >      >      >      Hi Daniil,
>      >      >      >      >
>      >      >      >      >         - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments.
>      >      >      >      >           Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply.
>      >      >      >      >
>      >      >      >      >         - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used.
>      >      >      >      >           But you can use same port number as RMI registry (1099).
>      >      >      >      >           It is same as relation between jmxremote.port and jmxremote.rmi.port.
>      >      >      >      >
>      >      >      >      >
>      >      >      >      >      Thanks,
>      >      >      >      >
>      >      >      >      >      Yasumasa
>      >      >      >      >
>      >      >      >      >
>      >      >      >      >      On 2020/02/24 13:21, Daniil Titov wrote:
>      >      >      >      >      > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port.
>      >      >      >      >      > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container.
>      >      >      >      >      >
>      >      >      >      >      > New CSR [3] was created for this change and it needs to be reviewed as well.
>      >      >      >      >      >
>      >      >      >      >      > Man pages for jhsdb will be updated in a separate issue.
>      >      >      >      >      >
>      >      >      >      >      > The current implementation (sun.jvm.hotspot.SALauncher)  parses the command line options passed to jhsdb tool,
>      >      >      >      >      > converts them to the ones for the debug server and then delegates the call  to sun.jvm.hotspot.DebugServer.main().
>      >      >      >      >      >
>      >      >      >      >      >                // delegate to the actual SA debug server.
>      >      >      >      >      >   367         DebugServer.main(newArgArray.toArray(new String[0]));
>      >      >      >      >      >
>      >      >      >      >      > However,  sun.jvm.hotspot.DebugServer  doesn't support named options and that prevents from efficiently adding new options to the tool.
>      >      >      >      >      > I found it more suitable to start Hotspot agent directly in  SALauncher rather than  adding a new option in  both sun.jvm.hotspot.SALauncher
>      >      >      >      >      >   and sun.jvm.hotspot.DebugServer and  delegating the call.  With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated
>      >      >      >      >      > but I would prefer to address it in a separate issue.
>      >      >      >      >      >
>      >      >      >      >      > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker
>      >      >      >      >      >                  container  and connecting  to it with the GUI debugger.
>      >      >      >      >      >                 Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
>      >      >      >      >      >
>      >      >      >      >      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01
>      >      >      >      >      > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751
>      >      >      >      >      > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
>      >      >      >      >      >
>      >      >      >      >      > Thank you,
>      >      >      >      >      > Daniil
>      >      >      >      >      >
>      >      >      >      >      >
>      >      >      >      >
>      >      >      >      >
>      >      >      >      >
>      >      >      >
>      >      >      >
>      >      >      >
>      >      >
>      >      >
>      >      >
>      >
>      >
>      >
>      
>      
>
>


From ioi.lam at oracle.com  Wed Mar 25 20:11:52 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Wed, 25 Mar 2020 13:11:52 -0700
Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the
 children process if vmArguments is already specified
In-Reply-To: <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com>
References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com>
 <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com>
 <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com>
 <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com>
Message-ID: <29f2ce73-2f53-e34a-15c0-a6723437c4fb@oracle.com>

This new versions looks good to me.

Thanks
- Ioi

On 3/25/20 12:01 PM, Leonid Mesnik wrote:
>
> Added Ioi, who also proposed new version of startAppVmOpts.
>
> Please find new webrev: 
> http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/
>
> Renamed startAppVmOpts/runAppVmOpts to 
> "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should make very 
> clear that this method doesn't use any of test.java.opts, test.vm.opts.
>
> Also, I fixed test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java 
> metnioned by Igor, and removed null pointer check as Ioi suggested in 
> startApp method.
>
> + public static void startApp(LingeredApp theApp, String... 
> additionalJvmOpts) throws IOException {
> + startAppExactJvmOpts(theApp, 
> Utils.appendTestJavaOpts(additionalJvmOpts));
> + }
>
> Leonid
>
> On 3/25/20 10:14 AM, Stefan Karlsson wrote:
>> On 2020-03-25 17:40, Igor Ignatyev wrote:
>>> Hi Leonid,
>>>
>>> I have briefly looked at the patch, a few comments so far:
>>>
>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java:
>>> ? - at L#114, could you please call static method using class name 
>>> (as the opposite of using instance)? or was it meant to be 
>>> theApp.runAppVmOpts(vmArgs) ?
>>>
>>> test/lib/jdk/test/lib/apps/LingeredApp.java:
>>> - it seems that code indent of startApp(LingeredApp, String[]) isn't 
>>> correct
>>> - I don't like startAppVmOpts name, but unfortunately don't have a 
>>> better suggestion (yet)
>>
>> I was going to say the same. Jtreg has the concept of "java options" 
>> and "vm options". We have had a fair share of bugs and wasted time 
>> when tests have been using the "vm options" part (VM_OPTIONS, 
>> test.vm.options, etc), and we've been moving away from using that way 
>> to pass options. I recently cleaned up some of this with:
>>
>> 8237111: LingeredApp should be started with getTestJavaOpts
>>
>> Because of this, I would prefer if we used a name that doesn't 
>> include "VmOpts", because it's too alike the other concept. Some 
>> suggestions:
>> ?startAppJavaOptions
>> ?startAppUsingJavaOptions
>> ?startAppWithJavaOptions
>> ?startAppExactJavaOptions
>> ?startAppJvmOptions
>>
>> Thanks,
>> StefanK
>>
>>> Thanks,
>>> -- Igor
>>>
>>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik 
>>>> <leonid.mesnik at oracle.com> wrote:
>>>>
>>>> Hi
>>>>
>>>> Could you please review following fix which change LingeredApp to 
>>>> prepend vm options to java/vm.test.opts when startApp is used and 
>>>> provide startAppVmOpts to override options completely.
>>>>
>>>> The intention is to avoid issue like in this bug where test/jtreg 
>>>> options were ignored by tests. Also I fixed some tests where 
>>>> intention was to append vm options rather than to override them.
>>>>
>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/
>>>>
>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698
>>>>
>>>> Leonid
>>>>
>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200325/92d6fc3e/attachment-0001.htm>

From chris.plummer at oracle.com  Wed Mar 25 20:24:43 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 25 Mar 2020 13:24:43 -0700
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <70717ba0-bcd1-05d4-3f25-ad1dd30309c1@redhat.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
 <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>
 <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>
 <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com>
 <f322e0c2-a87e-7c45-5b02-f5380e50246a@oracle.com>
 <359442d4-d6a1-f123-dad1-ac712f06eb0e@redhat.com>
 <d5dbee0b-9f6e-173f-15c4-f88bd0e9b619@oracle.com>
 <ef85c1cc-a22b-5b72-caa2-30020f7f487a@redhat.com>
 <7c613970-1788-5bd5-06f9-5c754843a193@oracle.com>
 <70717ba0-bcd1-05d4-3f25-ad1dd30309c1@redhat.com>
Message-ID: <80c64ca7-4f52-a7ba-0e35-9fa6417ce545@oracle.com>

Yes, please submit a new job. I'll start my testing once I see that the 
builds are done.

Chris

On 3/25/20 12:59 PM, Roman Kennke wrote:
> Hi Chris,
>
> Apparently we can get into classTrack_reset() before calling activate(),
> and we're seeing a null deletedSignatureBag. A simple NULL-check around
> the cleaning routine fixes the problem for me.
>
> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.08/
>
> Should I post another submit-repo job with that fix?
>
> Thanks,
> Roman
>
>
>> Hi Roman,
>>
>> com/sun/jdi/JdwpAllowTest.java crashed on many runs:
>>
>> Stack: [0x00007fbb790f9000,0x00007fbb791fa000],  sp=0x00007fbb791f8af0,  free space=1022k
>> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
>> C  [libjdwp.so+0xdb71]  bagEnumerateOver+0x11
>> C  [libjdwp.so+0xe365]  classTrack_reset+0x25
>> C  [libjdwp.so+0xfca1]  debugInit_reset+0x71
>> C  [libjdwp.so+0x12e0d]  debugLoop_run+0x38d
>> C  [libjdwp.so+0x25700]  acceptThread+0x80
>> V  [libjvm.so+0xf4b5a7]  JvmtiAgentThread::call_start_function()+0x1c7
>> V  [libjvm.so+0x15215c6]  JavaThread::thread_main_inner()+0x226
>> V  [libjvm.so+0x1527736]  Thread::call_run()+0xf6
>> V  [libjvm.so+0x1250ade]  thread_native_entry(Thread*)+0x10e
>>
>>
>> This happened during a test task run of open/test/jdk/:jdk_jdi. There
>> doesn't seem to be anything magic on the command line that might be
>> triggering. Pretty much I see it with all the various VM configs we test.
>>
>> I'm also seeing crashes in the following tests, but not as often:
>>
>> serviceability/jvmti/ModuleAwareAgents/ThreadStart/MAAThreadStart.java
>> vmTestbase/nsk/jdwp/VirtualMachine/Version/version002/TestDescription.java
>> vmTestbase/nsk/jdwp/VirtualMachine/ReleaseEvents/releaseevents002/TestDescription.java
>> vmTestbase/nsk/jdwp/VirtualMachine/HoldEvents/holdevents002/TestDescription.java
>> vmTestbase/nsk/jdwp/VirtualMachine/Dispose/dispose001/TestDescription.java
>>
>> thanks,
>>
>> Chris
>>
>>
>> On 3/25/20 11:37 AM, Roman Kennke wrote:
>>> Hi Chris,
>>>
>>>> Regarding the new assert:
>>>>
>>>>  ?105???? if (gdata && gdata->assertOn) {
>>>>  ?106???????? // Check this is not already tagged.
>>>>  ?107???????? jlong tag;
>>>>  ?108???????? error = JVMTI_FUNC_PTR(trackingEnv, GetTag)(env, klass, &tag);
>>>>  ?109???????? if (error != JVMTI_ERROR_NONE) {
>>>>  ?110???????????? EXIT_ERROR(error, "Unable to GetTag with class
>>>> trackingEnv");
>>>>  ?111???????? }
>>>>  ?112???????? JDI_ASSERT(tag == NOT_TAGGED);
>>>>  ?113???? }
>>>>
>>>> I think you should remove the gdata check. gdata should never be NULL
>>>> when you get to this code. If it is ever NULL then there's a bug, and
>>>> the check will hide the bug.
>>> Ok, will remove this.
>>>
>>>> Regarding testing, after you do the submit repo testing let me know the
>>>> jobID and I'll do additional testing on it.
>>> I did the submit repo earlier today, and it came back green:
>>>
>>> mach5-one-rkennke-JDK-8227269-2-20200325-1421-9706762
>>>
>>> Thanks,
>>> Roman
>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>
>>>> On 3/25/20 6:00 AM, Roman Kennke wrote:
>>>>> Hi Sergei,
>>>>>
>>>>>> The fix looks pretty clean now.
>>>>>> I also like new name of the lock.:)
>>>>> Thank you!
>>>>>
>>>>>> Just one comment below.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>
>>>>>>
>>>>>> 110 if (tag != 0l) {
>>>>>> 111 return; // Already added
>>>>>>  ? 112???? }
>>>>>>
>>>>>>  ??It is better to use a named constant or macro instead.
>>>>>>  ??Also, it'd be nice to add a short comment about this value is.
>>>>> As I replied to Chris earlier, this whole block can be turned into an
>>>>> assert. I also made a constant for the value 0, which should be pretty
>>>>> much self-explaining.
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.07/
>>>>>
>>>>>> How do you test the fix?
>>>>> I am using a manual test that is provided in this bug report:
>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>
>>>>> "Script to compare performance of GC with and without debugger, when
>>>>> many classes are loaded and classes are being unloaded":
>>>>>
>>>>> https://bugzilla.redhat.com/attachment.cgi?id=1640688
>>>>>
>>>>> I am also using this test and manually attach/detach jdb a couple of
>>>>> times in a row to check that disconnecting and reconnecting works well
>>>>> (this tended to deadlock or crash with an earlier version of the patch,
>>>>> and is now looking good).
>>>>>
>>>>> I am also running tier1 and tier2 tests locally, and as soon as we all
>>>>> agree that the fix is reasonable, I will push it to the submit repo. I
>>>>> am not sure if any of those tests actually exercise that code, though.
>>>>> Let me know if you want me to run any specific tests.
>>>>>
>>>>> Thank you,
>>>>> Roman
>>>>>
>>>>>
>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>>
>>>>>> On 3/20/20 08:30, Roman Kennke wrote:
>>>>>>> I believe I came up with a much simpler solution that also solves the
>>>>>>> problems of the existing one, and the ones I proposed earlier.
>>>>>>>
>>>>>>> It turns out that we can take advantage of the fact that we can use
>>>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is
>>>>>>> explicitely
>>>>>>> mentioned in the JVMTI spec). This means we can simply stick a pointer
>>>>>>> to the signature of a class into the tag, and pull it out again when we
>>>>>>> get notified that the class gets unloaded.
>>>>>>>
>>>>>>> This means we don't need an extra data-structure to keep track of
>>>>>>> classes and signatures, and it also makes the story around locking
>>>>>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning of all
>>>>>>> classes needed (as in the current implementation) and no searching of
>>>>>>> table needed (like in my previous attempts).
>>>>>>>
>>>>>>> Please review this new revision:
>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/
>>>>>>>
>>>>>>> (Notice that there still appears to be a performance bottleneck with
>>>>>>> class-unloading when an actual debugger is attached. This doesn't seem
>>>>>>> to be related to the classTrack.c implementation though, but looks like
>>>>>>> a consequence of getting all those class-unload notifications over the
>>>>>>> wire. My testcase generates 1000s of them, and it's clogging up the
>>>>>>> buffers.)
>>>>>>>
>>>>>>> I am not sure why jdb needs to enable class-unload listener always. A
>>>>>>> simple hack disables it, and performance is brilliant, even when jdb is
>>>>>>> attached:
>>>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch
>>>>>>>
>>>>>>> But this is not in the scope of this bug.)
>>>>>>>
>>>>>>> Roman
>>>>>>>
>>>>>>>
>>>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>> Sorry, forgot to complete my comments at the end (see below).
>>>>>>>>
>>>>>>>>
>>>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote:
>>>>>>>>> Hi Roman,
>>>>>>>>>
>>>>>>>>> Thank you for the update and sorry for the latency in review.
>>>>>>>>>
>>>>>>>>> Some comments are below.
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
>>>>>>>>>  ?? 88 {
>>>>>>>>> 89 debugMonitorEnter(deletedSignatureLock);
>>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>>> 93 return;
>>>>>>>>>  ?? 94???? }
>>>>>>>>> Just a question:
>>>>>>>>>  ?? Q1: Should the ObjectFree events be disabled for the jvmtiEnv
>>>>>>>>> that does
>>>>>>>>>  ?????? the class tracking if class tracking has not been initialized?
>>>>>>>>>
>>>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is
>>>>>>>>> better to
>>>>>>>>> be something like: lastClassTag or highestClassTag.
>>>>>>>>>
>>>>>>>>> 99 KlassNode* klass = *klass_ptr;
>>>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103
>>>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr;
>>>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // klass
>>>>>>>>> not
>>>>>>>>> found - ignore.
>>>>>>>>> 107 debugMonitorExit(deletedSignatureLock);
>>>>>>>>> 108 return;
>>>>>>>>>  ? 109???? }
>>>>>>>>>  ??It seems to me, something is wrong in the condition at L106 above.
>>>>>>>>>  ??Should it be? :
>>>>>>>>>  ???? if (klass == NULL || klass->klass_tag != tag)
>>>>>>>>>
>>>>>>>>>  ??Otherwise, how can the second check ever work correctly as the
>>>>>>>>> return
>>>>>>>>> will always happen when (klass != NULL)?
>>>>>>>>>
>>>>>>>>>  ? There are several places in this file with the the indent:
>>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>>> 93 return;
>>>>>>>>>  ?? 94???? }
>>>>>>>>>  ? ...
>>>>>>>>> 152 if (currentClassTag == -1) {
>>>>>>>>> 153 // Class tracking not initialized yet, nobody's interested
>>>>>>>>> 154 debugMonitorExit(deletedSignatureLock);
>>>>>>>>> 155 return;
>>>>>>>>>  ? 156???? }
>>>>>>>>>  ? ...
>>>>>>>>> 161 if (error != JVMTI_ERROR_NONE) {
>>>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv");
>>>>>>>>>  ? 163???? }
>>>>>>>>> 164 if (tag != 0l) {
>>>>>>>>> 165 debugMonitorExit(deletedSignatureLock);
>>>>>>>>> 166 return; // Already added
>>>>>>>>>  ? 167???? }
>>>>>>>>>  ? ...
>>>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg)
>>>>>>>>> 282 {
>>>>>>>>> 283 char* sig = (char*)signatureVoid;
>>>>>>>>> 284 jvmtiDeallocate(sig);
>>>>>>>>> 285 return JNI_TRUE;
>>>>>>>>>  ? 286 }
>>>>>>>>>  ? ...
>>>>>>>>>  ? 291 void
>>>>>>>>>  ? 292 classTrack_reset(void)
>>>>>>>>>  ? 293 {
>>>>>>>>> 294 int idx;
>>>>>>>>> 295 debugMonitorEnter(deletedSignatureLock);
>>>>>>>>> 296
>>>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) {
>>>>>>>>> 298 KlassNode* node = table[idx];
>>>>>>>>> 299 while (node != NULL) {
>>>>>>>>> 300 KlassNode* next = node->next;
>>>>>>>>> 301 jvmtiDeallocate(node->signature);
>>>>>>>>> 302 jvmtiDeallocate(node);
>>>>>>>>> 303 node = next;
>>>>>>>>> 304 }
>>>>>>>>> 305 }
>>>>>>>>> 306 jvmtiDeallocate(table);
>>>>>>>>> 307
>>>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL);
>>>>>>>>> 309 bagDestroyBag(deletedSignatureBag);
>>>>>>>>> 310
>>>>>>>>> 311 currentClassTag = -1;
>>>>>>>>> 312
>>>>>>>>> 313
>>>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv);
>>>>>>>>> 314 trackingEnv = NULL;
>>>>>>>>> 315
>>>>>>>>> 316 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>
>>>>>>>>> Could you, please, fix several comments below?
>>>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for
>>>>>>>>> class-unloads
>>>>>>>>>  ??The comma is not needed.
>>>>>>>>>  ??Would it better to replace: klass tags => klass_tag's ?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag
>>>>>>>>> consistent
>>>>>>>>>  ??Maybe: Lock to guard ... or lock to keep integrity of ...
>>>>>>>>>
>>>>>>>>> 84 * Callback when classes are freed, Finds the signature and
>>>>>>>>> remembers it in deletedSignatureBag. Would be better to use words
>>>>>>>>> like
>>>>>>>>> "store" or "record", "Find" should not start from capital letter:
>>>>>>>>> Invoke the callback when classes are freed, find and record the
>>>>>>>>> signature in deletedSignatureBag.
>>>>>>>>>
>>>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not initialized,
>>>>>>>>> nobody's interested 153 // Class tracking not initialized yet,
>>>>>>>>> nobody's interested 158 /* Check this is not a duplicate */ Missed
>>>>>>>>> dot
>>>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { //
>>>>>>>>> klass not found - ignore. In opposite, dot is not needed as the
>>>>>>>>> comment does not start from a capital letter. 111 // At this point we
>>>>>>>>> have the KlassNode corresponding to the tag
>>>>>>>>> 112 // in klass, and the pointer to it in klass_node.
>>>>>>>>  ? The comment above can be better. Maybe, something like:
>>>>>>>>  ? ? " At this point, we found the KlassNode matching the klass
>>>>>>>> tag(and it is
>>>>>>>> linked).
>>>>>>>>
>>>>>>>>> 113 // Remember the unloaded signature.
>>>>>>>>  ??Better: Record the signature of the unloaded class and unlink it.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Serguei
>>>>>>>>>
>>>>>>>>> On 3/9/20 05:39, Roman Kennke wrote:
>>>>>>>>>> Hello all,
>>>>>>>>>>
>>>>>>>>>> Can I please get reviews of this change? In the meantime, we've done
>>>>>>>>>> more testing and also field-/torture-testing by a customer who is
>>>>>>>>>> happy
>>>>>>>>>> now. :-)
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Roman
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Hi Serguei,
>>>>>>>>>>>
>>>>>>>>>>> Thanks for reviewing!
>>>>>>>>>>>
>>>>>>>>>>> I updated the patch to reflect your suggestions, very good!
>>>>>>>>>>> It also includes a fix to allow re-connecting an agent after
>>>>>>>>>>> disconnect,
>>>>>>>>>>> namely move setup of the trackingEnv and deletedSignatureBag to
>>>>>>>>>>> _activate() to ensure have those structures after re-connect.
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
>>>>>>>>>>>
>>>>>>>>>>> Let me know what you think!
>>>>>>>>>>> Roman
>>>>>>>>>>>
>>>>>>>>>>>> Hi Roman,
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you for taking care about this scalability issue!
>>>>>>>>>>>>
>>>>>>>>>>>> I have a couple of quick comments.
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 72 /*
>>>>>>>>>>>> 73 * Lock to protect deletedSignatureBag
>>>>>>>>>>>> 74 */
>>>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>>>>>>>>>>>> 78 * A bag containing all the deleted classes' signatures. Must be
>>>>>>>>>>>> accessed under
>>>>>>>>>>>> 79 * deletedTagLock,
>>>>>>>>>>>>  ?? 80? */
>>>>>>>>>>>> 81 struct bag* deletedSignatureBag;
>>>>>>>>>>>>
>>>>>>>>>>>>  ?? The comments contradict to each other.
>>>>>>>>>>>>  ?? I guess, the lock name at line 79 has to be
>>>>>>>>>>>> deletedSignatureLock
>>>>>>>>>>>> instead of deletedTagLock.
>>>>>>>>>>>>  ?? Also, comma at the end must be replaced with dot.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 101 // Tag not found? Ignore.
>>>>>>>>>>>> 102 if (klass == NULL) {
>>>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>> 104 return;
>>>>>>>>>>>> 105 }
>>>>>>>>>>>>  ? 106
>>>>>>>>>>>> 107 // Scan linked-list.
>>>>>>>>>>>> 108 jlong found_tag = klass->klass_tag;
>>>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) {
>>>>>>>>>>>> 110 klass_ptr = &klass->next;
>>>>>>>>>>>> 111 klass = *klass_ptr;
>>>>>>>>>>>> 112 found_tag = klass->klass_tag;
>>>>>>>>>>>>  ? 113???? }
>>>>>>>>>>>> 114
>>>>>>>>>>>> 115 // Tag not found? Ignore.
>>>>>>>>>>>> 116 if (found_tag != tag) {
>>>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>> 118 return;
>>>>>>>>>>>>  ? 119???? }
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  ??The code above can be simplified, so that the lines 101-105
>>>>>>>>>>>> are not
>>>>>>>>>>>> needed anymore.
>>>>>>>>>>>>  ??It can be something like this:
>>>>>>>>>>>>
>>>>>>>>>>>> // Scan linked-list.
>>>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) {
>>>>>>>>>>>> klass_ptr = &klass->next;
>>>>>>>>>>>> klass = *klass_ptr;
>>>>>>>>>>>>  ????? }
>>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not
>>>>>>>>>>>> found - ignore.
>>>>>>>>>>>> debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>> return;
>>>>>>>>>>>>  ????? }
>>>>>>>>>>>>
>>>>>>>>>>>> It will take more time when I get a chance to look at the rest.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Serguei
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote:
>>>>>>>>>>>>> Here comes an update that resolves some races that happen when
>>>>>>>>>>>>> disconnecting an agent. In particular, we need to take the
>>>>>>>>>>>>> lock on
>>>>>>>>>>>>> basically every operation, and also need to check whether or not
>>>>>>>>>>>>> class-tracking is active and return an appropriate result
>>>>>>>>>>>>> (e.g. an empty
>>>>>>>>>>>>> list) when we're not.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> So, here comes the O(1) implementation:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a
>>>>>>>>>>>>>> tag, and we
>>>>>>>>>>>>>> set-up a listener to get notified when it is unloaded.
>>>>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a
>>>>>>>>>>>>>> table, which
>>>>>>>>>>>>>> each entry being the head of a linked-list of KlassNode*. The
>>>>>>>>>>>>>> table is
>>>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend the new
>>>>>>>>>>>>>> KlassNode*.
>>>>>>>>>>>>>> This is O(1) operation.
>>>>>>>>>>>>>> - When we get notified of unloading a class, we look up the
>>>>>>>>>>>>>> signature of
>>>>>>>>>>>>>> the reported tag in that table, and remember it in a bag. The
>>>>>>>>>>>>>> KlassNode*
>>>>>>>>>>>>>> is then unlinked from the table and deallocated. This is
>>>>>>>>>>>>>> ~O(1) operation
>>>>>>>>>>>>>> too, depending on the depth of the table. In my testcase
>>>>>>>>>>>>>> which hammered
>>>>>>>>>>>>>> the code with class-loads and unloads, I usually see depths
>>>>>>>>>>>>>> of like 2-3,
>>>>>>>>>>>>>> but not usually more. It should be ok.
>>>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out that
>>>>>>>>>>>>>> bag, and
>>>>>>>>>>>>>> allocate a new one.
>>>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid
>>>>>>>>>>>>>> leaking the
>>>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets detached
>>>>>>>>>>>>>> and/or
>>>>>>>>>>>>>> re-attached (was missing before).
>>>>>>>>>>>>>> - I also added locks around data-structure-manipulation (was
>>>>>>>>>>>>>> missing
>>>>>>>>>>>>>> before).
>>>>>>>>>>>>>> - Also, I only activate this whole process when an actual
>>>>>>>>>>>>>> listener gets
>>>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when
>>>>>>>>>>>>>> attaching a
>>>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be something
>>>>>>>>>>>>>> to improve
>>>>>>>>>>>>>> in the future?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In my tests, the performance of class-tracking itself looks
>>>>>>>>>>>>>> really good.
>>>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the
>>>>>>>>>>>>>> class-unload
>>>>>>>>>>>>>> events. I don't see how this can be helped when the debug
>>>>>>>>>>>>>> agent asks for it?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please let me know what you think of it.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am implementing
>>>>>>>>>>>>>>> the even more
>>>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing
>>>>>>>>>>>>>>> for now.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,Roman
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  ? Hi Chris,
>>>>>>>>>>>>>>>>> I'll have a look at this, although it might not be for a
>>>>>>>>>>>>>>>>> few days. In
>>>>>>>>>>>>>>>>> the meantime, maybe you can describe your new
>>>>>>>>>>>>>>>>> implementation in
>>>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the changes.
>>>>>>>>>>>>>>>> Sure.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to
>>>>>>>>>>>>>>>> determine the
>>>>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading
>>>>>>>>>>>>>>>> happened, so that
>>>>>>>>>>>>>>>> we can generate the appropriate JDWP event.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The current implementation does so by maintaining a table
>>>>>>>>>>>>>>>> of currently
>>>>>>>>>>>>>>>> prepared classes by building that table when classTrack is
>>>>>>>>>>>>>>>> initialized,
>>>>>>>>>>>>>>>> and then add new classes whenever a class gets loaded. When
>>>>>>>>>>>>>>>> unloading
>>>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and
>>>>>>>>>>>>>>>> compared with the
>>>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the new
>>>>>>>>>>>>>>>> table gets
>>>>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently
>>>>>>>>>>>>>>>> and/or many
>>>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to
>>>>>>>>>>>>>>>> O(classCount*gcCount)
>>>>>>>>>>>>>>>> complexity.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared
>>>>>>>>>>>>>>>> classes, and also
>>>>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree().
>>>>>>>>>>>>>>>> Whenever an
>>>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned,
>>>>>>>>>>>>>>>> and classes
>>>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus
>>>>>>>>>>>>>>>> maintaining the
>>>>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list
>>>>>>>>>>>>>>>> that gets returned.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The implementation is not perfect. In order to determine
>>>>>>>>>>>>>>>> whether or not
>>>>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag.
>>>>>>>>>>>>>>>> That process is
>>>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here
>>>>>>>>>>>>>>>> is that
>>>>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this
>>>>>>>>>>>>>>>> seems to be
>>>>>>>>>>>>>>>> true, and also reasonable to expect.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> (I have some ideas how to improve the implementation to
>>>>>>>>>>>>>>>> ~O(1) but it
>>>>>>>>>>>>>>>> would be considerably more complex: have to maintain a
>>>>>>>>>>>>>>>> (hash)table that
>>>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload,
>>>>>>>>>>>>>>>> and build the
>>>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently see
>>>>>>>>>>>>>>>> that it's
>>>>>>>>>>>>>>>> worth the effort).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> In addition to all that, this process is only activated
>>>>>>>>>>>>>>>> when there's an
>>>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Issue:
>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c.
>>>>>>>>>>>>>>>>>> It avoids
>>>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps
>>>>>>>>>>>>>>>>>> track of
>>>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance until an
>>>>>>>>>>>>>>>>>> agent
>>>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and
>>>>>>>>>>>>>>>>>> timing.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Eg with the testcase provided here:
>>>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I am getting those numbers:
>>>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Can I please get a review?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>>


From magnus.ihse.bursie at oracle.com  Wed Mar 25 20:45:16 2020
From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie)
Date: Wed, 25 Mar 2020 21:45:16 +0100
Subject: RFR: JDK-8241618 Fix unchecked warning for jdk.hotspot.agent
In-Reply-To: <01f77be3-e7d2-a051-80ab-e81c83922cf6@oracle.com>
References: <8d884fcb-f424-1b54-7ece-5260037b2843@oracle.com>
 <01f77be3-e7d2-a051-80ab-e81c83922cf6@oracle.com>
Message-ID: <007988a3-50d6-6a54-7af6-90af623129b1@oracle.com>

On 2020-03-25 20:52, Chris Plummer wrote:
> Hi Magus,
>
> I haven't looked at the changes yet, other to see that there are many 
> files touched, but after reading below (and only partly understanding 
> since I don't know this area well), I was wondering if this issue 
> wouldn't be better served with multiple passes made to fix the 
> warnings. Start with a straight forward one where you are maybe only 
> making one or two types of changes, but that affect a large number of 
> files and don't cascade into other more complicated changes. 
Unfortunately, many changes tends to cling together -- for instance, 
class Foo has a List fooList of say Integer. If I change that to 
List<Integer>, then also the constructor needs to change, and the 
getFooList() method, and that in turn propagate to users of getFooList() 
etc. I tried to do this piecewise but for every line that I fixed I just 
ended up getting more and more places that needed fixing.

On the other hand, the patch I present *is* indeed mostly trivial. Apart 
from the places I mentioned below, the fixes are straightforward. And I 
opted out of fixing the tricky ones by disabling the warnings. My 
intention is to file a follow-up bug for these @SuppressWarnings to be 
fixed properly. However, doing that is unfortunately beyond the scope of 
what I'm able to do, since I do not have enough domain knowledge. The 
fixes in this patch is more or less "stupid" applications of adding 
generics with the correct type. (Basically, what I've done is to locate 
a problematic type, like fooList, and check the type of elements 
inserted and extracted of it, and created it as a generic of that type. 
Boring, but not really difficult.)

I realize the webrev can look daunting. Perhaps start by looking at the 
patch file, that will quickly show what kind of changes this is about. 
Also, 1/3 of the patch is just about updating those darned copyright 
years. :-(

> This will get a lot of the noise out of the way, and then we can focus 
> on some of the harder issues you bring up below.
>
> As for testing, I think the following list will capture all of them, 
> but can't say for sure:
>
> open/test/hotspot/jtreg/serviceability/sa
> open/test/hotspot/jtreg/resourcehogs/serviceability/sa
> open/test/jdk/sun/tools/jhsdb
> open/test/jdk/sun/tools/jstack
> open/test/jdk/sun/tools/jmap
> open/test/hotspot/jtreg/gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java 
>
> open/test/hotspot/jtreg/compiler/ciReplay/TestSAClient.java 
> open/test/hotspot/jtreg/compiler/ciReplay/TestSAServer.java
Thank you! I'll run these through our test system.

/Magnus
>
> Chris
>
> On 3/25/20 12:29 PM, Magnus Ihse Bursie wrote:
>> With the recent fixes in JDK-8241310, JDK-8237746 and JDK-8241073, 
>> and the upcoming fixes to remove the deprecated nashorn and jdk.rmi, 
>> the JDK build is very close to producing no warnings when compiling 
>> the Java classes.
>>
>> The one remaining sinner is jdk.hotspot.agent. Most of the warnings 
>> here are turned off, but unchecked and deprecation cannot be 
>> completely silenced.
>>
>> Since the poor agent does not seem to receive much love nowadays, I 
>> took it upon myself to fix these warnings, so we can finally get a 
>> quiet build.
>>
>> I started to address the unchecked warnings. Unfortunately, this was 
>> a much bigger task than I anticipated. I had to generify most of the 
>> module. On the plus side, the code is so much better now. And most of 
>> the changes were trivial, just tedious.
>>
>> There are a few places were I'm not entirely happy with the current 
>> solution, and that at least merits some discussion.
>>
>> I have resorted to @SuppressWarnings in four classes: ciMethodData, 
>> MethodData, TableModelComparator and VirtualBaseConstructor. All of 
>> them has in common that they are doing slightly fishy things with 
>> classes in collections. I'm not entirely sure they are bug-free, but 
>> this patch leaves the behavior untouched. I did some efforts to sort 
>> out the logic, but it turned out to be too hairy for me to fix, and 
>> it will probably require more substantial changes to the workings of 
>> the code.
>>
>> To make the code valid, I have moved ConstMethod to extend Metadata 
>> instead of VMObject. My understanding is that this is benign (and 
>> likely intended), but I really need for someone who knows the code to 
>> confirm this. I have also added a FIXME to signal this. I'll remove 
>> the FIXME as soon as I get confirmation that this is OK.
>> (The reason for this is the following piece of code from 
>> Metadata.java: metadataConstructor.addMapping("ConstMethod", 
>> ConstMethod.class))
>>
>> In ObjectListPanel, there is some code that screams "dead" with this 
>> change. I added a FIXME to point this out:
>> ??? for (Iterator<Oop> iter = elements.iterator(); iter.hasNext(); ) {
>> ????? if (iter.next() instanceof Array) {
>> ??????? // FIXME: Does not seem possible to happen
>> ??????? hasArrays = true;
>> ??????? return;
>> ????? }
>> It seems that if you start pulling this thread, even more dead code 
>> will unravel, so I'm not so eager to touch this in the current patch. 
>> But I can remove the FIXME if you want.
>>
>> My first iteration of this patch tried to generify the IntervalTree 
>> and related class hierarchy. However, this turned out to be 
>> impossible due to some weird usage in AnnotatedMemoryPanel, where 
>> there seemed to be confusion as to whether the tree stored 
>> Annotations or Addresses. I'm not entirely convinced the code is 
>> correct, it certainly looked and smelled very fishy. However, I 
>> reverted these changes since I could not get them to work due to 
>> this, and it was not needed for the goal of just getting rid of the 
>> warning.
>>
>> Finally, I have done no testing apart from verifying that it builds. 
>> Please advice on suitable tests to run.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8241618
>> WebRev: 
>> http://cr.openjdk.java.net/~ihse/JDK-8241618-fix-unchecked-warnings-for-agent/webrev.01
>>
>> /Magnus
>
>


From leonid.mesnik at oracle.com  Wed Mar 25 21:31:01 2020
From: leonid.mesnik at oracle.com (Leonid Mesnik)
Date: Wed, 25 Mar 2020 14:31:01 -0700
Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the
 children process if vmArguments is already specified
In-Reply-To: <042D1C56-75E3-40B9-8259-035C9A13C13B@oracle.com>
References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com>
 <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com>
 <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com>
 <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com>
 <042D1C56-75E3-40B9-8259-035C9A13C13B@oracle.com>
Message-ID: <47cba4c6-233d-84a9-1d89-b40e3c974c08@oracle.com>

Igor, Stefan, Ioi

Thank you for your feedback.

Filed https://bugs.openjdk.java.net/browse/JDK-8241624 To change @run 
main... to @run driver.

Test ClhsdbJstack.java is updated.

Still waiting for review from SVC team.

webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/

Leonid

On 3/25/20 12:46 PM, Igor Ignatyev wrote:
> Hi Leonid,
>
> not related related to your patch (but yet somewhat made more obvious 
> by it), it seems all (or at least almost all) the tests which 
> use?LingeredApp should be run in "driver" mode as they just 
> orchestrate execution of other JVMs, so running them w/ main (let 
> alone main/othervm) just wastes time, 
> test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java#id1, for 
> example, will now executed w/ Xcomp which will make it very slow for 
> no reasons. since you already got your hands dirty w/ these tests, 
> could you please file an RFE to sort this out and list all the 
> affected tests there?
>
> re: the patch, could you please update ClhsdbJstack.java test not to 
> be run w/ Xcomp and follow the same pattern you used in other tests 
> (e.g.?ClhsdbScanOops) ? other than that it looks fine to me, I however 
> wouldn't be able to tell if all svc tests continue to do that they 
> were supposed to, so I'd prefer for someone from svc team to?chime in.
>
> Thanks,
> -- Igor
>
>> On Mar 25, 2020, at 12:01 PM, Leonid Mesnik <leonid.mesnik at oracle.com 
>> <mailto:leonid.mesnik at oracle.com>> wrote:
>>
>> Added Ioi, who also proposed new version of startAppVmOpts.
>>
>> Please find new webrev: 
>> http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/
>>
>> Renamed startAppVmOpts/runAppVmOpts to 
>> "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should make 
>> very clear that this method doesn't use any of test.java.opts, 
>> test.vm.opts.
>>
>> Also, I fixed test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java 
>> metnioned by Igor, and removed null pointer check as Ioi suggested in 
>> startApp method.
>>
>> + public static void startApp(LingeredApp theApp, String... 
>> additionalJvmOpts) throws IOException {
>> + startAppExactJvmOpts(theApp, 
>> Utils.appendTestJavaOpts(additionalJvmOpts));
>> + }
>>
>> Leonid
>>
>> On 3/25/20 10:14 AM, Stefan Karlsson wrote:
>>> On 2020-03-25 17:40, Igor Ignatyev wrote:
>>>> Hi Leonid,
>>>>
>>>> I have briefly looked at the patch, a few comments so far:
>>>>
>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java:
>>>> ? - at L#114, could you please call static method using class name 
>>>> (as the opposite of using instance)? or was it meant to be 
>>>> theApp.runAppVmOpts(vmArgs) ?
>>>>
>>>> test/lib/jdk/test/lib/apps/LingeredApp.java:
>>>> - it seems that code indent of startApp(LingeredApp, String[]) 
>>>> isn't correct
>>>> - I don't like startAppVmOpts name, but unfortunately don't have a 
>>>> better suggestion (yet)
>>>
>>> I was going to say the same. Jtreg has the concept of "java options" 
>>> and "vm options". We have had a fair share of bugs and wasted time 
>>> when tests have been using the "vm options" part (VM_OPTIONS, 
>>> test.vm.options, etc), and we've been moving away from using that 
>>> way to pass options. I recently cleaned up some of this with:
>>>
>>> 8237111: LingeredApp should be started with getTestJavaOpts
>>>
>>> Because of this, I would prefer if we used a name that doesn't 
>>> include "VmOpts", because it's too alike the other concept. Some 
>>> suggestions:
>>> ?startAppJavaOptions
>>> ?startAppUsingJavaOptions
>>> ?startAppWithJavaOptions
>>> ?startAppExactJavaOptions
>>> ?startAppJvmOptions
>>>
>>> Thanks,
>>> StefanK
>>>
>>>> Thanks,
>>>> -- Igor
>>>>
>>>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik 
>>>>> <leonid.mesnik at oracle.com> wrote:
>>>>>
>>>>> Hi
>>>>>
>>>>> Could you please review following fix which change LingeredApp to 
>>>>> prepend vm options to java/vm.test.opts when startApp is used and 
>>>>> provide startAppVmOpts to override options completely.
>>>>>
>>>>> The intention is to avoid issue like in this bug where test/jtreg 
>>>>> options were ignored by tests. Also I fixed some tests where 
>>>>> intention was to append vm options rather than to override them.
>>>>>
>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/
>>>>>
>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698
>>>>>
>>>>> Leonid
>>>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200325/b2feff5d/attachment.htm>

From igor.ignatyev at oracle.com  Wed Mar 25 21:58:50 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Wed, 25 Mar 2020 14:58:50 -0700
Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the
 children process if vmArguments is already specified
In-Reply-To: <47cba4c6-233d-84a9-1d89-b40e3c974c08@oracle.com>
References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com>
 <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com>
 <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com>
 <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com>
 <042D1C56-75E3-40B9-8259-035C9A13C13B@oracle.com>
 <47cba4c6-233d-84a9-1d89-b40e3c974c08@oracle.com>
Message-ID: <EE88E7B4-6A4D-46E1-BBB8-F4DAF2E027C6@oracle.com>

> Test ClhsdbJstack.java is updated.
> 
now you reduced coverage provided by this test, I actually meant to create a separate jtreg test description in this test and pass "Xcomp" or "true" (or anything) as an argument to ClhsdbJstack, and use the value of this argument to decide if -Xcomp should be added to LingeredApp.startApp or not.

Thanks,
-- Igor


> On Mar 25, 2020, at 2:31 PM, Leonid Mesnik <leonid.mesnik at oracle.com> wrote:
> 
> Igor, Stefan, Ioi
> 
> Thank you for your feedback.
> 
> Filed https://bugs.openjdk.java.net/browse/JDK-8241624 <https://bugs.openjdk.java.net/browse/JDK-8241624> To change @run main... to @run driver. 
> 
> Test ClhsdbJstack.java is updated.
> 
> Still waiting for review from SVC team.
> 
> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/ <http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/>
> Leonid
> 
> On 3/25/20 12:46 PM, Igor Ignatyev wrote:
>> Hi Leonid,
>> 
>> not related related to your patch (but yet somewhat made more obvious by it), it seems all (or at least almost all) the tests which use LingeredApp should be run in "driver" mode as they just orchestrate execution of other JVMs, so running them w/ main (let alone main/othervm) just wastes time, test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java#id1, for example, will now executed w/ Xcomp which will make it very slow for no reasons. since you already got your hands dirty w/ these tests, could you please file an RFE to sort this out and list all the affected tests there?
>> 
>> re: the patch, could you please update ClhsdbJstack.java test not to be run w/ Xcomp and follow the same pattern you used in other tests (e.g. ClhsdbScanOops) ? other than that it looks fine to me, I however wouldn't be able to tell if all svc tests continue to do that they were supposed to, so I'd prefer for someone from svc team to chime in.
>> 
>> Thanks,
>> -- Igor
>> 
>>> On Mar 25, 2020, at 12:01 PM, Leonid Mesnik <leonid.mesnik at oracle.com <mailto:leonid.mesnik at oracle.com>> wrote:
>>> 
>>> Added Ioi, who also proposed new version of startAppVmOpts.
>>> 
>>> Please find new webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/ <http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/>
>>> Renamed startAppVmOpts/runAppVmOpts to "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should make very clear that this method doesn't use any of test.java.opts, test.vm.opts.
>>> 
>>> Also, I fixed test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java metnioned by Igor, and removed null pointer check as Ioi suggested in startApp method. 
>>> 
>>> +    public static void startApp(LingeredApp theApp, String... additionalJvmOpts) throws IOException {
>>> +        startAppExactJvmOpts(theApp, Utils.appendTestJavaOpts(additionalJvmOpts));
>>> +    }
>>> 
>>> Leonid
>>> 
>>> On 3/25/20 10:14 AM, Stefan Karlsson wrote:
>>>> On 2020-03-25 17:40, Igor Ignatyev wrote: 
>>>>> Hi Leonid, 
>>>>> 
>>>>> I have briefly looked at the patch, a few comments so far: 
>>>>> 
>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java: 
>>>>>   - at L#114, could you please call static method using class name (as the opposite of using instance)? or was it meant to be theApp.runAppVmOpts(vmArgs) ? 
>>>>> 
>>>>> test/lib/jdk/test/lib/apps/LingeredApp.java: 
>>>>> - it seems that code indent of startApp(LingeredApp, String[]) isn't correct 
>>>>> - I don't like startAppVmOpts name, but unfortunately don't have a better suggestion (yet) 
>>>> 
>>>> I was going to say the same. Jtreg has the concept of "java options" and "vm options". We have had a fair share of bugs and wasted time when tests have been using the "vm options" part (VM_OPTIONS, test.vm.options, etc), and we've been moving away from using that way to pass options. I recently cleaned up some of this with: 
>>>> 
>>>> 8237111: LingeredApp should be started with getTestJavaOpts 
>>>> 
>>>> Because of this, I would prefer if we used a name that doesn't include "VmOpts", because it's too alike the other concept. Some suggestions: 
>>>>  startAppJavaOptions 
>>>>  startAppUsingJavaOptions 
>>>>  startAppWithJavaOptions 
>>>>  startAppExactJavaOptions 
>>>>  startAppJvmOptions 
>>>> 
>>>> Thanks, 
>>>> StefanK 
>>>> 
>>>>> Thanks, 
>>>>> -- Igor 
>>>>> 
>>>>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik <leonid.mesnik at oracle.com> <mailto:leonid.mesnik at oracle.com> wrote: 
>>>>>> 
>>>>>> Hi 
>>>>>> 
>>>>>> Could you please review following fix which change LingeredApp to prepend vm options to java/vm.test.opts when startApp is used and provide startAppVmOpts to override options completely. 
>>>>>> 
>>>>>> The intention is to avoid issue like in this bug where test/jtreg options were ignored by tests. Also I fixed some tests where intention was to append vm options rather than to override them. 
>>>>>> 
>>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/ <http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/> 
>>>>>> 
>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698 <https://bugs.openjdk.java.net/browse/JDK-8240698> 
>>>>>> 
>>>>>> Leonid 
>>>>>> 
>>>> 
>> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200325/151173f4/attachment-0001.htm>

From rkennke at redhat.com  Wed Mar 25 22:22:31 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 25 Mar 2020 23:22:31 +0100
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <80c64ca7-4f52-a7ba-0e35-9fa6417ce545@oracle.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
 <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>
 <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>
 <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com>
 <f322e0c2-a87e-7c45-5b02-f5380e50246a@oracle.com>
 <359442d4-d6a1-f123-dad1-ac712f06eb0e@redhat.com>
 <d5dbee0b-9f6e-173f-15c4-f88bd0e9b619@oracle.com>
 <ef85c1cc-a22b-5b72-caa2-30020f7f487a@redhat.com>
 <7c613970-1788-5bd5-06f9-5c754843a193@oracle.com>
 <70717ba0-bcd1-05d4-3f25-ad1dd30309c1@redhat.com>
 <80c64ca7-4f52-a7ba-0e35-9fa6417ce545@oracle.com>
Message-ID: <e67eefb0-fa41-1a22-e794-6c98fe255713@redhat.com>

The new job finished, its ID is:

 mach5-one-rkennke-JDK-8227269-2-20200325-2027-9716289

Thank you,
Roman


> Yes, please submit a new job. I'll start my testing once I see that the
> builds are done.
> 
> Chris
> 
> On 3/25/20 12:59 PM, Roman Kennke wrote:
>> Hi Chris,
>>
>> Apparently we can get into classTrack_reset() before calling activate(),
>> and we're seeing a null deletedSignatureBag. A simple NULL-check around
>> the cleaning routine fixes the problem for me.
>>
>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.08/
>>
>> Should I post another submit-repo job with that fix?
>>
>> Thanks,
>> Roman
>>
>>
>>> Hi Roman,
>>>
>>> com/sun/jdi/JdwpAllowTest.java crashed on many runs:
>>>
>>> Stack: [0x00007fbb790f9000,0x00007fbb791fa000],?
>>> sp=0x00007fbb791f8af0,? free space=1022k
>>> Native frames: (J=compiled Java code, A=aot compiled Java code,
>>> j=interpreted, Vv=VM code, C=native code)
>>> C? [libjdwp.so+0xdb71]? bagEnumerateOver+0x11
>>> C? [libjdwp.so+0xe365]? classTrack_reset+0x25
>>> C? [libjdwp.so+0xfca1]? debugInit_reset+0x71
>>> C? [libjdwp.so+0x12e0d]? debugLoop_run+0x38d
>>> C? [libjdwp.so+0x25700]? acceptThread+0x80
>>> V? [libjvm.so+0xf4b5a7]? JvmtiAgentThread::call_start_function()+0x1c7
>>> V? [libjvm.so+0x15215c6]? JavaThread::thread_main_inner()+0x226
>>> V? [libjvm.so+0x1527736]? Thread::call_run()+0xf6
>>> V? [libjvm.so+0x1250ade]? thread_native_entry(Thread*)+0x10e
>>>
>>>
>>> This happened during a test task run of open/test/jdk/:jdk_jdi. There
>>> doesn't seem to be anything magic on the command line that might be
>>> triggering. Pretty much I see it with all the various VM configs we
>>> test.
>>>
>>> I'm also seeing crashes in the following tests, but not as often:
>>>
>>> serviceability/jvmti/ModuleAwareAgents/ThreadStart/MAAThreadStart.java
>>> vmTestbase/nsk/jdwp/VirtualMachine/Version/version002/TestDescription.java
>>>
>>> vmTestbase/nsk/jdwp/VirtualMachine/ReleaseEvents/releaseevents002/TestDescription.java
>>>
>>> vmTestbase/nsk/jdwp/VirtualMachine/HoldEvents/holdevents002/TestDescription.java
>>>
>>> vmTestbase/nsk/jdwp/VirtualMachine/Dispose/dispose001/TestDescription.java
>>>
>>>
>>> thanks,
>>>
>>> Chris
>>>
>>>
>>> On 3/25/20 11:37 AM, Roman Kennke wrote:
>>>> Hi Chris,
>>>>
>>>>> Regarding the new assert:
>>>>>
>>>>> ??105???? if (gdata && gdata->assertOn) {
>>>>> ??106???????? // Check this is not already tagged.
>>>>> ??107???????? jlong tag;
>>>>> ??108???????? error = JVMTI_FUNC_PTR(trackingEnv, GetTag)(env,
>>>>> klass, &tag);
>>>>> ??109???????? if (error != JVMTI_ERROR_NONE) {
>>>>> ??110???????????? EXIT_ERROR(error, "Unable to GetTag with class
>>>>> trackingEnv");
>>>>> ??111???????? }
>>>>> ??112???????? JDI_ASSERT(tag == NOT_TAGGED);
>>>>> ??113???? }
>>>>>
>>>>> I think you should remove the gdata check. gdata should never be NULL
>>>>> when you get to this code. If it is ever NULL then there's a bug, and
>>>>> the check will hide the bug.
>>>> Ok, will remove this.
>>>>
>>>>> Regarding testing, after you do the submit repo testing let me know
>>>>> the
>>>>> jobID and I'll do additional testing on it.
>>>> I did the submit repo earlier today, and it came back green:
>>>>
>>>> mach5-one-rkennke-JDK-8227269-2-20200325-1421-9706762
>>>>
>>>> Thanks,
>>>> Roman
>>>>
>>>>> thanks,
>>>>>
>>>>> Chris
>>>>>
>>>>> On 3/25/20 6:00 AM, Roman Kennke wrote:
>>>>>> Hi Sergei,
>>>>>>
>>>>>>> The fix looks pretty clean now.
>>>>>>> I also like new name of the lock.:)
>>>>>> Thank you!
>>>>>>
>>>>>>> Just one comment below.
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 110 if (tag != 0l) {
>>>>>>> 111 return; // Already added
>>>>>>> ?? 112???? }
>>>>>>>
>>>>>>> ???It is better to use a named constant or macro instead.
>>>>>>> ???Also, it'd be nice to add a short comment about this value is.
>>>>>> As I replied to Chris earlier, this whole block can be turned into an
>>>>>> assert. I also made a constant for the value 0, which should be
>>>>>> pretty
>>>>>> much self-explaining.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.07/
>>>>>>
>>>>>>> How do you test the fix?
>>>>>> I am using a manual test that is provided in this bug report:
>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>
>>>>>> "Script to compare performance of GC with and without debugger, when
>>>>>> many classes are loaded and classes are being unloaded":
>>>>>>
>>>>>> https://bugzilla.redhat.com/attachment.cgi?id=1640688
>>>>>>
>>>>>> I am also using this test and manually attach/detach jdb a couple of
>>>>>> times in a row to check that disconnecting and reconnecting works
>>>>>> well
>>>>>> (this tended to deadlock or crash with an earlier version of the
>>>>>> patch,
>>>>>> and is now looking good).
>>>>>>
>>>>>> I am also running tier1 and tier2 tests locally, and as soon as we
>>>>>> all
>>>>>> agree that the fix is reasonable, I will push it to the submit
>>>>>> repo. I
>>>>>> am not sure if any of those tests actually exercise that code,
>>>>>> though.
>>>>>> Let me know if you want me to run any specific tests.
>>>>>>
>>>>>> Thank you,
>>>>>> Roman
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>
>>>>>>> On 3/20/20 08:30, Roman Kennke wrote:
>>>>>>>> I believe I came up with a much simpler solution that also
>>>>>>>> solves the
>>>>>>>> problems of the existing one, and the ones I proposed earlier.
>>>>>>>>
>>>>>>>> It turns out that we can take advantage of the fact that we can use
>>>>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is
>>>>>>>> explicitely
>>>>>>>> mentioned in the JVMTI spec). This means we can simply stick a
>>>>>>>> pointer
>>>>>>>> to the signature of a class into the tag, and pull it out again
>>>>>>>> when we
>>>>>>>> get notified that the class gets unloaded.
>>>>>>>>
>>>>>>>> This means we don't need an extra data-structure to keep track of
>>>>>>>> classes and signatures, and it also makes the story around locking
>>>>>>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning
>>>>>>>> of all
>>>>>>>> classes needed (as in the current implementation) and no
>>>>>>>> searching of
>>>>>>>> table needed (like in my previous attempts).
>>>>>>>>
>>>>>>>> Please review this new revision:
>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/
>>>>>>>>
>>>>>>>> (Notice that there still appears to be a performance bottleneck
>>>>>>>> with
>>>>>>>> class-unloading when an actual debugger is attached. This
>>>>>>>> doesn't seem
>>>>>>>> to be related to the classTrack.c implementation though, but
>>>>>>>> looks like
>>>>>>>> a consequence of getting all those class-unload notifications
>>>>>>>> over the
>>>>>>>> wire. My testcase generates 1000s of them, and it's clogging up the
>>>>>>>> buffers.)
>>>>>>>>
>>>>>>>> I am not sure why jdb needs to enable class-unload listener
>>>>>>>> always. A
>>>>>>>> simple hack disables it, and performance is brilliant, even when
>>>>>>>> jdb is
>>>>>>>> attached:
>>>>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch
>>>>>>>>
>>>>>>>> But this is not in the scope of this bug.)
>>>>>>>>
>>>>>>>> Roman
>>>>>>>>
>>>>>>>>
>>>>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>> Sorry, forgot to complete my comments at the end (see below).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>> Hi Roman,
>>>>>>>>>>
>>>>>>>>>> Thank you for the update and sorry for the latency in review.
>>>>>>>>>>
>>>>>>>>>> Some comments are below.
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
>>>>>>>>>> ??? 88 {
>>>>>>>>>> 89 debugMonitorEnter(deletedSignatureLock);
>>>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>> 93 return;
>>>>>>>>>> ??? 94???? }
>>>>>>>>>> Just a question:
>>>>>>>>>> ??? Q1: Should the ObjectFree events be disabled for the jvmtiEnv
>>>>>>>>>> that does
>>>>>>>>>> ??????? the class tracking if class tracking has not been
>>>>>>>>>> initialized?
>>>>>>>>>>
>>>>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is
>>>>>>>>>> better to
>>>>>>>>>> be something like: lastClassTag or highestClassTag.
>>>>>>>>>>
>>>>>>>>>> 99 KlassNode* klass = *klass_ptr;
>>>>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103
>>>>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr;
>>>>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { //
>>>>>>>>>> klass
>>>>>>>>>> not
>>>>>>>>>> found - ignore.
>>>>>>>>>> 107 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>> 108 return;
>>>>>>>>>> ?? 109???? }
>>>>>>>>>> ???It seems to me, something is wrong in the condition at L106
>>>>>>>>>> above.
>>>>>>>>>> ???Should it be? :
>>>>>>>>>> ????? if (klass == NULL || klass->klass_tag != tag)
>>>>>>>>>>
>>>>>>>>>> ???Otherwise, how can the second check ever work correctly as the
>>>>>>>>>> return
>>>>>>>>>> will always happen when (klass != NULL)?
>>>>>>>>>>
>>>>>>>>>> ?? There are several places in this file with the the indent:
>>>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>> 93 return;
>>>>>>>>>> ??? 94???? }
>>>>>>>>>> ?? ...
>>>>>>>>>> 152 if (currentClassTag == -1) {
>>>>>>>>>> 153 // Class tracking not initialized yet, nobody's interested
>>>>>>>>>> 154 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>> 155 return;
>>>>>>>>>> ?? 156???? }
>>>>>>>>>> ?? ...
>>>>>>>>>> 161 if (error != JVMTI_ERROR_NONE) {
>>>>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv");
>>>>>>>>>> ?? 163???? }
>>>>>>>>>> 164 if (tag != 0l) {
>>>>>>>>>> 165 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>> 166 return; // Already added
>>>>>>>>>> ?? 167???? }
>>>>>>>>>> ?? ...
>>>>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg)
>>>>>>>>>> 282 {
>>>>>>>>>> 283 char* sig = (char*)signatureVoid;
>>>>>>>>>> 284 jvmtiDeallocate(sig);
>>>>>>>>>> 285 return JNI_TRUE;
>>>>>>>>>> ?? 286 }
>>>>>>>>>> ?? ...
>>>>>>>>>> ?? 291 void
>>>>>>>>>> ?? 292 classTrack_reset(void)
>>>>>>>>>> ?? 293 {
>>>>>>>>>> 294 int idx;
>>>>>>>>>> 295 debugMonitorEnter(deletedSignatureLock);
>>>>>>>>>> 296
>>>>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) {
>>>>>>>>>> 298 KlassNode* node = table[idx];
>>>>>>>>>> 299 while (node != NULL) {
>>>>>>>>>> 300 KlassNode* next = node->next;
>>>>>>>>>> 301 jvmtiDeallocate(node->signature);
>>>>>>>>>> 302 jvmtiDeallocate(node);
>>>>>>>>>> 303 node = next;
>>>>>>>>>> 304 }
>>>>>>>>>> 305 }
>>>>>>>>>> 306 jvmtiDeallocate(table);
>>>>>>>>>> 307
>>>>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL);
>>>>>>>>>> 309 bagDestroyBag(deletedSignatureBag);
>>>>>>>>>> 310
>>>>>>>>>> 311 currentClassTag = -1;
>>>>>>>>>> 312
>>>>>>>>>> 313
>>>>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv);
>>>>>>>>>>
>>>>>>>>>> 314 trackingEnv = NULL;
>>>>>>>>>> 315
>>>>>>>>>> 316 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>
>>>>>>>>>> Could you, please, fix several comments below?
>>>>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for
>>>>>>>>>> class-unloads
>>>>>>>>>> ???The comma is not needed.
>>>>>>>>>> ???Would it better to replace: klass tags => klass_tag's ?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag
>>>>>>>>>> consistent
>>>>>>>>>> ???Maybe: Lock to guard ... or lock to keep integrity of ...
>>>>>>>>>>
>>>>>>>>>> 84 * Callback when classes are freed, Finds the signature and
>>>>>>>>>> remembers it in deletedSignatureBag. Would be better to use words
>>>>>>>>>> like
>>>>>>>>>> "store" or "record", "Find" should not start from capital letter:
>>>>>>>>>> Invoke the callback when classes are freed, find and record the
>>>>>>>>>> signature in deletedSignatureBag.
>>>>>>>>>>
>>>>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not
>>>>>>>>>> initialized,
>>>>>>>>>> nobody's interested 153 // Class tracking not initialized yet,
>>>>>>>>>> nobody's interested 158 /* Check this is not a duplicate */
>>>>>>>>>> Missed
>>>>>>>>>> dot
>>>>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag)
>>>>>>>>>> { //
>>>>>>>>>> klass not found - ignore. In opposite, dot is not needed as the
>>>>>>>>>> comment does not start from a capital letter. 111 // At this
>>>>>>>>>> point we
>>>>>>>>>> have the KlassNode corresponding to the tag
>>>>>>>>>> 112 // in klass, and the pointer to it in klass_node.
>>>>>>>>> ?? The comment above can be better. Maybe, something like:
>>>>>>>>> ?? ? " At this point, we found the KlassNode matching the klass
>>>>>>>>> tag(and it is
>>>>>>>>> linked).
>>>>>>>>>
>>>>>>>>>> 113 // Remember the unloaded signature.
>>>>>>>>> ???Better: Record the signature of the unloaded class and
>>>>>>>>> unlink it.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Serguei
>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Serguei
>>>>>>>>>>
>>>>>>>>>> On 3/9/20 05:39, Roman Kennke wrote:
>>>>>>>>>>> Hello all,
>>>>>>>>>>>
>>>>>>>>>>> Can I please get reviews of this change? In the meantime,
>>>>>>>>>>> we've done
>>>>>>>>>>> more testing and also field-/torture-testing by a customer
>>>>>>>>>>> who is
>>>>>>>>>>> happy
>>>>>>>>>>> now. :-)
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Roman
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Hi Serguei,
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for reviewing!
>>>>>>>>>>>>
>>>>>>>>>>>> I updated the patch to reflect your suggestions, very good!
>>>>>>>>>>>> It also includes a fix to allow re-connecting an agent after
>>>>>>>>>>>> disconnect,
>>>>>>>>>>>> namely move setup of the trackingEnv and deletedSignatureBag to
>>>>>>>>>>>> _activate() to ensure have those structures after re-connect.
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
>>>>>>>>>>>>
>>>>>>>>>>>> Let me know what you think!
>>>>>>>>>>>> Roman
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Roman,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thank you for taking care about this scalability issue!
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have a couple of quick comments.
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 72 /*
>>>>>>>>>>>>> 73 * Lock to protect deletedSignatureBag
>>>>>>>>>>>>> 74 */
>>>>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>>>>>>>>>>>>> 78 * A bag containing all the deleted classes' signatures.
>>>>>>>>>>>>> Must be
>>>>>>>>>>>>> accessed under
>>>>>>>>>>>>> 79 * deletedTagLock,
>>>>>>>>>>>>> ??? 80? */
>>>>>>>>>>>>> 81 struct bag* deletedSignatureBag;
>>>>>>>>>>>>>
>>>>>>>>>>>>> ??? The comments contradict to each other.
>>>>>>>>>>>>> ??? I guess, the lock name at line 79 has to be
>>>>>>>>>>>>> deletedSignatureLock
>>>>>>>>>>>>> instead of deletedTagLock.
>>>>>>>>>>>>> ??? Also, comma at the end must be replaced with dot.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 101 // Tag not found? Ignore.
>>>>>>>>>>>>> 102 if (klass == NULL) {
>>>>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>> 104 return;
>>>>>>>>>>>>> 105 }
>>>>>>>>>>>>> ?? 106
>>>>>>>>>>>>> 107 // Scan linked-list.
>>>>>>>>>>>>> 108 jlong found_tag = klass->klass_tag;
>>>>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) {
>>>>>>>>>>>>> 110 klass_ptr = &klass->next;
>>>>>>>>>>>>> 111 klass = *klass_ptr;
>>>>>>>>>>>>> 112 found_tag = klass->klass_tag;
>>>>>>>>>>>>> ?? 113???? }
>>>>>>>>>>>>> 114
>>>>>>>>>>>>> 115 // Tag not found? Ignore.
>>>>>>>>>>>>> 116 if (found_tag != tag) {
>>>>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>> 118 return;
>>>>>>>>>>>>> ?? 119???? }
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> ???The code above can be simplified, so that the lines 101-105
>>>>>>>>>>>>> are not
>>>>>>>>>>>>> needed anymore.
>>>>>>>>>>>>> ???It can be something like this:
>>>>>>>>>>>>>
>>>>>>>>>>>>> // Scan linked-list.
>>>>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) {
>>>>>>>>>>>>> klass_ptr = &klass->next;
>>>>>>>>>>>>> klass = *klass_ptr;
>>>>>>>>>>>>> ?????? }
>>>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not
>>>>>>>>>>>>> found - ignore.
>>>>>>>>>>>>> debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>> return;
>>>>>>>>>>>>> ?????? }
>>>>>>>>>>>>>
>>>>>>>>>>>>> It will take more time when I get a chance to look at the
>>>>>>>>>>>>> rest.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote:
>>>>>>>>>>>>>> Here comes an update that resolves some races that happen
>>>>>>>>>>>>>> when
>>>>>>>>>>>>>> disconnecting an agent. In particular, we need to take the
>>>>>>>>>>>>>> lock on
>>>>>>>>>>>>>> basically every operation, and also need to check whether
>>>>>>>>>>>>>> or not
>>>>>>>>>>>>>> class-tracking is active and return an appropriate result
>>>>>>>>>>>>>> (e.g. an empty
>>>>>>>>>>>>>> list) when we're not.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So, here comes the O(1) implementation:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a
>>>>>>>>>>>>>>> tag, and we
>>>>>>>>>>>>>>> set-up a listener to get notified when it is unloaded.
>>>>>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a
>>>>>>>>>>>>>>> table, which
>>>>>>>>>>>>>>> each entry being the head of a linked-list of KlassNode*.
>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>> table is
>>>>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend the new
>>>>>>>>>>>>>>> KlassNode*.
>>>>>>>>>>>>>>> This is O(1) operation.
>>>>>>>>>>>>>>> - When we get notified of unloading a class, we look up the
>>>>>>>>>>>>>>> signature of
>>>>>>>>>>>>>>> the reported tag in that table, and remember it in a bag.
>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>> KlassNode*
>>>>>>>>>>>>>>> is then unlinked from the table and deallocated. This is
>>>>>>>>>>>>>>> ~O(1) operation
>>>>>>>>>>>>>>> too, depending on the depth of the table. In my testcase
>>>>>>>>>>>>>>> which hammered
>>>>>>>>>>>>>>> the code with class-loads and unloads, I usually see depths
>>>>>>>>>>>>>>> of like 2-3,
>>>>>>>>>>>>>>> but not usually more. It should be ok.
>>>>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out that
>>>>>>>>>>>>>>> bag, and
>>>>>>>>>>>>>>> allocate a new one.
>>>>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid
>>>>>>>>>>>>>>> leaking the
>>>>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets detached
>>>>>>>>>>>>>>> and/or
>>>>>>>>>>>>>>> re-attached (was missing before).
>>>>>>>>>>>>>>> - I also added locks around data-structure-manipulation (was
>>>>>>>>>>>>>>> missing
>>>>>>>>>>>>>>> before).
>>>>>>>>>>>>>>> - Also, I only activate this whole process when an actual
>>>>>>>>>>>>>>> listener gets
>>>>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when
>>>>>>>>>>>>>>> attaching a
>>>>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be
>>>>>>>>>>>>>>> something
>>>>>>>>>>>>>>> to improve
>>>>>>>>>>>>>>> in the future?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In my tests, the performance of class-tracking itself looks
>>>>>>>>>>>>>>> really good.
>>>>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the
>>>>>>>>>>>>>>> class-unload
>>>>>>>>>>>>>>> events. I don't see how this can be helped when the debug
>>>>>>>>>>>>>>> agent asks for it?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please let me know what you think of it.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am implementing
>>>>>>>>>>>>>>>> the even more
>>>>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing
>>>>>>>>>>>>>>>> for now.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,Roman
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ?? Hi Chris,
>>>>>>>>>>>>>>>>>> I'll have a look at this, although it might not be for a
>>>>>>>>>>>>>>>>>> few days. In
>>>>>>>>>>>>>>>>>> the meantime, maybe you can describe your new
>>>>>>>>>>>>>>>>>> implementation in
>>>>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the changes.
>>>>>>>>>>>>>>>>> Sure.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to
>>>>>>>>>>>>>>>>> determine the
>>>>>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading
>>>>>>>>>>>>>>>>> happened, so that
>>>>>>>>>>>>>>>>> we can generate the appropriate JDWP event.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The current implementation does so by maintaining a table
>>>>>>>>>>>>>>>>> of currently
>>>>>>>>>>>>>>>>> prepared classes by building that table when classTrack is
>>>>>>>>>>>>>>>>> initialized,
>>>>>>>>>>>>>>>>> and then add new classes whenever a class gets loaded.
>>>>>>>>>>>>>>>>> When
>>>>>>>>>>>>>>>>> unloading
>>>>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and
>>>>>>>>>>>>>>>>> compared with the
>>>>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the new
>>>>>>>>>>>>>>>>> table gets
>>>>>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently
>>>>>>>>>>>>>>>>> and/or many
>>>>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to
>>>>>>>>>>>>>>>>> O(classCount*gcCount)
>>>>>>>>>>>>>>>>> complexity.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared
>>>>>>>>>>>>>>>>> classes, and also
>>>>>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree().
>>>>>>>>>>>>>>>>> Whenever an
>>>>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned,
>>>>>>>>>>>>>>>>> and classes
>>>>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus
>>>>>>>>>>>>>>>>> maintaining the
>>>>>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list
>>>>>>>>>>>>>>>>> that gets returned.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The implementation is not perfect. In order to determine
>>>>>>>>>>>>>>>>> whether or not
>>>>>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag.
>>>>>>>>>>>>>>>>> That process is
>>>>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here
>>>>>>>>>>>>>>>>> is that
>>>>>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this
>>>>>>>>>>>>>>>>> seems to be
>>>>>>>>>>>>>>>>> true, and also reasonable to expect.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> (I have some ideas how to improve the implementation to
>>>>>>>>>>>>>>>>> ~O(1) but it
>>>>>>>>>>>>>>>>> would be considerably more complex: have to maintain a
>>>>>>>>>>>>>>>>> (hash)table that
>>>>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload,
>>>>>>>>>>>>>>>>> and build the
>>>>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently see
>>>>>>>>>>>>>>>>> that it's
>>>>>>>>>>>>>>>>> worth the effort).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> In addition to all that, this process is only activated
>>>>>>>>>>>>>>>>> when there's an
>>>>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Issue:
>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of
>>>>>>>>>>>>>>>>>>> classTrack.c.
>>>>>>>>>>>>>>>>>>> It avoids
>>>>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps
>>>>>>>>>>>>>>>>>>> track of
>>>>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance until an
>>>>>>>>>>>>>>>>>>> agent
>>>>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and
>>>>>>>>>>>>>>>>>>> timing.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Eg with the testcase provided here:
>>>>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I am getting those numbers:
>>>>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Can I please get a review?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>>>
> 
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200325/e4182338/signature-0001.asc>

From magnus.ihse.bursie at oracle.com  Wed Mar 25 22:34:18 2020
From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie)
Date: Wed, 25 Mar 2020 23:34:18 +0100
Subject: Discussion about fixing deprecation in jdk.hotspot.agent
Message-ID: <b66dad5c-cd1e-afdc-6c5c-62d1d89fda00@oracle.com>

Hi everyone,

As a follow-up to the ongoing review for JDK-8241618, I have also looked 
at fixing the deprecation warnings in jdk.hotspot.agent. These fall in 
three broad categories:

* Deprecation of the boxing type constructors (e.g. "new Integer(42)").

* Deprecation of java.util.Observer and Observable.

* The rest (mostly Class.newInstance(), and a few number of other odd 
deprecations)

The first category is trivial to fix. The last category need some 
special discussion. But the overwhelming majority of deprecation 
warnings come from the use of Observer and Observable. This really 
dwarfs anything else, and needs to be handled first, otherwise it's hard 
to even spot the other issues.

My analysis of the situation is that the deprecation of Observer and 
Observable seems a bit harsh, from the PoV of jdk.hotspot.agent. Sure, 
it might be limited, but I think it does exactly what is needed here. So 
the migration suggested in Observable (java.beans or 
java.util.concurrent) seems overkill. If there are genuine threading 
issues at play here, this assumption might be wrong, and then maybe 
going the j.u.c. route is correct.

But if that's not, the main goal should be to stay with the current 
implementation. One way to do this is to sprinkle the code with 
@SuppressWarning. But I think a better way would be to just implement 
our own Observer and Observable. After all, the classes are trivial.

I've made a mock-up of this solution, were I just copied the 
java.util.Observer and Observable, and removed the deprecation 
annotations. The only thing needed for the rest of the code is to make 
sure we import these; I've done this for three arbitrarily selected 
classes just to show what the change would typically look like. Here's 
the mock-up:

http://cr.openjdk.java.net/~ihse/hotspot-agent-observer/webrev.01

Let me know what you think.

/Magnus

From leonid.mesnik at oracle.com  Wed Mar 25 22:36:04 2020
From: leonid.mesnik at oracle.com (Leonid Mesnik)
Date: Wed, 25 Mar 2020 15:36:04 -0700
Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the
 children process if vmArguments is already specified
In-Reply-To: <EE88E7B4-6A4D-46E1-BBB8-F4DAF2E027C6@oracle.com>
References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com>
 <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com>
 <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com>
 <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com>
 <042D1C56-75E3-40B9-8259-035C9A13C13B@oracle.com>
 <47cba4c6-233d-84a9-1d89-b40e3c974c08@oracle.com>
 <EE88E7B4-6A4D-46E1-BBB8-F4DAF2E027C6@oracle.com>
Message-ID: <63ad8e28-fe30-53cb-93b3-85a4c09f919d@oracle.com>


On 3/25/20 2:58 PM, Igor Ignatyev wrote:
>>
>> Test ClhsdbJstack.java is updated.
>>
> now you reduced coverage provided by this test, I actually meant to 
> create a separate jtreg test description in this test and pass "Xcomp" 
> or "true" (or anything) as an argument to ClhsdbJstack, and use the 
> value of this argument to decide if -Xcomp should be added 
> to?LingeredApp.startApp or not.

Seems I misinterpret you words. Do you mean to change it to this? 
Basically the same as my original but faster and better prepared for 
"@run driver".

http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java.udiff.html

Leonid

>
> Thanks,
> -- Igor
>
>
>> On Mar 25, 2020, at 2:31 PM, Leonid Mesnik <leonid.mesnik at oracle.com 
>> <mailto:leonid.mesnik at oracle.com>> wrote:
>>
>> Igor, Stefan, Ioi
>>
>> Thank you for your feedback.
>>
>> Filed https://bugs.openjdk.java.net/browse/JDK-8241624 To change @run 
>> main... to @run driver.
>>
>> Test ClhsdbJstack.java is updated.
>>
>> Still waiting for review from SVC team.
>>
>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/
>>
>> Leonid
>>
>> On 3/25/20 12:46 PM, Igor Ignatyev wrote:
>>> Hi Leonid,
>>>
>>> not related related to your patch (but yet somewhat made more 
>>> obvious by it), it seems all (or at least almost all) the tests 
>>> which use?LingeredApp should be run in "driver" mode as they just 
>>> orchestrate execution of other JVMs, so running them w/ main (let 
>>> alone main/othervm) just wastes time, 
>>> test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java#id1, for 
>>> example, will now executed w/ Xcomp which will make it very slow for 
>>> no reasons. since you already got your hands dirty w/ these tests, 
>>> could you please file an RFE to sort this out and list all the 
>>> affected tests there?
>>>
>>> re: the patch, could you please update ClhsdbJstack.java test not to 
>>> be run w/ Xcomp and follow the same pattern you used in other tests 
>>> (e.g.?ClhsdbScanOops) ? other than that it looks fine to me, I 
>>> however wouldn't be able to tell if all svc tests continue to do 
>>> that they were supposed to, so I'd prefer for someone from svc team 
>>> to?chime in.
>>>
>>> Thanks,
>>> -- Igor
>>>
>>>> On Mar 25, 2020, at 12:01 PM, Leonid Mesnik 
>>>> <leonid.mesnik at oracle.com <mailto:leonid.mesnik at oracle.com>> wrote:
>>>>
>>>> Added Ioi, who also proposed new version of startAppVmOpts.
>>>>
>>>> Please find new webrev: 
>>>> http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/
>>>>
>>>> Renamed startAppVmOpts/runAppVmOpts to 
>>>> "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should make 
>>>> very clear that this method doesn't use any of test.java.opts, 
>>>> test.vm.opts.
>>>>
>>>> Also, I fixed test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java 
>>>> metnioned by Igor, and removed null pointer check as Ioi suggested 
>>>> in startApp method.
>>>>
>>>> + public static void startApp(LingeredApp theApp, String... 
>>>> additionalJvmOpts) throws IOException {
>>>> + startAppExactJvmOpts(theApp, 
>>>> Utils.appendTestJavaOpts(additionalJvmOpts));
>>>> + }
>>>>
>>>> Leonid
>>>>
>>>> On 3/25/20 10:14 AM, Stefan Karlsson wrote:
>>>>> On 2020-03-25 17:40, Igor Ignatyev wrote:
>>>>>> Hi Leonid,
>>>>>>
>>>>>> I have briefly looked at the patch, a few comments so far:
>>>>>>
>>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java:
>>>>>> ? - at L#114, could you please call static method using class 
>>>>>> name (as the opposite of using instance)? or was it meant to be 
>>>>>> theApp.runAppVmOpts(vmArgs) ?
>>>>>>
>>>>>> test/lib/jdk/test/lib/apps/LingeredApp.java:
>>>>>> - it seems that code indent of startApp(LingeredApp, String[]) 
>>>>>> isn't correct
>>>>>> - I don't like startAppVmOpts name, but unfortunately don't have 
>>>>>> a better suggestion (yet)
>>>>>
>>>>> I was going to say the same. Jtreg has the concept of "java 
>>>>> options" and "vm options". We have had a fair share of bugs and 
>>>>> wasted time when tests have been using the "vm options" part 
>>>>> (VM_OPTIONS, test.vm.options, etc), and we've been moving away 
>>>>> from using that way to pass options. I recently cleaned up some of 
>>>>> this with:
>>>>>
>>>>> 8237111: LingeredApp should be started with getTestJavaOpts
>>>>>
>>>>> Because of this, I would prefer if we used a name that doesn't 
>>>>> include "VmOpts", because it's too alike the other concept. Some 
>>>>> suggestions:
>>>>> ?startAppJavaOptions
>>>>> ?startAppUsingJavaOptions
>>>>> ?startAppWithJavaOptions
>>>>> ?startAppExactJavaOptions
>>>>> ?startAppJvmOptions
>>>>>
>>>>> Thanks,
>>>>> StefanK
>>>>>
>>>>>> Thanks,
>>>>>> -- Igor
>>>>>>
>>>>>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik 
>>>>>>> <leonid.mesnik at oracle.com> wrote:
>>>>>>>
>>>>>>> Hi
>>>>>>>
>>>>>>> Could you please review following fix which change LingeredApp 
>>>>>>> to prepend vm options to java/vm.test.opts when startApp is used 
>>>>>>> and provide startAppVmOpts to override options completely.
>>>>>>>
>>>>>>> The intention is to avoid issue like in this bug where 
>>>>>>> test/jtreg options were ignored by tests. Also I fixed some 
>>>>>>> tests where intention was to append vm options rather than to 
>>>>>>> override them.
>>>>>>>
>>>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/
>>>>>>>
>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698
>>>>>>>
>>>>>>> Leonid
>>>>>>>
>>>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200325/7dd00480/attachment.htm>

From igor.ignatyev at oracle.com  Wed Mar 25 22:42:41 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Wed, 25 Mar 2020 15:42:41 -0700
Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the
 children process if vmArguments is already specified
In-Reply-To: <63ad8e28-fe30-53cb-93b3-85a4c09f919d@oracle.com>
References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com>
 <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com>
 <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com>
 <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com>
 <042D1C56-75E3-40B9-8259-035C9A13C13B@oracle.com>
 <47cba4c6-233d-84a9-1d89-b40e3c974c08@oracle.com>
 <EE88E7B4-6A4D-46E1-BBB8-F4DAF2E027C6@oracle.com>
 <63ad8e28-fe30-53cb-93b3-85a4c09f919d@oracle.com>
Message-ID: <4F12C34D-4D61-4DAA-962C-2759EA83A6F9@oracle.com>


> On Mar 25, 2020, at 3:36 PM, Leonid Mesnik <leonid.mesnik at oracle.com> wrote:
> 
> 
> 
> On 3/25/20 2:58 PM, Igor Ignatyev wrote:
>>> Test ClhsdbJstack.java is updated.
>>> 
>> now you reduced coverage provided by this test, I actually meant to create a separate jtreg test description in this test and pass "Xcomp" or "true" (or anything) as an argument to ClhsdbJstack, and use the value of this argument to decide if -Xcomp should be added to LingeredApp.startApp or not.
> Seems I misinterpret you words. Do you mean to change it to this? Basically the same as my original but faster and better prepared for "@run driver".
> 
> http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java.udiff.html <http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java.udiff.html>
> 
yeap.
> 
> Leonid
> 
>> 
>> Thanks,
>> -- Igor
>> 
>> 
>>> On Mar 25, 2020, at 2:31 PM, Leonid Mesnik <leonid.mesnik at oracle.com <mailto:leonid.mesnik at oracle.com>> wrote:
>>> 
>>> Igor, Stefan, Ioi
>>> 
>>> Thank you for your feedback.
>>> 
>>> Filed https://bugs.openjdk.java.net/browse/JDK-8241624 <https://bugs.openjdk.java.net/browse/JDK-8241624> To change @run main... to @run driver. 
>>> 
>>> Test ClhsdbJstack.java is updated.
>>> 
>>> Still waiting for review from SVC team.
>>> 
>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/ <http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/>
>>> Leonid
>>> 
>>> On 3/25/20 12:46 PM, Igor Ignatyev wrote:
>>>> Hi Leonid,
>>>> 
>>>> not related related to your patch (but yet somewhat made more obvious by it), it seems all (or at least almost all) the tests which use LingeredApp should be run in "driver" mode as they just orchestrate execution of other JVMs, so running them w/ main (let alone main/othervm) just wastes time, test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java#id1, for example, will now executed w/ Xcomp which will make it very slow for no reasons. since you already got your hands dirty w/ these tests, could you please file an RFE to sort this out and list all the affected tests there?
>>>> 
>>>> re: the patch, could you please update ClhsdbJstack.java test not to be run w/ Xcomp and follow the same pattern you used in other tests (e.g. ClhsdbScanOops) ? other than that it looks fine to me, I however wouldn't be able to tell if all svc tests continue to do that they were supposed to, so I'd prefer for someone from svc team to chime in.
>>>> 
>>>> Thanks,
>>>> -- Igor
>>>> 
>>>>> On Mar 25, 2020, at 12:01 PM, Leonid Mesnik <leonid.mesnik at oracle.com <mailto:leonid.mesnik at oracle.com>> wrote:
>>>>> 
>>>>> Added Ioi, who also proposed new version of startAppVmOpts.
>>>>> 
>>>>> Please find new webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/ <http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/>
>>>>> Renamed startAppVmOpts/runAppVmOpts to "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should make very clear that this method doesn't use any of test.java.opts, test.vm.opts.
>>>>> 
>>>>> Also, I fixed test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java metnioned by Igor, and removed null pointer check as Ioi suggested in startApp method. 
>>>>> 
>>>>> +    public static void startApp(LingeredApp theApp, String... additionalJvmOpts) throws IOException {
>>>>> +        startAppExactJvmOpts(theApp, Utils.appendTestJavaOpts(additionalJvmOpts));
>>>>> +    }
>>>>> 
>>>>> Leonid
>>>>> 
>>>>> On 3/25/20 10:14 AM, Stefan Karlsson wrote:
>>>>>> On 2020-03-25 17:40, Igor Ignatyev wrote: 
>>>>>>> Hi Leonid, 
>>>>>>> 
>>>>>>> I have briefly looked at the patch, a few comments so far: 
>>>>>>> 
>>>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java: 
>>>>>>>   - at L#114, could you please call static method using class name (as the opposite of using instance)? or was it meant to be theApp.runAppVmOpts(vmArgs) ? 
>>>>>>> 
>>>>>>> test/lib/jdk/test/lib/apps/LingeredApp.java: 
>>>>>>> - it seems that code indent of startApp(LingeredApp, String[]) isn't correct 
>>>>>>> - I don't like startAppVmOpts name, but unfortunately don't have a better suggestion (yet) 
>>>>>> 
>>>>>> I was going to say the same. Jtreg has the concept of "java options" and "vm options". We have had a fair share of bugs and wasted time when tests have been using the "vm options" part (VM_OPTIONS, test.vm.options, etc), and we've been moving away from using that way to pass options. I recently cleaned up some of this with: 
>>>>>> 
>>>>>> 8237111: LingeredApp should be started with getTestJavaOpts 
>>>>>> 
>>>>>> Because of this, I would prefer if we used a name that doesn't include "VmOpts", because it's too alike the other concept. Some suggestions: 
>>>>>>  startAppJavaOptions 
>>>>>>  startAppUsingJavaOptions 
>>>>>>  startAppWithJavaOptions 
>>>>>>  startAppExactJavaOptions 
>>>>>>  startAppJvmOptions 
>>>>>> 
>>>>>> Thanks, 
>>>>>> StefanK 
>>>>>> 
>>>>>>> Thanks, 
>>>>>>> -- Igor 
>>>>>>> 
>>>>>>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik <leonid.mesnik at oracle.com> <mailto:leonid.mesnik at oracle.com> wrote: 
>>>>>>>> 
>>>>>>>> Hi 
>>>>>>>> 
>>>>>>>> Could you please review following fix which change LingeredApp to prepend vm options to java/vm.test.opts when startApp is used and provide startAppVmOpts to override options completely. 
>>>>>>>> 
>>>>>>>> The intention is to avoid issue like in this bug where test/jtreg options were ignored by tests. Also I fixed some tests where intention was to append vm options rather than to override them. 
>>>>>>>> 
>>>>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/ <http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/> 
>>>>>>>> 
>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698 <https://bugs.openjdk.java.net/browse/JDK-8240698> 
>>>>>>>> 
>>>>>>>> Leonid 
>>>>>>>> 
>>>>>> 
>>>> 
>> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200325/ea8ba2d8/attachment-0001.htm>

From chris.plummer at oracle.com  Thu Mar 26 00:55:57 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 25 Mar 2020 17:55:57 -0700
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <e67eefb0-fa41-1a22-e794-6c98fe255713@redhat.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
 <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>
 <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>
 <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com>
 <f322e0c2-a87e-7c45-5b02-f5380e50246a@oracle.com>
 <359442d4-d6a1-f123-dad1-ac712f06eb0e@redhat.com>
 <d5dbee0b-9f6e-173f-15c4-f88bd0e9b619@oracle.com>
 <ef85c1cc-a22b-5b72-caa2-30020f7f487a@redhat.com>
 <7c613970-1788-5bd5-06f9-5c754843a193@oracle.com>
 <70717ba0-bcd1-05d4-3f25-ad1dd30309c1@redhat.com>
 <80c64ca7-4f52-a7ba-0e35-9fa6417ce545@oracle.com>
 <e67eefb0-fa41-1a22-e794-6c98fe255713@redhat.com>
Message-ID: <89d41371-4394-e506-b1d1-0a810c72b6e3@oracle.com>

Hi Roman,

It passed all my testing. I think before you push Serguei has a question 
regarding an issue you brought up a while back. You mentioned that you 
weren't getting some events, and suddenly started seeing them. We were 
discussing it today and it was unclear if this was an issue you were 
seeing before your changes, and your changes resolved it, or it was 
initially caused by an earlier version of your changes, and you later 
fixed it. We just want to better understand what this issue was and how 
it was fixed.

thanks,

Chris

On 3/25/20 3:22 PM, Roman Kennke wrote:
> The new job finished, its ID is:
>
>   mach5-one-rkennke-JDK-8227269-2-20200325-2027-9716289
>
> Thank you,
> Roman
>
>
>> Yes, please submit a new job. I'll start my testing once I see that the
>> builds are done.
>>
>> Chris
>>
>> On 3/25/20 12:59 PM, Roman Kennke wrote:
>>> Hi Chris,
>>>
>>> Apparently we can get into classTrack_reset() before calling activate(),
>>> and we're seeing a null deletedSignatureBag. A simple NULL-check around
>>> the cleaning routine fixes the problem for me.
>>>
>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.08/
>>>
>>> Should I post another submit-repo job with that fix?
>>>
>>> Thanks,
>>> Roman
>>>
>>>
>>>> Hi Roman,
>>>>
>>>> com/sun/jdi/JdwpAllowTest.java crashed on many runs:
>>>>
>>>> Stack: [0x00007fbb790f9000,0x00007fbb791fa000],
>>>> sp=0x00007fbb791f8af0,? free space=1022k
>>>> Native frames: (J=compiled Java code, A=aot compiled Java code,
>>>> j=interpreted, Vv=VM code, C=native code)
>>>> C? [libjdwp.so+0xdb71]? bagEnumerateOver+0x11
>>>> C? [libjdwp.so+0xe365]? classTrack_reset+0x25
>>>> C? [libjdwp.so+0xfca1]? debugInit_reset+0x71
>>>> C? [libjdwp.so+0x12e0d]? debugLoop_run+0x38d
>>>> C? [libjdwp.so+0x25700]? acceptThread+0x80
>>>> V? [libjvm.so+0xf4b5a7]? JvmtiAgentThread::call_start_function()+0x1c7
>>>> V? [libjvm.so+0x15215c6]? JavaThread::thread_main_inner()+0x226
>>>> V? [libjvm.so+0x1527736]? Thread::call_run()+0xf6
>>>> V? [libjvm.so+0x1250ade]? thread_native_entry(Thread*)+0x10e
>>>>
>>>>
>>>> This happened during a test task run of open/test/jdk/:jdk_jdi. There
>>>> doesn't seem to be anything magic on the command line that might be
>>>> triggering. Pretty much I see it with all the various VM configs we
>>>> test.
>>>>
>>>> I'm also seeing crashes in the following tests, but not as often:
>>>>
>>>> serviceability/jvmti/ModuleAwareAgents/ThreadStart/MAAThreadStart.java
>>>> vmTestbase/nsk/jdwp/VirtualMachine/Version/version002/TestDescription.java
>>>>
>>>> vmTestbase/nsk/jdwp/VirtualMachine/ReleaseEvents/releaseevents002/TestDescription.java
>>>>
>>>> vmTestbase/nsk/jdwp/VirtualMachine/HoldEvents/holdevents002/TestDescription.java
>>>>
>>>> vmTestbase/nsk/jdwp/VirtualMachine/Dispose/dispose001/TestDescription.java
>>>>
>>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>
>>>>
>>>> On 3/25/20 11:37 AM, Roman Kennke wrote:
>>>>> Hi Chris,
>>>>>
>>>>>> Regarding the new assert:
>>>>>>
>>>>>>  ??105???? if (gdata && gdata->assertOn) {
>>>>>>  ??106???????? // Check this is not already tagged.
>>>>>>  ??107???????? jlong tag;
>>>>>>  ??108???????? error = JVMTI_FUNC_PTR(trackingEnv, GetTag)(env,
>>>>>> klass, &tag);
>>>>>>  ??109???????? if (error != JVMTI_ERROR_NONE) {
>>>>>>  ??110???????????? EXIT_ERROR(error, "Unable to GetTag with class
>>>>>> trackingEnv");
>>>>>>  ??111???????? }
>>>>>>  ??112???????? JDI_ASSERT(tag == NOT_TAGGED);
>>>>>>  ??113???? }
>>>>>>
>>>>>> I think you should remove the gdata check. gdata should never be NULL
>>>>>> when you get to this code. If it is ever NULL then there's a bug, and
>>>>>> the check will hide the bug.
>>>>> Ok, will remove this.
>>>>>
>>>>>> Regarding testing, after you do the submit repo testing let me know
>>>>>> the
>>>>>> jobID and I'll do additional testing on it.
>>>>> I did the submit repo earlier today, and it came back green:
>>>>>
>>>>> mach5-one-rkennke-JDK-8227269-2-20200325-1421-9706762
>>>>>
>>>>> Thanks,
>>>>> Roman
>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On 3/25/20 6:00 AM, Roman Kennke wrote:
>>>>>>> Hi Sergei,
>>>>>>>
>>>>>>>> The fix looks pretty clean now.
>>>>>>>> I also like new name of the lock.:)
>>>>>>> Thank you!
>>>>>>>
>>>>>>>> Just one comment below.
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 110 if (tag != 0l) {
>>>>>>>> 111 return; // Already added
>>>>>>>>  ?? 112???? }
>>>>>>>>
>>>>>>>>  ???It is better to use a named constant or macro instead.
>>>>>>>>  ???Also, it'd be nice to add a short comment about this value is.
>>>>>>> As I replied to Chris earlier, this whole block can be turned into an
>>>>>>> assert. I also made a constant for the value 0, which should be
>>>>>>> pretty
>>>>>>> much self-explaining.
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.07/
>>>>>>>
>>>>>>>> How do you test the fix?
>>>>>>> I am using a manual test that is provided in this bug report:
>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>
>>>>>>> "Script to compare performance of GC with and without debugger, when
>>>>>>> many classes are loaded and classes are being unloaded":
>>>>>>>
>>>>>>> https://bugzilla.redhat.com/attachment.cgi?id=1640688
>>>>>>>
>>>>>>> I am also using this test and manually attach/detach jdb a couple of
>>>>>>> times in a row to check that disconnecting and reconnecting works
>>>>>>> well
>>>>>>> (this tended to deadlock or crash with an earlier version of the
>>>>>>> patch,
>>>>>>> and is now looking good).
>>>>>>>
>>>>>>> I am also running tier1 and tier2 tests locally, and as soon as we
>>>>>>> all
>>>>>>> agree that the fix is reasonable, I will push it to the submit
>>>>>>> repo. I
>>>>>>> am not sure if any of those tests actually exercise that code,
>>>>>>> though.
>>>>>>> Let me know if you want me to run any specific tests.
>>>>>>>
>>>>>>> Thank you,
>>>>>>> Roman
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>>
>>>>>>>>
>>>>>>>> On 3/20/20 08:30, Roman Kennke wrote:
>>>>>>>>> I believe I came up with a much simpler solution that also
>>>>>>>>> solves the
>>>>>>>>> problems of the existing one, and the ones I proposed earlier.
>>>>>>>>>
>>>>>>>>> It turns out that we can take advantage of the fact that we can use
>>>>>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is
>>>>>>>>> explicitely
>>>>>>>>> mentioned in the JVMTI spec). This means we can simply stick a
>>>>>>>>> pointer
>>>>>>>>> to the signature of a class into the tag, and pull it out again
>>>>>>>>> when we
>>>>>>>>> get notified that the class gets unloaded.
>>>>>>>>>
>>>>>>>>> This means we don't need an extra data-structure to keep track of
>>>>>>>>> classes and signatures, and it also makes the story around locking
>>>>>>>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning
>>>>>>>>> of all
>>>>>>>>> classes needed (as in the current implementation) and no
>>>>>>>>> searching of
>>>>>>>>> table needed (like in my previous attempts).
>>>>>>>>>
>>>>>>>>> Please review this new revision:
>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/
>>>>>>>>>
>>>>>>>>> (Notice that there still appears to be a performance bottleneck
>>>>>>>>> with
>>>>>>>>> class-unloading when an actual debugger is attached. This
>>>>>>>>> doesn't seem
>>>>>>>>> to be related to the classTrack.c implementation though, but
>>>>>>>>> looks like
>>>>>>>>> a consequence of getting all those class-unload notifications
>>>>>>>>> over the
>>>>>>>>> wire. My testcase generates 1000s of them, and it's clogging up the
>>>>>>>>> buffers.)
>>>>>>>>>
>>>>>>>>> I am not sure why jdb needs to enable class-unload listener
>>>>>>>>> always. A
>>>>>>>>> simple hack disables it, and performance is brilliant, even when
>>>>>>>>> jdb is
>>>>>>>>> attached:
>>>>>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch
>>>>>>>>>
>>>>>>>>> But this is not in the scope of this bug.)
>>>>>>>>>
>>>>>>>>> Roman
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>> Sorry, forgot to complete my comments at the end (see below).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>> Hi Roman,
>>>>>>>>>>>
>>>>>>>>>>> Thank you for the update and sorry for the latency in review.
>>>>>>>>>>>
>>>>>>>>>>> Some comments are below.
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
>>>>>>>>>>>  ??? 88 {
>>>>>>>>>>> 89 debugMonitorEnter(deletedSignatureLock);
>>>>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>> 93 return;
>>>>>>>>>>>  ??? 94???? }
>>>>>>>>>>> Just a question:
>>>>>>>>>>>  ??? Q1: Should the ObjectFree events be disabled for the jvmtiEnv
>>>>>>>>>>> that does
>>>>>>>>>>>  ??????? the class tracking if class tracking has not been
>>>>>>>>>>> initialized?
>>>>>>>>>>>
>>>>>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is
>>>>>>>>>>> better to
>>>>>>>>>>> be something like: lastClassTag or highestClassTag.
>>>>>>>>>>>
>>>>>>>>>>> 99 KlassNode* klass = *klass_ptr;
>>>>>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103
>>>>>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr;
>>>>>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { //
>>>>>>>>>>> klass
>>>>>>>>>>> not
>>>>>>>>>>> found - ignore.
>>>>>>>>>>> 107 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>> 108 return;
>>>>>>>>>>>  ?? 109???? }
>>>>>>>>>>>  ???It seems to me, something is wrong in the condition at L106
>>>>>>>>>>> above.
>>>>>>>>>>>  ???Should it be? :
>>>>>>>>>>>  ????? if (klass == NULL || klass->klass_tag != tag)
>>>>>>>>>>>
>>>>>>>>>>>  ???Otherwise, how can the second check ever work correctly as the
>>>>>>>>>>> return
>>>>>>>>>>> will always happen when (klass != NULL)?
>>>>>>>>>>>
>>>>>>>>>>>  ?? There are several places in this file with the the indent:
>>>>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>> 93 return;
>>>>>>>>>>>  ??? 94???? }
>>>>>>>>>>>  ?? ...
>>>>>>>>>>> 152 if (currentClassTag == -1) {
>>>>>>>>>>> 153 // Class tracking not initialized yet, nobody's interested
>>>>>>>>>>> 154 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>> 155 return;
>>>>>>>>>>>  ?? 156???? }
>>>>>>>>>>>  ?? ...
>>>>>>>>>>> 161 if (error != JVMTI_ERROR_NONE) {
>>>>>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv");
>>>>>>>>>>>  ?? 163???? }
>>>>>>>>>>> 164 if (tag != 0l) {
>>>>>>>>>>> 165 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>> 166 return; // Already added
>>>>>>>>>>>  ?? 167???? }
>>>>>>>>>>>  ?? ...
>>>>>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg)
>>>>>>>>>>> 282 {
>>>>>>>>>>> 283 char* sig = (char*)signatureVoid;
>>>>>>>>>>> 284 jvmtiDeallocate(sig);
>>>>>>>>>>> 285 return JNI_TRUE;
>>>>>>>>>>>  ?? 286 }
>>>>>>>>>>>  ?? ...
>>>>>>>>>>>  ?? 291 void
>>>>>>>>>>>  ?? 292 classTrack_reset(void)
>>>>>>>>>>>  ?? 293 {
>>>>>>>>>>> 294 int idx;
>>>>>>>>>>> 295 debugMonitorEnter(deletedSignatureLock);
>>>>>>>>>>> 296
>>>>>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) {
>>>>>>>>>>> 298 KlassNode* node = table[idx];
>>>>>>>>>>> 299 while (node != NULL) {
>>>>>>>>>>> 300 KlassNode* next = node->next;
>>>>>>>>>>> 301 jvmtiDeallocate(node->signature);
>>>>>>>>>>> 302 jvmtiDeallocate(node);
>>>>>>>>>>> 303 node = next;
>>>>>>>>>>> 304 }
>>>>>>>>>>> 305 }
>>>>>>>>>>> 306 jvmtiDeallocate(table);
>>>>>>>>>>> 307
>>>>>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL);
>>>>>>>>>>> 309 bagDestroyBag(deletedSignatureBag);
>>>>>>>>>>> 310
>>>>>>>>>>> 311 currentClassTag = -1;
>>>>>>>>>>> 312
>>>>>>>>>>> 313
>>>>>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv);
>>>>>>>>>>>
>>>>>>>>>>> 314 trackingEnv = NULL;
>>>>>>>>>>> 315
>>>>>>>>>>> 316 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>
>>>>>>>>>>> Could you, please, fix several comments below?
>>>>>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for
>>>>>>>>>>> class-unloads
>>>>>>>>>>>  ???The comma is not needed.
>>>>>>>>>>>  ???Would it better to replace: klass tags => klass_tag's ?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag
>>>>>>>>>>> consistent
>>>>>>>>>>>  ???Maybe: Lock to guard ... or lock to keep integrity of ...
>>>>>>>>>>>
>>>>>>>>>>> 84 * Callback when classes are freed, Finds the signature and
>>>>>>>>>>> remembers it in deletedSignatureBag. Would be better to use words
>>>>>>>>>>> like
>>>>>>>>>>> "store" or "record", "Find" should not start from capital letter:
>>>>>>>>>>> Invoke the callback when classes are freed, find and record the
>>>>>>>>>>> signature in deletedSignatureBag.
>>>>>>>>>>>
>>>>>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not
>>>>>>>>>>> initialized,
>>>>>>>>>>> nobody's interested 153 // Class tracking not initialized yet,
>>>>>>>>>>> nobody's interested 158 /* Check this is not a duplicate */
>>>>>>>>>>> Missed
>>>>>>>>>>> dot
>>>>>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag)
>>>>>>>>>>> { //
>>>>>>>>>>> klass not found - ignore. In opposite, dot is not needed as the
>>>>>>>>>>> comment does not start from a capital letter. 111 // At this
>>>>>>>>>>> point we
>>>>>>>>>>> have the KlassNode corresponding to the tag
>>>>>>>>>>> 112 // in klass, and the pointer to it in klass_node.
>>>>>>>>>>  ?? The comment above can be better. Maybe, something like:
>>>>>>>>>>  ?? ? " At this point, we found the KlassNode matching the klass
>>>>>>>>>> tag(and it is
>>>>>>>>>> linked).
>>>>>>>>>>
>>>>>>>>>>> 113 // Remember the unloaded signature.
>>>>>>>>>>  ???Better: Record the signature of the unloaded class and
>>>>>>>>>> unlink it.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Serguei
>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Serguei
>>>>>>>>>>>
>>>>>>>>>>> On 3/9/20 05:39, Roman Kennke wrote:
>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>
>>>>>>>>>>>> Can I please get reviews of this change? In the meantime,
>>>>>>>>>>>> we've done
>>>>>>>>>>>> more testing and also field-/torture-testing by a customer
>>>>>>>>>>>> who is
>>>>>>>>>>>> happy
>>>>>>>>>>>> now. :-)
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Roman
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Serguei,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for reviewing!
>>>>>>>>>>>>>
>>>>>>>>>>>>> I updated the patch to reflect your suggestions, very good!
>>>>>>>>>>>>> It also includes a fix to allow re-connecting an agent after
>>>>>>>>>>>>> disconnect,
>>>>>>>>>>>>> namely move setup of the trackingEnv and deletedSignatureBag to
>>>>>>>>>>>>> _activate() to ensure have those structures after re-connect.
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
>>>>>>>>>>>>>
>>>>>>>>>>>>> Let me know what you think!
>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Roman,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thank you for taking care about this scalability issue!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have a couple of quick comments.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 72 /*
>>>>>>>>>>>>>> 73 * Lock to protect deletedSignatureBag
>>>>>>>>>>>>>> 74 */
>>>>>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>>>>>>>>>>>>>> 78 * A bag containing all the deleted classes' signatures.
>>>>>>>>>>>>>> Must be
>>>>>>>>>>>>>> accessed under
>>>>>>>>>>>>>> 79 * deletedTagLock,
>>>>>>>>>>>>>>  ??? 80? */
>>>>>>>>>>>>>> 81 struct bag* deletedSignatureBag;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  ??? The comments contradict to each other.
>>>>>>>>>>>>>>  ??? I guess, the lock name at line 79 has to be
>>>>>>>>>>>>>> deletedSignatureLock
>>>>>>>>>>>>>> instead of deletedTagLock.
>>>>>>>>>>>>>>  ??? Also, comma at the end must be replaced with dot.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 101 // Tag not found? Ignore.
>>>>>>>>>>>>>> 102 if (klass == NULL) {
>>>>>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>> 104 return;
>>>>>>>>>>>>>> 105 }
>>>>>>>>>>>>>>  ?? 106
>>>>>>>>>>>>>> 107 // Scan linked-list.
>>>>>>>>>>>>>> 108 jlong found_tag = klass->klass_tag;
>>>>>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) {
>>>>>>>>>>>>>> 110 klass_ptr = &klass->next;
>>>>>>>>>>>>>> 111 klass = *klass_ptr;
>>>>>>>>>>>>>> 112 found_tag = klass->klass_tag;
>>>>>>>>>>>>>>  ?? 113???? }
>>>>>>>>>>>>>> 114
>>>>>>>>>>>>>> 115 // Tag not found? Ignore.
>>>>>>>>>>>>>> 116 if (found_tag != tag) {
>>>>>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>> 118 return;
>>>>>>>>>>>>>>  ?? 119???? }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  ???The code above can be simplified, so that the lines 101-105
>>>>>>>>>>>>>> are not
>>>>>>>>>>>>>> needed anymore.
>>>>>>>>>>>>>>  ???It can be something like this:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> // Scan linked-list.
>>>>>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) {
>>>>>>>>>>>>>> klass_ptr = &klass->next;
>>>>>>>>>>>>>> klass = *klass_ptr;
>>>>>>>>>>>>>>  ?????? }
>>>>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not
>>>>>>>>>>>>>> found - ignore.
>>>>>>>>>>>>>> debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>> return;
>>>>>>>>>>>>>>  ?????? }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It will take more time when I get a chance to look at the
>>>>>>>>>>>>>> rest.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote:
>>>>>>>>>>>>>>> Here comes an update that resolves some races that happen
>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>> disconnecting an agent. In particular, we need to take the
>>>>>>>>>>>>>>> lock on
>>>>>>>>>>>>>>> basically every operation, and also need to check whether
>>>>>>>>>>>>>>> or not
>>>>>>>>>>>>>>> class-tracking is active and return an appropriate result
>>>>>>>>>>>>>>> (e.g. an empty
>>>>>>>>>>>>>>> list) when we're not.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> So, here comes the O(1) implementation:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a
>>>>>>>>>>>>>>>> tag, and we
>>>>>>>>>>>>>>>> set-up a listener to get notified when it is unloaded.
>>>>>>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a
>>>>>>>>>>>>>>>> table, which
>>>>>>>>>>>>>>>> each entry being the head of a linked-list of KlassNode*.
>>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>> table is
>>>>>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend the new
>>>>>>>>>>>>>>>> KlassNode*.
>>>>>>>>>>>>>>>> This is O(1) operation.
>>>>>>>>>>>>>>>> - When we get notified of unloading a class, we look up the
>>>>>>>>>>>>>>>> signature of
>>>>>>>>>>>>>>>> the reported tag in that table, and remember it in a bag.
>>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>> KlassNode*
>>>>>>>>>>>>>>>> is then unlinked from the table and deallocated. This is
>>>>>>>>>>>>>>>> ~O(1) operation
>>>>>>>>>>>>>>>> too, depending on the depth of the table. In my testcase
>>>>>>>>>>>>>>>> which hammered
>>>>>>>>>>>>>>>> the code with class-loads and unloads, I usually see depths
>>>>>>>>>>>>>>>> of like 2-3,
>>>>>>>>>>>>>>>> but not usually more. It should be ok.
>>>>>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out that
>>>>>>>>>>>>>>>> bag, and
>>>>>>>>>>>>>>>> allocate a new one.
>>>>>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid
>>>>>>>>>>>>>>>> leaking the
>>>>>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets detached
>>>>>>>>>>>>>>>> and/or
>>>>>>>>>>>>>>>> re-attached (was missing before).
>>>>>>>>>>>>>>>> - I also added locks around data-structure-manipulation (was
>>>>>>>>>>>>>>>> missing
>>>>>>>>>>>>>>>> before).
>>>>>>>>>>>>>>>> - Also, I only activate this whole process when an actual
>>>>>>>>>>>>>>>> listener gets
>>>>>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right when
>>>>>>>>>>>>>>>> attaching a
>>>>>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be
>>>>>>>>>>>>>>>> something
>>>>>>>>>>>>>>>> to improve
>>>>>>>>>>>>>>>> in the future?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> In my tests, the performance of class-tracking itself looks
>>>>>>>>>>>>>>>> really good.
>>>>>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the
>>>>>>>>>>>>>>>> class-unload
>>>>>>>>>>>>>>>> events. I don't see how this can be helped when the debug
>>>>>>>>>>>>>>>> agent asks for it?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Please let me know what you think of it.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am implementing
>>>>>>>>>>>>>>>>> the even more
>>>>>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing
>>>>>>>>>>>>>>>>> for now.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,Roman
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>  ?? Hi Chris,
>>>>>>>>>>>>>>>>>>> I'll have a look at this, although it might not be for a
>>>>>>>>>>>>>>>>>>> few days. In
>>>>>>>>>>>>>>>>>>> the meantime, maybe you can describe your new
>>>>>>>>>>>>>>>>>>> implementation in
>>>>>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the changes.
>>>>>>>>>>>>>>>>>> Sure.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to
>>>>>>>>>>>>>>>>>> determine the
>>>>>>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading
>>>>>>>>>>>>>>>>>> happened, so that
>>>>>>>>>>>>>>>>>> we can generate the appropriate JDWP event.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The current implementation does so by maintaining a table
>>>>>>>>>>>>>>>>>> of currently
>>>>>>>>>>>>>>>>>> prepared classes by building that table when classTrack is
>>>>>>>>>>>>>>>>>> initialized,
>>>>>>>>>>>>>>>>>> and then add new classes whenever a class gets loaded.
>>>>>>>>>>>>>>>>>> When
>>>>>>>>>>>>>>>>>> unloading
>>>>>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and
>>>>>>>>>>>>>>>>>> compared with the
>>>>>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the new
>>>>>>>>>>>>>>>>>> table gets
>>>>>>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently
>>>>>>>>>>>>>>>>>> and/or many
>>>>>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to
>>>>>>>>>>>>>>>>>> O(classCount*gcCount)
>>>>>>>>>>>>>>>>>> complexity.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared
>>>>>>>>>>>>>>>>>> classes, and also
>>>>>>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree().
>>>>>>>>>>>>>>>>>> Whenever an
>>>>>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned,
>>>>>>>>>>>>>>>>>> and classes
>>>>>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus
>>>>>>>>>>>>>>>>>> maintaining the
>>>>>>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list
>>>>>>>>>>>>>>>>>> that gets returned.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The implementation is not perfect. In order to determine
>>>>>>>>>>>>>>>>>> whether or not
>>>>>>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag.
>>>>>>>>>>>>>>>>>> That process is
>>>>>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption here
>>>>>>>>>>>>>>>>>> is that
>>>>>>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this
>>>>>>>>>>>>>>>>>> seems to be
>>>>>>>>>>>>>>>>>> true, and also reasonable to expect.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> (I have some ideas how to improve the implementation to
>>>>>>>>>>>>>>>>>> ~O(1) but it
>>>>>>>>>>>>>>>>>> would be considerably more complex: have to maintain a
>>>>>>>>>>>>>>>>>> (hash)table that
>>>>>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload,
>>>>>>>>>>>>>>>>>> and build the
>>>>>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently see
>>>>>>>>>>>>>>>>>> that it's
>>>>>>>>>>>>>>>>>> worth the effort).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> In addition to all that, this process is only activated
>>>>>>>>>>>>>>>>>> when there's an
>>>>>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Issue:
>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of
>>>>>>>>>>>>>>>>>>>> classTrack.c.
>>>>>>>>>>>>>>>>>>>> It avoids
>>>>>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps
>>>>>>>>>>>>>>>>>>>> track of
>>>>>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance until an
>>>>>>>>>>>>>>>>>>>> agent
>>>>>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and
>>>>>>>>>>>>>>>>>>>> timing.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Eg with the testcase provided here:
>>>>>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I am getting those numbers:
>>>>>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Can I please get a review?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>>>>
>>


From rkennke at redhat.com  Thu Mar 26 08:44:38 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Thu, 26 Mar 2020 09:44:38 +0100
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <89d41371-4394-e506-b1d1-0a810c72b6e3@oracle.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
 <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>
 <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>
 <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com>
 <f322e0c2-a87e-7c45-5b02-f5380e50246a@oracle.com>
 <359442d4-d6a1-f123-dad1-ac712f06eb0e@redhat.com>
 <d5dbee0b-9f6e-173f-15c4-f88bd0e9b619@oracle.com>
 <ef85c1cc-a22b-5b72-caa2-30020f7f487a@redhat.com>
 <7c613970-1788-5bd5-06f9-5c754843a193@oracle.com>
 <70717ba0-bcd1-05d4-3f25-ad1dd30309c1@redhat.com>
 <80c64ca7-4f52-a7ba-0e35-9fa6417ce545@oracle.com>
 <e67eefb0-fa41-1a22-e794-6c98fe255713@redhat.com>
 <89d41371-4394-e506-b1d1-0a810c72b6e3@oracle.com>
Message-ID: <f65dcdbb-465f-d1f5-2ecf-a293fa58b624@redhat.com>

That was in the previous implementation: I got a condition wrong in the
table lookup (as noted by Serguei), and this prevented any
class-unload-events from getting out. I have fixed this, but found other
problems in that implementation (deadlocks and a crash).

 The current implementation has none of these problems: we don't need
table-lookups - we simply pass-through the signatures, and locking is
much simpler and in particular we don't need a lock around the JVMTI
call (SetTag) which was the cause of the deadlock.

Does that answer your questions?

Thanks,
Roman

> Hi Roman,
> 
> It passed all my testing. I think before you push Serguei has a question
> regarding an issue you brought up a while back. You mentioned that you
> weren't getting some events, and suddenly started seeing them. We were
> discussing it today and it was unclear if this was an issue you were
> seeing before your changes, and your changes resolved it, or it was
> initially caused by an earlier version of your changes, and you later
> fixed it. We just want to better understand what this issue was and how
> it was fixed.
> 
> thanks,
> 
> Chris
> 
> On 3/25/20 3:22 PM, Roman Kennke wrote:
>> The new job finished, its ID is:
>>
>> ? mach5-one-rkennke-JDK-8227269-2-20200325-2027-9716289
>>
>> Thank you,
>> Roman
>>
>>
>>> Yes, please submit a new job. I'll start my testing once I see that the
>>> builds are done.
>>>
>>> Chris
>>>
>>> On 3/25/20 12:59 PM, Roman Kennke wrote:
>>>> Hi Chris,
>>>>
>>>> Apparently we can get into classTrack_reset() before calling
>>>> activate(),
>>>> and we're seeing a null deletedSignatureBag. A simple NULL-check around
>>>> the cleaning routine fixes the problem for me.
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.08/
>>>>
>>>> Should I post another submit-repo job with that fix?
>>>>
>>>> Thanks,
>>>> Roman
>>>>
>>>>
>>>>> Hi Roman,
>>>>>
>>>>> com/sun/jdi/JdwpAllowTest.java crashed on many runs:
>>>>>
>>>>> Stack: [0x00007fbb790f9000,0x00007fbb791fa000],
>>>>> sp=0x00007fbb791f8af0,? free space=1022k
>>>>> Native frames: (J=compiled Java code, A=aot compiled Java code,
>>>>> j=interpreted, Vv=VM code, C=native code)
>>>>> C? [libjdwp.so+0xdb71]? bagEnumerateOver+0x11
>>>>> C? [libjdwp.so+0xe365]? classTrack_reset+0x25
>>>>> C? [libjdwp.so+0xfca1]? debugInit_reset+0x71
>>>>> C? [libjdwp.so+0x12e0d]? debugLoop_run+0x38d
>>>>> C? [libjdwp.so+0x25700]? acceptThread+0x80
>>>>> V? [libjvm.so+0xf4b5a7]? JvmtiAgentThread::call_start_function()+0x1c7
>>>>> V? [libjvm.so+0x15215c6]? JavaThread::thread_main_inner()+0x226
>>>>> V? [libjvm.so+0x1527736]? Thread::call_run()+0xf6
>>>>> V? [libjvm.so+0x1250ade]? thread_native_entry(Thread*)+0x10e
>>>>>
>>>>>
>>>>> This happened during a test task run of open/test/jdk/:jdk_jdi. There
>>>>> doesn't seem to be anything magic on the command line that might be
>>>>> triggering. Pretty much I see it with all the various VM configs we
>>>>> test.
>>>>>
>>>>> I'm also seeing crashes in the following tests, but not as often:
>>>>>
>>>>> serviceability/jvmti/ModuleAwareAgents/ThreadStart/MAAThreadStart.java
>>>>> vmTestbase/nsk/jdwp/VirtualMachine/Version/version002/TestDescription.java
>>>>>
>>>>>
>>>>> vmTestbase/nsk/jdwp/VirtualMachine/ReleaseEvents/releaseevents002/TestDescription.java
>>>>>
>>>>>
>>>>> vmTestbase/nsk/jdwp/VirtualMachine/HoldEvents/holdevents002/TestDescription.java
>>>>>
>>>>>
>>>>> vmTestbase/nsk/jdwp/VirtualMachine/Dispose/dispose001/TestDescription.java
>>>>>
>>>>>
>>>>>
>>>>> thanks,
>>>>>
>>>>> Chris
>>>>>
>>>>>
>>>>> On 3/25/20 11:37 AM, Roman Kennke wrote:
>>>>>> Hi Chris,
>>>>>>
>>>>>>> Regarding the new assert:
>>>>>>>
>>>>>>> ???105???? if (gdata && gdata->assertOn) {
>>>>>>> ???106???????? // Check this is not already tagged.
>>>>>>> ???107???????? jlong tag;
>>>>>>> ???108???????? error = JVMTI_FUNC_PTR(trackingEnv, GetTag)(env,
>>>>>>> klass, &tag);
>>>>>>> ???109???????? if (error != JVMTI_ERROR_NONE) {
>>>>>>> ???110???????????? EXIT_ERROR(error, "Unable to GetTag with class
>>>>>>> trackingEnv");
>>>>>>> ???111???????? }
>>>>>>> ???112???????? JDI_ASSERT(tag == NOT_TAGGED);
>>>>>>> ???113???? }
>>>>>>>
>>>>>>> I think you should remove the gdata check. gdata should never be
>>>>>>> NULL
>>>>>>> when you get to this code. If it is ever NULL then there's a bug,
>>>>>>> and
>>>>>>> the check will hide the bug.
>>>>>> Ok, will remove this.
>>>>>>
>>>>>>> Regarding testing, after you do the submit repo testing let me know
>>>>>>> the
>>>>>>> jobID and I'll do additional testing on it.
>>>>>> I did the submit repo earlier today, and it came back green:
>>>>>>
>>>>>> mach5-one-rkennke-JDK-8227269-2-20200325-1421-9706762
>>>>>>
>>>>>> Thanks,
>>>>>> Roman
>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On 3/25/20 6:00 AM, Roman Kennke wrote:
>>>>>>>> Hi Sergei,
>>>>>>>>
>>>>>>>>> The fix looks pretty clean now.
>>>>>>>>> I also like new name of the lock.:)
>>>>>>>> Thank you!
>>>>>>>>
>>>>>>>>> Just one comment below.
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 110 if (tag != 0l) {
>>>>>>>>> 111 return; // Already added
>>>>>>>>> ??? 112???? }
>>>>>>>>>
>>>>>>>>> ????It is better to use a named constant or macro instead.
>>>>>>>>> ????Also, it'd be nice to add a short comment about this value is.
>>>>>>>> As I replied to Chris earlier, this whole block can be turned
>>>>>>>> into an
>>>>>>>> assert. I also made a constant for the value 0, which should be
>>>>>>>> pretty
>>>>>>>> much self-explaining.
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.07/
>>>>>>>>
>>>>>>>>> How do you test the fix?
>>>>>>>> I am using a manual test that is provided in this bug report:
>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>
>>>>>>>> "Script to compare performance of GC with and without debugger,
>>>>>>>> when
>>>>>>>> many classes are loaded and classes are being unloaded":
>>>>>>>>
>>>>>>>> https://bugzilla.redhat.com/attachment.cgi?id=1640688
>>>>>>>>
>>>>>>>> I am also using this test and manually attach/detach jdb a
>>>>>>>> couple of
>>>>>>>> times in a row to check that disconnecting and reconnecting works
>>>>>>>> well
>>>>>>>> (this tended to deadlock or crash with an earlier version of the
>>>>>>>> patch,
>>>>>>>> and is now looking good).
>>>>>>>>
>>>>>>>> I am also running tier1 and tier2 tests locally, and as soon as we
>>>>>>>> all
>>>>>>>> agree that the fix is reasonable, I will push it to the submit
>>>>>>>> repo. I
>>>>>>>> am not sure if any of those tests actually exercise that code,
>>>>>>>> though.
>>>>>>>> Let me know if you want me to run any specific tests.
>>>>>>>>
>>>>>>>> Thank you,
>>>>>>>> Roman
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Serguei
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 3/20/20 08:30, Roman Kennke wrote:
>>>>>>>>>> I believe I came up with a much simpler solution that also
>>>>>>>>>> solves the
>>>>>>>>>> problems of the existing one, and the ones I proposed earlier.
>>>>>>>>>>
>>>>>>>>>> It turns out that we can take advantage of the fact that we
>>>>>>>>>> can use
>>>>>>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is
>>>>>>>>>> explicitely
>>>>>>>>>> mentioned in the JVMTI spec). This means we can simply stick a
>>>>>>>>>> pointer
>>>>>>>>>> to the signature of a class into the tag, and pull it out again
>>>>>>>>>> when we
>>>>>>>>>> get notified that the class gets unloaded.
>>>>>>>>>>
>>>>>>>>>> This means we don't need an extra data-structure to keep track of
>>>>>>>>>> classes and signatures, and it also makes the story around
>>>>>>>>>> locking
>>>>>>>>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning
>>>>>>>>>> of all
>>>>>>>>>> classes needed (as in the current implementation) and no
>>>>>>>>>> searching of
>>>>>>>>>> table needed (like in my previous attempts).
>>>>>>>>>>
>>>>>>>>>> Please review this new revision:
>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/
>>>>>>>>>>
>>>>>>>>>> (Notice that there still appears to be a performance bottleneck
>>>>>>>>>> with
>>>>>>>>>> class-unloading when an actual debugger is attached. This
>>>>>>>>>> doesn't seem
>>>>>>>>>> to be related to the classTrack.c implementation though, but
>>>>>>>>>> looks like
>>>>>>>>>> a consequence of getting all those class-unload notifications
>>>>>>>>>> over the
>>>>>>>>>> wire. My testcase generates 1000s of them, and it's clogging
>>>>>>>>>> up the
>>>>>>>>>> buffers.)
>>>>>>>>>>
>>>>>>>>>> I am not sure why jdb needs to enable class-unload listener
>>>>>>>>>> always. A
>>>>>>>>>> simple hack disables it, and performance is brilliant, even when
>>>>>>>>>> jdb is
>>>>>>>>>> attached:
>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> But this is not in the scope of this bug.)
>>>>>>>>>>
>>>>>>>>>> Roman
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>> Sorry, forgot to complete my comments at the end (see below).
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>> Hi Roman,
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you for the update and sorry for the latency in review.
>>>>>>>>>>>>
>>>>>>>>>>>> Some comments are below.
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
>>>>>>>>>>>> ???? 88 {
>>>>>>>>>>>> 89 debugMonitorEnter(deletedSignatureLock);
>>>>>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>> 93 return;
>>>>>>>>>>>> ???? 94???? }
>>>>>>>>>>>> Just a question:
>>>>>>>>>>>> ???? Q1: Should the ObjectFree events be disabled for the
>>>>>>>>>>>> jvmtiEnv
>>>>>>>>>>>> that does
>>>>>>>>>>>> ???????? the class tracking if class tracking has not been
>>>>>>>>>>>> initialized?
>>>>>>>>>>>>
>>>>>>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is
>>>>>>>>>>>> better to
>>>>>>>>>>>> be something like: lastClassTag or highestClassTag.
>>>>>>>>>>>>
>>>>>>>>>>>> 99 KlassNode* klass = *klass_ptr;
>>>>>>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103
>>>>>>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr;
>>>>>>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { //
>>>>>>>>>>>> klass
>>>>>>>>>>>> not
>>>>>>>>>>>> found - ignore.
>>>>>>>>>>>> 107 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>> 108 return;
>>>>>>>>>>>> ??? 109???? }
>>>>>>>>>>>> ????It seems to me, something is wrong in the condition at L106
>>>>>>>>>>>> above.
>>>>>>>>>>>> ????Should it be? :
>>>>>>>>>>>> ?????? if (klass == NULL || klass->klass_tag != tag)
>>>>>>>>>>>>
>>>>>>>>>>>> ????Otherwise, how can the second check ever work correctly
>>>>>>>>>>>> as the
>>>>>>>>>>>> return
>>>>>>>>>>>> will always happen when (klass != NULL)?
>>>>>>>>>>>>
>>>>>>>>>>>> ??? There are several places in this file with the the indent:
>>>>>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>> 93 return;
>>>>>>>>>>>> ???? 94???? }
>>>>>>>>>>>> ??? ...
>>>>>>>>>>>> 152 if (currentClassTag == -1) {
>>>>>>>>>>>> 153 // Class tracking not initialized yet, nobody's interested
>>>>>>>>>>>> 154 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>> 155 return;
>>>>>>>>>>>> ??? 156???? }
>>>>>>>>>>>> ??? ...
>>>>>>>>>>>> 161 if (error != JVMTI_ERROR_NONE) {
>>>>>>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class
>>>>>>>>>>>> trackingEnv");
>>>>>>>>>>>> ??? 163???? }
>>>>>>>>>>>> 164 if (tag != 0l) {
>>>>>>>>>>>> 165 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>> 166 return; // Already added
>>>>>>>>>>>> ??? 167???? }
>>>>>>>>>>>> ??? ...
>>>>>>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg)
>>>>>>>>>>>> 282 {
>>>>>>>>>>>> 283 char* sig = (char*)signatureVoid;
>>>>>>>>>>>> 284 jvmtiDeallocate(sig);
>>>>>>>>>>>> 285 return JNI_TRUE;
>>>>>>>>>>>> ??? 286 }
>>>>>>>>>>>> ??? ...
>>>>>>>>>>>> ??? 291 void
>>>>>>>>>>>> ??? 292 classTrack_reset(void)
>>>>>>>>>>>> ??? 293 {
>>>>>>>>>>>> 294 int idx;
>>>>>>>>>>>> 295 debugMonitorEnter(deletedSignatureLock);
>>>>>>>>>>>> 296
>>>>>>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) {
>>>>>>>>>>>> 298 KlassNode* node = table[idx];
>>>>>>>>>>>> 299 while (node != NULL) {
>>>>>>>>>>>> 300 KlassNode* next = node->next;
>>>>>>>>>>>> 301 jvmtiDeallocate(node->signature);
>>>>>>>>>>>> 302 jvmtiDeallocate(node);
>>>>>>>>>>>> 303 node = next;
>>>>>>>>>>>> 304 }
>>>>>>>>>>>> 305 }
>>>>>>>>>>>> 306 jvmtiDeallocate(table);
>>>>>>>>>>>> 307
>>>>>>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL);
>>>>>>>>>>>> 309 bagDestroyBag(deletedSignatureBag);
>>>>>>>>>>>> 310
>>>>>>>>>>>> 311 currentClassTag = -1;
>>>>>>>>>>>> 312
>>>>>>>>>>>> 313
>>>>>>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv);
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 314 trackingEnv = NULL;
>>>>>>>>>>>> 315
>>>>>>>>>>>> 316 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>
>>>>>>>>>>>> Could you, please, fix several comments below?
>>>>>>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for
>>>>>>>>>>>> class-unloads
>>>>>>>>>>>> ????The comma is not needed.
>>>>>>>>>>>> ????Would it better to replace: klass tags => klass_tag's ?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 73 * Lock to keep table, currentClassTag and
>>>>>>>>>>>> deletedSignatureBag
>>>>>>>>>>>> consistent
>>>>>>>>>>>> ????Maybe: Lock to guard ... or lock to keep integrity of ...
>>>>>>>>>>>>
>>>>>>>>>>>> 84 * Callback when classes are freed, Finds the signature and
>>>>>>>>>>>> remembers it in deletedSignatureBag. Would be better to use
>>>>>>>>>>>> words
>>>>>>>>>>>> like
>>>>>>>>>>>> "store" or "record", "Find" should not start from capital
>>>>>>>>>>>> letter:
>>>>>>>>>>>> Invoke the callback when classes are freed, find and record the
>>>>>>>>>>>> signature in deletedSignatureBag.
>>>>>>>>>>>>
>>>>>>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not
>>>>>>>>>>>> initialized,
>>>>>>>>>>>> nobody's interested 153 // Class tracking not initialized yet,
>>>>>>>>>>>> nobody's interested 158 /* Check this is not a duplicate */
>>>>>>>>>>>> Missed
>>>>>>>>>>>> dot
>>>>>>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag)
>>>>>>>>>>>> { //
>>>>>>>>>>>> klass not found - ignore. In opposite, dot is not needed as the
>>>>>>>>>>>> comment does not start from a capital letter. 111 // At this
>>>>>>>>>>>> point we
>>>>>>>>>>>> have the KlassNode corresponding to the tag
>>>>>>>>>>>> 112 // in klass, and the pointer to it in klass_node.
>>>>>>>>>>> ??? The comment above can be better. Maybe, something like:
>>>>>>>>>>> ??? ? " At this point, we found the KlassNode matching the klass
>>>>>>>>>>> tag(and it is
>>>>>>>>>>> linked).
>>>>>>>>>>>
>>>>>>>>>>>> 113 // Remember the unloaded signature.
>>>>>>>>>>> ????Better: Record the signature of the unloaded class and
>>>>>>>>>>> unlink it.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Serguei
>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Serguei
>>>>>>>>>>>>
>>>>>>>>>>>> On 3/9/20 05:39, Roman Kennke wrote:
>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Can I please get reviews of this change? In the meantime,
>>>>>>>>>>>>> we've done
>>>>>>>>>>>>> more testing and also field-/torture-testing by a customer
>>>>>>>>>>>>> who is
>>>>>>>>>>>>> happy
>>>>>>>>>>>>> now. :-)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Serguei,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for reviewing!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I updated the patch to reflect your suggestions, very good!
>>>>>>>>>>>>>> It also includes a fix to allow re-connecting an agent after
>>>>>>>>>>>>>> disconnect,
>>>>>>>>>>>>>> namely move setup of the trackingEnv and
>>>>>>>>>>>>>> deletedSignatureBag to
>>>>>>>>>>>>>> _activate() to ensure have those structures after re-connect.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Let me know what you think!
>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Roman,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thank you for taking care about this scalability issue!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I have a couple of quick comments.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 72 /*
>>>>>>>>>>>>>>> 73 * Lock to protect deletedSignatureBag
>>>>>>>>>>>>>>> 74 */
>>>>>>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>>>>>>>>>>>>>>> 78 * A bag containing all the deleted classes' signatures.
>>>>>>>>>>>>>>> Must be
>>>>>>>>>>>>>>> accessed under
>>>>>>>>>>>>>>> 79 * deletedTagLock,
>>>>>>>>>>>>>>> ???? 80? */
>>>>>>>>>>>>>>> 81 struct bag* deletedSignatureBag;
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ???? The comments contradict to each other.
>>>>>>>>>>>>>>> ???? I guess, the lock name at line 79 has to be
>>>>>>>>>>>>>>> deletedSignatureLock
>>>>>>>>>>>>>>> instead of deletedTagLock.
>>>>>>>>>>>>>>> ???? Also, comma at the end must be replaced with dot.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 101 // Tag not found? Ignore.
>>>>>>>>>>>>>>> 102 if (klass == NULL) {
>>>>>>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>>> 104 return;
>>>>>>>>>>>>>>> 105 }
>>>>>>>>>>>>>>> ??? 106
>>>>>>>>>>>>>>> 107 // Scan linked-list.
>>>>>>>>>>>>>>> 108 jlong found_tag = klass->klass_tag;
>>>>>>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) {
>>>>>>>>>>>>>>> 110 klass_ptr = &klass->next;
>>>>>>>>>>>>>>> 111 klass = *klass_ptr;
>>>>>>>>>>>>>>> 112 found_tag = klass->klass_tag;
>>>>>>>>>>>>>>> ??? 113???? }
>>>>>>>>>>>>>>> 114
>>>>>>>>>>>>>>> 115 // Tag not found? Ignore.
>>>>>>>>>>>>>>> 116 if (found_tag != tag) {
>>>>>>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>>> 118 return;
>>>>>>>>>>>>>>> ??? 119???? }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ????The code above can be simplified, so that the lines
>>>>>>>>>>>>>>> 101-105
>>>>>>>>>>>>>>> are not
>>>>>>>>>>>>>>> needed anymore.
>>>>>>>>>>>>>>> ????It can be something like this:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> // Scan linked-list.
>>>>>>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) {
>>>>>>>>>>>>>>> klass_ptr = &klass->next;
>>>>>>>>>>>>>>> klass = *klass_ptr;
>>>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not
>>>>>>>>>>>>>>> found - ignore.
>>>>>>>>>>>>>>> debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>>> return;
>>>>>>>>>>>>>>> ??????? }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It will take more time when I get a chance to look at the
>>>>>>>>>>>>>>> rest.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote:
>>>>>>>>>>>>>>>> Here comes an update that resolves some races that happen
>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>> disconnecting an agent. In particular, we need to take the
>>>>>>>>>>>>>>>> lock on
>>>>>>>>>>>>>>>> basically every operation, and also need to check whether
>>>>>>>>>>>>>>>> or not
>>>>>>>>>>>>>>>> class-tracking is active and return an appropriate result
>>>>>>>>>>>>>>>> (e.g. an empty
>>>>>>>>>>>>>>>> list) when we're not.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> So, here comes the O(1) implementation:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a
>>>>>>>>>>>>>>>>> tag, and we
>>>>>>>>>>>>>>>>> set-up a listener to get notified when it is unloaded.
>>>>>>>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a
>>>>>>>>>>>>>>>>> table, which
>>>>>>>>>>>>>>>>> each entry being the head of a linked-list of KlassNode*.
>>>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>>> table is
>>>>>>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend
>>>>>>>>>>>>>>>>> the new
>>>>>>>>>>>>>>>>> KlassNode*.
>>>>>>>>>>>>>>>>> This is O(1) operation.
>>>>>>>>>>>>>>>>> - When we get notified of unloading a class, we look up
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> signature of
>>>>>>>>>>>>>>>>> the reported tag in that table, and remember it in a bag.
>>>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>>> KlassNode*
>>>>>>>>>>>>>>>>> is then unlinked from the table and deallocated. This is
>>>>>>>>>>>>>>>>> ~O(1) operation
>>>>>>>>>>>>>>>>> too, depending on the depth of the table. In my testcase
>>>>>>>>>>>>>>>>> which hammered
>>>>>>>>>>>>>>>>> the code with class-loads and unloads, I usually see
>>>>>>>>>>>>>>>>> depths
>>>>>>>>>>>>>>>>> of like 2-3,
>>>>>>>>>>>>>>>>> but not usually more. It should be ok.
>>>>>>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out
>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>> bag, and
>>>>>>>>>>>>>>>>> allocate a new one.
>>>>>>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid
>>>>>>>>>>>>>>>>> leaking the
>>>>>>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets
>>>>>>>>>>>>>>>>> detached
>>>>>>>>>>>>>>>>> and/or
>>>>>>>>>>>>>>>>> re-attached (was missing before).
>>>>>>>>>>>>>>>>> - I also added locks around data-structure-manipulation
>>>>>>>>>>>>>>>>> (was
>>>>>>>>>>>>>>>>> missing
>>>>>>>>>>>>>>>>> before).
>>>>>>>>>>>>>>>>> - Also, I only activate this whole process when an actual
>>>>>>>>>>>>>>>>> listener gets
>>>>>>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right
>>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>> attaching a
>>>>>>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be
>>>>>>>>>>>>>>>>> something
>>>>>>>>>>>>>>>>> to improve
>>>>>>>>>>>>>>>>> in the future?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> In my tests, the performance of class-tracking itself
>>>>>>>>>>>>>>>>> looks
>>>>>>>>>>>>>>>>> really good.
>>>>>>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the
>>>>>>>>>>>>>>>>> class-unload
>>>>>>>>>>>>>>>>> events. I don't see how this can be helped when the debug
>>>>>>>>>>>>>>>>> agent asks for it?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Please let me know what you think of it.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am
>>>>>>>>>>>>>>>>>> implementing
>>>>>>>>>>>>>>>>>> the even more
>>>>>>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing
>>>>>>>>>>>>>>>>>> for now.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,Roman
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ??? Hi Chris,
>>>>>>>>>>>>>>>>>>>> I'll have a look at this, although it might not be
>>>>>>>>>>>>>>>>>>>> for a
>>>>>>>>>>>>>>>>>>>> few days. In
>>>>>>>>>>>>>>>>>>>> the meantime, maybe you can describe your new
>>>>>>>>>>>>>>>>>>>> implementation in
>>>>>>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the
>>>>>>>>>>>>>>>>>>>> changes.
>>>>>>>>>>>>>>>>>>> Sure.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to
>>>>>>>>>>>>>>>>>>> determine the
>>>>>>>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading
>>>>>>>>>>>>>>>>>>> happened, so that
>>>>>>>>>>>>>>>>>>> we can generate the appropriate JDWP event.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The current implementation does so by maintaining a
>>>>>>>>>>>>>>>>>>> table
>>>>>>>>>>>>>>>>>>> of currently
>>>>>>>>>>>>>>>>>>> prepared classes by building that table when
>>>>>>>>>>>>>>>>>>> classTrack is
>>>>>>>>>>>>>>>>>>> initialized,
>>>>>>>>>>>>>>>>>>> and then add new classes whenever a class gets loaded.
>>>>>>>>>>>>>>>>>>> When
>>>>>>>>>>>>>>>>>>> unloading
>>>>>>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and
>>>>>>>>>>>>>>>>>>> compared with the
>>>>>>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the
>>>>>>>>>>>>>>>>>>> new
>>>>>>>>>>>>>>>>>>> table gets
>>>>>>>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently
>>>>>>>>>>>>>>>>>>> and/or many
>>>>>>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to
>>>>>>>>>>>>>>>>>>> O(classCount*gcCount)
>>>>>>>>>>>>>>>>>>> complexity.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared
>>>>>>>>>>>>>>>>>>> classes, and also
>>>>>>>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree().
>>>>>>>>>>>>>>>>>>> Whenever an
>>>>>>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is
>>>>>>>>>>>>>>>>>>> scanned,
>>>>>>>>>>>>>>>>>>> and classes
>>>>>>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus
>>>>>>>>>>>>>>>>>>> maintaining the
>>>>>>>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list
>>>>>>>>>>>>>>>>>>> that gets returned.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The implementation is not perfect. In order to determine
>>>>>>>>>>>>>>>>>>> whether or not
>>>>>>>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag.
>>>>>>>>>>>>>>>>>>> That process is
>>>>>>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption
>>>>>>>>>>>>>>>>>>> here
>>>>>>>>>>>>>>>>>>> is that
>>>>>>>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this
>>>>>>>>>>>>>>>>>>> seems to be
>>>>>>>>>>>>>>>>>>> true, and also reasonable to expect.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> (I have some ideas how to improve the implementation to
>>>>>>>>>>>>>>>>>>> ~O(1) but it
>>>>>>>>>>>>>>>>>>> would be considerably more complex: have to maintain a
>>>>>>>>>>>>>>>>>>> (hash)table that
>>>>>>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon
>>>>>>>>>>>>>>>>>>> unload,
>>>>>>>>>>>>>>>>>>> and build the
>>>>>>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently
>>>>>>>>>>>>>>>>>>> see
>>>>>>>>>>>>>>>>>>> that it's
>>>>>>>>>>>>>>>>>>> worth the effort).
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> In addition to all that, this process is only activated
>>>>>>>>>>>>>>>>>>> when there's an
>>>>>>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Issue:
>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of
>>>>>>>>>>>>>>>>>>>>> classTrack.c.
>>>>>>>>>>>>>>>>>>>>> It avoids
>>>>>>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps
>>>>>>>>>>>>>>>>>>>>> track of
>>>>>>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance
>>>>>>>>>>>>>>>>>>>>> until an
>>>>>>>>>>>>>>>>>>>>> agent
>>>>>>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and
>>>>>>>>>>>>>>>>>>>>> timing.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Eg with the testcase provided here:
>>>>>>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I am getting those numbers:
>>>>>>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Can I please get a review?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>>>>>
>>>
> 
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200326/02d7306d/signature-0001.asc>

From daniil.x.titov at oracle.com  Thu Mar 26 13:56:22 2020
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Thu, 26 Mar 2020 06:56:22 -0700
Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI
 connector port
In-Reply-To: <3fe390e9-39d1-c547-9480-fa1962cef0d8@oracle.com>
References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com>
 <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com>
 <D28BF049-D293-4F65-93EC-BCAE4F09B413@oracle.com>
 <a3b5ac77-b6ee-4927-cf99-b586c6bbeae6@oss.nttdata.com>
 <41392FC2-AC78-4E94-878B-907DDFB3E968@oracle.com>
 <c858b94c-0091-8d29-eb0b-145782984d86@oss.nttdata.com>
 <78EA1B33-1B0D-4969-AFC5-E73D8F82E6FA@oracle.com>
 <c00b118a-8619-3984-10d7-63134d7210a0@oss.nttdata.com>
 <FBCAD683-2A99-415D-8926-8AEF560EE55A@oracle.com>
 <9c888cc4-f9f8-8f64-f90b-a949007bc1dc@oracle.com>
 <202C5C21-BA69-4ACF-9421-A9B5D6704C8C@oracle.com>
 <3fe390e9-39d1-c547-9480-fa1962cef0d8@oracle.com>
Message-ID: <BBD463F2-DB3B-4E99-90B2-E46900956E54@oracle.com>

Hi Yasumasa and Serguei,

Thank you for reviewing this change.

Best regards,
--Daniil

?On 3/25/20, 1:01 PM, "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com> wrote:

    Hi Daniil,
    
    On 3/24/20 10:00, Daniil Titov wrote:
    > Hi Serguei,
    >
    >>     It looks like you removed the last call site of DebugServer.main.
    > Yes. It is correct.
    >
    >>     Do we need to remove the DebugServer.java as well?
    > I was considering this but since it is a public class I think it needs to be deprecated first. I also think that it would be better to do in a separate issue
    > since a  CSR for deprecation needs to be filed for that.  If you agree I will create a new issue for that.
    
    I'm okay to separate this.
    
    Thanks,
    Serguei
    
    >
    > Thanks,
    > Daniil
    >
    >
    > ?On 3/23/20, 11:56 PM, "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com> wrote:
    >
    >      Hi Daniil,
    >      
    >      It looks pretty good in general.
    >      
    >      It looks like you removed the last call site of DebugServer.main.
    >      Do we need to remove the DebugServer.java as well?
    >      
    >      Thanks,
    >      Serguei
    >      
    >      
    >      On 3/22/20 15:29, Daniil Titov wrote:
    >      > Hi Yasumasa, Serguei and Alex,
    >      >
    >      > Please review a new version of the webrev that merges SADebugDTest.java  with changes  done in  [2].
    >      >
    >      > Also the CRS [3] and the help message for debug server in SALauncher.java were updated to specify that  '--hostname'
    >      > option could be a hostname or an IPv4/IPv6 address.
    >      >
    >      >   >  Ok, but I think it might be more simply with TestLibrary.
    >      >   >   For example, can you use TestLibrary::getUnusedRandomPort ? It is used in test/jdk/java/rmi/testlibrary/RMID.java .
    >      >
    >      > TestLibrary:: getUnusedRandomPort() doesn't allow to specify what ports are reserved and it uses some hardcoded port range [FIXED_PORT_MIN, FIXED_PORT_MAX] as reserved ports. Besides,  test/jdk/java/rmi/testlibrary/TestLibrary.java class cannot be directly used in test/hotspot/jtreg/serviceability/* tests (it doesn't compile).
    >      >
    >      > Nevertheless, to simplify the test itself I moved findUnreservedFreePort(int .. reservedPorts) from SADebugTest.java to jdk.test.lib.Utils in /test/lib.
    >      >
    >      > Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
    >      >
    >      > [1] http://cr.openjdk.java.net/~dtitov/8196751/webrev.04/
    >      > [2] https://bugs.openjdk.java.net/browse/JDK-8238268
    >      > [3] https://bugs.openjdk.java.net/browse/JDK-8239831
    >      >
    >      > Thank you,
    >      > Daniil
    >      >
    >      > ?On 3/13/20, 7:23 PM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
    >      >
    >      >      Hi Daniil,
    >      >
    >      >      On 2020/03/14 7:05, Daniil Titov wrote:
    >      >      > Hi Yasumasa, Serguei and Alex,
    >      >      >
    >      >      > Please review a new version of the webrev that includes the changes Yasumasa suggested.
    >      >      >
    >      >      >> Shutdown hook is already registered in c'tor of HotSpotAgent.
    >      >      >>     It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed.
    >      >      >
    >      >      > The shutdown hook registered in the HotSpotAgent c'tor only works for non-servers, so we still need a
    >      >      > the shutdown hook for remote server being added in SALauncher. I changed it to use  the lambda expression.
    >      >      >
    >      >      > 101     public HotSpotAgent() {
    >      >      >   102         // for non-server add shutdown hook to clean-up debugger in case
    >      >      >   103         // of forced exit. For remote server, shutdown hook is added by
    >      >      >   104         // DebugServer.
    >      >      >   105         Runtime.getRuntime().addShutdownHook(new java.lang.Thread(
    >      >      >   106         new Runnable() {
    >      >      >   107             public void run() {
    >      >      >   108                 synchronized (HotSpotAgent.this) {
    >      >      >   109                     if (!isServer) {
    >      >      >   110                         detach();
    >      >      >   111                     }
    >      >      >   112                 }
    >      >      >   113             }
    >      >      >   114         }));
    >      >      >   115     }
    >      >
    >      >      I missed it, thanks!
    >      >
    >      >
    >      >      >>>     Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains
    >      >      >>> `exclusiveAccess.dirs=.` to avoid concurrent execution
    >      >      > As I understand exclusiveAccess.dirs prevents only the tests located in this directory from being run simultaneously and other tests could still run in parallel with one of these tests.  Thus I would prefer to have the retry mechanism in place. I simplified the code using the class variables instead of local arrays.
    >      >
    >      >      Ok, but I think it might be more simply with TestLibrary.
    >      >      For example, can you use TestLibrary::getUnusedRandomPort ? It is used in test/jdk/java/rmi/testlibrary/RMID.java .
    >      >
    >      >
    >      >      Thanks,
    >      >
    >      >      Yasumasa
    >      >
    >      >
    >      >      > Testing: Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
    >      >      >
    >      >      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.03/
    >      >      > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
    >      >      > [3] Bug: https://bugs.openjdk.java.net/browse/JDK-8196751
    >      >      >
    >      >      > Thank you,
    >      >      > Daniil
    >      >      >
    >      >      > ?On 3/6/20, 6:15 PM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
    >      >      >
    >      >      >      Hi Daniil,
    >      >      >
    >      >      >      On 2020/03/07 3:38, Daniil Titov wrote:
    >      >      >      > Hi Yasumasa,
    >      >      >      >
    >      >      >      >   -> checkBasicOptions() is needed? I think you can remove this method and embed it in caller.
    >      >      >      > I think that having a piece of code that invokes  a method  named "buildAttachArgs" with a copy of the argument map  just for its side-effect ( it throws an exception if parameters are incorrect)  and ignores its return might look confusing. Thus, I found it more appropriate to wrap it inside a method with more relevant name .
    >      >      >
    >      >      >      Ok, but I prefer to leave comment it.
    >      >      >
    >      >      >
    >      >      >      >   > SADebugDTest
    >      >      >      >   >  - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array.
    >      >      >      > We cannot use primitives there since these local variables are captured in lambda expression and are required to be final.
    >      >      >      > The other option is to use some other wrapper for them but I don't see any obvious benefits in it comparing to the array.
    >      >      >
    >      >      >      Hmm... I think port check (already in use) is not needed because test/hotspot/jtreg/serviceability/sa/sadebugd/TEST.properties contains `exclusiveAccess.dirs=.` to avoid concurrent execution.
    >      >      >      If you do not think this error check, test code is more simply.
    >      >      >
    >      >      >
    >      >      >      > I will include your other suggestion in the new version of the webrev.
    >      >      >
    >      >      >      Sorry, I have one more comment:
    >      >      >
    >      >      >      >           - Shutdown hook is very good idea. You can implement more simply if you use lambda expression.
    >      >      >
    >      >      >      Shutdown hook is already registered in c'tor of HotSpotAgent.
    >      >      >      It works same as shutdownServer(), so I think shutdown hook at SALauncher is not needed.
    >      >      >
    >      >      >
    >      >      >      Thanks,
    >      >      >
    >      >      >      Yasumasa
    >      >      >
    >      >      >
    >      >      >      > Thanks!
    >      >      >      > Daniil
    >      >      >      >
    >      >      >      > ?On 3/6/20, 12:30 AM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
    >      >      >      >
    >      >      >      >      Hi Daniil,
    >      >      >      >
    >      >      >      >
    >      >      >      >      - SALauncher.java
    >      >      >      >           - checkBasicOptions() is needed? I think you can remove this method and embed it in caller.
    >      >      >      >           - I think registryPort should be checked with Integer.parseInt() like others (rmiPort and pid) rather than regex.
    >      >      >      >           - Shutdown hook is very good idea. You can implement more simply if you use lambda expression.
    >      >      >      >
    >      >      >      >      - SADebugDTest.java
    >      >      >      >           - Please add bug ID to @bug.
    >      >      >      >           - Why do you declare portInUse and testResult as array? Their length is 1, so I think you don't need to use array.
    >      >      >      >
    >      >      >      >
    >      >      >      >      Thanks,
    >      >      >      >
    >      >      >      >      Yasumasa
    >      >      >      >
    >      >      >      >
    >      >      >      >      On 2020/03/06 10:15, Daniil Titov wrote:
    >      >      >      >      > Hi Yasumasa, Serguei and Alex,
    >      >      >      >      >
    >      >      >      >      > Please review a new version of the fix [1] that addresses your comments. The new version in addition to RMI connector
    >      >      >      >      > port option introduces two more options to specify RMI registry port and RMI connector host name. Currently, these
    >      >      >      >      > last two settings could be specified using the system properties but the system properties have the following disadvantages
    >      >      >      >      > comparing to the command line options:
    >      >      >      >      >     -  It?s hard to know about them: they are not listed in tool?s help.
    >      >      >      >      >     -  They have long names that hard to remember
    >      >      >      >      >     -   It is easy to mistype them  in the command line and you will not get any warning about it.
    >      >      >      >      >
    >      >      >      >      > The CSR [2] was also updated and needs to be reviewed.
    >      >      >      >      >
    >      >      >      >      > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker
    >      >      >      >      > container  and connecting  to it with the GUI debugger.  Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
    >      >      >      >      >
    >      >      >      >      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.02/
    >      >      >      >      > [2] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
    >      >      >      >      > [3] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751
    >      >      >      >      >
    >      >      >      >      > Thank you,
    >      >      >      >      > Daniil
    >      >      >      >      >
    >      >      >      >      > ?On 2/24/20, 5:45 AM, "Yasumasa Suenaga" <suenaga at oss.nttdata.com> wrote:
    >      >      >      >      >
    >      >      >      >      >      Hi Daniil,
    >      >      >      >      >
    >      >      >      >      >         - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments.
    >      >      >      >      >           Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply.
    >      >      >      >      >
    >      >      >      >      >         - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used.
    >      >      >      >      >           But you can use same port number as RMI registry (1099).
    >      >      >      >      >           It is same as relation between jmxremote.port and jmxremote.rmi.port.
    >      >      >      >      >
    >      >      >      >      >
    >      >      >      >      >      Thanks,
    >      >      >      >      >
    >      >      >      >      >      Yasumasa
    >      >      >      >      >
    >      >      >      >      >
    >      >      >      >      >      On 2020/02/24 13:21, Daniil Titov wrote:
    >      >      >      >      >      > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port.
    >      >      >      >      >      > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container.
    >      >      >      >      >      >
    >      >      >      >      >      > New CSR [3] was created for this change and it needs to be reviewed as well.
    >      >      >      >      >      >
    >      >      >      >      >      > Man pages for jhsdb will be updated in a separate issue.
    >      >      >      >      >      >
    >      >      >      >      >      > The current implementation (sun.jvm.hotspot.SALauncher)  parses the command line options passed to jhsdb tool,
    >      >      >      >      >      > converts them to the ones for the debug server and then delegates the call  to sun.jvm.hotspot.DebugServer.main().
    >      >      >      >      >      >
    >      >      >      >      >      >                // delegate to the actual SA debug server.
    >      >      >      >      >      >   367         DebugServer.main(newArgArray.toArray(new String[0]));
    >      >      >      >      >      >
    >      >      >      >      >      > However,  sun.jvm.hotspot.DebugServer  doesn't support named options and that prevents from efficiently adding new options to the tool.
    >      >      >      >      >      > I found it more suitable to start Hotspot agent directly in  SALauncher rather than  adding a new option in  both sun.jvm.hotspot.SALauncher
    >      >      >      >      >      >   and sun.jvm.hotspot.DebugServer and  delegating the call.  With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated
    >      >      >      >      >      > but I would prefer to address it in a separate issue.
    >      >      >      >      >      >
    >      >      >      >      >      > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker
    >      >      >      >      >      >                  container  and connecting  to it with the GUI debugger.
    >      >      >      >      >      >                 Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded.
    >      >      >      >      >      >
    >      >      >      >      >      > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01
    >      >      >      >      >      > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751
    >      >      >      >      >      > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831
    >      >      >      >      >      >
    >      >      >      >      >      > Thank you,
    >      >      >      >      >      > Daniil
    >      >      >      >      >      >
    >      >      >      >      >      >
    >      >      >      >      >
    >      >      >      >      >
    >      >      >      >      >
    >      >      >      >
    >      >      >      >
    >      >      >      >
    >      >      >
    >      >      >
    >      >      >
    >      >
    >      >
    >      >
    >      
    >      
    >
    >
    
    
From chris.plummer at oracle.com  Thu Mar 26 14:59:09 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Thu, 26 Mar 2020 07:59:09 -0700
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <f65dcdbb-465f-d1f5-2ecf-a293fa58b624@redhat.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
 <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>
 <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>
 <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com>
 <f322e0c2-a87e-7c45-5b02-f5380e50246a@oracle.com>
 <359442d4-d6a1-f123-dad1-ac712f06eb0e@redhat.com>
 <d5dbee0b-9f6e-173f-15c4-f88bd0e9b619@oracle.com>
 <ef85c1cc-a22b-5b72-caa2-30020f7f487a@redhat.com>
 <7c613970-1788-5bd5-06f9-5c754843a193@oracle.com>
 <70717ba0-bcd1-05d4-3f25-ad1dd30309c1@redhat.com>
 <80c64ca7-4f52-a7ba-0e35-9fa6417ce545@oracle.com>
 <e67eefb0-fa41-1a22-e794-6c98fe255713@redhat.com>
 <89d41371-4394-e506-b1d1-0a810c72b6e3@oracle.com>
 <f65dcdbb-465f-d1f5-2ecf-a293fa58b624@redhat.com>
Message-ID: <966eb7f4-ff8f-50ba-dabf-c1c29b1999ef@oracle.com>

Hi Roman,

Yes. Thank you.

Chris

On 3/26/20 1:44 AM, Roman Kennke wrote:
> That was in the previous implementation: I got a condition wrong in the
> table lookup (as noted by Serguei), and this prevented any
> class-unload-events from getting out. I have fixed this, but found other
> problems in that implementation (deadlocks and a crash).
>
>   The current implementation has none of these problems: we don't need
> table-lookups - we simply pass-through the signatures, and locking is
> much simpler and in particular we don't need a lock around the JVMTI
> call (SetTag) which was the cause of the deadlock.
>
> Does that answer your questions?
>
> Thanks,
> Roman
>
>> Hi Roman,
>>
>> It passed all my testing. I think before you push Serguei has a question
>> regarding an issue you brought up a while back. You mentioned that you
>> weren't getting some events, and suddenly started seeing them. We were
>> discussing it today and it was unclear if this was an issue you were
>> seeing before your changes, and your changes resolved it, or it was
>> initially caused by an earlier version of your changes, and you later
>> fixed it. We just want to better understand what this issue was and how
>> it was fixed.
>>
>> thanks,
>>
>> Chris
>>
>> On 3/25/20 3:22 PM, Roman Kennke wrote:
>>> The new job finished, its ID is:
>>>
>>>  ? mach5-one-rkennke-JDK-8227269-2-20200325-2027-9716289
>>>
>>> Thank you,
>>> Roman
>>>
>>>
>>>> Yes, please submit a new job. I'll start my testing once I see that the
>>>> builds are done.
>>>>
>>>> Chris
>>>>
>>>> On 3/25/20 12:59 PM, Roman Kennke wrote:
>>>>> Hi Chris,
>>>>>
>>>>> Apparently we can get into classTrack_reset() before calling
>>>>> activate(),
>>>>> and we're seeing a null deletedSignatureBag. A simple NULL-check around
>>>>> the cleaning routine fixes the problem for me.
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.08/
>>>>>
>>>>> Should I post another submit-repo job with that fix?
>>>>>
>>>>> Thanks,
>>>>> Roman
>>>>>
>>>>>
>>>>>> Hi Roman,
>>>>>>
>>>>>> com/sun/jdi/JdwpAllowTest.java crashed on many runs:
>>>>>>
>>>>>> Stack: [0x00007fbb790f9000,0x00007fbb791fa000],
>>>>>> sp=0x00007fbb791f8af0,? free space=1022k
>>>>>> Native frames: (J=compiled Java code, A=aot compiled Java code,
>>>>>> j=interpreted, Vv=VM code, C=native code)
>>>>>> C? [libjdwp.so+0xdb71]? bagEnumerateOver+0x11
>>>>>> C? [libjdwp.so+0xe365]? classTrack_reset+0x25
>>>>>> C? [libjdwp.so+0xfca1]? debugInit_reset+0x71
>>>>>> C? [libjdwp.so+0x12e0d]? debugLoop_run+0x38d
>>>>>> C? [libjdwp.so+0x25700]? acceptThread+0x80
>>>>>> V? [libjvm.so+0xf4b5a7]? JvmtiAgentThread::call_start_function()+0x1c7
>>>>>> V? [libjvm.so+0x15215c6]? JavaThread::thread_main_inner()+0x226
>>>>>> V? [libjvm.so+0x1527736]? Thread::call_run()+0xf6
>>>>>> V? [libjvm.so+0x1250ade]? thread_native_entry(Thread*)+0x10e
>>>>>>
>>>>>>
>>>>>> This happened during a test task run of open/test/jdk/:jdk_jdi. There
>>>>>> doesn't seem to be anything magic on the command line that might be
>>>>>> triggering. Pretty much I see it with all the various VM configs we
>>>>>> test.
>>>>>>
>>>>>> I'm also seeing crashes in the following tests, but not as often:
>>>>>>
>>>>>> serviceability/jvmti/ModuleAwareAgents/ThreadStart/MAAThreadStart.java
>>>>>> vmTestbase/nsk/jdwp/VirtualMachine/Version/version002/TestDescription.java
>>>>>>
>>>>>>
>>>>>> vmTestbase/nsk/jdwp/VirtualMachine/ReleaseEvents/releaseevents002/TestDescription.java
>>>>>>
>>>>>>
>>>>>> vmTestbase/nsk/jdwp/VirtualMachine/HoldEvents/holdevents002/TestDescription.java
>>>>>>
>>>>>>
>>>>>> vmTestbase/nsk/jdwp/VirtualMachine/Dispose/dispose001/TestDescription.java
>>>>>>
>>>>>>
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>>
>>>>>> On 3/25/20 11:37 AM, Roman Kennke wrote:
>>>>>>> Hi Chris,
>>>>>>>
>>>>>>>> Regarding the new assert:
>>>>>>>>
>>>>>>>>  ???105???? if (gdata && gdata->assertOn) {
>>>>>>>>  ???106???????? // Check this is not already tagged.
>>>>>>>>  ???107???????? jlong tag;
>>>>>>>>  ???108???????? error = JVMTI_FUNC_PTR(trackingEnv, GetTag)(env,
>>>>>>>> klass, &tag);
>>>>>>>>  ???109???????? if (error != JVMTI_ERROR_NONE) {
>>>>>>>>  ???110???????????? EXIT_ERROR(error, "Unable to GetTag with class
>>>>>>>> trackingEnv");
>>>>>>>>  ???111???????? }
>>>>>>>>  ???112???????? JDI_ASSERT(tag == NOT_TAGGED);
>>>>>>>>  ???113???? }
>>>>>>>>
>>>>>>>> I think you should remove the gdata check. gdata should never be
>>>>>>>> NULL
>>>>>>>> when you get to this code. If it is ever NULL then there's a bug,
>>>>>>>> and
>>>>>>>> the check will hide the bug.
>>>>>>> Ok, will remove this.
>>>>>>>
>>>>>>>> Regarding testing, after you do the submit repo testing let me know
>>>>>>>> the
>>>>>>>> jobID and I'll do additional testing on it.
>>>>>>> I did the submit repo earlier today, and it came back green:
>>>>>>>
>>>>>>> mach5-one-rkennke-JDK-8227269-2-20200325-1421-9706762
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Roman
>>>>>>>
>>>>>>>> thanks,
>>>>>>>>
>>>>>>>> Chris
>>>>>>>>
>>>>>>>> On 3/25/20 6:00 AM, Roman Kennke wrote:
>>>>>>>>> Hi Sergei,
>>>>>>>>>
>>>>>>>>>> The fix looks pretty clean now.
>>>>>>>>>> I also like new name of the lock.:)
>>>>>>>>> Thank you!
>>>>>>>>>
>>>>>>>>>> Just one comment below.
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 110 if (tag != 0l) {
>>>>>>>>>> 111 return; // Already added
>>>>>>>>>>  ??? 112???? }
>>>>>>>>>>
>>>>>>>>>>  ????It is better to use a named constant or macro instead.
>>>>>>>>>>  ????Also, it'd be nice to add a short comment about this value is.
>>>>>>>>> As I replied to Chris earlier, this whole block can be turned
>>>>>>>>> into an
>>>>>>>>> assert. I also made a constant for the value 0, which should be
>>>>>>>>> pretty
>>>>>>>>> much self-explaining.
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.07/
>>>>>>>>>
>>>>>>>>>> How do you test the fix?
>>>>>>>>> I am using a manual test that is provided in this bug report:
>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>
>>>>>>>>> "Script to compare performance of GC with and without debugger,
>>>>>>>>> when
>>>>>>>>> many classes are loaded and classes are being unloaded":
>>>>>>>>>
>>>>>>>>> https://bugzilla.redhat.com/attachment.cgi?id=1640688
>>>>>>>>>
>>>>>>>>> I am also using this test and manually attach/detach jdb a
>>>>>>>>> couple of
>>>>>>>>> times in a row to check that disconnecting and reconnecting works
>>>>>>>>> well
>>>>>>>>> (this tended to deadlock or crash with an earlier version of the
>>>>>>>>> patch,
>>>>>>>>> and is now looking good).
>>>>>>>>>
>>>>>>>>> I am also running tier1 and tier2 tests locally, and as soon as we
>>>>>>>>> all
>>>>>>>>> agree that the fix is reasonable, I will push it to the submit
>>>>>>>>> repo. I
>>>>>>>>> am not sure if any of those tests actually exercise that code,
>>>>>>>>> though.
>>>>>>>>> Let me know if you want me to run any specific tests.
>>>>>>>>>
>>>>>>>>> Thank you,
>>>>>>>>> Roman
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Serguei
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 3/20/20 08:30, Roman Kennke wrote:
>>>>>>>>>>> I believe I came up with a much simpler solution that also
>>>>>>>>>>> solves the
>>>>>>>>>>> problems of the existing one, and the ones I proposed earlier.
>>>>>>>>>>>
>>>>>>>>>>> It turns out that we can take advantage of the fact that we
>>>>>>>>>>> can use
>>>>>>>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is
>>>>>>>>>>> explicitely
>>>>>>>>>>> mentioned in the JVMTI spec). This means we can simply stick a
>>>>>>>>>>> pointer
>>>>>>>>>>> to the signature of a class into the tag, and pull it out again
>>>>>>>>>>> when we
>>>>>>>>>>> get notified that the class gets unloaded.
>>>>>>>>>>>
>>>>>>>>>>> This means we don't need an extra data-structure to keep track of
>>>>>>>>>>> classes and signatures, and it also makes the story around
>>>>>>>>>>> locking
>>>>>>>>>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning
>>>>>>>>>>> of all
>>>>>>>>>>> classes needed (as in the current implementation) and no
>>>>>>>>>>> searching of
>>>>>>>>>>> table needed (like in my previous attempts).
>>>>>>>>>>>
>>>>>>>>>>> Please review this new revision:
>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/
>>>>>>>>>>>
>>>>>>>>>>> (Notice that there still appears to be a performance bottleneck
>>>>>>>>>>> with
>>>>>>>>>>> class-unloading when an actual debugger is attached. This
>>>>>>>>>>> doesn't seem
>>>>>>>>>>> to be related to the classTrack.c implementation though, but
>>>>>>>>>>> looks like
>>>>>>>>>>> a consequence of getting all those class-unload notifications
>>>>>>>>>>> over the
>>>>>>>>>>> wire. My testcase generates 1000s of them, and it's clogging
>>>>>>>>>>> up the
>>>>>>>>>>> buffers.)
>>>>>>>>>>>
>>>>>>>>>>> I am not sure why jdb needs to enable class-unload listener
>>>>>>>>>>> always. A
>>>>>>>>>>> simple hack disables it, and performance is brilliant, even when
>>>>>>>>>>> jdb is
>>>>>>>>>>> attached:
>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> But this is not in the scope of this bug.)
>>>>>>>>>>>
>>>>>>>>>>> Roman
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>> Sorry, forgot to complete my comments at the end (see below).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>>> Hi Roman,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thank you for the update and sorry for the latency in review.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Some comments are below.
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
>>>>>>>>>>>>>  ???? 88 {
>>>>>>>>>>>>> 89 debugMonitorEnter(deletedSignatureLock);
>>>>>>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>> 93 return;
>>>>>>>>>>>>>  ???? 94???? }
>>>>>>>>>>>>> Just a question:
>>>>>>>>>>>>>  ???? Q1: Should the ObjectFree events be disabled for the
>>>>>>>>>>>>> jvmtiEnv
>>>>>>>>>>>>> that does
>>>>>>>>>>>>>  ???????? the class tracking if class tracking has not been
>>>>>>>>>>>>> initialized?
>>>>>>>>>>>>>
>>>>>>>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is
>>>>>>>>>>>>> better to
>>>>>>>>>>>>> be something like: lastClassTag or highestClassTag.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 99 KlassNode* klass = *klass_ptr;
>>>>>>>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103
>>>>>>>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr;
>>>>>>>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { //
>>>>>>>>>>>>> klass
>>>>>>>>>>>>> not
>>>>>>>>>>>>> found - ignore.
>>>>>>>>>>>>> 107 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>> 108 return;
>>>>>>>>>>>>>  ??? 109???? }
>>>>>>>>>>>>>  ????It seems to me, something is wrong in the condition at L106
>>>>>>>>>>>>> above.
>>>>>>>>>>>>>  ????Should it be? :
>>>>>>>>>>>>>  ?????? if (klass == NULL || klass->klass_tag != tag)
>>>>>>>>>>>>>
>>>>>>>>>>>>>  ????Otherwise, how can the second check ever work correctly
>>>>>>>>>>>>> as the
>>>>>>>>>>>>> return
>>>>>>>>>>>>> will always happen when (klass != NULL)?
>>>>>>>>>>>>>
>>>>>>>>>>>>>  ??? There are several places in this file with the the indent:
>>>>>>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>> 93 return;
>>>>>>>>>>>>>  ???? 94???? }
>>>>>>>>>>>>>  ??? ...
>>>>>>>>>>>>> 152 if (currentClassTag == -1) {
>>>>>>>>>>>>> 153 // Class tracking not initialized yet, nobody's interested
>>>>>>>>>>>>> 154 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>> 155 return;
>>>>>>>>>>>>>  ??? 156???? }
>>>>>>>>>>>>>  ??? ...
>>>>>>>>>>>>> 161 if (error != JVMTI_ERROR_NONE) {
>>>>>>>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class
>>>>>>>>>>>>> trackingEnv");
>>>>>>>>>>>>>  ??? 163???? }
>>>>>>>>>>>>> 164 if (tag != 0l) {
>>>>>>>>>>>>> 165 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>> 166 return; // Already added
>>>>>>>>>>>>>  ??? 167???? }
>>>>>>>>>>>>>  ??? ...
>>>>>>>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg)
>>>>>>>>>>>>> 282 {
>>>>>>>>>>>>> 283 char* sig = (char*)signatureVoid;
>>>>>>>>>>>>> 284 jvmtiDeallocate(sig);
>>>>>>>>>>>>> 285 return JNI_TRUE;
>>>>>>>>>>>>>  ??? 286 }
>>>>>>>>>>>>>  ??? ...
>>>>>>>>>>>>>  ??? 291 void
>>>>>>>>>>>>>  ??? 292 classTrack_reset(void)
>>>>>>>>>>>>>  ??? 293 {
>>>>>>>>>>>>> 294 int idx;
>>>>>>>>>>>>> 295 debugMonitorEnter(deletedSignatureLock);
>>>>>>>>>>>>> 296
>>>>>>>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) {
>>>>>>>>>>>>> 298 KlassNode* node = table[idx];
>>>>>>>>>>>>> 299 while (node != NULL) {
>>>>>>>>>>>>> 300 KlassNode* next = node->next;
>>>>>>>>>>>>> 301 jvmtiDeallocate(node->signature);
>>>>>>>>>>>>> 302 jvmtiDeallocate(node);
>>>>>>>>>>>>> 303 node = next;
>>>>>>>>>>>>> 304 }
>>>>>>>>>>>>> 305 }
>>>>>>>>>>>>> 306 jvmtiDeallocate(table);
>>>>>>>>>>>>> 307
>>>>>>>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL);
>>>>>>>>>>>>> 309 bagDestroyBag(deletedSignatureBag);
>>>>>>>>>>>>> 310
>>>>>>>>>>>>> 311 currentClassTag = -1;
>>>>>>>>>>>>> 312
>>>>>>>>>>>>> 313
>>>>>>>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv);
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 314 trackingEnv = NULL;
>>>>>>>>>>>>> 315
>>>>>>>>>>>>> 316 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>
>>>>>>>>>>>>> Could you, please, fix several comments below?
>>>>>>>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for
>>>>>>>>>>>>> class-unloads
>>>>>>>>>>>>>  ????The comma is not needed.
>>>>>>>>>>>>>  ????Would it better to replace: klass tags => klass_tag's ?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 73 * Lock to keep table, currentClassTag and
>>>>>>>>>>>>> deletedSignatureBag
>>>>>>>>>>>>> consistent
>>>>>>>>>>>>>  ????Maybe: Lock to guard ... or lock to keep integrity of ...
>>>>>>>>>>>>>
>>>>>>>>>>>>> 84 * Callback when classes are freed, Finds the signature and
>>>>>>>>>>>>> remembers it in deletedSignatureBag. Would be better to use
>>>>>>>>>>>>> words
>>>>>>>>>>>>> like
>>>>>>>>>>>>> "store" or "record", "Find" should not start from capital
>>>>>>>>>>>>> letter:
>>>>>>>>>>>>> Invoke the callback when classes are freed, find and record the
>>>>>>>>>>>>> signature in deletedSignatureBag.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not
>>>>>>>>>>>>> initialized,
>>>>>>>>>>>>> nobody's interested 153 // Class tracking not initialized yet,
>>>>>>>>>>>>> nobody's interested 158 /* Check this is not a duplicate */
>>>>>>>>>>>>> Missed
>>>>>>>>>>>>> dot
>>>>>>>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag)
>>>>>>>>>>>>> { //
>>>>>>>>>>>>> klass not found - ignore. In opposite, dot is not needed as the
>>>>>>>>>>>>> comment does not start from a capital letter. 111 // At this
>>>>>>>>>>>>> point we
>>>>>>>>>>>>> have the KlassNode corresponding to the tag
>>>>>>>>>>>>> 112 // in klass, and the pointer to it in klass_node.
>>>>>>>>>>>>  ??? The comment above can be better. Maybe, something like:
>>>>>>>>>>>>  ??? ? " At this point, we found the KlassNode matching the klass
>>>>>>>>>>>> tag(and it is
>>>>>>>>>>>> linked).
>>>>>>>>>>>>
>>>>>>>>>>>>> 113 // Remember the unloaded signature.
>>>>>>>>>>>>  ????Better: Record the signature of the unloaded class and
>>>>>>>>>>>> unlink it.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Serguei
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 3/9/20 05:39, Roman Kennke wrote:
>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can I please get reviews of this change? In the meantime,
>>>>>>>>>>>>>> we've done
>>>>>>>>>>>>>> more testing and also field-/torture-testing by a customer
>>>>>>>>>>>>>> who is
>>>>>>>>>>>>>> happy
>>>>>>>>>>>>>> now. :-)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Serguei,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks for reviewing!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I updated the patch to reflect your suggestions, very good!
>>>>>>>>>>>>>>> It also includes a fix to allow re-connecting an agent after
>>>>>>>>>>>>>>> disconnect,
>>>>>>>>>>>>>>> namely move setup of the trackingEnv and
>>>>>>>>>>>>>>> deletedSignatureBag to
>>>>>>>>>>>>>>> _activate() to ensure have those structures after re-connect.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Let me know what you think!
>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Roman,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thank you for taking care about this scalability issue!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I have a couple of quick comments.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 72 /*
>>>>>>>>>>>>>>>> 73 * Lock to protect deletedSignatureBag
>>>>>>>>>>>>>>>> 74 */
>>>>>>>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>>>>>>>>>>>>>>>> 78 * A bag containing all the deleted classes' signatures.
>>>>>>>>>>>>>>>> Must be
>>>>>>>>>>>>>>>> accessed under
>>>>>>>>>>>>>>>> 79 * deletedTagLock,
>>>>>>>>>>>>>>>>  ???? 80? */
>>>>>>>>>>>>>>>> 81 struct bag* deletedSignatureBag;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  ???? The comments contradict to each other.
>>>>>>>>>>>>>>>>  ???? I guess, the lock name at line 79 has to be
>>>>>>>>>>>>>>>> deletedSignatureLock
>>>>>>>>>>>>>>>> instead of deletedTagLock.
>>>>>>>>>>>>>>>>  ???? Also, comma at the end must be replaced with dot.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 101 // Tag not found? Ignore.
>>>>>>>>>>>>>>>> 102 if (klass == NULL) {
>>>>>>>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>>>> 104 return;
>>>>>>>>>>>>>>>> 105 }
>>>>>>>>>>>>>>>>  ??? 106
>>>>>>>>>>>>>>>> 107 // Scan linked-list.
>>>>>>>>>>>>>>>> 108 jlong found_tag = klass->klass_tag;
>>>>>>>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) {
>>>>>>>>>>>>>>>> 110 klass_ptr = &klass->next;
>>>>>>>>>>>>>>>> 111 klass = *klass_ptr;
>>>>>>>>>>>>>>>> 112 found_tag = klass->klass_tag;
>>>>>>>>>>>>>>>>  ??? 113???? }
>>>>>>>>>>>>>>>> 114
>>>>>>>>>>>>>>>> 115 // Tag not found? Ignore.
>>>>>>>>>>>>>>>> 116 if (found_tag != tag) {
>>>>>>>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>>>> 118 return;
>>>>>>>>>>>>>>>>  ??? 119???? }
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  ????The code above can be simplified, so that the lines
>>>>>>>>>>>>>>>> 101-105
>>>>>>>>>>>>>>>> are not
>>>>>>>>>>>>>>>> needed anymore.
>>>>>>>>>>>>>>>>  ????It can be something like this:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> // Scan linked-list.
>>>>>>>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) {
>>>>>>>>>>>>>>>> klass_ptr = &klass->next;
>>>>>>>>>>>>>>>> klass = *klass_ptr;
>>>>>>>>>>>>>>>>  ??????? }
>>>>>>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not
>>>>>>>>>>>>>>>> found - ignore.
>>>>>>>>>>>>>>>> debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>>>> return;
>>>>>>>>>>>>>>>>  ??????? }
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It will take more time when I get a chance to look at the
>>>>>>>>>>>>>>>> rest.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote:
>>>>>>>>>>>>>>>>> Here comes an update that resolves some races that happen
>>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>> disconnecting an agent. In particular, we need to take the
>>>>>>>>>>>>>>>>> lock on
>>>>>>>>>>>>>>>>> basically every operation, and also need to check whether
>>>>>>>>>>>>>>>>> or not
>>>>>>>>>>>>>>>>> class-tracking is active and return an appropriate result
>>>>>>>>>>>>>>>>> (e.g. an empty
>>>>>>>>>>>>>>>>> list) when we're not.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> So, here comes the O(1) implementation:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a
>>>>>>>>>>>>>>>>>> tag, and we
>>>>>>>>>>>>>>>>>> set-up a listener to get notified when it is unloaded.
>>>>>>>>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a
>>>>>>>>>>>>>>>>>> table, which
>>>>>>>>>>>>>>>>>> each entry being the head of a linked-list of KlassNode*.
>>>>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>>>> table is
>>>>>>>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend
>>>>>>>>>>>>>>>>>> the new
>>>>>>>>>>>>>>>>>> KlassNode*.
>>>>>>>>>>>>>>>>>> This is O(1) operation.
>>>>>>>>>>>>>>>>>> - When we get notified of unloading a class, we look up
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> signature of
>>>>>>>>>>>>>>>>>> the reported tag in that table, and remember it in a bag.
>>>>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>>>> KlassNode*
>>>>>>>>>>>>>>>>>> is then unlinked from the table and deallocated. This is
>>>>>>>>>>>>>>>>>> ~O(1) operation
>>>>>>>>>>>>>>>>>> too, depending on the depth of the table. In my testcase
>>>>>>>>>>>>>>>>>> which hammered
>>>>>>>>>>>>>>>>>> the code with class-loads and unloads, I usually see
>>>>>>>>>>>>>>>>>> depths
>>>>>>>>>>>>>>>>>> of like 2-3,
>>>>>>>>>>>>>>>>>> but not usually more. It should be ok.
>>>>>>>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out
>>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>> bag, and
>>>>>>>>>>>>>>>>>> allocate a new one.
>>>>>>>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid
>>>>>>>>>>>>>>>>>> leaking the
>>>>>>>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets
>>>>>>>>>>>>>>>>>> detached
>>>>>>>>>>>>>>>>>> and/or
>>>>>>>>>>>>>>>>>> re-attached (was missing before).
>>>>>>>>>>>>>>>>>> - I also added locks around data-structure-manipulation
>>>>>>>>>>>>>>>>>> (was
>>>>>>>>>>>>>>>>>> missing
>>>>>>>>>>>>>>>>>> before).
>>>>>>>>>>>>>>>>>> - Also, I only activate this whole process when an actual
>>>>>>>>>>>>>>>>>> listener gets
>>>>>>>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right
>>>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>>> attaching a
>>>>>>>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be
>>>>>>>>>>>>>>>>>> something
>>>>>>>>>>>>>>>>>> to improve
>>>>>>>>>>>>>>>>>> in the future?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> In my tests, the performance of class-tracking itself
>>>>>>>>>>>>>>>>>> looks
>>>>>>>>>>>>>>>>>> really good.
>>>>>>>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the
>>>>>>>>>>>>>>>>>> class-unload
>>>>>>>>>>>>>>>>>> events. I don't see how this can be helped when the debug
>>>>>>>>>>>>>>>>>> agent asks for it?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Please let me know what you think of it.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am
>>>>>>>>>>>>>>>>>>> implementing
>>>>>>>>>>>>>>>>>>> the even more
>>>>>>>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing
>>>>>>>>>>>>>>>>>>> for now.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,Roman
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>  ??? Hi Chris,
>>>>>>>>>>>>>>>>>>>>> I'll have a look at this, although it might not be
>>>>>>>>>>>>>>>>>>>>> for a
>>>>>>>>>>>>>>>>>>>>> few days. In
>>>>>>>>>>>>>>>>>>>>> the meantime, maybe you can describe your new
>>>>>>>>>>>>>>>>>>>>> implementation in
>>>>>>>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the
>>>>>>>>>>>>>>>>>>>>> changes.
>>>>>>>>>>>>>>>>>>>> Sure.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to
>>>>>>>>>>>>>>>>>>>> determine the
>>>>>>>>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading
>>>>>>>>>>>>>>>>>>>> happened, so that
>>>>>>>>>>>>>>>>>>>> we can generate the appropriate JDWP event.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The current implementation does so by maintaining a
>>>>>>>>>>>>>>>>>>>> table
>>>>>>>>>>>>>>>>>>>> of currently
>>>>>>>>>>>>>>>>>>>> prepared classes by building that table when
>>>>>>>>>>>>>>>>>>>> classTrack is
>>>>>>>>>>>>>>>>>>>> initialized,
>>>>>>>>>>>>>>>>>>>> and then add new classes whenever a class gets loaded.
>>>>>>>>>>>>>>>>>>>> When
>>>>>>>>>>>>>>>>>>>> unloading
>>>>>>>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and
>>>>>>>>>>>>>>>>>>>> compared with the
>>>>>>>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the
>>>>>>>>>>>>>>>>>>>> new
>>>>>>>>>>>>>>>>>>>> table gets
>>>>>>>>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently
>>>>>>>>>>>>>>>>>>>> and/or many
>>>>>>>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to
>>>>>>>>>>>>>>>>>>>> O(classCount*gcCount)
>>>>>>>>>>>>>>>>>>>> complexity.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared
>>>>>>>>>>>>>>>>>>>> classes, and also
>>>>>>>>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree().
>>>>>>>>>>>>>>>>>>>> Whenever an
>>>>>>>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is
>>>>>>>>>>>>>>>>>>>> scanned,
>>>>>>>>>>>>>>>>>>>> and classes
>>>>>>>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus
>>>>>>>>>>>>>>>>>>>> maintaining the
>>>>>>>>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list
>>>>>>>>>>>>>>>>>>>> that gets returned.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The implementation is not perfect. In order to determine
>>>>>>>>>>>>>>>>>>>> whether or not
>>>>>>>>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag.
>>>>>>>>>>>>>>>>>>>> That process is
>>>>>>>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption
>>>>>>>>>>>>>>>>>>>> here
>>>>>>>>>>>>>>>>>>>> is that
>>>>>>>>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this
>>>>>>>>>>>>>>>>>>>> seems to be
>>>>>>>>>>>>>>>>>>>> true, and also reasonable to expect.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> (I have some ideas how to improve the implementation to
>>>>>>>>>>>>>>>>>>>> ~O(1) but it
>>>>>>>>>>>>>>>>>>>> would be considerably more complex: have to maintain a
>>>>>>>>>>>>>>>>>>>> (hash)table that
>>>>>>>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon
>>>>>>>>>>>>>>>>>>>> unload,
>>>>>>>>>>>>>>>>>>>> and build the
>>>>>>>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently
>>>>>>>>>>>>>>>>>>>> see
>>>>>>>>>>>>>>>>>>>> that it's
>>>>>>>>>>>>>>>>>>>> worth the effort).
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> In addition to all that, this process is only activated
>>>>>>>>>>>>>>>>>>>> when there's an
>>>>>>>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>>>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Issue:
>>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of
>>>>>>>>>>>>>>>>>>>>>> classTrack.c.
>>>>>>>>>>>>>>>>>>>>>> It avoids
>>>>>>>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps
>>>>>>>>>>>>>>>>>>>>>> track of
>>>>>>>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance
>>>>>>>>>>>>>>>>>>>>>> until an
>>>>>>>>>>>>>>>>>>>>>> agent
>>>>>>>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and
>>>>>>>>>>>>>>>>>>>>>> timing.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Eg with the testcase provided here:
>>>>>>>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I am getting those numbers:
>>>>>>>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Can I please get a review?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>>>>>>
>>


From kevin.walls at oracle.com  Thu Mar 26 17:40:51 2020
From: kevin.walls at oracle.com (Kevin Walls)
Date: Thu, 26 Mar 2020 17:40:51 +0000
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <cb32ea21-a06a-824a-cd6f-0b731841f352@oss.nttdata.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <d0a1f49e-7f1d-0732-38d0-2b7a75966e74@oss.nttdata.com>
 <c0013500-ebc6-0eab-1705-feb99e58e73e@oracle.com>
 <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com>
 <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com>
 <a13433b4-e3f8-d280-b83f-cde27f7282cd@oracle.com>
 <ff79a3c5-7416-6ee1-73de-0ef1d91a7480@oracle.com>
 <b9694754-b669-0ea3-2ba9-a432b9d8dcfb@oss.nttdata.com>
 <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com>
 <c00aeece-863a-dfb3-f8da-2d3d5ae25330@oracle.com>
 <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com>
 <f0488e84-9758-ea16-82a6-4ce1f424a523@oracle.com>
 <4f940048-06e1-bda1-6f75-225001572232@oss.nttdata.com>
 <908005c7-c3ad-2f92-5853-2ae0895689b4@oracle.com>
 <2114ef12-b790-ee2d-4242-d114cc47151c@oracle.com>
 <9c9950af-5e36-90b5-d22f-a3e00d45fbcc@oss.nttdata.com>
 <f45b2bc7-a0ca-d85b-5998-1e30e99d0d36@oracle.com>
 <cb32ea21-a06a-824a-cd6f-0b731841f352@oss.nttdata.com>
Message-ID: <0851349f-ee17-4b0e-f186-a378cedd6913@oracle.com>

Hi Yasumasa,

Oops, didn't catch this - I also had done some manual testing and in 
mach5 but clearly not enough.

Generally I think this looks good.

"lastFrame" can mean last as in final, or last as in previous. "last" is 
one of those annoying English words.? Here it means final, if we get an 
Exception during processDwarf, use this to flag that we should return 
null from sender().? "finalFrame" would be clearer to me, anything else 
probably gets more verbose than you wanted.

Yes I like having the limit on the while loop in process_dwarf(), always 
worried how sane the information is that we are parsing through.

Thanks!
Kevin


On 24/03/2020 23:47, Yasumasa Suenaga wrote:
> Thanks Serguei!
>
> I will push it when I get second reviewer.
>
>
> Yasumasa
>
>
> On 2020/03/25 1:39, serguei.spitsyn at oracle.com wrote:
>> Hi Yasumasa,
>>
>> I'm okay with this update.
>> My mach5 test run for this patch is passed.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 3/23/20 17:08, Yasumasa Suenaga wrote:
>>> Hi Serguei,
>>>
>>> Thanks for your comment!
>>> I uploaded new webrev:
>>>
>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.04/
>>>
>>> Also I pushed it to submit repo:
>>>
>>> ? http://hg.openjdk.java.net/jdk/submit/rev/fade6a949bd1
>>>
>>> On 2020/03/24 7:39, serguei.spitsyn at oracle.com wrote:
>>>> Hi Yasumasa,
>>>>
>>>> The mach5 tier5 testing looks good.
>>>> The serviceability/sa/ClhsdbPstack.java is failed without fix and 
>>>> is not failed with it.
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>> On 3/23/20 10:18, serguei.spitsyn at oracle.com wrote:
>>>>> Hi Yasumasa,
>>>>>
>>>>> I looked at you changes.
>>>>> It is hard to understand if this fully solves the issue.
>>>>>
>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html 
>>>>>
>>>>>
>>>>> @@ -34,10 +34,11 @@
>>>>> ? ???? public static LinuxAMD64CFrame getTopFrame(LinuxDebugger 
>>>>> dbg, Address rip, ThreadContext context) {
>>>>> ??????? Address libptr = dbg.findLibPtrByAddress(rip);
>>>>> ??????? Address cfa = 
>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP);
>>>>> ??????? DwarfParser dwarf = null;
>>>>> + boolean unsupportedDwarf = false;
>>>>> ? ??????? if (libptr != null) { // Native frame
>>>>> ????????? try {
>>>>> ??????????? dwarf = new DwarfParser(libptr);
>>>>> ??????????? dwarf.processDwarf(rip);
>>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
>>>>>
>>>>> @@ -45,24 +46,33 @@
>>>>> ?????????????????? !dwarf.isBPOffsetAvailable())
>>>>> ????????????????????? ? 
>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP)
>>>>> ????????????????????? : 
>>>>> context.getRegisterAsAddress(dwarf.getCFARegister())
>>>>> .addOffsetTo(dwarf.getCFAOffset());
>>>>> ????????? } catch (DebuggerException e) {
>>>>> - // Bail out to Java frame case
>>>>> + if (dwarf != null) {
>>>>> + // DWARF processing should succeed when the frame is native
>>>>> + // but it might fail if CIE has language personality routine
>>>>> + // and/or LSDA.
>>>>> + dwarf = null;
>>>>> + unsupportedDwarf = true;
>>>>> + } else {
>>>>> + throw e;
>>>>> + }
>>>>> ????????? }
>>>>> ??????? }
>>>>> ? ??????? return (cfa == null) ? null
>>>>> - : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf);
>>>>> + : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf, !unsupportedDwarf);
>>>>> ???? }
>>>>>
>>>>> @@ -121,13 +131,25 @@
>>>>> ?????? }
>>>>> ? ?????? return isValidFrame(nextCFA, context) ? nextCFA : null;
>>>>> ???? }
>>>>> ? - private DwarfParser getNextDwarf(Address nextPC) {
>>>>> - DwarfParser nextDwarf = null;
>>>>> + @Override
>>>>> + public CFrame sender(ThreadProxy thread) {
>>>>> + if (!possibleNext) {
>>>>> + return null;
>>>>> + }
>>>>> +
>>>>> + ThreadContext context = thread.getContext();
>>>>> +
>>>>> + Address nextPC = getNextPC(dwarf != null);
>>>>> + if (nextPC == null) {
>>>>> + return null;
>>>>> + }
>>>>> ? + DwarfParser nextDwarf = null;
>>>>> + boolean unsupportedDwarf = false;
>>>>> ?????? if ((dwarf != null) && dwarf.isIn(nextPC)) {
>>>>> ???????? nextDwarf = dwarf;
>>>>> ?????? } else {
>>>>> ???????? Address libptr = dbg.findLibPtrByAddress(nextPC);
>>>>> ???????? if (libptr != null) {
>>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
>>>>>
>>>>> @@ -138,33 +160,29 @@
>>>>> ?????????? }
>>>>> ???????? }
>>>>> ?????? }
>>>>> ? ?????? if (nextDwarf != null) {
>>>>> + try {
>>>>> ???????? nextDwarf.processDwarf(nextPC);
>>>>> + } catch (DebuggerException e) {
>>>>> + // DWARF processing should succeed when the frame is native
>>>>> + // but it might fail if CIE has language personality routine
>>>>> + // and/or LSDA.
>>>>> + nextDwarf = null;
>>>>> + unsupportedDwarf = true;
>>>>> ?????? }
>>>>>
>>>>> This fix looks like a hack.
>>>>> Should we just propagate the Debugging exception instead of trying 
>>>>> to maintain unsupportedDwarf flag?
>>>
>>> DwarfParser::processDwarf would throw DebuggerException if it cannot 
>>> find DWARF which relates to PC.
>>> PC at this point is for next frame. So current frame (`this` object) 
>>> is valid, and it should be processed.
>>>
>>>
>>>>> Also, I don't like that DWARF-specific abbreviations (like CIE, 
>>>>> IDE,LSDA, etc.) are used without any comments explaining them.
>>>>> The code has to be generally readable without looking into the 
>>>>> DWARF spec each time.
>>>
>>> I added comments for them in this webrev.
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>>>> I'm submitting mach5 jobs to make sure the issue has been resolved 
>>>>> with your fix.
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>> On 3/20/20 17:55, Yasumasa Suenaga wrote:
>>>>>> Thanks Chris!
>>>>>> I'm waiting for reviewers for this change.
>>>>>>
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>> On 2020/03/21 4:23, Chris Plummer wrote:
>>>>>>> Hi Yasumasa,
>>>>>>>
>>>>>>> The failure is due to JDK-8231634, so not something you need to 
>>>>>>> worry about.
>>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On 3/20/20 2:45 AM, Yasumasa Suenaga wrote:
>>>>>>>> Hi Chris,
>>>>>>>>
>>>>>>>> I uploaded new webrev which includes reverting change for 
>>>>>>>> ProblemList:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/
>>>>>>>>
>>>>>>>> I tested it on submit repo 
>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-2-20200320-0814-9586301),
>>>>>>>> but it has failed in ClhsdbJstackXcompStress.java.
>>>>>>>> However I think it is not caused by this change because 
>>>>>>>> ClhsdbJstackXcompStress.java tests `jhsdb jstack`, not mixed 
>>>>>>>> mode, it would not parse DWARF.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2020/03/20 13:55, Chris Plummer wrote:
>>>>>>>>> Hi Yasumasa,
>>>>>>>>>
>>>>>>>>> The test has been problem listed so please add undoing this to 
>>>>>>>>> your webrev. Here's the diff that problem listed it:
>>>>>>>>>
>>>>>>>>> diff --git a/test/hotspot/jtreg/ProblemList.txt 
>>>>>>>>> b/test/hotspot/jtreg/ProblemList.txt
>>>>>>>>> --- a/test/hotspot/jtreg/ProblemList.txt
>>>>>>>>> +++ b/test/hotspot/jtreg/ProblemList.txt
>>>>>>>>> @@ -115,7 +115,7 @@
>>>>>>>>> ??serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all
>>>>>>>>> ??serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all
>>>>>>>>> ??serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all
>>>>>>>>> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all
>>>>>>>>> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 
>>>>>>>>> solaris-all,linux-all
>>>>>>>>> ??serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 
>>>>>>>>> 8193639 solaris-all
>>>>>>>>> ??serviceability/sa/ClhsdbScanOops.java 
>>>>>>>>> 8193639,8235220,8230731 
>>>>>>>>> solaris-all,linux-x64,macosx-x64,windows-x64
>>>>>>>>> ??serviceability/sa/ClhsdbSource.java 8193639 solaris-all
>>>>>>>>>
>>>>>>>>> thanks,
>>>>>>>>>
>>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>> On 3/16/20 5:07 AM, Yasumasa Suenaga wrote:
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> This webrev has passed submit repo 
>>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and 
>>>>>>>>>> additional tests.
>>>>>>>>>> So please review it:
>>>>>>>>>>
>>>>>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>>> ? webrev: 
>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2020/03/16 21:03, Yasumasa Suenaga wrote:
>>>>>>>>>>> Thank you so much, David!
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 2020/03/16 21:01, David Holmes wrote:
>>>>>>>>>>>> On 16/03/2020 9:46 pm, David Holmes wrote:
>>>>>>>>>>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I missed loop condition, so I fixed it and pushed to 
>>>>>>>>>>>>>> submit repo.
>>>>>>>>>>>>>> Could you try again?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> webrev is here:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>>>>>>>>>>>
>>>>>>>>>>>>> Test job resubmitted. Will advise results if it completes 
>>>>>>>>>>>>> before I go to bed :)
>>>>>>>>>>>>
>>>>>>>>>>>> Seems to have passed okay.
>>>>>>>>>>>>
>>>>>>>>>>>> David
>>>>>>>>>>>>
>>>>>>>>>>>>> David
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks a lot!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 2020/03/16 16:17, David Holmes wrote:
>>>>>>>>>>>>>>> Sorry it is still crashing.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime 
>>>>>>>>>>>>>>> Environment:
>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, 
>>>>>>>>>>>>>>> tid=13704
>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) 
>>>>>>>>>>>>>>> (fastdebug build 
>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0640217.suenaga.source)
>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0640217.suenaga.source, mixed 
>>>>>>>>>>>>>>> mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] 
>>>>>>>>>>>>>>> DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Same as before.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote:
>>>>>>>>>>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote:
>>>>>>>>>>>>>>>>>> BTW, if you submit it to the submit repo, we can then 
>>>>>>>>>>>>>>>>>> go and run additional internal tests (and even more 
>>>>>>>>>>>>>>>>>> builds) using that job.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks for that tip Chris!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I've pushed the change to submit repo, but I've not 
>>>>>>>>>>>>>>>>> yet received the result.
>>>>>>>>>>>>>>>>> I will share you when I get job ID.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> We can see the id. Just need to wait for the builds to 
>>>>>>>>>>>>>>>> complete before submitting the additional tests.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thank you for testing it.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I updated webrev to avoid bailout to Java frame when 
>>>>>>>>>>>>>>>>>>> DWARF has language personality routine or LSDA.
>>>>>>>>>>>>>>>>>>> Could you try it?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ 
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 .
>>>>>>>>>>>>>>>>>>> I've pushed it to submit repo.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Diff from webrev.00 is here:
>>>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>> Correction ...
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I can't review this as I know nothing about the 
>>>>>>>>>>>>>>>>>>>>>> code, but I'm putting the patch through our 
>>>>>>>>>>>>>>>>>>>>>> internal testing.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Sorry but the crashes still exist:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java 
>>>>>>>>>>>>>>>>>>>>> Runtime Environment:
>>>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, 
>>>>>>>>>>>>>>>>>>>>> pid=16948, tid=16949
>>>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment 
>>>>>>>>>>>>>>>>>>>>> (15.0) (fastdebug build 
>>>>>>>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) 
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM 
>>>>>>>>>>>>>>>>>>>>> (fastdebug 
>>>>>>>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, 
>>>>>>>>>>>>>>>>>>>>> mixed mode, sharing, tiered, compressed oops, g1 
>>>>>>>>>>>>>>>>>>>>> gc, linux-amd64)
>>>>>>>>>>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] 
>>>>>>>>>>>>>>>>>>>>> DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> in fact they seem worse as the test seems to 
>>>>>>>>>>>>>>>>>>>>> always crash now.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs 
>>>>>>>>>>>>>>>>>>>> of the test in linux-x64. I don't see a pattern as 
>>>>>>>>>>>>>>>>>>>> to where it fails versus passes.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> It doesn't fail for me locally.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> ?? JBS: 
>>>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>>>>>>>>>>>>>>>> ?? webrev: 
>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ 
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for 
>>>>>>>>>>>>>>>>>>>>>>> unwinding native frames in jstack mixed mode.
>>>>>>>>>>>>>>>>>>>>>>> However some error has seen intermittently after 
>>>>>>>>>>>>>>>>>>>>>>> that.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I investigated the cause of this, I found two 
>>>>>>>>>>>>>>>>>>>>>>> concerns:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) 
>>>>>>>>>>>>>>>>>>>>>>> range check
>>>>>>>>>>>>>>>>>>>>>>> ?? B: Language personality routine and Language 
>>>>>>>>>>>>>>>>>>>>>>> Specific Data Area (LSDA) are not considered
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I addd range check for .eh_frame processing, and 
>>>>>>>>>>>>>>>>>>>>>>> ignore personality routine and LSDA in this webrev.
>>>>>>>>>>>>>>>>>>>>>>> Also I added bailout code if DWARF processing is 
>>>>>>>>>>>>>>>>>>>>>>> failed due to these concerns.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> This change has passed all tests on submit repo 
>>>>>>>>>>>>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), 
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle 
>>>>>>>>>>>>>>>>>>>>>>> Linux 7.7 container.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>
>>
>>


From serguei.spitsyn at oracle.com  Thu Mar 26 17:53:50 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 26 Mar 2020 10:53:50 -0700
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <0851349f-ee17-4b0e-f186-a378cedd6913@oracle.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <c0013500-ebc6-0eab-1705-feb99e58e73e@oracle.com>
 <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com>
 <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com>
 <a13433b4-e3f8-d280-b83f-cde27f7282cd@oracle.com>
 <ff79a3c5-7416-6ee1-73de-0ef1d91a7480@oracle.com>
 <b9694754-b669-0ea3-2ba9-a432b9d8dcfb@oss.nttdata.com>
 <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com>
 <c00aeece-863a-dfb3-f8da-2d3d5ae25330@oracle.com>
 <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com>
 <f0488e84-9758-ea16-82a6-4ce1f424a523@oracle.com>
 <4f940048-06e1-bda1-6f75-225001572232@oss.nttdata.com>
 <908005c7-c3ad-2f92-5853-2ae0895689b4@oracle.com>
 <2114ef12-b790-ee2d-4242-d114cc47151c@oracle.com>
 <9c9950af-5e36-90b5-d22f-a3e00d45fbcc@oss.nttdata.com>
 <f45b2bc7-a0ca-d85b-5998-1e30e99d0d36@oracle.com>
 <cb32ea21-a06a-824a-cd6f-0b731841f352@oss.nttdata.com>
 <0851349f-ee17-4b0e-f186-a378cedd6913@oracle.com>
Message-ID: <e75f398d-b3b5-9253-681d-01d45414a2b5@oracle.com>

Hi Kevin,

Nice catch with the name "lastFrame".
I was also confused when reviewed this but did not come up with 
something better.

Thanks,
Serguei

On 3/26/20 10:40, Kevin Walls wrote:
> Hi Yasumasa,
>
> Oops, didn't catch this - I also had done some manual testing and in 
> mach5 but clearly not enough.
>
> Generally I think this looks good.
>
> "lastFrame" can mean last as in final, or last as in previous. "last" 
> is one of those annoying English words.? Here it means final, if we 
> get an Exception during processDwarf, use this to flag that we should 
> return null from sender().? "finalFrame" would be clearer to me, 
> anything else probably gets more verbose than you wanted.
>
> Yes I like having the limit on the while loop in process_dwarf(), 
> always worried how sane the information is that we are parsing through.
>
> Thanks!
> Kevin
>
>
> On 24/03/2020 23:47, Yasumasa Suenaga wrote:
>> Thanks Serguei!
>>
>> I will push it when I get second reviewer.
>>
>>
>> Yasumasa
>>
>>
>> On 2020/03/25 1:39, serguei.spitsyn at oracle.com wrote:
>>> Hi Yasumasa,
>>>
>>> I'm okay with this update.
>>> My mach5 test run for this patch is passed.
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 3/23/20 17:08, Yasumasa Suenaga wrote:
>>>> Hi Serguei,
>>>>
>>>> Thanks for your comment!
>>>> I uploaded new webrev:
>>>>
>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.04/
>>>>
>>>> Also I pushed it to submit repo:
>>>>
>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/fade6a949bd1
>>>>
>>>> On 2020/03/24 7:39, serguei.spitsyn at oracle.com wrote:
>>>>> Hi Yasumasa,
>>>>>
>>>>> The mach5 tier5 testing looks good.
>>>>> The serviceability/sa/ClhsdbPstack.java is failed without fix and 
>>>>> is not failed with it.
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>> On 3/23/20 10:18, serguei.spitsyn at oracle.com wrote:
>>>>>> Hi Yasumasa,
>>>>>>
>>>>>> I looked at you changes.
>>>>>> It is hard to understand if this fully solves the issue.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html 
>>>>>>
>>>>>>
>>>>>> @@ -34,10 +34,11 @@
>>>>>> ? ???? public static LinuxAMD64CFrame getTopFrame(LinuxDebugger 
>>>>>> dbg, Address rip, ThreadContext context) {
>>>>>> ??????? Address libptr = dbg.findLibPtrByAddress(rip);
>>>>>> ??????? Address cfa = 
>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP);
>>>>>> ??????? DwarfParser dwarf = null;
>>>>>> + boolean unsupportedDwarf = false;
>>>>>> ? ??????? if (libptr != null) { // Native frame
>>>>>> ????????? try {
>>>>>> ??????????? dwarf = new DwarfParser(libptr);
>>>>>> ??????????? dwarf.processDwarf(rip);
>>>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
>>>>>>
>>>>>> @@ -45,24 +46,33 @@
>>>>>> ?????????????????? !dwarf.isBPOffsetAvailable())
>>>>>> ????????????????????? ? 
>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP)
>>>>>> ????????????????????? : 
>>>>>> context.getRegisterAsAddress(dwarf.getCFARegister())
>>>>>> .addOffsetTo(dwarf.getCFAOffset());
>>>>>> ????????? } catch (DebuggerException e) {
>>>>>> - // Bail out to Java frame case
>>>>>> + if (dwarf != null) {
>>>>>> + // DWARF processing should succeed when the frame is native
>>>>>> + // but it might fail if CIE has language personality routine
>>>>>> + // and/or LSDA.
>>>>>> + dwarf = null;
>>>>>> + unsupportedDwarf = true;
>>>>>> + } else {
>>>>>> + throw e;
>>>>>> + }
>>>>>> ????????? }
>>>>>> ??????? }
>>>>>> ? ??????? return (cfa == null) ? null
>>>>>> - : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf);
>>>>>> + : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf, !unsupportedDwarf);
>>>>>> ???? }
>>>>>>
>>>>>> @@ -121,13 +131,25 @@
>>>>>> ?????? }
>>>>>> ? ?????? return isValidFrame(nextCFA, context) ? nextCFA : null;
>>>>>> ???? }
>>>>>> ? - private DwarfParser getNextDwarf(Address nextPC) {
>>>>>> - DwarfParser nextDwarf = null;
>>>>>> + @Override
>>>>>> + public CFrame sender(ThreadProxy thread) {
>>>>>> + if (!possibleNext) {
>>>>>> + return null;
>>>>>> + }
>>>>>> +
>>>>>> + ThreadContext context = thread.getContext();
>>>>>> +
>>>>>> + Address nextPC = getNextPC(dwarf != null);
>>>>>> + if (nextPC == null) {
>>>>>> + return null;
>>>>>> + }
>>>>>> ? + DwarfParser nextDwarf = null;
>>>>>> + boolean unsupportedDwarf = false;
>>>>>> ?????? if ((dwarf != null) && dwarf.isIn(nextPC)) {
>>>>>> ???????? nextDwarf = dwarf;
>>>>>> ?????? } else {
>>>>>> ???????? Address libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>> ???????? if (libptr != null) {
>>>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
>>>>>>
>>>>>> @@ -138,33 +160,29 @@
>>>>>> ?????????? }
>>>>>> ???????? }
>>>>>> ?????? }
>>>>>> ? ?????? if (nextDwarf != null) {
>>>>>> + try {
>>>>>> ???????? nextDwarf.processDwarf(nextPC);
>>>>>> + } catch (DebuggerException e) {
>>>>>> + // DWARF processing should succeed when the frame is native
>>>>>> + // but it might fail if CIE has language personality routine
>>>>>> + // and/or LSDA.
>>>>>> + nextDwarf = null;
>>>>>> + unsupportedDwarf = true;
>>>>>> ?????? }
>>>>>>
>>>>>> This fix looks like a hack.
>>>>>> Should we just propagate the Debugging exception instead of 
>>>>>> trying to maintain unsupportedDwarf flag?
>>>>
>>>> DwarfParser::processDwarf would throw DebuggerException if it 
>>>> cannot find DWARF which relates to PC.
>>>> PC at this point is for next frame. So current frame (`this` 
>>>> object) is valid, and it should be processed.
>>>>
>>>>
>>>>>> Also, I don't like that DWARF-specific abbreviations (like CIE, 
>>>>>> IDE,LSDA, etc.) are used without any comments explaining them.
>>>>>> The code has to be generally readable without looking into the 
>>>>>> DWARF spec each time.
>>>>
>>>> I added comments for them in this webrev.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>>>> I'm submitting mach5 jobs to make sure the issue has been 
>>>>>> resolved with your fix.
>>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>>
>>>>>> On 3/20/20 17:55, Yasumasa Suenaga wrote:
>>>>>>> Thanks Chris!
>>>>>>> I'm waiting for reviewers for this change.
>>>>>>>
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>> On 2020/03/21 4:23, Chris Plummer wrote:
>>>>>>>> Hi Yasumasa,
>>>>>>>>
>>>>>>>> The failure is due to JDK-8231634, so not something you need to 
>>>>>>>> worry about.
>>>>>>>>
>>>>>>>> thanks,
>>>>>>>>
>>>>>>>> Chris
>>>>>>>>
>>>>>>>> On 3/20/20 2:45 AM, Yasumasa Suenaga wrote:
>>>>>>>>> Hi Chris,
>>>>>>>>>
>>>>>>>>> I uploaded new webrev which includes reverting change for 
>>>>>>>>> ProblemList:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/
>>>>>>>>>
>>>>>>>>> I tested it on submit repo 
>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-2-20200320-0814-9586301),
>>>>>>>>> but it has failed in ClhsdbJstackXcompStress.java.
>>>>>>>>> However I think it is not caused by this change because 
>>>>>>>>> ClhsdbJstackXcompStress.java tests `jhsdb jstack`, not mixed 
>>>>>>>>> mode, it would not parse DWARF.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2020/03/20 13:55, Chris Plummer wrote:
>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>
>>>>>>>>>> The test has been problem listed so please add undoing this 
>>>>>>>>>> to your webrev. Here's the diff that problem listed it:
>>>>>>>>>>
>>>>>>>>>> diff --git a/test/hotspot/jtreg/ProblemList.txt 
>>>>>>>>>> b/test/hotspot/jtreg/ProblemList.txt
>>>>>>>>>> --- a/test/hotspot/jtreg/ProblemList.txt
>>>>>>>>>> +++ b/test/hotspot/jtreg/ProblemList.txt
>>>>>>>>>> @@ -115,7 +115,7 @@
>>>>>>>>>> ??serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all
>>>>>>>>>> ??serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all
>>>>>>>>>> ??serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all
>>>>>>>>>> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all
>>>>>>>>>> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 
>>>>>>>>>> solaris-all,linux-all
>>>>>>>>>> ??serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 
>>>>>>>>>> 8193639 solaris-all
>>>>>>>>>> ??serviceability/sa/ClhsdbScanOops.java 
>>>>>>>>>> 8193639,8235220,8230731 
>>>>>>>>>> solaris-all,linux-x64,macosx-x64,windows-x64
>>>>>>>>>> ??serviceability/sa/ClhsdbSource.java 8193639 solaris-all
>>>>>>>>>>
>>>>>>>>>> thanks,
>>>>>>>>>>
>>>>>>>>>> Chris
>>>>>>>>>>
>>>>>>>>>> On 3/16/20 5:07 AM, Yasumasa Suenaga wrote:
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> This webrev has passed submit repo 
>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and 
>>>>>>>>>>> additional tests.
>>>>>>>>>>> So please review it:
>>>>>>>>>>>
>>>>>>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>>>> ? webrev: 
>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 2020/03/16 21:03, Yasumasa Suenaga wrote:
>>>>>>>>>>>> Thank you so much, David!
>>>>>>>>>>>>
>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 2020/03/16 21:01, David Holmes wrote:
>>>>>>>>>>>>> On 16/03/2020 9:46 pm, David Holmes wrote:
>>>>>>>>>>>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I missed loop condition, so I fixed it and pushed to 
>>>>>>>>>>>>>>> submit repo.
>>>>>>>>>>>>>>> Could you try again?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> webrev is here:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Test job resubmitted. Will advise results if it completes 
>>>>>>>>>>>>>> before I go to bed :)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Seems to have passed okay.
>>>>>>>>>>>>>
>>>>>>>>>>>>> David
>>>>>>>>>>>>>
>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks a lot!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 2020/03/16 16:17, David Holmes wrote:
>>>>>>>>>>>>>>>> Sorry it is still crashing.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime 
>>>>>>>>>>>>>>>> Environment:
>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, 
>>>>>>>>>>>>>>>> tid=13704
>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) 
>>>>>>>>>>>>>>>> (fastdebug build 
>>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0640217.suenaga.source)
>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
>>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0640217.suenaga.source, mixed 
>>>>>>>>>>>>>>>> mode, sharing, tiered, compressed oops, g1 gc, 
>>>>>>>>>>>>>>>> linux-amd64)
>>>>>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] 
>>>>>>>>>>>>>>>> DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Same as before.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote:
>>>>>>>>>>>>>>>>>>> BTW, if you submit it to the submit repo, we can 
>>>>>>>>>>>>>>>>>>> then go and run additional internal tests (and even 
>>>>>>>>>>>>>>>>>>> more builds) using that job.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks for that tip Chris!
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I've pushed the change to submit repo, but I've not 
>>>>>>>>>>>>>>>>>> yet received the result.
>>>>>>>>>>>>>>>>>> I will share you when I get job ID.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> We can see the id. Just need to wait for the builds to 
>>>>>>>>>>>>>>>>> complete before submitting the additional tests.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thank you for testing it.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I updated webrev to avoid bailout to Java frame 
>>>>>>>>>>>>>>>>>>>> when DWARF has language personality routine or LSDA.
>>>>>>>>>>>>>>>>>>>> Could you try it?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ 
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 .
>>>>>>>>>>>>>>>>>>>> I've pushed it to submit repo.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Diff from webrev.00 is here:
>>>>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>>> Correction ...
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I can't review this as I know nothing about the 
>>>>>>>>>>>>>>>>>>>>>>> code, but I'm putting the patch through our 
>>>>>>>>>>>>>>>>>>>>>>> internal testing.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Sorry but the crashes still exist:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java 
>>>>>>>>>>>>>>>>>>>>>> Runtime Environment:
>>>>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, 
>>>>>>>>>>>>>>>>>>>>>> pid=16948, tid=16949
>>>>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment 
>>>>>>>>>>>>>>>>>>>>>> (15.0) (fastdebug build 
>>>>>>>>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) 
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM 
>>>>>>>>>>>>>>>>>>>>>> (fastdebug 
>>>>>>>>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, 
>>>>>>>>>>>>>>>>>>>>>> mixed mode, sharing, tiered, compressed oops, g1 
>>>>>>>>>>>>>>>>>>>>>> gc, linux-amd64)
>>>>>>>>>>>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>>>>>>>>>>>> # C [libsaproc.so+0x494e] 
>>>>>>>>>>>>>>>>>>>>>> DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> in fact they seem worse as the test seems to 
>>>>>>>>>>>>>>>>>>>>>> always crash now.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 
>>>>>>>>>>>>>>>>>>>>> runs of the test in linux-x64. I don't see a 
>>>>>>>>>>>>>>>>>>>>> pattern as to where it fails versus passes.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> It doesn't fail for me locally.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> ?? JBS: 
>>>>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>>>>>>>>>>>>>>>>> ?? webrev: 
>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ 
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for 
>>>>>>>>>>>>>>>>>>>>>>>> unwinding native frames in jstack mixed mode.
>>>>>>>>>>>>>>>>>>>>>>>> However some error has seen intermittently 
>>>>>>>>>>>>>>>>>>>>>>>> after that.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> I investigated the cause of this, I found two 
>>>>>>>>>>>>>>>>>>>>>>>> concerns:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) 
>>>>>>>>>>>>>>>>>>>>>>>> range check
>>>>>>>>>>>>>>>>>>>>>>>> ?? B: Language personality routine and Language 
>>>>>>>>>>>>>>>>>>>>>>>> Specific Data Area (LSDA) are not considered
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> I addd range check for .eh_frame processing, 
>>>>>>>>>>>>>>>>>>>>>>>> and ignore personality routine and LSDA in this 
>>>>>>>>>>>>>>>>>>>>>>>> webrev.
>>>>>>>>>>>>>>>>>>>>>>>> Also I added bailout code if DWARF processing 
>>>>>>>>>>>>>>>>>>>>>>>> is failed due to these concerns.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> This change has passed all tests on submit repo 
>>>>>>>>>>>>>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), 
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle 
>>>>>>>>>>>>>>>>>>>>>>>> Linux 7.7 container.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>
>>>
>


From chris.plummer at oracle.com  Thu Mar 26 20:27:00 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Thu, 26 Mar 2020 13:27:00 -0700
Subject: RFR(XS) 8241696: ProblemList
 gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java due to JDK-8241293
Message-ID: <ebacf12d-0a16-1798-45eb-19b8c86d70bb@oracle.com>

Hello,

Please review the following:

https://bugs.openjdk.java.net/browse/JDK-8241696

diff --git a/test/hotspot/jtreg/ProblemList.txt 
b/test/hotspot/jtreg/ProblemList.txt
--- a/test/hotspot/jtreg/ProblemList.txt
+++ b/test/hotspot/jtreg/ProblemList.txt
@@ -85,7 +85,7 @@
 ?gc/stress/gclocker/TestGCLockerWithParallel.java 8180622 generic-all
 ?gc/stress/gclocker/TestGCLockerWithG1.java 8180622 generic-all
 ?gc/stress/TestJNIBlockFullGC/TestJNIBlockFullGC.java 8192647 generic-all
-gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java 8193639 solaris-all
+gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java 8193639,8241293 
solaris-all,macosx-x64

thanks,

Chris


From christian.tornqvist at oracle.com  Thu Mar 26 20:41:55 2020
From: christian.tornqvist at oracle.com (Christian Tornqvist)
Date: Thu, 26 Mar 2020 13:41:55 -0700
Subject: RFR(XS) 8241696: ProblemList
 gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java due to JDK-8241293
In-Reply-To: <ebacf12d-0a16-1798-45eb-19b8c86d70bb@oracle.com>
References: <ebacf12d-0a16-1798-45eb-19b8c86d70bb@oracle.com>
Message-ID: <5FAF6E40-AAC8-42D6-9678-13250BDCDF29@oracle.com>

Hi Chris,

Looks good, thanks for fixing this.

Thanks,
Christian

> On Mar 26, 2020, at 1:27 PM, Chris Plummer <chris.plummer at oracle.com> wrote:
> 
> Hello,
> 
> Please review the following:
> 
> https://bugs.openjdk.java.net/browse/JDK-8241696
> 
> diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt
> --- a/test/hotspot/jtreg/ProblemList.txt
> +++ b/test/hotspot/jtreg/ProblemList.txt
> @@ -85,7 +85,7 @@
>  gc/stress/gclocker/TestGCLockerWithParallel.java 8180622 generic-all
>  gc/stress/gclocker/TestGCLockerWithG1.java 8180622 generic-all
>  gc/stress/TestJNIBlockFullGC/TestJNIBlockFullGC.java 8192647 generic-all
> -gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java 8193639 solaris-all
> +gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java 8193639,8241293 solaris-all,macosx-x64
> 
> thanks,
> 
> Chris
> 


From daniel.daugherty at oracle.com  Thu Mar 26 20:55:43 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 26 Mar 2020 16:55:43 -0400
Subject: RFR(XS) 8241696: ProblemList
 gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java due to JDK-8241293
In-Reply-To: <5FAF6E40-AAC8-42D6-9678-13250BDCDF29@oracle.com>
References: <ebacf12d-0a16-1798-45eb-19b8c86d70bb@oracle.com>
 <5FAF6E40-AAC8-42D6-9678-13250BDCDF29@oracle.com>
Message-ID: <4a41868c-0fb9-046f-8911-b1b602de8004@oracle.com>

Thumbs up. This is a trivial review, but you didn't qualify it as such
so now you have a second review.

Dan


On 3/26/20 4:41 PM, Christian Tornqvist wrote:
> Hi Chris,
>
> Looks good, thanks for fixing this.
>
> Thanks,
> Christian
>
>> On Mar 26, 2020, at 1:27 PM, Chris Plummer <chris.plummer at oracle.com> wrote:
>>
>> Hello,
>>
>> Please review the following:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8241696
>>
>> diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt
>> --- a/test/hotspot/jtreg/ProblemList.txt
>> +++ b/test/hotspot/jtreg/ProblemList.txt
>> @@ -85,7 +85,7 @@
>>   gc/stress/gclocker/TestGCLockerWithParallel.java 8180622 generic-all
>>   gc/stress/gclocker/TestGCLockerWithG1.java 8180622 generic-all
>>   gc/stress/TestJNIBlockFullGC/TestJNIBlockFullGC.java 8192647 generic-all
>> -gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java 8193639 solaris-all
>> +gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java 8193639,8241293 solaris-all,macosx-x64
>>
>> thanks,
>>
>> Chris
>>


From leonid.mesnik at oracle.com  Thu Mar 26 21:39:15 2020
From: leonid.mesnik at oracle.com (Leonid Mesnik)
Date: Thu, 26 Mar 2020 14:39:15 -0700
Subject: RFR: 8241456: ThreadRunner shouldn't use Wicket for threads starting
 synchronization
In-Reply-To: <B5860361-2A64-41A9-B436-171A84536CC3@oracle.com>
References: <B5860361-2A64-41A9-B436-171A84536CC3@oracle.com>
Message-ID: <412d8c29-7742-a138-dc74-8f07def5eeae@oracle.com>

Replying with correct summary.

Leonid

On 3/23/20 8:55 PM, Leonid Mesnik wrote:
> Hi
>
> Could you please review following fix which update ThreadsRunner to use AtomicInteger/spinOnWait instead of Wicket to synchronize starting of stress test threads.
>
> Failing tests allocated all memory by earlier started threads before Lock.unlock is called in the latest threads. So thread might get an OOME exception while trying to release lock and/or get into inconsistent state.
>
> The bug was introduced by https://bugs.openjdk.java.net/browse/JDK-8241123 <https://bugs.openjdk.java.net/browse/JDK-8241123>
> The Atomic works fine for stress test finishing sync. I just didn't expect that tests might OOME while releasing start lock.
> Verified that tests now don't fail with -Xcomp -server -XX:-TieredCompilation -XX:-UseCompressedOops.
>
> webrev: http://cr.openjdk.java.net/~lmesnik/8241456/webrev.00/ <http://cr.openjdk.java.net/~lmesnik/8241456/webrev.00/>
> bug: https://bugs.openjdk.java.net/browse/JDK-8241456 <https://bugs.openjdk.java.net/browse/JDK-8241456>
>
> Leonid

From chris.plummer at oracle.com  Thu Mar 26 22:15:11 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Thu, 26 Mar 2020 15:15:11 -0700
Subject: RFR(XS) 8241696: ProblemList
 gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java due to JDK-8241293
In-Reply-To: <4a41868c-0fb9-046f-8911-b1b602de8004@oracle.com>
References: <ebacf12d-0a16-1798-45eb-19b8c86d70bb@oracle.com>
 <5FAF6E40-AAC8-42D6-9678-13250BDCDF29@oracle.com>
 <4a41868c-0fb9-046f-8911-b1b602de8004@oracle.com>
Message-ID: <e701129a-ce9f-7770-4200-9a58a719f20b@oracle.com>

Thanks!

On 3/26/20 1:55 PM, Daniel D. Daugherty wrote:
> Thumbs up. This is a trivial review, but you didn't qualify it as such
> so now you have a second review.
>
> Dan
>
>
> On 3/26/20 4:41 PM, Christian Tornqvist wrote:
>> Hi Chris,
>>
>> Looks good, thanks for fixing this.
>>
>> Thanks,
>> Christian
>>
>>> On Mar 26, 2020, at 1:27 PM, Chris Plummer 
>>> <chris.plummer at oracle.com> wrote:
>>>
>>> Hello,
>>>
>>> Please review the following:
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8241696
>>>
>>> diff --git a/test/hotspot/jtreg/ProblemList.txt 
>>> b/test/hotspot/jtreg/ProblemList.txt
>>> --- a/test/hotspot/jtreg/ProblemList.txt
>>> +++ b/test/hotspot/jtreg/ProblemList.txt
>>> @@ -85,7 +85,7 @@
>>> ? gc/stress/gclocker/TestGCLockerWithParallel.java 8180622 generic-all
>>> ? gc/stress/gclocker/TestGCLockerWithG1.java 8180622 generic-all
>>> ? gc/stress/TestJNIBlockFullGC/TestJNIBlockFullGC.java 8192647 
>>> generic-all
>>> -gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java 8193639 
>>> solaris-all
>>> +gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java 
>>> 8193639,8241293 solaris-all,macosx-x64
>>>
>>> thanks,
>>>
>>> Chris
>>>
>


From david.holmes at oracle.com  Thu Mar 26 23:06:53 2020
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 27 Mar 2020 09:06:53 +1000
Subject: RFR: 8241456: ThreadRunner shouldn't use Wicket for threads
 starting synchronization
In-Reply-To: <412d8c29-7742-a138-dc74-8f07def5eeae@oracle.com>
References: <B5860361-2A64-41A9-B436-171A84536CC3@oracle.com>
 <412d8c29-7742-a138-dc74-8f07def5eeae@oracle.com>
Message-ID: <9502df2b-07d1-b1d2-5e66-fce0eb4ac9d7@oracle.com>

Hi Leonid,

On 27/03/2020 7:39 am, Leonid Mesnik wrote:
> Replying with correct summary.
> 
> Leonid
> 
> On 3/23/20 8:55 PM, Leonid Mesnik wrote:
>> Hi
>>
>> Could you please review following fix which update ThreadsRunner to 
>> use AtomicInteger/spinOnWait instead of Wicket to synchronize starting 
>> of stress test threads.
>>
>> Failing tests allocated all memory by earlier started threads before 
>> Lock.unlock is called in the latest threads. So thread might get an 
>> OOME exception while trying to release lock and/or get into 
>> inconsistent state.

You have a bug in Wicket:

+        try {
+            lock.lock();
...
+        } finally {
+            lock.unlock();

The lock() has to go outside the try block. That is why you were getting 
IllegalMonitorStateExceptions when the lock() threw OOME.

But the OOME itself is still a problem as it means you can't use any 
proper synchronizer. I don't like seeing the spin-loops but in this code 
you may have no choice if memory may already be exhausted.

David
-----


>>
>> The bug was introduced by 
>> https://bugs.openjdk.java.net/browse/JDK-8241123 
>> <https://bugs.openjdk.java.net/browse/JDK-8241123>
>> The Atomic works fine for stress test finishing sync. I just didn't 
>> expect that tests might OOME while releasing start lock.
>> Verified that tests now don't fail with -Xcomp -server 
>> -XX:-TieredCompilation -XX:-UseCompressedOops.
>>
>> webrev: http://cr.openjdk.java.net/~lmesnik/8241456/webrev.00/ 
>> <http://cr.openjdk.java.net/~lmesnik/8241456/webrev.00/>
>> bug: https://bugs.openjdk.java.net/browse/JDK-8241456 
>> <https://bugs.openjdk.java.net/browse/JDK-8241456>
>>
>> Leonid

From leonid.mesnik at oracle.com  Thu Mar 26 23:16:39 2020
From: leonid.mesnik at oracle.com (Leonid Mesnik)
Date: Thu, 26 Mar 2020 16:16:39 -0700
Subject: RFR: 8241456: ThreadRunner shouldn't use Wicket for threads
 starting synchronization
In-Reply-To: <9502df2b-07d1-b1d2-5e66-fce0eb4ac9d7@oracle.com>
References: <B5860361-2A64-41A9-B436-171A84536CC3@oracle.com>
 <412d8c29-7742-a138-dc74-8f07def5eeae@oracle.com>
 <9502df2b-07d1-b1d2-5e66-fce0eb4ac9d7@oracle.com>
Message-ID: <e8ebe186-3bc6-faca-5423-8b6df829828e@oracle.com>


On 3/26/20 4:06 PM, David Holmes wrote:
> Hi Leonid,
>
> On 27/03/2020 7:39 am, Leonid Mesnik wrote:
>> Replying with correct summary.
>>
>> Leonid
>>
>> On 3/23/20 8:55 PM, Leonid Mesnik wrote:
>>> Hi
>>>
>>> Could you please review following fix which update ThreadsRunner to 
>>> use AtomicInteger/spinOnWait instead of Wicket to synchronize 
>>> starting of stress test threads.
>>>
>>> Failing tests allocated all memory by earlier started threads before 
>>> Lock.unlock is called in the latest threads. So thread might get an 
>>> OOME exception while trying to release lock and/or get into 
>>> inconsistent state.
>
> You have a bug in Wicket:
>
> +??????? try {
> +??????????? lock.lock();
> ...
> +??????? } finally {
> +??????????? lock.unlock();
>
> The lock() has to go outside the try block. That is why you were 
> getting IllegalMonitorStateExceptions when the lock() threw OOME.
Thanks for explanation. But anyway, as I understand locks use memory and 
might be inconsistent if OOME happened.
>
> But the OOME itself is still a problem as it means you can't use any 
> proper synchronizer. I don't like seeing the spin-loops but in this 
> code you may have no choice if memory may already be exhausted.

It should be really short spin-loop, test only start thread during this 
loop and don't do anything more. Also, it is done only once for all 
stress test. The goal is to start thread completely before heap is 
exhausted.

Leonid

>
> David
> -----
>
>
>>>
>>> The bug was introduced by 
>>> https://bugs.openjdk.java.net/browse/JDK-8241123 
>>> <https://bugs.openjdk.java.net/browse/JDK-8241123>
>>> The Atomic works fine for stress test finishing sync. I just didn't 
>>> expect that tests might OOME while releasing start lock.
>>> Verified that tests now don't fail with -Xcomp -server 
>>> -XX:-TieredCompilation -XX:-UseCompressedOops.
>>>
>>> webrev: http://cr.openjdk.java.net/~lmesnik/8241456/webrev.00/ 
>>> <http://cr.openjdk.java.net/~lmesnik/8241456/webrev.00/>
>>> bug: https://bugs.openjdk.java.net/browse/JDK-8241456 
>>> <https://bugs.openjdk.java.net/browse/JDK-8241456>
>>>
>>> Leonid

From david.holmes at oracle.com  Thu Mar 26 23:29:18 2020
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 27 Mar 2020 09:29:18 +1000
Subject: RFR: 8241456: ThreadRunner shouldn't use Wicket for threads
 starting synchronization
In-Reply-To: <e8ebe186-3bc6-faca-5423-8b6df829828e@oracle.com>
References: <B5860361-2A64-41A9-B436-171A84536CC3@oracle.com>
 <412d8c29-7742-a138-dc74-8f07def5eeae@oracle.com>
 <9502df2b-07d1-b1d2-5e66-fce0eb4ac9d7@oracle.com>
 <e8ebe186-3bc6-faca-5423-8b6df829828e@oracle.com>
Message-ID: <12701240-fd7d-560d-8974-ff0be9cafa7e@oracle.com>

On 27/03/2020 9:16 am, Leonid Mesnik wrote:
> 
> On 3/26/20 4:06 PM, David Holmes wrote:
>> Hi Leonid,
>>
>> On 27/03/2020 7:39 am, Leonid Mesnik wrote:
>>> Replying with correct summary.
>>>
>>> Leonid
>>>
>>> On 3/23/20 8:55 PM, Leonid Mesnik wrote:
>>>> Hi
>>>>
>>>> Could you please review following fix which update ThreadsRunner to 
>>>> use AtomicInteger/spinOnWait instead of Wicket to synchronize 
>>>> starting of stress test threads.
>>>>
>>>> Failing tests allocated all memory by earlier started threads before 
>>>> Lock.unlock is called in the latest threads. So thread might get an 
>>>> OOME exception while trying to release lock and/or get into 
>>>> inconsistent state.
>>
>> You have a bug in Wicket:
>>
>> +??????? try {
>> +??????????? lock.lock();
>> ...
>> +??????? } finally {
>> +??????????? lock.unlock();
>>
>> The lock() has to go outside the try block. That is why you were 
>> getting IllegalMonitorStateExceptions when the lock() threw OOME.
> Thanks for explanation. But anyway, as I understand locks use memory and 
> might be inconsistent if OOME happened.

They use memory and so lock() can throw OOME, but they are never 
inconsistent.

>>
>> But the OOME itself is still a problem as it means you can't use any 
>> proper synchronizer. I don't like seeing the spin-loops but in this 
>> code you may have no choice if memory may already be exhausted.
> 
> It should be really short spin-loop, test only start thread during this 
> loop and don't do anything more. Also, it is done only once for all 
> stress test. The goal is to start thread completely before heap is 
> exhausted.

Okay. I'm somewhat dubious about making these changes in mainline now 
just to support loom. I don't see why we need to care about pinning 
threads in this kind of situation.

David

> Leonid
> 
>>
>> David
>> -----
>>
>>
>>>>
>>>> The bug was introduced by 
>>>> https://bugs.openjdk.java.net/browse/JDK-8241123 
>>>> <https://bugs.openjdk.java.net/browse/JDK-8241123>
>>>> The Atomic works fine for stress test finishing sync. I just didn't 
>>>> expect that tests might OOME while releasing start lock.
>>>> Verified that tests now don't fail with -Xcomp -server 
>>>> -XX:-TieredCompilation -XX:-UseCompressedOops.
>>>>
>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8241456/webrev.00/ 
>>>> <http://cr.openjdk.java.net/~lmesnik/8241456/webrev.00/>
>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8241456 
>>>> <https://bugs.openjdk.java.net/browse/JDK-8241456>
>>>>
>>>> Leonid

From david.holmes at oracle.com  Thu Mar 26 23:36:35 2020
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 27 Mar 2020 09:36:35 +1000
Subject: RFR: 8241585: Remove unused _recursion_counter facility from
 PerfTraceTime
In-Reply-To: <91a8ebbc-522d-bd67-6304-c9e097bd8366@oracle.com>
References: <91a8ebbc-522d-bd67-6304-c9e097bd8366@oracle.com>
Message-ID: <31c866a4-e135-adbc-cb8c-81fbd77bb59e@oracle.com>

Hi Claes,

Adding serviceability as they are the consumers of this IIUC.

On 27/03/2020 3:40 am, Claes Redestad wrote:
> Hi,
> 
> PerfTraceTime::_recursion_counter is unused, and removing it
> gets rid of some branchy (but well-predicted) code in paths that is
> somewhat startup sensitive.

Okay.

> http://cr.openjdk.java.net/~redestad/8241585/open.00/
> 
> Also added some trace logging to determine the number of perf
> data counter or each type along with a tune-up to exactly match
> the defaults.

Okay so can you change the bug synopsis and description to cover this 
more general cleanup and tuneup please.

I'm never very clear on the uses of these PerfCounters. It seems SUN_NS 
is unused after this change. The references to jvmstat seem no longer 
correct - these are read via jstat ?

> Testing: tier1+2

I think serviceability testing is mainly in tier3.

Thanks,
David
-----

> 
> Thanks!
> 
> /Claes

From leonid.mesnik at oracle.com  Thu Mar 26 23:41:36 2020
From: leonid.mesnik at oracle.com (Leonid Mesnik)
Date: Thu, 26 Mar 2020 16:41:36 -0700
Subject: RFR: 8241456: ThreadRunner shouldn't use Wicket for threads
 starting synchronization
In-Reply-To: <12701240-fd7d-560d-8974-ff0be9cafa7e@oracle.com>
References: <B5860361-2A64-41A9-B436-171A84536CC3@oracle.com>
 <412d8c29-7742-a138-dc74-8f07def5eeae@oracle.com>
 <9502df2b-07d1-b1d2-5e66-fce0eb4ac9d7@oracle.com>
 <e8ebe186-3bc6-faca-5423-8b6df829828e@oracle.com>
 <12701240-fd7d-560d-8974-ff0be9cafa7e@oracle.com>
Message-ID: <70175e02-2c50-50e7-0646-4fb82be6c768@oracle.com>


On 3/26/20 4:29 PM, David Holmes wrote:
> On 27/03/2020 9:16 am, Leonid Mesnik wrote:
>>
>> On 3/26/20 4:06 PM, David Holmes wrote:
>>> Hi Leonid,
>>>
>>> On 27/03/2020 7:39 am, Leonid Mesnik wrote:
>>>> Replying with correct summary.
>>>>
>>>> Leonid
>>>>
>>>> On 3/23/20 8:55 PM, Leonid Mesnik wrote:
>>>>> Hi
>>>>>
>>>>> Could you please review following fix which update ThreadsRunner 
>>>>> to use AtomicInteger/spinOnWait instead of Wicket to synchronize 
>>>>> starting of stress test threads.
>>>>>
>>>>> Failing tests allocated all memory by earlier started threads 
>>>>> before Lock.unlock is called in the latest threads. So thread 
>>>>> might get an OOME exception while trying to release lock and/or 
>>>>> get into inconsistent state.
>>>
>>> You have a bug in Wicket:
>>>
>>> +??????? try {
>>> +??????????? lock.lock();
>>> ...
>>> +??????? } finally {
>>> +??????????? lock.unlock();
>>>
>>> The lock() has to go outside the try block. That is why you were 
>>> getting IllegalMonitorStateExceptions when the lock() threw OOME.
>> Thanks for explanation. But anyway, as I understand locks use memory 
>> and might be inconsistent if OOME happened.
>
> They use memory and so lock() can throw OOME, but they are never 
> inconsistent.
Ok, I will move lock.lock() outside of try {}. Thanks for explanation.
>
>>>
>>> But the OOME itself is still a problem as it means you can't use any 
>>> proper synchronizer. I don't like seeing the spin-loops but in this 
>>> code you may have no choice if memory may already be exhausted.
>>
>> It should be really short spin-loop, test only start thread during 
>> this loop and don't do anything more. Also, it is done only once for 
>> all stress test. The goal is to start thread completely before heap 
>> is exhausted.
>
> Okay. I'm somewhat dubious about making these changes in mainline now 
> just to support loom. I don't see why we need to care about pinning 
> threads in this kind of situation.

The idea is to add some nsk/share stress tests for virtual threads. 
Basically, there are the same tests as existing (gc, sysdict) but 
running in virtual threads. And these tests are going to be executed 
after loom is integrated. And I want to keep the difference as small as 
possible between mainline and loom.

Leonid

>
> David
>
>> Leonid
>>
>>>
>>> David
>>> -----
>>>
>>>
>>>>>
>>>>> The bug was introduced by 
>>>>> https://bugs.openjdk.java.net/browse/JDK-8241123 
>>>>> <https://bugs.openjdk.java.net/browse/JDK-8241123>
>>>>> The Atomic works fine for stress test finishing sync. I just 
>>>>> didn't expect that tests might OOME while releasing start lock.
>>>>> Verified that tests now don't fail with -Xcomp -server 
>>>>> -XX:-TieredCompilation -XX:-UseCompressedOops.
>>>>>
>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8241456/webrev.00/ 
>>>>> <http://cr.openjdk.java.net/~lmesnik/8241456/webrev.00/>
>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8241456 
>>>>> <https://bugs.openjdk.java.net/browse/JDK-8241456>
>>>>>
>>>>> Leonid

From chris.plummer at oracle.com  Thu Mar 26 23:46:12 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Thu, 26 Mar 2020 16:46:12 -0700
Subject: RFR: 8241585: Remove unused _recursion_counter facility from
 PerfTraceTime
In-Reply-To: <31c866a4-e135-adbc-cb8c-81fbd77bb59e@oracle.com>
References: <91a8ebbc-522d-bd67-6304-c9e097bd8366@oracle.com>
 <31c866a4-e135-adbc-cb8c-81fbd77bb59e@oracle.com>
Message-ID: <f2825e82-1c36-feab-4e63-17c982886888@oracle.com>

On 3/26/20 4:36 PM, David Holmes wrote:
> Hi Claes,
>
> Adding serviceability as they are the consumers of this IIUC.
>
> On 27/03/2020 3:40 am, Claes Redestad wrote:
>> Hi,
>>
>> PerfTraceTime::_recursion_counter is unused, and removing it
>> gets rid of some branchy (but well-predicted) code in paths that is
>> somewhat startup sensitive.
>
> Okay.
>
>> http://cr.openjdk.java.net/~redestad/8241585/open.00/
>>
>> Also added some trace logging to determine the number of perf
>> data counter or each type along with a tune-up to exactly match
>> the defaults.
>
> Okay so can you change the bug synopsis and description to cover this 
> more general cleanup and tuneup please.
>
> I'm never very clear on the uses of these PerfCounters. It seems 
> SUN_NS is unused after this change. The references to jvmstat seem no 
> longer correct - these are read via jstat ?
jstat uses jvmstat.

Chris
>
>> Testing: tier1+2
>
> I think serviceability testing is mainly in tier3.
>
> Thanks,
> David
> -----
>
>>
>> Thanks!
>>
>> /Claes


From david.holmes at oracle.com  Thu Mar 26 23:49:16 2020
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 27 Mar 2020 09:49:16 +1000
Subject: RFR: 8241585: Remove unused _recursion_counter facility from
 PerfTraceTime
In-Reply-To: <f2825e82-1c36-feab-4e63-17c982886888@oracle.com>
References: <91a8ebbc-522d-bd67-6304-c9e097bd8366@oracle.com>
 <31c866a4-e135-adbc-cb8c-81fbd77bb59e@oracle.com>
 <f2825e82-1c36-feab-4e63-17c982886888@oracle.com>
Message-ID: <3a4ac731-c79c-8094-3dd6-b6f94a8f6df4@oracle.com>

On 27/03/2020 9:46 am, Chris Plummer wrote:
> On 3/26/20 4:36 PM, David Holmes wrote:
>> Hi Claes,
>>
>> Adding serviceability as they are the consumers of this IIUC.
>>
>> On 27/03/2020 3:40 am, Claes Redestad wrote:
>>> Hi,
>>>
>>> PerfTraceTime::_recursion_counter is unused, and removing it
>>> gets rid of some branchy (but well-predicted) code in paths that is
>>> somewhat startup sensitive.
>>
>> Okay.
>>
>>> http://cr.openjdk.java.net/~redestad/8241585/open.00/
>>>
>>> Also added some trace logging to determine the number of perf
>>> data counter or each type along with a tune-up to exactly match
>>> the defaults.
>>
>> Okay so can you change the bug synopsis and description to cover this 
>> more general cleanup and tuneup please.
>>
>> I'm never very clear on the uses of these PerfCounters. It seems 
>> SUN_NS is unused after this change. The references to jvmstat seem no 
>> longer correct - these are read via jstat ?
> jstat uses jvmstat.

Thanks Chris, I was grepping C++ code not realizing jvmstat is a Java API.

David

> Chris
>>
>>> Testing: tier1+2
>>
>> I think serviceability testing is mainly in tier3.
>>
>> Thanks,
>> David
>> -----
>>
>>>
>>> Thanks!
>>>
>>> /Claes
> 

From mandy.chung at oracle.com  Thu Mar 26 23:57:39 2020
From: mandy.chung at oracle.com (Mandy Chung)
Date: Thu, 26 Mar 2020 16:57:39 -0700
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
Message-ID: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>

Please review the implementation of JEP 371: Hidden Classes. The main 
changes are in core-libs and hotspot runtime area.? Small changes are 
made in javac, VM compiler (intrinsification of Class::isHiddenClass), 
JFR, JDI, and jcmd.? CSR [1]has been reviewed and is in the finalized 
state (see specdiff and javadoc below for reference).

Webrev:
http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03

Hidden class is created via `Lookup::defineHiddenClass`. From JVM's point
of view, a hidden class is a normal class except the following:

- A hidden class has no initiating class loader and is not registered in 
any dictionary.
- A hidden class has a name containing an illegal character 
`Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` 
returns "Lp/Foo.0x1234;".
- A hidden class is not modifiable, i.e. cannot be redefined or 
retransformed. JVM TI IsModifableClass returns false on a hidden.
- Final fields in a hidden class is "final".? The value of final fields 
cannot be overriden via reflection.? setAccessible(true) can still be 
called on reflected objects representing final fields in a hidden class 
and its access check will be suppressed but only have read-access (i.e. 
can do Field::getXXX but not setXXX).

Brief summary of this patch:

1. A new Lookup::defineHiddenClass method is the API to create a hidden 
class.
2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG 
option that
 ?? can be specified when creating a hidden class.
3. A new Class::isHiddenClass method tests if a class is a hidden class.
4. Field::setXXX method will throw IAE on a final field of a hidden class
 ?? regardless of the value of the accessible flag.
5. JVM_LookupDefineClass is the new JVM entry point for Lookup::defineClass
 ?? and defineHiddenClass to create a class from the given bytes.
6. ClassLoaderData implementation is not changed.? There is one primary CLD
 ?? that holds the classes strongly referenced by its defining loader.? 
There
 ?? can be zero or more additional CLDs - one per weak class.
7. Nest host determination is updated per revised JVMS 5.4.4. Access control
 ?? check no longer throws LinkageError but instead it will throw IAE with
 ?? a clear message if a class fails to resolve/validate the nest host 
declared
 ?? in NestHost/NestMembers attribute.
8. JFR, jcmd, JDI are updated to support hidden classes.
9. update javac LambdaToMethod as lambda proxy starts using nestmates
 ?? and generate a bridge method to desuger a method reference to a 
protected
 ?? method in its supertype in a different package

This patch also updates StringConcatFactory, LambdaMetaFactory, and 
LambdaForms
to use hidden classes.? The webrev includes changes in nashorn to hidden 
class
and I will update the webrev if JEP 372 removes it any time soon.

We uncovered a bug in Lookup::defineClass spec throws LinkageError and 
intends
to have the newly created class linked.? However, the implementation in 14
does not link the class.? A separate CSR [2] proposes to update the
implementation to match the spec.? This patch fixes the implementation.

The spec update on JVM TI, JDI and Instrumentation will be done as
a separate RFE [3].? This patch includes new tests for JVM TI and
java.instrument that validates how the existing APIs work for hidden 
classes.

javadoc/specdiff
http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/
http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/

JVMS 5.4.4 change:
http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf

CSR:
https://bugs.openjdk.java.net/browse/JDK-8238359

Thanks
Mandy
[1] https://bugs.openjdk.java.net/browse/JDK-8238359
[2] https://bugs.openjdk.java.net/browse/JDK-8240338
[3] https://bugs.openjdk.java.net/browse/JDK-8230502
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200326/8f09bec2/attachment-0001.htm>

From suenaga at oss.nttdata.com  Fri Mar 27 00:07:15 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Fri, 27 Mar 2020 09:07:15 +0900
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <e75f398d-b3b5-9253-681d-01d45414a2b5@oracle.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <30617fc2-5d01-4471-712a-6c3a5089af11@oracle.com>
 <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com>
 <a13433b4-e3f8-d280-b83f-cde27f7282cd@oracle.com>
 <ff79a3c5-7416-6ee1-73de-0ef1d91a7480@oracle.com>
 <b9694754-b669-0ea3-2ba9-a432b9d8dcfb@oss.nttdata.com>
 <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com>
 <c00aeece-863a-dfb3-f8da-2d3d5ae25330@oracle.com>
 <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com>
 <f0488e84-9758-ea16-82a6-4ce1f424a523@oracle.com>
 <4f940048-06e1-bda1-6f75-225001572232@oss.nttdata.com>
 <908005c7-c3ad-2f92-5853-2ae0895689b4@oracle.com>
 <2114ef12-b790-ee2d-4242-d114cc47151c@oracle.com>
 <9c9950af-5e36-90b5-d22f-a3e00d45fbcc@oss.nttdata.com>
 <f45b2bc7-a0ca-d85b-5998-1e30e99d0d36@oracle.com>
 <cb32ea21-a06a-824a-cd6f-0b731841f352@oss.nttdata.com>
 <0851349f-ee17-4b0e-f186-a378cedd6913@oracle.com>
 <e75f398d-b3b5-9253-681d-01d45414a2b5@oracle.com>
Message-ID: <48d5c2a6-777b-b17d-db05-1b71b712fb4e@oss.nttdata.com>

Thanks Kevin and Serguei! and sorry for my English...

I uploaded new webrev:

   http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.05/

Diff from webrev.04 is here:

   http://hg.openjdk.java.net/jdk/submit/rev/d5f400d70e94


Thanks,

Yasumasa


On 2020/03/27 2:53, serguei.spitsyn at oracle.com wrote:
> Hi Kevin,
> 
> Nice catch with the name "lastFrame".
> I was also confused when reviewed this but did not come up with something better.
> 
> Thanks,
> Serguei
> 
> On 3/26/20 10:40, Kevin Walls wrote:
>> Hi Yasumasa,
>>
>> Oops, didn't catch this - I also had done some manual testing and in mach5 but clearly not enough.
>>
>> Generally I think this looks good.
>>
>> "lastFrame" can mean last as in final, or last as in previous. "last" is one of those annoying English words.? Here it means final, if we get an Exception during processDwarf, use this to flag that we should return null from sender().? "finalFrame" would be clearer to me, anything else probably gets more verbose than you wanted.
>>
>> Yes I like having the limit on the while loop in process_dwarf(), always worried how sane the information is that we are parsing through.
>>
>> Thanks!
>> Kevin
>>
>>
>> On 24/03/2020 23:47, Yasumasa Suenaga wrote:
>>> Thanks Serguei!
>>>
>>> I will push it when I get second reviewer.
>>>
>>>
>>> Yasumasa
>>>
>>>
>>> On 2020/03/25 1:39, serguei.spitsyn at oracle.com wrote:
>>>> Hi Yasumasa,
>>>>
>>>> I'm okay with this update.
>>>> My mach5 test run for this patch is passed.
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>> On 3/23/20 17:08, Yasumasa Suenaga wrote:
>>>>> Hi Serguei,
>>>>>
>>>>> Thanks for your comment!
>>>>> I uploaded new webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.04/
>>>>>
>>>>> Also I pushed it to submit repo:
>>>>>
>>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/fade6a949bd1
>>>>>
>>>>> On 2020/03/24 7:39, serguei.spitsyn at oracle.com wrote:
>>>>>> Hi Yasumasa,
>>>>>>
>>>>>> The mach5 tier5 testing looks good.
>>>>>> The serviceability/sa/ClhsdbPstack.java is failed without fix and is not failed with it.
>>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>>
>>>>>> On 3/23/20 10:18, serguei.spitsyn at oracle.com wrote:
>>>>>>> Hi Yasumasa,
>>>>>>>
>>>>>>> I looked at you changes.
>>>>>>> It is hard to understand if this fully solves the issue.
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html
>>>>>>>
>>>>>>> @@ -34,10 +34,11 @@
>>>>>>> ? ???? public static LinuxAMD64CFrame getTopFrame(LinuxDebugger dbg, Address rip, ThreadContext context) {
>>>>>>> ??????? Address libptr = dbg.findLibPtrByAddress(rip);
>>>>>>> ??????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP);
>>>>>>> ??????? DwarfParser dwarf = null;
>>>>>>> + boolean unsupportedDwarf = false;
>>>>>>> ? ??????? if (libptr != null) { // Native frame
>>>>>>> ????????? try {
>>>>>>> ??????????? dwarf = new DwarfParser(libptr);
>>>>>>> ??????????? dwarf.processDwarf(rip);
>>>>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
>>>>>>>
>>>>>>> @@ -45,24 +46,33 @@
>>>>>>> ?????????????????? !dwarf.isBPOffsetAvailable())
>>>>>>> ????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP)
>>>>>>> ????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister())
>>>>>>> .addOffsetTo(dwarf.getCFAOffset());
>>>>>>> ????????? } catch (DebuggerException e) {
>>>>>>> - // Bail out to Java frame case
>>>>>>> + if (dwarf != null) {
>>>>>>> + // DWARF processing should succeed when the frame is native
>>>>>>> + // but it might fail if CIE has language personality routine
>>>>>>> + // and/or LSDA.
>>>>>>> + dwarf = null;
>>>>>>> + unsupportedDwarf = true;
>>>>>>> + } else {
>>>>>>> + throw e;
>>>>>>> + }
>>>>>>> ????????? }
>>>>>>> ??????? }
>>>>>>> ? ??????? return (cfa == null) ? null
>>>>>>> - : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf);
>>>>>>> + : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf, !unsupportedDwarf);
>>>>>>> ???? }
>>>>>>>
>>>>>>> @@ -121,13 +131,25 @@
>>>>>>> ?????? }
>>>>>>> ? ?????? return isValidFrame(nextCFA, context) ? nextCFA : null;
>>>>>>> ???? }
>>>>>>> ? - private DwarfParser getNextDwarf(Address nextPC) {
>>>>>>> - DwarfParser nextDwarf = null;
>>>>>>> + @Override
>>>>>>> + public CFrame sender(ThreadProxy thread) {
>>>>>>> + if (!possibleNext) {
>>>>>>> + return null;
>>>>>>> + }
>>>>>>> +
>>>>>>> + ThreadContext context = thread.getContext();
>>>>>>> +
>>>>>>> + Address nextPC = getNextPC(dwarf != null);
>>>>>>> + if (nextPC == null) {
>>>>>>> + return null;
>>>>>>> + }
>>>>>>> ? + DwarfParser nextDwarf = null;
>>>>>>> + boolean unsupportedDwarf = false;
>>>>>>> ?????? if ((dwarf != null) && dwarf.isIn(nextPC)) {
>>>>>>> ???????? nextDwarf = dwarf;
>>>>>>> ?????? } else {
>>>>>>> ???????? Address libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>> ???????? if (libptr != null) {
>>>>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
>>>>>>>
>>>>>>> @@ -138,33 +160,29 @@
>>>>>>> ?????????? }
>>>>>>> ???????? }
>>>>>>> ?????? }
>>>>>>> ? ?????? if (nextDwarf != null) {
>>>>>>> + try {
>>>>>>> ???????? nextDwarf.processDwarf(nextPC);
>>>>>>> + } catch (DebuggerException e) {
>>>>>>> + // DWARF processing should succeed when the frame is native
>>>>>>> + // but it might fail if CIE has language personality routine
>>>>>>> + // and/or LSDA.
>>>>>>> + nextDwarf = null;
>>>>>>> + unsupportedDwarf = true;
>>>>>>> ?????? }
>>>>>>>
>>>>>>> This fix looks like a hack.
>>>>>>> Should we just propagate the Debugging exception instead of trying to maintain unsupportedDwarf flag?
>>>>>
>>>>> DwarfParser::processDwarf would throw DebuggerException if it cannot find DWARF which relates to PC.
>>>>> PC at this point is for next frame. So current frame (`this` object) is valid, and it should be processed.
>>>>>
>>>>>
>>>>>>> Also, I don't like that DWARF-specific abbreviations (like CIE, IDE,LSDA, etc.) are used without any comments explaining them.
>>>>>>> The code has to be generally readable without looking into the DWARF spec each time.
>>>>>
>>>>> I added comments for them in this webrev.
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>>>> I'm submitting mach5 jobs to make sure the issue has been resolved with your fix.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>
>>>>>>> On 3/20/20 17:55, Yasumasa Suenaga wrote:
>>>>>>>> Thanks Chris!
>>>>>>>> I'm waiting for reviewers for this change.
>>>>>>>>
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2020/03/21 4:23, Chris Plummer wrote:
>>>>>>>>> Hi Yasumasa,
>>>>>>>>>
>>>>>>>>> The failure is due to JDK-8231634, so not something you need to worry about.
>>>>>>>>>
>>>>>>>>> thanks,
>>>>>>>>>
>>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>> On 3/20/20 2:45 AM, Yasumasa Suenaga wrote:
>>>>>>>>>> Hi Chris,
>>>>>>>>>>
>>>>>>>>>> I uploaded new webrev which includes reverting change for ProblemList:
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/
>>>>>>>>>>
>>>>>>>>>> I tested it on submit repo (mach5-one-ysuenaga-JDK-8240956-2-20200320-0814-9586301),
>>>>>>>>>> but it has failed in ClhsdbJstackXcompStress.java.
>>>>>>>>>> However I think it is not caused by this change because ClhsdbJstackXcompStress.java tests `jhsdb jstack`, not mixed mode, it would not parse DWARF.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2020/03/20 13:55, Chris Plummer wrote:
>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>
>>>>>>>>>>> The test has been problem listed so please add undoing this to your webrev. Here's the diff that problem listed it:
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt
>>>>>>>>>>> --- a/test/hotspot/jtreg/ProblemList.txt
>>>>>>>>>>> +++ b/test/hotspot/jtreg/ProblemList.txt
>>>>>>>>>>> @@ -115,7 +115,7 @@
>>>>>>>>>>> ??serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all
>>>>>>>>>>> ??serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all
>>>>>>>>>>> ??serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all
>>>>>>>>>>> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all
>>>>>>>>>>> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 solaris-all,linux-all
>>>>>>>>>>> ??serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 solaris-all
>>>>>>>>>>> ??serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 solaris-all,linux-x64,macosx-x64,windows-x64
>>>>>>>>>>> ??serviceability/sa/ClhsdbSource.java 8193639 solaris-all
>>>>>>>>>>>
>>>>>>>>>>> thanks,
>>>>>>>>>>>
>>>>>>>>>>> Chris
>>>>>>>>>>>
>>>>>>>>>>> On 3/16/20 5:07 AM, Yasumasa Suenaga wrote:
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>
>>>>>>>>>>>> This webrev has passed submit repo (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and additional tests.
>>>>>>>>>>>> So please review it:
>>>>>>>>>>>>
>>>>>>>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>>>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 2020/03/16 21:03, Yasumasa Suenaga wrote:
>>>>>>>>>>>>> Thank you so much, David!
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 2020/03/16 21:01, David Holmes wrote:
>>>>>>>>>>>>>> On 16/03/2020 9:46 pm, David Holmes wrote:
>>>>>>>>>>>>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I missed loop condition, so I fixed it and pushed to submit repo.
>>>>>>>>>>>>>>>> Could you try again?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> webrev is here:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Test job resubmitted. Will advise results if it completes before I go to bed :)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Seems to have passed okay.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks a lot!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 2020/03/16 16:17, David Holmes wrote:
>>>>>>>>>>>>>>>>> Sorry it is still crashing.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704
>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0640217.suenaga.source)
>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>>>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Same as before.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote:
>>>>>>>>>>>>>>>>>>>> BTW, if you submit it to the submit repo, we can then go and run additional internal tests (and even more builds) using that job.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks for that tip Chris!
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I've pushed the change to submit repo, but I've not yet received the result.
>>>>>>>>>>>>>>>>>>> I will share you when I get job ID.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> We can see the id. Just need to wait for the builds to complete before submitting the additional tests.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thank you for testing it.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I updated webrev to avoid bailout to Java frame when DWARF has language personality routine or LSDA.
>>>>>>>>>>>>>>>>>>>>> Could you try it?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 .
>>>>>>>>>>>>>>>>>>>>> I've pushed it to submit repo.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Diff from webrev.00 is here:
>>>>>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>>>> Correction ...
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> I can't review this as I know nothing about the code, but I'm putting the patch through our internal testing.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Sorry but the crashes still exist:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949
>>>>>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev)
>>>>>>>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>>>>>>>>>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>>>>>>>>>>>>> # C [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> in fact they seem worse as the test seems to always crash now.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in linux-x64. I don't see a pattern as to where it fails versus passes.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> It doesn't fail for me locally.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>>>>>>>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames in jstack mixed mode.
>>>>>>>>>>>>>>>>>>>>>>>>> However some error has seen intermittently after that.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> I investigated the cause of this, I found two concerns:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check
>>>>>>>>>>>>>>>>>>>>>>>>> ?? B: Language personality routine and Language Specific Data Area (LSDA) are not considered
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> I addd range check for .eh_frame processing, and ignore personality routine and LSDA in this webrev.
>>>>>>>>>>>>>>>>>>>>>>>>> Also I added bailout code if DWARF processing is failed due to these concerns.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> This change has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671),
>>>>>>>>>>>>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>>
>>
> 
> 

From claes.redestad at oracle.com  Fri Mar 27 00:11:34 2020
From: claes.redestad at oracle.com (Claes Redestad)
Date: Fri, 27 Mar 2020 01:11:34 +0100
Subject: RFR: 8241585: Remove unused _recursion_counter facility from
 PerfTraceTime
In-Reply-To: <31c866a4-e135-adbc-cb8c-81fbd77bb59e@oracle.com>
References: <91a8ebbc-522d-bd67-6304-c9e097bd8366@oracle.com>
 <31c866a4-e135-adbc-cb8c-81fbd77bb59e@oracle.com>
Message-ID: <c8d35386-a132-9439-7df4-f875201b609f@oracle.com>


On 2020-03-27 00:36, David Holmes wrote:
>>
> 
> Okay so can you change the bug synopsis and description to cover this 
> more general cleanup and tuneup please.

I filed an addendum RFE and will add this RFE bug id to the single
changeset push:
https://bugs.openjdk.java.net/browse/JDK-8241705

> 
> I'm never very clear on the uses of these PerfCounters. It seems SUN_NS 
> is unused after this change. The references to jvmstat seem no longer 
> correct - these are read via jstat ?

The general confusion about PerfData/-Counters and what they're for is
why I'm trying to untangle this. Generally I think we should pull the
plug on it, but the perfdata shared file is tangled up with
functionality to detect running JVMs used by jcmd etc, so it might
take a few iterations to get there.

/Claes

From serguei.spitsyn at oracle.com  Fri Mar 27 00:15:19 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 26 Mar 2020 17:15:19 -0700
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <f65dcdbb-465f-d1f5-2ecf-a293fa58b624@redhat.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
 <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>
 <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>
 <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com>
 <f322e0c2-a87e-7c45-5b02-f5380e50246a@oracle.com>
 <359442d4-d6a1-f123-dad1-ac712f06eb0e@redhat.com>
 <d5dbee0b-9f6e-173f-15c4-f88bd0e9b619@oracle.com>
 <ef85c1cc-a22b-5b72-caa2-30020f7f487a@redhat.com>
 <7c613970-1788-5bd5-06f9-5c754843a193@oracle.com>
 <70717ba0-bcd1-05d4-3f25-ad1dd30309c1@redhat.com>
 <80c64ca7-4f52-a7ba-0e35-9fa6417ce545@oracle.com>
 <e67eefb0-fa41-1a22-e794-6c98fe255713@redhat.com>
 <89d41371-4394-e506-b1d1-0a810c72b6e3@oracle.com>
 <f65dcdbb-465f-d1f5-2ecf-a293fa58b624@redhat.com>
Message-ID: <99649661-67cf-9583-673a-8c53d038aed9@oracle.com>

Hi Roman,

Yes. Thank you for the explanation.

Thanks,
Serguei

On 3/26/20 01:44, Roman Kennke wrote:
> That was in the previous implementation: I got a condition wrong in the
> table lookup (as noted by Serguei), and this prevented any
> class-unload-events from getting out. I have fixed this, but found other
> problems in that implementation (deadlocks and a crash).
>
>   The current implementation has none of these problems: we don't need
> table-lookups - we simply pass-through the signatures, and locking is
> much simpler and in particular we don't need a lock around the JVMTI
> call (SetTag) which was the cause of the deadlock.
>
> Does that answer your questions?
>
> Thanks,
> Roman
>
>> Hi Roman,
>>
>> It passed all my testing. I think before you push Serguei has a question
>> regarding an issue you brought up a while back. You mentioned that you
>> weren't getting some events, and suddenly started seeing them. We were
>> discussing it today and it was unclear if this was an issue you were
>> seeing before your changes, and your changes resolved it, or it was
>> initially caused by an earlier version of your changes, and you later
>> fixed it. We just want to better understand what this issue was and how
>> it was fixed.
>>
>> thanks,
>>
>> Chris
>>
>> On 3/25/20 3:22 PM, Roman Kennke wrote:
>>> The new job finished, its ID is:
>>>
>>>  ? mach5-one-rkennke-JDK-8227269-2-20200325-2027-9716289
>>>
>>> Thank you,
>>> Roman
>>>
>>>
>>>> Yes, please submit a new job. I'll start my testing once I see that the
>>>> builds are done.
>>>>
>>>> Chris
>>>>
>>>> On 3/25/20 12:59 PM, Roman Kennke wrote:
>>>>> Hi Chris,
>>>>>
>>>>> Apparently we can get into classTrack_reset() before calling
>>>>> activate(),
>>>>> and we're seeing a null deletedSignatureBag. A simple NULL-check around
>>>>> the cleaning routine fixes the problem for me.
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.08/
>>>>>
>>>>> Should I post another submit-repo job with that fix?
>>>>>
>>>>> Thanks,
>>>>> Roman
>>>>>
>>>>>
>>>>>> Hi Roman,
>>>>>>
>>>>>> com/sun/jdi/JdwpAllowTest.java crashed on many runs:
>>>>>>
>>>>>> Stack: [0x00007fbb790f9000,0x00007fbb791fa000],
>>>>>> sp=0x00007fbb791f8af0,? free space=1022k
>>>>>> Native frames: (J=compiled Java code, A=aot compiled Java code,
>>>>>> j=interpreted, Vv=VM code, C=native code)
>>>>>> C? [libjdwp.so+0xdb71]? bagEnumerateOver+0x11
>>>>>> C? [libjdwp.so+0xe365]? classTrack_reset+0x25
>>>>>> C? [libjdwp.so+0xfca1]? debugInit_reset+0x71
>>>>>> C? [libjdwp.so+0x12e0d]? debugLoop_run+0x38d
>>>>>> C? [libjdwp.so+0x25700]? acceptThread+0x80
>>>>>> V? [libjvm.so+0xf4b5a7]? JvmtiAgentThread::call_start_function()+0x1c7
>>>>>> V? [libjvm.so+0x15215c6]? JavaThread::thread_main_inner()+0x226
>>>>>> V? [libjvm.so+0x1527736]? Thread::call_run()+0xf6
>>>>>> V? [libjvm.so+0x1250ade]? thread_native_entry(Thread*)+0x10e
>>>>>>
>>>>>>
>>>>>> This happened during a test task run of open/test/jdk/:jdk_jdi. There
>>>>>> doesn't seem to be anything magic on the command line that might be
>>>>>> triggering. Pretty much I see it with all the various VM configs we
>>>>>> test.
>>>>>>
>>>>>> I'm also seeing crashes in the following tests, but not as often:
>>>>>>
>>>>>> serviceability/jvmti/ModuleAwareAgents/ThreadStart/MAAThreadStart.java
>>>>>> vmTestbase/nsk/jdwp/VirtualMachine/Version/version002/TestDescription.java
>>>>>>
>>>>>>
>>>>>> vmTestbase/nsk/jdwp/VirtualMachine/ReleaseEvents/releaseevents002/TestDescription.java
>>>>>>
>>>>>>
>>>>>> vmTestbase/nsk/jdwp/VirtualMachine/HoldEvents/holdevents002/TestDescription.java
>>>>>>
>>>>>>
>>>>>> vmTestbase/nsk/jdwp/VirtualMachine/Dispose/dispose001/TestDescription.java
>>>>>>
>>>>>>
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>>
>>>>>> On 3/25/20 11:37 AM, Roman Kennke wrote:
>>>>>>> Hi Chris,
>>>>>>>
>>>>>>>> Regarding the new assert:
>>>>>>>>
>>>>>>>>  ???105???? if (gdata && gdata->assertOn) {
>>>>>>>>  ???106???????? // Check this is not already tagged.
>>>>>>>>  ???107???????? jlong tag;
>>>>>>>>  ???108???????? error = JVMTI_FUNC_PTR(trackingEnv, GetTag)(env,
>>>>>>>> klass, &tag);
>>>>>>>>  ???109???????? if (error != JVMTI_ERROR_NONE) {
>>>>>>>>  ???110???????????? EXIT_ERROR(error, "Unable to GetTag with class
>>>>>>>> trackingEnv");
>>>>>>>>  ???111???????? }
>>>>>>>>  ???112???????? JDI_ASSERT(tag == NOT_TAGGED);
>>>>>>>>  ???113???? }
>>>>>>>>
>>>>>>>> I think you should remove the gdata check. gdata should never be
>>>>>>>> NULL
>>>>>>>> when you get to this code. If it is ever NULL then there's a bug,
>>>>>>>> and
>>>>>>>> the check will hide the bug.
>>>>>>> Ok, will remove this.
>>>>>>>
>>>>>>>> Regarding testing, after you do the submit repo testing let me know
>>>>>>>> the
>>>>>>>> jobID and I'll do additional testing on it.
>>>>>>> I did the submit repo earlier today, and it came back green:
>>>>>>>
>>>>>>> mach5-one-rkennke-JDK-8227269-2-20200325-1421-9706762
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Roman
>>>>>>>
>>>>>>>> thanks,
>>>>>>>>
>>>>>>>> Chris
>>>>>>>>
>>>>>>>> On 3/25/20 6:00 AM, Roman Kennke wrote:
>>>>>>>>> Hi Sergei,
>>>>>>>>>
>>>>>>>>>> The fix looks pretty clean now.
>>>>>>>>>> I also like new name of the lock.:)
>>>>>>>>> Thank you!
>>>>>>>>>
>>>>>>>>>> Just one comment below.
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 110 if (tag != 0l) {
>>>>>>>>>> 111 return; // Already added
>>>>>>>>>>  ??? 112???? }
>>>>>>>>>>
>>>>>>>>>>  ????It is better to use a named constant or macro instead.
>>>>>>>>>>  ????Also, it'd be nice to add a short comment about this value is.
>>>>>>>>> As I replied to Chris earlier, this whole block can be turned
>>>>>>>>> into an
>>>>>>>>> assert. I also made a constant for the value 0, which should be
>>>>>>>>> pretty
>>>>>>>>> much self-explaining.
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.07/
>>>>>>>>>
>>>>>>>>>> How do you test the fix?
>>>>>>>>> I am using a manual test that is provided in this bug report:
>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>
>>>>>>>>> "Script to compare performance of GC with and without debugger,
>>>>>>>>> when
>>>>>>>>> many classes are loaded and classes are being unloaded":
>>>>>>>>>
>>>>>>>>> https://bugzilla.redhat.com/attachment.cgi?id=1640688
>>>>>>>>>
>>>>>>>>> I am also using this test and manually attach/detach jdb a
>>>>>>>>> couple of
>>>>>>>>> times in a row to check that disconnecting and reconnecting works
>>>>>>>>> well
>>>>>>>>> (this tended to deadlock or crash with an earlier version of the
>>>>>>>>> patch,
>>>>>>>>> and is now looking good).
>>>>>>>>>
>>>>>>>>> I am also running tier1 and tier2 tests locally, and as soon as we
>>>>>>>>> all
>>>>>>>>> agree that the fix is reasonable, I will push it to the submit
>>>>>>>>> repo. I
>>>>>>>>> am not sure if any of those tests actually exercise that code,
>>>>>>>>> though.
>>>>>>>>> Let me know if you want me to run any specific tests.
>>>>>>>>>
>>>>>>>>> Thank you,
>>>>>>>>> Roman
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Serguei
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 3/20/20 08:30, Roman Kennke wrote:
>>>>>>>>>>> I believe I came up with a much simpler solution that also
>>>>>>>>>>> solves the
>>>>>>>>>>> problems of the existing one, and the ones I proposed earlier.
>>>>>>>>>>>
>>>>>>>>>>> It turns out that we can take advantage of the fact that we
>>>>>>>>>>> can use
>>>>>>>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is
>>>>>>>>>>> explicitely
>>>>>>>>>>> mentioned in the JVMTI spec). This means we can simply stick a
>>>>>>>>>>> pointer
>>>>>>>>>>> to the signature of a class into the tag, and pull it out again
>>>>>>>>>>> when we
>>>>>>>>>>> get notified that the class gets unloaded.
>>>>>>>>>>>
>>>>>>>>>>> This means we don't need an extra data-structure to keep track of
>>>>>>>>>>> classes and signatures, and it also makes the story around
>>>>>>>>>>> locking
>>>>>>>>>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning
>>>>>>>>>>> of all
>>>>>>>>>>> classes needed (as in the current implementation) and no
>>>>>>>>>>> searching of
>>>>>>>>>>> table needed (like in my previous attempts).
>>>>>>>>>>>
>>>>>>>>>>> Please review this new revision:
>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/
>>>>>>>>>>>
>>>>>>>>>>> (Notice that there still appears to be a performance bottleneck
>>>>>>>>>>> with
>>>>>>>>>>> class-unloading when an actual debugger is attached. This
>>>>>>>>>>> doesn't seem
>>>>>>>>>>> to be related to the classTrack.c implementation though, but
>>>>>>>>>>> looks like
>>>>>>>>>>> a consequence of getting all those class-unload notifications
>>>>>>>>>>> over the
>>>>>>>>>>> wire. My testcase generates 1000s of them, and it's clogging
>>>>>>>>>>> up the
>>>>>>>>>>> buffers.)
>>>>>>>>>>>
>>>>>>>>>>> I am not sure why jdb needs to enable class-unload listener
>>>>>>>>>>> always. A
>>>>>>>>>>> simple hack disables it, and performance is brilliant, even when
>>>>>>>>>>> jdb is
>>>>>>>>>>> attached:
>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> But this is not in the scope of this bug.)
>>>>>>>>>>>
>>>>>>>>>>> Roman
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>> Sorry, forgot to complete my comments at the end (see below).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>>> Hi Roman,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thank you for the update and sorry for the latency in review.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Some comments are below.
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
>>>>>>>>>>>>>  ???? 88 {
>>>>>>>>>>>>> 89 debugMonitorEnter(deletedSignatureLock);
>>>>>>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>> 93 return;
>>>>>>>>>>>>>  ???? 94???? }
>>>>>>>>>>>>> Just a question:
>>>>>>>>>>>>>  ???? Q1: Should the ObjectFree events be disabled for the
>>>>>>>>>>>>> jvmtiEnv
>>>>>>>>>>>>> that does
>>>>>>>>>>>>>  ???????? the class tracking if class tracking has not been
>>>>>>>>>>>>> initialized?
>>>>>>>>>>>>>
>>>>>>>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is
>>>>>>>>>>>>> better to
>>>>>>>>>>>>> be something like: lastClassTag or highestClassTag.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 99 KlassNode* klass = *klass_ptr;
>>>>>>>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103
>>>>>>>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr;
>>>>>>>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { //
>>>>>>>>>>>>> klass
>>>>>>>>>>>>> not
>>>>>>>>>>>>> found - ignore.
>>>>>>>>>>>>> 107 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>> 108 return;
>>>>>>>>>>>>>  ??? 109???? }
>>>>>>>>>>>>>  ????It seems to me, something is wrong in the condition at L106
>>>>>>>>>>>>> above.
>>>>>>>>>>>>>  ????Should it be? :
>>>>>>>>>>>>>  ?????? if (klass == NULL || klass->klass_tag != tag)
>>>>>>>>>>>>>
>>>>>>>>>>>>>  ????Otherwise, how can the second check ever work correctly
>>>>>>>>>>>>> as the
>>>>>>>>>>>>> return
>>>>>>>>>>>>> will always happen when (klass != NULL)?
>>>>>>>>>>>>>
>>>>>>>>>>>>>  ??? There are several places in this file with the the indent:
>>>>>>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>> 93 return;
>>>>>>>>>>>>>  ???? 94???? }
>>>>>>>>>>>>>  ??? ...
>>>>>>>>>>>>> 152 if (currentClassTag == -1) {
>>>>>>>>>>>>> 153 // Class tracking not initialized yet, nobody's interested
>>>>>>>>>>>>> 154 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>> 155 return;
>>>>>>>>>>>>>  ??? 156???? }
>>>>>>>>>>>>>  ??? ...
>>>>>>>>>>>>> 161 if (error != JVMTI_ERROR_NONE) {
>>>>>>>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class
>>>>>>>>>>>>> trackingEnv");
>>>>>>>>>>>>>  ??? 163???? }
>>>>>>>>>>>>> 164 if (tag != 0l) {
>>>>>>>>>>>>> 165 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>> 166 return; // Already added
>>>>>>>>>>>>>  ??? 167???? }
>>>>>>>>>>>>>  ??? ...
>>>>>>>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg)
>>>>>>>>>>>>> 282 {
>>>>>>>>>>>>> 283 char* sig = (char*)signatureVoid;
>>>>>>>>>>>>> 284 jvmtiDeallocate(sig);
>>>>>>>>>>>>> 285 return JNI_TRUE;
>>>>>>>>>>>>>  ??? 286 }
>>>>>>>>>>>>>  ??? ...
>>>>>>>>>>>>>  ??? 291 void
>>>>>>>>>>>>>  ??? 292 classTrack_reset(void)
>>>>>>>>>>>>>  ??? 293 {
>>>>>>>>>>>>> 294 int idx;
>>>>>>>>>>>>> 295 debugMonitorEnter(deletedSignatureLock);
>>>>>>>>>>>>> 296
>>>>>>>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) {
>>>>>>>>>>>>> 298 KlassNode* node = table[idx];
>>>>>>>>>>>>> 299 while (node != NULL) {
>>>>>>>>>>>>> 300 KlassNode* next = node->next;
>>>>>>>>>>>>> 301 jvmtiDeallocate(node->signature);
>>>>>>>>>>>>> 302 jvmtiDeallocate(node);
>>>>>>>>>>>>> 303 node = next;
>>>>>>>>>>>>> 304 }
>>>>>>>>>>>>> 305 }
>>>>>>>>>>>>> 306 jvmtiDeallocate(table);
>>>>>>>>>>>>> 307
>>>>>>>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL);
>>>>>>>>>>>>> 309 bagDestroyBag(deletedSignatureBag);
>>>>>>>>>>>>> 310
>>>>>>>>>>>>> 311 currentClassTag = -1;
>>>>>>>>>>>>> 312
>>>>>>>>>>>>> 313
>>>>>>>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv);
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 314 trackingEnv = NULL;
>>>>>>>>>>>>> 315
>>>>>>>>>>>>> 316 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>
>>>>>>>>>>>>> Could you, please, fix several comments below?
>>>>>>>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for
>>>>>>>>>>>>> class-unloads
>>>>>>>>>>>>>  ????The comma is not needed.
>>>>>>>>>>>>>  ????Would it better to replace: klass tags => klass_tag's ?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 73 * Lock to keep table, currentClassTag and
>>>>>>>>>>>>> deletedSignatureBag
>>>>>>>>>>>>> consistent
>>>>>>>>>>>>>  ????Maybe: Lock to guard ... or lock to keep integrity of ...
>>>>>>>>>>>>>
>>>>>>>>>>>>> 84 * Callback when classes are freed, Finds the signature and
>>>>>>>>>>>>> remembers it in deletedSignatureBag. Would be better to use
>>>>>>>>>>>>> words
>>>>>>>>>>>>> like
>>>>>>>>>>>>> "store" or "record", "Find" should not start from capital
>>>>>>>>>>>>> letter:
>>>>>>>>>>>>> Invoke the callback when classes are freed, find and record the
>>>>>>>>>>>>> signature in deletedSignatureBag.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not
>>>>>>>>>>>>> initialized,
>>>>>>>>>>>>> nobody's interested 153 // Class tracking not initialized yet,
>>>>>>>>>>>>> nobody's interested 158 /* Check this is not a duplicate */
>>>>>>>>>>>>> Missed
>>>>>>>>>>>>> dot
>>>>>>>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag)
>>>>>>>>>>>>> { //
>>>>>>>>>>>>> klass not found - ignore. In opposite, dot is not needed as the
>>>>>>>>>>>>> comment does not start from a capital letter. 111 // At this
>>>>>>>>>>>>> point we
>>>>>>>>>>>>> have the KlassNode corresponding to the tag
>>>>>>>>>>>>> 112 // in klass, and the pointer to it in klass_node.
>>>>>>>>>>>>  ??? The comment above can be better. Maybe, something like:
>>>>>>>>>>>>  ??? ? " At this point, we found the KlassNode matching the klass
>>>>>>>>>>>> tag(and it is
>>>>>>>>>>>> linked).
>>>>>>>>>>>>
>>>>>>>>>>>>> 113 // Remember the unloaded signature.
>>>>>>>>>>>>  ????Better: Record the signature of the unloaded class and
>>>>>>>>>>>> unlink it.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Serguei
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 3/9/20 05:39, Roman Kennke wrote:
>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can I please get reviews of this change? In the meantime,
>>>>>>>>>>>>>> we've done
>>>>>>>>>>>>>> more testing and also field-/torture-testing by a customer
>>>>>>>>>>>>>> who is
>>>>>>>>>>>>>> happy
>>>>>>>>>>>>>> now. :-)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Serguei,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks for reviewing!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I updated the patch to reflect your suggestions, very good!
>>>>>>>>>>>>>>> It also includes a fix to allow re-connecting an agent after
>>>>>>>>>>>>>>> disconnect,
>>>>>>>>>>>>>>> namely move setup of the trackingEnv and
>>>>>>>>>>>>>>> deletedSignatureBag to
>>>>>>>>>>>>>>> _activate() to ensure have those structures after re-connect.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Let me know what you think!
>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Roman,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thank you for taking care about this scalability issue!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I have a couple of quick comments.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 72 /*
>>>>>>>>>>>>>>>> 73 * Lock to protect deletedSignatureBag
>>>>>>>>>>>>>>>> 74 */
>>>>>>>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>>>>>>>>>>>>>>>> 78 * A bag containing all the deleted classes' signatures.
>>>>>>>>>>>>>>>> Must be
>>>>>>>>>>>>>>>> accessed under
>>>>>>>>>>>>>>>> 79 * deletedTagLock,
>>>>>>>>>>>>>>>>  ???? 80? */
>>>>>>>>>>>>>>>> 81 struct bag* deletedSignatureBag;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  ???? The comments contradict to each other.
>>>>>>>>>>>>>>>>  ???? I guess, the lock name at line 79 has to be
>>>>>>>>>>>>>>>> deletedSignatureLock
>>>>>>>>>>>>>>>> instead of deletedTagLock.
>>>>>>>>>>>>>>>>  ???? Also, comma at the end must be replaced with dot.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 101 // Tag not found? Ignore.
>>>>>>>>>>>>>>>> 102 if (klass == NULL) {
>>>>>>>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>>>> 104 return;
>>>>>>>>>>>>>>>> 105 }
>>>>>>>>>>>>>>>>  ??? 106
>>>>>>>>>>>>>>>> 107 // Scan linked-list.
>>>>>>>>>>>>>>>> 108 jlong found_tag = klass->klass_tag;
>>>>>>>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) {
>>>>>>>>>>>>>>>> 110 klass_ptr = &klass->next;
>>>>>>>>>>>>>>>> 111 klass = *klass_ptr;
>>>>>>>>>>>>>>>> 112 found_tag = klass->klass_tag;
>>>>>>>>>>>>>>>>  ??? 113???? }
>>>>>>>>>>>>>>>> 114
>>>>>>>>>>>>>>>> 115 // Tag not found? Ignore.
>>>>>>>>>>>>>>>> 116 if (found_tag != tag) {
>>>>>>>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>>>> 118 return;
>>>>>>>>>>>>>>>>  ??? 119???? }
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  ????The code above can be simplified, so that the lines
>>>>>>>>>>>>>>>> 101-105
>>>>>>>>>>>>>>>> are not
>>>>>>>>>>>>>>>> needed anymore.
>>>>>>>>>>>>>>>>  ????It can be something like this:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> // Scan linked-list.
>>>>>>>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) {
>>>>>>>>>>>>>>>> klass_ptr = &klass->next;
>>>>>>>>>>>>>>>> klass = *klass_ptr;
>>>>>>>>>>>>>>>>  ??????? }
>>>>>>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not
>>>>>>>>>>>>>>>> found - ignore.
>>>>>>>>>>>>>>>> debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>>>> return;
>>>>>>>>>>>>>>>>  ??????? }
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It will take more time when I get a chance to look at the
>>>>>>>>>>>>>>>> rest.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote:
>>>>>>>>>>>>>>>>> Here comes an update that resolves some races that happen
>>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>> disconnecting an agent. In particular, we need to take the
>>>>>>>>>>>>>>>>> lock on
>>>>>>>>>>>>>>>>> basically every operation, and also need to check whether
>>>>>>>>>>>>>>>>> or not
>>>>>>>>>>>>>>>>> class-tracking is active and return an appropriate result
>>>>>>>>>>>>>>>>> (e.g. an empty
>>>>>>>>>>>>>>>>> list) when we're not.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> So, here comes the O(1) implementation:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a
>>>>>>>>>>>>>>>>>> tag, and we
>>>>>>>>>>>>>>>>>> set-up a listener to get notified when it is unloaded.
>>>>>>>>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a
>>>>>>>>>>>>>>>>>> table, which
>>>>>>>>>>>>>>>>>> each entry being the head of a linked-list of KlassNode*.
>>>>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>>>> table is
>>>>>>>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend
>>>>>>>>>>>>>>>>>> the new
>>>>>>>>>>>>>>>>>> KlassNode*.
>>>>>>>>>>>>>>>>>> This is O(1) operation.
>>>>>>>>>>>>>>>>>> - When we get notified of unloading a class, we look up
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> signature of
>>>>>>>>>>>>>>>>>> the reported tag in that table, and remember it in a bag.
>>>>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>>>> KlassNode*
>>>>>>>>>>>>>>>>>> is then unlinked from the table and deallocated. This is
>>>>>>>>>>>>>>>>>> ~O(1) operation
>>>>>>>>>>>>>>>>>> too, depending on the depth of the table. In my testcase
>>>>>>>>>>>>>>>>>> which hammered
>>>>>>>>>>>>>>>>>> the code with class-loads and unloads, I usually see
>>>>>>>>>>>>>>>>>> depths
>>>>>>>>>>>>>>>>>> of like 2-3,
>>>>>>>>>>>>>>>>>> but not usually more. It should be ok.
>>>>>>>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out
>>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>> bag, and
>>>>>>>>>>>>>>>>>> allocate a new one.
>>>>>>>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid
>>>>>>>>>>>>>>>>>> leaking the
>>>>>>>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets
>>>>>>>>>>>>>>>>>> detached
>>>>>>>>>>>>>>>>>> and/or
>>>>>>>>>>>>>>>>>> re-attached (was missing before).
>>>>>>>>>>>>>>>>>> - I also added locks around data-structure-manipulation
>>>>>>>>>>>>>>>>>> (was
>>>>>>>>>>>>>>>>>> missing
>>>>>>>>>>>>>>>>>> before).
>>>>>>>>>>>>>>>>>> - Also, I only activate this whole process when an actual
>>>>>>>>>>>>>>>>>> listener gets
>>>>>>>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right
>>>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>>> attaching a
>>>>>>>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be
>>>>>>>>>>>>>>>>>> something
>>>>>>>>>>>>>>>>>> to improve
>>>>>>>>>>>>>>>>>> in the future?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> In my tests, the performance of class-tracking itself
>>>>>>>>>>>>>>>>>> looks
>>>>>>>>>>>>>>>>>> really good.
>>>>>>>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the
>>>>>>>>>>>>>>>>>> class-unload
>>>>>>>>>>>>>>>>>> events. I don't see how this can be helped when the debug
>>>>>>>>>>>>>>>>>> agent asks for it?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Please let me know what you think of it.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am
>>>>>>>>>>>>>>>>>>> implementing
>>>>>>>>>>>>>>>>>>> the even more
>>>>>>>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing
>>>>>>>>>>>>>>>>>>> for now.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,Roman
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>  ??? Hi Chris,
>>>>>>>>>>>>>>>>>>>>> I'll have a look at this, although it might not be
>>>>>>>>>>>>>>>>>>>>> for a
>>>>>>>>>>>>>>>>>>>>> few days. In
>>>>>>>>>>>>>>>>>>>>> the meantime, maybe you can describe your new
>>>>>>>>>>>>>>>>>>>>> implementation in
>>>>>>>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the
>>>>>>>>>>>>>>>>>>>>> changes.
>>>>>>>>>>>>>>>>>>>> Sure.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to
>>>>>>>>>>>>>>>>>>>> determine the
>>>>>>>>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading
>>>>>>>>>>>>>>>>>>>> happened, so that
>>>>>>>>>>>>>>>>>>>> we can generate the appropriate JDWP event.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The current implementation does so by maintaining a
>>>>>>>>>>>>>>>>>>>> table
>>>>>>>>>>>>>>>>>>>> of currently
>>>>>>>>>>>>>>>>>>>> prepared classes by building that table when
>>>>>>>>>>>>>>>>>>>> classTrack is
>>>>>>>>>>>>>>>>>>>> initialized,
>>>>>>>>>>>>>>>>>>>> and then add new classes whenever a class gets loaded.
>>>>>>>>>>>>>>>>>>>> When
>>>>>>>>>>>>>>>>>>>> unloading
>>>>>>>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and
>>>>>>>>>>>>>>>>>>>> compared with the
>>>>>>>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the
>>>>>>>>>>>>>>>>>>>> new
>>>>>>>>>>>>>>>>>>>> table gets
>>>>>>>>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently
>>>>>>>>>>>>>>>>>>>> and/or many
>>>>>>>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to
>>>>>>>>>>>>>>>>>>>> O(classCount*gcCount)
>>>>>>>>>>>>>>>>>>>> complexity.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared
>>>>>>>>>>>>>>>>>>>> classes, and also
>>>>>>>>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree().
>>>>>>>>>>>>>>>>>>>> Whenever an
>>>>>>>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is
>>>>>>>>>>>>>>>>>>>> scanned,
>>>>>>>>>>>>>>>>>>>> and classes
>>>>>>>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus
>>>>>>>>>>>>>>>>>>>> maintaining the
>>>>>>>>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list
>>>>>>>>>>>>>>>>>>>> that gets returned.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The implementation is not perfect. In order to determine
>>>>>>>>>>>>>>>>>>>> whether or not
>>>>>>>>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag.
>>>>>>>>>>>>>>>>>>>> That process is
>>>>>>>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption
>>>>>>>>>>>>>>>>>>>> here
>>>>>>>>>>>>>>>>>>>> is that
>>>>>>>>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this
>>>>>>>>>>>>>>>>>>>> seems to be
>>>>>>>>>>>>>>>>>>>> true, and also reasonable to expect.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> (I have some ideas how to improve the implementation to
>>>>>>>>>>>>>>>>>>>> ~O(1) but it
>>>>>>>>>>>>>>>>>>>> would be considerably more complex: have to maintain a
>>>>>>>>>>>>>>>>>>>> (hash)table that
>>>>>>>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon
>>>>>>>>>>>>>>>>>>>> unload,
>>>>>>>>>>>>>>>>>>>> and build the
>>>>>>>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently
>>>>>>>>>>>>>>>>>>>> see
>>>>>>>>>>>>>>>>>>>> that it's
>>>>>>>>>>>>>>>>>>>> worth the effort).
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> In addition to all that, this process is only activated
>>>>>>>>>>>>>>>>>>>> when there's an
>>>>>>>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>>>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Issue:
>>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of
>>>>>>>>>>>>>>>>>>>>>> classTrack.c.
>>>>>>>>>>>>>>>>>>>>>> It avoids
>>>>>>>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps
>>>>>>>>>>>>>>>>>>>>>> track of
>>>>>>>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance
>>>>>>>>>>>>>>>>>>>>>> until an
>>>>>>>>>>>>>>>>>>>>>> agent
>>>>>>>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and
>>>>>>>>>>>>>>>>>>>>>> timing.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Eg with the testcase provided here:
>>>>>>>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I am getting those numbers:
>>>>>>>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Can I please get a review?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>>>>>>
>>


From serguei.spitsyn at oracle.com  Fri Mar 27 00:22:43 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 26 Mar 2020 17:22:43 -0700
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <99649661-67cf-9583-673a-8c53d038aed9@oracle.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>
 <14de19af-7146-969e-2a21-80ed40d175fb@oracle.com>
 <13f27bc5-8f69-7b11-830c-fb08eaa80528@redhat.com>
 <6de01312-5b91-39d6-e3dd-fbb245024017@redhat.com>
 <3abcd869-d204-c5ae-5811-c90c584e8046@oracle.com>
 <0b48d887-614c-93fd-0e55-8abd74e3bce0@oracle.com>
 <05ead05f-c8df-dba1-2d76-896829ec4249@redhat.com>
 <f322e0c2-a87e-7c45-5b02-f5380e50246a@oracle.com>
 <359442d4-d6a1-f123-dad1-ac712f06eb0e@redhat.com>
 <d5dbee0b-9f6e-173f-15c4-f88bd0e9b619@oracle.com>
 <ef85c1cc-a22b-5b72-caa2-30020f7f487a@redhat.com>
 <7c613970-1788-5bd5-06f9-5c754843a193@oracle.com>
 <70717ba0-bcd1-05d4-3f25-ad1dd30309c1@redhat.com>
 <80c64ca7-4f52-a7ba-0e35-9fa6417ce545@oracle.com>
 <e67eefb0-fa41-1a22-e794-6c98fe255713@redhat.com>
 <89d41371-4394-e506-b1d1-0a810c72b6e3@oracle.com>
 <f65dcdbb-465f-d1f5-2ecf-a293fa58b624@redhat.com>
 <99649661-67cf-9583-673a-8c53d038aed9@oracle.com>
Message-ID: <33703a4c-3fcc-33ee-4774-e158f5980218@oracle.com>

Hi Roman,

I'm okay with fix.

Thanks,
Serguei


On 3/26/20 17:15, serguei.spitsyn at oracle.com wrote:
> Hi Roman,
>
> Yes. Thank you for the explanation.
>
> Thanks,
> Serguei
>
> On 3/26/20 01:44, Roman Kennke wrote:
>> That was in the previous implementation: I got a condition wrong in the
>> table lookup (as noted by Serguei), and this prevented any
>> class-unload-events from getting out. I have fixed this, but found other
>> problems in that implementation (deadlocks and a crash).
>>
>> ? The current implementation has none of these problems: we don't need
>> table-lookups - we simply pass-through the signatures, and locking is
>> much simpler and in particular we don't need a lock around the JVMTI
>> call (SetTag) which was the cause of the deadlock.
>>
>> Does that answer your questions?
>>
>> Thanks,
>> Roman
>>
>>> Hi Roman,
>>>
>>> It passed all my testing. I think before you push Serguei has a 
>>> question
>>> regarding an issue you brought up a while back. You mentioned that you
>>> weren't getting some events, and suddenly started seeing them. We were
>>> discussing it today and it was unclear if this was an issue you were
>>> seeing before your changes, and your changes resolved it, or it was
>>> initially caused by an earlier version of your changes, and you later
>>> fixed it. We just want to better understand what this issue was and how
>>> it was fixed.
>>>
>>> thanks,
>>>
>>> Chris
>>>
>>> On 3/25/20 3:22 PM, Roman Kennke wrote:
>>>> The new job finished, its ID is:
>>>>
>>>> ?? mach5-one-rkennke-JDK-8227269-2-20200325-2027-9716289
>>>>
>>>> Thank you,
>>>> Roman
>>>>
>>>>
>>>>> Yes, please submit a new job. I'll start my testing once I see 
>>>>> that the
>>>>> builds are done.
>>>>>
>>>>> Chris
>>>>>
>>>>> On 3/25/20 12:59 PM, Roman Kennke wrote:
>>>>>> Hi Chris,
>>>>>>
>>>>>> Apparently we can get into classTrack_reset() before calling
>>>>>> activate(),
>>>>>> and we're seeing a null deletedSignatureBag. A simple NULL-check 
>>>>>> around
>>>>>> the cleaning routine fixes the problem for me.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.08/
>>>>>>
>>>>>> Should I post another submit-repo job with that fix?
>>>>>>
>>>>>> Thanks,
>>>>>> Roman
>>>>>>
>>>>>>
>>>>>>> Hi Roman,
>>>>>>>
>>>>>>> com/sun/jdi/JdwpAllowTest.java crashed on many runs:
>>>>>>>
>>>>>>> Stack: [0x00007fbb790f9000,0x00007fbb791fa000],
>>>>>>> sp=0x00007fbb791f8af0,? free space=1022k
>>>>>>> Native frames: (J=compiled Java code, A=aot compiled Java code,
>>>>>>> j=interpreted, Vv=VM code, C=native code)
>>>>>>> C? [libjdwp.so+0xdb71]? bagEnumerateOver+0x11
>>>>>>> C? [libjdwp.so+0xe365]? classTrack_reset+0x25
>>>>>>> C? [libjdwp.so+0xfca1]? debugInit_reset+0x71
>>>>>>> C? [libjdwp.so+0x12e0d]? debugLoop_run+0x38d
>>>>>>> C? [libjdwp.so+0x25700]? acceptThread+0x80
>>>>>>> V? [libjvm.so+0xf4b5a7] 
>>>>>>> JvmtiAgentThread::call_start_function()+0x1c7
>>>>>>> V? [libjvm.so+0x15215c6] JavaThread::thread_main_inner()+0x226
>>>>>>> V? [libjvm.so+0x1527736]? Thread::call_run()+0xf6
>>>>>>> V? [libjvm.so+0x1250ade] thread_native_entry(Thread*)+0x10e
>>>>>>>
>>>>>>>
>>>>>>> This happened during a test task run of open/test/jdk/:jdk_jdi. 
>>>>>>> There
>>>>>>> doesn't seem to be anything magic on the command line that might be
>>>>>>> triggering. Pretty much I see it with all the various VM configs we
>>>>>>> test.
>>>>>>>
>>>>>>> I'm also seeing crashes in the following tests, but not as often:
>>>>>>>
>>>>>>> serviceability/jvmti/ModuleAwareAgents/ThreadStart/MAAThreadStart.java 
>>>>>>>
>>>>>>> vmTestbase/nsk/jdwp/VirtualMachine/Version/version002/TestDescription.java 
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> vmTestbase/nsk/jdwp/VirtualMachine/ReleaseEvents/releaseevents002/TestDescription.java 
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> vmTestbase/nsk/jdwp/VirtualMachine/HoldEvents/holdevents002/TestDescription.java 
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> vmTestbase/nsk/jdwp/VirtualMachine/Dispose/dispose001/TestDescription.java 
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>>
>>>>>>> On 3/25/20 11:37 AM, Roman Kennke wrote:
>>>>>>>> Hi Chris,
>>>>>>>>
>>>>>>>>> Regarding the new assert:
>>>>>>>>>
>>>>>>>>> ????105???? if (gdata && gdata->assertOn) {
>>>>>>>>> ????106???????? // Check this is not already tagged.
>>>>>>>>> ????107???????? jlong tag;
>>>>>>>>> ????108???????? error = JVMTI_FUNC_PTR(trackingEnv, GetTag)(env,
>>>>>>>>> klass, &tag);
>>>>>>>>> ????109???????? if (error != JVMTI_ERROR_NONE) {
>>>>>>>>> ????110???????????? EXIT_ERROR(error, "Unable to GetTag with 
>>>>>>>>> class
>>>>>>>>> trackingEnv");
>>>>>>>>> ????111???????? }
>>>>>>>>> ????112???????? JDI_ASSERT(tag == NOT_TAGGED);
>>>>>>>>> ????113???? }
>>>>>>>>>
>>>>>>>>> I think you should remove the gdata check. gdata should never be
>>>>>>>>> NULL
>>>>>>>>> when you get to this code. If it is ever NULL then there's a bug,
>>>>>>>>> and
>>>>>>>>> the check will hide the bug.
>>>>>>>> Ok, will remove this.
>>>>>>>>
>>>>>>>>> Regarding testing, after you do the submit repo testing let me 
>>>>>>>>> know
>>>>>>>>> the
>>>>>>>>> jobID and I'll do additional testing on it.
>>>>>>>> I did the submit repo earlier today, and it came back green:
>>>>>>>>
>>>>>>>> mach5-one-rkennke-JDK-8227269-2-20200325-1421-9706762
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Roman
>>>>>>>>
>>>>>>>>> thanks,
>>>>>>>>>
>>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>> On 3/25/20 6:00 AM, Roman Kennke wrote:
>>>>>>>>>> Hi Sergei,
>>>>>>>>>>
>>>>>>>>>>> The fix looks pretty clean now.
>>>>>>>>>>> I also like new name of the lock.:)
>>>>>>>>>> Thank you!
>>>>>>>>>>
>>>>>>>>>>> Just one comment below.
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 110 if (tag != 0l) {
>>>>>>>>>>> 111 return; // Already added
>>>>>>>>>>> ???? 112???? }
>>>>>>>>>>>
>>>>>>>>>>> ?????It is better to use a named constant or macro instead.
>>>>>>>>>>> ?????Also, it'd be nice to add a short comment about this 
>>>>>>>>>>> value is.
>>>>>>>>>> As I replied to Chris earlier, this whole block can be turned
>>>>>>>>>> into an
>>>>>>>>>> assert. I also made a constant for the value 0, which should be
>>>>>>>>>> pretty
>>>>>>>>>> much self-explaining.
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.07/
>>>>>>>>>>
>>>>>>>>>>> How do you test the fix?
>>>>>>>>>> I am using a manual test that is provided in this bug report:
>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>
>>>>>>>>>> "Script to compare performance of GC with and without debugger,
>>>>>>>>>> when
>>>>>>>>>> many classes are loaded and classes are being unloaded":
>>>>>>>>>>
>>>>>>>>>> https://bugzilla.redhat.com/attachment.cgi?id=1640688
>>>>>>>>>>
>>>>>>>>>> I am also using this test and manually attach/detach jdb a
>>>>>>>>>> couple of
>>>>>>>>>> times in a row to check that disconnecting and reconnecting 
>>>>>>>>>> works
>>>>>>>>>> well
>>>>>>>>>> (this tended to deadlock or crash with an earlier version of the
>>>>>>>>>> patch,
>>>>>>>>>> and is now looking good).
>>>>>>>>>>
>>>>>>>>>> I am also running tier1 and tier2 tests locally, and as soon 
>>>>>>>>>> as we
>>>>>>>>>> all
>>>>>>>>>> agree that the fix is reasonable, I will push it to the submit
>>>>>>>>>> repo. I
>>>>>>>>>> am not sure if any of those tests actually exercise that code,
>>>>>>>>>> though.
>>>>>>>>>> Let me know if you want me to run any specific tests.
>>>>>>>>>>
>>>>>>>>>> Thank you,
>>>>>>>>>> Roman
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Serguei
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 3/20/20 08:30, Roman Kennke wrote:
>>>>>>>>>>>> I believe I came up with a much simpler solution that also
>>>>>>>>>>>> solves the
>>>>>>>>>>>> problems of the existing one, and the ones I proposed earlier.
>>>>>>>>>>>>
>>>>>>>>>>>> It turns out that we can take advantage of the fact that we
>>>>>>>>>>>> can use
>>>>>>>>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is
>>>>>>>>>>>> explicitely
>>>>>>>>>>>> mentioned in the JVMTI spec). This means we can simply stick a
>>>>>>>>>>>> pointer
>>>>>>>>>>>> to the signature of a class into the tag, and pull it out 
>>>>>>>>>>>> again
>>>>>>>>>>>> when we
>>>>>>>>>>>> get notified that the class gets unloaded.
>>>>>>>>>>>>
>>>>>>>>>>>> This means we don't need an extra data-structure to keep 
>>>>>>>>>>>> track of
>>>>>>>>>>>> classes and signatures, and it also makes the story around
>>>>>>>>>>>> locking
>>>>>>>>>>>> *much* simpler. Performance-wise this is O(1), i.e. no 
>>>>>>>>>>>> scanning
>>>>>>>>>>>> of all
>>>>>>>>>>>> classes needed (as in the current implementation) and no
>>>>>>>>>>>> searching of
>>>>>>>>>>>> table needed (like in my previous attempts).
>>>>>>>>>>>>
>>>>>>>>>>>> Please review this new revision:
>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/
>>>>>>>>>>>>
>>>>>>>>>>>> (Notice that there still appears to be a performance 
>>>>>>>>>>>> bottleneck
>>>>>>>>>>>> with
>>>>>>>>>>>> class-unloading when an actual debugger is attached. This
>>>>>>>>>>>> doesn't seem
>>>>>>>>>>>> to be related to the classTrack.c implementation though, but
>>>>>>>>>>>> looks like
>>>>>>>>>>>> a consequence of getting all those class-unload notifications
>>>>>>>>>>>> over the
>>>>>>>>>>>> wire. My testcase generates 1000s of them, and it's clogging
>>>>>>>>>>>> up the
>>>>>>>>>>>> buffers.)
>>>>>>>>>>>>
>>>>>>>>>>>> I am not sure why jdb needs to enable class-unload listener
>>>>>>>>>>>> always. A
>>>>>>>>>>>> simple hack disables it, and performance is brilliant, even 
>>>>>>>>>>>> when
>>>>>>>>>>>> jdb is
>>>>>>>>>>>> attached:
>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch 
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> But this is not in the scope of this bug.)
>>>>>>>>>>>>
>>>>>>>>>>>> Roman
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>>> Sorry, forgot to complete my comments at the end (see below).
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>>>> Hi Roman,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thank you for the update and sorry for the latency in 
>>>>>>>>>>>>>> review.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Some comments are below.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html 
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
>>>>>>>>>>>>>> ????? 88 {
>>>>>>>>>>>>>> 89 debugMonitorEnter(deletedSignatureLock);
>>>>>>>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>> 93 return;
>>>>>>>>>>>>>> ????? 94???? }
>>>>>>>>>>>>>> Just a question:
>>>>>>>>>>>>>> ????? Q1: Should the ObjectFree events be disabled for the
>>>>>>>>>>>>>> jvmtiEnv
>>>>>>>>>>>>>> that does
>>>>>>>>>>>>>> ????????? the class tracking if class tracking has not been
>>>>>>>>>>>>>> initialized?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is
>>>>>>>>>>>>>> better to
>>>>>>>>>>>>>> be something like: lastClassTag or highestClassTag.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 99 KlassNode* klass = *klass_ptr;
>>>>>>>>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) 
>>>>>>>>>>>>>> { 103
>>>>>>>>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr;
>>>>>>>>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { //
>>>>>>>>>>>>>> klass
>>>>>>>>>>>>>> not
>>>>>>>>>>>>>> found - ignore.
>>>>>>>>>>>>>> 107 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>> 108 return;
>>>>>>>>>>>>>> ???? 109???? }
>>>>>>>>>>>>>> ?????It seems to me, something is wrong in the condition 
>>>>>>>>>>>>>> at L106
>>>>>>>>>>>>>> above.
>>>>>>>>>>>>>> ?????Should it be? :
>>>>>>>>>>>>>> ??????? if (klass == NULL || klass->klass_tag != tag)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ?????Otherwise, how can the second check ever work correctly
>>>>>>>>>>>>>> as the
>>>>>>>>>>>>>> return
>>>>>>>>>>>>>> will always happen when (klass != NULL)?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ???? There are several places in this file with the the 
>>>>>>>>>>>>>> indent:
>>>>>>>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>> 93 return;
>>>>>>>>>>>>>> ????? 94???? }
>>>>>>>>>>>>>> ???? ...
>>>>>>>>>>>>>> 152 if (currentClassTag == -1) {
>>>>>>>>>>>>>> 153 // Class tracking not initialized yet, nobody's 
>>>>>>>>>>>>>> interested
>>>>>>>>>>>>>> 154 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>> 155 return;
>>>>>>>>>>>>>> ???? 156???? }
>>>>>>>>>>>>>> ???? ...
>>>>>>>>>>>>>> 161 if (error != JVMTI_ERROR_NONE) {
>>>>>>>>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class
>>>>>>>>>>>>>> trackingEnv");
>>>>>>>>>>>>>> ???? 163???? }
>>>>>>>>>>>>>> 164 if (tag != 0l) {
>>>>>>>>>>>>>> 165 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>> 166 return; // Already added
>>>>>>>>>>>>>> ???? 167???? }
>>>>>>>>>>>>>> ???? ...
>>>>>>>>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg)
>>>>>>>>>>>>>> 282 {
>>>>>>>>>>>>>> 283 char* sig = (char*)signatureVoid;
>>>>>>>>>>>>>> 284 jvmtiDeallocate(sig);
>>>>>>>>>>>>>> 285 return JNI_TRUE;
>>>>>>>>>>>>>> ???? 286 }
>>>>>>>>>>>>>> ???? ...
>>>>>>>>>>>>>> ???? 291 void
>>>>>>>>>>>>>> ???? 292 classTrack_reset(void)
>>>>>>>>>>>>>> ???? 293 {
>>>>>>>>>>>>>> 294 int idx;
>>>>>>>>>>>>>> 295 debugMonitorEnter(deletedSignatureLock);
>>>>>>>>>>>>>> 296
>>>>>>>>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) {
>>>>>>>>>>>>>> 298 KlassNode* node = table[idx];
>>>>>>>>>>>>>> 299 while (node != NULL) {
>>>>>>>>>>>>>> 300 KlassNode* next = node->next;
>>>>>>>>>>>>>> 301 jvmtiDeallocate(node->signature);
>>>>>>>>>>>>>> 302 jvmtiDeallocate(node);
>>>>>>>>>>>>>> 303 node = next;
>>>>>>>>>>>>>> 304 }
>>>>>>>>>>>>>> 305 }
>>>>>>>>>>>>>> 306 jvmtiDeallocate(table);
>>>>>>>>>>>>>> 307
>>>>>>>>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, 
>>>>>>>>>>>>>> NULL);
>>>>>>>>>>>>>> 309 bagDestroyBag(deletedSignatureBag);
>>>>>>>>>>>>>> 310
>>>>>>>>>>>>>> 311 currentClassTag = -1;
>>>>>>>>>>>>>> 312
>>>>>>>>>>>>>> 313
>>>>>>>>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); 
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 314 trackingEnv = NULL;
>>>>>>>>>>>>>> 315
>>>>>>>>>>>>>> 316 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Could you, please, fix several comments below?
>>>>>>>>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for
>>>>>>>>>>>>>> class-unloads
>>>>>>>>>>>>>> ?????The comma is not needed.
>>>>>>>>>>>>>> ?????Would it better to replace: klass tags => klass_tag's ?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 73 * Lock to keep table, currentClassTag and
>>>>>>>>>>>>>> deletedSignatureBag
>>>>>>>>>>>>>> consistent
>>>>>>>>>>>>>> ?????Maybe: Lock to guard ... or lock to keep integrity 
>>>>>>>>>>>>>> of ...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 84 * Callback when classes are freed, Finds the signature 
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>> remembers it in deletedSignatureBag. Would be better to use
>>>>>>>>>>>>>> words
>>>>>>>>>>>>>> like
>>>>>>>>>>>>>> "store" or "record", "Find" should not start from capital
>>>>>>>>>>>>>> letter:
>>>>>>>>>>>>>> Invoke the callback when classes are freed, find and 
>>>>>>>>>>>>>> record the
>>>>>>>>>>>>>> signature in deletedSignatureBag.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not
>>>>>>>>>>>>>> initialized,
>>>>>>>>>>>>>> nobody's interested 153 // Class tracking not initialized 
>>>>>>>>>>>>>> yet,
>>>>>>>>>>>>>> nobody's interested 158 /* Check this is not a duplicate */
>>>>>>>>>>>>>> Missed
>>>>>>>>>>>>>> dot
>>>>>>>>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != 
>>>>>>>>>>>>>> tag)
>>>>>>>>>>>>>> { //
>>>>>>>>>>>>>> klass not found - ignore. In opposite, dot is not needed 
>>>>>>>>>>>>>> as the
>>>>>>>>>>>>>> comment does not start from a capital letter. 111 // At this
>>>>>>>>>>>>>> point we
>>>>>>>>>>>>>> have the KlassNode corresponding to the tag
>>>>>>>>>>>>>> 112 // in klass, and the pointer to it in klass_node.
>>>>>>>>>>>>> ???? The comment above can be better. Maybe, something like:
>>>>>>>>>>>>> ???? ? " At this point, we found the KlassNode matching 
>>>>>>>>>>>>> the klass
>>>>>>>>>>>>> tag(and it is
>>>>>>>>>>>>> linked).
>>>>>>>>>>>>>
>>>>>>>>>>>>>> 113 // Remember the unloaded signature.
>>>>>>>>>>>>> ?????Better: Record the signature of the unloaded class and
>>>>>>>>>>>>> unlink it.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 3/9/20 05:39, Roman Kennke wrote:
>>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Can I please get reviews of this change? In the meantime,
>>>>>>>>>>>>>>> we've done
>>>>>>>>>>>>>>> more testing and also field-/torture-testing by a customer
>>>>>>>>>>>>>>> who is
>>>>>>>>>>>>>>> happy
>>>>>>>>>>>>>>> now. :-)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Serguei,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks for reviewing!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I updated the patch to reflect your suggestions, very 
>>>>>>>>>>>>>>>> good!
>>>>>>>>>>>>>>>> It also includes a fix to allow re-connecting an agent 
>>>>>>>>>>>>>>>> after
>>>>>>>>>>>>>>>> disconnect,
>>>>>>>>>>>>>>>> namely move setup of the trackingEnv and
>>>>>>>>>>>>>>>> deletedSignatureBag to
>>>>>>>>>>>>>>>> _activate() to ensure have those structures after 
>>>>>>>>>>>>>>>> re-connect.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Let me know what you think!
>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Roman,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thank you for taking care about this scalability issue!
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I have a couple of quick comments.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html 
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 72 /*
>>>>>>>>>>>>>>>>> 73 * Lock to protect deletedSignatureBag
>>>>>>>>>>>>>>>>> 74 */
>>>>>>>>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>>>>>>>>>>>>>>>>> 78 * A bag containing all the deleted classes' 
>>>>>>>>>>>>>>>>> signatures.
>>>>>>>>>>>>>>>>> Must be
>>>>>>>>>>>>>>>>> accessed under
>>>>>>>>>>>>>>>>> 79 * deletedTagLock,
>>>>>>>>>>>>>>>>> ????? 80? */
>>>>>>>>>>>>>>>>> 81 struct bag* deletedSignatureBag;
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ????? The comments contradict to each other.
>>>>>>>>>>>>>>>>> ????? I guess, the lock name at line 79 has to be
>>>>>>>>>>>>>>>>> deletedSignatureLock
>>>>>>>>>>>>>>>>> instead of deletedTagLock.
>>>>>>>>>>>>>>>>> ????? Also, comma at the end must be replaced with dot.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 101 // Tag not found? Ignore.
>>>>>>>>>>>>>>>>> 102 if (klass == NULL) {
>>>>>>>>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>>>>> 104 return;
>>>>>>>>>>>>>>>>> 105 }
>>>>>>>>>>>>>>>>> ???? 106
>>>>>>>>>>>>>>>>> 107 // Scan linked-list.
>>>>>>>>>>>>>>>>> 108 jlong found_tag = klass->klass_tag;
>>>>>>>>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) {
>>>>>>>>>>>>>>>>> 110 klass_ptr = &klass->next;
>>>>>>>>>>>>>>>>> 111 klass = *klass_ptr;
>>>>>>>>>>>>>>>>> 112 found_tag = klass->klass_tag;
>>>>>>>>>>>>>>>>> ???? 113???? }
>>>>>>>>>>>>>>>>> 114
>>>>>>>>>>>>>>>>> 115 // Tag not found? Ignore.
>>>>>>>>>>>>>>>>> 116 if (found_tag != tag) {
>>>>>>>>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>>>>> 118 return;
>>>>>>>>>>>>>>>>> ???? 119???? }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ?????The code above can be simplified, so that the lines
>>>>>>>>>>>>>>>>> 101-105
>>>>>>>>>>>>>>>>> are not
>>>>>>>>>>>>>>>>> needed anymore.
>>>>>>>>>>>>>>>>> ?????It can be something like this:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> // Scan linked-list.
>>>>>>>>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) {
>>>>>>>>>>>>>>>>> klass_ptr = &klass->next;
>>>>>>>>>>>>>>>>> klass = *klass_ptr;
>>>>>>>>>>>>>>>>> ???????? }
>>>>>>>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // 
>>>>>>>>>>>>>>>>> klass not
>>>>>>>>>>>>>>>>> found - ignore.
>>>>>>>>>>>>>>>>> debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>>>>> return;
>>>>>>>>>>>>>>>>> ???????? }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> It will take more time when I get a chance to look at the
>>>>>>>>>>>>>>>>> rest.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote:
>>>>>>>>>>>>>>>>>> Here comes an update that resolves some races that 
>>>>>>>>>>>>>>>>>> happen
>>>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>>> disconnecting an agent. In particular, we need to 
>>>>>>>>>>>>>>>>>> take the
>>>>>>>>>>>>>>>>>> lock on
>>>>>>>>>>>>>>>>>> basically every operation, and also need to check 
>>>>>>>>>>>>>>>>>> whether
>>>>>>>>>>>>>>>>>> or not
>>>>>>>>>>>>>>>>>> class-tracking is active and return an appropriate 
>>>>>>>>>>>>>>>>>> result
>>>>>>>>>>>>>>>>>> (e.g. an empty
>>>>>>>>>>>>>>>>>> list) when we're not.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ 
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> So, here comes the O(1) implementation:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered 
>>>>>>>>>>>>>>>>>>> with a
>>>>>>>>>>>>>>>>>>> tag, and we
>>>>>>>>>>>>>>>>>>> set-up a listener to get notified when it is unloaded.
>>>>>>>>>>>>>>>>>>> - Prepared classes are kept in a datastructure that 
>>>>>>>>>>>>>>>>>>> is a
>>>>>>>>>>>>>>>>>>> table, which
>>>>>>>>>>>>>>>>>>> each entry being the head of a linked-list of 
>>>>>>>>>>>>>>>>>>> KlassNode*.
>>>>>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>>>>> table is
>>>>>>>>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend
>>>>>>>>>>>>>>>>>>> the new
>>>>>>>>>>>>>>>>>>> KlassNode*.
>>>>>>>>>>>>>>>>>>> This is O(1) operation.
>>>>>>>>>>>>>>>>>>> - When we get notified of unloading a class, we look up
>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> signature of
>>>>>>>>>>>>>>>>>>> the reported tag in that table, and remember it in a 
>>>>>>>>>>>>>>>>>>> bag.
>>>>>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>>>>> KlassNode*
>>>>>>>>>>>>>>>>>>> is then unlinked from the table and deallocated. 
>>>>>>>>>>>>>>>>>>> This is
>>>>>>>>>>>>>>>>>>> ~O(1) operation
>>>>>>>>>>>>>>>>>>> too, depending on the depth of the table. In my 
>>>>>>>>>>>>>>>>>>> testcase
>>>>>>>>>>>>>>>>>>> which hammered
>>>>>>>>>>>>>>>>>>> the code with class-loads and unloads, I usually see
>>>>>>>>>>>>>>>>>>> depths
>>>>>>>>>>>>>>>>>>> of like 2-3,
>>>>>>>>>>>>>>>>>>> but not usually more. It should be ok.
>>>>>>>>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out
>>>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>> bag, and
>>>>>>>>>>>>>>>>>>> allocate a new one.
>>>>>>>>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to 
>>>>>>>>>>>>>>>>>>> avoid
>>>>>>>>>>>>>>>>>>> leaking the
>>>>>>>>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets
>>>>>>>>>>>>>>>>>>> detached
>>>>>>>>>>>>>>>>>>> and/or
>>>>>>>>>>>>>>>>>>> re-attached (was missing before).
>>>>>>>>>>>>>>>>>>> - I also added locks around data-structure-manipulation
>>>>>>>>>>>>>>>>>>> (was
>>>>>>>>>>>>>>>>>>> missing
>>>>>>>>>>>>>>>>>>> before).
>>>>>>>>>>>>>>>>>>> - Also, I only activate this whole process when an 
>>>>>>>>>>>>>>>>>>> actual
>>>>>>>>>>>>>>>>>>> listener gets
>>>>>>>>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right
>>>>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>>>> attaching a
>>>>>>>>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be
>>>>>>>>>>>>>>>>>>> something
>>>>>>>>>>>>>>>>>>> to improve
>>>>>>>>>>>>>>>>>>> in the future?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> In my tests, the performance of class-tracking itself
>>>>>>>>>>>>>>>>>>> looks
>>>>>>>>>>>>>>>>>>> really good.
>>>>>>>>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the
>>>>>>>>>>>>>>>>>>> class-unload
>>>>>>>>>>>>>>>>>>> events. I don't see how this can be helped when the 
>>>>>>>>>>>>>>>>>>> debug
>>>>>>>>>>>>>>>>>>> agent asks for it?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ 
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Please let me know what you think of it.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am
>>>>>>>>>>>>>>>>>>>> implementing
>>>>>>>>>>>>>>>>>>>> the even more
>>>>>>>>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off 
>>>>>>>>>>>>>>>>>>>> reviewing
>>>>>>>>>>>>>>>>>>>> for now.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks,Roman
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> ???? Hi Chris,
>>>>>>>>>>>>>>>>>>>>>> I'll have a look at this, although it might not be
>>>>>>>>>>>>>>>>>>>>>> for a
>>>>>>>>>>>>>>>>>>>>>> few days. In
>>>>>>>>>>>>>>>>>>>>>> the meantime, maybe you can describe your new
>>>>>>>>>>>>>>>>>>>>>> implementation in
>>>>>>>>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the
>>>>>>>>>>>>>>>>>>>>>> changes.
>>>>>>>>>>>>>>>>>>>>> Sure.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to
>>>>>>>>>>>>>>>>>>>>> determine the
>>>>>>>>>>>>>>>>>>>>> signatures of unloaded classes when 
>>>>>>>>>>>>>>>>>>>>> GC/class-unloading
>>>>>>>>>>>>>>>>>>>>> happened, so that
>>>>>>>>>>>>>>>>>>>>> we can generate the appropriate JDWP event.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> The current implementation does so by maintaining a
>>>>>>>>>>>>>>>>>>>>> table
>>>>>>>>>>>>>>>>>>>>> of currently
>>>>>>>>>>>>>>>>>>>>> prepared classes by building that table when
>>>>>>>>>>>>>>>>>>>>> classTrack is
>>>>>>>>>>>>>>>>>>>>> initialized,
>>>>>>>>>>>>>>>>>>>>> and then add new classes whenever a class gets 
>>>>>>>>>>>>>>>>>>>>> loaded.
>>>>>>>>>>>>>>>>>>>>> When
>>>>>>>>>>>>>>>>>>>>> unloading
>>>>>>>>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and
>>>>>>>>>>>>>>>>>>>>> compared with the
>>>>>>>>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the
>>>>>>>>>>>>>>>>>>>>> new
>>>>>>>>>>>>>>>>>>>>> table gets
>>>>>>>>>>>>>>>>>>>>> returned. The problem is that when GCs happen 
>>>>>>>>>>>>>>>>>>>>> frequently
>>>>>>>>>>>>>>>>>>>>> and/or many
>>>>>>>>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to
>>>>>>>>>>>>>>>>>>>>> O(classCount*gcCount)
>>>>>>>>>>>>>>>>>>>>> complexity.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> The new implementation keeps a linked-list of 
>>>>>>>>>>>>>>>>>>>>> prepared
>>>>>>>>>>>>>>>>>>>>> classes, and also
>>>>>>>>>>>>>>>>>>>>> tracks unloads via the listener 
>>>>>>>>>>>>>>>>>>>>> cbTrackingObjectFree().
>>>>>>>>>>>>>>>>>>>>> Whenever an
>>>>>>>>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is
>>>>>>>>>>>>>>>>>>>>> scanned,
>>>>>>>>>>>>>>>>>>>>> and classes
>>>>>>>>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus
>>>>>>>>>>>>>>>>>>>>> maintaining the
>>>>>>>>>>>>>>>>>>>>> prepared-classes-list) and its signature put in 
>>>>>>>>>>>>>>>>>>>>> the list
>>>>>>>>>>>>>>>>>>>>> that gets returned.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> The implementation is not perfect. In order to 
>>>>>>>>>>>>>>>>>>>>> determine
>>>>>>>>>>>>>>>>>>>>> whether or not
>>>>>>>>>>>>>>>>>>>>> a class is unloaded, it needs to scan the 
>>>>>>>>>>>>>>>>>>>>> deletedTagBag.
>>>>>>>>>>>>>>>>>>>>> That process is
>>>>>>>>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption
>>>>>>>>>>>>>>>>>>>>> here
>>>>>>>>>>>>>>>>>>>>> is that
>>>>>>>>>>>>>>>>>>>>> unloadedClassCount << classCount. In my 
>>>>>>>>>>>>>>>>>>>>> experiments this
>>>>>>>>>>>>>>>>>>>>> seems to be
>>>>>>>>>>>>>>>>>>>>> true, and also reasonable to expect.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> (I have some ideas how to improve the 
>>>>>>>>>>>>>>>>>>>>> implementation to
>>>>>>>>>>>>>>>>>>>>> ~O(1) but it
>>>>>>>>>>>>>>>>>>>>> would be considerably more complex: have to 
>>>>>>>>>>>>>>>>>>>>> maintain a
>>>>>>>>>>>>>>>>>>>>> (hash)table that
>>>>>>>>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon
>>>>>>>>>>>>>>>>>>>>> unload,
>>>>>>>>>>>>>>>>>>>>> and build the
>>>>>>>>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently
>>>>>>>>>>>>>>>>>>>>> see
>>>>>>>>>>>>>>>>>>>>> that it's
>>>>>>>>>>>>>>>>>>>>> worth the effort).
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> In addition to all that, this process is only 
>>>>>>>>>>>>>>>>>>>>> activated
>>>>>>>>>>>>>>>>>>>>> when there's an
>>>>>>>>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>>>>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Issue:
>>>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of
>>>>>>>>>>>>>>>>>>>>>>> classTrack.c.
>>>>>>>>>>>>>>>>>>>>>>> It avoids
>>>>>>>>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead 
>>>>>>>>>>>>>>>>>>>>>>> keeps
>>>>>>>>>>>>>>>>>>>>>>> track of
>>>>>>>>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance
>>>>>>>>>>>>>>>>>>>>>>> until an
>>>>>>>>>>>>>>>>>>>>>>> agent
>>>>>>>>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ 
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Testing: manual testing of provided test 
>>>>>>>>>>>>>>>>>>>>>>> scenarios and
>>>>>>>>>>>>>>>>>>>>>>> timing.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Eg with the testcase provided here:
>>>>>>>>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I am getting those numbers:
>>>>>>>>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>>>>>>>>>>>>>>>>> patched:?? no debug: 85s with debug: 95s
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Can I please get a review?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>>>>>>>
>>>
>


From david.holmes at oracle.com  Fri Mar 27 02:18:28 2020
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 27 Mar 2020 12:18:28 +1000
Subject: RFR: 8241585: Remove unused _recursion_counter facility from
 PerfTraceTime
In-Reply-To: <c8d35386-a132-9439-7df4-f875201b609f@oracle.com>
References: <91a8ebbc-522d-bd67-6304-c9e097bd8366@oracle.com>
 <31c866a4-e135-adbc-cb8c-81fbd77bb59e@oracle.com>
 <c8d35386-a132-9439-7df4-f875201b609f@oracle.com>
Message-ID: <880cc6fc-de02-0891-9f66-46bb6185b04f@oracle.com>

Hi Claes,

On 27/03/2020 10:11 am, Claes Redestad wrote:
> 
> 
> On 2020-03-27 00:36, David Holmes wrote:
>>>
>>
>> Okay so can you change the bug synopsis and description to cover this 
>> more general cleanup and tuneup please.
> 
> I filed an addendum RFE and will add this RFE bug id to the single
> changeset push:
> https://bugs.openjdk.java.net/browse/JDK-8241705

That works too :) Thanks.

>>
>> I'm never very clear on the uses of these PerfCounters. It seems 
>> SUN_NS is unused after this change. The references to jvmstat seem no 
>> longer correct - these are read via jstat ?
> 
> The general confusion about PerfData/-Counters and what they're for is
> why I'm trying to untangle this. Generally I think we should pull the
> plug on it, but the perfdata shared file is tangled up with
> functionality to detect running JVMs used by jcmd etc, so it might
> take a few iterations to get there.

Yeah they confuse me. Which makes it hard to see what impact your 
changes may have.

Hopefully serviceability folk are more familiar with how things hook 
together.

Thanks,
David

> /Claes

From suenaga at oss.nttdata.com  Fri Mar 27 02:49:38 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Fri, 27 Mar 2020 11:49:38 +0900
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <48d5c2a6-777b-b17d-db05-1b71b712fb4e@oss.nttdata.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <3477b92f-8f83-2704-ca63-4d59d4c3f3a4@oss.nttdata.com>
 <a13433b4-e3f8-d280-b83f-cde27f7282cd@oracle.com>
 <ff79a3c5-7416-6ee1-73de-0ef1d91a7480@oracle.com>
 <b9694754-b669-0ea3-2ba9-a432b9d8dcfb@oss.nttdata.com>
 <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com>
 <c00aeece-863a-dfb3-f8da-2d3d5ae25330@oracle.com>
 <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com>
 <f0488e84-9758-ea16-82a6-4ce1f424a523@oracle.com>
 <4f940048-06e1-bda1-6f75-225001572232@oss.nttdata.com>
 <908005c7-c3ad-2f92-5853-2ae0895689b4@oracle.com>
 <2114ef12-b790-ee2d-4242-d114cc47151c@oracle.com>
 <9c9950af-5e36-90b5-d22f-a3e00d45fbcc@oss.nttdata.com>
 <f45b2bc7-a0ca-d85b-5998-1e30e99d0d36@oracle.com>
 <cb32ea21-a06a-824a-cd6f-0b731841f352@oss.nttdata.com>
 <0851349f-ee17-4b0e-f186-a378cedd6913@oracle.com>
 <e75f398d-b3b5-9253-681d-01d45414a2b5@oracle.com>
 <48d5c2a6-777b-b17d-db05-1b71b712fb4e@oss.nttdata.com>
Message-ID: <676255e6-e5b2-dfcb-7dc9-4dd8646032ec@oss.nttdata.com>

All tests on submit repo has been passed. (mach5-one-ysuenaga-JDK-8240956-3-20200327-0003-9753265)

Yasumasa

On 2020/03/27 9:07, Yasumasa Suenaga wrote:
> Thanks Kevin and Serguei! and sorry for my English...
> 
> I uploaded new webrev:
> 
>  ? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.05/
> 
> Diff from webrev.04 is here:
> 
>  ? http://hg.openjdk.java.net/jdk/submit/rev/d5f400d70e94
> 
> 
> Thanks,
> 
> Yasumasa
> 
> 
> On 2020/03/27 2:53, serguei.spitsyn at oracle.com wrote:
>> Hi Kevin,
>>
>> Nice catch with the name "lastFrame".
>> I was also confused when reviewed this but did not come up with something better.
>>
>> Thanks,
>> Serguei
>>
>> On 3/26/20 10:40, Kevin Walls wrote:
>>> Hi Yasumasa,
>>>
>>> Oops, didn't catch this - I also had done some manual testing and in mach5 but clearly not enough.
>>>
>>> Generally I think this looks good.
>>>
>>> "lastFrame" can mean last as in final, or last as in previous. "last" is one of those annoying English words.? Here it means final, if we get an Exception during processDwarf, use this to flag that we should return null from sender().? "finalFrame" would be clearer to me, anything else probably gets more verbose than you wanted.
>>>
>>> Yes I like having the limit on the while loop in process_dwarf(), always worried how sane the information is that we are parsing through.
>>>
>>> Thanks!
>>> Kevin
>>>
>>>
>>> On 24/03/2020 23:47, Yasumasa Suenaga wrote:
>>>> Thanks Serguei!
>>>>
>>>> I will push it when I get second reviewer.
>>>>
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2020/03/25 1:39, serguei.spitsyn at oracle.com wrote:
>>>>> Hi Yasumasa,
>>>>>
>>>>> I'm okay with this update.
>>>>> My mach5 test run for this patch is passed.
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>> On 3/23/20 17:08, Yasumasa Suenaga wrote:
>>>>>> Hi Serguei,
>>>>>>
>>>>>> Thanks for your comment!
>>>>>> I uploaded new webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.04/
>>>>>>
>>>>>> Also I pushed it to submit repo:
>>>>>>
>>>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/fade6a949bd1
>>>>>>
>>>>>> On 2020/03/24 7:39, serguei.spitsyn at oracle.com wrote:
>>>>>>> Hi Yasumasa,
>>>>>>>
>>>>>>> The mach5 tier5 testing looks good.
>>>>>>> The serviceability/sa/ClhsdbPstack.java is failed without fix and is not failed with it.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>
>>>>>>> On 3/23/20 10:18, serguei.spitsyn at oracle.com wrote:
>>>>>>>> Hi Yasumasa,
>>>>>>>>
>>>>>>>> I looked at you changes.
>>>>>>>> It is hard to understand if this fully solves the issue.
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html
>>>>>>>>
>>>>>>>> @@ -34,10 +34,11 @@
>>>>>>>> ? ???? public static LinuxAMD64CFrame getTopFrame(LinuxDebugger dbg, Address rip, ThreadContext context) {
>>>>>>>> ??????? Address libptr = dbg.findLibPtrByAddress(rip);
>>>>>>>> ??????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP);
>>>>>>>> ??????? DwarfParser dwarf = null;
>>>>>>>> + boolean unsupportedDwarf = false;
>>>>>>>> ? ??????? if (libptr != null) { // Native frame
>>>>>>>> ????????? try {
>>>>>>>> ??????????? dwarf = new DwarfParser(libptr);
>>>>>>>> ??????????? dwarf.processDwarf(rip);
>>>>>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
>>>>>>>>
>>>>>>>> @@ -45,24 +46,33 @@
>>>>>>>> ?????????????????? !dwarf.isBPOffsetAvailable())
>>>>>>>> ????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP)
>>>>>>>> ????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister())
>>>>>>>> .addOffsetTo(dwarf.getCFAOffset());
>>>>>>>> ????????? } catch (DebuggerException e) {
>>>>>>>> - // Bail out to Java frame case
>>>>>>>> + if (dwarf != null) {
>>>>>>>> + // DWARF processing should succeed when the frame is native
>>>>>>>> + // but it might fail if CIE has language personality routine
>>>>>>>> + // and/or LSDA.
>>>>>>>> + dwarf = null;
>>>>>>>> + unsupportedDwarf = true;
>>>>>>>> + } else {
>>>>>>>> + throw e;
>>>>>>>> + }
>>>>>>>> ????????? }
>>>>>>>> ??????? }
>>>>>>>> ? ??????? return (cfa == null) ? null
>>>>>>>> - : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf);
>>>>>>>> + : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf, !unsupportedDwarf);
>>>>>>>> ???? }
>>>>>>>>
>>>>>>>> @@ -121,13 +131,25 @@
>>>>>>>> ?????? }
>>>>>>>> ? ?????? return isValidFrame(nextCFA, context) ? nextCFA : null;
>>>>>>>> ???? }
>>>>>>>> ? - private DwarfParser getNextDwarf(Address nextPC) {
>>>>>>>> - DwarfParser nextDwarf = null;
>>>>>>>> + @Override
>>>>>>>> + public CFrame sender(ThreadProxy thread) {
>>>>>>>> + if (!possibleNext) {
>>>>>>>> + return null;
>>>>>>>> + }
>>>>>>>> +
>>>>>>>> + ThreadContext context = thread.getContext();
>>>>>>>> +
>>>>>>>> + Address nextPC = getNextPC(dwarf != null);
>>>>>>>> + if (nextPC == null) {
>>>>>>>> + return null;
>>>>>>>> + }
>>>>>>>> ? + DwarfParser nextDwarf = null;
>>>>>>>> + boolean unsupportedDwarf = false;
>>>>>>>> ?????? if ((dwarf != null) && dwarf.isIn(nextPC)) {
>>>>>>>> ???????? nextDwarf = dwarf;
>>>>>>>> ?????? } else {
>>>>>>>> ???????? Address libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>> ???????? if (libptr != null) {
>>>>>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
>>>>>>>>
>>>>>>>> @@ -138,33 +160,29 @@
>>>>>>>> ?????????? }
>>>>>>>> ???????? }
>>>>>>>> ?????? }
>>>>>>>> ? ?????? if (nextDwarf != null) {
>>>>>>>> + try {
>>>>>>>> ???????? nextDwarf.processDwarf(nextPC);
>>>>>>>> + } catch (DebuggerException e) {
>>>>>>>> + // DWARF processing should succeed when the frame is native
>>>>>>>> + // but it might fail if CIE has language personality routine
>>>>>>>> + // and/or LSDA.
>>>>>>>> + nextDwarf = null;
>>>>>>>> + unsupportedDwarf = true;
>>>>>>>> ?????? }
>>>>>>>>
>>>>>>>> This fix looks like a hack.
>>>>>>>> Should we just propagate the Debugging exception instead of trying to maintain unsupportedDwarf flag?
>>>>>>
>>>>>> DwarfParser::processDwarf would throw DebuggerException if it cannot find DWARF which relates to PC.
>>>>>> PC at this point is for next frame. So current frame (`this` object) is valid, and it should be processed.
>>>>>>
>>>>>>
>>>>>>>> Also, I don't like that DWARF-specific abbreviations (like CIE, IDE,LSDA, etc.) are used without any comments explaining them.
>>>>>>>> The code has to be generally readable without looking into the DWARF spec each time.
>>>>>>
>>>>>> I added comments for them in this webrev.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>>>> I'm submitting mach5 jobs to make sure the issue has been resolved with your fix.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>>
>>>>>>>>
>>>>>>>> On 3/20/20 17:55, Yasumasa Suenaga wrote:
>>>>>>>>> Thanks Chris!
>>>>>>>>> I'm waiting for reviewers for this change.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2020/03/21 4:23, Chris Plummer wrote:
>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>
>>>>>>>>>> The failure is due to JDK-8231634, so not something you need to worry about.
>>>>>>>>>>
>>>>>>>>>> thanks,
>>>>>>>>>>
>>>>>>>>>> Chris
>>>>>>>>>>
>>>>>>>>>> On 3/20/20 2:45 AM, Yasumasa Suenaga wrote:
>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>
>>>>>>>>>>> I uploaded new webrev which includes reverting change for ProblemList:
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/
>>>>>>>>>>>
>>>>>>>>>>> I tested it on submit repo (mach5-one-ysuenaga-JDK-8240956-2-20200320-0814-9586301),
>>>>>>>>>>> but it has failed in ClhsdbJstackXcompStress.java.
>>>>>>>>>>> However I think it is not caused by this change because ClhsdbJstackXcompStress.java tests `jhsdb jstack`, not mixed mode, it would not parse DWARF.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 2020/03/20 13:55, Chris Plummer wrote:
>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>
>>>>>>>>>>>> The test has been problem listed so please add undoing this to your webrev. Here's the diff that problem listed it:
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt
>>>>>>>>>>>> --- a/test/hotspot/jtreg/ProblemList.txt
>>>>>>>>>>>> +++ b/test/hotspot/jtreg/ProblemList.txt
>>>>>>>>>>>> @@ -115,7 +115,7 @@
>>>>>>>>>>>> ??serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all
>>>>>>>>>>>> ??serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all
>>>>>>>>>>>> ??serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all
>>>>>>>>>>>> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all
>>>>>>>>>>>> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 solaris-all,linux-all
>>>>>>>>>>>> ??serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 solaris-all
>>>>>>>>>>>> ??serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 solaris-all,linux-x64,macosx-x64,windows-x64
>>>>>>>>>>>> ??serviceability/sa/ClhsdbSource.java 8193639 solaris-all
>>>>>>>>>>>>
>>>>>>>>>>>> thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Chris
>>>>>>>>>>>>
>>>>>>>>>>>> On 3/16/20 5:07 AM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>
>>>>>>>>>>>>> This webrev has passed submit repo (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and additional tests.
>>>>>>>>>>>>> So please review it:
>>>>>>>>>>>>>
>>>>>>>>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>>>>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 2020/03/16 21:03, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>> Thank you so much, David!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 2020/03/16 21:01, David Holmes wrote:
>>>>>>>>>>>>>>> On 16/03/2020 9:46 pm, David Holmes wrote:
>>>>>>>>>>>>>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I missed loop condition, so I fixed it and pushed to submit repo.
>>>>>>>>>>>>>>>>> Could you try again?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> webrev is here:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Test job resubmitted. Will advise results if it completes before I go to bed :)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Seems to have passed okay.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks a lot!
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 2020/03/16 16:17, David Holmes wrote:
>>>>>>>>>>>>>>>>>> Sorry it is still crashing.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704
>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0640217.suenaga.source)
>>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>>>>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Same as before.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote:
>>>>>>>>>>>>>>>>>>>>> BTW, if you submit it to the submit repo, we can then go and run additional internal tests (and even more builds) using that job.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks for that tip Chris!
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I've pushed the change to submit repo, but I've not yet received the result.
>>>>>>>>>>>>>>>>>>>> I will share you when I get job ID.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> We can see the id. Just need to wait for the builds to complete before submitting the additional tests.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thank you for testing it.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I updated webrev to avoid bailout to Java frame when DWARF has language personality routine or LSDA.
>>>>>>>>>>>>>>>>>>>>>> Could you try it?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 .
>>>>>>>>>>>>>>>>>>>>>> I've pushed it to submit repo.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Diff from webrev.00 is here:
>>>>>>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>>>>> Correction ...
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> I can't review this as I know nothing about the code, but I'm putting the patch through our internal testing.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Sorry but the crashes still exist:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>>>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949
>>>>>>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev)
>>>>>>>>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>>>>>>>>>>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>>>>>>>>>>>>>> # C [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> in fact they seem worse as the test seems to always crash now.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in linux-x64. I don't see a pattern as to where it fails versus passes.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> It doesn't fail for me locally.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>>>>>>>>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames in jstack mixed mode.
>>>>>>>>>>>>>>>>>>>>>>>>>> However some error has seen intermittently after that.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> I investigated the cause of this, I found two concerns:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check
>>>>>>>>>>>>>>>>>>>>>>>>>> ?? B: Language personality routine and Language Specific Data Area (LSDA) are not considered
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> I addd range check for .eh_frame processing, and ignore personality routine and LSDA in this webrev.
>>>>>>>>>>>>>>>>>>>>>>>>>> Also I added bailout code if DWARF processing is failed due to these concerns.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> This change has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671),
>>>>>>>>>>>>>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>
>>
>>

From kevin.walls at oracle.com  Fri Mar 27 07:42:16 2020
From: kevin.walls at oracle.com (Kevin Walls)
Date: Fri, 27 Mar 2020 07:42:16 +0000
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <676255e6-e5b2-dfcb-7dc9-4dd8646032ec@oss.nttdata.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <a13433b4-e3f8-d280-b83f-cde27f7282cd@oracle.com>
 <ff79a3c5-7416-6ee1-73de-0ef1d91a7480@oracle.com>
 <b9694754-b669-0ea3-2ba9-a432b9d8dcfb@oss.nttdata.com>
 <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com>
 <c00aeece-863a-dfb3-f8da-2d3d5ae25330@oracle.com>
 <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com>
 <f0488e84-9758-ea16-82a6-4ce1f424a523@oracle.com>
 <4f940048-06e1-bda1-6f75-225001572232@oss.nttdata.com>
 <908005c7-c3ad-2f92-5853-2ae0895689b4@oracle.com>
 <2114ef12-b790-ee2d-4242-d114cc47151c@oracle.com>
 <9c9950af-5e36-90b5-d22f-a3e00d45fbcc@oss.nttdata.com>
 <f45b2bc7-a0ca-d85b-5998-1e30e99d0d36@oracle.com>
 <cb32ea21-a06a-824a-cd6f-0b731841f352@oss.nttdata.com>
 <0851349f-ee17-4b0e-f186-a378cedd6913@oracle.com>
 <e75f398d-b3b5-9253-681d-01d45414a2b5@oracle.com>
 <48d5c2a6-777b-b17d-db05-1b71b712fb4e@oss.nttdata.com>
 <676255e6-e5b2-dfcb-7dc9-4dd8646032ec@oss.nttdata.com>
Message-ID: <bf05fc4a-d5e0-43e0-d921-509b35093d87@oracle.com>

Great, thanks Yasumasa.? Don't worry, the language is not just you - 
it's often unclear in other places. 8-)? Sorry maybe I should have said 
you didn't need to resubmit the webrev for that, but a retest is nice.

Thanks
Kevin


On 27/03/2020 02:49, Yasumasa Suenaga wrote:
> All tests on submit repo has been passed. 
> (mach5-one-ysuenaga-JDK-8240956-3-20200327-0003-9753265)
>
> Yasumasa
>
> On 2020/03/27 9:07, Yasumasa Suenaga wrote:
>> Thanks Kevin and Serguei! and sorry for my English...
>>
>> I uploaded new webrev:
>>
>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.05/
>>
>> Diff from webrev.04 is here:
>>
>> ?? http://hg.openjdk.java.net/jdk/submit/rev/d5f400d70e94
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> On 2020/03/27 2:53, serguei.spitsyn at oracle.com wrote:
>>> Hi Kevin,
>>>
>>> Nice catch with the name "lastFrame".
>>> I was also confused when reviewed this but did not come up with 
>>> something better.
>>>
>>> Thanks,
>>> Serguei
>>>
>>> On 3/26/20 10:40, Kevin Walls wrote:
>>>> Hi Yasumasa,
>>>>
>>>> Oops, didn't catch this - I also had done some manual testing and 
>>>> in mach5 but clearly not enough.
>>>>
>>>> Generally I think this looks good.
>>>>
>>>> "lastFrame" can mean last as in final, or last as in previous. 
>>>> "last" is one of those annoying English words. Here it means final, 
>>>> if we get an Exception during processDwarf, use this to flag that 
>>>> we should return null from sender().? "finalFrame" would be clearer 
>>>> to me, anything else probably gets more verbose than you wanted.
>>>>
>>>> Yes I like having the limit on the while loop in process_dwarf(), 
>>>> always worried how sane the information is that we are parsing 
>>>> through.
>>>>
>>>> Thanks!
>>>> Kevin
>>>>
>>>>
>>>> On 24/03/2020 23:47, Yasumasa Suenaga wrote:
>>>>> Thanks Serguei!
>>>>>
>>>>> I will push it when I get second reviewer.
>>>>>
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2020/03/25 1:39, serguei.spitsyn at oracle.com wrote:
>>>>>> Hi Yasumasa,
>>>>>>
>>>>>> I'm okay with this update.
>>>>>> My mach5 test run for this patch is passed.
>>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>>
>>>>>> On 3/23/20 17:08, Yasumasa Suenaga wrote:
>>>>>>> Hi Serguei,
>>>>>>>
>>>>>>> Thanks for your comment!
>>>>>>> I uploaded new webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.04/
>>>>>>>
>>>>>>> Also I pushed it to submit repo:
>>>>>>>
>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/fade6a949bd1
>>>>>>>
>>>>>>> On 2020/03/24 7:39, serguei.spitsyn at oracle.com wrote:
>>>>>>>> Hi Yasumasa,
>>>>>>>>
>>>>>>>> The mach5 tier5 testing looks good.
>>>>>>>> The serviceability/sa/ClhsdbPstack.java is failed without fix 
>>>>>>>> and is not failed with it.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>>
>>>>>>>>
>>>>>>>> On 3/23/20 10:18, serguei.spitsyn at oracle.com wrote:
>>>>>>>>> Hi Yasumasa,
>>>>>>>>>
>>>>>>>>> I looked at you changes.
>>>>>>>>> It is hard to understand if this fully solves the issue.
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> @@ -34,10 +34,11 @@
>>>>>>>>> ? ???? public static LinuxAMD64CFrame 
>>>>>>>>> getTopFrame(LinuxDebugger dbg, Address rip, ThreadContext 
>>>>>>>>> context) {
>>>>>>>>> ??????? Address libptr = dbg.findLibPtrByAddress(rip);
>>>>>>>>> ??????? Address cfa = 
>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP);
>>>>>>>>> ??????? DwarfParser dwarf = null;
>>>>>>>>> + boolean unsupportedDwarf = false;
>>>>>>>>> ? ??????? if (libptr != null) { // Native frame
>>>>>>>>> ????????? try {
>>>>>>>>> ??????????? dwarf = new DwarfParser(libptr);
>>>>>>>>> ??????????? dwarf.processDwarf(rip);
>>>>>>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
>>>>>>>>>
>>>>>>>>> @@ -45,24 +46,33 @@
>>>>>>>>> ?????????????????? !dwarf.isBPOffsetAvailable())
>>>>>>>>> ????????????????????? ? 
>>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP)
>>>>>>>>> ????????????????????? : 
>>>>>>>>> context.getRegisterAsAddress(dwarf.getCFARegister())
>>>>>>>>> .addOffsetTo(dwarf.getCFAOffset());
>>>>>>>>> ????????? } catch (DebuggerException e) {
>>>>>>>>> - // Bail out to Java frame case
>>>>>>>>> + if (dwarf != null) {
>>>>>>>>> + // DWARF processing should succeed when the frame is native
>>>>>>>>> + // but it might fail if CIE has language personality routine
>>>>>>>>> + // and/or LSDA.
>>>>>>>>> + dwarf = null;
>>>>>>>>> + unsupportedDwarf = true;
>>>>>>>>> + } else {
>>>>>>>>> + throw e;
>>>>>>>>> + }
>>>>>>>>> ????????? }
>>>>>>>>> ??????? }
>>>>>>>>> ? ??????? return (cfa == null) ? null
>>>>>>>>> - : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf);
>>>>>>>>> + : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf, 
>>>>>>>>> !unsupportedDwarf);
>>>>>>>>> ???? }
>>>>>>>>>
>>>>>>>>> @@ -121,13 +131,25 @@
>>>>>>>>> ?????? }
>>>>>>>>> ? ?????? return isValidFrame(nextCFA, context) ? nextCFA : null;
>>>>>>>>> ???? }
>>>>>>>>> ? - private DwarfParser getNextDwarf(Address nextPC) {
>>>>>>>>> - DwarfParser nextDwarf = null;
>>>>>>>>> + @Override
>>>>>>>>> + public CFrame sender(ThreadProxy thread) {
>>>>>>>>> + if (!possibleNext) {
>>>>>>>>> + return null;
>>>>>>>>> + }
>>>>>>>>> +
>>>>>>>>> + ThreadContext context = thread.getContext();
>>>>>>>>> +
>>>>>>>>> + Address nextPC = getNextPC(dwarf != null);
>>>>>>>>> + if (nextPC == null) {
>>>>>>>>> + return null;
>>>>>>>>> + }
>>>>>>>>> ? + DwarfParser nextDwarf = null;
>>>>>>>>> + boolean unsupportedDwarf = false;
>>>>>>>>> ?????? if ((dwarf != null) && dwarf.isIn(nextPC)) {
>>>>>>>>> ???????? nextDwarf = dwarf;
>>>>>>>>> ?????? } else {
>>>>>>>>> ???????? Address libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>> ???????? if (libptr != null) {
>>>>>>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
>>>>>>>>>
>>>>>>>>> @@ -138,33 +160,29 @@
>>>>>>>>> ?????????? }
>>>>>>>>> ???????? }
>>>>>>>>> ?????? }
>>>>>>>>> ? ?????? if (nextDwarf != null) {
>>>>>>>>> + try {
>>>>>>>>> ???????? nextDwarf.processDwarf(nextPC);
>>>>>>>>> + } catch (DebuggerException e) {
>>>>>>>>> + // DWARF processing should succeed when the frame is native
>>>>>>>>> + // but it might fail if CIE has language personality routine
>>>>>>>>> + // and/or LSDA.
>>>>>>>>> + nextDwarf = null;
>>>>>>>>> + unsupportedDwarf = true;
>>>>>>>>> ?????? }
>>>>>>>>>
>>>>>>>>> This fix looks like a hack.
>>>>>>>>> Should we just propagate the Debugging exception instead of 
>>>>>>>>> trying to maintain unsupportedDwarf flag?
>>>>>>>
>>>>>>> DwarfParser::processDwarf would throw DebuggerException if it 
>>>>>>> cannot find DWARF which relates to PC.
>>>>>>> PC at this point is for next frame. So current frame (`this` 
>>>>>>> object) is valid, and it should be processed.
>>>>>>>
>>>>>>>
>>>>>>>>> Also, I don't like that DWARF-specific abbreviations (like 
>>>>>>>>> CIE, IDE,LSDA, etc.) are used without any comments explaining 
>>>>>>>>> them.
>>>>>>>>> The code has to be generally readable without looking into the 
>>>>>>>>> DWARF spec each time.
>>>>>>>
>>>>>>> I added comments for them in this webrev.
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>>>> I'm submitting mach5 jobs to make sure the issue has been 
>>>>>>>>> resolved with your fix.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Serguei
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 3/20/20 17:55, Yasumasa Suenaga wrote:
>>>>>>>>>> Thanks Chris!
>>>>>>>>>> I'm waiting for reviewers for this change.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2020/03/21 4:23, Chris Plummer wrote:
>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>
>>>>>>>>>>> The failure is due to JDK-8231634, so not something you need 
>>>>>>>>>>> to worry about.
>>>>>>>>>>>
>>>>>>>>>>> thanks,
>>>>>>>>>>>
>>>>>>>>>>> Chris
>>>>>>>>>>>
>>>>>>>>>>> On 3/20/20 2:45 AM, Yasumasa Suenaga wrote:
>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>
>>>>>>>>>>>> I uploaded new webrev which includes reverting change for 
>>>>>>>>>>>> ProblemList:
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/
>>>>>>>>>>>>
>>>>>>>>>>>> I tested it on submit repo 
>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-2-20200320-0814-9586301),
>>>>>>>>>>>> but it has failed in ClhsdbJstackXcompStress.java.
>>>>>>>>>>>> However I think it is not caused by this change because 
>>>>>>>>>>>> ClhsdbJstackXcompStress.java tests `jhsdb jstack`, not 
>>>>>>>>>>>> mixed mode, it would not parse DWARF.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 2020/03/20 13:55, Chris Plummer wrote:
>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>
>>>>>>>>>>>>> The test has been problem listed so please add undoing 
>>>>>>>>>>>>> this to your webrev. Here's the diff that problem listed it:
>>>>>>>>>>>>>
>>>>>>>>>>>>> diff --git a/test/hotspot/jtreg/ProblemList.txt 
>>>>>>>>>>>>> b/test/hotspot/jtreg/ProblemList.txt
>>>>>>>>>>>>> --- a/test/hotspot/jtreg/ProblemList.txt
>>>>>>>>>>>>> +++ b/test/hotspot/jtreg/ProblemList.txt
>>>>>>>>>>>>> @@ -115,7 +115,7 @@
>>>>>>>>>>>>> ??serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all
>>>>>>>>>>>>> ??serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all
>>>>>>>>>>>>> ??serviceability/sa/ClhsdbPrintStatics.java 8193639 
>>>>>>>>>>>>> solaris-all
>>>>>>>>>>>>> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all
>>>>>>>>>>>>> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 
>>>>>>>>>>>>> solaris-all,linux-all
>>>>>>>>>>>>> ??serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 
>>>>>>>>>>>>> 8193639 solaris-all
>>>>>>>>>>>>> ??serviceability/sa/ClhsdbScanOops.java 
>>>>>>>>>>>>> 8193639,8235220,8230731 
>>>>>>>>>>>>> solaris-all,linux-x64,macosx-x64,windows-x64
>>>>>>>>>>>>> ??serviceability/sa/ClhsdbSource.java 8193639 solaris-all
>>>>>>>>>>>>>
>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 3/16/20 5:07 AM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This webrev has passed submit repo 
>>>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) 
>>>>>>>>>>>>>> and additional tests.
>>>>>>>>>>>>>> So please review it:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>>>>>>> ? webrev: 
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 2020/03/16 21:03, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>> Thank you so much, David!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 2020/03/16 21:01, David Holmes wrote:
>>>>>>>>>>>>>>>> On 16/03/2020 9:46 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I missed loop condition, so I fixed it and pushed to 
>>>>>>>>>>>>>>>>>> submit repo.
>>>>>>>>>>>>>>>>>> Could you try again?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> webrev is here:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/ 
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Test job resubmitted. Will advise results if it 
>>>>>>>>>>>>>>>>> completes before I go to bed :)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Seems to have passed okay.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks a lot!
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 2020/03/16 16:17, David Holmes wrote:
>>>>>>>>>>>>>>>>>>> Sorry it is still crashing.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java 
>>>>>>>>>>>>>>>>>>> Runtime Environment:
>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, 
>>>>>>>>>>>>>>>>>>> pid=13702, tid=13704
>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment 
>>>>>>>>>>>>>>>>>>> (15.0) (fastdebug build 
>>>>>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0640217.suenaga.source)
>>>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM 
>>>>>>>>>>>>>>>>>>> (fastdebug 
>>>>>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0640217.suenaga.source, 
>>>>>>>>>>>>>>>>>>> mixed mode, sharing, tiered, compressed oops, g1 gc, 
>>>>>>>>>>>>>>>>>>> linux-amd64)
>>>>>>>>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] 
>>>>>>>>>>>>>>>>>>> DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Same as before.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote:
>>>>>>>>>>>>>>>>>>>>>> BTW, if you submit it to the submit repo, we can 
>>>>>>>>>>>>>>>>>>>>>> then go and run additional internal tests (and 
>>>>>>>>>>>>>>>>>>>>>> even more builds) using that job.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks for that tip Chris!
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I've pushed the change to submit repo, but I've 
>>>>>>>>>>>>>>>>>>>>> not yet received the result.
>>>>>>>>>>>>>>>>>>>>> I will share you when I get job ID.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> We can see the id. Just need to wait for the builds 
>>>>>>>>>>>>>>>>>>>> to complete before submitting the additional tests.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Thank you for testing it.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I updated webrev to avoid bailout to Java frame 
>>>>>>>>>>>>>>>>>>>>>>> when DWARF has language personality routine or 
>>>>>>>>>>>>>>>>>>>>>>> LSDA.
>>>>>>>>>>>>>>>>>>>>>>> Could you try it?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ 
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 
>>>>>>>>>>>>>>>>>>>>>>> 7.7 .
>>>>>>>>>>>>>>>>>>>>>>> I've pushed it to submit repo.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Diff from webrev.00 is here:
>>>>>>>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652 
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>>>>>> Correction ...
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> I can't review this as I know nothing about 
>>>>>>>>>>>>>>>>>>>>>>>>>> the code, but I'm putting the patch through 
>>>>>>>>>>>>>>>>>>>>>>>>>> our internal testing.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Sorry but the crashes still exist:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java 
>>>>>>>>>>>>>>>>>>>>>>>>> Runtime Environment:
>>>>>>>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, 
>>>>>>>>>>>>>>>>>>>>>>>>> pid=16948, tid=16949
>>>>>>>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment 
>>>>>>>>>>>>>>>>>>>>>>>>> (15.0) (fastdebug build 
>>>>>>>>>>>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev) 
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM 
>>>>>>>>>>>>>>>>>>>>>>>>> (fastdebug 
>>>>>>>>>>>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, 
>>>>>>>>>>>>>>>>>>>>>>>>> mixed mode, sharing, tiered, compressed oops, 
>>>>>>>>>>>>>>>>>>>>>>>>> g1 gc, linux-amd64)
>>>>>>>>>>>>>>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>>>>>>>>>>>>>>> # C [libsaproc.so+0x494e] 
>>>>>>>>>>>>>>>>>>>>>>>>> DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> in fact they seem worse as the test seems to 
>>>>>>>>>>>>>>>>>>>>>>>>> always crash now.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 
>>>>>>>>>>>>>>>>>>>>>>>> runs of the test in linux-x64. I don't see a 
>>>>>>>>>>>>>>>>>>>>>>>> pattern as to where it fails versus passes.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> It doesn't fail for me locally.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> ?? JBS: 
>>>>>>>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8240956 
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> ?? webrev: 
>>>>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ 
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA 
>>>>>>>>>>>>>>>>>>>>>>>>>>> for unwinding native frames in jstack mixed 
>>>>>>>>>>>>>>>>>>>>>>>>>>> mode.
>>>>>>>>>>>>>>>>>>>>>>>>>>> However some error has seen intermittently 
>>>>>>>>>>>>>>>>>>>>>>>>>>> after that.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> I investigated the cause of this, I found 
>>>>>>>>>>>>>>>>>>>>>>>>>>> two concerns:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section 
>>>>>>>>>>>>>>>>>>>>>>>>>>> data) range check
>>>>>>>>>>>>>>>>>>>>>>>>>>> ?? B: Language personality routine and 
>>>>>>>>>>>>>>>>>>>>>>>>>>> Language Specific Data Area (LSDA) are not 
>>>>>>>>>>>>>>>>>>>>>>>>>>> considered
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> I addd range check for .eh_frame processing, 
>>>>>>>>>>>>>>>>>>>>>>>>>>> and ignore personality routine and LSDA in 
>>>>>>>>>>>>>>>>>>>>>>>>>>> this webrev.
>>>>>>>>>>>>>>>>>>>>>>>>>>> Also I added bailout code if DWARF 
>>>>>>>>>>>>>>>>>>>>>>>>>>> processing is failed due to these concerns.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> This change has passed all tests on submit 
>>>>>>>>>>>>>>>>>>>>>>>>>>> repo 
>>>>>>>>>>>>>>>>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), 
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and 
>>>>>>>>>>>>>>>>>>>>>>>>>>> Oracle Linux 7.7 container.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>
>>>


From suenaga at oss.nttdata.com  Fri Mar 27 07:54:08 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Fri, 27 Mar 2020 16:54:08 +0900
Subject: RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
In-Reply-To: <bf05fc4a-d5e0-43e0-d921-509b35093d87@oracle.com>
References: <0df41255-25e9-9eb2-8c35-4d80c3a3600b@oss.nttdata.com>
 <ff79a3c5-7416-6ee1-73de-0ef1d91a7480@oracle.com>
 <b9694754-b669-0ea3-2ba9-a432b9d8dcfb@oss.nttdata.com>
 <6ce14ba3-e9e7-f88a-e813-102fda0cd2f3@oss.nttdata.com>
 <c00aeece-863a-dfb3-f8da-2d3d5ae25330@oracle.com>
 <319e3627-2269-0014-4fb9-7f8f4014787f@oss.nttdata.com>
 <f0488e84-9758-ea16-82a6-4ce1f424a523@oracle.com>
 <4f940048-06e1-bda1-6f75-225001572232@oss.nttdata.com>
 <908005c7-c3ad-2f92-5853-2ae0895689b4@oracle.com>
 <2114ef12-b790-ee2d-4242-d114cc47151c@oracle.com>
 <9c9950af-5e36-90b5-d22f-a3e00d45fbcc@oss.nttdata.com>
 <f45b2bc7-a0ca-d85b-5998-1e30e99d0d36@oracle.com>
 <cb32ea21-a06a-824a-cd6f-0b731841f352@oss.nttdata.com>
 <0851349f-ee17-4b0e-f186-a378cedd6913@oracle.com>
 <e75f398d-b3b5-9253-681d-01d45414a2b5@oracle.com>
 <48d5c2a6-777b-b17d-db05-1b71b712fb4e@oss.nttdata.com>
 <676255e6-e5b2-dfcb-7dc9-4dd8646032ec@oss.nttdata.com>
 <bf05fc4a-d5e0-43e0-d921-509b35093d87@oracle.com>
Message-ID: <e127bf0d-37dc-eb19-2174-3b103bf6872e@oss.nttdata.com>

Thanks Kevin! I will push it.

Yasumasa

On 2020/03/27 16:42, Kevin Walls wrote:
> Great, thanks Yasumasa.? Don't worry, the language is not just you - it's often unclear in other places. 8-)? Sorry maybe I should have said you didn't need to resubmit the webrev for that, but a retest is nice.
> 
> Thanks
> Kevin
> 
> 
> On 27/03/2020 02:49, Yasumasa Suenaga wrote:
>> All tests on submit repo has been passed. (mach5-one-ysuenaga-JDK-8240956-3-20200327-0003-9753265)
>>
>> Yasumasa
>>
>> On 2020/03/27 9:07, Yasumasa Suenaga wrote:
>>> Thanks Kevin and Serguei! and sorry for my English...
>>>
>>> I uploaded new webrev:
>>>
>>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.05/
>>>
>>> Diff from webrev.04 is here:
>>>
>>> ?? http://hg.openjdk.java.net/jdk/submit/rev/d5f400d70e94
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2020/03/27 2:53, serguei.spitsyn at oracle.com wrote:
>>>> Hi Kevin,
>>>>
>>>> Nice catch with the name "lastFrame".
>>>> I was also confused when reviewed this but did not come up with something better.
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>> On 3/26/20 10:40, Kevin Walls wrote:
>>>>> Hi Yasumasa,
>>>>>
>>>>> Oops, didn't catch this - I also had done some manual testing and in mach5 but clearly not enough.
>>>>>
>>>>> Generally I think this looks good.
>>>>>
>>>>> "lastFrame" can mean last as in final, or last as in previous. "last" is one of those annoying English words. Here it means final, if we get an Exception during processDwarf, use this to flag that we should return null from sender().? "finalFrame" would be clearer to me, anything else probably gets more verbose than you wanted.
>>>>>
>>>>> Yes I like having the limit on the while loop in process_dwarf(), always worried how sane the information is that we are parsing through.
>>>>>
>>>>> Thanks!
>>>>> Kevin
>>>>>
>>>>>
>>>>> On 24/03/2020 23:47, Yasumasa Suenaga wrote:
>>>>>> Thanks Serguei!
>>>>>>
>>>>>> I will push it when I get second reviewer.
>>>>>>
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>> On 2020/03/25 1:39, serguei.spitsyn at oracle.com wrote:
>>>>>>> Hi Yasumasa,
>>>>>>>
>>>>>>> I'm okay with this update.
>>>>>>> My mach5 test run for this patch is passed.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>
>>>>>>> On 3/23/20 17:08, Yasumasa Suenaga wrote:
>>>>>>>> Hi Serguei,
>>>>>>>>
>>>>>>>> Thanks for your comment!
>>>>>>>> I uploaded new webrev:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.04/
>>>>>>>>
>>>>>>>> Also I pushed it to submit repo:
>>>>>>>>
>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/fade6a949bd1
>>>>>>>>
>>>>>>>> On 2020/03/24 7:39, serguei.spitsyn at oracle.com wrote:
>>>>>>>>> Hi Yasumasa,
>>>>>>>>>
>>>>>>>>> The mach5 tier5 testing looks good.
>>>>>>>>> The serviceability/sa/ClhsdbPstack.java is failed without fix and is not failed with it.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Serguei
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 3/23/20 10:18, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>
>>>>>>>>>> I looked at you changes.
>>>>>>>>>> It is hard to understand if this fully solves the issue.
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html
>>>>>>>>>>
>>>>>>>>>> @@ -34,10 +34,11 @@
>>>>>>>>>> ? ???? public static LinuxAMD64CFrame getTopFrame(LinuxDebugger dbg, Address rip, ThreadContext context) {
>>>>>>>>>> ??????? Address libptr = dbg.findLibPtrByAddress(rip);
>>>>>>>>>> ??????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP);
>>>>>>>>>> ??????? DwarfParser dwarf = null;
>>>>>>>>>> + boolean unsupportedDwarf = false;
>>>>>>>>>> ? ??????? if (libptr != null) { // Native frame
>>>>>>>>>> ????????? try {
>>>>>>>>>> ??????????? dwarf = new DwarfParser(libptr);
>>>>>>>>>> ??????????? dwarf.processDwarf(rip);
>>>>>>>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
>>>>>>>>>>
>>>>>>>>>> @@ -45,24 +46,33 @@
>>>>>>>>>> ?????????????????? !dwarf.isBPOffsetAvailable())
>>>>>>>>>> ????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP)
>>>>>>>>>> ????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister())
>>>>>>>>>> .addOffsetTo(dwarf.getCFAOffset());
>>>>>>>>>> ????????? } catch (DebuggerException e) {
>>>>>>>>>> - // Bail out to Java frame case
>>>>>>>>>> + if (dwarf != null) {
>>>>>>>>>> + // DWARF processing should succeed when the frame is native
>>>>>>>>>> + // but it might fail if CIE has language personality routine
>>>>>>>>>> + // and/or LSDA.
>>>>>>>>>> + dwarf = null;
>>>>>>>>>> + unsupportedDwarf = true;
>>>>>>>>>> + } else {
>>>>>>>>>> + throw e;
>>>>>>>>>> + }
>>>>>>>>>> ????????? }
>>>>>>>>>> ??????? }
>>>>>>>>>> ? ??????? return (cfa == null) ? null
>>>>>>>>>> - : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf);
>>>>>>>>>> + : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf, !unsupportedDwarf);
>>>>>>>>>> ???? }
>>>>>>>>>>
>>>>>>>>>> @@ -121,13 +131,25 @@
>>>>>>>>>> ?????? }
>>>>>>>>>> ? ?????? return isValidFrame(nextCFA, context) ? nextCFA : null;
>>>>>>>>>> ???? }
>>>>>>>>>> ? - private DwarfParser getNextDwarf(Address nextPC) {
>>>>>>>>>> - DwarfParser nextDwarf = null;
>>>>>>>>>> + @Override
>>>>>>>>>> + public CFrame sender(ThreadProxy thread) {
>>>>>>>>>> + if (!possibleNext) {
>>>>>>>>>> + return null;
>>>>>>>>>> + }
>>>>>>>>>> +
>>>>>>>>>> + ThreadContext context = thread.getContext();
>>>>>>>>>> +
>>>>>>>>>> + Address nextPC = getNextPC(dwarf != null);
>>>>>>>>>> + if (nextPC == null) {
>>>>>>>>>> + return null;
>>>>>>>>>> + }
>>>>>>>>>> ? + DwarfParser nextDwarf = null;
>>>>>>>>>> + boolean unsupportedDwarf = false;
>>>>>>>>>> ?????? if ((dwarf != null) && dwarf.isIn(nextPC)) {
>>>>>>>>>> ???????? nextDwarf = dwarf;
>>>>>>>>>> ?????? } else {
>>>>>>>>>> ???????? Address libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>>> ???????? if (libptr != null) {
>>>>>>>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
>>>>>>>>>>
>>>>>>>>>> @@ -138,33 +160,29 @@
>>>>>>>>>> ?????????? }
>>>>>>>>>> ???????? }
>>>>>>>>>> ?????? }
>>>>>>>>>> ? ?????? if (nextDwarf != null) {
>>>>>>>>>> + try {
>>>>>>>>>> ???????? nextDwarf.processDwarf(nextPC);
>>>>>>>>>> + } catch (DebuggerException e) {
>>>>>>>>>> + // DWARF processing should succeed when the frame is native
>>>>>>>>>> + // but it might fail if CIE has language personality routine
>>>>>>>>>> + // and/or LSDA.
>>>>>>>>>> + nextDwarf = null;
>>>>>>>>>> + unsupportedDwarf = true;
>>>>>>>>>> ?????? }
>>>>>>>>>>
>>>>>>>>>> This fix looks like a hack.
>>>>>>>>>> Should we just propagate the Debugging exception instead of trying to maintain unsupportedDwarf flag?
>>>>>>>>
>>>>>>>> DwarfParser::processDwarf would throw DebuggerException if it cannot find DWARF which relates to PC.
>>>>>>>> PC at this point is for next frame. So current frame (`this` object) is valid, and it should be processed.
>>>>>>>>
>>>>>>>>
>>>>>>>>>> Also, I don't like that DWARF-specific abbreviations (like CIE, IDE,LSDA, etc.) are used without any comments explaining them.
>>>>>>>>>> The code has to be generally readable without looking into the DWARF spec each time.
>>>>>>>>
>>>>>>>> I added comments for them in this webrev.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>>>> I'm submitting mach5 jobs to make sure the issue has been resolved with your fix.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Serguei
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 3/20/20 17:55, Yasumasa Suenaga wrote:
>>>>>>>>>>> Thanks Chris!
>>>>>>>>>>> I'm waiting for reviewers for this change.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 2020/03/21 4:23, Chris Plummer wrote:
>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>
>>>>>>>>>>>> The failure is due to JDK-8231634, so not something you need to worry about.
>>>>>>>>>>>>
>>>>>>>>>>>> thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Chris
>>>>>>>>>>>>
>>>>>>>>>>>> On 3/20/20 2:45 AM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I uploaded new webrev which includes reverting change for ProblemList:
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/
>>>>>>>>>>>>>
>>>>>>>>>>>>> I tested it on submit repo (mach5-one-ysuenaga-JDK-8240956-2-20200320-0814-9586301),
>>>>>>>>>>>>> but it has failed in ClhsdbJstackXcompStress.java.
>>>>>>>>>>>>> However I think it is not caused by this change because ClhsdbJstackXcompStress.java tests `jhsdb jstack`, not mixed mode, it would not parse DWARF.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 2020/03/20 13:55, Chris Plummer wrote:
>>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The test has been problem listed so please add undoing this to your webrev. Here's the diff that problem listed it:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt
>>>>>>>>>>>>>> --- a/test/hotspot/jtreg/ProblemList.txt
>>>>>>>>>>>>>> +++ b/test/hotspot/jtreg/ProblemList.txt
>>>>>>>>>>>>>> @@ -115,7 +115,7 @@
>>>>>>>>>>>>>> ??serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all
>>>>>>>>>>>>>> ??serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all
>>>>>>>>>>>>>> ??serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all
>>>>>>>>>>>>>> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all
>>>>>>>>>>>>>> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 solaris-all,linux-all
>>>>>>>>>>>>>> ??serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 solaris-all
>>>>>>>>>>>>>> ??serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 solaris-all,linux-x64,macosx-x64,windows-x64
>>>>>>>>>>>>>> ??serviceability/sa/ClhsdbSource.java 8193639 solaris-all
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 3/16/20 5:07 AM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This webrev has passed submit repo (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and additional tests.
>>>>>>>>>>>>>>> So please review it:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>>>>>>>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 2020/03/16 21:03, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>> Thank you so much, David!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 2020/03/16 21:01, David Holmes wrote:
>>>>>>>>>>>>>>>>> On 16/03/2020 9:46 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I missed loop condition, so I fixed it and pushed to submit repo.
>>>>>>>>>>>>>>>>>>> Could you try again?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> webrev is here:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Test job resubmitted. Will advise results if it completes before I go to bed :)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Seems to have passed okay.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks a lot!
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 2020/03/16 16:17, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>> Sorry it is still crashing.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, tid=13704
>>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0640217.suenaga.source)
>>>>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0640217.suenaga.source, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>>>>>>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>>>>>>>>>> # C? [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Same as before.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote:
>>>>>>>>>>>>>>>>>>>>>>> BTW, if you submit it to the submit repo, we can then go and run additional internal tests (and even more builds) using that job.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks for that tip Chris!
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I've pushed the change to submit repo, but I've not yet received the result.
>>>>>>>>>>>>>>>>>>>>>> I will share you when I get job ID.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> We can see the id. Just need to wait for the builds to complete before submitting the additional tests.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Thank you for testing it.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> I updated webrev to avoid bailout to Java frame when DWARF has language personality routine or LSDA.
>>>>>>>>>>>>>>>>>>>>>>>> Could you try it?
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 .
>>>>>>>>>>>>>>>>>>>>>>>> I've pushed it to submit repo.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Diff from webrev.00 is here:
>>>>>>>>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> Correction ...
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> I can't review this as I know nothing about the code, but I'm putting the patch through our internal testing.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Sorry but the crashes still exist:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>>>>>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>>>>>>>>> #? SIGSEGV (0xb) at pc=0x00007fcdd403894e, pid=16948, tid=16949
>>>>>>>>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) (fastdebug build 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev)
>>>>>>>>>>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>>>>>>>>>>>>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>>>>>>>>>>>>>>>> # C [libsaproc.so+0x494e] DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> in fact they seem worse as the test seems to always crash now.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs of the test in linux-x64. I don't see a pattern as to where it fails versus passes.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> It doesn't fail for me locally.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>>>>>>>>>>>>>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for unwinding native frames in jstack mixed mode.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> However some error has seen intermittently after that.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> I investigated the cause of this, I found two concerns:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> ?? A: lack of buffer (.eh_frame section data) range check
>>>>>>>>>>>>>>>>>>>>>>>>>>>> ?? B: Language personality routine and Language Specific Data Area (LSDA) are not considered
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> I addd range check for .eh_frame processing, and ignore personality routine and LSDA in this webrev.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Also I added bailout code if DWARF processing is failed due to these concerns.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> This change has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671),
>>>>>>>>>>>>>>>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle Linux 7.7 container.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>>
> 

From claes.redestad at oracle.com  Fri Mar 27 10:26:36 2020
From: claes.redestad at oracle.com (Claes Redestad)
Date: Fri, 27 Mar 2020 11:26:36 +0100
Subject: RFR: 8241585: Remove unused _recursion_counter facility from
 PerfTraceTime
In-Reply-To: <880cc6fc-de02-0891-9f66-46bb6185b04f@oracle.com>
References: <91a8ebbc-522d-bd67-6304-c9e097bd8366@oracle.com>
 <31c866a4-e135-adbc-cb8c-81fbd77bb59e@oracle.com>
 <c8d35386-a132-9439-7df4-f875201b609f@oracle.com>
 <880cc6fc-de02-0891-9f66-46bb6185b04f@oracle.com>
Message-ID: <2a064a78-09cb-4f0e-6bef-4a74ca9712f9@oracle.com>


On 2020-03-27 03:18, David Holmes wrote:
> 
> Yeah they confuse me. Which makes it hard to see what impact your 
> changes may have.

This patch removes some internal, unused code on the JVM end that is not
observable via jstat / jvmstat. I'm happy if serviceability can weigh in
though.

The other RFE[1] I've filed to remove StatSampler[1] might be more
contentious since it changes what gets periodically stored in the
perfdata shared file. I've not yet decided if it's worth the trouble to
move ahead with that at this point.

/Claes

[1] https://bugs.openjdk.java.net/browse/JDK-8241701

From jan.lahoda at oracle.com  Fri Mar 27 11:31:33 2020
From: jan.lahoda at oracle.com (Jan Lahoda)
Date: Fri, 27 Mar 2020 12:31:33 +0100
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
Message-ID: <d3d2fa4f-4bf7-94c3-de4b-15dfe75b6370@oracle.com>

Hi Mandy,

Regarding the javac changes - should those be switched on/off depending 
the Target? Or, if one compiles with e.g. --release 14, will the newly 
generated output still work on JDK 14?

Jan

On 27. 03. 20 0:57, Mandy Chung wrote:
> Please review the implementation of JEP 371: Hidden Classes. The main 
> changes are in core-libs and hotspot runtime area.? Small changes are 
> made in javac, VM compiler (intrinsification of Class::isHiddenClass), 
> JFR, JDI, and jcmd.? CSR [1]has been reviewed and is in the finalized 
> state (see specdiff and javadoc below for reference).
> 
> Webrev:
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 
> 
> 
> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's point
> of view, a hidden class is a normal class except the following:
> 
> - A hidden class has no initiating class loader and is not registered in 
> any dictionary.
> - A hidden class has a name containing an illegal character 
> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` 
> returns "Lp/Foo.0x1234;".
> - A hidden class is not modifiable, i.e. cannot be redefined or 
> retransformed. JVM TI IsModifableClass returns false on a hidden.
> - Final fields in a hidden class is "final".? The value of final fields 
> cannot be overriden via reflection.? setAccessible(true) can still be 
> called on reflected objects representing final fields in a hidden class 
> and its access check will be suppressed but only have read-access (i.e. 
> can do Field::getXXX but not setXXX).
> 
> Brief summary of this patch:
> 
> 1. A new Lookup::defineHiddenClass method is the API to create a hidden 
> class.
> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG 
> option that
>  ?? can be specified when creating a hidden class.
> 3. A new Class::isHiddenClass method tests if a class is a hidden class.
> 4. Field::setXXX method will throw IAE on a final field of a hidden class
>  ?? regardless of the value of the accessible flag.
> 5. JVM_LookupDefineClass is the new JVM entry point for Lookup::defineClass
>  ?? and defineHiddenClass to create a class from the given bytes.
> 6. ClassLoaderData implementation is not changed.? There is one primary CLD
>  ?? that holds the classes strongly referenced by its defining loader. 
> There
>  ?? can be zero or more additional CLDs - one per weak class.
> 7. Nest host determination is updated per revised JVMS 5.4.4. Access 
> control
>  ?? check no longer throws LinkageError but instead it will throw IAE with
>  ?? a clear message if a class fails to resolve/validate the nest host 
> declared
>  ?? in NestHost/NestMembers attribute.
> 8. JFR, jcmd, JDI are updated to support hidden classes.
> 9. update javac LambdaToMethod as lambda proxy starts using nestmates
>  ?? and generate a bridge method to desuger a method reference to a 
> protected
>  ?? method in its supertype in a different package
> 
> This patch also updates StringConcatFactory, LambdaMetaFactory, and 
> LambdaForms
> to use hidden classes.? The webrev includes changes in nashorn to hidden 
> class
> and I will update the webrev if JEP 372 removes it any time soon.
> 
> We uncovered a bug in Lookup::defineClass spec throws LinkageError and 
> intends
> to have the newly created class linked.? However, the implementation in 14
> does not link the class.? A separate CSR [2] proposes to update the
> implementation to match the spec.? This patch fixes the implementation.
> 
> The spec update on JVM TI, JDI and Instrumentation will be done as
> a separate RFE [3].? This patch includes new tests for JVM TI and
> java.instrument that validates how the existing APIs work for hidden 
> classes.
> 
> javadoc/specdiff
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ 
> 
> 
> JVMS 5.4.4 change:
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf 
> 
> 
> CSR:
> https://bugs.openjdk.java.net/browse/JDK-8238359
> 
> Thanks
> Mandy
> [1] https://bugs.openjdk.java.net/browse/JDK-8238359
> [2] https://bugs.openjdk.java.net/browse/JDK-8240338
> [3] https://bugs.openjdk.java.net/browse/JDK-8230502

From forax at univ-mlv.fr  Fri Mar 27 12:00:06 2020
From: forax at univ-mlv.fr (Remi Forax)
Date: Fri, 27 Mar 2020 13:00:06 +0100 (CET)
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
Message-ID: <1271059704.1214239.1585310406005.JavaMail.zimbra@u-pem.fr>

Hi Mandy,
in ReflectionFactory, why in the case of a constructor the check to the anonymous class is removed ?

in BytecodeGenerator, the comment "// bootstrapping issue if using condy"
can be promoted on top of clinit, because i ask myself the same question seeing a static block was generated

in AbstractValidatingLambdaMetafactory.java, the field caller is not used after all ?

regards,
R?mi

----- Mail original -----
> De: "mandy chung" <mandy.chung at oracle.com>
> ?: "valhalla-dev" <valhalla-dev at openjdk.java.net>, "core-libs-dev" <core-libs-dev at openjdk.java.net>,
> "serviceability-dev" <serviceability-dev at openjdk.java.net>, "hotspot-dev" <hotspot-dev at openjdk.java.net>
> Envoy?: Vendredi 27 Mars 2020 00:57:39
> Objet: Review Request: 8238358: Implementation of JEP 371: Hidden Classes

> Please review the implementation of JEP 371: Hidden Classes. The main
> changes are in core-libs and hotspot runtime area.? Small changes are
> made in javac, VM compiler (intrinsification of Class::isHiddenClass),
> JFR, JDI, and jcmd.? CSR [1]has been reviewed and is in the finalized
> state (see specdiff and javadoc below for reference).
> 
> Webrev:
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03
> 
> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's point
> of view, a hidden class is a normal class except the following:
> 
> - A hidden class has no initiating class loader and is not registered in
> any dictionary.
> - A hidden class has a name containing an illegal character
> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature`
> returns "Lp/Foo.0x1234;".
> - A hidden class is not modifiable, i.e. cannot be redefined or
> retransformed. JVM TI IsModifableClass returns false on a hidden.
> - Final fields in a hidden class is "final".? The value of final fields
> cannot be overriden via reflection.? setAccessible(true) can still be
> called on reflected objects representing final fields in a hidden class
> and its access check will be suppressed but only have read-access (i.e.
> can do Field::getXXX but not setXXX).
> 
> Brief summary of this patch:
> 
> 1. A new Lookup::defineHiddenClass method is the API to create a hidden
> class.
> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG
> option that
> ?? can be specified when creating a hidden class.
> 3. A new Class::isHiddenClass method tests if a class is a hidden class.
> 4. Field::setXXX method will throw IAE on a final field of a hidden class
> ?? regardless of the value of the accessible flag.
> 5. JVM_LookupDefineClass is the new JVM entry point for Lookup::defineClass
> ?? and defineHiddenClass to create a class from the given bytes.
> 6. ClassLoaderData implementation is not changed.? There is one primary CLD
> ?? that holds the classes strongly referenced by its defining loader.
> There
> ?? can be zero or more additional CLDs - one per weak class.
> 7. Nest host determination is updated per revised JVMS 5.4.4. Access control
> ?? check no longer throws LinkageError but instead it will throw IAE with
> ?? a clear message if a class fails to resolve/validate the nest host
> declared
> ?? in NestHost/NestMembers attribute.
> 8. JFR, jcmd, JDI are updated to support hidden classes.
> 9. update javac LambdaToMethod as lambda proxy starts using nestmates
> ?? and generate a bridge method to desuger a method reference to a
> protected
> ?? method in its supertype in a different package
> 
> This patch also updates StringConcatFactory, LambdaMetaFactory, and
> LambdaForms
> to use hidden classes.? The webrev includes changes in nashorn to hidden
> class
> and I will update the webrev if JEP 372 removes it any time soon.
> 
> We uncovered a bug in Lookup::defineClass spec throws LinkageError and
> intends
> to have the newly created class linked.? However, the implementation in 14
> does not link the class.? A separate CSR [2] proposes to update the
> implementation to match the spec.? This patch fixes the implementation.
> 
> The spec update on JVM TI, JDI and Instrumentation will be done as
> a separate RFE [3].? This patch includes new tests for JVM TI and
> java.instrument that validates how the existing APIs work for hidden
> classes.
> 
> javadoc/specdiff
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/
> 
> JVMS 5.4.4 change:
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf
> 
> CSR:
> https://bugs.openjdk.java.net/browse/JDK-8238359
> 
> Thanks
> Mandy
> [1] https://bugs.openjdk.java.net/browse/JDK-8238359
> [2] https://bugs.openjdk.java.net/browse/JDK-8240338
> [3] https://bugs.openjdk.java.net/browse/JDK-8230502

From mandy.chung at oracle.com  Fri Mar 27 15:50:55 2020
From: mandy.chung at oracle.com (Mandy Chung)
Date: Fri, 27 Mar 2020 08:50:55 -0700
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <1271059704.1214239.1585310406005.JavaMail.zimbra@u-pem.fr>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
 <1271059704.1214239.1585310406005.JavaMail.zimbra@u-pem.fr>
Message-ID: <b32743fe-6046-2d76-2edb-6fdaa8fb7f70@oracle.com>

On 3/27/20 5:00 AM, Remi Forax wrote:
> Hi Mandy,
> in ReflectionFactory, why in the case of a constructor the check to the anonymous class is removed ?

Good catch.? Fixed
>
> in BytecodeGenerator, the comment "// bootstrapping issue if using condy"
> can be promoted on top of clinit, because i ask myself the same question seeing a static block was generated

OK, that's clearer.
>
> in AbstractValidatingLambdaMetafactory.java, the field caller is not used after all ?

Thanks.? Removed.? It was left behind from an early prototype.

Below is the patch.? I will send out a new webrev and delta webrev in 
the next revision.

thanks
Mandy

diff --git 
a/src/java.base/share/classes/java/lang/invoke/AbstractValidatingLambdaMetafactory.java 
b/src/java.base/share/classes/java/lang/invoke/AbstractValidatingLambdaMetafactory.java
--- 
a/src/java.base/share/classes/java/lang/invoke/AbstractValidatingLambdaMetafactory.java
+++ 
b/src/java.base/share/classes/java/lang/invoke/AbstractValidatingLambdaMetafactory.java
@@ -51,7 +51,6 @@
 ????? *???????? System.out.printf(">>> %s\n", iii.foo(44));
 ????? * }}
 ????? */
-??? final MethodHandles.Lookup caller;
 ???? final Class<?> targetClass;?????????????? // The class calling the 
meta-factory via invokedynamic "class X"
 ???? final MethodType invokedType;???????????? // The type of the 
invoked method "(CC)II"
 ???? final Class<?> samBase;?????????????????? // The type of the 
returned instance "interface JJ"
@@ -121,7 +120,6 @@
 ???????????????????? "Invalid caller: %s",
 ???????????????????? caller.lookupClass().getName()));
 ???????? }
-??????? this.caller = caller;
 ???????? this.targetClass = caller.lookupClass();
 ???????? this.invokedType = invokedType;

diff --git 
a/src/java.base/share/classes/java/lang/invoke/InvokerBytecodeGenerator.java 
b/src/java.base/share/classes/java/lang/invoke/InvokerBytecodeGenerator.java
--- 
a/src/java.base/share/classes/java/lang/invoke/InvokerBytecodeGenerator.java
+++ 
b/src/java.base/share/classes/java/lang/invoke/InvokerBytecodeGenerator.java
@@ -363,6 +363,10 @@
 ???????? clinit(cw, className(), classData);
 ???? }

+??? /*
+???? * <clinit> to initialize the static final fields with the live 
class data
+???? * LambdaForms can't use condy due to bootstrapping issue.
+???? */
 ???? static void clinit(ClassWriter cw, String className, 
List<ClassData> classData) {
 ???????? if (classData.isEmpty())
 ???????????? return;
@@ -375,7 +379,6 @@

 ???????? MethodVisitor mv = cw.visitMethod(Opcodes.ACC_STATIC, 
"<clinit>", "()V", null, null);
 ???????? mv.visitCode();
-??????? // bootstrapping issue if using condy
 ???????? mv.visitLdcInsn(Type.getType("L" + className + ";"));
 ???????? mv.visitMethodInsn(Opcodes.INVOKESTATIC, 
"java/lang/invoke/MethodHandleNatives",
 ??????????????????????????? "classData", 
"(Ljava/lang/Class;)Ljava/lang/Object;", false);
diff --git 
a/src/java.base/share/classes/jdk/internal/reflect/ReflectionFactory.java 
b/src/java.base/share/classes/jdk/internal/reflect/ReflectionFactory.java
--- 
a/src/java.base/share/classes/jdk/internal/reflect/ReflectionFactory.java
+++ 
b/src/java.base/share/classes/jdk/internal/reflect/ReflectionFactory.java
@@ -245,7 +245,8 @@
 ???????????? return new BootstrapConstructorAccessorImpl(c);
 ???????? }

-??????? if (noInflation && !c.getDeclaringClass().isHiddenClass()) {
+??????? if (noInflation && !c.getDeclaringClass().isHiddenClass()
+??????????????? && 
!ReflectUtil.isVMAnonymousClass(c.getDeclaringClass())) {
 ???????????? return new MethodAccessorGenerator().
 ???????????????? generateConstructor(c.getDeclaringClass(),
 ???????????????????????????????????? c.getParameterTypes(),
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200327/b8b896c7/attachment.htm>

From forax at univ-mlv.fr  Fri Mar 27 15:54:37 2020
From: forax at univ-mlv.fr (forax at univ-mlv.fr)
Date: Fri, 27 Mar 2020 16:54:37 +0100 (CET)
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <b32743fe-6046-2d76-2edb-6fdaa8fb7f70@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
 <1271059704.1214239.1585310406005.JavaMail.zimbra@u-pem.fr>
 <b32743fe-6046-2d76-2edb-6fdaa8fb7f70@oracle.com>
Message-ID: <387396409.1414184.1585324477993.JavaMail.zimbra@u-pem.fr>

> De: "mandy chung" <mandy.chung at oracle.com>
> ?: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "valhalla-dev" <valhalla-dev at openjdk.java.net>, "core-libs-dev"
> <core-libs-dev at openjdk.java.net>, "serviceability-dev"
> <serviceability-dev at openjdk.java.net>, "hotspot-dev"
> <hotspot-dev at openjdk.java.net>
> Envoy?: Vendredi 27 Mars 2020 16:50:55
> Objet: Re: Review Request: 8238358: Implementation of JEP 371: Hidden Classes

> On 3/27/20 5:00 AM, Remi Forax wrote:

>> Hi Mandy,
>> in ReflectionFactory, why in the case of a constructor the check to the
>> anonymous class is removed ?

> Good catch. Fixed

>> in BytecodeGenerator, the comment "// bootstrapping issue if using condy"
>> can be promoted on top of clinit, because i ask myself the same question seeing
>> a static block was generated

> OK, that's clearer.

>> in AbstractValidatingLambdaMetafactory.java, the field caller is not used after
>> all ?

> Thanks. Removed. It was left behind from an early prototype.

> Below is the patch. I will send out a new webrev and delta webrev in the next
> revision.
Thanks Mandy, 
Looks good. 

R?mi 

> thanks
> Mandy

> diff --git
> a/src/java.base/share/classes/java/lang/invoke/AbstractValidatingLambdaMetafactory.java
> b/src/java.base/share/classes/java/lang/invoke/AbstractValidatingLambdaMetafactory.java
> ---
> a/src/java.base/share/classes/java/lang/invoke/AbstractValidatingLambdaMetafactory.java
> +++
> b/src/java.base/share/classes/java/lang/invoke/AbstractValidatingLambdaMetafactory.java
> @@ -51,7 +51,6 @@
> * System.out.printf(">>> %s\n", iii.foo(44));
> * }}
> */
> - final MethodHandles.Lookup caller;
> final Class<?> targetClass; // The class calling the meta-factory via
> invokedynamic "class X"
> final MethodType invokedType; // The type of the invoked method "(CC)II"
> final Class<?> samBase; // The type of the returned instance "interface JJ"
> @@ -121,7 +120,6 @@
> "Invalid caller: %s",
> caller.lookupClass().getName()));
> }
> - this.caller = caller;
> this.targetClass = caller.lookupClass();
> this.invokedType = invokedType;

> diff --git
> a/src/java.base/share/classes/java/lang/invoke/InvokerBytecodeGenerator.java
> b/src/java.base/share/classes/java/lang/invoke/InvokerBytecodeGenerator.java
> --- a/src/java.base/share/classes/java/lang/invoke/InvokerBytecodeGenerator.java
> +++ b/src/java.base/share/classes/java/lang/invoke/InvokerBytecodeGenerator.java
> @@ -363,6 +363,10 @@
> clinit(cw, className(), classData);
> }

> + /*
> + * <clinit> to initialize the static final fields with the live class data
> + * LambdaForms can't use condy due to bootstrapping issue.
> + */
> static void clinit(ClassWriter cw, String className, List<ClassData> classData)
> {
> if (classData.isEmpty())
> return;
> @@ -375,7 +379,6 @@

> MethodVisitor mv = cw.visitMethod(Opcodes.ACC_STATIC, "<clinit>", "()V", null,
> null);
> mv.visitCode();
> - // bootstrapping issue if using condy
> mv.visitLdcInsn(Type.getType("L" + className + ";"));
> mv.visitMethodInsn(Opcodes.INVOKESTATIC, "java/lang/invoke/MethodHandleNatives",
> "classData", "(Ljava/lang/Class;)Ljava/lang/Object;", false);
> diff --git
> a/src/java.base/share/classes/jdk/internal/reflect/ReflectionFactory.java
> b/src/java.base/share/classes/jdk/internal/reflect/ReflectionFactory.java
> --- a/src/java.base/share/classes/jdk/internal/reflect/ReflectionFactory.java
> +++ b/src/java.base/share/classes/jdk/internal/reflect/ReflectionFactory.java
> @@ -245,7 +245,8 @@
> return new BootstrapConstructorAccessorImpl(c);
> }

> - if (noInflation && !c.getDeclaringClass().isHiddenClass()) {
> + if (noInflation && !c.getDeclaringClass().isHiddenClass()
> + && !ReflectUtil.isVMAnonymousClass(c.getDeclaringClass())) {
> return new MethodAccessorGenerator().
> generateConstructor(c.getDeclaringClass(),
> c.getParameterTypes(),
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200327/8590e89f/attachment-0001.htm>

From mandy.chung at oracle.com  Fri Mar 27 16:29:46 2020
From: mandy.chung at oracle.com (Mandy Chung)
Date: Fri, 27 Mar 2020 09:29:46 -0700
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <d3d2fa4f-4bf7-94c3-de4b-15dfe75b6370@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
 <d3d2fa4f-4bf7-94c3-de4b-15dfe75b6370@oracle.com>
Message-ID: <08191054-8d0a-ae60-ac99-e2849f08ce85@oracle.com>

Hi Jan,

Good point.? The javac change only applies to JDK 15 and later and the 
lambda proxy class is not a nestmate when running on JDK 14 or earlier.

I probably need the help from langtools team to fix this.? I'll give it 
a try.

Mandy

On 3/27/20 4:31 AM, Jan Lahoda wrote:
> Hi Mandy,
>
> Regarding the javac changes - should those be switched on/off 
> depending the Target? Or, if one compiles with e.g. --release 14, will 
> the newly generated output still work on JDK 14?
>
> Jan
>
> On 27. 03. 20 0:57, Mandy Chung wrote:
>> Please review the implementation of JEP 371: Hidden Classes. The main 
>> changes are in core-libs and hotspot runtime area.? Small changes are 
>> made in javac, VM compiler (intrinsification of 
>> Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been reviewed 
>> and is in the finalized state (see specdiff and javadoc below for 
>> reference).
>>
>> Webrev:
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 
>>
>>
>> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's 
>> point
>> of view, a hidden class is a normal class except the following:
>>
>> - A hidden class has no initiating class loader and is not registered 
>> in any dictionary.
>> - A hidden class has a name containing an illegal character 
>> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` 
>> returns "Lp/Foo.0x1234;".
>> - A hidden class is not modifiable, i.e. cannot be redefined or 
>> retransformed. JVM TI IsModifableClass returns false on a hidden.
>> - Final fields in a hidden class is "final".? The value of final 
>> fields cannot be overriden via reflection.? setAccessible(true) can 
>> still be called on reflected objects representing final fields in a 
>> hidden class and its access check will be suppressed but only have 
>> read-access (i.e. can do Field::getXXX but not setXXX).
>>
>> Brief summary of this patch:
>>
>> 1. A new Lookup::defineHiddenClass method is the API to create a 
>> hidden class.
>> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG 
>> option that
>> ??? can be specified when creating a hidden class.
>> 3. A new Class::isHiddenClass method tests if a class is a hidden class.
>> 4. Field::setXXX method will throw IAE on a final field of a hidden 
>> class
>> ??? regardless of the value of the accessible flag.
>> 5. JVM_LookupDefineClass is the new JVM entry point for 
>> Lookup::defineClass
>> ??? and defineHiddenClass to create a class from the given bytes.
>> 6. ClassLoaderData implementation is not changed.? There is one 
>> primary CLD
>> ??? that holds the classes strongly referenced by its defining 
>> loader. There
>> ??? can be zero or more additional CLDs - one per weak class.
>> 7. Nest host determination is updated per revised JVMS 5.4.4. Access 
>> control
>> ??? check no longer throws LinkageError but instead it will throw IAE 
>> with
>> ??? a clear message if a class fails to resolve/validate the nest 
>> host declared
>> ??? in NestHost/NestMembers attribute.
>> 8. JFR, jcmd, JDI are updated to support hidden classes.
>> 9. update javac LambdaToMethod as lambda proxy starts using nestmates
>> ??? and generate a bridge method to desuger a method reference to a 
>> protected
>> ??? method in its supertype in a different package
>>
>> This patch also updates StringConcatFactory, LambdaMetaFactory, and 
>> LambdaForms
>> to use hidden classes.? The webrev includes changes in nashorn to 
>> hidden class
>> and I will update the webrev if JEP 372 removes it any time soon.
>>
>> We uncovered a bug in Lookup::defineClass spec throws LinkageError 
>> and intends
>> to have the newly created class linked.? However, the implementation 
>> in 14
>> does not link the class.? A separate CSR [2] proposes to update the
>> implementation to match the spec.? This patch fixes the implementation.
>>
>> The spec update on JVM TI, JDI and Instrumentation will be done as
>> a separate RFE [3].? This patch includes new tests for JVM TI and
>> java.instrument that validates how the existing APIs work for hidden 
>> classes.
>>
>> javadoc/specdiff
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ 
>>
>>
>> JVMS 5.4.4 change:
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf 
>>
>>
>> CSR:
>> https://bugs.openjdk.java.net/browse/JDK-8238359
>>
>> Thanks
>> Mandy
>> [1] https://bugs.openjdk.java.net/browse/JDK-8238359
>> [2] https://bugs.openjdk.java.net/browse/JDK-8240338
>> [3] https://bugs.openjdk.java.net/browse/JDK-8230502

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200327/ac7f779f/attachment.htm>

From shade at redhat.com  Fri Mar 27 16:57:19 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Fri, 27 Mar 2020 17:57:19 +0100
Subject: RFR (XS) 8241750: x86_32 build failure after JDK-8227269
Message-ID: <cc1a65a3-7aa3-0234-002b-0b968795dcef@redhat.com>

Bug:
  https://bugs.openjdk.java.net/browse/JDK-8241750

Fix:

diff -r fef47d126675 src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c
--- a/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c      Fri Mar 27 15:33:24 2020 +0100
+++ b/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c      Fri Mar 27 17:47:31 2020 +0100
@@ -70,5 +70,5 @@
       return;
     }
-    *(char**)bagAdd(deletedSignatures) = (char*)tag;
+    *(char**)bagAdd(deletedSignatures) = (char*)jlong_to_ptr(tag);

     debugMonitorExit(classTrackLock);
@@ -118,5 +118,5 @@
         EXIT_ERROR(error,"signature");
     }
-    error = JVMTI_FUNC_PTR(trackingEnv, SetTag)(env, klass, (jlong)signature);
+    error = JVMTI_FUNC_PTR(trackingEnv, SetTag)(env, klass, ptr_to_jlong(signature));
     if (error != JVMTI_ERROR_NONE) {
         jvmtiDeallocate(signature);

Testing: Linux {x86_64, x86_32} x {builds, vmTestbase_nsk_jdwp}; jdk-submit (running)

-- 
Thanks,
-Aleksey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200327/e2511239/signature.asc>

From rkennke at redhat.com  Fri Mar 27 17:03:49 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 27 Mar 2020 18:03:49 +0100
Subject: RFR (XS) 8241750: x86_32 build failure after JDK-8227269
In-Reply-To: <cc1a65a3-7aa3-0234-002b-0b968795dcef@redhat.com>
References: <cc1a65a3-7aa3-0234-002b-0b968795dcef@redhat.com>
Message-ID: <c0f2998d-c08e-5940-6bf3-f5457b82aac4@redhat.com>

Looks good to me, thanks!

Roman

> Bug:
>   https://bugs.openjdk.java.net/browse/JDK-8241750
> 
> Fix:
> 
> diff -r fef47d126675 src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c
> --- a/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c      Fri Mar 27 15:33:24 2020 +0100
> +++ b/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c      Fri Mar 27 17:47:31 2020 +0100
> @@ -70,5 +70,5 @@
>        return;
>      }
> -    *(char**)bagAdd(deletedSignatures) = (char*)tag;
> +    *(char**)bagAdd(deletedSignatures) = (char*)jlong_to_ptr(tag);
> 
>      debugMonitorExit(classTrackLock);
> @@ -118,5 +118,5 @@
>          EXIT_ERROR(error,"signature");
>      }
> -    error = JVMTI_FUNC_PTR(trackingEnv, SetTag)(env, klass, (jlong)signature);
> +    error = JVMTI_FUNC_PTR(trackingEnv, SetTag)(env, klass, ptr_to_jlong(signature));
>      if (error != JVMTI_ERROR_NONE) {
>          jvmtiDeallocate(signature);
> 
> Testing: Linux {x86_64, x86_32} x {builds, vmTestbase_nsk_jdwp}; jdk-submit (running)
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200327/a2e307ba/signature-0001.asc>

From chris.plummer at oracle.com  Fri Mar 27 17:08:45 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Fri, 27 Mar 2020 10:08:45 -0700
Subject: RFR (XS) 8241750: x86_32 build failure after JDK-8227269
In-Reply-To: <c0f2998d-c08e-5940-6bf3-f5457b82aac4@redhat.com>
References: <cc1a65a3-7aa3-0234-002b-0b968795dcef@redhat.com>
 <c0f2998d-c08e-5940-6bf3-f5457b82aac4@redhat.com>
Message-ID: <b734b405-6d01-6c64-d862-e6316034d50a@oracle.com>

+1

Chris

On 3/27/20 10:03 AM, Roman Kennke wrote:
> Looks good to me, thanks!
>
> Roman
>
>> Bug:
>>    https://bugs.openjdk.java.net/browse/JDK-8241750
>>
>> Fix:
>>
>> diff -r fef47d126675 src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c
>> --- a/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c      Fri Mar 27 15:33:24 2020 +0100
>> +++ b/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c      Fri Mar 27 17:47:31 2020 +0100
>> @@ -70,5 +70,5 @@
>>         return;
>>       }
>> -    *(char**)bagAdd(deletedSignatures) = (char*)tag;
>> +    *(char**)bagAdd(deletedSignatures) = (char*)jlong_to_ptr(tag);
>>
>>       debugMonitorExit(classTrackLock);
>> @@ -118,5 +118,5 @@
>>           EXIT_ERROR(error,"signature");
>>       }
>> -    error = JVMTI_FUNC_PTR(trackingEnv, SetTag)(env, klass, (jlong)signature);
>> +    error = JVMTI_FUNC_PTR(trackingEnv, SetTag)(env, klass, ptr_to_jlong(signature));
>>       if (error != JVMTI_ERROR_NONE) {
>>           jvmtiDeallocate(signature);
>>
>> Testing: Linux {x86_64, x86_32} x {builds, vmTestbase_nsk_jdwp}; jdk-submit (running)
>>


From shade at redhat.com  Fri Mar 27 17:10:09 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Fri, 27 Mar 2020 18:10:09 +0100
Subject: RFR (XS) 8241750: x86_32 build failure after JDK-8227269
In-Reply-To: <b734b405-6d01-6c64-d862-e6316034d50a@oracle.com>
References: <cc1a65a3-7aa3-0234-002b-0b968795dcef@redhat.com>
 <c0f2998d-c08e-5940-6bf3-f5457b82aac4@redhat.com>
 <b734b405-6d01-6c64-d862-e6316034d50a@oracle.com>
Message-ID: <60fcb978-f7f0-48b9-f438-0fe0c918dd32@redhat.com>

Thanks! Trivial, right?

If so, I'll push as soon as jdk-submit clears.

-Aleksey

On 3/27/20 6:08 PM, Chris Plummer wrote:
> +1
> 
> Chris
> 
> On 3/27/20 10:03 AM, Roman Kennke wrote:
>> Looks good to me, thanks!
>>
>> Roman
>>
>>> Bug:
>>>    https://bugs.openjdk.java.net/browse/JDK-8241750
>>>
>>> Fix:
>>>
>>> diff -r fef47d126675 src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c
>>> --- a/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c      Fri Mar 27 15:33:24 2020 +0100
>>> +++ b/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c      Fri Mar 27 17:47:31 2020 +0100
>>> @@ -70,5 +70,5 @@
>>>         return;
>>>       }
>>> -    *(char**)bagAdd(deletedSignatures) = (char*)tag;
>>> +    *(char**)bagAdd(deletedSignatures) = (char*)jlong_to_ptr(tag);
>>>
>>>       debugMonitorExit(classTrackLock);
>>> @@ -118,5 +118,5 @@
>>>           EXIT_ERROR(error,"signature");
>>>       }
>>> -    error = JVMTI_FUNC_PTR(trackingEnv, SetTag)(env, klass, (jlong)signature);
>>> +    error = JVMTI_FUNC_PTR(trackingEnv, SetTag)(env, klass, ptr_to_jlong(signature));
>>>       if (error != JVMTI_ERROR_NONE) {
>>>           jvmtiDeallocate(signature);
>>>
>>> Testing: Linux {x86_64, x86_32} x {builds, vmTestbase_nsk_jdwp}; jdk-submit (running)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200327/ce0ed09a/signature.asc>

From chris.plummer at oracle.com  Fri Mar 27 17:15:55 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Fri, 27 Mar 2020 10:15:55 -0700
Subject: RFR (XS) 8241750: x86_32 build failure after JDK-8227269
In-Reply-To: <60fcb978-f7f0-48b9-f438-0fe0c918dd32@redhat.com>
References: <cc1a65a3-7aa3-0234-002b-0b968795dcef@redhat.com>
 <c0f2998d-c08e-5940-6bf3-f5457b82aac4@redhat.com>
 <b734b405-6d01-6c64-d862-e6316034d50a@oracle.com>
 <60fcb978-f7f0-48b9-f438-0fe0c918dd32@redhat.com>
Message-ID: <6d1a21fa-637a-b1fc-4d12-0032711e45a0@oracle.com>

Yeah, I think given that it fixes a broken build it should be fine to 
push right away.

Chris

On 3/27/20 10:10 AM, Aleksey Shipilev wrote:
> Thanks! Trivial, right?
>
> If so, I'll push as soon as jdk-submit clears.
>
> -Aleksey
>
> On 3/27/20 6:08 PM, Chris Plummer wrote:
>> +1
>>
>> Chris
>>
>> On 3/27/20 10:03 AM, Roman Kennke wrote:
>>> Looks good to me, thanks!
>>>
>>> Roman
>>>
>>>> Bug:
>>>>     https://bugs.openjdk.java.net/browse/JDK-8241750
>>>>
>>>> Fix:
>>>>
>>>> diff -r fef47d126675 src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c
>>>> --- a/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c      Fri Mar 27 15:33:24 2020 +0100
>>>> +++ b/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c      Fri Mar 27 17:47:31 2020 +0100
>>>> @@ -70,5 +70,5 @@
>>>>          return;
>>>>        }
>>>> -    *(char**)bagAdd(deletedSignatures) = (char*)tag;
>>>> +    *(char**)bagAdd(deletedSignatures) = (char*)jlong_to_ptr(tag);
>>>>
>>>>        debugMonitorExit(classTrackLock);
>>>> @@ -118,5 +118,5 @@
>>>>            EXIT_ERROR(error,"signature");
>>>>        }
>>>> -    error = JVMTI_FUNC_PTR(trackingEnv, SetTag)(env, klass, (jlong)signature);
>>>> +    error = JVMTI_FUNC_PTR(trackingEnv, SetTag)(env, klass, ptr_to_jlong(signature));
>>>>        if (error != JVMTI_ERROR_NONE) {
>>>>            jvmtiDeallocate(signature);
>>>>
>>>> Testing: Linux {x86_64, x86_32} x {builds, vmTestbase_nsk_jdwp}; jdk-submit (running)


From shade at redhat.com  Fri Mar 27 18:05:34 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Fri, 27 Mar 2020 19:05:34 +0100
Subject: RFR (XS) 8241750: x86_32 build failure after JDK-8227269
In-Reply-To: <6d1a21fa-637a-b1fc-4d12-0032711e45a0@oracle.com>
References: <cc1a65a3-7aa3-0234-002b-0b968795dcef@redhat.com>
 <c0f2998d-c08e-5940-6bf3-f5457b82aac4@redhat.com>
 <b734b405-6d01-6c64-d862-e6316034d50a@oracle.com>
 <60fcb978-f7f0-48b9-f438-0fe0c918dd32@redhat.com>
 <6d1a21fa-637a-b1fc-4d12-0032711e45a0@oracle.com>
Message-ID: <d1fb2bd2-f98a-4ed9-56bf-575c542166fa@redhat.com>

On 3/27/20 6:15 PM, Chris Plummer wrote:
> Yeah, I think given that it fixes a broken build it should be fine to 
> push right away.
jdk-submit came clean, pushed.

-- 
Thanks,
-Aleksey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200327/7bb6caee/signature-0001.asc>

From paul.sandoz at oracle.com  Fri Mar 27 18:59:10 2020
From: paul.sandoz at oracle.com (Paul Sandoz)
Date: Fri, 27 Mar 2020 11:59:10 -0700
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
Message-ID: <72EC6AFD-412B-4238-B98E-90AF8C185D98@oracle.com>

Hi Mandy,

Very thorough, bravo!

Minor suggestions below.

Paul.

MethodHandleNatives.java
?

 142 
 143         /**
 144          * Flags for Lookup.ClassOptions
 145          */
 146         static final int
 147             NESTMATE_CLASS            = 0x00000001,
 148             HIDDEN_CLASS              = 0x00000002,
 149             STRONG_LOADER_LINK        = 0x00000004,
 150             ACCESS_VM_ANNOTATIONS     = 0x00000008;
 151     }
 

Suggest you add a comment to keep the values in sync with the VM component.


MethodHandles.java
?

1786          * (Given the {@code Lookup} object returned this method, its lookup class
1787          * is a {@code Class} object for which {@link Class#getName()} returns a string
1788          * that is not a binary name.)

?
(The {@code Lookup} object returned from this method has a lookup class that is 
a {@code Class} object whose {@link Class#getName()} returns a string
that is not a binary name.)
?


1902             Set<ClassOption> opts = options.length > 0 ? Set.of(options) : Set.of();

You can just do:

  Set<ClassOption> opts = Set.of(options)

And/or inline it into the subsequent method call.  The implementation of Set.of checks the array length.


2001         ClassDefiner makeHiddenClassDefiner(byte[] bytes,

I think you can telescope the methods for non-name and name accepting since IIUC the name is derived from the byte[].  Thereby you can remove some code duplication. i.e. pull ClassDefiner.className out from ClassDefiner and place the logic in the factory methods.  Alternative push the factory methods into ClassDefiner to keep all the logic together.


3797         public enum ClassOption {

Shuffle up to be closer to the defineHiddenClass


3798             /**
3799              * This class option specifies the hidden class be added to
3800              * {@linkplain Class#getNestHost nest} of a lookup class as
3801              * a nestmate.

Suggest:

"This class option specifies the hidden class ? -> ?Specifies that a hidden class 


3812              * This class option specifies the hidden class to have a <em>strong</em>

?Specifies that a hidden class have a ?"


3813              * relationship with the class loader marked as its defining loader,
3814              * as a normal class or interface has with its own defining loader.
3815              * This means that the hidden class may be unloaded if and only if
3816              * its defining loader is not reachable and thus may be reclaimed
3817              * by a garbage collector (JLS 12.7).


StringConcatFactory.java
?

 861             // use of @ForceInline no longer has any effect

?

 862             mv.visitAnnotation("Ljdk/internal/vm/annotation/ForceInline;", true);
 863             mv.visitCode();


> On Mar 26, 2020, at 4:57 PM, Mandy Chung <mandy.chung at oracle.com> wrote:
> 
> Please review the implementation of JEP 371: Hidden Classes. The main changes are in core-libs and hotspot runtime area.  Small changes are made in javac, VM compiler (intrinsification of Class::isHiddenClass), JFR, JDI, and jcmd.  CSR [1]has been reviewed and is in the finalized state (see specdiff and javadoc below for reference).
> 
> Webrev:
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03
> 
> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's point
> of view, a hidden class is a normal class except the following:
> 
> - A hidden class has no initiating class loader and is not registered in any dictionary.
> - A hidden class has a name containing an illegal character `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` returns "Lp/Foo.0x1234;".
> - A hidden class is not modifiable, i.e. cannot be redefined or retransformed. JVM TI IsModifableClass returns false on a hidden.
> - Final fields in a hidden class is "final".  The value of final fields cannot be overriden via reflection.  setAccessible(true) can still be called on reflected objects representing final fields in a hidden class and its access check will be suppressed but only have read-access (i.e. can do Field::getXXX but not setXXX).
> 
> Brief summary of this patch:
> 
> 1. A new Lookup::defineHiddenClass method is the API to create a hidden class.
> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG option that
>    can be specified when creating a hidden class.
> 3. A new Class::isHiddenClass method tests if a class is a hidden class.
> 4. Field::setXXX method will throw IAE on a final field of a hidden class
>    regardless of the value of the accessible flag.
> 5. JVM_LookupDefineClass is the new JVM entry point for Lookup::defineClass
>    and defineHiddenClass to create a class from the given bytes.
> 6. ClassLoaderData implementation is not changed.  There is one primary CLD
>    that holds the classes strongly referenced by its defining loader.  There
>    can be zero or more additional CLDs - one per weak class.
> 7. Nest host determination is updated per revised JVMS 5.4.4. Access control
>    check no longer throws LinkageError but instead it will throw IAE with
>    a clear message if a class fails to resolve/validate the nest host declared
>    in NestHost/NestMembers attribute.
> 8. JFR, jcmd, JDI are updated to support hidden classes.
> 9. update javac LambdaToMethod as lambda proxy starts using nestmates
>    and generate a bridge method to desuger a method reference to a protected
>    method in its supertype in a different package
> 
> This patch also updates StringConcatFactory, LambdaMetaFactory, and LambdaForms
> to use hidden classes.  The webrev includes changes in nashorn to hidden class
> and I will update the webrev if JEP 372 removes it any time soon.
> 
> We uncovered a bug in Lookup::defineClass spec throws LinkageError and intends
> to have the newly created class linked.  However, the implementation in 14
> does not link the class.  A separate CSR [2] proposes to update the
> implementation to match the spec.  This patch fixes the implementation.
> 
> The spec update on JVM TI, JDI and Instrumentation will be done as
> a separate RFE [3].  This patch includes new tests for JVM TI and
> java.instrument that validates how the existing APIs work for hidden classes.
> 
> javadoc/specdiff
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/
> 
> JVMS 5.4.4 change:
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf
> 
> CSR:
> https://bugs.openjdk.java.net/browse/JDK-8238359
> 
> Thanks
> Mandy
> [1] https://bugs.openjdk.java.net/browse/JDK-8238359
> [2] https://bugs.openjdk.java.net/browse/JDK-8240338
> [3] https://bugs.openjdk.java.net/browse/JDK-8230502


From mandy.chung at oracle.com  Fri Mar 27 20:18:07 2020
From: mandy.chung at oracle.com (Mandy Chung)
Date: Fri, 27 Mar 2020 13:18:07 -0700
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <72EC6AFD-412B-4238-B98E-90AF8C185D98@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
 <72EC6AFD-412B-4238-B98E-90AF8C185D98@oracle.com>
Message-ID: <c2ae90e1-5bf8-1d6b-a739-eb4aac542790@oracle.com>


On 3/27/20 11:59 AM, Paul Sandoz wrote:
> Hi Mandy,
>
> Very thorough, bravo!

Thanks.
> Minor suggestions below.
>
> Paul.
>
> MethodHandleNatives.java
> ?
>
>   142
>   143         /**
>   144          * Flags for Lookup.ClassOptions
>   145          */
>   146         static final int
>   147             NESTMATE_CLASS            = 0x00000001,
>   148             HIDDEN_CLASS              = 0x00000002,
>   149             STRONG_LOADER_LINK        = 0x00000004,
>   150             ACCESS_VM_ANNOTATIONS     = 0x00000008;
>   151     }
>   
>
> Suggest you add a comment to keep the values in sync with the VM component.

Already in the class spec of this Constants class.? The values of all 
constants defined in this Constants class are verified in sync with VM 
(see verifyConstants).

>
> MethodHandles.java
> ?
>
> 1786          * (Given the {@code Lookup} object returned this method, its lookup class
> 1787          * is a {@code Class} object for which {@link Class#getName()} returns a string
> 1788          * that is not a binary name.)
>
> ?
> (The {@code Lookup} object returned from this method has a lookup class that is
> a {@code Class} object whose {@link Class#getName()} returns a string
> that is not a binary name.)
> ?
>
>
> 1902             Set<ClassOption> opts = options.length > 0 ? Set.of(options) : Set.of();
>
> You can just do:
>
>    Set<ClassOption> opts = Set.of(options)
>
> And/or inline it into the subsequent method call.  The implementation of Set.of checks the array length.

Great to know.? Thanks.
>
> 2001         ClassDefiner makeHiddenClassDefiner(byte[] bytes,
>
> I think you can telescope the methods for non-name and name accepting since IIUC the name is derived from the byte[].  Thereby you can remove some code duplication. i.e. pull ClassDefiner.className out from ClassDefiner and place the logic in the factory methods.  Alternative push the factory methods into ClassDefiner to keep all the logic together.
>
Ok.? I will move the className out.
>
> 3797         public enum ClassOption {
>
> Shuffle up to be closer to the defineHiddenClass

Moved before defineHiddenClass.

>
> 3798             /**
> 3799              * This class option specifies the hidden class be added to
> 3800              * {@linkplain Class#getNestHost nest} of a lookup class as
> 3801              * a nestmate.
>
> Suggest:
>
> "This class option specifies the hidden class ? -> ?Specifies that a hidden class
>
> 3812              * This class option specifies the hidden class to have a <em>strong</em>
>
> ?Specifies that a hidden class have a ?"

Specifies that a hidden class has a...

>
> 3813              * relationship with the class loader marked as its defining loader,
> 3814              * as a normal class or interface has with its own defining loader.
> 3815              * This means that the hidden class may be unloaded if and only if
> 3816              * its defining loader is not reachable and thus may be reclaimed
> 3817              * by a garbage collector (JLS 12.7).
>
>
> StringConcatFactory.java
> ?
>
>   861             // use of @ForceInline no longer has any effect
>
> ?

Right, I should have explained this [1].

This @ForceInline is used by BytecodeStringBuilderStrategy that 
generates code to have the same StringBuilder chain javac would emit. It 
uses `@ForceInline` annotation which may probably be for performance.? 
It's believed people rarely uses this non-default strategy.? This patch 
changes StringConcatFactory to the standard defineHiddenClass method and 
hence `@ForceInline` has no effect in the generated class for this 
non-default strategy.? If it turns out to be an issue, then we will 
determine if it should enable the access to VM annotations (I doubt this 
is supported strategy).

[1] https://bugs.openjdk.java.net/browse/JDK-8241548
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200327/fd5e4cef/attachment.htm>

From vicente.romero at oracle.com  Fri Mar 27 21:15:29 2020
From: vicente.romero at oracle.com (Vicente Romero)
Date: Fri, 27 Mar 2020 17:15:29 -0400
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <08191054-8d0a-ae60-ac99-e2849f08ce85@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
 <d3d2fa4f-4bf7-94c3-de4b-15dfe75b6370@oracle.com>
 <08191054-8d0a-ae60-ac99-e2849f08ce85@oracle.com>
Message-ID: <e9ee79ca-e00a-c1d7-27a1-e110064a8fa3@oracle.com>

Hi Mandy,

The patch for nestmates [1] could be used as a reference. There a new 
method was added to class `com.sun.tools.javac.jvm.Target`, named: 
`hasNestmateAccess` which checks if a target is ready for nestmates or 
not. I think that you can follow a similar approach here.

Thanks,
Vicente

[1] http://hg.openjdk.java.net/jdk/jdk/rev/2f2af62dfac7

On 3/27/20 12:29 PM, Mandy Chung wrote:
> Hi Jan,
>
> Good point.? The javac change only applies to JDK 15 and later and the 
> lambda proxy class is not a nestmate when running on JDK 14 or earlier.
>
> I probably need the help from langtools team to fix this.? I'll give 
> it a try.
>
> Mandy
>
> On 3/27/20 4:31 AM, Jan Lahoda wrote:
>> Hi Mandy,
>>
>> Regarding the javac changes - should those be switched on/off 
>> depending the Target? Or, if one compiles with e.g. --release 14, 
>> will the newly generated output still work on JDK 14?
>>
>> Jan
>>
>> On 27. 03. 20 0:57, Mandy Chung wrote:
>>> Please review the implementation of JEP 371: Hidden Classes. The 
>>> main changes are in core-libs and hotspot runtime area.? Small 
>>> changes are made in javac, VM compiler (intrinsification of 
>>> Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been reviewed 
>>> and is in the finalized state (see specdiff and javadoc below for 
>>> reference).
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 
>>>
>>>
>>> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's 
>>> point
>>> of view, a hidden class is a normal class except the following:
>>>
>>> - A hidden class has no initiating class loader and is not 
>>> registered in any dictionary.
>>> - A hidden class has a name containing an illegal character 
>>> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` 
>>> returns "Lp/Foo.0x1234;".
>>> - A hidden class is not modifiable, i.e. cannot be redefined or 
>>> retransformed. JVM TI IsModifableClass returns false on a hidden.
>>> - Final fields in a hidden class is "final".? The value of final 
>>> fields cannot be overriden via reflection. setAccessible(true) can 
>>> still be called on reflected objects representing final fields in a 
>>> hidden class and its access check will be suppressed but only have 
>>> read-access (i.e. can do Field::getXXX but not setXXX).
>>>
>>> Brief summary of this patch:
>>>
>>> 1. A new Lookup::defineHiddenClass method is the API to create a 
>>> hidden class.
>>> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG 
>>> option that
>>> ??? can be specified when creating a hidden class.
>>> 3. A new Class::isHiddenClass method tests if a class is a hidden 
>>> class.
>>> 4. Field::setXXX method will throw IAE on a final field of a hidden 
>>> class
>>> ??? regardless of the value of the accessible flag.
>>> 5. JVM_LookupDefineClass is the new JVM entry point for 
>>> Lookup::defineClass
>>> ??? and defineHiddenClass to create a class from the given bytes.
>>> 6. ClassLoaderData implementation is not changed.? There is one 
>>> primary CLD
>>> ??? that holds the classes strongly referenced by its defining 
>>> loader. There
>>> ??? can be zero or more additional CLDs - one per weak class.
>>> 7. Nest host determination is updated per revised JVMS 5.4.4. Access 
>>> control
>>> ??? check no longer throws LinkageError but instead it will throw 
>>> IAE with
>>> ??? a clear message if a class fails to resolve/validate the nest 
>>> host declared
>>> ??? in NestHost/NestMembers attribute.
>>> 8. JFR, jcmd, JDI are updated to support hidden classes.
>>> 9. update javac LambdaToMethod as lambda proxy starts using nestmates
>>> ??? and generate a bridge method to desuger a method reference to a 
>>> protected
>>> ??? method in its supertype in a different package
>>>
>>> This patch also updates StringConcatFactory, LambdaMetaFactory, and 
>>> LambdaForms
>>> to use hidden classes.? The webrev includes changes in nashorn to 
>>> hidden class
>>> and I will update the webrev if JEP 372 removes it any time soon.
>>>
>>> We uncovered a bug in Lookup::defineClass spec throws LinkageError 
>>> and intends
>>> to have the newly created class linked.? However, the implementation 
>>> in 14
>>> does not link the class.? A separate CSR [2] proposes to update the
>>> implementation to match the spec.? This patch fixes the implementation.
>>>
>>> The spec update on JVM TI, JDI and Instrumentation will be done as
>>> a separate RFE [3].? This patch includes new tests for JVM TI and
>>> java.instrument that validates how the existing APIs work for hidden 
>>> classes.
>>>
>>> javadoc/specdiff
>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/
>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ 
>>>
>>>
>>> JVMS 5.4.4 change:
>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf 
>>>
>>>
>>> CSR:
>>> https://bugs.openjdk.java.net/browse/JDK-8238359
>>>
>>> Thanks
>>> Mandy
>>> [1] https://bugs.openjdk.java.net/browse/JDK-8238359
>>> [2] https://bugs.openjdk.java.net/browse/JDK-8240338
>>> [3] https://bugs.openjdk.java.net/browse/JDK-8230502
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200327/f0328b0b/attachment-0001.htm>

From mandy.chung at oracle.com  Fri Mar 27 22:22:19 2020
From: mandy.chung at oracle.com (Mandy Chung)
Date: Fri, 27 Mar 2020 15:22:19 -0700
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <c2ae90e1-5bf8-1d6b-a739-eb4aac542790@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
 <72EC6AFD-412B-4238-B98E-90AF8C185D98@oracle.com>
 <c2ae90e1-5bf8-1d6b-a739-eb4aac542790@oracle.com>
Message-ID: <41b99011-921a-9fc6-c738-98c47e9959c3@oracle.com>

Hi Paul,

This is the delta incorporating your comment:
http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03-delta-psandoz/

This patch also took Alex's comment to make it clear that the hidden 
class is the lookup class of the returned Lookup object and drops the 
sentence you commented on:

On 3/27/20 1:18 PM, Mandy Chung wrote:
>> MethodHandles.java
>> ?
>>
>> 1786????????? * (Given the {@code Lookup} object returned this 
>> method, its lookup class
>> 1787????????? * is a {@code Class} object for which {@link 
>> Class#getName()} returns a string
>> 1788????????? * that is not a binary name.)
>>
>> ?
>> (The {@code Lookup} object returned from this method has a lookup 
>> class that is
>> a {@code Class} object whose {@link Class#getName()} returns a string
>> that is not a binary name.)
>> ?


Mandy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200327/31349199/attachment.htm>

From mandy.chung at oracle.com  Fri Mar 27 22:29:03 2020
From: mandy.chung at oracle.com (Mandy Chung)
Date: Fri, 27 Mar 2020 15:29:03 -0700
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <e9ee79ca-e00a-c1d7-27a1-e110064a8fa3@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
 <d3d2fa4f-4bf7-94c3-de4b-15dfe75b6370@oracle.com>
 <08191054-8d0a-ae60-ac99-e2849f08ce85@oracle.com>
 <e9ee79ca-e00a-c1d7-27a1-e110064a8fa3@oracle.com>
Message-ID: <dbdeae81-7006-6922-9919-f2d253152006@oracle.com>

Hi Vicente,

hasNestmateAccess is about VM supports static nestmates on JDK release 
 >= 11.

However this is about javac --release 14 and the compiled classes may 
run on JDK 14 that lambda and string concat spin classes that are not 
nestmates. I have a patch with Jan's help:

http://cr.openjdk.java.net/~mchung/valhalla/webrevs/8171335/webrev-javac-target-release-14/index.html

(you can apply the above patch on valhalla repo "nestmates" branch)

About testing, I wanted to run BridgeMethodsForLambdaTest and 
TestLambdaBytecode test with --release 14 but it turns out not 
straight-forward.? Any help would be appreciated.

thanks
Mandy

On 3/27/20 2:15 PM, Vicente Romero wrote:
> Hi Mandy,
>
> The patch for nestmates [1] could be used as a reference. There a new 
> method was added to class `com.sun.tools.javac.jvm.Target`, named: 
> `hasNestmateAccess` which checks if a target is ready for nestmates or 
> not. I think that you can follow a similar approach here.
>
> Thanks,
> Vicente
>
> [1] http://hg.openjdk.java.net/jdk/jdk/rev/2f2af62dfac7
>
> On 3/27/20 12:29 PM, Mandy Chung wrote:
>> Hi Jan,
>>
>> Good point.? The javac change only applies to JDK 15 and later and 
>> the lambda proxy class is not a nestmate when running on JDK 14 or 
>> earlier.
>>
>> I probably need the help from langtools team to fix this.? I'll give 
>> it a try.
>>
>> Mandy
>>
>> On 3/27/20 4:31 AM, Jan Lahoda wrote:
>>> Hi Mandy,
>>>
>>> Regarding the javac changes - should those be switched on/off 
>>> depending the Target? Or, if one compiles with e.g. --release 14, 
>>> will the newly generated output still work on JDK 14?
>>>
>>> Jan
>>>
>>> On 27. 03. 20 0:57, Mandy Chung wrote:
>>>> Please review the implementation of JEP 371: Hidden Classes. The 
>>>> main changes are in core-libs and hotspot runtime area.? Small 
>>>> changes are made in javac, VM compiler (intrinsification of 
>>>> Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been 
>>>> reviewed and is in the finalized state (see specdiff and javadoc 
>>>> below for reference).
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 
>>>>
>>>>
>>>> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's 
>>>> point
>>>> of view, a hidden class is a normal class except the following:
>>>>
>>>> - A hidden class has no initiating class loader and is not 
>>>> registered in any dictionary.
>>>> - A hidden class has a name containing an illegal character 
>>>> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` 
>>>> returns "Lp/Foo.0x1234;".
>>>> - A hidden class is not modifiable, i.e. cannot be redefined or 
>>>> retransformed. JVM TI IsModifableClass returns false on a hidden.
>>>> - Final fields in a hidden class is "final".? The value of final 
>>>> fields cannot be overriden via reflection. setAccessible(true) can 
>>>> still be called on reflected objects representing final fields in a 
>>>> hidden class and its access check will be suppressed but only have 
>>>> read-access (i.e. can do Field::getXXX but not setXXX).
>>>>
>>>> Brief summary of this patch:
>>>>
>>>> 1. A new Lookup::defineHiddenClass method is the API to create a 
>>>> hidden class.
>>>> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG 
>>>> option that
>>>> ??? can be specified when creating a hidden class.
>>>> 3. A new Class::isHiddenClass method tests if a class is a hidden 
>>>> class.
>>>> 4. Field::setXXX method will throw IAE on a final field of a hidden 
>>>> class
>>>> ??? regardless of the value of the accessible flag.
>>>> 5. JVM_LookupDefineClass is the new JVM entry point for 
>>>> Lookup::defineClass
>>>> ??? and defineHiddenClass to create a class from the given bytes.
>>>> 6. ClassLoaderData implementation is not changed.? There is one 
>>>> primary CLD
>>>> ??? that holds the classes strongly referenced by its defining 
>>>> loader. There
>>>> ??? can be zero or more additional CLDs - one per weak class.
>>>> 7. Nest host determination is updated per revised JVMS 5.4.4. 
>>>> Access control
>>>> ??? check no longer throws LinkageError but instead it will throw 
>>>> IAE with
>>>> ??? a clear message if a class fails to resolve/validate the nest 
>>>> host declared
>>>> ??? in NestHost/NestMembers attribute.
>>>> 8. JFR, jcmd, JDI are updated to support hidden classes.
>>>> 9. update javac LambdaToMethod as lambda proxy starts using nestmates
>>>> ??? and generate a bridge method to desuger a method reference to a 
>>>> protected
>>>> ??? method in its supertype in a different package
>>>>
>>>> This patch also updates StringConcatFactory, LambdaMetaFactory, and 
>>>> LambdaForms
>>>> to use hidden classes.? The webrev includes changes in nashorn to 
>>>> hidden class
>>>> and I will update the webrev if JEP 372 removes it any time soon.
>>>>
>>>> We uncovered a bug in Lookup::defineClass spec throws LinkageError 
>>>> and intends
>>>> to have the newly created class linked.? However, the 
>>>> implementation in 14
>>>> does not link the class.? A separate CSR [2] proposes to update the
>>>> implementation to match the spec.? This patch fixes the 
>>>> implementation.
>>>>
>>>> The spec update on JVM TI, JDI and Instrumentation will be done as
>>>> a separate RFE [3].? This patch includes new tests for JVM TI and
>>>> java.instrument that validates how the existing APIs work for 
>>>> hidden classes.
>>>>
>>>> javadoc/specdiff
>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ 
>>>>
>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ 
>>>>
>>>>
>>>> JVMS 5.4.4 change:
>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf 
>>>>
>>>>
>>>> CSR:
>>>> https://bugs.openjdk.java.net/browse/JDK-8238359
>>>>
>>>> Thanks
>>>> Mandy
>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8238359
>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8240338
>>>> [3] https://bugs.openjdk.java.net/browse/JDK-8230502
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200327/d297b8bf/attachment.htm>

From vicente.romero at oracle.com  Fri Mar 27 22:48:52 2020
From: vicente.romero at oracle.com (Vicente Romero)
Date: Fri, 27 Mar 2020 18:48:52 -0400
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <dbdeae81-7006-6922-9919-f2d253152006@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
 <d3d2fa4f-4bf7-94c3-de4b-15dfe75b6370@oracle.com>
 <08191054-8d0a-ae60-ac99-e2849f08ce85@oracle.com>
 <e9ee79ca-e00a-c1d7-27a1-e110064a8fa3@oracle.com>
 <dbdeae81-7006-6922-9919-f2d253152006@oracle.com>
Message-ID: <60348819-ace5-8e52-f1ff-5f9654c915e0@oracle.com>

Hi Mandy,

On 3/27/20 6:29 PM, Mandy Chung wrote:
> Hi Vicente,
>
> hasNestmateAccess is about VM supports static nestmates on JDK release 
> >= 11.

I was not suggesting the use of `hasNestmateAccess` but to follow the 
same approach which is adding a new method at class `Target` to check if 
the new goodies were in the given target
>
> However this is about javac --release 14 and the compiled classes may 
> run on JDK 14 that lambda and string concat spin classes that are not 
> nestmates. I have a patch with Jan's help:
>
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/8171335/webrev-javac-target-release-14/index.html

which is what the patch above is doing

>
> (you can apply the above patch on valhalla repo "nestmates" branch)
>
> About testing, I wanted to run BridgeMethodsForLambdaTest and 
> TestLambdaBytecode test with --release 14 but it turns out not 
> straight-forward.? Any help would be appreciated.
>
> thanks
> Mandy

Vicente
>
> On 3/27/20 2:15 PM, Vicente Romero wrote:
>> Hi Mandy,
>>
>> The patch for nestmates [1] could be used as a reference. There a new 
>> method was added to class `com.sun.tools.javac.jvm.Target`, named: 
>> `hasNestmateAccess` which checks if a target is ready for nestmates 
>> or not. I think that you can follow a similar approach here.
>>
>> Thanks,
>> Vicente
>>
>> [1] http://hg.openjdk.java.net/jdk/jdk/rev/2f2af62dfac7
>>
>> On 3/27/20 12:29 PM, Mandy Chung wrote:
>>> Hi Jan,
>>>
>>> Good point.? The javac change only applies to JDK 15 and later and 
>>> the lambda proxy class is not a nestmate when running on JDK 14 or 
>>> earlier.
>>>
>>> I probably need the help from langtools team to fix this. I'll give 
>>> it a try.
>>>
>>> Mandy
>>>
>>> On 3/27/20 4:31 AM, Jan Lahoda wrote:
>>>> Hi Mandy,
>>>>
>>>> Regarding the javac changes - should those be switched on/off 
>>>> depending the Target? Or, if one compiles with e.g. --release 14, 
>>>> will the newly generated output still work on JDK 14?
>>>>
>>>> Jan
>>>>
>>>> On 27. 03. 20 0:57, Mandy Chung wrote:
>>>>> Please review the implementation of JEP 371: Hidden Classes. The 
>>>>> main changes are in core-libs and hotspot runtime area.? Small 
>>>>> changes are made in javac, VM compiler (intrinsification of 
>>>>> Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been 
>>>>> reviewed and is in the finalized state (see specdiff and javadoc 
>>>>> below for reference).
>>>>>
>>>>> Webrev:
>>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 
>>>>>
>>>>>
>>>>> Hidden class is created via `Lookup::defineHiddenClass`. From 
>>>>> JVM's point
>>>>> of view, a hidden class is a normal class except the following:
>>>>>
>>>>> - A hidden class has no initiating class loader and is not 
>>>>> registered in any dictionary.
>>>>> - A hidden class has a name containing an illegal character 
>>>>> `Class::getName` returns `p.Foo/0x1234` whereas 
>>>>> `GetClassSignature` returns "Lp/Foo.0x1234;".
>>>>> - A hidden class is not modifiable, i.e. cannot be redefined or 
>>>>> retransformed. JVM TI IsModifableClass returns false on a hidden.
>>>>> - Final fields in a hidden class is "final".? The value of final 
>>>>> fields cannot be overriden via reflection. setAccessible(true) can 
>>>>> still be called on reflected objects representing final fields in 
>>>>> a hidden class and its access check will be suppressed but only 
>>>>> have read-access (i.e. can do Field::getXXX but not setXXX).
>>>>>
>>>>> Brief summary of this patch:
>>>>>
>>>>> 1. A new Lookup::defineHiddenClass method is the API to create a 
>>>>> hidden class.
>>>>> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG 
>>>>> option that
>>>>> ??? can be specified when creating a hidden class.
>>>>> 3. A new Class::isHiddenClass method tests if a class is a hidden 
>>>>> class.
>>>>> 4. Field::setXXX method will throw IAE on a final field of a 
>>>>> hidden class
>>>>> ??? regardless of the value of the accessible flag.
>>>>> 5. JVM_LookupDefineClass is the new JVM entry point for 
>>>>> Lookup::defineClass
>>>>> ??? and defineHiddenClass to create a class from the given bytes.
>>>>> 6. ClassLoaderData implementation is not changed.? There is one 
>>>>> primary CLD
>>>>> ??? that holds the classes strongly referenced by its defining 
>>>>> loader. There
>>>>> ??? can be zero or more additional CLDs - one per weak class.
>>>>> 7. Nest host determination is updated per revised JVMS 5.4.4. 
>>>>> Access control
>>>>> ??? check no longer throws LinkageError but instead it will throw 
>>>>> IAE with
>>>>> ??? a clear message if a class fails to resolve/validate the nest 
>>>>> host declared
>>>>> ??? in NestHost/NestMembers attribute.
>>>>> 8. JFR, jcmd, JDI are updated to support hidden classes.
>>>>> 9. update javac LambdaToMethod as lambda proxy starts using nestmates
>>>>> ??? and generate a bridge method to desuger a method reference to 
>>>>> a protected
>>>>> ??? method in its supertype in a different package
>>>>>
>>>>> This patch also updates StringConcatFactory, LambdaMetaFactory, 
>>>>> and LambdaForms
>>>>> to use hidden classes.? The webrev includes changes in nashorn to 
>>>>> hidden class
>>>>> and I will update the webrev if JEP 372 removes it any time soon.
>>>>>
>>>>> We uncovered a bug in Lookup::defineClass spec throws LinkageError 
>>>>> and intends
>>>>> to have the newly created class linked.? However, the 
>>>>> implementation in 14
>>>>> does not link the class.? A separate CSR [2] proposes to update the
>>>>> implementation to match the spec.? This patch fixes the 
>>>>> implementation.
>>>>>
>>>>> The spec update on JVM TI, JDI and Instrumentation will be done as
>>>>> a separate RFE [3].? This patch includes new tests for JVM TI and
>>>>> java.instrument that validates how the existing APIs work for 
>>>>> hidden classes.
>>>>>
>>>>> javadoc/specdiff
>>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ 
>>>>>
>>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ 
>>>>>
>>>>>
>>>>> JVMS 5.4.4 change:
>>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf 
>>>>>
>>>>>
>>>>> CSR:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8238359
>>>>>
>>>>> Thanks
>>>>> Mandy
>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8238359
>>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8240338
>>>>> [3] https://bugs.openjdk.java.net/browse/JDK-8230502
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200327/5694c1e1/attachment-0001.htm>

From david.holmes at oracle.com  Fri Mar 27 23:01:58 2020
From: david.holmes at oracle.com (David Holmes)
Date: Sat, 28 Mar 2020 09:01:58 +1000
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <dbdeae81-7006-6922-9919-f2d253152006@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
 <d3d2fa4f-4bf7-94c3-de4b-15dfe75b6370@oracle.com>
 <08191054-8d0a-ae60-ac99-e2849f08ce85@oracle.com>
 <e9ee79ca-e00a-c1d7-27a1-e110064a8fa3@oracle.com>
 <dbdeae81-7006-6922-9919-f2d253152006@oracle.com>
Message-ID: <51d0a4cf-2565-2d61-31bb-ff43bca6b0e4@oracle.com>

Hi Mandy,

On 28/03/2020 8:29 am, Mandy Chung wrote:
> Hi Vicente,
> 
> hasNestmateAccess is about VM supports static nestmates on JDK release 
>  >= 11.
> 
> However this is about javac --release 14 and the compiled classes may 
> run on JDK 14 that lambda and string concat spin classes that are not 
> nestmates. I have a patch with Jan's help:
> 
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/8171335/webrev-javac-target-release-14/index.html 

+             /**
+              * The VM does not support access across nested classes 
(8010319).
+              * Were that ever to change, this should be removed.
+              */
+             boolean isPrivateInOtherClass() {

I'm not at all sure what this means - access across different nests? 
(I'm not even sure what that means.)

Thanks,
David
-----

> 
> (you can apply the above patch on valhalla repo "nestmates" branch)
> 
> About testing, I wanted to run BridgeMethodsForLambdaTest and 
> TestLambdaBytecode test with --release 14 but it turns out not 
> straight-forward.? Any help would be appreciated.
> 
> thanks
> Mandy
> 
> On 3/27/20 2:15 PM, Vicente Romero wrote:
>> Hi Mandy,
>>
>> The patch for nestmates [1] could be used as a reference. There a new 
>> method was added to class `com.sun.tools.javac.jvm.Target`, named: 
>> `hasNestmateAccess` which checks if a target is ready for nestmates or 
>> not. I think that you can follow a similar approach here.
>>
>> Thanks,
>> Vicente
>>
>> [1] http://hg.openjdk.java.net/jdk/jdk/rev/2f2af62dfac7
>>
>> On 3/27/20 12:29 PM, Mandy Chung wrote:
>>> Hi Jan,
>>>
>>> Good point.? The javac change only applies to JDK 15 and later and 
>>> the lambda proxy class is not a nestmate when running on JDK 14 or 
>>> earlier.
>>>
>>> I probably need the help from langtools team to fix this.? I'll give 
>>> it a try.
>>>
>>> Mandy
>>>
>>> On 3/27/20 4:31 AM, Jan Lahoda wrote:
>>>> Hi Mandy,
>>>>
>>>> Regarding the javac changes - should those be switched on/off 
>>>> depending the Target? Or, if one compiles with e.g. --release 14, 
>>>> will the newly generated output still work on JDK 14?
>>>>
>>>> Jan
>>>>
>>>> On 27. 03. 20 0:57, Mandy Chung wrote:
>>>>> Please review the implementation of JEP 371: Hidden Classes. The 
>>>>> main changes are in core-libs and hotspot runtime area.? Small 
>>>>> changes are made in javac, VM compiler (intrinsification of 
>>>>> Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been 
>>>>> reviewed and is in the finalized state (see specdiff and javadoc 
>>>>> below for reference).
>>>>>
>>>>> Webrev:
>>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 
>>>>>
>>>>>
>>>>> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's 
>>>>> point
>>>>> of view, a hidden class is a normal class except the following:
>>>>>
>>>>> - A hidden class has no initiating class loader and is not 
>>>>> registered in any dictionary.
>>>>> - A hidden class has a name containing an illegal character 
>>>>> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` 
>>>>> returns "Lp/Foo.0x1234;".
>>>>> - A hidden class is not modifiable, i.e. cannot be redefined or 
>>>>> retransformed. JVM TI IsModifableClass returns false on a hidden.
>>>>> - Final fields in a hidden class is "final".? The value of final 
>>>>> fields cannot be overriden via reflection. setAccessible(true) can 
>>>>> still be called on reflected objects representing final fields in a 
>>>>> hidden class and its access check will be suppressed but only have 
>>>>> read-access (i.e. can do Field::getXXX but not setXXX).
>>>>>
>>>>> Brief summary of this patch:
>>>>>
>>>>> 1. A new Lookup::defineHiddenClass method is the API to create a 
>>>>> hidden class.
>>>>> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG 
>>>>> option that
>>>>> ??? can be specified when creating a hidden class.
>>>>> 3. A new Class::isHiddenClass method tests if a class is a hidden 
>>>>> class.
>>>>> 4. Field::setXXX method will throw IAE on a final field of a hidden 
>>>>> class
>>>>> ??? regardless of the value of the accessible flag.
>>>>> 5. JVM_LookupDefineClass is the new JVM entry point for 
>>>>> Lookup::defineClass
>>>>> ??? and defineHiddenClass to create a class from the given bytes.
>>>>> 6. ClassLoaderData implementation is not changed.? There is one 
>>>>> primary CLD
>>>>> ??? that holds the classes strongly referenced by its defining 
>>>>> loader. There
>>>>> ??? can be zero or more additional CLDs - one per weak class.
>>>>> 7. Nest host determination is updated per revised JVMS 5.4.4. 
>>>>> Access control
>>>>> ??? check no longer throws LinkageError but instead it will throw 
>>>>> IAE with
>>>>> ??? a clear message if a class fails to resolve/validate the nest 
>>>>> host declared
>>>>> ??? in NestHost/NestMembers attribute.
>>>>> 8. JFR, jcmd, JDI are updated to support hidden classes.
>>>>> 9. update javac LambdaToMethod as lambda proxy starts using nestmates
>>>>> ??? and generate a bridge method to desuger a method reference to a 
>>>>> protected
>>>>> ??? method in its supertype in a different package
>>>>>
>>>>> This patch also updates StringConcatFactory, LambdaMetaFactory, and 
>>>>> LambdaForms
>>>>> to use hidden classes.? The webrev includes changes in nashorn to 
>>>>> hidden class
>>>>> and I will update the webrev if JEP 372 removes it any time soon.
>>>>>
>>>>> We uncovered a bug in Lookup::defineClass spec throws LinkageError 
>>>>> and intends
>>>>> to have the newly created class linked.? However, the 
>>>>> implementation in 14
>>>>> does not link the class.? A separate CSR [2] proposes to update the
>>>>> implementation to match the spec.? This patch fixes the 
>>>>> implementation.
>>>>>
>>>>> The spec update on JVM TI, JDI and Instrumentation will be done as
>>>>> a separate RFE [3].? This patch includes new tests for JVM TI and
>>>>> java.instrument that validates how the existing APIs work for 
>>>>> hidden classes.
>>>>>
>>>>> javadoc/specdiff
>>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ 
>>>>>
>>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ 
>>>>>
>>>>>
>>>>> JVMS 5.4.4 change:
>>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf 
>>>>>
>>>>>
>>>>> CSR:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8238359
>>>>>
>>>>> Thanks
>>>>> Mandy
>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8238359
>>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8240338
>>>>> [3] https://bugs.openjdk.java.net/browse/JDK-8230502
>>>
>>
> 

From forax at univ-mlv.fr  Fri Mar 27 23:40:59 2020
From: forax at univ-mlv.fr (Remi Forax)
Date: Sat, 28 Mar 2020 00:40:59 +0100 (CET)
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <51d0a4cf-2565-2d61-31bb-ff43bca6b0e4@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
 <d3d2fa4f-4bf7-94c3-de4b-15dfe75b6370@oracle.com>
 <08191054-8d0a-ae60-ac99-e2849f08ce85@oracle.com>
 <e9ee79ca-e00a-c1d7-27a1-e110064a8fa3@oracle.com>
 <dbdeae81-7006-6922-9919-f2d253152006@oracle.com>
 <51d0a4cf-2565-2d61-31bb-ff43bca6b0e4@oracle.com>
Message-ID: <405050984.1553152.1585352459094.JavaMail.zimbra@u-pem.fr>

----- Mail original -----
> De: "David Holmes" <david.holmes at oracle.com>
> ?: "mandy chung" <mandy.chung at oracle.com>, "Vicente Romero" <vicente.romero at oracle.com>, "jan lahoda"
> <jan.lahoda at oracle.com>
> Cc: "serviceability-dev" <serviceability-dev at openjdk.java.net>, "hotspot-dev" <hotspot-dev at openjdk.java.net>,
> "core-libs-dev" <core-libs-dev at openjdk.java.net>, "valhalla-dev" <valhalla-dev at openjdk.java.net>
> Envoy?: Samedi 28 Mars 2020 00:01:58
> Objet: Re: Review Request: 8238358: Implementation of JEP 371: Hidden Classes

> Hi Mandy,

Hi David,

> 
> On 28/03/2020 8:29 am, Mandy Chung wrote:
>> Hi Vicente,
>> 
>> hasNestmateAccess is about VM supports static nestmates on JDK release
>>  >= 11.
>> 
>> However this is about javac --release 14 and the compiled classes may
>> run on JDK 14 that lambda and string concat spin classes that are not
>> nestmates. I have a patch with Jan's help:
>> 
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/8171335/webrev-javac-target-release-14/index.html
> 
> +             /**
> +              * The VM does not support access across nested classes
> (8010319).
> +              * Were that ever to change, this should be removed.
> +              */
> +             boolean isPrivateInOtherClass() {
> 
> I'm not at all sure what this means - access across different nests?
> (I'm not even sure what that means.)

Access inside the same nest.
As you know, until now, a lambda proxy is a VM anonymous class that can only see the private fields of the class declaring the lambda (the host class) and not the private fields of a class of the nest (the enclosing classes in term of Java the language).

R?mi

> 
> Thanks,
> David
> -----
> 
>> 
>> (you can apply the above patch on valhalla repo "nestmates" branch)
>> 
>> About testing, I wanted to run BridgeMethodsForLambdaTest and
>> TestLambdaBytecode test with --release 14 but it turns out not
>> straight-forward.? Any help would be appreciated.
>> 
>> thanks
>> Mandy
>> 
>> On 3/27/20 2:15 PM, Vicente Romero wrote:
>>> Hi Mandy,
>>>
>>> The patch for nestmates [1] could be used as a reference. There a new
>>> method was added to class `com.sun.tools.javac.jvm.Target`, named:
>>> `hasNestmateAccess` which checks if a target is ready for nestmates or
>>> not. I think that you can follow a similar approach here.
>>>
>>> Thanks,
>>> Vicente
>>>
>>> [1] http://hg.openjdk.java.net/jdk/jdk/rev/2f2af62dfac7
>>>
>>> On 3/27/20 12:29 PM, Mandy Chung wrote:
>>>> Hi Jan,
>>>>
>>>> Good point.? The javac change only applies to JDK 15 and later and
>>>> the lambda proxy class is not a nestmate when running on JDK 14 or
>>>> earlier.
>>>>
>>>> I probably need the help from langtools team to fix this.? I'll give
>>>> it a try.
>>>>
>>>> Mandy
>>>>
>>>> On 3/27/20 4:31 AM, Jan Lahoda wrote:
>>>>> Hi Mandy,
>>>>>
>>>>> Regarding the javac changes - should those be switched on/off
>>>>> depending the Target? Or, if one compiles with e.g. --release 14,
>>>>> will the newly generated output still work on JDK 14?
>>>>>
>>>>> Jan
>>>>>
>>>>> On 27. 03. 20 0:57, Mandy Chung wrote:
>>>>>> Please review the implementation of JEP 371: Hidden Classes. The
>>>>>> main changes are in core-libs and hotspot runtime area.? Small
>>>>>> changes are made in javac, VM compiler (intrinsification of
>>>>>> Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been
>>>>>> reviewed and is in the finalized state (see specdiff and javadoc
>>>>>> below for reference).
>>>>>>
>>>>>> Webrev:
>>>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03
>>>>>>
>>>>>>
>>>>>> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's
>>>>>> point
>>>>>> of view, a hidden class is a normal class except the following:
>>>>>>
>>>>>> - A hidden class has no initiating class loader and is not
>>>>>> registered in any dictionary.
>>>>>> - A hidden class has a name containing an illegal character
>>>>>> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature`
>>>>>> returns "Lp/Foo.0x1234;".
>>>>>> - A hidden class is not modifiable, i.e. cannot be redefined or
>>>>>> retransformed. JVM TI IsModifableClass returns false on a hidden.
>>>>>> - Final fields in a hidden class is "final".? The value of final
>>>>>> fields cannot be overriden via reflection. setAccessible(true) can
>>>>>> still be called on reflected objects representing final fields in a
>>>>>> hidden class and its access check will be suppressed but only have
>>>>>> read-access (i.e. can do Field::getXXX but not setXXX).
>>>>>>
>>>>>> Brief summary of this patch:
>>>>>>
>>>>>> 1. A new Lookup::defineHiddenClass method is the API to create a
>>>>>> hidden class.
>>>>>> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG
>>>>>> option that
>>>>>> ??? can be specified when creating a hidden class.
>>>>>> 3. A new Class::isHiddenClass method tests if a class is a hidden
>>>>>> class.
>>>>>> 4. Field::setXXX method will throw IAE on a final field of a hidden
>>>>>> class
>>>>>> ??? regardless of the value of the accessible flag.
>>>>>> 5. JVM_LookupDefineClass is the new JVM entry point for
>>>>>> Lookup::defineClass
>>>>>> ??? and defineHiddenClass to create a class from the given bytes.
>>>>>> 6. ClassLoaderData implementation is not changed.? There is one
>>>>>> primary CLD
>>>>>> ??? that holds the classes strongly referenced by its defining
>>>>>> loader. There
>>>>>> ??? can be zero or more additional CLDs - one per weak class.
>>>>>> 7. Nest host determination is updated per revised JVMS 5.4.4.
>>>>>> Access control
>>>>>> ??? check no longer throws LinkageError but instead it will throw
>>>>>> IAE with
>>>>>> ??? a clear message if a class fails to resolve/validate the nest
>>>>>> host declared
>>>>>> ??? in NestHost/NestMembers attribute.
>>>>>> 8. JFR, jcmd, JDI are updated to support hidden classes.
>>>>>> 9. update javac LambdaToMethod as lambda proxy starts using nestmates
>>>>>> ??? and generate a bridge method to desuger a method reference to a
>>>>>> protected
>>>>>> ??? method in its supertype in a different package
>>>>>>
>>>>>> This patch also updates StringConcatFactory, LambdaMetaFactory, and
>>>>>> LambdaForms
>>>>>> to use hidden classes.? The webrev includes changes in nashorn to
>>>>>> hidden class
>>>>>> and I will update the webrev if JEP 372 removes it any time soon.
>>>>>>
>>>>>> We uncovered a bug in Lookup::defineClass spec throws LinkageError
>>>>>> and intends
>>>>>> to have the newly created class linked.? However, the
>>>>>> implementation in 14
>>>>>> does not link the class.? A separate CSR [2] proposes to update the
>>>>>> implementation to match the spec.? This patch fixes the
>>>>>> implementation.
>>>>>>
>>>>>> The spec update on JVM TI, JDI and Instrumentation will be done as
>>>>>> a separate RFE [3].? This patch includes new tests for JVM TI and
>>>>>> java.instrument that validates how the existing APIs work for
>>>>>> hidden classes.
>>>>>>
>>>>>> javadoc/specdiff
>>>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/
>>>>>>
>>>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/
>>>>>>
>>>>>>
>>>>>> JVMS 5.4.4 change:
>>>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf
>>>>>>
>>>>>>
>>>>>> CSR:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238359
>>>>>>
>>>>>> Thanks
>>>>>> Mandy
>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8238359
>>>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8240338
>>>>>> [3] https://bugs.openjdk.java.net/browse/JDK-8230502
>>>>
>>>

From mandy.chung at oracle.com  Fri Mar 27 23:46:00 2020
From: mandy.chung at oracle.com (Mandy Chung)
Date: Fri, 27 Mar 2020 16:46:00 -0700
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <51d0a4cf-2565-2d61-31bb-ff43bca6b0e4@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
 <d3d2fa4f-4bf7-94c3-de4b-15dfe75b6370@oracle.com>
 <08191054-8d0a-ae60-ac99-e2849f08ce85@oracle.com>
 <e9ee79ca-e00a-c1d7-27a1-e110064a8fa3@oracle.com>
 <dbdeae81-7006-6922-9919-f2d253152006@oracle.com>
 <51d0a4cf-2565-2d61-31bb-ff43bca6b0e4@oracle.com>
Message-ID: <09126b64-2aa5-9b0d-ec5a-62753a416ea9@oracle.com>


On 3/27/20 4:01 PM, David Holmes wrote:
> Hi Mandy,
>
> On 28/03/2020 8:29 am, Mandy Chung wrote:
>> Hi Vicente,
>>
>> hasNestmateAccess is about VM supports static nestmates on JDK 
>> release ?>= 11.
>>
>> However this is about javac --release 14 and the compiled classes may 
>> run on JDK 14 that lambda and string concat spin classes that are not 
>> nestmates. I have a patch with Jan's help:
>>
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/8171335/webrev-javac-target-release-14/index.html 
>
>
> +???????????? /**
> +????????????? * The VM does not support access across nested classes 
> (8010319).
> +????????????? * Were that ever to change, this should be removed.
> +????????????? */
> +???????????? boolean isPrivateInOtherClass() {
>
> I'm not at all sure what this means - access across different nests? 
> (I'm not even sure what that means.)

This just reverts? the old code that I removed.

What this method is trying to determine if it accesses a private in 
another class in the same nest (nested classes) that needs a synthetic 
bridge method to access.

Mandy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200327/fe03dc38/attachment-0001.htm>

From david.holmes at oracle.com  Sat Mar 28 01:15:43 2020
From: david.holmes at oracle.com (David Holmes)
Date: Sat, 28 Mar 2020 11:15:43 +1000
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <09126b64-2aa5-9b0d-ec5a-62753a416ea9@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
 <d3d2fa4f-4bf7-94c3-de4b-15dfe75b6370@oracle.com>
 <08191054-8d0a-ae60-ac99-e2849f08ce85@oracle.com>
 <e9ee79ca-e00a-c1d7-27a1-e110064a8fa3@oracle.com>
 <dbdeae81-7006-6922-9919-f2d253152006@oracle.com>
 <51d0a4cf-2565-2d61-31bb-ff43bca6b0e4@oracle.com>
 <09126b64-2aa5-9b0d-ec5a-62753a416ea9@oracle.com>
Message-ID: <069c76b6-dd85-f8f5-c2dd-1ba994178084@oracle.com>

Hi Mandy,

On 28/03/2020 9:46 am, Mandy Chung wrote:
> 
> 
> On 3/27/20 4:01 PM, David Holmes wrote:
>> Hi Mandy,
>>
>> On 28/03/2020 8:29 am, Mandy Chung wrote:
>>> Hi Vicente,
>>>
>>> hasNestmateAccess is about VM supports static nestmates on JDK 
>>> release ?>= 11.
>>>
>>> However this is about javac --release 14 and the compiled classes may 
>>> run on JDK 14 that lambda and string concat spin classes that are not 
>>> nestmates. I have a patch with Jan's help:
>>>
>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/8171335/webrev-javac-target-release-14/index.html 
>>
>>
>> +???????????? /**
>> +????????????? * The VM does not support access across nested classes 
>> (8010319).
>> +????????????? * Were that ever to change, this should be removed.
>> +????????????? */
>> +???????????? boolean isPrivateInOtherClass() {
>>
>> I'm not at all sure what this means - access across different nests? 
>> (I'm not even sure what that means.)
> 
> This just reverts? the old code that I removed.

Ah I see. This is ancient pre-nestmate code. Can we at least fix the 
comment as it really doesn't make any sense

> What this method is trying to determine if it accesses a private in 
> another class in the same nest (nested classes) that needs a synthetic 
> bridge method to access.

That would be a good comment to add. Something like:

If compiling for a release where the VM does not support access between
nested classes, this method indicates if a synthetic bridge method is
needed for access.

Thanks,
David

> Mandy

From paul.sandoz at oracle.com  Sat Mar 28 01:39:46 2020
From: paul.sandoz at oracle.com (Paul Sandoz)
Date: Fri, 27 Mar 2020 18:39:46 -0700
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <41b99011-921a-9fc6-c738-98c47e9959c3@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
 <72EC6AFD-412B-4238-B98E-90AF8C185D98@oracle.com>
 <c2ae90e1-5bf8-1d6b-a739-eb4aac542790@oracle.com>
 <41b99011-921a-9fc6-c738-98c47e9959c3@oracle.com>
Message-ID: <2667FFDE-44A9-4584-BF16-897B863D89F3@oracle.com>

+1
Paul.

> On Mar 27, 2020, at 3:22 PM, Mandy Chung <mandy.chung at oracle.com> wrote:
> 
> Hi Paul,
> 
> This is the delta incorporating your comment:
>   http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03-delta-psandoz/ <http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03-delta-psandoz/>
> 
> This patch also took Alex's comment to make it clear that the hidden class is the lookup class of the returned Lookup object and drops the sentence you commented on:
> 
> On 3/27/20 1:18 PM, Mandy Chung wrote:
>>> MethodHandles.java 
>>> ? 
>>> 
>>> 1786          * (Given the {@code Lookup} object returned this method, its lookup class 
>>> 1787          * is a {@code Class} object for which {@link Class#getName()} returns a string 
>>> 1788          * that is not a binary name.) 
>>> 
>>> ? 
>>> (The {@code Lookup} object returned from this method has a lookup class that is 
>>> a {@code Class} object whose {@link Class#getName()} returns a string 
>>> that is not a binary name.) 
>>> ? 
> 
> 
> Mandy

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200327/877c333c/attachment.htm>

From dean.long at oracle.com  Sat Mar 28 02:25:36 2020
From: dean.long at oracle.com (Dean Long)
Date: Fri, 27 Mar 2020 19:25:36 -0700
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
Message-ID: <13d7ba73-e49f-a55d-7e80-bd10153152a4@oracle.com>

I looked at the AOT, C2, and JVMCI changes and I didn't find any issues.

dl

On 3/26/20 4:57 PM, Mandy Chung wrote:
> Please review the implementation of JEP 371: Hidden Classes. The main 
> changes are in core-libs and hotspot runtime area. Small changes are 
> made in javac, VM compiler (intrinsification of Class::isHiddenClass), 
> JFR, JDI, and jcmd.? CSR [1]has been reviewed and is in the finalized 
> state (see specdiff and javadoc below for reference).
>
> Webrev:
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03
>
> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's point
> of view, a hidden class is a normal class except the following:
>
> - A hidden class has no initiating class loader and is not registered 
> in any dictionary.
> - A hidden class has a name containing an illegal character 
> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` 
> returns "Lp/Foo.0x1234;".
> - A hidden class is not modifiable, i.e. cannot be redefined or 
> retransformed. JVM TI IsModifableClass returns false on a hidden.
> - Final fields in a hidden class is "final".? The value of final 
> fields cannot be overriden via reflection.? setAccessible(true) can 
> still be called on reflected objects representing final fields in a 
> hidden class and its access check will be suppressed but only have 
> read-access (i.e. can do Field::getXXX but not setXXX).
>
> Brief summary of this patch:
>
> 1. A new Lookup::defineHiddenClass method is the API to create a 
> hidden class.
> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG 
> option that
> ?? can be specified when creating a hidden class.
> 3. A new Class::isHiddenClass method tests if a class is a hidden class.
> 4. Field::setXXX method will throw IAE on a final field of a hidden class
> ?? regardless of the value of the accessible flag.
> 5. JVM_LookupDefineClass is the new JVM entry point for 
> Lookup::defineClass
> ?? and defineHiddenClass to create a class from the given bytes.
> 6. ClassLoaderData implementation is not changed.? There is one 
> primary CLD
> ?? that holds the classes strongly referenced by its defining loader.? 
> There
> ?? can be zero or more additional CLDs - one per weak class.
> 7. Nest host determination is updated per revised JVMS 5.4.4. Access 
> control
> ?? check no longer throws LinkageError but instead it will throw IAE with
> ?? a clear message if a class fails to resolve/validate the nest host 
> declared
> ?? in NestHost/NestMembers attribute.
> 8. JFR, jcmd, JDI are updated to support hidden classes.
> 9. update javac LambdaToMethod as lambda proxy starts using nestmates
> ?? and generate a bridge method to desuger a method reference to a 
> protected
> ?? method in its supertype in a different package
>
> This patch also updates StringConcatFactory, LambdaMetaFactory, and 
> LambdaForms
> to use hidden classes.? The webrev includes changes in nashorn to 
> hidden class
> and I will update the webrev if JEP 372 removes it any time soon.
>
> We uncovered a bug in Lookup::defineClass spec throws LinkageError and 
> intends
> to have the newly created class linked.? However, the implementation in 14
> does not link the class.? A separate CSR [2] proposes to update the
> implementation to match the spec.? This patch fixes the implementation.
>
> The spec update on JVM TI, JDI and Instrumentation will be done as
> a separate RFE [3].? This patch includes new tests for JVM TI and
> java.instrument that validates how the existing APIs work for hidden 
> classes.
>
> javadoc/specdiff
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/
>
> JVMS 5.4.4 change:
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf
>
> CSR:
> https://bugs.openjdk.java.net/browse/JDK-8238359
>
> Thanks
> Mandy
> [1] https://bugs.openjdk.java.net/browse/JDK-8238359
> [2] https://bugs.openjdk.java.net/browse/JDK-8240338
> [3] https://bugs.openjdk.java.net/browse/JDK-8230502


From chris.plummer at oracle.com  Sat Mar 28 03:51:37 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Fri, 27 Mar 2020 20:51:37 -0700
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
Message-ID: <90f9276d-2777-88ba-b1ec-9901711fcf02@oracle.com>

Hi Mandy,

A couple of very minor nits in the jvmtiRedefineClasses.cpp comments:

 ?153???? // classes for primitives, arrays, hidden and vm unsafe 
anonymous classes
 ?154???? // cannot be redefined.? Check here so following code can 
assume these classes
 ?155???? // are InstanceKlass.
 ?156???? if (!is_modifiable_class(mirror)) {
 ?157?????? _res = JVMTI_ERROR_UNMODIFIABLE_CLASS;
 ?158?????? return false;
 ?159???? }

I think this code and comment predate anonymous classes. Probably before 
anonymous classes the check was not for !is_modifiable_class() but 
instead was just a check for primitive or array class types since they 
are not an InstanceKlass, and would cause issues when cast to one in the 
code that lies below this section. When anonymous classes were added, 
the code got changed to use !is_modifiable_class() and the comment was 
not correctly updated (anonymous classes are an InstanceKlass). Then 
with this webrev the mention of hidden classes was added, also 
incorrectly implying they are not an InstanceKlass. I think you should 
just leave off the last sentence of the comment.

There's some ambiguity in the application of adjectives in the following:

 ?297?? // Cannot redefine or retransform a hidden or an unsafe 
anonymous class.

I'd suggest:

 ?297?? // Cannot redefine or retransform a hidden class or an unsafe 
anonymous class.

There are some places in libjdwp that need to be fixed. I spoke to 
Serguei about those this afternoon. Basically the 
convertSignatureToClassname() function needs to be fixed to handle 
hidden classes. Without the fix classname filtering will have problems 
if the filter contains a pattern with a '/' to filter on hidden classes. 
Also CLASS_UNLOAD events will not properly convert hidden class names. 
We also need tests for these cases. I think these are all things that 
can be addressed later.

I still need to look over the JVMTI tests.

thanks,

Chris

On 3/26/20 4:57 PM, Mandy Chung wrote:
> Please review the implementation of JEP 371: Hidden Classes. The main 
> changes are in core-libs and hotspot runtime area. Small changes are 
> made in javac, VM compiler (intrinsification of Class::isHiddenClass), 
> JFR, JDI, and jcmd.? CSR [1]has been reviewed and is in the finalized 
> state (see specdiff and javadoc below for reference).
>
> Webrev:
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03
>
> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's point
> of view, a hidden class is a normal class except the following:
>
> - A hidden class has no initiating class loader and is not registered 
> in any dictionary.
> - A hidden class has a name containing an illegal character 
> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` 
> returns "Lp/Foo.0x1234;".
> - A hidden class is not modifiable, i.e. cannot be redefined or 
> retransformed. JVM TI IsModifableClass returns false on a hidden.
> - Final fields in a hidden class is "final".? The value of final 
> fields cannot be overriden via reflection.? setAccessible(true) can 
> still be called on reflected objects representing final fields in a 
> hidden class and its access check will be suppressed but only have 
> read-access (i.e. can do Field::getXXX but not setXXX).
>
> Brief summary of this patch:
>
> 1. A new Lookup::defineHiddenClass method is the API to create a 
> hidden class.
> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG 
> option that
> ?? can be specified when creating a hidden class.
> 3. A new Class::isHiddenClass method tests if a class is a hidden class.
> 4. Field::setXXX method will throw IAE on a final field of a hidden class
> ?? regardless of the value of the accessible flag.
> 5. JVM_LookupDefineClass is the new JVM entry point for 
> Lookup::defineClass
> ?? and defineHiddenClass to create a class from the given bytes.
> 6. ClassLoaderData implementation is not changed.? There is one 
> primary CLD
> ?? that holds the classes strongly referenced by its defining loader.? 
> There
> ?? can be zero or more additional CLDs - one per weak class.
> 7. Nest host determination is updated per revised JVMS 5.4.4. Access 
> control
> ?? check no longer throws LinkageError but instead it will throw IAE with
> ?? a clear message if a class fails to resolve/validate the nest host 
> declared
> ?? in NestHost/NestMembers attribute.
> 8. JFR, jcmd, JDI are updated to support hidden classes.
> 9. update javac LambdaToMethod as lambda proxy starts using nestmates
> ?? and generate a bridge method to desuger a method reference to a 
> protected
> ?? method in its supertype in a different package
>
> This patch also updates StringConcatFactory, LambdaMetaFactory, and 
> LambdaForms
> to use hidden classes.? The webrev includes changes in nashorn to 
> hidden class
> and I will update the webrev if JEP 372 removes it any time soon.
>
> We uncovered a bug in Lookup::defineClass spec throws LinkageError and 
> intends
> to have the newly created class linked.? However, the implementation in 14
> does not link the class.? A separate CSR [2] proposes to update the
> implementation to match the spec.? This patch fixes the implementation.
>
> The spec update on JVM TI, JDI and Instrumentation will be done as
> a separate RFE [3].? This patch includes new tests for JVM TI and
> java.instrument that validates how the existing APIs work for hidden 
> classes.
>
> javadoc/specdiff
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/
>
> JVMS 5.4.4 change:
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf
>
> CSR:
> https://bugs.openjdk.java.net/browse/JDK-8238359
>
> Thanks
> Mandy
> [1] https://bugs.openjdk.java.net/browse/JDK-8238359
> [2] https://bugs.openjdk.java.net/browse/JDK-8240338
> [3] https://bugs.openjdk.java.net/browse/JDK-8230502


From mandy.chung at oracle.com  Mon Mar 30 02:17:27 2020
From: mandy.chung at oracle.com (Mandy Chung)
Date: Sun, 29 Mar 2020 19:17:27 -0700
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <90f9276d-2777-88ba-b1ec-9901711fcf02@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
 <90f9276d-2777-88ba-b1ec-9901711fcf02@oracle.com>
Message-ID: <c56eba10-13eb-aa86-11a3-f2dfd1a7b0b0@oracle.com>


On 3/27/20 8:51 PM, Chris Plummer wrote:
> Hi Mandy,
>
> A couple of very minor nits in the jvmtiRedefineClasses.cpp comments:
>
> ?153???? // classes for primitives, arrays, hidden and vm unsafe 
> anonymous classes
> ?154???? // cannot be redefined.? Check here so following code can 
> assume these classes
> ?155???? // are InstanceKlass.
> ?156???? if (!is_modifiable_class(mirror)) {
> ?157?????? _res = JVMTI_ERROR_UNMODIFIABLE_CLASS;
> ?158?????? return false;
> ?159???? }
>
> I think this code and comment predate anonymous classes. Probably 
> before anonymous classes the check was not for !is_modifiable_class() 
> but instead was just a check for primitive or array class types since 
> they are not an InstanceKlass, and would cause issues when cast to one 
> in the code that lies below this section. When anonymous classes were 
> added, the code got changed to use !is_modifiable_class() and the 
> comment was not correctly updated (anonymous classes are an 
> InstanceKlass). Then with this webrev the mention of hidden classes 
> was added, also incorrectly implying they are not an InstanceKlass. I 
> think you should just leave off the last sentence of the comment.
>

I agree with you that this comment needs update.?? Perhaps it should say 
"primitive, array types and hidden classes are non-modifiable. A 
modifiable class must be an InstanceKlass."

I leave it to Serguei who may have other opinion.

> There's some ambiguity in the application of adjectives in the following:
>
> ?297?? // Cannot redefine or retransform a hidden or an unsafe 
> anonymous class.
>
> I'd suggest:
>
> ?297?? // Cannot redefine or retransform a hidden class or an unsafe 
> anonymous class.
>

+1

> There are some places in libjdwp that need to be fixed. I spoke to 
> Serguei about those this afternoon. Basically the 
> convertSignatureToClassname() function needs to be fixed to handle 
> hidden classes. Without the fix classname filtering will have problems 
> if the filter contains a pattern with a '/' to filter on hidden 
> classes. Also CLASS_UNLOAD events will not properly convert hidden 
> class names. We also need tests for these cases. I think these are all 
> things that can be addressed later.
>

Good catch.? I have created a subtask under JDK-8230502:
 ?? https://bugs.openjdk.java.net/browse/JDK-8230502

> I still need to look over the JVMTI tests.
>

Thanks
Mandy
> thanks,
>
> Chris
>
> On 3/26/20 4:57 PM, Mandy Chung wrote:
>> Please review the implementation of JEP 371: Hidden Classes. The main 
>> changes are in core-libs and hotspot runtime area. Small changes are 
>> made in javac, VM compiler (intrinsification of 
>> Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been reviewed 
>> and is in the finalized state (see specdiff and javadoc below for 
>> reference).
>>
>> Webrev:
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 
>>
>>
>> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's 
>> point
>> of view, a hidden class is a normal class except the following:
>>
>> - A hidden class has no initiating class loader and is not registered 
>> in any dictionary.
>> - A hidden class has a name containing an illegal character 
>> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` 
>> returns "Lp/Foo.0x1234;".
>> - A hidden class is not modifiable, i.e. cannot be redefined or 
>> retransformed. JVM TI IsModifableClass returns false on a hidden.
>> - Final fields in a hidden class is "final".? The value of final 
>> fields cannot be overriden via reflection.? setAccessible(true) can 
>> still be called on reflected objects representing final fields in a 
>> hidden class and its access check will be suppressed but only have 
>> read-access (i.e. can do Field::getXXX but not setXXX).
>>
>> Brief summary of this patch:
>>
>> 1. A new Lookup::defineHiddenClass method is the API to create a 
>> hidden class.
>> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG 
>> option that
>> ?? can be specified when creating a hidden class.
>> 3. A new Class::isHiddenClass method tests if a class is a hidden class.
>> 4. Field::setXXX method will throw IAE on a final field of a hidden 
>> class
>> ?? regardless of the value of the accessible flag.
>> 5. JVM_LookupDefineClass is the new JVM entry point for 
>> Lookup::defineClass
>> ?? and defineHiddenClass to create a class from the given bytes.
>> 6. ClassLoaderData implementation is not changed.? There is one 
>> primary CLD
>> ?? that holds the classes strongly referenced by its defining 
>> loader.? There
>> ?? can be zero or more additional CLDs - one per weak class.
>> 7. Nest host determination is updated per revised JVMS 5.4.4. Access 
>> control
>> ?? check no longer throws LinkageError but instead it will throw IAE 
>> with
>> ?? a clear message if a class fails to resolve/validate the nest host 
>> declared
>> ?? in NestHost/NestMembers attribute.
>> 8. JFR, jcmd, JDI are updated to support hidden classes.
>> 9. update javac LambdaToMethod as lambda proxy starts using nestmates
>> ?? and generate a bridge method to desuger a method reference to a 
>> protected
>> ?? method in its supertype in a different package
>>
>> This patch also updates StringConcatFactory, LambdaMetaFactory, and 
>> LambdaForms
>> to use hidden classes.? The webrev includes changes in nashorn to 
>> hidden class
>> and I will update the webrev if JEP 372 removes it any time soon.
>>
>> We uncovered a bug in Lookup::defineClass spec throws LinkageError 
>> and intends
>> to have the newly created class linked.? However, the implementation 
>> in 14
>> does not link the class.? A separate CSR [2] proposes to update the
>> implementation to match the spec.? This patch fixes the implementation.
>>
>> The spec update on JVM TI, JDI and Instrumentation will be done as
>> a separate RFE [3].? This patch includes new tests for JVM TI and
>> java.instrument that validates how the existing APIs work for hidden 
>> classes.
>>
>> javadoc/specdiff
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ 
>>
>>
>> JVMS 5.4.4 change:
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf 
>>
>>
>> CSR:
>> https://bugs.openjdk.java.net/browse/JDK-8238359
>>
>> Thanks
>> Mandy
>> [1] https://bugs.openjdk.java.net/browse/JDK-8238359
>> [2] https://bugs.openjdk.java.net/browse/JDK-8240338
>> [3] https://bugs.openjdk.java.net/browse/JDK-8230502
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200329/3f339d7e/attachment.htm>

From serguei.spitsyn at oracle.com  Mon Mar 30 03:40:43 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Sun, 29 Mar 2020 20:40:43 -0700
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <c56eba10-13eb-aa86-11a3-f2dfd1a7b0b0@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
 <90f9276d-2777-88ba-b1ec-9901711fcf02@oracle.com>
 <c56eba10-13eb-aa86-11a3-f2dfd1a7b0b0@oracle.com>
Message-ID: <7fe9fd65-8bee-beb3-03af-ab56120a4cc1@oracle.com>

Hi Mandy and Chris,


On 3/29/20 19:17, Mandy Chung wrote:
>
>
> On 3/27/20 8:51 PM, Chris Plummer wrote:
>> Hi Mandy,
>>
>> A couple of very minor nits in the jvmtiRedefineClasses.cpp comments:
>>
>> ?153???? // classes for primitives, arrays, hidden and vm unsafe 
>> anonymous classes
>> ?154???? // cannot be redefined.? Check here so following code can 
>> assume these classes
>> ?155???? // are InstanceKlass.
>> ?156???? if (!is_modifiable_class(mirror)) {
>> ?157?????? _res = JVMTI_ERROR_UNMODIFIABLE_CLASS;
>> ?158?????? return false;
>> ?159???? }
>>
>> I think this code and comment predate anonymous classes. Probably 
>> before anonymous classes the check was not for !is_modifiable_class() 
>> but instead was just a check for primitive or array class types since 
>> they are not an InstanceKlass, and would cause issues when cast to 
>> one in the code that lies below this section. When anonymous classes 
>> were added, the code got changed to use !is_modifiable_class() and 
>> the comment was not correctly updated (anonymous classes are an 
>> InstanceKlass). Then with this webrev the mention of hidden classes 
>> was added, also incorrectly implying they are not an InstanceKlass. I 
>> think you should just leave off the last sentence of the comment.
>>
>
> I agree with you that this comment needs update.?? Perhaps it should 
> say "primitive, array types and hidden classes are non-modifiable. A 
> modifiable class must be an InstanceKlass."
>
> I leave it to Serguei who may have other opinion.

We already had a chat with Chris about this.
This suggestion looks right.


>> There's some ambiguity in the application of adjectives in the 
>> following:
>>
>> ?297?? // Cannot redefine or retransform a hidden or an unsafe 
>> anonymous class.
>>
>> I'd suggest:
>>
>> ?297?? // Cannot redefine or retransform a hidden class or an unsafe 
>> anonymous class.
>>
>
> +1

+1

>> There are some places in libjdwp that need to be fixed. I spoke to 
>> Serguei about those this afternoon. Basically the 
>> convertSignatureToClassname() function needs to be fixed to handle 
>> hidden classes. Without the fix classname filtering will have 
>> problems if the filter contains a pattern with a '/' to filter on 
>> hidden classes. Also CLASS_UNLOAD events will not properly convert 
>> hidden class names. We also need tests for these cases. I think these 
>> are all things that can be addressed later.
>>
>
> Good catch.? I have created a subtask under JDK-8230502:
> ?? https://bugs.openjdk.java.net/browse/JDK-8230502

Yes, it is good catch. Thank you for filing the subtask.
We discussed this with Chris.
This was expected to be found with new test coverage and fixed in the 
JDI chunk of work which we have decided to separate from JEP 371.


Thanks,
Serguei

>> I still need to look over the JVMTI tests.
>>
>
> Thanks
> Mandy
>> thanks,
>>
>> Chris
>>
>> On 3/26/20 4:57 PM, Mandy Chung wrote:
>>> Please review the implementation of JEP 371: Hidden Classes. The 
>>> main changes are in core-libs and hotspot runtime area. Small 
>>> changes are made in javac, VM compiler (intrinsification of 
>>> Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been reviewed 
>>> and is in the finalized state (see specdiff and javadoc below for 
>>> reference).
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 
>>>
>>>
>>> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's 
>>> point
>>> of view, a hidden class is a normal class except the following:
>>>
>>> - A hidden class has no initiating class loader and is not 
>>> registered in any dictionary.
>>> - A hidden class has a name containing an illegal character 
>>> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` 
>>> returns "Lp/Foo.0x1234;".
>>> - A hidden class is not modifiable, i.e. cannot be redefined or 
>>> retransformed. JVM TI IsModifableClass returns false on a hidden.
>>> - Final fields in a hidden class is "final".? The value of final 
>>> fields cannot be overriden via reflection. setAccessible(true) can 
>>> still be called on reflected objects representing final fields in a 
>>> hidden class and its access check will be suppressed but only have 
>>> read-access (i.e. can do Field::getXXX but not setXXX).
>>>
>>> Brief summary of this patch:
>>>
>>> 1. A new Lookup::defineHiddenClass method is the API to create a 
>>> hidden class.
>>> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG 
>>> option that
>>> ?? can be specified when creating a hidden class.
>>> 3. A new Class::isHiddenClass method tests if a class is a hidden 
>>> class.
>>> 4. Field::setXXX method will throw IAE on a final field of a hidden 
>>> class
>>> ?? regardless of the value of the accessible flag.
>>> 5. JVM_LookupDefineClass is the new JVM entry point for 
>>> Lookup::defineClass
>>> ?? and defineHiddenClass to create a class from the given bytes.
>>> 6. ClassLoaderData implementation is not changed.? There is one 
>>> primary CLD
>>> ?? that holds the classes strongly referenced by its defining 
>>> loader.? There
>>> ?? can be zero or more additional CLDs - one per weak class.
>>> 7. Nest host determination is updated per revised JVMS 5.4.4. Access 
>>> control
>>> ?? check no longer throws LinkageError but instead it will throw IAE 
>>> with
>>> ?? a clear message if a class fails to resolve/validate the nest 
>>> host declared
>>> ?? in NestHost/NestMembers attribute.
>>> 8. JFR, jcmd, JDI are updated to support hidden classes.
>>> 9. update javac LambdaToMethod as lambda proxy starts using nestmates
>>> ?? and generate a bridge method to desuger a method reference to a 
>>> protected
>>> ?? method in its supertype in a different package
>>>
>>> This patch also updates StringConcatFactory, LambdaMetaFactory, and 
>>> LambdaForms
>>> to use hidden classes.? The webrev includes changes in nashorn to 
>>> hidden class
>>> and I will update the webrev if JEP 372 removes it any time soon.
>>>
>>> We uncovered a bug in Lookup::defineClass spec throws LinkageError 
>>> and intends
>>> to have the newly created class linked.? However, the implementation 
>>> in 14
>>> does not link the class.? A separate CSR [2] proposes to update the
>>> implementation to match the spec.? This patch fixes the implementation.
>>>
>>> The spec update on JVM TI, JDI and Instrumentation will be done as
>>> a separate RFE [3].? This patch includes new tests for JVM TI and
>>> java.instrument that validates how the existing APIs work for hidden 
>>> classes.
>>>
>>> javadoc/specdiff
>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/
>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ 
>>>
>>>
>>> JVMS 5.4.4 change:
>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf 
>>>
>>>
>>> CSR:
>>> https://bugs.openjdk.java.net/browse/JDK-8238359
>>>
>>> Thanks
>>> Mandy
>>> [1] https://bugs.openjdk.java.net/browse/JDK-8238359
>>> [2] https://bugs.openjdk.java.net/browse/JDK-8240338
>>> [3] https://bugs.openjdk.java.net/browse/JDK-8230502
>>
>>
>


From richard.reingruber at sap.com  Mon Mar 30 08:10:42 2020
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Mon, 30 Mar 2020 08:10:42 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <AM0PR0202MB32972071A26C80FB22FC49DE9AFD0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
 <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
 <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com>
 <AM0PR0202MB33316510E86767AED0D29F679B030@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM7PR02MB6049A3D2F6DE10CAD6AA7A51ECEC0@AM7PR02MB6049.eurprd02.prod.outlook.com>
 <b159e349-95bc-01c3-5250-f3b454d7ef53@oracle.com>
 <AM0PR0202MB33315707EAB1F5C9801DB4C19BE40@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB32972071A26C80FB22FC49DE9AFD0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
Message-ID: <AM0PR0202MB333131615D3C17ABB72153449BCB0@AM0PR0202MB3331.eurprd02.prod.outlook.com>

Hi,

this is webrev.5 based on Robbin's feedback and Martin's review - thanks! :)

The change affects jvmti, hotspot and c2. Partial reviews are very welcome too.

Full:  http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5/
Delta: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5.inc/

Robbin, Martin, please let me know, if anything shouldn't be quite as you wanted it. Also find my
comments on your feedback below.

Robbin, can I count you as Reviewer for the runtime part?

Thanks, Richard.

-----Original Message-----
From: Doerr, Martin <martin.doerr at sap.com> 
Sent: Donnerstag, 12. M?rz 2020 17:28
To: Reingruber, Richard <richard.reingruber at sap.com>; 'Robbin Ehn' <robbin.ehn at oracle.com>; Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; David Holmes <david.holmes at oracle.com>; Vladimir Kozlov (vladimir.kozlov at oracle.com) <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

Hi Richard,


I managed to find time for a (almost) complete review of webrev.4. (I'll review the tests separately.)

First of all, the change seems to be in pretty good quality for its significant complexity. I couldn't find any real bugs. But I'd like to propose minor improvements.
I'm convinced that it's mature because we did substantial testing.

I like the new functionality for object deoptimization. It can possibly be reused for future escape analysis based optimizations. So I appreciate having it available in the code base.
In addition to that, your change makes the JVMTI implementation better integrated into the VM.


Now to the details:


src/hotspot/share/c1/c1_IR.hpp
describe_scope parameters. Ok.


src/hotspot/share/ci/ciEnv.cpp
src/hotspot/share/ci/ciEnv.hpp
Fix for JvmtiExport::can_walk_any_space() capability. Ok.


src/hotspot/share/code/compiledMethod.cpp
Nice cleanup!


src/hotspot/share/code/debugInfoRec.cpp
src/hotspot/share/code/debugInfoRec.hpp
Additional parmeters. (Remark: I think "non_global_escape_in_scope" would read better than "not_global_escape_in_scope", but your version is consistent with existing code, so no change request from my side.) Ok.


src/hotspot/share/code/nmethod.cpp
Nice cleanup!


src/hotspot/share/code/pcDesc.hpp
Additional parameters. Ok.


src/hotspot/share/code/scopeDesc.cpp
src/hotspot/share/code/scopeDesc.hpp
Improved implementation + additional parameters. Ok.


src/hotspot/share/compiler/compileBroker.cpp
src/hotspot/share/compiler/compileBroker.hpp
Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into a follow up change together with the test in order to make this webrev smaller, but since it is included, I'm reviewing everything at once. Not a big deal.) Ok.


src/hotspot/share/jvmci/jvmciCodeInstaller.cpp
Additional parameters. Ok.


src/hotspot/share/opto/c2compiler.cpp
Make do_escape_analysis independent of JVMCI capabilities. Nice!


src/hotspot/share/opto/callnode.hpp
Additional fields for MachSafePointNodes. Ok.


src/hotspot/share/opto/escape.cpp
Annotation for MachSafePointNodes. Your added functionality looks correct.
But I'd prefer to move the bulky code out of the large function.
I suggest to factor out something like has_not_global_escape and has_arg_escape. So the code could look like this:
      SafePointNode* sfn = sfn_worklist.at(next);
      sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn));
      if (sfn->is_CallJava()) {
        CallJavaNode* call = sfn->as_CallJava();
        call->set_arg_escape(has_arg_escape(call));
      }
This would also allow us to get rid of the found_..._escape_in_args variables making the loops better readable.

It's kind of ugly to use strcmp to recognize uncommon trap, but that seems to be the way to do it (there are more such places). So it's ok.


src/hotspot/share/opto/machnode.hpp
Additional fields for MachSafePointNodes. Ok.


src/hotspot/share/opto/macro.cpp
Allow elimination of non-escaping allocations. Ok.


src/hotspot/share/opto/matcher.cpp
src/hotspot/share/opto/output.cpp
Copy attribute / pass parameters. Ok.


src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp
Nice cleanup!


src/hotspot/share/prims/jvmtiEnv.cpp
src/hotspot/share/prims/jvmtiEnvBase.cpp
Escape barriers + deoptimize objects for target thread. Good.


src/hotspot/share/prims/jvmtiImpl.cpp
src/hotspot/share/prims/jvmtiImpl.hpp
The sequence is pretty complex:
VM_GetOrSetLocal element initialization executes EscapeBarrier code which suspends the target thread (extra VM Operation).
VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM Thread to prepare VM Operation with frame deoptimization).
VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor which resumes the target thread.
But I don't have any improvement proposal. Performance is probably not a concern, here. So it's ok.

VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it has non-globally escaping objects and other frames if they have arg escaping ones. Good.


src/hotspot/share/prims/jvmtiTagMap.cpp
Escape barriers + deoptimize objects for all threads. Ok.


src/hotspot/share/prims/whitebox.cpp
Added WB_IsFrameDeoptimized to API. Ok.


src/hotspot/share/runtime/deoptimization.cpp
Object deoptimization. I have more comments and proposals, here.
First of all, handling recursive and waiting locks in relock_objects is tricky, but looks correct.
Comments are sufficient to understand why things are done as they are implemented.

BiasedLocking related parts are complex, but we may get rid of them in the future (with BiasedLocking removal).
Anyway, looks correct, too.

Typo in comment: "regularily" => "regularly"

Deoptimization::fetch_unroll_info_helper is the only place where _jvmti_deferred_updates get deallocated (except JavaThread destructor). But I think we always go through it, so I can't see a memory leak or such kind of issues.

EscapeBarrier::deoptimize_objects: ResourceMark should use calling_thread().

You can use MutexLocker and MonitorLocker with Thread* to save the Thread::current() call.

I'd make set_objs_are_deoptimized static and remove it from the EscapeBarrier interface because I think it shouldn't be used outside of EscapeBarrier::deoptimize_objects.

Typo in comment: "we must only deoptimize" => "we only have to deoptimize"

"bool EscapeBarrier::deoptimize_objects(intptr_t* fr_id)" is trivial and barrier_active() is redundant. Implementation can get moved to hpp file.

I'll get back to suspend flags, later.

There are weird cases regarding _self_deoptimization_in_progress.
Assume we have 3 threads A, B and C. A deopts C, B deopts C, C deopts C. C can set _self_deoptimization_in_progress while A performs the handshake for suspending C. I think this doesn't lead to errors, but it's probably not desired.
I think it would be better to use only one "wait" call in sync_and_suspend_one and sync_and_suspend_all.

I first thought it'd be better to move ThreadBlockInVM before wait() to reduce thread state transitions, but that seems to be problematic because ThreadBlockInVM destructor contains a safepoint check which we shouldn't do while holding EscapeBarrier_lock. So no change request.

Change in thred_added:
I think the sequence would be more comprehensive if we waited for deopt_all_threads in Thread::start and all other places where a new thread can run into Java code (e.g. JVMTI attach).
Your version makes new threads come up with suspend flag set. That looks correct, too. Advantage is that you only have to change one place (thread_added). It'll be interesting to see how it will look like when we use async handshakes instead of suspend flags.
For now, I'm ok with your version.

I'd only move MutexLocker ml(EscapeBarrier_lock...) after if (!jt->is_hidden_from_external_view()).

Having 4 different deoptimize_objects functions makes it a little hard to keep an overview of which one is used for what.
Maybe adding suffixes would help a little bit, but I can also live with what you have.
Implementation looks correct to me.


src/hotspot/share/runtime/deoptimization.hpp
Escape barriers and object deoptimization functions.
Typo in comment: "helt" => "held"


src/hotspot/share/runtime/globals.hpp
Addition of develop flag DeoptimizeObjectsALotInterval. Ok.


src/hotspot/share/runtime/interfaceSupport.cpp
InterfaceSupport::deoptimizeAllObjects() is only used for DeoptimizeObjectsALot = 1.
I think DeoptimizeObjectsALot = 2 is more important, but I think it's not bad to have DeoptimizeObjectsALot = 1 in addition. Ok.


src/hotspot/share/runtime/interfaceSupport.inline.hpp
Addition of deoptimizeAllObjects. Ok.


src/hotspot/share/runtime/mutexLocker.cpp
src/hotspot/share/runtime/mutexLocker.hpp
Addition of EscapeBarrier_lock. Ok.


src/hotspot/share/runtime/objectMonitor.cpp
Make recursion count relock aware. Ok.


src/hotspot/share/runtime/stackValue.hpp
Better reinitilization in StackValue. Good.


src/hotspot/share/runtime/thread.cpp
src/hotspot/share/runtime/thread.hpp
src/hotspot/share/runtime/thread.inline.hpp
wait_for_object_deoptimization, suspend flag, deferred updates and test feature to deoptimize objects.

In the long term, we want to get rid of suspend flags, so it's not so nice to introduce a new one. But I agree with G?tz that it should be acceptable as temporary solution until async handshakes are available (which takes more time). So I'm ok with your change.

You can use MutexLocker with Thread*.

JVMTIDeferredUpdates: I agree with Robin. It'd be nice to move the class out of thread.hpp.


src/hotspot/share/runtime/vframe.cpp
Added support for entry frame to new_vframe. Ok.


src/hotspot/share/runtime/vframe_hp.cpp
src/hotspot/share/runtime/vframe_hp.hpp

I think code()->as_nmethod() in not_global_escape_in_scope() and arg_escape() should better be under #ifdef ASSERT or inside the assert statement (no need for code cache walking in product build).

jvmtiDeferredLocalVariableSet::update_monitors:
Please add a comment explaining that owner referenced by original info may be scalar replaced, but it is deoptimized in the vframe.


src/hotspot/share/utilities/macros.hpp
Addition of NOT_COMPILER2_OR_JVMCI_RETURN macros. Ok.


test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysisEnabled.java
test/hotspot/jtreg/serviceability/jvmti/Heap/libIterateHeapWithEscapeAnalysisEnabled.c
New test. Will review separately.


test/jdk/TEST.ROOT
Addition of vm.jvmci as required property. Ok.


test/jdk/com/sun/jdi/EATests.java
test/jdk/com/sun/jdi/EATestsJVMCI.java
New test. Will review separately.


test/lib/sun/hotspot/WhiteBox.java
Added isFrameDeoptimized to API. Ok.


That was it. Best regards,
Martin


> -----Original Message-----
> From: hotspot-compiler-dev <hotspot-compiler-dev-
> bounces at openjdk.java.net> On Behalf Of Reingruber, Richard
> Sent: Dienstag, 3. M?rz 2020 21:23
> To: 'Robbin Ehn' <robbin.ehn at oracle.com>; Lindenmaier, Goetz
> <goetz.lindenmaier at sap.com>; David Holmes <david.holmes at oracle.com>;
> Vladimir Kozlov (vladimir.kozlov at oracle.com)
> <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net;
> hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-
> dev at openjdk.java.net
> Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better
> Performance in the Presence of JVMTI Agents
> 
> Hi Robbin,
> 
> > > I understand that Robbin proposed to replace the usage of
> > > _suspend_flag with handshakes. Apparently, async handshakes
> > > are needed to do so. We have been waiting a while for removal
> > > of the _suspend_flag / introduction of async handshakes [2].
> > > What is the status here?
> 
> > I have an old prototype which I would like to continue to work on.
> > So do not assume asynch handshakes will make 15.
> > Even if it would, I think there are a lot more investigate work to remove
> > _suspend_flag.
> 
> Let us know, if we can be of any help to you and be it only testing.
> 
> > >> Full:
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/
> 
> > DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
> > You can move both declaration and definition to that file, no need to
> clobber
> > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)
> 
> Will do.
> 
> > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's
> own
> > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.
> 
> You are right. It shouldn't be declared in thread.hpp. I will look into that.
> 
> > Note that we also think we may have a bug in deopt:
> > https://bugs.openjdk.java.net/browse/JDK-8238237
> 
> > I think it would be best, if possible, to push after that is resolved.
> 
> Sure.
> 
> > Not even nearly a full review :)
> 
> I know :)
> 
> Anyways, thanks a lot,
> Richard.
> 
> 
> -----Original Message-----
> From: Robbin Ehn <robbin.ehn at oracle.com>
> Sent: Monday, March 2, 2020 11:17 AM
> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Reingruber, Richard
> <richard.reingruber at sap.com>; David Holmes <david.holmes at oracle.com>;
> Vladimir Kozlov (vladimir.kozlov at oracle.com)
> <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net;
> hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-
> dev at openjdk.java.net
> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance
> in the Presence of JVMTI Agents
> 
> Hi,
> 
> On 2/24/20 5:39 PM, Lindenmaier, Goetz wrote:
> > Hi,
> >
> > I had a look at the progress of this change. Nothing
> > happened since Richard posted his update using more
> > handshakes [1].
> > But we (SAP) would appreciate a lot if this change could
> > be successfully reviewed and pushed.
> >
> > I think there is basic understanding that this
> > change is helpful. It fixes a number of issues with JVMTI,
> > and will deliver the same performance benefits as EA
> > does in current production mode for debugging scenarios.
> >
> > This is important for us as we run our VMs prepared
> > for debugging in production mode.
> >
> > I understand that Robbin proposed to replace the usage of
> > _suspend_flag with handshakes. Apparently, async handshakes
> > are needed to do so. We have been waiting a while for removal
> > of the _suspend_flag / introduction of async handshakes [2].
> > What is the status here?
> 
> I have an old prototype which I would like to continue to work on.
> So do not assume asynch handshakes will make 15.
> Even if it would, I think there are a lot more investigate work to remove
> _suspend_flag.
> 
> >
> > I think we should no longer wait, but proceed with
> > this change. We will look into removing the usage of
> > suspend_flag introduced here once it is possible to implement
> > it with handshakes.
> 
> Yes, sure.
> 
> >> Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/
> 
> DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
> You can move both declaration and definition to that file, no need to clobber
> thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)
> 
> Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's
> own
> hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.
> 
> Note that we also think we may have a bug in deopt:
> https://bugs.openjdk.java.net/browse/JDK-8238237
> 
> I think it would be best, if possible, to push after that is resolved.
> 
> Not even nearly a full review :)
> 
> Thanks, Robbin
> 
> 
> >> Incremental:
> >> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/
> >>
> >> I was not able to eliminate the additional suspend flag now. I'll take care
> of this
> >> as soon as the
> >> existing suspend-resume-mechanism is reworked.
> >>
> >> Testing:
> >>
> >> Nightly tests @SAP:
> >>
> >>    JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015,
> Renaissance
> >> Suite, SAP specific tests
> >>    with fastdebug and release builds on all platforms
> >>
> >>    Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x
> parallel
> >> for 24h
> >>
> >> Thanks, Richard.
> >>
> >>
> >> More details on the changes:
> >>
> >> * Hide DeoptimizeObjectsALotThread from external view.
> >>
> >> * Changed EscapeBarrier_lock to be a _safepoint_check_never lock.
> >>    It used to be _safepoint_check_sometimes, which will be eliminated
> sooner or
> >> later.
> >>    I added explicit thread state changes with ThreadBlockInVM to code
> paths
> >> where we can wait()
> >>    on EscapeBarrier_lock to become safepoint safe.
> >>
> >> * Use handshake EscapeBarrierSuspendHandshake to suspend target
> threads
> >> instead of vm operation
> >>    VM_ThreadSuspendAllForObjDeopt.
> >>
> >> * Removed uses of Threads_lock. When adding a new thread we suspend
> it iff
> >> EA optimizations are
> >>    being reverted. In the previous version we were waiting on
> Threads_lock
> >> while EA optimizations
> >>    were reverted. See EscapeBarrier::thread_added().
> >>
> >> * Made tests require Xmixed compilation mode.
> >>
> >> * Made tests agnostic regarding tiered compilation.
> >>    I.e. tc isn't disabled anymore, and the tests can be run with tc enabled or
> >> disabled.
> >>
> >> * Exercising EATests.java as well with stress test options
> >> DeoptimizeObjectsALot*
> >>    Due to the non-deterministic deoptimizations some tests need to be
> skipped.
> >>    We do this to prevent bit-rot of the stress test code.
> >>
> >> * Executing EATests.java as well with graal if available. Driver for this is
> >>    EATestsJVMCI.java. Graal cannot pass all tests, because it does not
> provide all
> >> the new debug info
> >>    (namely not_global_escape_in_scope and arg_escape in
> scopeDesc.hpp).
> >>    And graal does not yet support the JVMTI operations force early return
> and
> >> pop frame.
> >>
> >> * Removed tracing from new jdi tests in EATests.java. Too much trace
> output
> >> before the debugging
> >>    connection is established can cause deadlock because output buffers fill
> up.
> >>    (See https://bugs.openjdk.java.net/browse/JDK-8173304)
> >>
> >> * Many copyright year changes and smaller clean-up changes of testing
> code
> >> (trailing white-space and
> >>    the like).
> >>
> >>
> >> -----Original Message-----
> >> From: David Holmes <david.holmes at oracle.com>
> >> Sent: Donnerstag, 19. Dezember 2019 03:12
> >> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-
> >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net;
> hotspot-
> >> runtime-dev at openjdk.java.net; Vladimir Kozlov
> (vladimir.kozlov at oracle.com)
> >> <vladimir.kozlov at oracle.com>
> >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> Performance in
> >> the Presence of JVMTI Agents
> >>
> >> Hi Richard,
> >>
> >> I think my issue is with the way EliminateNestedLocks works so I'm going
> >> to look into that more deeply.
> >>
> >> Thanks for the explanations.
> >>
> >> David
> >>
> >> On 18/12/2019 12:47 am, Reingruber, Richard wrote:
> >>> Hi David,
> >>>
> >>>     > >    > Some further queries/concerns:
> >>>     > >    >
> >>>     > >    > src/hotspot/share/runtime/objectMonitor.cpp
> >>>     > >    >
> >>>     > >    > Can you please explain the changes to ObjectMonitor::wait:
> >>>     > >    >
> >>>     > >    > !   _recursions = save      // restore the old recursion count
> >>>     > >    > !                 + jt->get_and_reset_relock_count_after_wait(); //
> >>>     > >    > increased by the deferred relock count
> >>>     > >    >
> >>>     > >    > what is the "deferred relock count"? I gather it relates to
> >>>     > >    >
> >>>     > >    > "The code was extended to be able to deoptimize objects of a
> >>>     > > frame that
> >>>     > >    > is not the top frame and to let another thread than the owning
> >>>     > > thread do
> >>>     > >    > it."
> >>>     > >
> >>>     > > Yes, these relate. Currently EA based optimizations are reverted,
> when a
> >> compiled frame is
> >>>     > > replaced with corresponding interpreter frames. Part of this is
> relocking
> >> objects with eliminated
> >>>     > > locking. New with the enhancement is that we do this also just
> before
> >> object references are
> >>>     > > acquired through JVMTI. In this case we deoptimize also the
> owning
> >> compiled frame C and we
> >>>     > > register deoptimized objects as deferred updates. When control
> returns
> >> to C it gets deoptimized,
> >>>     > > we notice that objects are already deoptimized (reallocated and
> >> relocked), so we don't do it again
> >>>     > > (relocking twice would be incorrect of course). Deferred updates
> are
> >> copied into the new
> >>>     > > interpreter frames.
> >>>     > >
> >>>     > > Problem: relocking is not possible if the target thread T is waiting
> on the
> >> monitor that needs to
> >>>     > > be relocked. This happens only with non-local objects with
> >> EliminateNestedLocks. Instead relocking
> >>>     > > is deferred until T owns the monitor again. This is what the piece of
> >> code above does.
> >>>     >
> >>>     >  Sorry I need some more detail here. How can you wait() on an
> object
> >>>     >  monitor if the object allocation and/or locking was optimised away?
> And
> >>>     >  what is a "non-local object" in this context? Isn't EA restricted to
> >>>     >  thread-confined objects?
> >>>
> >>> "Non-local object" is an object that escapes its thread. The issue I'm
> >> addressing with the changes
> >>> in ObjectMonitor::wait are almost unrelated to EA. They are caused by
> >> EliminateNestedLocks, where C2
> >>> eliminates recursive locking of an already owned lock. The lock owning
> object
> >> exists on the heap, it
> >>> is locked and you can call wait() on it.
> >>>
> >>> EliminateLocks is the C2 option that controls lock elimination based on
> EA.
> >> Both optimizations have
> >>> in common that objects with eliminated locking need to be relocked
> when
> >> deoptimizing a frame,
> >>> i.e. when replacing a compiled frame with equivalent interpreter
> >>> frames. Deoptimization::relock_objects does that job for /all/ eliminated
> >> locks in scope. /All/ can
> >>> be a mix of eliminated nested locks and locks of not-escaping objects.
> >>>
> >>> New with the enhancement: I call relock_objects earlier, just before
> objects
> >> pontentially
> >>> escape. But then later when the owning compiled frame gets
> deoptimized, I
> >> must not do it again:
> >>>
> >>> See call to EscapeBarrier::objs_are_deoptimized in deoptimization.cpp:
> >>>
> >>>    373   if ((jvmci_enabled || ((DoEscapeAnalysis ||
> EliminateNestedLocks) &&
> >> EliminateLocks))
> >>>    374       && !EscapeBarrier::objs_are_deoptimized(thread,
> deoptee.id())) {
> >>>    375     bool unused;
> >>>    376     eliminate_locks(thread, chunk, realloc_failures, deoptee,
> exec_mode,
> >> unused);
> >>>    377   }
> >>>
> >>> Now when calling relock_objects early it is quiet possible that I have to
> relock
> >> an object the
> >>> target thread currently waits for. Obviously I cannot relock in this case,
> >> instead I chose to
> >>> introduce relock_count_after_wait to JavaThread.
> >>>
> >>>     >  Is it just that some of the locking gets optimized away e.g.
> >>>     >
> >>>     >  synchronised(obj) {
> >>>     >     synchronised(obj) {
> >>>     >       synchronised(obj) {
> >>>     >         obj.wait();
> >>>     >       }
> >>>     >     }
> >>>     >  }
> >>>     >
> >>>     >  If this is reduced to a form as-if it were a single lock of the monitor
> >>>     >  (due to EA) and the wait() triggers a JVM TI event which leads to the
> >>>     >  escape of "obj" then we need to reconstruct the true lock state, and
> so
> >>>     >  when the wait() internally unblocks and reacquires the monitor it
> has to
> >>>     >  set the true recursion count to 3, not the 1 that it appeared to be
> when
> >>>     >  wait() was initially called. Is that the scenario?
> >>>
> >>> Kind of... except that the locking is not eliminated due to EA and there is
> no
> >> JVM TI event
> >>> triggered by wait.
> >>>
> >>> Add
> >>>
> >>> LocalObject l1 = new LocalObject();
> >>>
> >>> in front of the synchrnized blocks and assume a JVM TI agent acquires l1.
> This
> >> triggers the code in
> >>> question.
> >>>
> >>> See that relocking/reallocating is transactional. If it is done then for /all/
> >> objects in scope and it is
> >>> done at most once. It wouldn't be quite so easy to split this in relocking
> of
> >> nested/EA-based
> >>> eliminated locks.
> >>>
> >>>     >  If so I find this truly awful. Anyone using wait() in a realistic form
> >>>     >  requires a notification and so the object cannot be thread confined.
> In
> >>>
> >>> It is not thread confined.
> >>>
> >>>     >  which case I would strongly argue that upon hitting the wait() the
> deopt
> >>>     >  should occur unconditionally and so the lock state is correct before
> we
> >>>     >  wait and so we don't need to mess with the recursion count
> internally
> >>>     >  when we reacquire the monitor.
> >>>     >
> >>>     > >
> >>>     > >    > which I don't like the sound of at all when it comes to
> ObjectMonitor
> >>>     > >    > state. So I'd like to understand in detail exactly what is going on
> here
> >>>     > >    > and why.  This is a very intrusive change that seems to badly
> break
> >>>     > >    > encapsulation and impacts future changes to ObjectMonitor
> that are
> >> under
> >>>     > >    > investigation.
> >>>     > >
> >>>     > > I would not regard this as breaking encapsulation. Certainly not
> badly.
> >>>     > >
> >>>     > > I've added a property relock_count_after_wait to JavaThread. The
> >> property is well
> >>>     > > encapsulated. Future ObjectMonitor implementations have to deal
> with
> >> recursion too. They are free
> >>>     > > in choosing a way to do that as long as that property is taken into
> >> account. This is hardly a
> >>>     > > limitation.
> >>>     >
> >>>     >  I do think this badly breaks encapsulation as you have to add a
> callout
> >>>     >  from the guts of the ObjectMonitor code to reach into the thread to
> get
> >>>     >  this lock count adjustment. I understand why you have had to do
> this but
> >>>     >  I would much rather see a change to the EA optimisation strategy so
> that
> >>>     >  this is not needed.
> >>>     >
> >>>     > > Note also that the property is a straight forward extension of the
> >> existing concept of deferred
> >>>     > > local updates. It is embedded into the structure holding them. So
> not
> >> even the footprint of a
> >>>     > > JavaThread is enlarged if no deferred updates are generated.
> >>>     >
> >>>     > [...]
> >>>     >
> >>>     > >
> >>>     > > I'm actually duplicating the existing external suspend mechanism,
> >> because a thread can be
> >>>     > > suspended at most once. And hey, and don't like that either! But it
> >> seems not unlikely that the
> >>>     > > duplicate can be removed together with the original and the new
> type
> >> of handshakes that will be
> >>>     > > used for thread suspend can be used for object deoptimization
> too. See
> >> today's discussion in
> >>>     > > JDK-8227745 [2].
> >>>     >
> >>>     >  I hope that discussion bears some fruit, at the moment it seems not
> to
> >>>     >  be possible to use handshakes here. :(
> >>>     >
> >>>     >  The external suspend mechanism is a royal pain in the proverbial
> that we
> >>>     >  have to carefully live with. The idea that we're duplicating that for
> >>>     >  use in another fringe area of functionality does not thrill me at all.
> >>>     >
> >>>     >  To be clear, I understand the problem that exists and that you wish
> to
> >>>     >  solve, but for the runtime parts I balk at the complexity cost of
> >>>     >  solving it.
> >>>
> >>> I know it's complex, but by far no rocket science.
> >>>
> >>> Also I find it hard to imagine another fix for JDK-8233915 besides
> changing
> >> the JVM TI specification.
> >>>
> >>> Thanks, Richard.
> >>>
> >>> -----Original Message-----
> >>> From: David Holmes <david.holmes at oracle.com>
> >>> Sent: Dienstag, 17. Dezember 2019 08:03
> >>> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-
> >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net;
> hotspot-
> >> runtime-dev at openjdk.java.net; Vladimir Kozlov
> (vladimir.kozlov at oracle.com)
> >> <vladimir.kozlov at oracle.com>
> >>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> Performance
> >> in the Presence of JVMTI Agents
> >>>
> >>> <resend as my mailer crashed during last send>
> >>>
> >>> David
> >>>
> >>> On 17/12/2019 4:57 pm, David Holmes wrote:
> >>>> Hi Richard,
> >>>>
> >>>> On 14/12/2019 5:01 am, Reingruber, Richard wrote:
> >>>>> Hi David,
> >>>>>
> >>>>>   ?? > Some further queries/concerns:
> >>>>>   ?? >
> >>>>>   ?? > src/hotspot/share/runtime/objectMonitor.cpp
> >>>>>   ?? >
> >>>>>   ?? > Can you please explain the changes to ObjectMonitor::wait:
> >>>>>   ?? >
> >>>>>   ?? > !?? _recursions = save????? // restore the old recursion count
> >>>>>   ?? > !???????????????? + jt->get_and_reset_relock_count_after_wait(); //
> >>>>>   ?? > increased by the deferred relock count
> >>>>>   ?? >
> >>>>>   ?? > what is the "deferred relock count"? I gather it relates to
> >>>>>   ?? >
> >>>>>   ?? > "The code was extended to be able to deoptimize objects of a
> >>>>> frame that
> >>>>>   ?? > is not the top frame and to let another thread than the owning
> >>>>> thread do
> >>>>>   ?? > it."
> >>>>>
> >>>>> Yes, these relate. Currently EA based optimizations are reverted,
> when
> >>>>> a compiled frame is replaced
> >>>>> with corresponding interpreter frames. Part of this is relocking
> >>>>> objects with eliminated
> >>>>> locking. New with the enhancement is that we do this also just before
> >>>>> object references are acquired
> >>>>> through JVMTI. In this case we deoptimize also the owning compiled
> >>>>> frame C and we register
> >>>>> deoptimized objects as deferred updates. When control returns to C
> it
> >>>>> gets deoptimized, we notice
> >>>>> that objects are already deoptimized (reallocated and relocked), so
> we
> >>>>> don't do it again (relocking
> >>>>> twice would be incorrect of course). Deferred updates are copied into
> >>>>> the new interpreter frames.
> >>>>>
> >>>>> Problem: relocking is not possible if the target thread T is waiting
> >>>>> on the monitor that needs to be
> >>>>> relocked. This happens only with non-local objects with
> >>>>> EliminateNestedLocks. Instead relocking is
> >>>>> deferred until T owns the monitor again. This is what the piece of
> >>>>> code above does.
> >>>>
> >>>> Sorry I need some more detail here. How can you wait() on an object
> >>>> monitor if the object allocation and/or locking was optimised away?
> And
> >>>> what is a "non-local object" in this context? Isn't EA restricted to
> >>>> thread-confined objects?
> >>>>
> >>>> Is it just that some of the locking gets optimized away e.g.
> >>>>
> >>>> synchronised(obj) {
> >>>>    ? synchronised(obj) {
> >>>>    ??? synchronised(obj) {
> >>>>    ????? obj.wait();
> >>>>    ??? }
> >>>>    ? }
> >>>> }
> >>>>
> >>>> If this is reduced to a form as-if it were a single lock of the monitor
> >>>> (due to EA) and the wait() triggers a JVM TI event which leads to the
> >>>> escape of "obj" then we need to reconstruct the true lock state, and so
> >>>> when the wait() internally unblocks and reacquires the monitor it has to
> >>>> set the true recursion count to 3, not the 1 that it appeared to be when
> >>>> wait() was initially called. Is that the scenario?
> >>>>
> >>>> If so I find this truly awful. Anyone using wait() in a realistic form
> >>>> requires a notification and so the object cannot be thread confined. In
> >>>> which case I would strongly argue that upon hitting the wait() the
> deopt
> >>>> should occur unconditionally and so the lock state is correct before we
> >>>> wait and so we don't need to mess with the recursion count internally
> >>>> when we reacquire the monitor.
> >>>>
> >>>>>
> >>>>>   ?? > which I don't like the sound of at all when it comes to
> >>>>> ObjectMonitor
> >>>>>   ?? > state. So I'd like to understand in detail exactly what is going
> >>>>> on here
> >>>>>   ?? > and why.? This is a very intrusive change that seems to badly
> break
> >>>>>   ?? > encapsulation and impacts future changes to ObjectMonitor that
> >>>>> are under
> >>>>>   ?? > investigation.
> >>>>>
> >>>>> I would not regard this as breaking encapsulation. Certainly not badly.
> >>>>>
> >>>>> I've added a property relock_count_after_wait to JavaThread. The
> >>>>> property is well
> >>>>> encapsulated. Future ObjectMonitor implementations have to deal
> with
> >>>>> recursion too. They are free in
> >>>>> choosing a way to do that as long as that property is taken into
> >>>>> account. This is hardly a
> >>>>> limitation.
> >>>>
> >>>> I do think this badly breaks encapsulation as you have to add a callout
> >>>> from the guts of the ObjectMonitor code to reach into the thread to
> get
> >>>> this lock count adjustment. I understand why you have had to do this
> but
> >>>> I would much rather see a change to the EA optimisation strategy so
> that
> >>>> this is not needed.
> >>>>
> >>>>> Note also that the property is a straight forward extension of the
> >>>>> existing concept of deferred
> >>>>> local updates. It is embedded into the structure holding them. So not
> >>>>> even the footprint of a
> >>>>> JavaThread is enlarged if no deferred updates are generated.
> >>>>>
> >>>>>   ?? > ---
> >>>>>   ?? >
> >>>>>   ?? > src/hotspot/share/runtime/thread.cpp
> >>>>>   ?? >
> >>>>>   ?? > Can you please explain why
> >>>>> JavaThread::wait_for_object_deoptimization
> >>>>>   ?? > has to be handcrafted in this way rather than using proper
> >>>>> transitions.
> >>>>>   ?? >
> >>>>>
> >>>>> I wrote wait_for_object_deoptimization taking
> >>>>> JavaThread::java_suspend_self_with_safepoint_check
> >>>>> as template. So in short: for the same reasons :)
> >>>>>
> >>>>> Threads reach both methods as part of thread state transitions,
> >>>>> therefore special handling is
> >>>>> required to change thread state on top of ongoing transitions.
> >>>>>
> >>>>>   ?? > We got rid of "deopt suspend" some time ago and it is disturbing
> >>>>> to see
> >>>>>   ?? > it being added back (effectively). This seems like it may be
> >>>>> something
> >>>>>   ?? > that handshakes could be used for.
> >>>>>
> >>>>> Deopt suspend used to be something rather different with a similar
> >>>>> name[1]. It is not being added back.
> >>>>
> >>>> I stand corrected. Despite comments in the code to the contrary
> >>>> deopt_suspend didn't actually cause a self-suspend. I was doing a lot of
> >>>> cleanup in this area 13 years ago :)
> >>>>
> >>>>>
> >>>>> I'm actually duplicating the existing external suspend mechanism,
> >>>>> because a thread can be suspended
> >>>>> at most once. And hey, and don't like that either! But it seems not
> >>>>> unlikely that the duplicate can
> >>>>> be removed together with the original and the new type of
> handshakes
> >>>>> that will be used for
> >>>>> thread suspend can be used for object deoptimization too. See
> today's
> >>>>> discussion in JDK-8227745 [2].
> >>>>
> >>>> I hope that discussion bears some fruit, at the moment it seems not to
> >>>> be possible to use handshakes here. :(
> >>>>
> >>>> The external suspend mechanism is a royal pain in the proverbial that
> we
> >>>> have to carefully live with. The idea that we're duplicating that for
> >>>> use in another fringe area of functionality does not thrill me at all.
> >>>>
> >>>> To be clear, I understand the problem that exists and that you wish to
> >>>> solve, but for the runtime parts I balk at the complexity cost of
> >>>> solving it.
> >>>>
> >>>> Thanks,
> >>>> David
> >>>> -----
> >>>>
> >>>>> Thanks, Richard.
> >>>>>
> >>>>> [1] Deopt suspend was something like an async. handshake for
> >>>>> architectures with register windows,
> >>>>>   ???? where patching the return pc for deoptimization of a compiled
> >>>>> frame was racy if the owner thread
> >>>>>   ???? was in native code. Instead a "deopt" suspend flag was set on
> >>>>> which the thread patched its own
> >>>>>   ???? frame upon return from native. So no thread was suspended. It
> got
> >>>>> its name only from the name of
> >>>>>   ???? the flags.
> >>>>>
> >>>>> [2] Discussion about using handshakes to sync. with the target thread:
> >>>>>
> >>>>> https://bugs.openjdk.java.net/browse/JDK-
> >>
> 8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.syst
> e
> >> m.issuetabpanels:comment-tabpanel#comment-14306727
> >>>>>
> >>>>>
> >>>>> -----Original Message-----
> >>>>> From: David Holmes <david.holmes at oracle.com>
> >>>>> Sent: Freitag, 13. Dezember 2019 00:56
> >>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
> >>>>> serviceability-dev at openjdk.java.net;
> >>>>> hotspot-compiler-dev at openjdk.java.net;
> >>>>> hotspot-runtime-dev at openjdk.java.net
> >>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> >>>>> Performance in the Presence of JVMTI Agents
> >>>>>
> >>>>> Hi Richard,
> >>>>>
> >>>>> Some further queries/concerns:
> >>>>>
> >>>>> src/hotspot/share/runtime/objectMonitor.cpp
> >>>>>
> >>>>> Can you please explain the changes to ObjectMonitor::wait:
> >>>>>
> >>>>> !?? _recursions = save????? // restore the old recursion count
> >>>>> !???????????????? + jt->get_and_reset_relock_count_after_wait(); //
> >>>>> increased by the deferred relock count
> >>>>>
> >>>>> what is the "deferred relock count"? I gather it relates to
> >>>>>
> >>>>> "The code was extended to be able to deoptimize objects of a frame
> that
> >>>>> is not the top frame and to let another thread than the owning thread
> do
> >>>>> it."
> >>>>>
> >>>>> which I don't like the sound of at all when it comes to ObjectMonitor
> >>>>> state. So I'd like to understand in detail exactly what is going on here
> >>>>> and why.? This is a very intrusive change that seems to badly break
> >>>>> encapsulation and impacts future changes to ObjectMonitor that are
> under
> >>>>> investigation.
> >>>>>
> >>>>> ---
> >>>>>
> >>>>> src/hotspot/share/runtime/thread.cpp
> >>>>>
> >>>>> Can you please explain why
> JavaThread::wait_for_object_deoptimization
> >>>>> has to be handcrafted in this way rather than using proper transitions.
> >>>>>
> >>>>> We got rid of "deopt suspend" some time ago and it is disturbing to
> see
> >>>>> it being added back (effectively). This seems like it may be something
> >>>>> that handshakes could be used for.
> >>>>>
> >>>>> Thanks,
> >>>>> David
> >>>>> -----
> >>>>>
> >>>>> On 12/12/2019 7:02 am, David Holmes wrote:
> >>>>>> On 12/12/2019 1:07 am, Reingruber, Richard wrote:
> >>>>>>> Hi David,
> >>>>>>>
> >>>>>>>   ??? > Most of the details here are in areas I can comment on in
> detail,
> >>>>>>> but I
> >>>>>>>   ??? > did take an initial general look at things.
> >>>>>>>
> >>>>>>> Thanks for taking the time!
> >>>>>>
> >>>>>> Apologies the above should read:
> >>>>>>
> >>>>>> "Most of the details here are in areas I *can't* comment on in detail
> >>>>>> ..."
> >>>>>>
> >>>>>> David
> >>>>>>
> >>>>>>>   ??? > The only thing that jumped out at me is that I think the
> >>>>>>>   ??? > DeoptimizeObjectsALotThread should be a hidden thread.
> >>>>>>>   ??? >
> >>>>>>>   ??? > +? bool is_hidden_from_external_view() const { return true; }
> >>>>>>>
> >>>>>>> Yes, it should. Will add the method like above.
> >>>>>>>
> >>>>>>>   ??? > Also I don't see any testing of the
> DeoptimizeObjectsALotThread.
> >>>>>>> Without
> >>>>>>>   ??? > active testing this will just bit-rot.
> >>>>>>>
> >>>>>>> DeoptimizeObjectsALot is meant for stress testing with a larger
> >>>>>>> workload. I will add a minimal test
> >>>>>>> to keep it fresh.
> >>>>>>>
> >>>>>>>   ??? > Also on the tests I don't understand your @requires clause:
> >>>>>>>   ??? >
> >>>>>>>   ??? >?? @requires ((vm.compMode != "Xcomp") &
> vm.compiler2.enabled
> >> &
> >>>>>>>   ??? > (vm.opt.TieredCompilation != true))
> >>>>>>>   ??? >
> >>>>>>>   ??? > This seems to require that TieredCompilation is disabled, but
> >>>>>>> tiered is
> >>>>>>>   ??? > our normal mode of operation. ??
> >>>>>>>   ??? >
> >>>>>>>
> >>>>>>> I removed the clause. I guess I wanted to target the tests towards
> the
> >>>>>>> code they are supposed to
> >>>>>>> test, and it's easier to analyze failures w/o tiered compilation and
> >>>>>>> with just one compiler thread.
> >>>>>>>
> >>>>>>> Additionally I will make use of
> >>>>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Richard.
> >>>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: David Holmes <david.holmes at oracle.com>
> >>>>>>> Sent: Mittwoch, 11. Dezember 2019 08:03
> >>>>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
> >>>>>>> serviceability-dev at openjdk.java.net;
> >>>>>>> hotspot-compiler-dev at openjdk.java.net;
> >>>>>>> hotspot-runtime-dev at openjdk.java.net
> >>>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> >>>>>>> Performance in the Presence of JVMTI Agents
> >>>>>>>
> >>>>>>> Hi Richard,
> >>>>>>>
> >>>>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote:
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> I would like to get reviews please for
> >>>>>>>>
> >>>>>>>>
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
> >>>>>>>>
> >>>>>>>> Corresponding RFE:
> >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745
> >>>>>>>>
> >>>>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
> >>>>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-
> 8214584 [1]
> >>>>>>>>
> >>>>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing
> without
> >>>>>>>> issues (thanks!). In addition the
> >>>>>>>> change is being tested at SAP since I posted the first RFR some
> >>>>>>>> months ago.
> >>>>>>>>
> >>>>>>>> The intention of this enhancement is to benefit performance wise
> from
> >>>>>>>> escape analysis even if JVMTI
> >>>>>>>> agents request capabilities that allow them to access local variable
> >>>>>>>> values. E.g. if you start-up
> >>>>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,
> then
> >>>>>>>> escape analysis is disabled right
> >>>>>>>> from the beginning, well before a debugger attaches -- if ever one
> >>>>>>>> should do so. With the
> >>>>>>>> enhancement, escape analysis will remain enabled until and after
> a
> >>>>>>>> debugger attaches. EA based
> >>>>>>>> optimizations are reverted just before an agent acquires the
> >>>>>>>> reference to an object. In the JBS item
> >>>>>>>> you'll find more details.
> >>>>>>>
> >>>>>>> Most of the details here are in areas I can comment on in detail, but
> I
> >>>>>>> did take an initial general look at things.
> >>>>>>>
> >>>>>>> The only thing that jumped out at me is that I think the
> >>>>>>> DeoptimizeObjectsALotThread should be a hidden thread.
> >>>>>>>
> >>>>>>> +? bool is_hidden_from_external_view() const { return true; }
> >>>>>>>
> >>>>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread.
> >>>>>>> Without
> >>>>>>> active testing this will just bit-rot.
> >>>>>>>
> >>>>>>> Also on the tests I don't understand your @requires clause:
> >>>>>>>
> >>>>>>>   ??? @requires ((vm.compMode != "Xcomp") &
> vm.compiler2.enabled &
> >>>>>>> (vm.opt.TieredCompilation != true))
> >>>>>>>
> >>>>>>> This seems to require that TieredCompilation is disabled, but tiered
> is
> >>>>>>> our normal mode of operation. ??
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> David
> >>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Richard.
> >>>>>>>>
> >>>>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745
> >>>>>>>>
> >>
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.pa
> tc
> >> h
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>

From richard.reingruber at sap.com  Mon Mar 30 08:31:30 2020
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Mon, 30 Mar 2020 08:31:30 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <AM0PR0202MB32972071A26C80FB22FC49DE9AFD0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
 <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
 <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com>
 <AM0PR0202MB33316510E86767AED0D29F679B030@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM7PR02MB6049A3D2F6DE10CAD6AA7A51ECEC0@AM7PR02MB6049.eurprd02.prod.outlook.com>
 <b159e349-95bc-01c3-5250-f3b454d7ef53@oracle.com>
 <AM0PR0202MB33315707EAB1F5C9801DB4C19BE40@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB32972071A26C80FB22FC49DE9AFD0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
Message-ID: <AM0PR0202MB3331EEF36942FCEBA7E131389BCB0@AM0PR0202MB3331.eurprd02.prod.outlook.com>

Hi,

this is webrev.5 based on Robbin's feedback and Martin's review - thanks! :)

The change affects jvmti, hotspot and c2. Partial reviews are very welcome too.

Full:  http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5/
Delta: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5.inc/

Robbin, Martin, please let me know, if anything shouldn't be quite as you wanted it. Also find my
comments on your feedback below.

Robbin, can I count you as Reviewer for the runtime part?

Thanks, Richard.

--

> DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
> You can move both declaration and definition to that file, no need to clobber
> thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)

Done.

> Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's own
> hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.

I moved JvmtiDeferredUpdates to vframe_hp.hpp where preexisting jvmtiDeferredLocalVariableSet is
declared.

> src/hotspot/share/code/compiledMethod.cpp
> Nice cleanup!

Thanks :)

> src/hotspot/share/code/debugInfoRec.cpp
> src/hotspot/share/code/debugInfoRec.hpp
> Additional parmeters. (Remark: I think "non_global_escape_in_scope" would read better than "not_global_escape_in_scope", but your version is consistent with existing code, so no change request from my side.) Ok.

I've been thinking about this too and finally stayed with not_global_escape_in_scope. It's supposed
to mean an object whose escape state is not GlobalEscape is in scope.

> src/hotspot/share/compiler/compileBroker.cpp
> src/hotspot/share/compiler/compileBroker.hpp
> Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into a follow up change together with the test in order to make this webrev smaller, but since it is included, I'm reviewing everything at once. Not a big deal.) Ok.

Yes the change would be a little smaller. And if it helps I'll split it off. In general I prefer
patches that bring along a suitable amount of tests.

> src/hotspot/share/opto/c2compiler.cpp
> Make do_escape_analysis independent of JVMCI capabilities. Nice!

It is the main goal of the enhancement. It is done for C2, but could be done for JVMCI compilers
with just a small effort as well.

> src/hotspot/share/opto/escape.cpp
> Annotation for MachSafePointNodes. Your added functionality looks correct.
> But I'd prefer to move the bulky code out of the large function.
> I suggest to factor out something like has_not_global_escape and has_arg_escape. So the code could look like this:
>       SafePointNode* sfn = sfn_worklist.at(next);
>       sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn));
>       if (sfn->is_CallJava()) {
>         CallJavaNode* call = sfn->as_CallJava();
>         call->set_arg_escape(has_arg_escape(call));
>       }
> This would also allow us to get rid of the found_..._escape_in_args variables making the loops better readable.

Done.

> It's kind of ugly to use strcmp to recognize uncommon trap, but that seems to be the way to do it (there are more such places). So it's ok.

Yeah. I copied the snippet.

> src/hotspot/share/prims/jvmtiImpl.cpp
> src/hotspot/share/prims/jvmtiImpl.hpp
> The sequence is pretty complex:
> VM_GetOrSetLocal element initialization executes EscapeBarrier code which suspends the target thread (extra VM Operation).

Note that the target threads have to be suspended already for VM_GetOrSetLocal*. So it's mainly the
synchronization effect of EscapeBarrier::sync_and_suspend_one() that is required here. Also no extra
_handshake_ is executed, since sync_and_suspend_one() will find the target threads already
suspended.

> VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM Thread to prepare VM Operation with frame deoptimization).
> VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor which resumes the target thread.
> But I don't have any improvement proposal. Performance is probably not a concern, here. So it's ok.

> VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it has non-globally escaping objects and other frames if they have arg escaping ones. Good.

It's not specifically the top frame, but the frame that is accessed.

> src/hotspot/share/runtime/deoptimization.cpp
> Object deoptimization. I have more comments and proposals, here.
> First of all, handling recursive and waiting locks in relock_objects is tricky, but looks correct.
> Comments are sufficient to understand why things are done as they are implemented.

> BiasedLocking related parts are complex, but we may get rid of them in the future (with BiasedLocking removal).
> Anyway, looks correct, too.

> Typo in comment: "regularily" => "regularly"

> Deoptimization::fetch_unroll_info_helper is the only place where _jvmti_deferred_updates get deallocated (except JavaThread destructor). But I think we always go through it, so I can't see a memory leak or such kind of issues.

That's correct. The compiled frame for which deferred updates are allocated is always deoptimized
before (see EscapeBarrier::deoptimize_objects()). This is also asserted in
compiledVFrame::update_deferred_value(). I've added the same assertion to
Deoptimization::relock_objects(). So we can be sure that _jvmti_deferred_updates are deallocated
again in fetch_unroll_info_helper().

> EscapeBarrier::deoptimize_objects: ResourceMark should use calling_thread().

Sure, well spotted!

> You can use MutexLocker and MonitorLocker with Thread* to save the Thread::current() call.

Right, good hint. This was recently introduced with 8235678. I even had to resolve conflicts. Should
have done this then.

> I'd make set_objs_are_deoptimized static and remove it from the EscapeBarrier interface because I think it shouldn't be used outside of EscapeBarrier::deoptimize_objects.

Done.

> Typo in comment: "we must only deoptimize" => "we only have to deoptimize"

Replaced with "[...] we deoptimize iff local objects are passed as args"

> "bool EscapeBarrier::deoptimize_objects(intptr_t* fr_id)" is trivial and barrier_active() is redundant. Implementation can get moved to hpp file.

Ok. Done.

> I'll get back to suspend flags, later.

> There are weird cases regarding _self_deoptimization_in_progress.
> Assume we have 3 threads A, B and C. A deopts C, B deopts C, C deopts C. C can set _self_deoptimization_in_progress while A performs the handshake for suspending C. I think this doesn't lead to errors, but it's probably not desired.
> I think it would be better to use only one "wait" call in sync_and_suspend_one and sync_and_suspend_all.

You're right. We've discussed that face-to-face, but couldn't find a real issue. But now, thinking again, a reckon I found one:

2808   // Sync with other threads that might be doing deoptimizations
2809   {
2810     // Need to switch to _thread_blocked for the wait() call
2811     ThreadBlockInVM tbivm(_calling_thread);
2812     MonitorLocker ml(EscapeBarrier_lock, Mutex::_no_safepoint_check_flag);
2813     while (_self_deoptimization_in_progress) {
2814       ml.wait();
2815     }
2816 
2817     if (self_deopt()) {
2818       _self_deoptimization_in_progress = true;
2819     }
2820 
2821     while (_deoptee_thread->is_ea_obj_deopt_suspend()) {
2822       ml.wait();
2823     }
2824 
2825     if (self_deopt()) {
2826       return;
2827     }
2828 
2829     // set suspend flag for target thread
2830     _deoptee_thread->set_ea_obj_deopt_flag();
2831   }

- A waits in 2822
- C is suspended
- B notifies all in resume_one()
- A and C wake up
- C wins over A and sets _self_deoptimization_in_progress = true in 2818
- C does the self deoptimization
- A executes 2830 _deoptee_thread->set_ea_obj_deopt_flag()

C will self suspend at some undefined point. The resulting state is illegal.

> I first thought it'd be better to move ThreadBlockInVM before wait() to reduce thread state transitions, but that seems to be problematic because ThreadBlockInVM destructor contains a safepoint check which we shouldn't do while holding EscapeBarrier_lock. So no change request.

Yes, would be nice to have the state change only if needed, but for the reason you mentioned it is
not quite as easy as it seems to be. I experimented as well with a second lock, but did not succeed.

> Change in thred_added:
> I think the sequence would be more comprehensive if we waited for deopt_all_threads in Thread::start and all other places where a new thread can run into Java code (e.g. JVMTI attach).
> Your version makes new threads come up with suspend flag set. That looks correct, too. Advantage is that you only have to change one place (thread_added). It'll be interesting to see how it will look like when we use async handshakes instead of suspend flags.
> For now, I'm ok with your version.

I had a version that did what you are suggesting. The current version also has the advantage, that
there are fewer places where a thread has to wait for ongoing object deoptimization. This means
viewer places where you have to worry about correct thread state transitions, possible deadlocks,
and if all oops are properly Handle'ed.

> I'd only move MutexLocker ml(EscapeBarrier_lock...) after if (!jt->is_hidden_from_external_view()).

Done.

> Having 4 different deoptimize_objects functions makes it a little hard to keep an overview of which one is used for what.
> Maybe adding suffixes would help a little bit, but I can also live with what you have.
> Implementation looks correct to me.

2 are internal. I added the suffix _internal to them. This leaves 2 to choose from.

> src/hotspot/share/runtime/deoptimization.hpp
> Escape barriers and object deoptimization functions.
> Typo in comment: "helt" => "held"

Done in place already.

> src/hotspot/share/runtime/interfaceSupport.cpp
> InterfaceSupport::deoptimizeAllObjects() is only used for DeoptimizeObjectsALot = 1.
> I think DeoptimizeObjectsALot = 2 is more important, but I think it's not bad to have DeoptimizeObjectsALot = 1 in addition. Ok.

I never used DeoptimizeObjectsALot = 1 that much. It could be more deterministic in single threaded
scenarios. I wouldn't object to get rid of it though.

> src/hotspot/share/runtime/stackValue.hpp
> Better reinitilization in StackValue. Good.

StackValue::obj_is_scalar_replaced() should not return true after calling set_obj().

> src/hotspot/share/runtime/thread.cpp
> src/hotspot/share/runtime/thread.hpp
> src/hotspot/share/runtime/thread.inline.hpp
> wait_for_object_deoptimization, suspend flag, deferred updates and test feature to deoptimize objects.

> In the long term, we want to get rid of suspend flags, so it's not so nice to introduce a new one. But I agree with G?tz that it should be acceptable as temporary solution until async handshakes are available (which takes more time). So I'm ok with your change.

I'm keen to build the feature on async handshakes when the arive.

> You can use MutexLocker with Thread*.

Done.

> JVMTIDeferredUpdates: I agree with Robin. It'd be nice to move the class out of thread.hpp.

Done.

> src/hotspot/share/runtime/vframe.cpp
> Added support for entry frame to new_vframe. Ok.


> src/hotspot/share/runtime/vframe_hp.cpp
> src/hotspot/share/runtime/vframe_hp.hpp

> I think code()->as_nmethod() in not_global_escape_in_scope() and arg_escape() should better be under #ifdef ASSERT or inside the assert statement (no need for code cache walking in product build).

Done.

> jvmtiDeferredLocalVariableSet::update_monitors:
> Please add a comment explaining that owner referenced by original info may be scalar replaced, but it is deoptimized in the vframe.

Done.

-----Original Message-----
From: Doerr, Martin <martin.doerr at sap.com> 
Sent: Donnerstag, 12. M?rz 2020 17:28
To: Reingruber, Richard <richard.reingruber at sap.com>; 'Robbin Ehn' <robbin.ehn at oracle.com>; Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; David Holmes <david.holmes at oracle.com>; Vladimir Kozlov (vladimir.kozlov at oracle.com) <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

Hi Richard,


I managed to find time for a (almost) complete review of webrev.4. (I'll review the tests separately.)

First of all, the change seems to be in pretty good quality for its significant complexity. I couldn't find any real bugs. But I'd like to propose minor improvements.
I'm convinced that it's mature because we did substantial testing.

I like the new functionality for object deoptimization. It can possibly be reused for future escape analysis based optimizations. So I appreciate having it available in the code base.
In addition to that, your change makes the JVMTI implementation better integrated into the VM.


Now to the details:


src/hotspot/share/c1/c1_IR.hpp
describe_scope parameters. Ok.


src/hotspot/share/ci/ciEnv.cpp
src/hotspot/share/ci/ciEnv.hpp
Fix for JvmtiExport::can_walk_any_space() capability. Ok.


src/hotspot/share/code/compiledMethod.cpp
Nice cleanup!


src/hotspot/share/code/debugInfoRec.cpp
src/hotspot/share/code/debugInfoRec.hpp
Additional parmeters. (Remark: I think "non_global_escape_in_scope" would read better than "not_global_escape_in_scope", but your version is consistent with existing code, so no change request from my side.) Ok.


src/hotspot/share/code/nmethod.cpp
Nice cleanup!


src/hotspot/share/code/pcDesc.hpp
Additional parameters. Ok.


src/hotspot/share/code/scopeDesc.cpp
src/hotspot/share/code/scopeDesc.hpp
Improved implementation + additional parameters. Ok.


src/hotspot/share/compiler/compileBroker.cpp
src/hotspot/share/compiler/compileBroker.hpp
Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into a follow up change together with the test in order to make this webrev smaller, but since it is included, I'm reviewing everything at once. Not a big deal.) Ok.


src/hotspot/share/jvmci/jvmciCodeInstaller.cpp
Additional parameters. Ok.


src/hotspot/share/opto/c2compiler.cpp
Make do_escape_analysis independent of JVMCI capabilities. Nice!


src/hotspot/share/opto/callnode.hpp
Additional fields for MachSafePointNodes. Ok.


src/hotspot/share/opto/escape.cpp
Annotation for MachSafePointNodes. Your added functionality looks correct.
But I'd prefer to move the bulky code out of the large function.
I suggest to factor out something like has_not_global_escape and has_arg_escape. So the code could look like this:
      SafePointNode* sfn = sfn_worklist.at(next);
      sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn));
      if (sfn->is_CallJava()) {
        CallJavaNode* call = sfn->as_CallJava();
        call->set_arg_escape(has_arg_escape(call));
      }
This would also allow us to get rid of the found_..._escape_in_args variables making the loops better readable.

It's kind of ugly to use strcmp to recognize uncommon trap, but that seems to be the way to do it (there are more such places). So it's ok.


src/hotspot/share/opto/machnode.hpp
Additional fields for MachSafePointNodes. Ok.


src/hotspot/share/opto/macro.cpp
Allow elimination of non-escaping allocations. Ok.


src/hotspot/share/opto/matcher.cpp
src/hotspot/share/opto/output.cpp
Copy attribute / pass parameters. Ok.


src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp
Nice cleanup!


src/hotspot/share/prims/jvmtiEnv.cpp
src/hotspot/share/prims/jvmtiEnvBase.cpp
Escape barriers + deoptimize objects for target thread. Good.


src/hotspot/share/prims/jvmtiImpl.cpp
src/hotspot/share/prims/jvmtiImpl.hpp
The sequence is pretty complex:
VM_GetOrSetLocal element initialization executes EscapeBarrier code which suspends the target thread (extra VM Operation).
VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM Thread to prepare VM Operation with frame deoptimization).
VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor which resumes the target thread.
But I don't have any improvement proposal. Performance is probably not a concern, here. So it's ok.

VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it has non-globally escaping objects and other frames if they have arg escaping ones. Good.


src/hotspot/share/prims/jvmtiTagMap.cpp
Escape barriers + deoptimize objects for all threads. Ok.


src/hotspot/share/prims/whitebox.cpp
Added WB_IsFrameDeoptimized to API. Ok.


src/hotspot/share/runtime/deoptimization.cpp
Object deoptimization. I have more comments and proposals, here.
First of all, handling recursive and waiting locks in relock_objects is tricky, but looks correct.
Comments are sufficient to understand why things are done as they are implemented.

BiasedLocking related parts are complex, but we may get rid of them in the future (with BiasedLocking removal).
Anyway, looks correct, too.

Typo in comment: "regularily" => "regularly"

Deoptimization::fetch_unroll_info_helper is the only place where _jvmti_deferred_updates get deallocated (except JavaThread destructor). But I think we always go through it, so I can't see a memory leak or such kind of issues.

EscapeBarrier::deoptimize_objects: ResourceMark should use calling_thread().

You can use MutexLocker and MonitorLocker with Thread* to save the Thread::current() call.

I'd make set_objs_are_deoptimized static and remove it from the EscapeBarrier interface because I think it shouldn't be used outside of EscapeBarrier::deoptimize_objects.

Typo in comment: "we must only deoptimize" => "we only have to deoptimize"

"bool EscapeBarrier::deoptimize_objects(intptr_t* fr_id)" is trivial and barrier_active() is redundant. Implementation can get moved to hpp file.

I'll get back to suspend flags, later.

There are weird cases regarding _self_deoptimization_in_progress.
Assume we have 3 threads A, B and C. A deopts C, B deopts C, C deopts C. C can set _self_deoptimization_in_progress while A performs the handshake for suspending C. I think this doesn't lead to errors, but it's probably not desired.
I think it would be better to use only one "wait" call in sync_and_suspend_one and sync_and_suspend_all.

I first thought it'd be better to move ThreadBlockInVM before wait() to reduce thread state transitions, but that seems to be problematic because ThreadBlockInVM destructor contains a safepoint check which we shouldn't do while holding EscapeBarrier_lock. So no change request.

Change in thred_added:
I think the sequence would be more comprehensive if we waited for deopt_all_threads in Thread::start and all other places where a new thread can run into Java code (e.g. JVMTI attach).
Your version makes new threads come up with suspend flag set. That looks correct, too. Advantage is that you only have to change one place (thread_added). It'll be interesting to see how it will look like when we use async handshakes instead of suspend flags.
For now, I'm ok with your version.

I'd only move MutexLocker ml(EscapeBarrier_lock...) after if (!jt->is_hidden_from_external_view()).

Having 4 different deoptimize_objects functions makes it a little hard to keep an overview of which one is used for what.
Maybe adding suffixes would help a little bit, but I can also live with what you have.
Implementation looks correct to me.


src/hotspot/share/runtime/deoptimization.hpp
Escape barriers and object deoptimization functions.
Typo in comment: "helt" => "held"


src/hotspot/share/runtime/globals.hpp
Addition of develop flag DeoptimizeObjectsALotInterval. Ok.


src/hotspot/share/runtime/interfaceSupport.cpp
InterfaceSupport::deoptimizeAllObjects() is only used for DeoptimizeObjectsALot = 1.
I think DeoptimizeObjectsALot = 2 is more important, but I think it's not bad to have DeoptimizeObjectsALot = 1 in addition. Ok.


src/hotspot/share/runtime/interfaceSupport.inline.hpp
Addition of deoptimizeAllObjects. Ok.


src/hotspot/share/runtime/mutexLocker.cpp
src/hotspot/share/runtime/mutexLocker.hpp
Addition of EscapeBarrier_lock. Ok.


src/hotspot/share/runtime/objectMonitor.cpp
Make recursion count relock aware. Ok.


src/hotspot/share/runtime/stackValue.hpp
Better reinitilization in StackValue. Good.


src/hotspot/share/runtime/thread.cpp
src/hotspot/share/runtime/thread.hpp
src/hotspot/share/runtime/thread.inline.hpp
wait_for_object_deoptimization, suspend flag, deferred updates and test feature to deoptimize objects.

In the long term, we want to get rid of suspend flags, so it's not so nice to introduce a new one. But I agree with G?tz that it should be acceptable as temporary solution until async handshakes are available (which takes more time). So I'm ok with your change.

You can use MutexLocker with Thread*.

JVMTIDeferredUpdates: I agree with Robin. It'd be nice to move the class out of thread.hpp.


src/hotspot/share/runtime/vframe.cpp
Added support for entry frame to new_vframe. Ok.


src/hotspot/share/runtime/vframe_hp.cpp
src/hotspot/share/runtime/vframe_hp.hpp

I think code()->as_nmethod() in not_global_escape_in_scope() and arg_escape() should better be under #ifdef ASSERT or inside the assert statement (no need for code cache walking in product build).

jvmtiDeferredLocalVariableSet::update_monitors:
Please add a comment explaining that owner referenced by original info may be scalar replaced, but it is deoptimized in the vframe.


src/hotspot/share/utilities/macros.hpp
Addition of NOT_COMPILER2_OR_JVMCI_RETURN macros. Ok.


test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysisEnabled.java
test/hotspot/jtreg/serviceability/jvmti/Heap/libIterateHeapWithEscapeAnalysisEnabled.c
New test. Will review separately.


test/jdk/TEST.ROOT
Addition of vm.jvmci as required property. Ok.


test/jdk/com/sun/jdi/EATests.java
test/jdk/com/sun/jdi/EATestsJVMCI.java
New test. Will review separately.


test/lib/sun/hotspot/WhiteBox.java
Added isFrameDeoptimized to API. Ok.


That was it. Best regards,
Martin


> -----Original Message-----
> From: hotspot-compiler-dev <hotspot-compiler-dev-
> bounces at openjdk.java.net> On Behalf Of Reingruber, Richard
> Sent: Dienstag, 3. M?rz 2020 21:23
> To: 'Robbin Ehn' <robbin.ehn at oracle.com>; Lindenmaier, Goetz
> <goetz.lindenmaier at sap.com>; David Holmes <david.holmes at oracle.com>;
> Vladimir Kozlov (vladimir.kozlov at oracle.com)
> <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net;
> hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-
> dev at openjdk.java.net
> Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better
> Performance in the Presence of JVMTI Agents
> 
> Hi Robbin,
> 
> > > I understand that Robbin proposed to replace the usage of
> > > _suspend_flag with handshakes. Apparently, async handshakes
> > > are needed to do so. We have been waiting a while for removal
> > > of the _suspend_flag / introduction of async handshakes [2].
> > > What is the status here?
> 
> > I have an old prototype which I would like to continue to work on.
> > So do not assume asynch handshakes will make 15.
> > Even if it would, I think there are a lot more investigate work to remove
> > _suspend_flag.
> 
> Let us know, if we can be of any help to you and be it only testing.
> 
> > >> Full:
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/
> 
> > DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
> > You can move both declaration and definition to that file, no need to
> clobber
> > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)
> 
> Will do.
> 
> > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's
> own
> > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.
> 
> You are right. It shouldn't be declared in thread.hpp. I will look into that.
> 
> > Note that we also think we may have a bug in deopt:
> > https://bugs.openjdk.java.net/browse/JDK-8238237
> 
> > I think it would be best, if possible, to push after that is resolved.
> 
> Sure.
> 
> > Not even nearly a full review :)
> 
> I know :)
> 
> Anyways, thanks a lot,
> Richard.
> 
> 
> -----Original Message-----
> From: Robbin Ehn <robbin.ehn at oracle.com>
> Sent: Monday, March 2, 2020 11:17 AM
> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Reingruber, Richard
> <richard.reingruber at sap.com>; David Holmes <david.holmes at oracle.com>;
> Vladimir Kozlov (vladimir.kozlov at oracle.com)
> <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net;
> hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-
> dev at openjdk.java.net
> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance
> in the Presence of JVMTI Agents
> 
> Hi,
> 
> On 2/24/20 5:39 PM, Lindenmaier, Goetz wrote:
> > Hi,
> >
> > I had a look at the progress of this change. Nothing
> > happened since Richard posted his update using more
> > handshakes [1].
> > But we (SAP) would appreciate a lot if this change could
> > be successfully reviewed and pushed.
> >
> > I think there is basic understanding that this
> > change is helpful. It fixes a number of issues with JVMTI,
> > and will deliver the same performance benefits as EA
> > does in current production mode for debugging scenarios.
> >
> > This is important for us as we run our VMs prepared
> > for debugging in production mode.
> >
> > I understand that Robbin proposed to replace the usage of
> > _suspend_flag with handshakes. Apparently, async handshakes
> > are needed to do so. We have been waiting a while for removal
> > of the _suspend_flag / introduction of async handshakes [2].
> > What is the status here?
> 
> I have an old prototype which I would like to continue to work on.
> So do not assume asynch handshakes will make 15.
> Even if it would, I think there are a lot more investigate work to remove
> _suspend_flag.
> 
> >
> > I think we should no longer wait, but proceed with
> > this change. We will look into removing the usage of
> > suspend_flag introduced here once it is possible to implement
> > it with handshakes.
> 
> Yes, sure.
> 
> >> Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/
> 
> DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
> You can move both declaration and definition to that file, no need to clobber
> thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)
> 
> Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's
> own
> hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.
> 
> Note that we also think we may have a bug in deopt:
> https://bugs.openjdk.java.net/browse/JDK-8238237
> 
> I think it would be best, if possible, to push after that is resolved.
> 
> Not even nearly a full review :)
> 
> Thanks, Robbin
> 
> 
> >> Incremental:
> >> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/
> >>
> >> I was not able to eliminate the additional suspend flag now. I'll take care
> of this
> >> as soon as the
> >> existing suspend-resume-mechanism is reworked.
> >>
> >> Testing:
> >>
> >> Nightly tests @SAP:
> >>
> >>    JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015,
> Renaissance
> >> Suite, SAP specific tests
> >>    with fastdebug and release builds on all platforms
> >>
> >>    Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x
> parallel
> >> for 24h
> >>
> >> Thanks, Richard.
> >>
> >>
> >> More details on the changes:
> >>
> >> * Hide DeoptimizeObjectsALotThread from external view.
> >>
> >> * Changed EscapeBarrier_lock to be a _safepoint_check_never lock.
> >>    It used to be _safepoint_check_sometimes, which will be eliminated
> sooner or
> >> later.
> >>    I added explicit thread state changes with ThreadBlockInVM to code
> paths
> >> where we can wait()
> >>    on EscapeBarrier_lock to become safepoint safe.
> >>
> >> * Use handshake EscapeBarrierSuspendHandshake to suspend target
> threads
> >> instead of vm operation
> >>    VM_ThreadSuspendAllForObjDeopt.
> >>
> >> * Removed uses of Threads_lock. When adding a new thread we suspend
> it iff
> >> EA optimizations are
> >>    being reverted. In the previous version we were waiting on
> Threads_lock
> >> while EA optimizations
> >>    were reverted. See EscapeBarrier::thread_added().
> >>
> >> * Made tests require Xmixed compilation mode.
> >>
> >> * Made tests agnostic regarding tiered compilation.
> >>    I.e. tc isn't disabled anymore, and the tests can be run with tc enabled or
> >> disabled.
> >>
> >> * Exercising EATests.java as well with stress test options
> >> DeoptimizeObjectsALot*
> >>    Due to the non-deterministic deoptimizations some tests need to be
> skipped.
> >>    We do this to prevent bit-rot of the stress test code.
> >>
> >> * Executing EATests.java as well with graal if available. Driver for this is
> >>    EATestsJVMCI.java. Graal cannot pass all tests, because it does not
> provide all
> >> the new debug info
> >>    (namely not_global_escape_in_scope and arg_escape in
> scopeDesc.hpp).
> >>    And graal does not yet support the JVMTI operations force early return
> and
> >> pop frame.
> >>
> >> * Removed tracing from new jdi tests in EATests.java. Too much trace
> output
> >> before the debugging
> >>    connection is established can cause deadlock because output buffers fill
> up.
> >>    (See https://bugs.openjdk.java.net/browse/JDK-8173304)
> >>
> >> * Many copyright year changes and smaller clean-up changes of testing
> code
> >> (trailing white-space and
> >>    the like).
> >>
> >>
> >> -----Original Message-----
> >> From: David Holmes <david.holmes at oracle.com>
> >> Sent: Donnerstag, 19. Dezember 2019 03:12
> >> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-
> >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net;
> hotspot-
> >> runtime-dev at openjdk.java.net; Vladimir Kozlov
> (vladimir.kozlov at oracle.com)
> >> <vladimir.kozlov at oracle.com>
> >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> Performance in
> >> the Presence of JVMTI Agents
> >>
> >> Hi Richard,
> >>
> >> I think my issue is with the way EliminateNestedLocks works so I'm going
> >> to look into that more deeply.
> >>
> >> Thanks for the explanations.
> >>
> >> David
> >>
> >> On 18/12/2019 12:47 am, Reingruber, Richard wrote:
> >>> Hi David,
> >>>
> >>>     > >    > Some further queries/concerns:
> >>>     > >    >
> >>>     > >    > src/hotspot/share/runtime/objectMonitor.cpp
> >>>     > >    >
> >>>     > >    > Can you please explain the changes to ObjectMonitor::wait:
> >>>     > >    >
> >>>     > >    > !   _recursions = save      // restore the old recursion count
> >>>     > >    > !                 + jt->get_and_reset_relock_count_after_wait(); //
> >>>     > >    > increased by the deferred relock count
> >>>     > >    >
> >>>     > >    > what is the "deferred relock count"? I gather it relates to
> >>>     > >    >
> >>>     > >    > "The code was extended to be able to deoptimize objects of a
> >>>     > > frame that
> >>>     > >    > is not the top frame and to let another thread than the owning
> >>>     > > thread do
> >>>     > >    > it."
> >>>     > >
> >>>     > > Yes, these relate. Currently EA based optimizations are reverted,
> when a
> >> compiled frame is
> >>>     > > replaced with corresponding interpreter frames. Part of this is
> relocking
> >> objects with eliminated
> >>>     > > locking. New with the enhancement is that we do this also just
> before
> >> object references are
> >>>     > > acquired through JVMTI. In this case we deoptimize also the
> owning
> >> compiled frame C and we
> >>>     > > register deoptimized objects as deferred updates. When control
> returns
> >> to C it gets deoptimized,
> >>>     > > we notice that objects are already deoptimized (reallocated and
> >> relocked), so we don't do it again
> >>>     > > (relocking twice would be incorrect of course). Deferred updates
> are
> >> copied into the new
> >>>     > > interpreter frames.
> >>>     > >
> >>>     > > Problem: relocking is not possible if the target thread T is waiting
> on the
> >> monitor that needs to
> >>>     > > be relocked. This happens only with non-local objects with
> >> EliminateNestedLocks. Instead relocking
> >>>     > > is deferred until T owns the monitor again. This is what the piece of
> >> code above does.
> >>>     >
> >>>     >  Sorry I need some more detail here. How can you wait() on an
> object
> >>>     >  monitor if the object allocation and/or locking was optimised away?
> And
> >>>     >  what is a "non-local object" in this context? Isn't EA restricted to
> >>>     >  thread-confined objects?
> >>>
> >>> "Non-local object" is an object that escapes its thread. The issue I'm
> >> addressing with the changes
> >>> in ObjectMonitor::wait are almost unrelated to EA. They are caused by
> >> EliminateNestedLocks, where C2
> >>> eliminates recursive locking of an already owned lock. The lock owning
> object
> >> exists on the heap, it
> >>> is locked and you can call wait() on it.
> >>>
> >>> EliminateLocks is the C2 option that controls lock elimination based on
> EA.
> >> Both optimizations have
> >>> in common that objects with eliminated locking need to be relocked
> when
> >> deoptimizing a frame,
> >>> i.e. when replacing a compiled frame with equivalent interpreter
> >>> frames. Deoptimization::relock_objects does that job for /all/ eliminated
> >> locks in scope. /All/ can
> >>> be a mix of eliminated nested locks and locks of not-escaping objects.
> >>>
> >>> New with the enhancement: I call relock_objects earlier, just before
> objects
> >> pontentially
> >>> escape. But then later when the owning compiled frame gets
> deoptimized, I
> >> must not do it again:
> >>>
> >>> See call to EscapeBarrier::objs_are_deoptimized in deoptimization.cpp:
> >>>
> >>>    373   if ((jvmci_enabled || ((DoEscapeAnalysis ||
> EliminateNestedLocks) &&
> >> EliminateLocks))
> >>>    374       && !EscapeBarrier::objs_are_deoptimized(thread,
> deoptee.id())) {
> >>>    375     bool unused;
> >>>    376     eliminate_locks(thread, chunk, realloc_failures, deoptee,
> exec_mode,
> >> unused);
> >>>    377   }
> >>>
> >>> Now when calling relock_objects early it is quiet possible that I have to
> relock
> >> an object the
> >>> target thread currently waits for. Obviously I cannot relock in this case,
> >> instead I chose to
> >>> introduce relock_count_after_wait to JavaThread.
> >>>
> >>>     >  Is it just that some of the locking gets optimized away e.g.
> >>>     >
> >>>     >  synchronised(obj) {
> >>>     >     synchronised(obj) {
> >>>     >       synchronised(obj) {
> >>>     >         obj.wait();
> >>>     >       }
> >>>     >     }
> >>>     >  }
> >>>     >
> >>>     >  If this is reduced to a form as-if it were a single lock of the monitor
> >>>     >  (due to EA) and the wait() triggers a JVM TI event which leads to the
> >>>     >  escape of "obj" then we need to reconstruct the true lock state, and
> so
> >>>     >  when the wait() internally unblocks and reacquires the monitor it
> has to
> >>>     >  set the true recursion count to 3, not the 1 that it appeared to be
> when
> >>>     >  wait() was initially called. Is that the scenario?
> >>>
> >>> Kind of... except that the locking is not eliminated due to EA and there is
> no
> >> JVM TI event
> >>> triggered by wait.
> >>>
> >>> Add
> >>>
> >>> LocalObject l1 = new LocalObject();
> >>>
> >>> in front of the synchrnized blocks and assume a JVM TI agent acquires l1.
> This
> >> triggers the code in
> >>> question.
> >>>
> >>> See that relocking/reallocating is transactional. If it is done then for /all/
> >> objects in scope and it is
> >>> done at most once. It wouldn't be quite so easy to split this in relocking
> of
> >> nested/EA-based
> >>> eliminated locks.
> >>>
> >>>     >  If so I find this truly awful. Anyone using wait() in a realistic form
> >>>     >  requires a notification and so the object cannot be thread confined.
> In
> >>>
> >>> It is not thread confined.
> >>>
> >>>     >  which case I would strongly argue that upon hitting the wait() the
> deopt
> >>>     >  should occur unconditionally and so the lock state is correct before
> we
> >>>     >  wait and so we don't need to mess with the recursion count
> internally
> >>>     >  when we reacquire the monitor.
> >>>     >
> >>>     > >
> >>>     > >    > which I don't like the sound of at all when it comes to
> ObjectMonitor
> >>>     > >    > state. So I'd like to understand in detail exactly what is going on
> here
> >>>     > >    > and why.  This is a very intrusive change that seems to badly
> break
> >>>     > >    > encapsulation and impacts future changes to ObjectMonitor
> that are
> >> under
> >>>     > >    > investigation.
> >>>     > >
> >>>     > > I would not regard this as breaking encapsulation. Certainly not
> badly.
> >>>     > >
> >>>     > > I've added a property relock_count_after_wait to JavaThread. The
> >> property is well
> >>>     > > encapsulated. Future ObjectMonitor implementations have to deal
> with
> >> recursion too. They are free
> >>>     > > in choosing a way to do that as long as that property is taken into
> >> account. This is hardly a
> >>>     > > limitation.
> >>>     >
> >>>     >  I do think this badly breaks encapsulation as you have to add a
> callout
> >>>     >  from the guts of the ObjectMonitor code to reach into the thread to
> get
> >>>     >  this lock count adjustment. I understand why you have had to do
> this but
> >>>     >  I would much rather see a change to the EA optimisation strategy so
> that
> >>>     >  this is not needed.
> >>>     >
> >>>     > > Note also that the property is a straight forward extension of the
> >> existing concept of deferred
> >>>     > > local updates. It is embedded into the structure holding them. So
> not
> >> even the footprint of a
> >>>     > > JavaThread is enlarged if no deferred updates are generated.
> >>>     >
> >>>     > [...]
> >>>     >
> >>>     > >
> >>>     > > I'm actually duplicating the existing external suspend mechanism,
> >> because a thread can be
> >>>     > > suspended at most once. And hey, and don't like that either! But it
> >> seems not unlikely that the
> >>>     > > duplicate can be removed together with the original and the new
> type
> >> of handshakes that will be
> >>>     > > used for thread suspend can be used for object deoptimization
> too. See
> >> today's discussion in
> >>>     > > JDK-8227745 [2].
> >>>     >
> >>>     >  I hope that discussion bears some fruit, at the moment it seems not
> to
> >>>     >  be possible to use handshakes here. :(
> >>>     >
> >>>     >  The external suspend mechanism is a royal pain in the proverbial
> that we
> >>>     >  have to carefully live with. The idea that we're duplicating that for
> >>>     >  use in another fringe area of functionality does not thrill me at all.
> >>>     >
> >>>     >  To be clear, I understand the problem that exists and that you wish
> to
> >>>     >  solve, but for the runtime parts I balk at the complexity cost of
> >>>     >  solving it.
> >>>
> >>> I know it's complex, but by far no rocket science.
> >>>
> >>> Also I find it hard to imagine another fix for JDK-8233915 besides
> changing
> >> the JVM TI specification.
> >>>
> >>> Thanks, Richard.
> >>>
> >>> -----Original Message-----
> >>> From: David Holmes <david.holmes at oracle.com>
> >>> Sent: Dienstag, 17. Dezember 2019 08:03
> >>> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-
> >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net;
> hotspot-
> >> runtime-dev at openjdk.java.net; Vladimir Kozlov
> (vladimir.kozlov at oracle.com)
> >> <vladimir.kozlov at oracle.com>
> >>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> Performance
> >> in the Presence of JVMTI Agents
> >>>
> >>> <resend as my mailer crashed during last send>
> >>>
> >>> David
> >>>
> >>> On 17/12/2019 4:57 pm, David Holmes wrote:
> >>>> Hi Richard,
> >>>>
> >>>> On 14/12/2019 5:01 am, Reingruber, Richard wrote:
> >>>>> Hi David,
> >>>>>
> >>>>>   ?? > Some further queries/concerns:
> >>>>>   ?? >
> >>>>>   ?? > src/hotspot/share/runtime/objectMonitor.cpp
> >>>>>   ?? >
> >>>>>   ?? > Can you please explain the changes to ObjectMonitor::wait:
> >>>>>   ?? >
> >>>>>   ?? > !?? _recursions = save????? // restore the old recursion count
> >>>>>   ?? > !???????????????? + jt->get_and_reset_relock_count_after_wait(); //
> >>>>>   ?? > increased by the deferred relock count
> >>>>>   ?? >
> >>>>>   ?? > what is the "deferred relock count"? I gather it relates to
> >>>>>   ?? >
> >>>>>   ?? > "The code was extended to be able to deoptimize objects of a
> >>>>> frame that
> >>>>>   ?? > is not the top frame and to let another thread than the owning
> >>>>> thread do
> >>>>>   ?? > it."
> >>>>>
> >>>>> Yes, these relate. Currently EA based optimizations are reverted,
> when
> >>>>> a compiled frame is replaced
> >>>>> with corresponding interpreter frames. Part of this is relocking
> >>>>> objects with eliminated
> >>>>> locking. New with the enhancement is that we do this also just before
> >>>>> object references are acquired
> >>>>> through JVMTI. In this case we deoptimize also the owning compiled
> >>>>> frame C and we register
> >>>>> deoptimized objects as deferred updates. When control returns to C
> it
> >>>>> gets deoptimized, we notice
> >>>>> that objects are already deoptimized (reallocated and relocked), so
> we
> >>>>> don't do it again (relocking
> >>>>> twice would be incorrect of course). Deferred updates are copied into
> >>>>> the new interpreter frames.
> >>>>>
> >>>>> Problem: relocking is not possible if the target thread T is waiting
> >>>>> on the monitor that needs to be
> >>>>> relocked. This happens only with non-local objects with
> >>>>> EliminateNestedLocks. Instead relocking is
> >>>>> deferred until T owns the monitor again. This is what the piece of
> >>>>> code above does.
> >>>>
> >>>> Sorry I need some more detail here. How can you wait() on an object
> >>>> monitor if the object allocation and/or locking was optimised away?
> And
> >>>> what is a "non-local object" in this context? Isn't EA restricted to
> >>>> thread-confined objects?
> >>>>
> >>>> Is it just that some of the locking gets optimized away e.g.
> >>>>
> >>>> synchronised(obj) {
> >>>>    ? synchronised(obj) {
> >>>>    ??? synchronised(obj) {
> >>>>    ????? obj.wait();
> >>>>    ??? }
> >>>>    ? }
> >>>> }
> >>>>
> >>>> If this is reduced to a form as-if it were a single lock of the monitor
> >>>> (due to EA) and the wait() triggers a JVM TI event which leads to the
> >>>> escape of "obj" then we need to reconstruct the true lock state, and so
> >>>> when the wait() internally unblocks and reacquires the monitor it has to
> >>>> set the true recursion count to 3, not the 1 that it appeared to be when
> >>>> wait() was initially called. Is that the scenario?
> >>>>
> >>>> If so I find this truly awful. Anyone using wait() in a realistic form
> >>>> requires a notification and so the object cannot be thread confined. In
> >>>> which case I would strongly argue that upon hitting the wait() the
> deopt
> >>>> should occur unconditionally and so the lock state is correct before we
> >>>> wait and so we don't need to mess with the recursion count internally
> >>>> when we reacquire the monitor.
> >>>>
> >>>>>
> >>>>>   ?? > which I don't like the sound of at all when it comes to
> >>>>> ObjectMonitor
> >>>>>   ?? > state. So I'd like to understand in detail exactly what is going
> >>>>> on here
> >>>>>   ?? > and why.? This is a very intrusive change that seems to badly
> break
> >>>>>   ?? > encapsulation and impacts future changes to ObjectMonitor that
> >>>>> are under
> >>>>>   ?? > investigation.
> >>>>>
> >>>>> I would not regard this as breaking encapsulation. Certainly not badly.
> >>>>>
> >>>>> I've added a property relock_count_after_wait to JavaThread. The
> >>>>> property is well
> >>>>> encapsulated. Future ObjectMonitor implementations have to deal
> with
> >>>>> recursion too. They are free in
> >>>>> choosing a way to do that as long as that property is taken into
> >>>>> account. This is hardly a
> >>>>> limitation.
> >>>>
> >>>> I do think this badly breaks encapsulation as you have to add a callout
> >>>> from the guts of the ObjectMonitor code to reach into the thread to
> get
> >>>> this lock count adjustment. I understand why you have had to do this
> but
> >>>> I would much rather see a change to the EA optimisation strategy so
> that
> >>>> this is not needed.
> >>>>
> >>>>> Note also that the property is a straight forward extension of the
> >>>>> existing concept of deferred
> >>>>> local updates. It is embedded into the structure holding them. So not
> >>>>> even the footprint of a
> >>>>> JavaThread is enlarged if no deferred updates are generated.
> >>>>>
> >>>>>   ?? > ---
> >>>>>   ?? >
> >>>>>   ?? > src/hotspot/share/runtime/thread.cpp
> >>>>>   ?? >
> >>>>>   ?? > Can you please explain why
> >>>>> JavaThread::wait_for_object_deoptimization
> >>>>>   ?? > has to be handcrafted in this way rather than using proper
> >>>>> transitions.
> >>>>>   ?? >
> >>>>>
> >>>>> I wrote wait_for_object_deoptimization taking
> >>>>> JavaThread::java_suspend_self_with_safepoint_check
> >>>>> as template. So in short: for the same reasons :)
> >>>>>
> >>>>> Threads reach both methods as part of thread state transitions,
> >>>>> therefore special handling is
> >>>>> required to change thread state on top of ongoing transitions.
> >>>>>
> >>>>>   ?? > We got rid of "deopt suspend" some time ago and it is disturbing
> >>>>> to see
> >>>>>   ?? > it being added back (effectively). This seems like it may be
> >>>>> something
> >>>>>   ?? > that handshakes could be used for.
> >>>>>
> >>>>> Deopt suspend used to be something rather different with a similar
> >>>>> name[1]. It is not being added back.
> >>>>
> >>>> I stand corrected. Despite comments in the code to the contrary
> >>>> deopt_suspend didn't actually cause a self-suspend. I was doing a lot of
> >>>> cleanup in this area 13 years ago :)
> >>>>
> >>>>>
> >>>>> I'm actually duplicating the existing external suspend mechanism,
> >>>>> because a thread can be suspended
> >>>>> at most once. And hey, and don't like that either! But it seems not
> >>>>> unlikely that the duplicate can
> >>>>> be removed together with the original and the new type of
> handshakes
> >>>>> that will be used for
> >>>>> thread suspend can be used for object deoptimization too. See
> today's
> >>>>> discussion in JDK-8227745 [2].
> >>>>
> >>>> I hope that discussion bears some fruit, at the moment it seems not to
> >>>> be possible to use handshakes here. :(
> >>>>
> >>>> The external suspend mechanism is a royal pain in the proverbial that
> we
> >>>> have to carefully live with. The idea that we're duplicating that for
> >>>> use in another fringe area of functionality does not thrill me at all.
> >>>>
> >>>> To be clear, I understand the problem that exists and that you wish to
> >>>> solve, but for the runtime parts I balk at the complexity cost of
> >>>> solving it.
> >>>>
> >>>> Thanks,
> >>>> David
> >>>> -----
> >>>>
> >>>>> Thanks, Richard.
> >>>>>
> >>>>> [1] Deopt suspend was something like an async. handshake for
> >>>>> architectures with register windows,
> >>>>>   ???? where patching the return pc for deoptimization of a compiled
> >>>>> frame was racy if the owner thread
> >>>>>   ???? was in native code. Instead a "deopt" suspend flag was set on
> >>>>> which the thread patched its own
> >>>>>   ???? frame upon return from native. So no thread was suspended. It
> got
> >>>>> its name only from the name of
> >>>>>   ???? the flags.
> >>>>>
> >>>>> [2] Discussion about using handshakes to sync. with the target thread:
> >>>>>
> >>>>> https://bugs.openjdk.java.net/browse/JDK-
> >>
> 8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.syst
> e
> >> m.issuetabpanels:comment-tabpanel#comment-14306727
> >>>>>
> >>>>>
> >>>>> -----Original Message-----
> >>>>> From: David Holmes <david.holmes at oracle.com>
> >>>>> Sent: Freitag, 13. Dezember 2019 00:56
> >>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
> >>>>> serviceability-dev at openjdk.java.net;
> >>>>> hotspot-compiler-dev at openjdk.java.net;
> >>>>> hotspot-runtime-dev at openjdk.java.net
> >>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> >>>>> Performance in the Presence of JVMTI Agents
> >>>>>
> >>>>> Hi Richard,
> >>>>>
> >>>>> Some further queries/concerns:
> >>>>>
> >>>>> src/hotspot/share/runtime/objectMonitor.cpp
> >>>>>
> >>>>> Can you please explain the changes to ObjectMonitor::wait:
> >>>>>
> >>>>> !?? _recursions = save????? // restore the old recursion count
> >>>>> !???????????????? + jt->get_and_reset_relock_count_after_wait(); //
> >>>>> increased by the deferred relock count
> >>>>>
> >>>>> what is the "deferred relock count"? I gather it relates to
> >>>>>
> >>>>> "The code was extended to be able to deoptimize objects of a frame
> that
> >>>>> is not the top frame and to let another thread than the owning thread
> do
> >>>>> it."
> >>>>>
> >>>>> which I don't like the sound of at all when it comes to ObjectMonitor
> >>>>> state. So I'd like to understand in detail exactly what is going on here
> >>>>> and why.? This is a very intrusive change that seems to badly break
> >>>>> encapsulation and impacts future changes to ObjectMonitor that are
> under
> >>>>> investigation.
> >>>>>
> >>>>> ---
> >>>>>
> >>>>> src/hotspot/share/runtime/thread.cpp
> >>>>>
> >>>>> Can you please explain why
> JavaThread::wait_for_object_deoptimization
> >>>>> has to be handcrafted in this way rather than using proper transitions.
> >>>>>
> >>>>> We got rid of "deopt suspend" some time ago and it is disturbing to
> see
> >>>>> it being added back (effectively). This seems like it may be something
> >>>>> that handshakes could be used for.
> >>>>>
> >>>>> Thanks,
> >>>>> David
> >>>>> -----
> >>>>>
> >>>>> On 12/12/2019 7:02 am, David Holmes wrote:
> >>>>>> On 12/12/2019 1:07 am, Reingruber, Richard wrote:
> >>>>>>> Hi David,
> >>>>>>>
> >>>>>>>   ??? > Most of the details here are in areas I can comment on in
> detail,
> >>>>>>> but I
> >>>>>>>   ??? > did take an initial general look at things.
> >>>>>>>
> >>>>>>> Thanks for taking the time!
> >>>>>>
> >>>>>> Apologies the above should read:
> >>>>>>
> >>>>>> "Most of the details here are in areas I *can't* comment on in detail
> >>>>>> ..."
> >>>>>>
> >>>>>> David
> >>>>>>
> >>>>>>>   ??? > The only thing that jumped out at me is that I think the
> >>>>>>>   ??? > DeoptimizeObjectsALotThread should be a hidden thread.
> >>>>>>>   ??? >
> >>>>>>>   ??? > +? bool is_hidden_from_external_view() const { return true; }
> >>>>>>>
> >>>>>>> Yes, it should. Will add the method like above.
> >>>>>>>
> >>>>>>>   ??? > Also I don't see any testing of the
> DeoptimizeObjectsALotThread.
> >>>>>>> Without
> >>>>>>>   ??? > active testing this will just bit-rot.
> >>>>>>>
> >>>>>>> DeoptimizeObjectsALot is meant for stress testing with a larger
> >>>>>>> workload. I will add a minimal test
> >>>>>>> to keep it fresh.
> >>>>>>>
> >>>>>>>   ??? > Also on the tests I don't understand your @requires clause:
> >>>>>>>   ??? >
> >>>>>>>   ??? >?? @requires ((vm.compMode != "Xcomp") &
> vm.compiler2.enabled
> >> &
> >>>>>>>   ??? > (vm.opt.TieredCompilation != true))
> >>>>>>>   ??? >
> >>>>>>>   ??? > This seems to require that TieredCompilation is disabled, but
> >>>>>>> tiered is
> >>>>>>>   ??? > our normal mode of operation. ??
> >>>>>>>   ??? >
> >>>>>>>
> >>>>>>> I removed the clause. I guess I wanted to target the tests towards
> the
> >>>>>>> code they are supposed to
> >>>>>>> test, and it's easier to analyze failures w/o tiered compilation and
> >>>>>>> with just one compiler thread.
> >>>>>>>
> >>>>>>> Additionally I will make use of
> >>>>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Richard.
> >>>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: David Holmes <david.holmes at oracle.com>
> >>>>>>> Sent: Mittwoch, 11. Dezember 2019 08:03
> >>>>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
> >>>>>>> serviceability-dev at openjdk.java.net;
> >>>>>>> hotspot-compiler-dev at openjdk.java.net;
> >>>>>>> hotspot-runtime-dev at openjdk.java.net
> >>>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> >>>>>>> Performance in the Presence of JVMTI Agents
> >>>>>>>
> >>>>>>> Hi Richard,
> >>>>>>>
> >>>>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote:
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> I would like to get reviews please for
> >>>>>>>>
> >>>>>>>>
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
> >>>>>>>>
> >>>>>>>> Corresponding RFE:
> >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745
> >>>>>>>>
> >>>>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
> >>>>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-
> 8214584 [1]
> >>>>>>>>
> >>>>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing
> without
> >>>>>>>> issues (thanks!). In addition the
> >>>>>>>> change is being tested at SAP since I posted the first RFR some
> >>>>>>>> months ago.
> >>>>>>>>
> >>>>>>>> The intention of this enhancement is to benefit performance wise
> from
> >>>>>>>> escape analysis even if JVMTI
> >>>>>>>> agents request capabilities that allow them to access local variable
> >>>>>>>> values. E.g. if you start-up
> >>>>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,
> then
> >>>>>>>> escape analysis is disabled right
> >>>>>>>> from the beginning, well before a debugger attaches -- if ever one
> >>>>>>>> should do so. With the
> >>>>>>>> enhancement, escape analysis will remain enabled until and after
> a
> >>>>>>>> debugger attaches. EA based
> >>>>>>>> optimizations are reverted just before an agent acquires the
> >>>>>>>> reference to an object. In the JBS item
> >>>>>>>> you'll find more details.
> >>>>>>>
> >>>>>>> Most of the details here are in areas I can comment on in detail, but
> I
> >>>>>>> did take an initial general look at things.
> >>>>>>>
> >>>>>>> The only thing that jumped out at me is that I think the
> >>>>>>> DeoptimizeObjectsALotThread should be a hidden thread.
> >>>>>>>
> >>>>>>> +? bool is_hidden_from_external_view() const { return true; }
> >>>>>>>
> >>>>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread.
> >>>>>>> Without
> >>>>>>> active testing this will just bit-rot.
> >>>>>>>
> >>>>>>> Also on the tests I don't understand your @requires clause:
> >>>>>>>
> >>>>>>>   ??? @requires ((vm.compMode != "Xcomp") &
> vm.compiler2.enabled &
> >>>>>>> (vm.opt.TieredCompilation != true))
> >>>>>>>
> >>>>>>> This seems to require that TieredCompilation is disabled, but tiered
> is
> >>>>>>> our normal mode of operation. ??
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> David
> >>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Richard.
> >>>>>>>>
> >>>>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745
> >>>>>>>>
> >>
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.pa
> tc
> >> h
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>

From serguei.spitsyn at oracle.com  Mon Mar 30 09:30:54 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 30 Mar 2020 02:30:54 -0700
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
Message-ID: <ec8b6266-3eff-6d4a-987a-d46013c715e1@oracle.com>

Hi Mandy,

I have just one comment so far.

http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/classfile/classLoaderHierarchyDCmd.cpp.frames.html

 ?356?? void add_classes(LoadedClassInfo* first_class, int num_classes, 
bool has_class_mirror_holder) {
 ?357???? LoadedClassInfo** p_list_to_add_to;
 ?358???? bool is_hidden = first_class->_klass->is_hidden();
 ?359???? if (has_class_mirror_holder) {
 ?360?????? p_list_to_add_to = is_hidden ? &_hidden_weak_classes : 
&_anon_classes;
 ?361???? } else {
 ?362?????? p_list_to_add_to = &_classes;
 ?363???? }
 ?364???? // Search tail.
 ?365???? while ((*p_list_to_add_to) != NULL) {
 ?366?????? p_list_to_add_to = &(*p_list_to_add_to)->_next;
 ?367???? }
 ?368???? *p_list_to_add_to = first_class;
 ?369???? if (has_class_mirror_holder) {
 ?370?????? if (is_hidden) {
 ?371???????? _num_hidden_weak_classes += num_classes;
 ?372?????? } else {
 ?373???????? _num_anon_classes += num_classes;
 ?374?????? }
 ?375???? } else {
 ?376?????? _num_classes += num_classes;
 ?377???? }
 ?378?? }

 ?Q1: I'm just curious, what happens if a cld has arrays of hidden classes?
 ???? Is the bottom_klass always expected to be the first?


Thanks,
Serguei


On 3/26/20 16:57, Mandy Chung wrote:
> Please review the implementation of JEP 371: Hidden Classes. The main 
> changes are in core-libs and hotspot runtime area.? Small changes are 
> made in javac, VM compiler (intrinsification of Class::isHiddenClass), 
> JFR, JDI, and jcmd.? CSR [1]has been reviewed and is in the finalized 
> state (see specdiff and javadoc below for reference).
>
> Webrev:
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 
>
>
> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's point
> of view, a hidden class is a normal class except the following:
>
> - A hidden class has no initiating class loader and is not registered 
> in any dictionary.
> - A hidden class has a name containing an illegal character 
> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` 
> returns "Lp/Foo.0x1234;".
> - A hidden class is not modifiable, i.e. cannot be redefined or 
> retransformed. JVM TI IsModifableClass returns false on a hidden.
> - Final fields in a hidden class is "final".? The value of final 
> fields cannot be overriden via reflection.? setAccessible(true) can 
> still be called on reflected objects representing final fields in a 
> hidden class and its access check will be suppressed but only have 
> read-access (i.e. can do Field::getXXX but not setXXX).
>
> Brief summary of this patch:
>
> 1. A new Lookup::defineHiddenClass method is the API to create a 
> hidden class.
> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG 
> option that
> ?? can be specified when creating a hidden class.
> 3. A new Class::isHiddenClass method tests if a class is a hidden class.
> 4. Field::setXXX method will throw IAE on a final field of a hidden class
> ?? regardless of the value of the accessible flag.
> 5. JVM_LookupDefineClass is the new JVM entry point for 
> Lookup::defineClass
> ?? and defineHiddenClass to create a class from the given bytes.
> 6. ClassLoaderData implementation is not changed.? There is one 
> primary CLD
> ?? that holds the classes strongly referenced by its defining loader.? 
> There
> ?? can be zero or more additional CLDs - one per weak class.
> 7. Nest host determination is updated per revised JVMS 5.4.4. Access 
> control
> ?? check no longer throws LinkageError but instead it will throw IAE with
> ?? a clear message if a class fails to resolve/validate the nest host 
> declared
> ?? in NestHost/NestMembers attribute.
> 8. JFR, jcmd, JDI are updated to support hidden classes.
> 9. update javac LambdaToMethod as lambda proxy starts using nestmates
> ?? and generate a bridge method to desuger a method reference to a 
> protected
> ?? method in its supertype in a different package
>
> This patch also updates StringConcatFactory, LambdaMetaFactory, and 
> LambdaForms
> to use hidden classes.? The webrev includes changes in nashorn to 
> hidden class
> and I will update the webrev if JEP 372 removes it any time soon.
>
> We uncovered a bug in Lookup::defineClass spec throws LinkageError and 
> intends
> to have the newly created class linked.? However, the implementation 
> in 14
> does not link the class.? A separate CSR [2] proposes to update the
> implementation to match the spec.? This patch fixes the implementation.
>
> The spec update on JVM TI, JDI and Instrumentation will be done as
> a separate RFE [3].? This patch includes new tests for JVM TI and
> java.instrument that validates how the existing APIs work for hidden 
> classes.
>
> javadoc/specdiff
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ 
>
>
> JVMS 5.4.4 change:
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf 
>
>
> CSR:
> https://bugs.openjdk.java.net/browse/JDK-8238359
>
> Thanks
> Mandy
> [1] https://bugs.openjdk.java.net/browse/JDK-8238359
> [2] https://bugs.openjdk.java.net/browse/JDK-8240338
> [3] https://bugs.openjdk.java.net/browse/JDK-8230502


From david.holmes at oracle.com  Mon Mar 30 09:54:51 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 30 Mar 2020 19:54:51 +1000
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <ec8b6266-3eff-6d4a-987a-d46013c715e1@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
 <ec8b6266-3eff-6d4a-987a-d46013c715e1@oracle.com>
Message-ID: <546cf8e4-00e4-1d22-6402-6620b8d7b2db@oracle.com>

Sorry to jump in on this but it caught my eye though I may be missing a 
larger context ...

On 30/03/2020 7:30 pm, serguei.spitsyn at oracle.com wrote:
> Hi Mandy,
> 
> I have just one comment so far.
> 
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/classfile/classLoaderHierarchyDCmd.cpp.frames.html 
> 
> 
>  ?356?? void add_classes(LoadedClassInfo* first_class, int num_classes, 
> bool has_class_mirror_holder) {
>  ?357???? LoadedClassInfo** p_list_to_add_to;
>  ?358???? bool is_hidden = first_class->_klass->is_hidden();
>  ?359???? if (has_class_mirror_holder) {
>  ?360?????? p_list_to_add_to = is_hidden ? &_hidden_weak_classes : 
> &_anon_classes;
>  ?361???? } else {
>  ?362?????? p_list_to_add_to = &_classes;
>  ?363???? }
>  ?364???? // Search tail.
>  ?365???? while ((*p_list_to_add_to) != NULL) {
>  ?366?????? p_list_to_add_to = &(*p_list_to_add_to)->_next;
>  ?367???? }
>  ?368???? *p_list_to_add_to = first_class;
>  ?369???? if (has_class_mirror_holder) {
>  ?370?????? if (is_hidden) {
>  ?371???????? _num_hidden_weak_classes += num_classes;

Why does hidden imply weak here?

David
-----

>  ?372?????? } else {
>  ?373???????? _num_anon_classes += num_classes;
>  ?374?????? }
>  ?375???? } else {
>  ?376?????? _num_classes += num_classes;
>  ?377???? }
>  ?378?? }
> 
>  ?Q1: I'm just curious, what happens if a cld has arrays of hidden classes?
>  ???? Is the bottom_klass always expected to be the first?
> 
> 
> Thanks,
> Serguei
> 
> 
> On 3/26/20 16:57, Mandy Chung wrote:
>> Please review the implementation of JEP 371: Hidden Classes. The main 
>> changes are in core-libs and hotspot runtime area.? Small changes are 
>> made in javac, VM compiler (intrinsification of Class::isHiddenClass), 
>> JFR, JDI, and jcmd.? CSR [1]has been reviewed and is in the finalized 
>> state (see specdiff and javadoc below for reference).
>>
>> Webrev:
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 
>>
>>
>> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's point
>> of view, a hidden class is a normal class except the following:
>>
>> - A hidden class has no initiating class loader and is not registered 
>> in any dictionary.
>> - A hidden class has a name containing an illegal character 
>> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` 
>> returns "Lp/Foo.0x1234;".
>> - A hidden class is not modifiable, i.e. cannot be redefined or 
>> retransformed. JVM TI IsModifableClass returns false on a hidden.
>> - Final fields in a hidden class is "final".? The value of final 
>> fields cannot be overriden via reflection.? setAccessible(true) can 
>> still be called on reflected objects representing final fields in a 
>> hidden class and its access check will be suppressed but only have 
>> read-access (i.e. can do Field::getXXX but not setXXX).
>>
>> Brief summary of this patch:
>>
>> 1. A new Lookup::defineHiddenClass method is the API to create a 
>> hidden class.
>> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG 
>> option that
>> ?? can be specified when creating a hidden class.
>> 3. A new Class::isHiddenClass method tests if a class is a hidden class.
>> 4. Field::setXXX method will throw IAE on a final field of a hidden class
>> ?? regardless of the value of the accessible flag.
>> 5. JVM_LookupDefineClass is the new JVM entry point for 
>> Lookup::defineClass
>> ?? and defineHiddenClass to create a class from the given bytes.
>> 6. ClassLoaderData implementation is not changed.? There is one 
>> primary CLD
>> ?? that holds the classes strongly referenced by its defining loader. 
>> There
>> ?? can be zero or more additional CLDs - one per weak class.
>> 7. Nest host determination is updated per revised JVMS 5.4.4. Access 
>> control
>> ?? check no longer throws LinkageError but instead it will throw IAE with
>> ?? a clear message if a class fails to resolve/validate the nest host 
>> declared
>> ?? in NestHost/NestMembers attribute.
>> 8. JFR, jcmd, JDI are updated to support hidden classes.
>> 9. update javac LambdaToMethod as lambda proxy starts using nestmates
>> ?? and generate a bridge method to desuger a method reference to a 
>> protected
>> ?? method in its supertype in a different package
>>
>> This patch also updates StringConcatFactory, LambdaMetaFactory, and 
>> LambdaForms
>> to use hidden classes.? The webrev includes changes in nashorn to 
>> hidden class
>> and I will update the webrev if JEP 372 removes it any time soon.
>>
>> We uncovered a bug in Lookup::defineClass spec throws LinkageError and 
>> intends
>> to have the newly created class linked.? However, the implementation 
>> in 14
>> does not link the class.? A separate CSR [2] proposes to update the
>> implementation to match the spec.? This patch fixes the implementation.
>>
>> The spec update on JVM TI, JDI and Instrumentation will be done as
>> a separate RFE [3].? This patch includes new tests for JVM TI and
>> java.instrument that validates how the existing APIs work for hidden 
>> classes.
>>
>> javadoc/specdiff
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ 
>>
>>
>> JVMS 5.4.4 change:
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf 
>>
>>
>> CSR:
>> https://bugs.openjdk.java.net/browse/JDK-8238359
>>
>> Thanks
>> Mandy
>> [1] https://bugs.openjdk.java.net/browse/JDK-8238359
>> [2] https://bugs.openjdk.java.net/browse/JDK-8240338
>> [3] https://bugs.openjdk.java.net/browse/JDK-8230502
> 

From magnus.ihse.bursie at oracle.com  Mon Mar 30 12:14:44 2020
From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie)
Date: Mon, 30 Mar 2020 14:14:44 +0200
Subject: Discussion about fixing deprecation in jdk.hotspot.agent
In-Reply-To: <b66dad5c-cd1e-afdc-6c5c-62d1d89fda00@oracle.com>
References: <b66dad5c-cd1e-afdc-6c5c-62d1d89fda00@oracle.com>
Message-ID: <1916207b-de97-1f25-f93c-8830025fad62@oracle.com>

No opinions on this?

/Magnus

On 2020-03-25 23:34, Magnus Ihse Bursie wrote:
> Hi everyone,
>
> As a follow-up to the ongoing review for JDK-8241618, I have also 
> looked at fixing the deprecation warnings in jdk.hotspot.agent. These 
> fall in three broad categories:
>
> * Deprecation of the boxing type constructors (e.g. "new Integer(42)").
>
> * Deprecation of java.util.Observer and Observable.
>
> * The rest (mostly Class.newInstance(), and a few number of other odd 
> deprecations)
>
> The first category is trivial to fix. The last category need some 
> special discussion. But the overwhelming majority of deprecation 
> warnings come from the use of Observer and Observable. This really 
> dwarfs anything else, and needs to be handled first, otherwise it's 
> hard to even spot the other issues.
>
> My analysis of the situation is that the deprecation of Observer and 
> Observable seems a bit harsh, from the PoV of jdk.hotspot.agent. Sure, 
> it might be limited, but I think it does exactly what is needed here. 
> So the migration suggested in Observable (java.beans or 
> java.util.concurrent) seems overkill. If there are genuine threading 
> issues at play here, this assumption might be wrong, and then maybe 
> going the j.u.c. route is correct.
>
> But if that's not, the main goal should be to stay with the current 
> implementation. One way to do this is to sprinkle the code with 
> @SuppressWarning. But I think a better way would be to just implement 
> our own Observer and Observable. After all, the classes are trivial.
>
> I've made a mock-up of this solution, were I just copied the 
> java.util.Observer and Observable, and removed the deprecation 
> annotations. The only thing needed for the rest of the code is to make 
> sure we import these; I've done this for three arbitrarily selected 
> classes just to show what the change would typically look like. Here's 
> the mock-up:
>
> http://cr.openjdk.java.net/~ihse/hotspot-agent-observer/webrev.01
>
> Let me know what you think.
>
> /Magnus


From coleen.phillimore at oracle.com  Mon Mar 30 14:18:58 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 30 Mar 2020 10:18:58 -0400
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <546cf8e4-00e4-1d22-6402-6620b8d7b2db@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
 <ec8b6266-3eff-6d4a-987a-d46013c715e1@oracle.com>
 <546cf8e4-00e4-1d22-6402-6620b8d7b2db@oracle.com>
Message-ID: <9cd71367-edc6-efc8-0a53-2e703ffbbfab@oracle.com>


On 3/30/20 5:54 AM, David Holmes wrote:
> Sorry to jump in on this but it caught my eye though I may be missing 
> a larger context ...
>
> On 30/03/2020 7:30 pm, serguei.spitsyn at oracle.com wrote:
>> Hi Mandy,
>>
>> I have just one comment so far.
>>
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/classfile/classLoaderHierarchyDCmd.cpp.frames.html 
>>
>>
>> ??356?? void add_classes(LoadedClassInfo* first_class, int 
>> num_classes, bool has_class_mirror_holder) {
>> ??357???? LoadedClassInfo** p_list_to_add_to;
>> ??358???? bool is_hidden = first_class->_klass->is_hidden();
>> ??359???? if (has_class_mirror_holder) {
>> ??360?????? p_list_to_add_to = is_hidden ? &_hidden_weak_classes : 
>> &_anon_classes;
>> ??361???? } else {
>> ??362?????? p_list_to_add_to = &_classes;
>> ??363???? }
>> ??364???? // Search tail.
>> ??365???? while ((*p_list_to_add_to) != NULL) {
>> ??366?????? p_list_to_add_to = &(*p_list_to_add_to)->_next;
>> ??367???? }
>> ??368???? *p_list_to_add_to = first_class;
>> ??369???? if (has_class_mirror_holder) {
>> ??370?????? if (is_hidden) {
>> ??371???????? _num_hidden_weak_classes += num_classes;
>
> Why does hidden imply weak here?

has_class_mirror_holder() implies weak.

Coleen
>
> David
> -----
>
>> ??372?????? } else {
>> ??373???????? _num_anon_classes += num_classes;
>> ??374?????? }
>> ??375???? } else {
>> ??376?????? _num_classes += num_classes;
>> ??377???? }
>> ??378?? }
>>
>> ??Q1: I'm just curious, what happens if a cld has arrays of hidden 
>> classes?
>> ????? Is the bottom_klass always expected to be the first?
>>
>>
>> Thanks,
>> Serguei
>>
>>
>> On 3/26/20 16:57, Mandy Chung wrote:
>>> Please review the implementation of JEP 371: Hidden Classes. The 
>>> main changes are in core-libs and hotspot runtime area.? Small 
>>> changes are made in javac, VM compiler (intrinsification of 
>>> Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been reviewed 
>>> and is in the finalized state (see specdiff and javadoc below for 
>>> reference).
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 
>>>
>>>
>>> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's 
>>> point
>>> of view, a hidden class is a normal class except the following:
>>>
>>> - A hidden class has no initiating class loader and is not 
>>> registered in any dictionary.
>>> - A hidden class has a name containing an illegal character 
>>> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` 
>>> returns "Lp/Foo.0x1234;".
>>> - A hidden class is not modifiable, i.e. cannot be redefined or 
>>> retransformed. JVM TI IsModifableClass returns false on a hidden.
>>> - Final fields in a hidden class is "final".? The value of final 
>>> fields cannot be overriden via reflection. setAccessible(true) can 
>>> still be called on reflected objects representing final fields in a 
>>> hidden class and its access check will be suppressed but only have 
>>> read-access (i.e. can do Field::getXXX but not setXXX).
>>>
>>> Brief summary of this patch:
>>>
>>> 1. A new Lookup::defineHiddenClass method is the API to create a 
>>> hidden class.
>>> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG 
>>> option that
>>> ?? can be specified when creating a hidden class.
>>> 3. A new Class::isHiddenClass method tests if a class is a hidden 
>>> class.
>>> 4. Field::setXXX method will throw IAE on a final field of a hidden 
>>> class
>>> ?? regardless of the value of the accessible flag.
>>> 5. JVM_LookupDefineClass is the new JVM entry point for 
>>> Lookup::defineClass
>>> ?? and defineHiddenClass to create a class from the given bytes.
>>> 6. ClassLoaderData implementation is not changed.? There is one 
>>> primary CLD
>>> ?? that holds the classes strongly referenced by its defining 
>>> loader. There
>>> ?? can be zero or more additional CLDs - one per weak class.
>>> 7. Nest host determination is updated per revised JVMS 5.4.4. Access 
>>> control
>>> ?? check no longer throws LinkageError but instead it will throw IAE 
>>> with
>>> ?? a clear message if a class fails to resolve/validate the nest 
>>> host declared
>>> ?? in NestHost/NestMembers attribute.
>>> 8. JFR, jcmd, JDI are updated to support hidden classes.
>>> 9. update javac LambdaToMethod as lambda proxy starts using nestmates
>>> ?? and generate a bridge method to desuger a method reference to a 
>>> protected
>>> ?? method in its supertype in a different package
>>>
>>> This patch also updates StringConcatFactory, LambdaMetaFactory, and 
>>> LambdaForms
>>> to use hidden classes.? The webrev includes changes in nashorn to 
>>> hidden class
>>> and I will update the webrev if JEP 372 removes it any time soon.
>>>
>>> We uncovered a bug in Lookup::defineClass spec throws LinkageError 
>>> and intends
>>> to have the newly created class linked.? However, the implementation 
>>> in 14
>>> does not link the class.? A separate CSR [2] proposes to update the
>>> implementation to match the spec.? This patch fixes the implementation.
>>>
>>> The spec update on JVM TI, JDI and Instrumentation will be done as
>>> a separate RFE [3].? This patch includes new tests for JVM TI and
>>> java.instrument that validates how the existing APIs work for hidden 
>>> classes.
>>>
>>> javadoc/specdiff
>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/
>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ 
>>>
>>>
>>> JVMS 5.4.4 change:
>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf 
>>>
>>>
>>> CSR:
>>> https://bugs.openjdk.java.net/browse/JDK-8238359
>>>
>>> Thanks
>>> Mandy
>>> [1] https://bugs.openjdk.java.net/browse/JDK-8238359
>>> [2] https://bugs.openjdk.java.net/browse/JDK-8240338
>>> [3] https://bugs.openjdk.java.net/browse/JDK-8230502
>>


From coleen.phillimore at oracle.com  Mon Mar 30 14:20:01 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 30 Mar 2020 10:20:01 -0400
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <ebe3de7f-0cba-99a6-8980-ad0b95ff93da@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
 <90f9276d-2777-88ba-b1ec-9901711fcf02@oracle.com>
 <c56eba10-13eb-aa86-11a3-f2dfd1a7b0b0@oracle.com>
 <ebe3de7f-0cba-99a6-8980-ad0b95ff93da@oracle.com>
Message-ID: <ea6e4918-aaca-f64f-e089-d4013e02bf99@oracle.com>

Adding back serviceability-dev.? Sometimes reply (and myself) remembers 
it and sometimes it strips it off....

Coleen

On 3/30/20 10:16 AM, coleen.phillimore at oracle.com wrote:
>
>
> On 3/29/20 10:17 PM, Mandy Chung wrote:
>>
>>
>> On 3/27/20 8:51 PM, Chris Plummer wrote:
>>> Hi Mandy,
>>>
>>> A couple of very minor nits in the jvmtiRedefineClasses.cpp comments:
>>>
>>> ?153???? // classes for primitives, arrays, hidden and vm unsafe 
>>> anonymous classes
>>> ?154???? // cannot be redefined.? Check here so following code can 
>>> assume these classes
>>> ?155???? // are InstanceKlass.
>>> ?156???? if (!is_modifiable_class(mirror)) {
>>> ?157?????? _res = JVMTI_ERROR_UNMODIFIABLE_CLASS;
>>> ?158?????? return false;
>>> ?159???? }
>>>
>>> I think this code and comment predate anonymous classes. Probably 
>>> before anonymous classes the check was not for 
>>> !is_modifiable_class() but instead was just a check for primitive or 
>>> array class types since they are not an InstanceKlass, and would 
>>> cause issues when cast to one in the code that lies below this 
>>> section. When anonymous classes were added, the code got changed to 
>>> use !is_modifiable_class() and the comment was not correctly updated 
>>> (anonymous classes are an InstanceKlass). Then with this webrev the 
>>> mention of hidden classes was added, also incorrectly implying they 
>>> are not an InstanceKlass. I think you should just leave off the last 
>>> sentence of the comment.
>>>
>>
>> I agree with you that this comment needs update.?? Perhaps it should 
>> say "primitive, array types and hidden classes are non-modifiable. A 
>> modifiable class must be an InstanceKlass."
>
> I may have written the last part of that comment (or remember it at 
> least).? I think Chris's suggestion to remove the last sentence makes 
> sense.? Anything further will just adds unnecessary confusion to the 
> reader.? Anyone modifying this will get the InstanceKlass::cast() 
> assert soon after if they mess up.
>
> Coleen
>
>>
>> I leave it to Serguei who may have other opinion.
>>
>>> There's some ambiguity in the application of adjectives in the 
>>> following:
>>>
>>> ?297?? // Cannot redefine or retransform a hidden or an unsafe 
>>> anonymous class.
>>>
>>> I'd suggest:
>>>
>>> ?297?? // Cannot redefine or retransform a hidden class or an unsafe 
>>> anonymous class.
>>>
>>
>> +1
>>
>>> There are some places in libjdwp that need to be fixed. I spoke to 
>>> Serguei about those this afternoon. Basically the 
>>> convertSignatureToClassname() function needs to be fixed to handle 
>>> hidden classes. Without the fix classname filtering will have 
>>> problems if the filter contains a pattern with a '/' to filter on 
>>> hidden classes. Also CLASS_UNLOAD events will not properly convert 
>>> hidden class names. We also need tests for these cases. I think 
>>> these are all things that can be addressed later.
>>>
>>
>> Good catch.? I have created a subtask under JDK-8230502:
>> ?? https://bugs.openjdk.java.net/browse/JDK-8230502
>>
>>> I still need to look over the JVMTI tests.
>>>
>>
>> Thanks
>> Mandy
>>> thanks,
>>>
>>> Chris
>>>
>>> On 3/26/20 4:57 PM, Mandy Chung wrote:
>>>> Please review the implementation of JEP 371: Hidden Classes. The 
>>>> main changes are in core-libs and hotspot runtime area. Small 
>>>> changes are made in javac, VM compiler (intrinsification of 
>>>> Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been 
>>>> reviewed and is in the finalized state (see specdiff and javadoc 
>>>> below for reference).
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 
>>>>
>>>>
>>>> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's 
>>>> point
>>>> of view, a hidden class is a normal class except the following:
>>>>
>>>> - A hidden class has no initiating class loader and is not 
>>>> registered in any dictionary.
>>>> - A hidden class has a name containing an illegal character 
>>>> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` 
>>>> returns "Lp/Foo.0x1234;".
>>>> - A hidden class is not modifiable, i.e. cannot be redefined or 
>>>> retransformed. JVM TI IsModifableClass returns false on a hidden.
>>>> - Final fields in a hidden class is "final".? The value of final 
>>>> fields cannot be overriden via reflection. setAccessible(true) can 
>>>> still be called on reflected objects representing final fields in a 
>>>> hidden class and its access check will be suppressed but only have 
>>>> read-access (i.e. can do Field::getXXX but not setXXX).
>>>>
>>>> Brief summary of this patch:
>>>>
>>>> 1. A new Lookup::defineHiddenClass method is the API to create a 
>>>> hidden class.
>>>> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG 
>>>> option that
>>>> ?? can be specified when creating a hidden class.
>>>> 3. A new Class::isHiddenClass method tests if a class is a hidden 
>>>> class.
>>>> 4. Field::setXXX method will throw IAE on a final field of a hidden 
>>>> class
>>>> ?? regardless of the value of the accessible flag.
>>>> 5. JVM_LookupDefineClass is the new JVM entry point for 
>>>> Lookup::defineClass
>>>> ?? and defineHiddenClass to create a class from the given bytes.
>>>> 6. ClassLoaderData implementation is not changed.? There is one 
>>>> primary CLD
>>>> ?? that holds the classes strongly referenced by its defining 
>>>> loader.? There
>>>> ?? can be zero or more additional CLDs - one per weak class.
>>>> 7. Nest host determination is updated per revised JVMS 5.4.4. 
>>>> Access control
>>>> ?? check no longer throws LinkageError but instead it will throw 
>>>> IAE with
>>>> ?? a clear message if a class fails to resolve/validate the nest 
>>>> host declared
>>>> ?? in NestHost/NestMembers attribute.
>>>> 8. JFR, jcmd, JDI are updated to support hidden classes.
>>>> 9. update javac LambdaToMethod as lambda proxy starts using nestmates
>>>> ?? and generate a bridge method to desuger a method reference to a 
>>>> protected
>>>> ?? method in its supertype in a different package
>>>>
>>>> This patch also updates StringConcatFactory, LambdaMetaFactory, and 
>>>> LambdaForms
>>>> to use hidden classes.? The webrev includes changes in nashorn to 
>>>> hidden class
>>>> and I will update the webrev if JEP 372 removes it any time soon.
>>>>
>>>> We uncovered a bug in Lookup::defineClass spec throws LinkageError 
>>>> and intends
>>>> to have the newly created class linked.? However, the 
>>>> implementation in 14
>>>> does not link the class.? A separate CSR [2] proposes to update the
>>>> implementation to match the spec.? This patch fixes the 
>>>> implementation.
>>>>
>>>> The spec update on JVM TI, JDI and Instrumentation will be done as
>>>> a separate RFE [3].? This patch includes new tests for JVM TI and
>>>> java.instrument that validates how the existing APIs work for 
>>>> hidden classes.
>>>>
>>>> javadoc/specdiff
>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/ 
>>>>
>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ 
>>>>
>>>>
>>>> JVMS 5.4.4 change:
>>>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf 
>>>>
>>>>
>>>> CSR:
>>>> https://bugs.openjdk.java.net/browse/JDK-8238359
>>>>
>>>> Thanks
>>>> Mandy
>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8238359
>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8240338
>>>> [3] https://bugs.openjdk.java.net/browse/JDK-8230502
>>>
>>>
>>
>


From magnus.ihse.bursie at oracle.com  Mon Mar 30 14:25:13 2020
From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie)
Date: Mon, 30 Mar 2020 16:25:13 +0200
Subject: RFR: JDK-8241618 Fix unchecked warning for jdk.hotspot.agent
In-Reply-To: <01f77be3-e7d2-a051-80ab-e81c83922cf6@oracle.com>
References: <8d884fcb-f424-1b54-7ece-5260037b2843@oracle.com>
 <01f77be3-e7d2-a051-80ab-e81c83922cf6@oracle.com>
Message-ID: <a0ad7e59-ae77-c805-9cca-c18ab51a7505@oracle.com>


On 2020-03-25 20:52, Chris Plummer wrote:
> Hi Magus,
>
> I haven't looked at the changes yet, other to see that there are many 
> files touched, but after reading below (and only partly understanding 
> since I don't know this area well), I was wondering if this issue 
> wouldn't be better served with multiple passes made to fix the 
> warnings. Start with a straight forward one where you are maybe only 
> making one or two types of changes, but that affect a large number of 
> files and don't cascade into other more complicated changes. This will 
> get a lot of the noise out of the way, and then we can focus on some 
> of the harder issues you bring up below.
Ok, I did just this. Here is an updated webrev. It contain the bulk of 
the changes, but all changes are -- I dare not say trivially obvious, 
but at least no-brainers. Hopefully it should be easier to review so I 
can get this pushed and out of the way.

This also means that it is not possible to turn on the warning just yet.

http://cr.openjdk.java.net/~ihse/JDK-8241618-fix-unchecked-warnings-for-agent/webrev.02

/Magnus
>
> As for testing, I think the following list will capture all of them, 
> but can't say for sure:
>
> open/test/hotspot/jtreg/serviceability/sa
> open/test/hotspot/jtreg/resourcehogs/serviceability/sa
> open/test/jdk/sun/tools/jhsdb
> open/test/jdk/sun/tools/jstack
> open/test/jdk/sun/tools/jmap
> open/test/hotspot/jtreg/gc/metaspace/CompressedClassSpaceSizeInJmapHeap.java 
>
> open/test/hotspot/jtreg/compiler/ciReplay/TestSAClient.java 
> open/test/hotspot/jtreg/compiler/ciReplay/TestSAServer.java
>
> Chris
>
> On 3/25/20 12:29 PM, Magnus Ihse Bursie wrote:
>> With the recent fixes in JDK-8241310, JDK-8237746 and JDK-8241073, 
>> and the upcoming fixes to remove the deprecated nashorn and jdk.rmi, 
>> the JDK build is very close to producing no warnings when compiling 
>> the Java classes.
>>
>> The one remaining sinner is jdk.hotspot.agent. Most of the warnings 
>> here are turned off, but unchecked and deprecation cannot be 
>> completely silenced.
>>
>> Since the poor agent does not seem to receive much love nowadays, I 
>> took it upon myself to fix these warnings, so we can finally get a 
>> quiet build.
>>
>> I started to address the unchecked warnings. Unfortunately, this was 
>> a much bigger task than I anticipated. I had to generify most of the 
>> module. On the plus side, the code is so much better now. And most of 
>> the changes were trivial, just tedious.
>>
>> There are a few places were I'm not entirely happy with the current 
>> solution, and that at least merits some discussion.
>>
>> I have resorted to @SuppressWarnings in four classes: ciMethodData, 
>> MethodData, TableModelComparator and VirtualBaseConstructor. All of 
>> them has in common that they are doing slightly fishy things with 
>> classes in collections. I'm not entirely sure they are bug-free, but 
>> this patch leaves the behavior untouched. I did some efforts to sort 
>> out the logic, but it turned out to be too hairy for me to fix, and 
>> it will probably require more substantial changes to the workings of 
>> the code.
>>
>> To make the code valid, I have moved ConstMethod to extend Metadata 
>> instead of VMObject. My understanding is that this is benign (and 
>> likely intended), but I really need for someone who knows the code to 
>> confirm this. I have also added a FIXME to signal this. I'll remove 
>> the FIXME as soon as I get confirmation that this is OK.
>> (The reason for this is the following piece of code from 
>> Metadata.java: metadataConstructor.addMapping("ConstMethod", 
>> ConstMethod.class))
>>
>> In ObjectListPanel, there is some code that screams "dead" with this 
>> change. I added a FIXME to point this out:
>> ??? for (Iterator<Oop> iter = elements.iterator(); iter.hasNext(); ) {
>> ????? if (iter.next() instanceof Array) {
>> ??????? // FIXME: Does not seem possible to happen
>> ??????? hasArrays = true;
>> ??????? return;
>> ????? }
>> It seems that if you start pulling this thread, even more dead code 
>> will unravel, so I'm not so eager to touch this in the current patch. 
>> But I can remove the FIXME if you want.
>>
>> My first iteration of this patch tried to generify the IntervalTree 
>> and related class hierarchy. However, this turned out to be 
>> impossible due to some weird usage in AnnotatedMemoryPanel, where 
>> there seemed to be confusion as to whether the tree stored 
>> Annotations or Addresses. I'm not entirely convinced the code is 
>> correct, it certainly looked and smelled very fishy. However, I 
>> reverted these changes since I could not get them to work due to 
>> this, and it was not needed for the goal of just getting rid of the 
>> warning.
>>
>> Finally, I have done no testing apart from verifying that it builds. 
>> Please advice on suitable tests to run.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8241618
>> WebRev: 
>> http://cr.openjdk.java.net/~ihse/JDK-8241618-fix-unchecked-warnings-for-agent/webrev.01
>>
>> /Magnus
>
>


From coleen.phillimore at oracle.com  Mon Mar 30 15:02:02 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 30 Mar 2020 11:02:02 -0400
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
Message-ID: <81f90a3e-dfda-0566-a9b2-ea0c4f17e7ac@oracle.com>


Hi,? This is great work!? I did a prereview and all of my comments were 
addressed.? These are a few minor things I noticed.

http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/ci/ciInstanceKlass.hpp.udiff.html

Nit. Can you add 'const' to the is_hidden accessor?

http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/classfile/classFileParser.cpp.udiff.html

+ ID annotation_index(const ClassLoaderData* loader_data, const Symbol* 
name, const bool can_access_vm_annotations);


'const' bool is weird and unnecessary.? Can you remove const here?

+ if (is_hidden()) { // Mark methods in hidden classes as 'hidden'.
+ m->set_hidden(true);
+ }
+

Could be:

+ // Mark methods in hidden classes as 'hidden'.
+ m->set_hidden(is_hidden());
+


http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/classfile/javaClasses.cpp.udiff.html

+ macro(_classData_offset, k, "classData", object_signature, false); \


Probably should remove trailing backslash here.

http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/classfile/systemDictionary.cpp.udiff.html

I think in a future RFE, we should add a default parameter to 
register_loader to make the code in the beginning of parse_stream() 
cleaner and remove has_class_mirror_holder_cld().

http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/prims/jvm.cpp.udiff.html

+ jboolean is_nestmate = (flags & NESTMATE) == NESTMATE;
+ jboolean is_hidden = (flags & HIDDEN_CLASS) == HIDDEN_CLASS;
+ jboolean is_strong = (flags & STRONG_LOADER_LINK) == STRONG_LOADER_LINK;
+ jboolean vm_annotations = (flags & ACCESS_VM_ANNOTATIONS) == 
ACCESS_VM_ANNOTATION


Instead of jboolean, please use C++ bool here.

+ oop loader = lookup_k->class_loader();
+ Handle class_loader (THREAD, loader);

Can you rewrite as this to prevent potential unhandled oop for oop loader.

+ Handle class_loader (THREAD, lookup_k->class_loader());


Here:

+ InstanceKlass::cast(defined_k)->class_loader_data()->dec_keep_alive();


Don't have to cast defined_k to get class_loader_data(), but you 
probably just want to move this up to remove the rest of the 
InstanceKlass::cast().

+ InstanceKlass* ik = InstanceKlass::cast(defined_k);


http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/runtime/vmStructs.cpp.udiff.html 
http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/classfile/ClassLoaderData.java.udiff.html

We agreed already that these changes aren't needed by the SA.? You can 
revert these.

These are minor changes.? I don't need to see another webrev.

Thanks,
Coleen


On 3/26/20 7:57 PM, Mandy Chung wrote:
> Please review the implementation of JEP 371: Hidden Classes. The main 
> changes are in core-libs and hotspot runtime area. Small changes are 
> made in javac, VM compiler (intrinsification of Class::isHiddenClass), 
> JFR, JDI, and jcmd.? CSR [1]has been reviewed and is in the finalized 
> state (see specdiff and javadoc below for reference).
>
> Webrev:
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03
>
> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's point
> of view, a hidden class is a normal class except the following:
>
> - A hidden class has no initiating class loader and is not registered 
> in any dictionary.
> - A hidden class has a name containing an illegal character 
> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` 
> returns "Lp/Foo.0x1234;".
> - A hidden class is not modifiable, i.e. cannot be redefined or 
> retransformed. JVM TI IsModifableClass returns false on a hidden.
> - Final fields in a hidden class is "final".? The value of final 
> fields cannot be overriden via reflection.? setAccessible(true) can 
> still be called on reflected objects representing final fields in a 
> hidden class and its access check will be suppressed but only have 
> read-access (i.e. can do Field::getXXX but not setXXX).
>
> Brief summary of this patch:
>
> 1. A new Lookup::defineHiddenClass method is the API to create a 
> hidden class.
> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG 
> option that
> ?? can be specified when creating a hidden class.
> 3. A new Class::isHiddenClass method tests if a class is a hidden class.
> 4. Field::setXXX method will throw IAE on a final field of a hidden class
> ?? regardless of the value of the accessible flag.
> 5. JVM_LookupDefineClass is the new JVM entry point for 
> Lookup::defineClass
> ?? and defineHiddenClass to create a class from the given bytes.
> 6. ClassLoaderData implementation is not changed.? There is one 
> primary CLD
> ?? that holds the classes strongly referenced by its defining loader.? 
> There
> ?? can be zero or more additional CLDs - one per weak class.
> 7. Nest host determination is updated per revised JVMS 5.4.4. Access 
> control
> ?? check no longer throws LinkageError but instead it will throw IAE with
> ?? a clear message if a class fails to resolve/validate the nest host 
> declared
> ?? in NestHost/NestMembers attribute.
> 8. JFR, jcmd, JDI are updated to support hidden classes.
> 9. update javac LambdaToMethod as lambda proxy starts using nestmates
> ?? and generate a bridge method to desuger a method reference to a 
> protected
> ?? method in its supertype in a different package
>
> This patch also updates StringConcatFactory, LambdaMetaFactory, and 
> LambdaForms
> to use hidden classes.? The webrev includes changes in nashorn to 
> hidden class
> and I will update the webrev if JEP 372 removes it any time soon.
>
> We uncovered a bug in Lookup::defineClass spec throws LinkageError and 
> intends
> to have the newly created class linked.? However, the implementation in 14
> does not link the class.? A separate CSR [2] proposes to update the
> implementation to match the spec.? This patch fixes the implementation.
>
> The spec update on JVM TI, JDI and Instrumentation will be done as
> a separate RFE [3].? This patch includes new tests for JVM TI and
> java.instrument that validates how the existing APIs work for hidden 
> classes.
>
> javadoc/specdiff
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/
>
> JVMS 5.4.4 change:
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf
>
> CSR:
> https://bugs.openjdk.java.net/browse/JDK-8238359
>
> Thanks
> Mandy
> [1] https://bugs.openjdk.java.net/browse/JDK-8238359
> [2] https://bugs.openjdk.java.net/browse/JDK-8240338
> [3] https://bugs.openjdk.java.net/browse/JDK-8230502

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200330/c14a74a4/attachment.htm>

From coleen.phillimore at oracle.com  Mon Mar 30 15:23:19 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 30 Mar 2020 11:23:19 -0400
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <81f90a3e-dfda-0566-a9b2-ea0c4f17e7ac@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
 <81f90a3e-dfda-0566-a9b2-ea0c4f17e7ac@oracle.com>
Message-ID: <42af77ec-0f03-a4b0-164d-3b25c14c7f37@oracle.com>

Adding back hotspot-dev.

On 3/30/20 11:02 AM, coleen.phillimore at oracle.com wrote:
>
> Hi,? This is great work!? I did a prereview and all of my comments 
> were addressed.? These are a few minor things I noticed.
>
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/ci/ciInstanceKlass.hpp.udiff.html
>
> Nit. Can you add 'const' to the is_hidden accessor?
>
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/classfile/classFileParser.cpp.udiff.html
>
> + ID annotation_index(const ClassLoaderData* loader_data, const 
> Symbol* name, const bool can_access_vm_annotations);
>
> 'const' bool is weird and unnecessary.? Can you remove const here?
>
> + if (is_hidden()) { // Mark methods in hidden classes as 'hidden'.
> + m->set_hidden(true);
> + }
> +
> Could be:
>
> + // Mark methods in hidden classes as 'hidden'.
> + m->set_hidden(is_hidden());
> +
>
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/classfile/javaClasses.cpp.udiff.html
>
> + macro(_classData_offset, k, "classData", object_signature, false); \
>
> Probably should remove trailing backslash here.
>
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/classfile/systemDictionary.cpp.udiff.html
>
> I think in a future RFE, we should add a default parameter to 
> register_loader to make the code in the beginning of parse_stream() 
> cleaner and remove has_class_mirror_holder_cld().
>
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/prims/jvm.cpp.udiff.html
> + jboolean is_nestmate = (flags & NESTMATE) == NESTMATE;
> + jboolean is_hidden = (flags & HIDDEN_CLASS) == HIDDEN_CLASS;
> + jboolean is_strong = (flags & STRONG_LOADER_LINK) == STRONG_LOADER_LINK;
> + jboolean vm_annotations = (flags & ACCESS_VM_ANNOTATIONS) == 
> ACCESS_VM_ANNOTATION
>
> Instead of jboolean, please use C++ bool here.
>
> + oop loader = lookup_k->class_loader();
> + Handle class_loader (THREAD, loader);
> Can you rewrite as this to prevent potential unhandled oop for oop loader.
> + Handle class_loader (THREAD, lookup_k->class_loader());
>
> Here:
> + InstanceKlass::cast(defined_k)->class_loader_data()->dec_keep_alive();
>
> Don't have to cast defined_k to get class_loader_data(), but you 
> probably just want to move this up to remove the rest of the 
> InstanceKlass::cast().
>
> + InstanceKlass* ik = InstanceKlass::cast(defined_k);
>
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/runtime/vmStructs.cpp.udiff.html 
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/classfile/ClassLoaderData.java.udiff.html
>
> We agreed already that these changes aren't needed by the SA.? You can 
> revert these.
>
> These are minor changes.? I don't need to see another webrev.
>
> Thanks,
> Coleen
>
>
>
> On 3/26/20 7:57 PM, Mandy Chung wrote:
>> Please review the implementation of JEP 371: Hidden Classes.? The 
>> main changes are in core-libs and hotspot runtime area.? Small 
>> changes are made in javac, VM compiler (intrinsification of 
>> Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been reviewed 
>> and is in the finalized state (see specdiff and javadoc below for 
>> reference).
>>
>> Webrev:
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03
>>
>> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's point
>> of view, a hidden class is a normal class except the following:
>>
>> - A hidden class has no initiating class loader and is not registered 
>> in any dictionary.
>> - A hidden class has a name containing an illegal character 
>> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` 
>> returns "Lp/Foo.0x1234;".
>> - A hidden class is not modifiable, i.e. cannot be redefined or 
>> retransformed. JVM TI IsModifableClass returns false on a hidden.
>> - Final fields in a hidden class is "final".? The value of final 
>> fields cannot be overriden via reflection. setAccessible(true) can 
>> still be called on reflected objects representing final fields in a 
>> hidden class and its access check will be suppressed but only have 
>> read-access (i.e. can do Field::getXXX but not setXXX).
>>
>> Brief summary of this patch:
>>
>> 1. A new Lookup::defineHiddenClass method is the API to create a 
>> hidden class.
>> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG 
>> option that
>> ?? can be specified when creating a hidden class.
>> 3. A new Class::isHiddenClass method tests if a class is a hidden class.
>> 4. Field::setXXX method will throw IAE on a final field of a hidden class
>> ?? regardless of the value of the accessible flag.
>> 5. JVM_LookupDefineClass is the new JVM entry point for 
>> Lookup::defineClass
>> ?? and defineHiddenClass to create a class from the given bytes.
>> 6. ClassLoaderData implementation is not changed.? There is one 
>> primary CLD
>> ?? that holds the classes strongly referenced by its defining 
>> loader.? There
>> ?? can be zero or more additional CLDs - one per weak class.
>> 7. Nest host determination is updated per revised JVMS 5.4.4. Access 
>> control
>> ?? check no longer throws LinkageError but instead it will throw IAE with
>> ?? a clear message if a class fails to resolve/validate the nest host 
>> declared
>> ?? in NestHost/NestMembers attribute.
>> 8. JFR, jcmd, JDI are updated to support hidden classes.
>> 9. update javac LambdaToMethod as lambda proxy starts using nestmates
>> ?? and generate a bridge method to desuger a method reference to a 
>> protected
>> ?? method in its supertype in a different package
>>
>> This patch also updates StringConcatFactory, LambdaMetaFactory, and 
>> LambdaForms
>> to use hidden classes.? The webrev includes changes in nashorn to 
>> hidden class
>> and I will update the webrev if JEP 372 removes it any time soon.
>>
>> We uncovered a bug in Lookup::defineClass spec throws LinkageError 
>> and intends
>> to have the newly created class linked.? However, the implementation 
>> in 14
>> does not link the class.? A separate CSR [2] proposes to update the
>> implementation to match the spec.? This patch fixes the implementation.
>>
>> The spec update on JVM TI, JDI and Instrumentation will be done as
>> a separate RFE [3].? This patch includes new tests for JVM TI and
>> java.instrument that validates how the existing APIs work for hidden 
>> classes.
>>
>> javadoc/specdiff
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/
>>
>> JVMS 5.4.4 change:
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf
>>
>> CSR:
>> https://bugs.openjdk.java.net/browse/JDK-8238359
>>
>> Thanks
>> Mandy
>> [1] https://bugs.openjdk.java.net/browse/JDK-8238359
>> [2] https://bugs.openjdk.java.net/browse/JDK-8240338
>> [3] https://bugs.openjdk.java.net/browse/JDK-8230502
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200330/fabe74c8/attachment-0001.htm>

From mandy.chung at oracle.com  Mon Mar 30 16:18:48 2020
From: mandy.chung at oracle.com (Mandy Chung)
Date: Mon, 30 Mar 2020 09:18:48 -0700
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <ebe3de7f-0cba-99a6-8980-ad0b95ff93da@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
 <90f9276d-2777-88ba-b1ec-9901711fcf02@oracle.com>
 <c56eba10-13eb-aa86-11a3-f2dfd1a7b0b0@oracle.com>
 <ebe3de7f-0cba-99a6-8980-ad0b95ff93da@oracle.com>
Message-ID: <0b585750-54f9-aaa0-19b3-752f723894d1@oracle.com>


On 3/30/20 7:16 AM, coleen.phillimore at oracle.com wrote:
>> I agree with you that this comment needs update.?? Perhaps it should 
>> say "primitive, array types and hidden classes are non-modifiable. A 
>> modifiable class must be an InstanceKlass."
>
> I may have written the last part of that comment (or remember it at 
> least).? I think Chris's suggestion to remove the last sentence makes 
> sense.? Anything further will just adds unnecessary confusion to the 
> reader.? Anyone modifying this will get the InstanceKlass::cast() 
> assert soon after if they mess up. 

OK.? That's fine too.

Mandy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200330/5d9ca788/attachment.htm>

From serguei.spitsyn at oracle.com  Mon Mar 30 16:19:19 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 30 Mar 2020 09:19:19 -0700
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <ec8b6266-3eff-6d4a-987a-d46013c715e1@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
 <ec8b6266-3eff-6d4a-987a-d46013c715e1@oracle.com>
Message-ID: <50b1658d-2195-53af-ea0b-e13842e00496@oracle.com>

On 3/30/20 02:30, serguei.spitsyn at oracle.com wrote:
> Hi Mandy,
>
> I have just one comment so far.
>
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03/src/hotspot/share/classfile/classLoaderHierarchyDCmd.cpp.frames.html 
>
>
> ?356?? void add_classes(LoadedClassInfo* first_class, int num_classes, 
> bool has_class_mirror_holder) {
> ?357???? LoadedClassInfo** p_list_to_add_to;
> ?358???? bool is_hidden = first_class->_klass->is_hidden();
> ?359???? if (has_class_mirror_holder) {
> ?360?????? p_list_to_add_to = is_hidden ? &_hidden_weak_classes : 
> &_anon_classes;
> ?361???? } else {
> ?362?????? p_list_to_add_to = &_classes;
> ?363???? }
> ?364???? // Search tail.
> ?365???? while ((*p_list_to_add_to) != NULL) {
> ?366?????? p_list_to_add_to = &(*p_list_to_add_to)->_next;
> ?367???? }
> ?368???? *p_list_to_add_to = first_class;
> ?369???? if (has_class_mirror_holder) {
> ?370?????? if (is_hidden) {
> ?371???????? _num_hidden_weak_classes += num_classes;
> ?372?????? } else {
> ?373???????? _num_anon_classes += num_classes;
> ?374?????? }
> ?375???? } else {
> ?376?????? _num_classes += num_classes;
> ?377???? }
> ?378?? }
>
> ?Q1: I'm just curious, what happens if a cld has arrays of hidden 
> classes?
> ???? Is the bottom_klass always expected to be the first?

Please, skip it. I've got the answer.
The array classes were not included into the LoadedClassInfo* by the 
classes_do.

Thanks,
Serguei

>
> Thanks,
> Serguei
>
>
> On 3/26/20 16:57, Mandy Chung wrote:
>> Please review the implementation of JEP 371: Hidden Classes. The main 
>> changes are in core-libs and hotspot runtime area.? Small changes are 
>> made in javac, VM compiler (intrinsification of 
>> Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been reviewed 
>> and is in the finalized state (see specdiff and javadoc below for 
>> reference).
>>
>> Webrev:
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 
>>
>>
>> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's 
>> point
>> of view, a hidden class is a normal class except the following:
>>
>> - A hidden class has no initiating class loader and is not registered 
>> in any dictionary.
>> - A hidden class has a name containing an illegal character 
>> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` 
>> returns "Lp/Foo.0x1234;".
>> - A hidden class is not modifiable, i.e. cannot be redefined or 
>> retransformed. JVM TI IsModifableClass returns false on a hidden.
>> - Final fields in a hidden class is "final".? The value of final 
>> fields cannot be overriden via reflection.? setAccessible(true) can 
>> still be called on reflected objects representing final fields in a 
>> hidden class and its access check will be suppressed but only have 
>> read-access (i.e. can do Field::getXXX but not setXXX).
>>
>> Brief summary of this patch:
>>
>> 1. A new Lookup::defineHiddenClass method is the API to create a 
>> hidden class.
>> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG 
>> option that
>> ?? can be specified when creating a hidden class.
>> 3. A new Class::isHiddenClass method tests if a class is a hidden class.
>> 4. Field::setXXX method will throw IAE on a final field of a hidden 
>> class
>> ?? regardless of the value of the accessible flag.
>> 5. JVM_LookupDefineClass is the new JVM entry point for 
>> Lookup::defineClass
>> ?? and defineHiddenClass to create a class from the given bytes.
>> 6. ClassLoaderData implementation is not changed.? There is one 
>> primary CLD
>> ?? that holds the classes strongly referenced by its defining 
>> loader.? There
>> ?? can be zero or more additional CLDs - one per weak class.
>> 7. Nest host determination is updated per revised JVMS 5.4.4. Access 
>> control
>> ?? check no longer throws LinkageError but instead it will throw IAE 
>> with
>> ?? a clear message if a class fails to resolve/validate the nest host 
>> declared
>> ?? in NestHost/NestMembers attribute.
>> 8. JFR, jcmd, JDI are updated to support hidden classes.
>> 9. update javac LambdaToMethod as lambda proxy starts using nestmates
>> ?? and generate a bridge method to desuger a method reference to a 
>> protected
>> ?? method in its supertype in a different package
>>
>> This patch also updates StringConcatFactory, LambdaMetaFactory, and 
>> LambdaForms
>> to use hidden classes.? The webrev includes changes in nashorn to 
>> hidden class
>> and I will update the webrev if JEP 372 removes it any time soon.
>>
>> We uncovered a bug in Lookup::defineClass spec throws LinkageError 
>> and intends
>> to have the newly created class linked.? However, the implementation 
>> in 14
>> does not link the class.? A separate CSR [2] proposes to update the
>> implementation to match the spec.? This patch fixes the implementation.
>>
>> The spec update on JVM TI, JDI and Instrumentation will be done as
>> a separate RFE [3].? This patch includes new tests for JVM TI and
>> java.instrument that validates how the existing APIs work for hidden 
>> classes.
>>
>> javadoc/specdiff
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ 
>>
>>
>> JVMS 5.4.4 change:
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf 
>>
>>
>> CSR:
>> https://bugs.openjdk.java.net/browse/JDK-8238359
>>
>> Thanks
>> Mandy
>> [1] https://bugs.openjdk.java.net/browse/JDK-8238359
>> [2] https://bugs.openjdk.java.net/browse/JDK-8240338
>> [3] https://bugs.openjdk.java.net/browse/JDK-8230502
>


From chris.plummer at oracle.com  Mon Mar 30 18:39:19 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Mon, 30 Mar 2020 11:39:19 -0700
Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the
 children process if vmArguments is already specified
In-Reply-To: <47cba4c6-233d-84a9-1d89-b40e3c974c08@oracle.com>
References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com>
 <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com>
 <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com>
 <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com>
 <042D1C56-75E3-40B9-8259-035C9A13C13B@oracle.com>
 <47cba4c6-233d-84a9-1d89-b40e3c974c08@oracle.com>
Message-ID: <8bcf232e-e05c-98ae-767f-26adf18ad3fd@oracle.com>

Hi Leonid,

I haven't gone through all the tests yet.? I've accumulated enough 
questions that I'd like to see them answered or addressed before I 
continue on.

This isn't directly related to your changes, but I noticed that users of 
JDKToolLauncher do nothing to make sure that default test options are 
used. This means we are never running these tools with the test options 
being specified with the jtreg run. Is that a bug or intentional?

In the problem lists, is it necessary to list the test multiple times 
with #id0, #id1, etc, or could you list it just once and leave that part 
off. It seems very error prone. Also, changing tests like ClhsdbFindPC, 
ClhsdbJstack, and ClhsdbScanOops to split out the testing in this manner 
seems completely unrelated to this CR, especially when the tests do not 
even contain any changes related to the CR.

 ?426???? public static LingeredApp startApp(String... 
additionalJvmOpts) throws IOException {

The default test opts are appended to additionalJvmOpts, and if you want 
prepended you need to call Utils.prependTestJavaOpts(). I would have 
thought the opposite would be more desirable and expected default 
behavior. Why did you choose this way? I also find it somewhat confusing 
that there is even a default mode for where the additionalJvmOpts go. 
Maybe it would be best to have startAppAppendJvmArgs() and 
startAppPrependJvmArgs() just to make it explicit. This would also be in 
line with the existing startAppExactJvmOpts().

Is ClhsdbFindPC correct. It used to use just use -Xcomp or -Xint, 
ignoring any default test opts. You've fixed it to include the default 
test opts, but the are appended, possibly overriding the -Xcomp or 
-Xint. Don't we want the default test opts prepended? Same for ClhsdbJstack.

thanks,

Chris

On 3/25/20 2:31 PM, Leonid Mesnik wrote:
>
> Igor, Stefan, Ioi
>
> Thank you for your feedback.
>
> Filed https://bugs.openjdk.java.net/browse/JDK-8241624 To change @run 
> main... to @run driver.
>
> Test ClhsdbJstack.java is updated.
>
> Still waiting for review from SVC team.
>
> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/
>
> Leonid
>
> On 3/25/20 12:46 PM, Igor Ignatyev wrote:
>> Hi Leonid,
>>
>> not related related to your patch (but yet somewhat made more obvious 
>> by it), it seems all (or at least almost all) the tests which 
>> use?LingeredApp should be run in "driver" mode as they just 
>> orchestrate execution of other JVMs, so running them w/ main (let 
>> alone main/othervm) just wastes time, 
>> test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java#id1, for 
>> example, will now executed w/ Xcomp which will make it very slow for 
>> no reasons. since you already got your hands dirty w/ these tests, 
>> could you please file an RFE to sort this out and list all the 
>> affected tests there?
>>
>> re: the patch, could you please update ClhsdbJstack.java test not to 
>> be run w/ Xcomp and follow the same pattern you used in other tests 
>> (e.g.?ClhsdbScanOops) ? other than that it looks fine to me, I 
>> however wouldn't be able to tell if all svc tests continue to do that 
>> they were supposed to, so I'd prefer for someone from svc team 
>> to?chime in.
>>
>> Thanks,
>> -- Igor
>>
>>> On Mar 25, 2020, at 12:01 PM, Leonid Mesnik 
>>> <leonid.mesnik at oracle.com <mailto:leonid.mesnik at oracle.com>> wrote:
>>>
>>> Added Ioi, who also proposed new version of startAppVmOpts.
>>>
>>> Please find new webrev: 
>>> http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/
>>>
>>> Renamed startAppVmOpts/runAppVmOpts to 
>>> "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should make 
>>> very clear that this method doesn't use any of test.java.opts, 
>>> test.vm.opts.
>>>
>>> Also, I fixed test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java 
>>> metnioned by Igor, and removed null pointer check as Ioi suggested 
>>> in startApp method.
>>>
>>> + public static void startApp(LingeredApp theApp, String... 
>>> additionalJvmOpts) throws IOException {
>>> + startAppExactJvmOpts(theApp, 
>>> Utils.appendTestJavaOpts(additionalJvmOpts));
>>> + }
>>>
>>> Leonid
>>>
>>> On 3/25/20 10:14 AM, Stefan Karlsson wrote:
>>>> On 2020-03-25 17:40, Igor Ignatyev wrote:
>>>>> Hi Leonid,
>>>>>
>>>>> I have briefly looked at the patch, a few comments so far:
>>>>>
>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java:
>>>>> ? - at L#114, could you please call static method using class name 
>>>>> (as the opposite of using instance)? or was it meant to be 
>>>>> theApp.runAppVmOpts(vmArgs) ?
>>>>>
>>>>> test/lib/jdk/test/lib/apps/LingeredApp.java:
>>>>> - it seems that code indent of startApp(LingeredApp, String[]) 
>>>>> isn't correct
>>>>> - I don't like startAppVmOpts name, but unfortunately don't have a 
>>>>> better suggestion (yet)
>>>>
>>>> I was going to say the same. Jtreg has the concept of "java 
>>>> options" and "vm options". We have had a fair share of bugs and 
>>>> wasted time when tests have been using the "vm options" part 
>>>> (VM_OPTIONS, test.vm.options, etc), and we've been moving away from 
>>>> using that way to pass options. I recently cleaned up some of this 
>>>> with:
>>>>
>>>> 8237111: LingeredApp should be started with getTestJavaOpts
>>>>
>>>> Because of this, I would prefer if we used a name that doesn't 
>>>> include "VmOpts", because it's too alike the other concept. Some 
>>>> suggestions:
>>>> ?startAppJavaOptions
>>>> ?startAppUsingJavaOptions
>>>> ?startAppWithJavaOptions
>>>> ?startAppExactJavaOptions
>>>> ?startAppJvmOptions
>>>>
>>>> Thanks,
>>>> StefanK
>>>>
>>>>> Thanks,
>>>>> -- Igor
>>>>>
>>>>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik 
>>>>>> <leonid.mesnik at oracle.com> wrote:
>>>>>>
>>>>>> Hi
>>>>>>
>>>>>> Could you please review following fix which change LingeredApp to 
>>>>>> prepend vm options to java/vm.test.opts when startApp is used and 
>>>>>> provide startAppVmOpts to override options completely.
>>>>>>
>>>>>> The intention is to avoid issue like in this bug where test/jtreg 
>>>>>> options were ignored by tests. Also I fixed some tests where 
>>>>>> intention was to append vm options rather than to override them.
>>>>>>
>>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/
>>>>>>
>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698
>>>>>>
>>>>>> Leonid
>>>>>>
>>>>
>>


From coleen.phillimore at oracle.com  Mon Mar 30 19:04:34 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 30 Mar 2020 15:04:34 -0400
Subject: Discussion about fixing deprecation in jdk.hotspot.agent
In-Reply-To: <1916207b-de97-1f25-f93c-8830025fad62@oracle.com>
References: <b66dad5c-cd1e-afdc-6c5c-62d1d89fda00@oracle.com>
 <1916207b-de97-1f25-f93c-8830025fad62@oracle.com>
Message-ID: <a9590dc7-867b-2f13-6772-5bc06c4d1a86@oracle.com>


I was wondering why this is needed when debugging a core file, which is 
the key thing we need the SA for:

 ? /** This is used by both the debugger and any runtime system. It is
 ????? the basic mechanism by which classes which mimic underlying VM
 ????? functionality cause themselves to be initialized. The given
 ????? observer will be notified (with arguments (null, null)) when the
 ????? VM is re-initialized, as well as when it registers itself with
 ????? the VM. */
 ? public static void registerVMInitializedObserver(Observer o) {
 ??? vmInitializedObservers.add(o);
 ??? o.update(null, null);
 ? }

It seems like if it isn't needed, we shouldn't add these classes and 
remove their use.

Coleen

On 3/30/20 8:14 AM, Magnus Ihse Bursie wrote:
> No opinions on this?
>
> /Magnus
>
> On 2020-03-25 23:34, Magnus Ihse Bursie wrote:
>> Hi everyone,
>>
>> As a follow-up to the ongoing review for JDK-8241618, I have also 
>> looked at fixing the deprecation warnings in jdk.hotspot.agent. These 
>> fall in three broad categories:
>>
>> * Deprecation of the boxing type constructors (e.g. "new Integer(42)").
>>
>> * Deprecation of java.util.Observer and Observable.
>>
>> * The rest (mostly Class.newInstance(), and a few number of other odd 
>> deprecations)
>>
>> The first category is trivial to fix. The last category need some 
>> special discussion. But the overwhelming majority of deprecation 
>> warnings come from the use of Observer and Observable. This really 
>> dwarfs anything else, and needs to be handled first, otherwise it's 
>> hard to even spot the other issues.
>>
>> My analysis of the situation is that the deprecation of Observer and 
>> Observable seems a bit harsh, from the PoV of jdk.hotspot.agent. 
>> Sure, it might be limited, but I think it does exactly what is needed 
>> here. So the migration suggested in Observable (java.beans or 
>> java.util.concurrent) seems overkill. If there are genuine threading 
>> issues at play here, this assumption might be wrong, and then maybe 
>> going the j.u.c. route is correct.
>>
>> But if that's not, the main goal should be to stay with the current 
>> implementation. One way to do this is to sprinkle the code with 
>> @SuppressWarning. But I think a better way would be to just implement 
>> our own Observer and Observable. After all, the classes are trivial.
>>
>> I've made a mock-up of this solution, were I just copied the 
>> java.util.Observer and Observable, and removed the deprecation 
>> annotations. The only thing needed for the rest of the code is to 
>> make sure we import these; I've done this for three arbitrarily 
>> selected classes just to show what the change would typically look 
>> like. Here's the mock-up:
>>
>> http://cr.openjdk.java.net/~ihse/hotspot-agent-observer/webrev.01
>>
>> Let me know what you think.
>>
>> /Magnus
>


From mandy.chung at oracle.com  Mon Mar 30 19:13:57 2020
From: mandy.chung at oracle.com (Mandy Chung)
Date: Mon, 30 Mar 2020 12:13:57 -0700
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <08191054-8d0a-ae60-ac99-e2849f08ce85@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
 <d3d2fa4f-4bf7-94c3-de4b-15dfe75b6370@oracle.com>
 <08191054-8d0a-ae60-ac99-e2849f08ce85@oracle.com>
Message-ID: <528c7933-be32-9863-6cc5-92223a75bbee@oracle.com>

This is the patch to keep the JDK 14 behavior if target release to 14 
(thanks to Jan for helping making change in javac to get the tests working)
http://cr.openjdk.java.net/~mchung/valhalla/webrevs/8171335/webrev-javac-target-release-14/

Mandy

On 3/27/20 9:29 AM, Mandy Chung wrote:
> Hi Jan,
>
> Good point.? The javac change only applies to JDK 15 and later and the 
> lambda proxy class is not a nestmate when running on JDK 14 or earlier.
>
> I probably need the help from langtools team to fix this.? I'll give 
> it a try.
>
> Mandy

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200330/5432bb74/attachment-0001.htm>

From daniil.x.titov at oracle.com  Mon Mar 30 19:43:01 2020
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Mon, 30 Mar 2020 12:43:01 -0700
Subject: RFR: 8241530: com/sun/jdi tests fail due to network issues on OSX
 10.15
Message-ID: <84D85D3D-AFCA-42BD-BD02-35604E462D5F@oracle.com>

Please review the change [1] that fixes the failure of com/sun/jdi/JdwpListenTest.java 
and com/sun/jdi/JdwpAttachTest.java tests on OSX 10.15.

The problem here is the similar to the one solved in [4] by additional filtering
 of unusual network interfaces in the test library class jdk.test.lib.NetworkConfiguration.
However,  the failing com/sun/jdi tests do not use jdk.test.lib.NetworkConfiguration and 
Instead do repeat the same logic themselves.

The fix changes these tests to start using jdk.test.lib.NetworkConfiguration to find all local addresses.

Initially the issue [2] also included 3 other failing tests from sun/management/jdp package, but these tests fail
for a different reason so I moved them in the new issue [3] and updated the ProblemList.txt  for them.


[1] Webrev:  http://cr.openjdk.java.net/~dtitov/8241530/webrev.01/
[2] Jira Issue: https://bugs.openjdk.java.net/browse/JDK-8241530 
[3] https://bugs.openjdk.java.net/browse/JDK-8241865 
[4] https://bugs.openjdk.java.net/browse/JDK-8241336 

Thank you,
Daniil


From alexey.menkov at oracle.com  Mon Mar 30 20:06:48 2020
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Mon, 30 Mar 2020 13:06:48 -0700
Subject: RFR: 8241530: com/sun/jdi tests fail due to network issues on OSX
 10.15
In-Reply-To: <84D85D3D-AFCA-42BD-BD02-35604E462D5F@oracle.com>
References: <84D85D3D-AFCA-42BD-BD02-35604E462D5F@oracle.com>
Message-ID: <e2359438-666a-a644-9365-0c088c775df5@oracle.com>

Looks good.

--alex

On 03/30/2020 12:43, Daniil Titov wrote:
> Please review the change [1] that fixes the failure of com/sun/jdi/JdwpListenTest.java
> and com/sun/jdi/JdwpAttachTest.java tests on OSX 10.15.
> 
> The problem here is the similar to the one solved in [4] by additional filtering
>   of unusual network interfaces in the test library class jdk.test.lib.NetworkConfiguration.
> However,  the failing com/sun/jdi tests do not use jdk.test.lib.NetworkConfiguration and
> Instead do repeat the same logic themselves.
> 
> The fix changes these tests to start using jdk.test.lib.NetworkConfiguration to find all local addresses.
> 
> Initially the issue [2] also included 3 other failing tests from sun/management/jdp package, but these tests fail
> for a different reason so I moved them in the new issue [3] and updated the ProblemList.txt  for them.
> 
> 
> [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8241530/webrev.01/
> [2] Jira Issue: https://bugs.openjdk.java.net/browse/JDK-8241530
> [3] https://bugs.openjdk.java.net/browse/JDK-8241865
> [4] https://bugs.openjdk.java.net/browse/JDK-8241336
> 
> Thank you,
> Daniil
> 
> 

From leonid.mesnik at oracle.com  Tue Mar 31 00:42:11 2020
From: leonid.mesnik at oracle.com (Leonid Mesnik)
Date: Mon, 30 Mar 2020 17:42:11 -0700
Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the
 children process if vmArguments is already specified
In-Reply-To: <8bcf232e-e05c-98ae-767f-26adf18ad3fd@oracle.com>
References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com>
 <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com>
 <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com>
 <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com>
 <042D1C56-75E3-40B9-8259-035C9A13C13B@oracle.com>
 <47cba4c6-233d-84a9-1d89-b40e3c974c08@oracle.com>
 <8bcf232e-e05c-98ae-767f-26adf18ad3fd@oracle.com>
Message-ID: <cd4bb93e-b41c-a59f-ac9c-c1ea10a6a374@oracle.com>

Hi

See my comments inline. I will update webrev after go through all your 
comments.


On 3/30/20 11:39 AM, Chris Plummer wrote:
> Hi Leonid,
>
> I haven't gone through all the tests yet.? I've accumulated enough 
> questions that I'd like to see them answered or addressed before I 
> continue on.
>
> This isn't directly related to your changes, but I noticed that users 
> of JDKToolLauncher do nothing to make sure that default test options 
> are used. This means we are never running these tools with the test 
> options being specified with the jtreg run. Is that a bug or intentional?

Which "default test options" do you mean? We have 2 properties to set 
JVM options. The idea is to pass test.vm.opts to ALL java processes and 
test.java.opts? to only tested processes if applicable. Usually, for 
example we don't want to run jcmd with -Xcomp. test.vm.opts was used (a 
long time ago) for options like '-d32/-d64' on Solaris where JVM don't 
start without choosing correct version. Also, it is used to reduce 
maximum heap for all JVM instances when tests are running concurrently.

So, probably test.vm.opts (or test.vm.tools.opts) should be added by 
JDKToolLauncher but not test.java.opts. It is separate topic, there are 
a lot of launchers which ignore test.vm.opts now.

>
> In the problem lists, is it necessary to list the test multiple times 
> with #id0, #id1, etc, or could you list it just once and leave that 
> part off. It seems very error prone. Also, changing tests like 
> ClhsdbFindPC, ClhsdbJstack, and ClhsdbScanOops to split out the 
> testing in this manner seems completely unrelated to this CR, 
> especially when the tests do not even contain any changes related to 
> the CR.

I think, that these chages are related. The startApp(...) was updated so 
some test combinations become invalid or redundant.

ClhsdbFindPC and ClhsdbJstack were always run twice. Now, when test 
options passed in test it is not needed to run it twice when Xcomp is 
already set by user.

ClhsdbScanOops is fixed to don't allow to run incompatible GC combination.

So I should update these tests by splitting them or change them to? 
startAppExactJvmOpts() if we wan't continue to ignore user-given test 
options.

It seems that #idN are required by jtreg now, otherwise it just run test.

>
> ?426???? public static LingeredApp startApp(String... 
> additionalJvmOpts) throws IOException {
>
> The default test opts are appended to additionalJvmOpts, and if you 
> want prepended you need to call Utils.prependTestJavaOpts(). I would 
> have thought the opposite would be more desirable and expected default 
> behavior. Why did you choose this way? I also find it somewhat 
> confusing that there is even a default mode for where the 
> additionalJvmOpts go. Maybe it would be best to have 
> startAppAppendJvmArgs() and startAppPrependJvmArgs() just to make it 
> explicit. This would also be in line with the existing 
> startAppExactJvmOpts().
>
I've chosen the most popular usage, which was Utils.appendTestJavaOpts. 
But I agree, that it would be better to change it to prepend. Thanks for 
pointing to this.

I don't want to add startAppAppendJvmArgs()/startAppPrependJvmArgs() to 
don't complicate all things. I think that startApp() should be used in 
the cases when test vm options really shouldn't interfere with 
user-provided options or overwrite them. So basically the behavior is 
the same as for ProcessTools.createJavaProcessBuilder(true, ...) and 
jtreg itself.


> Is ClhsdbFindPC correct. It used to use just use -Xcomp or -Xint, 
> ignoring any default test opts. You've fixed it to include the default 
> test opts, but the are appended, possibly overriding the -Xcomp or 
> -Xint. Don't we want the default test opts prepended? Same for 
> ClhsdbJstack.

The idea is to don't mix Xcomp and Xmixed/Xint using requires filter. 
However ClhsdbFindPC might override Xint with Xmixed if it is set 
explicitly. Switching to prepending will fix it.

Leonid

>
> thanks,
>
> Chris
>
> On 3/25/20 2:31 PM, Leonid Mesnik wrote:
>>
>> Igor, Stefan, Ioi
>>
>> Thank you for your feedback.
>>
>> Filed https://bugs.openjdk.java.net/browse/JDK-8241624 To change @run 
>> main... to @run driver.
>>
>> Test ClhsdbJstack.java is updated.
>>
>> Still waiting for review from SVC team.
>>
>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/
>>
>> Leonid
>>
>> On 3/25/20 12:46 PM, Igor Ignatyev wrote:
>>> Hi Leonid,
>>>
>>> not related related to your patch (but yet somewhat made more 
>>> obvious by it), it seems all (or at least almost all) the tests 
>>> which use?LingeredApp should be run in "driver" mode as they just 
>>> orchestrate execution of other JVMs, so running them w/ main (let 
>>> alone main/othervm) just wastes time, 
>>> test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java#id1, for 
>>> example, will now executed w/ Xcomp which will make it very slow for 
>>> no reasons. since you already got your hands dirty w/ these tests, 
>>> could you please file an RFE to sort this out and list all the 
>>> affected tests there?
>>>
>>> re: the patch, could you please update ClhsdbJstack.java test not to 
>>> be run w/ Xcomp and follow the same pattern you used in other tests 
>>> (e.g.?ClhsdbScanOops) ? other than that it looks fine to me, I 
>>> however wouldn't be able to tell if all svc tests continue to do 
>>> that they were supposed to, so I'd prefer for someone from svc team 
>>> to?chime in.
>>>
>>> Thanks,
>>> -- Igor
>>>
>>>> On Mar 25, 2020, at 12:01 PM, Leonid Mesnik 
>>>> <leonid.mesnik at oracle.com <mailto:leonid.mesnik at oracle.com>> wrote:
>>>>
>>>> Added Ioi, who also proposed new version of startAppVmOpts.
>>>>
>>>> Please find new webrev: 
>>>> http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/
>>>>
>>>> Renamed startAppVmOpts/runAppVmOpts to 
>>>> "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should make 
>>>> very clear that this method doesn't use any of test.java.opts, 
>>>> test.vm.opts.
>>>>
>>>> Also, I fixed test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java 
>>>> metnioned by Igor, and removed null pointer check as Ioi suggested 
>>>> in startApp method.
>>>>
>>>> + public static void startApp(LingeredApp theApp, String... 
>>>> additionalJvmOpts) throws IOException {
>>>> + startAppExactJvmOpts(theApp, 
>>>> Utils.appendTestJavaOpts(additionalJvmOpts));
>>>> + }
>>>>
>>>> Leonid
>>>>
>>>> On 3/25/20 10:14 AM, Stefan Karlsson wrote:
>>>>> On 2020-03-25 17:40, Igor Ignatyev wrote:
>>>>>> Hi Leonid,
>>>>>>
>>>>>> I have briefly looked at the patch, a few comments so far:
>>>>>>
>>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java:
>>>>>> ? - at L#114, could you please call static method using class 
>>>>>> name (as the opposite of using instance)? or was it meant to be 
>>>>>> theApp.runAppVmOpts(vmArgs) ?
>>>>>>
>>>>>> test/lib/jdk/test/lib/apps/LingeredApp.java:
>>>>>> - it seems that code indent of startApp(LingeredApp, String[]) 
>>>>>> isn't correct
>>>>>> - I don't like startAppVmOpts name, but unfortunately don't have 
>>>>>> a better suggestion (yet)
>>>>>
>>>>> I was going to say the same. Jtreg has the concept of "java 
>>>>> options" and "vm options". We have had a fair share of bugs and 
>>>>> wasted time when tests have been using the "vm options" part 
>>>>> (VM_OPTIONS, test.vm.options, etc), and we've been moving away 
>>>>> from using that way to pass options. I recently cleaned up some of 
>>>>> this with:
>>>>>
>>>>> 8237111: LingeredApp should be started with getTestJavaOpts
>>>>>
>>>>> Because of this, I would prefer if we used a name that doesn't 
>>>>> include "VmOpts", because it's too alike the other concept. Some 
>>>>> suggestions:
>>>>> ?startAppJavaOptions
>>>>> ?startAppUsingJavaOptions
>>>>> ?startAppWithJavaOptions
>>>>> ?startAppExactJavaOptions
>>>>> ?startAppJvmOptions
>>>>>
>>>>> Thanks,
>>>>> StefanK
>>>>>
>>>>>> Thanks,
>>>>>> -- Igor
>>>>>>
>>>>>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik 
>>>>>>> <leonid.mesnik at oracle.com> wrote:
>>>>>>>
>>>>>>> Hi
>>>>>>>
>>>>>>> Could you please review following fix which change LingeredApp 
>>>>>>> to prepend vm options to java/vm.test.opts when startApp is used 
>>>>>>> and provide startAppVmOpts to override options completely.
>>>>>>>
>>>>>>> The intention is to avoid issue like in this bug where 
>>>>>>> test/jtreg options were ignored by tests. Also I fixed some 
>>>>>>> tests where intention was to append vm options rather than to 
>>>>>>> override them.
>>>>>>>
>>>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/
>>>>>>>
>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698
>>>>>>>
>>>>>>> Leonid
>>>>>>>
>>>>>
>>>
>
>

From chris.plummer at oracle.com  Tue Mar 31 04:43:13 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Mon, 30 Mar 2020 21:43:13 -0700
Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the
 children process if vmArguments is already specified
In-Reply-To: <cd4bb93e-b41c-a59f-ac9c-c1ea10a6a374@oracle.com>
References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com>
 <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com>
 <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com>
 <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com>
 <042D1C56-75E3-40B9-8259-035C9A13C13B@oracle.com>
 <47cba4c6-233d-84a9-1d89-b40e3c974c08@oracle.com>
 <8bcf232e-e05c-98ae-767f-26adf18ad3fd@oracle.com>
 <cd4bb93e-b41c-a59f-ac9c-c1ea10a6a374@oracle.com>
Message-ID: <ed4167ad-681a-01f8-add6-c0f01188fefd@oracle.com>

Hi Leonid,

On 3/30/20 5:42 PM, Leonid Mesnik wrote:
> Hi
>
> See my comments inline. I will update webrev after go through all your 
> comments.
>
>
> On 3/30/20 11:39 AM, Chris Plummer wrote:
>> Hi Leonid,
>>
>> I haven't gone through all the tests yet.? I've accumulated enough 
>> questions that I'd like to see them answered or addressed before I 
>> continue on.
>>
>> This isn't directly related to your changes, but I noticed that users 
>> of JDKToolLauncher do nothing to make sure that default test options 
>> are used. This means we are never running these tools with the test 
>> options being specified with the jtreg run. Is that a bug or 
>> intentional?
>
> Which "default test options" do you mean? We have 2 properties to set 
> JVM options. The idea is to pass test.vm.opts to ALL java processes 
> and test.java.opts? to only tested processes if applicable. Usually, 
> for example we don't want to run jcmd with -Xcomp. test.vm.opts was 
> used (a long time ago) for options like '-d32/-d64' on Solaris where 
> JVM don't start without choosing correct version. Also, it is used to 
> reduce maximum heap for all JVM instances when tests are running 
> concurrently.
>
> So, probably test.vm.opts (or test.vm.tools.opts) should be added by 
> JDKToolLauncher but not test.java.opts. It is separate topic, there 
> are a lot of launchers which ignore test.vm.opts now.
I always get confused about which set of options these properties 
represent, but basically I'm suggesting that if for example we are doing 
a -Xcomp run in mach5, JDKToolLauncher (at least in some cases) should 
be launched with this option. I think this is what you get from 
Utils.getTestJavaOpts(),.

For example the SA tests use 
JDKToolLauncher.createUsingTestJDK("jhsdb"). jhsdb is what is really 
being tested here, and it should be launched with the test vm options. 
Currently we launch the target process with these options, which is 
probably also a good idea.? Also we aren't too concerned with the 
options that the test itself is run with, although I'm guessing they 
also get run with the test java opts. So we have 3 processes here:
 ?- jhsdb, which should be getting test java opts but is not
 ?- the target process, which should be getting test java opts and 
currently is
 ?- the test itself, where options don't really matter, but is getting 
passed test java opts

However, you could argue that tests like jinfo, jstack, and jcmd, all of 
which use the Attach API and the bulk of the work is done on the target 
process, are not that concerned with the options passed to the command, 
but do want the options passed to the target process.
>
>>
>> In the problem lists, is it necessary to list the test multiple times 
>> with #id0, #id1, etc, or could you list it just once and leave that 
>> part off. It seems very error prone. Also, changing tests like 
>> ClhsdbFindPC, ClhsdbJstack, and ClhsdbScanOops to split out the 
>> testing in this manner seems completely unrelated to this CR, 
>> especially when the tests do not even contain any changes related to 
>> the CR.
>
> I think, that these chages are related. The startApp(...) was updated 
> so some test combinations become invalid or redundant.
>
> ClhsdbFindPC and ClhsdbJstack were always run twice. Now, when test 
> options passed in test it is not needed to run it twice when Xcomp is 
> already set by user.
>
Ok. I see now that the second test run, which is the non -Xcomp run, 
adds '@requires vm.compMode != "Xcomp"'. But this also is strange. The 
first test run, which does not have the @requires and is the one that 
makes LingeredApp launch with -Xcomp, will always run whether or not it 
is an -Xcomp test run. So it will run as part of the a regular test run 
and as part of a -Xcomp test run. The only difference between the two is 
the -Xcomp run will also run the test with -Xcomp, but that's not really 
needed (I think it will also end up passing -Xcomp to the target 
processs twice). Perhaps '@requires vm.compMode == "Xcomp"' should be 
used for the first test run, but that means it no longer gets run until 
later tiers when we use -Xcomp. Why not revert it back to a single test, 
but also add '@requires vm.compMode != "Xcomp"'. Then it gets run both 
ways in an early tier and not run during the -Xcomp run, which isn't 
really needed.

> ClhsdbScanOops is fixed to don't allow to run incompatible GC 
> combination.
Ok
>
> So I should update these tests by splitting them or change them to? 
> startAppExactJvmOpts() if we wan't continue to ignore user-given test 
> options.
I don't think I was suggesting removing user-given test options. I don't 
see why you would.
>
> It seems that #idN are required by jtreg now, otherwise it just run test.
Ok.
>
>>
>> ?426???? public static LingeredApp startApp(String... 
>> additionalJvmOpts) throws IOException {
>>
>> The default test opts are appended to additionalJvmOpts, and if you 
>> want prepended you need to call Utils.prependTestJavaOpts(). I would 
>> have thought the opposite would be more desirable and expected 
>> default behavior. Why did you choose this way? I also find it 
>> somewhat confusing that there is even a default mode for where the 
>> additionalJvmOpts go. Maybe it would be best to have 
>> startAppAppendJvmArgs() and startAppPrependJvmArgs() just to make it 
>> explicit. This would also be in line with the existing 
>> startAppExactJvmOpts().
>>
> I've chosen the most popular usage, which was 
> Utils.appendTestJavaOpts. But I agree, that it would be better to 
> change it to prepend. Thanks for pointing to this.
>
> I don't want to add startAppAppendJvmArgs()/startAppPrependJvmArgs() 
> to don't complicate all things. I think that startApp() should be used 
> in the cases when test vm options really shouldn't interfere with 
> user-provided options or overwrite them. So basically the behavior is 
> the same as for ProcessTools.createJavaProcessBuilder(true, ...) and 
> jtreg itself.
>
Ok.
>
>> Is ClhsdbFindPC correct. It used to use just use -Xcomp or -Xint, 
>> ignoring any default test opts. You've fixed it to include the 
>> default test opts, but the are appended, possibly overriding the 
>> -Xcomp or -Xint. Don't we want the default test opts prepended? Same 
>> for ClhsdbJstack.
>
> The idea is to don't mix Xcomp and Xmixed/Xint using requires filter. 
> However ClhsdbFindPC might override Xint with Xmixed if it is set 
> explicitly. Switching to prepending will fix it.
Yes, that's what I was thinking and one reason I thought that should be 
default behavior.

thanks,

Chris
>
> Leonid
>
>>
>> thanks,
>>
>> Chris
>>
>> On 3/25/20 2:31 PM, Leonid Mesnik wrote:
>>>
>>> Igor, Stefan, Ioi
>>>
>>> Thank you for your feedback.
>>>
>>> Filed https://bugs.openjdk.java.net/browse/JDK-8241624 To change 
>>> @run main... to @run driver.
>>>
>>> Test ClhsdbJstack.java is updated.
>>>
>>> Still waiting for review from SVC team.
>>>
>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/
>>>
>>> Leonid
>>>
>>> On 3/25/20 12:46 PM, Igor Ignatyev wrote:
>>>> Hi Leonid,
>>>>
>>>> not related related to your patch (but yet somewhat made more 
>>>> obvious by it), it seems all (or at least almost all) the tests 
>>>> which use?LingeredApp should be run in "driver" mode as they just 
>>>> orchestrate execution of other JVMs, so running them w/ main (let 
>>>> alone main/othervm) just wastes time, 
>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java#id1, for 
>>>> example, will now executed w/ Xcomp which will make it very slow 
>>>> for no reasons. since you already got your hands dirty w/ these 
>>>> tests, could you please file an RFE to sort this out and list all 
>>>> the affected tests there?
>>>>
>>>> re: the patch, could you please update ClhsdbJstack.java test not 
>>>> to be run w/ Xcomp and follow the same pattern you used in other 
>>>> tests (e.g.?ClhsdbScanOops) ? other than that it looks fine to me, 
>>>> I however wouldn't be able to tell if all svc tests continue to do 
>>>> that they were supposed to, so I'd prefer for someone from svc team 
>>>> to?chime in.
>>>>
>>>> Thanks,
>>>> -- Igor
>>>>
>>>>> On Mar 25, 2020, at 12:01 PM, Leonid Mesnik 
>>>>> <leonid.mesnik at oracle.com <mailto:leonid.mesnik at oracle.com>> wrote:
>>>>>
>>>>> Added Ioi, who also proposed new version of startAppVmOpts.
>>>>>
>>>>> Please find new webrev: 
>>>>> http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/
>>>>>
>>>>> Renamed startAppVmOpts/runAppVmOpts to 
>>>>> "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should make 
>>>>> very clear that this method doesn't use any of test.java.opts, 
>>>>> test.vm.opts.
>>>>>
>>>>> Also, I fixed 
>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java metnioned by 
>>>>> Igor, and removed null pointer check as Ioi suggested in startApp 
>>>>> method.
>>>>>
>>>>> + public static void startApp(LingeredApp theApp, String... 
>>>>> additionalJvmOpts) throws IOException {
>>>>> + startAppExactJvmOpts(theApp, 
>>>>> Utils.appendTestJavaOpts(additionalJvmOpts));
>>>>> + }
>>>>>
>>>>> Leonid
>>>>>
>>>>> On 3/25/20 10:14 AM, Stefan Karlsson wrote:
>>>>>> On 2020-03-25 17:40, Igor Ignatyev wrote:
>>>>>>> Hi Leonid,
>>>>>>>
>>>>>>> I have briefly looked at the patch, a few comments so far:
>>>>>>>
>>>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java:
>>>>>>> ? - at L#114, could you please call static method using class 
>>>>>>> name (as the opposite of using instance)? or was it meant to be 
>>>>>>> theApp.runAppVmOpts(vmArgs) ?
>>>>>>>
>>>>>>> test/lib/jdk/test/lib/apps/LingeredApp.java:
>>>>>>> - it seems that code indent of startApp(LingeredApp, String[]) 
>>>>>>> isn't correct
>>>>>>> - I don't like startAppVmOpts name, but unfortunately don't have 
>>>>>>> a better suggestion (yet)
>>>>>>
>>>>>> I was going to say the same. Jtreg has the concept of "java 
>>>>>> options" and "vm options". We have had a fair share of bugs and 
>>>>>> wasted time when tests have been using the "vm options" part 
>>>>>> (VM_OPTIONS, test.vm.options, etc), and we've been moving away 
>>>>>> from using that way to pass options. I recently cleaned up some 
>>>>>> of this with:
>>>>>>
>>>>>> 8237111: LingeredApp should be started with getTestJavaOpts
>>>>>>
>>>>>> Because of this, I would prefer if we used a name that doesn't 
>>>>>> include "VmOpts", because it's too alike the other concept. Some 
>>>>>> suggestions:
>>>>>> ?startAppJavaOptions
>>>>>> ?startAppUsingJavaOptions
>>>>>> ?startAppWithJavaOptions
>>>>>> ?startAppExactJavaOptions
>>>>>> ?startAppJvmOptions
>>>>>>
>>>>>> Thanks,
>>>>>> StefanK
>>>>>>
>>>>>>> Thanks,
>>>>>>> -- Igor
>>>>>>>
>>>>>>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik 
>>>>>>>> <leonid.mesnik at oracle.com> wrote:
>>>>>>>>
>>>>>>>> Hi
>>>>>>>>
>>>>>>>> Could you please review following fix which change LingeredApp 
>>>>>>>> to prepend vm options to java/vm.test.opts when startApp is 
>>>>>>>> used and provide startAppVmOpts to override options completely.
>>>>>>>>
>>>>>>>> The intention is to avoid issue like in this bug where 
>>>>>>>> test/jtreg options were ignored by tests. Also I fixed some 
>>>>>>>> tests where intention was to append vm options rather than to 
>>>>>>>> override them.
>>>>>>>>
>>>>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/
>>>>>>>>
>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698
>>>>>>>>
>>>>>>>> Leonid
>>>>>>>>
>>>>>>
>>>>
>>
>>


From suenaga at oss.nttdata.com  Tue Mar 31 10:06:25 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Tue, 31 Mar 2020 19:06:25 +0900
Subject: Thread Local Handshake in JVMTI functions
Message-ID: <9ecf6856-f5c7-4723-7cc9-7d257e7bb7c0@oss.nttdata.com>

Hi all,

Many JVMTI functions uses VM Operation to get information. However some of them need to stop only one thread - they don't need to stop all threads.
So I think we can use Thread Local Handshake as this webrev. It is example for GetOneCurrentContendedMonitor().

   http://cr.openjdk.java.net/~ysuenaga/jvmti-thread-local-handshake/

Also I think we can replace following VM Operations to Thread Local Handshake:

class VM_GetCurrentLocation
class VM_EnterInterpOnlyMode
class VM_UpdateForPopTopFrame
class VM_SetFramePop
class VM_GetOwnedMonitorInfo
class VM_GetCurrentContendedMonitor
class VM_GetFrameCount
class VM_GetFrameLocation

What do you think?
It it is acceptable, I will file it to JBS and send review request.


Thanks,

Yasumasa

From david.holmes at oracle.com  Tue Mar 31 10:16:23 2020
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 31 Mar 2020 20:16:23 +1000
Subject: Thread Local Handshake in JVMTI functions
In-Reply-To: <9ecf6856-f5c7-4723-7cc9-7d257e7bb7c0@oss.nttdata.com>
References: <9ecf6856-f5c7-4723-7cc9-7d257e7bb7c0@oss.nttdata.com>
Message-ID: <e2f3b51c-be7a-cd27-caf9-8d2f5e3afc1c@oracle.com>

Hi Yasumasa,

On 31/03/2020 8:06 pm, Yasumasa Suenaga wrote:
> Hi all,
> 
> Many JVMTI functions uses VM Operation to get information. However some 
> of them need to stop only one thread - they don't need to stop all threads.
> So I think we can use Thread Local Handshake as this webrev. It is 
> example for GetOneCurrentContendedMonitor().

True, but at the moment handshakes involve the VMThread. There is work 
being done to support direct thread-to-thread handshakes and once that 
is done this kind of conversion should be more easily done. It might be 
worth waiting for that.

>  ? http://cr.openjdk.java.net/~ysuenaga/jvmti-thread-local-handshake/

An observation, it seems to me that calling_thread is not used when this 
is not a VMOperation.

Cheers,
David

> Also I think we can replace following VM Operations to Thread Local 
> Handshake:
> 
> class VM_GetCurrentLocation
> class VM_EnterInterpOnlyMode
> class VM_UpdateForPopTopFrame
> class VM_SetFramePop
> class VM_GetOwnedMonitorInfo
> class VM_GetCurrentContendedMonitor
> class VM_GetFrameCount
> class VM_GetFrameLocation
> 
> What do you think?
> It it is acceptable, I will file it to JBS and send review request.
> 
> 
> Thanks,
> 
> Yasumasa

From suenaga at oss.nttdata.com  Tue Mar 31 11:40:28 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Tue, 31 Mar 2020 20:40:28 +0900
Subject: Thread Local Handshake in JVMTI functions
In-Reply-To: <e2f3b51c-be7a-cd27-caf9-8d2f5e3afc1c@oracle.com>
References: <9ecf6856-f5c7-4723-7cc9-7d257e7bb7c0@oss.nttdata.com>
 <e2f3b51c-be7a-cd27-caf9-8d2f5e3afc1c@oracle.com>
Message-ID: <4c9aa3ab-468d-eede-18f7-ac8d352575b6@oss.nttdata.com>

Hi David,

On 2020/03/31 19:16, David Holmes wrote:
> Hi Yasumasa,
> 
> On 31/03/2020 8:06 pm, Yasumasa Suenaga wrote:
>> Hi all,
>>
>> Many JVMTI functions uses VM Operation to get information. However some of them need to stop only one thread - they don't need to stop all threads.
>> So I think we can use Thread Local Handshake as this webrev. It is example for GetOneCurrentContendedMonitor().
> 
> True, but at the moment handshakes involve the VMThread. There is work being done to support direct thread-to-thread handshakes and once that is done this kind of conversion should be more easily done. It might be worth waiting for that.

Thanks, I will be back to this topic when thread-to-thread handshake is done.
I wondered at first why VMThread involves handshake. Its improvement is welcome for me ;)


Cheers,

Yasumasa


>> ?? http://cr.openjdk.java.net/~ysuenaga/jvmti-thread-local-handshake/
> 
> An observation, it seems to me that calling_thread is not used when this is not a VMOperation.
> 
> Cheers,
> David
> 
>> Also I think we can replace following VM Operations to Thread Local Handshake:
>>
>> class VM_GetCurrentLocation
>> class VM_EnterInterpOnlyMode
>> class VM_UpdateForPopTopFrame
>> class VM_SetFramePop
>> class VM_GetOwnedMonitorInfo
>> class VM_GetCurrentContendedMonitor
>> class VM_GetFrameCount
>> class VM_GetFrameLocation
>>
>> What do you think?
>> It it is acceptable, I will file it to JBS and send review request.
>>
>>
>> Thanks,
>>
>> Yasumasa

From coleen.phillimore at oracle.com  Tue Mar 31 12:34:46 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 31 Mar 2020 08:34:46 -0400
Subject: Discussion about fixing deprecation in jdk.hotspot.agent
In-Reply-To: <a9590dc7-867b-2f13-6772-5bc06c4d1a86@oracle.com>
References: <b66dad5c-cd1e-afdc-6c5c-62d1d89fda00@oracle.com>
 <1916207b-de97-1f25-f93c-8830025fad62@oracle.com>
 <a9590dc7-867b-2f13-6772-5bc06c4d1a86@oracle.com>
Message-ID: <113dd83a-82a3-88fc-8f31-fe9bfd00c12c@oracle.com>


To answer my own question, this functionality is used to allow 
detach/reattach from {cl}hsdb.? Which seems to work on linux but not 
windows with this code removed.

The next question is whether this is useful functionality to justify all 
this code (900+ and this new code that Magnus has added).? Can't you 
just exit and restart the clhsdb process on the core file or process?

For the record, this is me playing with python to remove this code.

http://cr.openjdk.java.net/~coleenp/2020/01/webrev/index.html

Thanks,
Coleen

On 3/30/20 3:04 PM, coleen.phillimore at oracle.com wrote:
>
> I was wondering why this is needed when debugging a core file, which 
> is the key thing we need the SA for:
>
> ? /** This is used by both the debugger and any runtime system. It is
> ????? the basic mechanism by which classes which mimic underlying VM
> ????? functionality cause themselves to be initialized. The given
> ????? observer will be notified (with arguments (null, null)) when the
> ????? VM is re-initialized, as well as when it registers itself with
> ????? the VM. */
> ? public static void registerVMInitializedObserver(Observer o) {
> ??? vmInitializedObservers.add(o);
> ??? o.update(null, null);
> ? }
>
> It seems like if it isn't needed, we shouldn't add these classes and 
> remove their use.
>
> Coleen
>
> On 3/30/20 8:14 AM, Magnus Ihse Bursie wrote:
>> No opinions on this?
>>
>> /Magnus
>>
>> On 2020-03-25 23:34, Magnus Ihse Bursie wrote:
>>> Hi everyone,
>>>
>>> As a follow-up to the ongoing review for JDK-8241618, I have also 
>>> looked at fixing the deprecation warnings in jdk.hotspot.agent. 
>>> These fall in three broad categories:
>>>
>>> * Deprecation of the boxing type constructors (e.g. "new Integer(42)").
>>>
>>> * Deprecation of java.util.Observer and Observable.
>>>
>>> * The rest (mostly Class.newInstance(), and a few number of other 
>>> odd deprecations)
>>>
>>> The first category is trivial to fix. The last category need some 
>>> special discussion. But the overwhelming majority of deprecation 
>>> warnings come from the use of Observer and Observable. This really 
>>> dwarfs anything else, and needs to be handled first, otherwise it's 
>>> hard to even spot the other issues.
>>>
>>> My analysis of the situation is that the deprecation of Observer and 
>>> Observable seems a bit harsh, from the PoV of jdk.hotspot.agent. 
>>> Sure, it might be limited, but I think it does exactly what is 
>>> needed here. So the migration suggested in Observable (java.beans or 
>>> java.util.concurrent) seems overkill. If there are genuine threading 
>>> issues at play here, this assumption might be wrong, and then maybe 
>>> going the j.u.c. route is correct.
>>>
>>> But if that's not, the main goal should be to stay with the current 
>>> implementation. One way to do this is to sprinkle the code with 
>>> @SuppressWarning. But I think a better way would be to just 
>>> implement our own Observer and Observable. After all, the classes 
>>> are trivial.
>>>
>>> I've made a mock-up of this solution, were I just copied the 
>>> java.util.Observer and Observable, and removed the deprecation 
>>> annotations. The only thing needed for the rest of the code is to 
>>> make sure we import these; I've done this for three arbitrarily 
>>> selected classes just to show what the change would typically look 
>>> like. Here's the mock-up:
>>>
>>> http://cr.openjdk.java.net/~ihse/hotspot-agent-observer/webrev.01
>>>
>>> Let me know what you think.
>>>
>>> /Magnus
>>
>


From magnus.ihse.bursie at oracle.com  Tue Mar 31 12:51:52 2020
From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie)
Date: Tue, 31 Mar 2020 14:51:52 +0200
Subject: Discussion about fixing deprecation in jdk.hotspot.agent
In-Reply-To: <113dd83a-82a3-88fc-8f31-fe9bfd00c12c@oracle.com>
References: <b66dad5c-cd1e-afdc-6c5c-62d1d89fda00@oracle.com>
 <1916207b-de97-1f25-f93c-8830025fad62@oracle.com>
 <a9590dc7-867b-2f13-6772-5bc06c4d1a86@oracle.com>
 <113dd83a-82a3-88fc-8f31-fe9bfd00c12c@oracle.com>
Message-ID: <51a5b160-1af8-69a3-1dff-deb04c8a2447@oracle.com>


On 2020-03-31 14:34, coleen.phillimore at oracle.com wrote:
>
> To answer my own question, this functionality is used to allow 
> detach/reattach from {cl}hsdb.? Which seems to work on linux but not 
> windows with this code removed.
>
> The next question is whether this is useful functionality to justify 
> all this code (900+ and this new code that Magnus has added).? Can't 
> you just exit and restart the clhsdb process on the core file or process?

Personally, I'm happy for any solution. All I want to see is that SA 
stops polluting the build log with warnings that cannot be disabled. My 
approach was to minimize the amount of code changes that'd allow for 
this, but if you all can agree that this code is better off removed, 
then I'm completely OK with it. (And as a rule of thumb, dead and 
removed code is good code!)

/Magnus
>
> For the record, this is me playing with python to remove this code.
>
> http://cr.openjdk.java.net/~coleenp/2020/01/webrev/index.html
>
> Thanks,
> Coleen
>
> On 3/30/20 3:04 PM, coleen.phillimore at oracle.com wrote:
>>
>> I was wondering why this is needed when debugging a core file, which 
>> is the key thing we need the SA for:
>>
>> ? /** This is used by both the debugger and any runtime system. It is
>> ????? the basic mechanism by which classes which mimic underlying VM
>> ????? functionality cause themselves to be initialized. The given
>> ????? observer will be notified (with arguments (null, null)) when the
>> ????? VM is re-initialized, as well as when it registers itself with
>> ????? the VM. */
>> ? public static void registerVMInitializedObserver(Observer o) {
>> ??? vmInitializedObservers.add(o);
>> ??? o.update(null, null);
>> ? }
>>
>> It seems like if it isn't needed, we shouldn't add these classes and 
>> remove their use.
>>
>> Coleen
>>
>> On 3/30/20 8:14 AM, Magnus Ihse Bursie wrote:
>>> No opinions on this?
>>>
>>> /Magnus
>>>
>>> On 2020-03-25 23:34, Magnus Ihse Bursie wrote:
>>>> Hi everyone,
>>>>
>>>> As a follow-up to the ongoing review for JDK-8241618, I have also 
>>>> looked at fixing the deprecation warnings in jdk.hotspot.agent. 
>>>> These fall in three broad categories:
>>>>
>>>> * Deprecation of the boxing type constructors (e.g. "new 
>>>> Integer(42)").
>>>>
>>>> * Deprecation of java.util.Observer and Observable.
>>>>
>>>> * The rest (mostly Class.newInstance(), and a few number of other 
>>>> odd deprecations)
>>>>
>>>> The first category is trivial to fix. The last category need some 
>>>> special discussion. But the overwhelming majority of deprecation 
>>>> warnings come from the use of Observer and Observable. This really 
>>>> dwarfs anything else, and needs to be handled first, otherwise it's 
>>>> hard to even spot the other issues.
>>>>
>>>> My analysis of the situation is that the deprecation of Observer 
>>>> and Observable seems a bit harsh, from the PoV of 
>>>> jdk.hotspot.agent. Sure, it might be limited, but I think it does 
>>>> exactly what is needed here. So the migration suggested in 
>>>> Observable (java.beans or java.util.concurrent) seems overkill. If 
>>>> there are genuine threading issues at play here, this assumption 
>>>> might be wrong, and then maybe going the j.u.c. route is correct.
>>>>
>>>> But if that's not, the main goal should be to stay with the current 
>>>> implementation. One way to do this is to sprinkle the code with 
>>>> @SuppressWarning. But I think a better way would be to just 
>>>> implement our own Observer and Observable. After all, the classes 
>>>> are trivial.
>>>>
>>>> I've made a mock-up of this solution, were I just copied the 
>>>> java.util.Observer and Observable, and removed the deprecation 
>>>> annotations. The only thing needed for the rest of the code is to 
>>>> make sure we import these; I've done this for three arbitrarily 
>>>> selected classes just to show what the change would typically look 
>>>> like. Here's the mock-up:
>>>>
>>>> http://cr.openjdk.java.net/~ihse/hotspot-agent-observer/webrev.01
>>>>
>>>> Let me know what you think.
>>>>
>>>> /Magnus
>>>
>>
>


From martin.doerr at sap.com  Tue Mar 31 14:01:02 2020
From: martin.doerr at sap.com (Doerr, Martin)
Date: Tue, 31 Mar 2020 14:01:02 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <AM0PR0202MB3331EEF36942FCEBA7E131389BCB0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
 <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
 <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com>
 <AM0PR0202MB33316510E86767AED0D29F679B030@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM7PR02MB6049A3D2F6DE10CAD6AA7A51ECEC0@AM7PR02MB6049.eurprd02.prod.outlook.com>
 <b159e349-95bc-01c3-5250-f3b454d7ef53@oracle.com>
 <AM0PR0202MB33315707EAB1F5C9801DB4C19BE40@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB32972071A26C80FB22FC49DE9AFD0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331EEF36942FCEBA7E131389BCB0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
Message-ID: <AM0PR0202MB329746F57D1C78F14000CB799AC80@AM0PR0202MB3297.eurprd02.prod.outlook.com>

Hi Richard,

thanks for addressing all my points. I've looked over webrev.5 and I'm satisfied with your changes.


I had also promised to review the tests.

test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysisEnabled.java
Thanks for updating the @summary comment. Looks good in webrev.5.

test/hotspot/jtreg/serviceability/jvmti/Heap/libIterateHeapWithEscapeAnalysisEnabled.c
JVMTI agent for object tagging and heap iteration. Good.

test/jdk/com/sun/jdi/EATests.java
This is a substantial amount of tests which is appropriate for a such a large change. Skipping some subtests with UseJVMCICompiler makes sense because it doesn't provide the necessary JVMTI functionality, yet.
Nice work!
I also like that you test with and without BiasedLocking. Your tests will still be fine after BiasedLocking deprecation.

Very minor nits:
- 2 typos in comment above EARelockingNestedInflatedTarget: "lockes are ommitted" (sounds funny)
- You sometimes write "graal" and sometimes "Graal". I guess the capital G is better. (Also in EATestsJVMCI.java.)

test/jdk/com/sun/jdi/EATestsJVMCI.java
EATests with Graal enabled. Nice that you support Graal to some extent. Maybe Graal folks want to enhance them in the future. I think this is a good starting point.


Conclusion: Looks good and not trivial :-)
Now, you have one full review. I'd be ok with covering 2nd review by partial reviews.
Compiler and JVMTI parts are not too complicated IMHO.
Runtime part should get at least one additional careful review.

Best regards,
Martin


> -----Original Message-----
> From: Reingruber, Richard <richard.reingruber at sap.com>
> Sent: Montag, 30. M?rz 2020 10:32
> To: Doerr, Martin <martin.doerr at sap.com>; 'Robbin Ehn'
> <robbin.ehn at oracle.com>; Lindenmaier, Goetz
> <goetz.lindenmaier at sap.com>; David Holmes <david.holmes at oracle.com>;
> Vladimir Kozlov (vladimir.kozlov at oracle.com)
> <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net;
> hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-
> dev at openjdk.java.net
> Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance
> in the Presence of JVMTI Agents
> 
> Hi,
> 
> this is webrev.5 based on Robbin's feedback and Martin's review - thanks! :)
> 
> The change affects jvmti, hotspot and c2. Partial reviews are very welcome
> too.
> 
> Full:  http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5/
> Delta:
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5.inc/
> 
> Robbin, Martin, please let me know, if anything shouldn't be quite as you
> wanted it. Also find my
> comments on your feedback below.
> 
> Robbin, can I count you as Reviewer for the runtime part?
> 
> Thanks, Richard.
> 
> --
> 
> > DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
> > You can move both declaration and definition to that file, no need to
> clobber
> > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)
> 
> Done.
> 
> > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's
> own
> > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.
> 
> I moved JvmtiDeferredUpdates to vframe_hp.hpp where preexisting
> jvmtiDeferredLocalVariableSet is
> declared.
> 
> > src/hotspot/share/code/compiledMethod.cpp
> > Nice cleanup!
> 
> Thanks :)
> 
> > src/hotspot/share/code/debugInfoRec.cpp
> > src/hotspot/share/code/debugInfoRec.hpp
> > Additional parmeters. (Remark: I think "non_global_escape_in_scope"
> would read better than "not_global_escape_in_scope", but your version is
> consistent with existing code, so no change request from my side.) Ok.
> 
> I've been thinking about this too and finally stayed with
> not_global_escape_in_scope. It's supposed
> to mean an object whose escape state is not GlobalEscape is in scope.
> 
> > src/hotspot/share/compiler/compileBroker.cpp
> > src/hotspot/share/compiler/compileBroker.hpp
> > Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into
> a follow up change together with the test in order to make this webrev
> smaller, but since it is included, I'm reviewing everything at once. Not a big
> deal.) Ok.
> 
> Yes the change would be a little smaller. And if it helps I'll split it off. In
> general I prefer
> patches that bring along a suitable amount of tests.
> 
> > src/hotspot/share/opto/c2compiler.cpp
> > Make do_escape_analysis independent of JVMCI capabilities. Nice!
> 
> It is the main goal of the enhancement. It is done for C2, but could be done
> for JVMCI compilers
> with just a small effort as well.
> 
> > src/hotspot/share/opto/escape.cpp
> > Annotation for MachSafePointNodes. Your added functionality looks
> correct.
> > But I'd prefer to move the bulky code out of the large function.
> > I suggest to factor out something like has_not_global_escape and
> has_arg_escape. So the code could look like this:
> >       SafePointNode* sfn = sfn_worklist.at(next);
> >       sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn));
> >       if (sfn->is_CallJava()) {
> >         CallJavaNode* call = sfn->as_CallJava();
> >         call->set_arg_escape(has_arg_escape(call));
> >       }
> > This would also allow us to get rid of the found_..._escape_in_args
> variables making the loops better readable.
> 
> Done.
> 
> > It's kind of ugly to use strcmp to recognize uncommon trap, but that seems
> to be the way to do it (there are more such places). So it's ok.
> 
> Yeah. I copied the snippet.
> 
> > src/hotspot/share/prims/jvmtiImpl.cpp
> > src/hotspot/share/prims/jvmtiImpl.hpp
> > The sequence is pretty complex:
> > VM_GetOrSetLocal element initialization executes EscapeBarrier code
> which suspends the target thread (extra VM Operation).
> 
> Note that the target threads have to be suspended already for
> VM_GetOrSetLocal*. So it's mainly the
> synchronization effect of EscapeBarrier::sync_and_suspend_one() that is
> required here. Also no extra
> _handshake_ is executed, since sync_and_suspend_one() will find the
> target threads already
> suspended.
> 
> > VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM
> Thread to prepare VM Operation with frame deoptimization).
> > VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor
> which resumes the target thread.
> > But I don't have any improvement proposal. Performance is probably not a
> concern, here. So it's ok.
> 
> > VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it
> has non-globally escaping objects and other frames if they have arg escaping
> ones. Good.
> 
> It's not specifically the top frame, but the frame that is accessed.
> 
> > src/hotspot/share/runtime/deoptimization.cpp
> > Object deoptimization. I have more comments and proposals, here.
> > First of all, handling recursive and waiting locks in relock_objects is tricky,
> but looks correct.
> > Comments are sufficient to understand why things are done as they are
> implemented.
> 
> > BiasedLocking related parts are complex, but we may get rid of them in the
> future (with BiasedLocking removal).
> > Anyway, looks correct, too.
> 
> > Typo in comment: "regularily" => "regularly"
> 
> > Deoptimization::fetch_unroll_info_helper is the only place where
> _jvmti_deferred_updates get deallocated (except JavaThread destructor).
> But I think we always go through it, so I can't see a memory leak or such kind
> of issues.
> 
> That's correct. The compiled frame for which deferred updates are allocated
> is always deoptimized
> before (see EscapeBarrier::deoptimize_objects()). This is also asserted in
> compiledVFrame::update_deferred_value(). I've added the same assertion
> to
> Deoptimization::relock_objects(). So we can be sure that
> _jvmti_deferred_updates are deallocated
> again in fetch_unroll_info_helper().
> 
> > EscapeBarrier::deoptimize_objects: ResourceMark should use
> calling_thread().
> 
> Sure, well spotted!
> 
> > You can use MutexLocker and MonitorLocker with Thread* to save the
> Thread::current() call.
> 
> Right, good hint. This was recently introduced with 8235678. I even had to
> resolve conflicts. Should
> have done this then.
> 
> > I'd make set_objs_are_deoptimized static and remove it from the
> EscapeBarrier interface because I think it shouldn't be used outside of
> EscapeBarrier::deoptimize_objects.
> 
> Done.
> 
> > Typo in comment: "we must only deoptimize" => "we only have to
> deoptimize"
> 
> Replaced with "[...] we deoptimize iff local objects are passed as args"
> 
> > "bool EscapeBarrier::deoptimize_objects(intptr_t* fr_id)" is trivial and
> barrier_active() is redundant. Implementation can get moved to hpp file.
> 
> Ok. Done.
> 
> > I'll get back to suspend flags, later.
> 
> > There are weird cases regarding _self_deoptimization_in_progress.
> > Assume we have 3 threads A, B and C. A deopts C, B deopts C, C deopts C.
> C can set _self_deoptimization_in_progress while A performs the handshake
> for suspending C. I think this doesn't lead to errors, but it's probably not
> desired.
> > I think it would be better to use only one "wait" call in
> sync_and_suspend_one and sync_and_suspend_all.
> 
> You're right. We've discussed that face-to-face, but couldn't find a real issue.
> But now, thinking again, a reckon I found one:
> 
> 2808   // Sync with other threads that might be doing deoptimizations
> 2809   {
> 2810     // Need to switch to _thread_blocked for the wait() call
> 2811     ThreadBlockInVM tbivm(_calling_thread);
> 2812     MonitorLocker ml(EscapeBarrier_lock,
> Mutex::_no_safepoint_check_flag);
> 2813     while (_self_deoptimization_in_progress) {
> 2814       ml.wait();
> 2815     }
> 2816
> 2817     if (self_deopt()) {
> 2818       _self_deoptimization_in_progress = true;
> 2819     }
> 2820
> 2821     while (_deoptee_thread->is_ea_obj_deopt_suspend()) {
> 2822       ml.wait();
> 2823     }
> 2824
> 2825     if (self_deopt()) {
> 2826       return;
> 2827     }
> 2828
> 2829     // set suspend flag for target thread
> 2830     _deoptee_thread->set_ea_obj_deopt_flag();
> 2831   }
> 
> - A waits in 2822
> - C is suspended
> - B notifies all in resume_one()
> - A and C wake up
> - C wins over A and sets _self_deoptimization_in_progress = true in 2818
> - C does the self deoptimization
> - A executes 2830 _deoptee_thread->set_ea_obj_deopt_flag()
> 
> C will self suspend at some undefined point. The resulting state is illegal.
> 
> > I first thought it'd be better to move ThreadBlockInVM before wait() to
> reduce thread state transitions, but that seems to be problematic because
> ThreadBlockInVM destructor contains a safepoint check which we shouldn't
> do while holding EscapeBarrier_lock. So no change request.
> 
> Yes, would be nice to have the state change only if needed, but for the
> reason you mentioned it is
> not quite as easy as it seems to be. I experimented as well with a second
> lock, but did not succeed.
> 
> > Change in thred_added:
> > I think the sequence would be more comprehensive if we waited for
> deopt_all_threads in Thread::start and all other places where a new thread
> can run into Java code (e.g. JVMTI attach).
> > Your version makes new threads come up with suspend flag set. That looks
> correct, too. Advantage is that you only have to change one place
> (thread_added). It'll be interesting to see how it will look like when we use
> async handshakes instead of suspend flags.
> > For now, I'm ok with your version.
> 
> I had a version that did what you are suggesting. The current version also has
> the advantage, that
> there are fewer places where a thread has to wait for ongoing object
> deoptimization. This means
> viewer places where you have to worry about correct thread state
> transitions, possible deadlocks,
> and if all oops are properly Handle'ed.
> 
> > I'd only move MutexLocker ml(EscapeBarrier_lock...) after if (!jt-
> >is_hidden_from_external_view()).
> 
> Done.
> 
> > Having 4 different deoptimize_objects functions makes it a little hard to
> keep an overview of which one is used for what.
> > Maybe adding suffixes would help a little bit, but I can also live with what
> you have.
> > Implementation looks correct to me.
> 
> 2 are internal. I added the suffix _internal to them. This leaves 2 to choose
> from.
> 
> > src/hotspot/share/runtime/deoptimization.hpp
> > Escape barriers and object deoptimization functions.
> > Typo in comment: "helt" => "held"
> 
> Done in place already.
> 
> > src/hotspot/share/runtime/interfaceSupport.cpp
> > InterfaceSupport::deoptimizeAllObjects() is only used for
> DeoptimizeObjectsALot = 1.
> > I think DeoptimizeObjectsALot = 2 is more important, but I think it's not bad
> to have DeoptimizeObjectsALot = 1 in addition. Ok.
> 
> I never used DeoptimizeObjectsALot = 1 that much. It could be more
> deterministic in single threaded
> scenarios. I wouldn't object to get rid of it though.
> 
> > src/hotspot/share/runtime/stackValue.hpp
> > Better reinitilization in StackValue. Good.
> 
> StackValue::obj_is_scalar_replaced() should not return true after calling
> set_obj().
> 
> > src/hotspot/share/runtime/thread.cpp
> > src/hotspot/share/runtime/thread.hpp
> > src/hotspot/share/runtime/thread.inline.hpp
> > wait_for_object_deoptimization, suspend flag, deferred updates and test
> feature to deoptimize objects.
> 
> > In the long term, we want to get rid of suspend flags, so it's not so nice to
> introduce a new one. But I agree with G?tz that it should be acceptable as
> temporary solution until async handshakes are available (which takes more
> time). So I'm ok with your change.
> 
> I'm keen to build the feature on async handshakes when the arive.
> 
> > You can use MutexLocker with Thread*.
> 
> Done.
> 
> > JVMTIDeferredUpdates: I agree with Robin. It'd be nice to move the class
> out of thread.hpp.
> 
> Done.
> 
> > src/hotspot/share/runtime/vframe.cpp
> > Added support for entry frame to new_vframe. Ok.
> 
> 
> > src/hotspot/share/runtime/vframe_hp.cpp
> > src/hotspot/share/runtime/vframe_hp.hpp
> 
> > I think code()->as_nmethod() in not_global_escape_in_scope() and
> arg_escape() should better be under #ifdef ASSERT or inside the assert
> statement (no need for code cache walking in product build).
> 
> Done.
> 
> > jvmtiDeferredLocalVariableSet::update_monitors:
> > Please add a comment explaining that owner referenced by original info
> may be scalar replaced, but it is deoptimized in the vframe.
> 
> Done.
> 
> -----Original Message-----
> From: Doerr, Martin <martin.doerr at sap.com>
> Sent: Donnerstag, 12. M?rz 2020 17:28
> To: Reingruber, Richard <richard.reingruber at sap.com>; 'Robbin Ehn'
> <robbin.ehn at oracle.com>; Lindenmaier, Goetz
> <goetz.lindenmaier at sap.com>; David Holmes <david.holmes at oracle.com>;
> Vladimir Kozlov (vladimir.kozlov at oracle.com)
> <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net;
> hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-
> dev at openjdk.java.net
> Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance
> in the Presence of JVMTI Agents
> 
> Hi Richard,
> 
> 
> I managed to find time for a (almost) complete review of webrev.4. (I'll
> review the tests separately.)
> 
> First of all, the change seems to be in pretty good quality for its significant
> complexity. I couldn't find any real bugs. But I'd like to propose minor
> improvements.
> I'm convinced that it's mature because we did substantial testing.
> 
> I like the new functionality for object deoptimization. It can possibly be
> reused for future escape analysis based optimizations. So I appreciate having
> it available in the code base.
> In addition to that, your change makes the JVMTI implementation better
> integrated into the VM.
> 
> 
> Now to the details:
> 
> 
> src/hotspot/share/c1/c1_IR.hpp
> describe_scope parameters. Ok.
> 
> 
> src/hotspot/share/ci/ciEnv.cpp
> src/hotspot/share/ci/ciEnv.hpp
> Fix for JvmtiExport::can_walk_any_space() capability. Ok.
> 
> 
> src/hotspot/share/code/compiledMethod.cpp
> Nice cleanup!
> 
> 
> src/hotspot/share/code/debugInfoRec.cpp
> src/hotspot/share/code/debugInfoRec.hpp
> Additional parmeters. (Remark: I think "non_global_escape_in_scope"
> would read better than "not_global_escape_in_scope", but your version is
> consistent with existing code, so no change request from my side.) Ok.
> 
> 
> src/hotspot/share/code/nmethod.cpp
> Nice cleanup!
> 
> 
> src/hotspot/share/code/pcDesc.hpp
> Additional parameters. Ok.
> 
> 
> src/hotspot/share/code/scopeDesc.cpp
> src/hotspot/share/code/scopeDesc.hpp
> Improved implementation + additional parameters. Ok.
> 
> 
> src/hotspot/share/compiler/compileBroker.cpp
> src/hotspot/share/compiler/compileBroker.hpp
> Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into a
> follow up change together with the test in order to make this webrev
> smaller, but since it is included, I'm reviewing everything at once. Not a big
> deal.) Ok.
> 
> 
> src/hotspot/share/jvmci/jvmciCodeInstaller.cpp
> Additional parameters. Ok.
> 
> 
> src/hotspot/share/opto/c2compiler.cpp
> Make do_escape_analysis independent of JVMCI capabilities. Nice!
> 
> 
> src/hotspot/share/opto/callnode.hpp
> Additional fields for MachSafePointNodes. Ok.
> 
> 
> src/hotspot/share/opto/escape.cpp
> Annotation for MachSafePointNodes. Your added functionality looks correct.
> But I'd prefer to move the bulky code out of the large function.
> I suggest to factor out something like has_not_global_escape and
> has_arg_escape. So the code could look like this:
>       SafePointNode* sfn = sfn_worklist.at(next);
>       sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn));
>       if (sfn->is_CallJava()) {
>         CallJavaNode* call = sfn->as_CallJava();
>         call->set_arg_escape(has_arg_escape(call));
>       }
> This would also allow us to get rid of the found_..._escape_in_args variables
> making the loops better readable.
> 
> It's kind of ugly to use strcmp to recognize uncommon trap, but that seems
> to be the way to do it (there are more such places). So it's ok.
> 
> 
> src/hotspot/share/opto/machnode.hpp
> Additional fields for MachSafePointNodes. Ok.
> 
> 
> src/hotspot/share/opto/macro.cpp
> Allow elimination of non-escaping allocations. Ok.
> 
> 
> src/hotspot/share/opto/matcher.cpp
> src/hotspot/share/opto/output.cpp
> Copy attribute / pass parameters. Ok.
> 
> 
> src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp
> Nice cleanup!
> 
> 
> src/hotspot/share/prims/jvmtiEnv.cpp
> src/hotspot/share/prims/jvmtiEnvBase.cpp
> Escape barriers + deoptimize objects for target thread. Good.
> 
> 
> src/hotspot/share/prims/jvmtiImpl.cpp
> src/hotspot/share/prims/jvmtiImpl.hpp
> The sequence is pretty complex:
> VM_GetOrSetLocal element initialization executes EscapeBarrier code which
> suspends the target thread (extra VM Operation).
> VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM
> Thread to prepare VM Operation with frame deoptimization).
> VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor which
> resumes the target thread.
> But I don't have any improvement proposal. Performance is probably not a
> concern, here. So it's ok.
> 
> VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it has
> non-globally escaping objects and other frames if they have arg escaping
> ones. Good.
> 
> 
> src/hotspot/share/prims/jvmtiTagMap.cpp
> Escape barriers + deoptimize objects for all threads. Ok.
> 
> 
> src/hotspot/share/prims/whitebox.cpp
> Added WB_IsFrameDeoptimized to API. Ok.
> 
> 
> src/hotspot/share/runtime/deoptimization.cpp
> Object deoptimization. I have more comments and proposals, here.
> First of all, handling recursive and waiting locks in relock_objects is tricky, but
> looks correct.
> Comments are sufficient to understand why things are done as they are
> implemented.
> 
> BiasedLocking related parts are complex, but we may get rid of them in the
> future (with BiasedLocking removal).
> Anyway, looks correct, too.
> 
> Typo in comment: "regularily" => "regularly"
> 
> Deoptimization::fetch_unroll_info_helper is the only place where
> _jvmti_deferred_updates get deallocated (except JavaThread destructor).
> But I think we always go through it, so I can't see a memory leak or such kind
> of issues.
> 
> EscapeBarrier::deoptimize_objects: ResourceMark should use
> calling_thread().
> 
> You can use MutexLocker and MonitorLocker with Thread* to save the
> Thread::current() call.
> 
> I'd make set_objs_are_deoptimized static and remove it from the
> EscapeBarrier interface because I think it shouldn't be used outside of
> EscapeBarrier::deoptimize_objects.
> 
> Typo in comment: "we must only deoptimize" => "we only have to
> deoptimize"
> 
> "bool EscapeBarrier::deoptimize_objects(intptr_t* fr_id)" is trivial and
> barrier_active() is redundant. Implementation can get moved to hpp file.
> 
> I'll get back to suspend flags, later.
> 
> There are weird cases regarding _self_deoptimization_in_progress.
> Assume we have 3 threads A, B and C. A deopts C, B deopts C, C deopts C. C
> can set _self_deoptimization_in_progress while A performs the handshake
> for suspending C. I think this doesn't lead to errors, but it's probably not
> desired.
> I think it would be better to use only one "wait" call in
> sync_and_suspend_one and sync_and_suspend_all.
> 
> I first thought it'd be better to move ThreadBlockInVM before wait() to
> reduce thread state transitions, but that seems to be problematic because
> ThreadBlockInVM destructor contains a safepoint check which we shouldn't
> do while holding EscapeBarrier_lock. So no change request.
> 
> Change in thred_added:
> I think the sequence would be more comprehensive if we waited for
> deopt_all_threads in Thread::start and all other places where a new thread
> can run into Java code (e.g. JVMTI attach).
> Your version makes new threads come up with suspend flag set. That looks
> correct, too. Advantage is that you only have to change one place
> (thread_added). It'll be interesting to see how it will look like when we use
> async handshakes instead of suspend flags.
> For now, I'm ok with your version.
> 
> I'd only move MutexLocker ml(EscapeBarrier_lock...) after if (!jt-
> >is_hidden_from_external_view()).
> 
> Having 4 different deoptimize_objects functions makes it a little hard to keep
> an overview of which one is used for what.
> Maybe adding suffixes would help a little bit, but I can also live with what you
> have.
> Implementation looks correct to me.
> 
> 
> src/hotspot/share/runtime/deoptimization.hpp
> Escape barriers and object deoptimization functions.
> Typo in comment: "helt" => "held"
> 
> 
> src/hotspot/share/runtime/globals.hpp
> Addition of develop flag DeoptimizeObjectsALotInterval. Ok.
> 
> 
> src/hotspot/share/runtime/interfaceSupport.cpp
> InterfaceSupport::deoptimizeAllObjects() is only used for
> DeoptimizeObjectsALot = 1.
> I think DeoptimizeObjectsALot = 2 is more important, but I think it's not bad
> to have DeoptimizeObjectsALot = 1 in addition. Ok.
> 
> 
> src/hotspot/share/runtime/interfaceSupport.inline.hpp
> Addition of deoptimizeAllObjects. Ok.
> 
> 
> src/hotspot/share/runtime/mutexLocker.cpp
> src/hotspot/share/runtime/mutexLocker.hpp
> Addition of EscapeBarrier_lock. Ok.
> 
> 
> src/hotspot/share/runtime/objectMonitor.cpp
> Make recursion count relock aware. Ok.
> 
> 
> src/hotspot/share/runtime/stackValue.hpp
> Better reinitilization in StackValue. Good.
> 
> 
> src/hotspot/share/runtime/thread.cpp
> src/hotspot/share/runtime/thread.hpp
> src/hotspot/share/runtime/thread.inline.hpp
> wait_for_object_deoptimization, suspend flag, deferred updates and test
> feature to deoptimize objects.
> 
> In the long term, we want to get rid of suspend flags, so it's not so nice to
> introduce a new one. But I agree with G?tz that it should be acceptable as
> temporary solution until async handshakes are available (which takes more
> time). So I'm ok with your change.
> 
> You can use MutexLocker with Thread*.
> 
> JVMTIDeferredUpdates: I agree with Robin. It'd be nice to move the class out
> of thread.hpp.
> 
> 
> src/hotspot/share/runtime/vframe.cpp
> Added support for entry frame to new_vframe. Ok.
> 
> 
> src/hotspot/share/runtime/vframe_hp.cpp
> src/hotspot/share/runtime/vframe_hp.hpp
> 
> I think code()->as_nmethod() in not_global_escape_in_scope() and
> arg_escape() should better be under #ifdef ASSERT or inside the assert
> statement (no need for code cache walking in product build).
> 
> jvmtiDeferredLocalVariableSet::update_monitors:
> Please add a comment explaining that owner referenced by original info may
> be scalar replaced, but it is deoptimized in the vframe.
> 
> 
> src/hotspot/share/utilities/macros.hpp
> Addition of NOT_COMPILER2_OR_JVMCI_RETURN macros. Ok.
> 
> 
> test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysi
> sEnabled.java
> test/hotspot/jtreg/serviceability/jvmti/Heap/libIterateHeapWithEscapeAnal
> ysisEnabled.c
> New test. Will review separately.
> 
> 
> test/jdk/TEST.ROOT
> Addition of vm.jvmci as required property. Ok.
> 
> 
> test/jdk/com/sun/jdi/EATests.java
> test/jdk/com/sun/jdi/EATestsJVMCI.java
> New test. Will review separately.
> 
> 
> test/lib/sun/hotspot/WhiteBox.java
> Added isFrameDeoptimized to API. Ok.
> 
> 
> That was it. Best regards,
> Martin
> 
> 
> > -----Original Message-----
> > From: hotspot-compiler-dev <hotspot-compiler-dev-
> > bounces at openjdk.java.net> On Behalf Of Reingruber, Richard
> > Sent: Dienstag, 3. M?rz 2020 21:23
> > To: 'Robbin Ehn' <robbin.ehn at oracle.com>; Lindenmaier, Goetz
> > <goetz.lindenmaier at sap.com>; David Holmes
> <david.holmes at oracle.com>;
> > Vladimir Kozlov (vladimir.kozlov at oracle.com)
> > <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net;
> > hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-
> > dev at openjdk.java.net
> > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better
> > Performance in the Presence of JVMTI Agents
> >
> > Hi Robbin,
> >
> > > > I understand that Robbin proposed to replace the usage of
> > > > _suspend_flag with handshakes. Apparently, async handshakes
> > > > are needed to do so. We have been waiting a while for removal
> > > > of the _suspend_flag / introduction of async handshakes [2].
> > > > What is the status here?
> >
> > > I have an old prototype which I would like to continue to work on.
> > > So do not assume asynch handshakes will make 15.
> > > Even if it would, I think there are a lot more investigate work to remove
> > > _suspend_flag.
> >
> > Let us know, if we can be of any help to you and be it only testing.
> >
> > > >> Full:
> > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/
> >
> > > DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
> > > You can move both declaration and definition to that file, no need to
> > clobber
> > > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)
> >
> > Will do.
> >
> > > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in
> it's
> > own
> > > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.
> >
> > You are right. It shouldn't be declared in thread.hpp. I will look into that.
> >
> > > Note that we also think we may have a bug in deopt:
> > > https://bugs.openjdk.java.net/browse/JDK-8238237
> >
> > > I think it would be best, if possible, to push after that is resolved.
> >
> > Sure.
> >
> > > Not even nearly a full review :)
> >
> > I know :)
> >
> > Anyways, thanks a lot,
> > Richard.
> >
> >
> > -----Original Message-----
> > From: Robbin Ehn <robbin.ehn at oracle.com>
> > Sent: Monday, March 2, 2020 11:17 AM
> > To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Reingruber,
> Richard
> > <richard.reingruber at sap.com>; David Holmes
> <david.holmes at oracle.com>;
> > Vladimir Kozlov (vladimir.kozlov at oracle.com)
> > <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net;
> > hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-
> > dev at openjdk.java.net
> > Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance
> > in the Presence of JVMTI Agents
> >
> > Hi,
> >
> > On 2/24/20 5:39 PM, Lindenmaier, Goetz wrote:
> > > Hi,
> > >
> > > I had a look at the progress of this change. Nothing
> > > happened since Richard posted his update using more
> > > handshakes [1].
> > > But we (SAP) would appreciate a lot if this change could
> > > be successfully reviewed and pushed.
> > >
> > > I think there is basic understanding that this
> > > change is helpful. It fixes a number of issues with JVMTI,
> > > and will deliver the same performance benefits as EA
> > > does in current production mode for debugging scenarios.
> > >
> > > This is important for us as we run our VMs prepared
> > > for debugging in production mode.
> > >
> > > I understand that Robbin proposed to replace the usage of
> > > _suspend_flag with handshakes. Apparently, async handshakes
> > > are needed to do so. We have been waiting a while for removal
> > > of the _suspend_flag / introduction of async handshakes [2].
> > > What is the status here?
> >
> > I have an old prototype which I would like to continue to work on.
> > So do not assume asynch handshakes will make 15.
> > Even if it would, I think there are a lot more investigate work to remove
> > _suspend_flag.
> >
> > >
> > > I think we should no longer wait, but proceed with
> > > this change. We will look into removing the usage of
> > > suspend_flag introduced here once it is possible to implement
> > > it with handshakes.
> >
> > Yes, sure.
> >
> > >> Full:
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/
> >
> > DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
> > You can move both declaration and definition to that file, no need to
> clobber
> > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)
> >
> > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's
> > own
> > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.
> >
> > Note that we also think we may have a bug in deopt:
> > https://bugs.openjdk.java.net/browse/JDK-8238237
> >
> > I think it would be best, if possible, to push after that is resolved.
> >
> > Not even nearly a full review :)
> >
> > Thanks, Robbin
> >
> >
> > >> Incremental:
> > >>
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/
> > >>
> > >> I was not able to eliminate the additional suspend flag now. I'll take care
> > of this
> > >> as soon as the
> > >> existing suspend-resume-mechanism is reworked.
> > >>
> > >> Testing:
> > >>
> > >> Nightly tests @SAP:
> > >>
> > >>    JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015,
> > Renaissance
> > >> Suite, SAP specific tests
> > >>    with fastdebug and release builds on all platforms
> > >>
> > >>    Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x
> > parallel
> > >> for 24h
> > >>
> > >> Thanks, Richard.
> > >>
> > >>
> > >> More details on the changes:
> > >>
> > >> * Hide DeoptimizeObjectsALotThread from external view.
> > >>
> > >> * Changed EscapeBarrier_lock to be a _safepoint_check_never lock.
> > >>    It used to be _safepoint_check_sometimes, which will be eliminated
> > sooner or
> > >> later.
> > >>    I added explicit thread state changes with ThreadBlockInVM to code
> > paths
> > >> where we can wait()
> > >>    on EscapeBarrier_lock to become safepoint safe.
> > >>
> > >> * Use handshake EscapeBarrierSuspendHandshake to suspend target
> > threads
> > >> instead of vm operation
> > >>    VM_ThreadSuspendAllForObjDeopt.
> > >>
> > >> * Removed uses of Threads_lock. When adding a new thread we
> suspend
> > it iff
> > >> EA optimizations are
> > >>    being reverted. In the previous version we were waiting on
> > Threads_lock
> > >> while EA optimizations
> > >>    were reverted. See EscapeBarrier::thread_added().
> > >>
> > >> * Made tests require Xmixed compilation mode.
> > >>
> > >> * Made tests agnostic regarding tiered compilation.
> > >>    I.e. tc isn't disabled anymore, and the tests can be run with tc enabled
> or
> > >> disabled.
> > >>
> > >> * Exercising EATests.java as well with stress test options
> > >> DeoptimizeObjectsALot*
> > >>    Due to the non-deterministic deoptimizations some tests need to be
> > skipped.
> > >>    We do this to prevent bit-rot of the stress test code.
> > >>
> > >> * Executing EATests.java as well with graal if available. Driver for this is
> > >>    EATestsJVMCI.java. Graal cannot pass all tests, because it does not
> > provide all
> > >> the new debug info
> > >>    (namely not_global_escape_in_scope and arg_escape in
> > scopeDesc.hpp).
> > >>    And graal does not yet support the JVMTI operations force early
> return
> > and
> > >> pop frame.
> > >>
> > >> * Removed tracing from new jdi tests in EATests.java. Too much trace
> > output
> > >> before the debugging
> > >>    connection is established can cause deadlock because output buffers
> fill
> > up.
> > >>    (See https://bugs.openjdk.java.net/browse/JDK-8173304)
> > >>
> > >> * Many copyright year changes and smaller clean-up changes of testing
> > code
> > >> (trailing white-space and
> > >>    the like).
> > >>
> > >>
> > >> -----Original Message-----
> > >> From: David Holmes <david.holmes at oracle.com>
> > >> Sent: Donnerstag, 19. Dezember 2019 03:12
> > >> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-
> > >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net;
> > hotspot-
> > >> runtime-dev at openjdk.java.net; Vladimir Kozlov
> > (vladimir.kozlov at oracle.com)
> > >> <vladimir.kozlov at oracle.com>
> > >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> > Performance in
> > >> the Presence of JVMTI Agents
> > >>
> > >> Hi Richard,
> > >>
> > >> I think my issue is with the way EliminateNestedLocks works so I'm going
> > >> to look into that more deeply.
> > >>
> > >> Thanks for the explanations.
> > >>
> > >> David
> > >>
> > >> On 18/12/2019 12:47 am, Reingruber, Richard wrote:
> > >>> Hi David,
> > >>>
> > >>>     > >    > Some further queries/concerns:
> > >>>     > >    >
> > >>>     > >    > src/hotspot/share/runtime/objectMonitor.cpp
> > >>>     > >    >
> > >>>     > >    > Can you please explain the changes to ObjectMonitor::wait:
> > >>>     > >    >
> > >>>     > >    > !   _recursions = save      // restore the old recursion count
> > >>>     > >    > !                 + jt->get_and_reset_relock_count_after_wait(); //
> > >>>     > >    > increased by the deferred relock count
> > >>>     > >    >
> > >>>     > >    > what is the "deferred relock count"? I gather it relates to
> > >>>     > >    >
> > >>>     > >    > "The code was extended to be able to deoptimize objects of a
> > >>>     > > frame that
> > >>>     > >    > is not the top frame and to let another thread than the
> owning
> > >>>     > > thread do
> > >>>     > >    > it."
> > >>>     > >
> > >>>     > > Yes, these relate. Currently EA based optimizations are reverted,
> > when a
> > >> compiled frame is
> > >>>     > > replaced with corresponding interpreter frames. Part of this is
> > relocking
> > >> objects with eliminated
> > >>>     > > locking. New with the enhancement is that we do this also just
> > before
> > >> object references are
> > >>>     > > acquired through JVMTI. In this case we deoptimize also the
> > owning
> > >> compiled frame C and we
> > >>>     > > register deoptimized objects as deferred updates. When control
> > returns
> > >> to C it gets deoptimized,
> > >>>     > > we notice that objects are already deoptimized (reallocated and
> > >> relocked), so we don't do it again
> > >>>     > > (relocking twice would be incorrect of course). Deferred updates
> > are
> > >> copied into the new
> > >>>     > > interpreter frames.
> > >>>     > >
> > >>>     > > Problem: relocking is not possible if the target thread T is waiting
> > on the
> > >> monitor that needs to
> > >>>     > > be relocked. This happens only with non-local objects with
> > >> EliminateNestedLocks. Instead relocking
> > >>>     > > is deferred until T owns the monitor again. This is what the piece
> of
> > >> code above does.
> > >>>     >
> > >>>     >  Sorry I need some more detail here. How can you wait() on an
> > object
> > >>>     >  monitor if the object allocation and/or locking was optimised
> away?
> > And
> > >>>     >  what is a "non-local object" in this context? Isn't EA restricted to
> > >>>     >  thread-confined objects?
> > >>>
> > >>> "Non-local object" is an object that escapes its thread. The issue I'm
> > >> addressing with the changes
> > >>> in ObjectMonitor::wait are almost unrelated to EA. They are caused by
> > >> EliminateNestedLocks, where C2
> > >>> eliminates recursive locking of an already owned lock. The lock owning
> > object
> > >> exists on the heap, it
> > >>> is locked and you can call wait() on it.
> > >>>
> > >>> EliminateLocks is the C2 option that controls lock elimination based on
> > EA.
> > >> Both optimizations have
> > >>> in common that objects with eliminated locking need to be relocked
> > when
> > >> deoptimizing a frame,
> > >>> i.e. when replacing a compiled frame with equivalent interpreter
> > >>> frames. Deoptimization::relock_objects does that job for /all/
> eliminated
> > >> locks in scope. /All/ can
> > >>> be a mix of eliminated nested locks and locks of not-escaping objects.
> > >>>
> > >>> New with the enhancement: I call relock_objects earlier, just before
> > objects
> > >> pontentially
> > >>> escape. But then later when the owning compiled frame gets
> > deoptimized, I
> > >> must not do it again:
> > >>>
> > >>> See call to EscapeBarrier::objs_are_deoptimized in
> deoptimization.cpp:
> > >>>
> > >>>    373   if ((jvmci_enabled || ((DoEscapeAnalysis ||
> > EliminateNestedLocks) &&
> > >> EliminateLocks))
> > >>>    374       && !EscapeBarrier::objs_are_deoptimized(thread,
> > deoptee.id())) {
> > >>>    375     bool unused;
> > >>>    376     eliminate_locks(thread, chunk, realloc_failures, deoptee,
> > exec_mode,
> > >> unused);
> > >>>    377   }
> > >>>
> > >>> Now when calling relock_objects early it is quiet possible that I have to
> > relock
> > >> an object the
> > >>> target thread currently waits for. Obviously I cannot relock in this case,
> > >> instead I chose to
> > >>> introduce relock_count_after_wait to JavaThread.
> > >>>
> > >>>     >  Is it just that some of the locking gets optimized away e.g.
> > >>>     >
> > >>>     >  synchronised(obj) {
> > >>>     >     synchronised(obj) {
> > >>>     >       synchronised(obj) {
> > >>>     >         obj.wait();
> > >>>     >       }
> > >>>     >     }
> > >>>     >  }
> > >>>     >
> > >>>     >  If this is reduced to a form as-if it were a single lock of the monitor
> > >>>     >  (due to EA) and the wait() triggers a JVM TI event which leads to
> the
> > >>>     >  escape of "obj" then we need to reconstruct the true lock state,
> and
> > so
> > >>>     >  when the wait() internally unblocks and reacquires the monitor it
> > has to
> > >>>     >  set the true recursion count to 3, not the 1 that it appeared to be
> > when
> > >>>     >  wait() was initially called. Is that the scenario?
> > >>>
> > >>> Kind of... except that the locking is not eliminated due to EA and there
> is
> > no
> > >> JVM TI event
> > >>> triggered by wait.
> > >>>
> > >>> Add
> > >>>
> > >>> LocalObject l1 = new LocalObject();
> > >>>
> > >>> in front of the synchrnized blocks and assume a JVM TI agent acquires
> l1.
> > This
> > >> triggers the code in
> > >>> question.
> > >>>
> > >>> See that relocking/reallocating is transactional. If it is done then for
> /all/
> > >> objects in scope and it is
> > >>> done at most once. It wouldn't be quite so easy to split this in relocking
> > of
> > >> nested/EA-based
> > >>> eliminated locks.
> > >>>
> > >>>     >  If so I find this truly awful. Anyone using wait() in a realistic form
> > >>>     >  requires a notification and so the object cannot be thread
> confined.
> > In
> > >>>
> > >>> It is not thread confined.
> > >>>
> > >>>     >  which case I would strongly argue that upon hitting the wait() the
> > deopt
> > >>>     >  should occur unconditionally and so the lock state is correct before
> > we
> > >>>     >  wait and so we don't need to mess with the recursion count
> > internally
> > >>>     >  when we reacquire the monitor.
> > >>>     >
> > >>>     > >
> > >>>     > >    > which I don't like the sound of at all when it comes to
> > ObjectMonitor
> > >>>     > >    > state. So I'd like to understand in detail exactly what is going
> on
> > here
> > >>>     > >    > and why.  This is a very intrusive change that seems to badly
> > break
> > >>>     > >    > encapsulation and impacts future changes to ObjectMonitor
> > that are
> > >> under
> > >>>     > >    > investigation.
> > >>>     > >
> > >>>     > > I would not regard this as breaking encapsulation. Certainly not
> > badly.
> > >>>     > >
> > >>>     > > I've added a property relock_count_after_wait to JavaThread.
> The
> > >> property is well
> > >>>     > > encapsulated. Future ObjectMonitor implementations have to
> deal
> > with
> > >> recursion too. They are free
> > >>>     > > in choosing a way to do that as long as that property is taken into
> > >> account. This is hardly a
> > >>>     > > limitation.
> > >>>     >
> > >>>     >  I do think this badly breaks encapsulation as you have to add a
> > callout
> > >>>     >  from the guts of the ObjectMonitor code to reach into the thread
> to
> > get
> > >>>     >  this lock count adjustment. I understand why you have had to do
> > this but
> > >>>     >  I would much rather see a change to the EA optimisation strategy
> so
> > that
> > >>>     >  this is not needed.
> > >>>     >
> > >>>     > > Note also that the property is a straight forward extension of the
> > >> existing concept of deferred
> > >>>     > > local updates. It is embedded into the structure holding them. So
> > not
> > >> even the footprint of a
> > >>>     > > JavaThread is enlarged if no deferred updates are generated.
> > >>>     >
> > >>>     > [...]
> > >>>     >
> > >>>     > >
> > >>>     > > I'm actually duplicating the existing external suspend mechanism,
> > >> because a thread can be
> > >>>     > > suspended at most once. And hey, and don't like that either! But
> it
> > >> seems not unlikely that the
> > >>>     > > duplicate can be removed together with the original and the new
> > type
> > >> of handshakes that will be
> > >>>     > > used for thread suspend can be used for object deoptimization
> > too. See
> > >> today's discussion in
> > >>>     > > JDK-8227745 [2].
> > >>>     >
> > >>>     >  I hope that discussion bears some fruit, at the moment it seems
> not
> > to
> > >>>     >  be possible to use handshakes here. :(
> > >>>     >
> > >>>     >  The external suspend mechanism is a royal pain in the proverbial
> > that we
> > >>>     >  have to carefully live with. The idea that we're duplicating that for
> > >>>     >  use in another fringe area of functionality does not thrill me at all.
> > >>>     >
> > >>>     >  To be clear, I understand the problem that exists and that you
> wish
> > to
> > >>>     >  solve, but for the runtime parts I balk at the complexity cost of
> > >>>     >  solving it.
> > >>>
> > >>> I know it's complex, but by far no rocket science.
> > >>>
> > >>> Also I find it hard to imagine another fix for JDK-8233915 besides
> > changing
> > >> the JVM TI specification.
> > >>>
> > >>> Thanks, Richard.
> > >>>
> > >>> -----Original Message-----
> > >>> From: David Holmes <david.holmes at oracle.com>
> > >>> Sent: Dienstag, 17. Dezember 2019 08:03
> > >>> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-
> > >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net;
> > hotspot-
> > >> runtime-dev at openjdk.java.net; Vladimir Kozlov
> > (vladimir.kozlov at oracle.com)
> > >> <vladimir.kozlov at oracle.com>
> > >>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> > Performance
> > >> in the Presence of JVMTI Agents
> > >>>
> > >>> <resend as my mailer crashed during last send>
> > >>>
> > >>> David
> > >>>
> > >>> On 17/12/2019 4:57 pm, David Holmes wrote:
> > >>>> Hi Richard,
> > >>>>
> > >>>> On 14/12/2019 5:01 am, Reingruber, Richard wrote:
> > >>>>> Hi David,
> > >>>>>
> > >>>>>   ?? > Some further queries/concerns:
> > >>>>>   ?? >
> > >>>>>   ?? > src/hotspot/share/runtime/objectMonitor.cpp
> > >>>>>   ?? >
> > >>>>>   ?? > Can you please explain the changes to ObjectMonitor::wait:
> > >>>>>   ?? >
> > >>>>>   ?? > !?? _recursions = save????? // restore the old recursion count
> > >>>>>   ?? > !???????????????? + jt->get_and_reset_relock_count_after_wait(); //
> > >>>>>   ?? > increased by the deferred relock count
> > >>>>>   ?? >
> > >>>>>   ?? > what is the "deferred relock count"? I gather it relates to
> > >>>>>   ?? >
> > >>>>>   ?? > "The code was extended to be able to deoptimize objects of a
> > >>>>> frame that
> > >>>>>   ?? > is not the top frame and to let another thread than the owning
> > >>>>> thread do
> > >>>>>   ?? > it."
> > >>>>>
> > >>>>> Yes, these relate. Currently EA based optimizations are reverted,
> > when
> > >>>>> a compiled frame is replaced
> > >>>>> with corresponding interpreter frames. Part of this is relocking
> > >>>>> objects with eliminated
> > >>>>> locking. New with the enhancement is that we do this also just
> before
> > >>>>> object references are acquired
> > >>>>> through JVMTI. In this case we deoptimize also the owning compiled
> > >>>>> frame C and we register
> > >>>>> deoptimized objects as deferred updates. When control returns to
> C
> > it
> > >>>>> gets deoptimized, we notice
> > >>>>> that objects are already deoptimized (reallocated and relocked), so
> > we
> > >>>>> don't do it again (relocking
> > >>>>> twice would be incorrect of course). Deferred updates are copied
> into
> > >>>>> the new interpreter frames.
> > >>>>>
> > >>>>> Problem: relocking is not possible if the target thread T is waiting
> > >>>>> on the monitor that needs to be
> > >>>>> relocked. This happens only with non-local objects with
> > >>>>> EliminateNestedLocks. Instead relocking is
> > >>>>> deferred until T owns the monitor again. This is what the piece of
> > >>>>> code above does.
> > >>>>
> > >>>> Sorry I need some more detail here. How can you wait() on an object
> > >>>> monitor if the object allocation and/or locking was optimised away?
> > And
> > >>>> what is a "non-local object" in this context? Isn't EA restricted to
> > >>>> thread-confined objects?
> > >>>>
> > >>>> Is it just that some of the locking gets optimized away e.g.
> > >>>>
> > >>>> synchronised(obj) {
> > >>>>    ? synchronised(obj) {
> > >>>>    ??? synchronised(obj) {
> > >>>>    ????? obj.wait();
> > >>>>    ??? }
> > >>>>    ? }
> > >>>> }
> > >>>>
> > >>>> If this is reduced to a form as-if it were a single lock of the monitor
> > >>>> (due to EA) and the wait() triggers a JVM TI event which leads to the
> > >>>> escape of "obj" then we need to reconstruct the true lock state, and
> so
> > >>>> when the wait() internally unblocks and reacquires the monitor it has
> to
> > >>>> set the true recursion count to 3, not the 1 that it appeared to be
> when
> > >>>> wait() was initially called. Is that the scenario?
> > >>>>
> > >>>> If so I find this truly awful. Anyone using wait() in a realistic form
> > >>>> requires a notification and so the object cannot be thread confined.
> In
> > >>>> which case I would strongly argue that upon hitting the wait() the
> > deopt
> > >>>> should occur unconditionally and so the lock state is correct before
> we
> > >>>> wait and so we don't need to mess with the recursion count internally
> > >>>> when we reacquire the monitor.
> > >>>>
> > >>>>>
> > >>>>>   ?? > which I don't like the sound of at all when it comes to
> > >>>>> ObjectMonitor
> > >>>>>   ?? > state. So I'd like to understand in detail exactly what is going
> > >>>>> on here
> > >>>>>   ?? > and why.? This is a very intrusive change that seems to badly
> > break
> > >>>>>   ?? > encapsulation and impacts future changes to ObjectMonitor
> that
> > >>>>> are under
> > >>>>>   ?? > investigation.
> > >>>>>
> > >>>>> I would not regard this as breaking encapsulation. Certainly not
> badly.
> > >>>>>
> > >>>>> I've added a property relock_count_after_wait to JavaThread. The
> > >>>>> property is well
> > >>>>> encapsulated. Future ObjectMonitor implementations have to deal
> > with
> > >>>>> recursion too. They are free in
> > >>>>> choosing a way to do that as long as that property is taken into
> > >>>>> account. This is hardly a
> > >>>>> limitation.
> > >>>>
> > >>>> I do think this badly breaks encapsulation as you have to add a callout
> > >>>> from the guts of the ObjectMonitor code to reach into the thread to
> > get
> > >>>> this lock count adjustment. I understand why you have had to do this
> > but
> > >>>> I would much rather see a change to the EA optimisation strategy so
> > that
> > >>>> this is not needed.
> > >>>>
> > >>>>> Note also that the property is a straight forward extension of the
> > >>>>> existing concept of deferred
> > >>>>> local updates. It is embedded into the structure holding them. So
> not
> > >>>>> even the footprint of a
> > >>>>> JavaThread is enlarged if no deferred updates are generated.
> > >>>>>
> > >>>>>   ?? > ---
> > >>>>>   ?? >
> > >>>>>   ?? > src/hotspot/share/runtime/thread.cpp
> > >>>>>   ?? >
> > >>>>>   ?? > Can you please explain why
> > >>>>> JavaThread::wait_for_object_deoptimization
> > >>>>>   ?? > has to be handcrafted in this way rather than using proper
> > >>>>> transitions.
> > >>>>>   ?? >
> > >>>>>
> > >>>>> I wrote wait_for_object_deoptimization taking
> > >>>>> JavaThread::java_suspend_self_with_safepoint_check
> > >>>>> as template. So in short: for the same reasons :)
> > >>>>>
> > >>>>> Threads reach both methods as part of thread state transitions,
> > >>>>> therefore special handling is
> > >>>>> required to change thread state on top of ongoing transitions.
> > >>>>>
> > >>>>>   ?? > We got rid of "deopt suspend" some time ago and it is
> disturbing
> > >>>>> to see
> > >>>>>   ?? > it being added back (effectively). This seems like it may be
> > >>>>> something
> > >>>>>   ?? > that handshakes could be used for.
> > >>>>>
> > >>>>> Deopt suspend used to be something rather different with a similar
> > >>>>> name[1]. It is not being added back.
> > >>>>
> > >>>> I stand corrected. Despite comments in the code to the contrary
> > >>>> deopt_suspend didn't actually cause a self-suspend. I was doing a lot
> of
> > >>>> cleanup in this area 13 years ago :)
> > >>>>
> > >>>>>
> > >>>>> I'm actually duplicating the existing external suspend mechanism,
> > >>>>> because a thread can be suspended
> > >>>>> at most once. And hey, and don't like that either! But it seems not
> > >>>>> unlikely that the duplicate can
> > >>>>> be removed together with the original and the new type of
> > handshakes
> > >>>>> that will be used for
> > >>>>> thread suspend can be used for object deoptimization too. See
> > today's
> > >>>>> discussion in JDK-8227745 [2].
> > >>>>
> > >>>> I hope that discussion bears some fruit, at the moment it seems not
> to
> > >>>> be possible to use handshakes here. :(
> > >>>>
> > >>>> The external suspend mechanism is a royal pain in the proverbial that
> > we
> > >>>> have to carefully live with. The idea that we're duplicating that for
> > >>>> use in another fringe area of functionality does not thrill me at all.
> > >>>>
> > >>>> To be clear, I understand the problem that exists and that you wish to
> > >>>> solve, but for the runtime parts I balk at the complexity cost of
> > >>>> solving it.
> > >>>>
> > >>>> Thanks,
> > >>>> David
> > >>>> -----
> > >>>>
> > >>>>> Thanks, Richard.
> > >>>>>
> > >>>>> [1] Deopt suspend was something like an async. handshake for
> > >>>>> architectures with register windows,
> > >>>>>   ???? where patching the return pc for deoptimization of a compiled
> > >>>>> frame was racy if the owner thread
> > >>>>>   ???? was in native code. Instead a "deopt" suspend flag was set on
> > >>>>> which the thread patched its own
> > >>>>>   ???? frame upon return from native. So no thread was suspended. It
> > got
> > >>>>> its name only from the name of
> > >>>>>   ???? the flags.
> > >>>>>
> > >>>>> [2] Discussion about using handshakes to sync. with the target
> thread:
> > >>>>>
> > >>>>> https://bugs.openjdk.java.net/browse/JDK-
> > >>
> >
> 8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.syst
> > e
> > >> m.issuetabpanels:comment-tabpanel#comment-14306727
> > >>>>>
> > >>>>>
> > >>>>> -----Original Message-----
> > >>>>> From: David Holmes <david.holmes at oracle.com>
> > >>>>> Sent: Freitag, 13. Dezember 2019 00:56
> > >>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
> > >>>>> serviceability-dev at openjdk.java.net;
> > >>>>> hotspot-compiler-dev at openjdk.java.net;
> > >>>>> hotspot-runtime-dev at openjdk.java.net
> > >>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> > >>>>> Performance in the Presence of JVMTI Agents
> > >>>>>
> > >>>>> Hi Richard,
> > >>>>>
> > >>>>> Some further queries/concerns:
> > >>>>>
> > >>>>> src/hotspot/share/runtime/objectMonitor.cpp
> > >>>>>
> > >>>>> Can you please explain the changes to ObjectMonitor::wait:
> > >>>>>
> > >>>>> !?? _recursions = save????? // restore the old recursion count
> > >>>>> !???????????????? + jt->get_and_reset_relock_count_after_wait(); //
> > >>>>> increased by the deferred relock count
> > >>>>>
> > >>>>> what is the "deferred relock count"? I gather it relates to
> > >>>>>
> > >>>>> "The code was extended to be able to deoptimize objects of a
> frame
> > that
> > >>>>> is not the top frame and to let another thread than the owning
> thread
> > do
> > >>>>> it."
> > >>>>>
> > >>>>> which I don't like the sound of at all when it comes to ObjectMonitor
> > >>>>> state. So I'd like to understand in detail exactly what is going on here
> > >>>>> and why.? This is a very intrusive change that seems to badly break
> > >>>>> encapsulation and impacts future changes to ObjectMonitor that
> are
> > under
> > >>>>> investigation.
> > >>>>>
> > >>>>> ---
> > >>>>>
> > >>>>> src/hotspot/share/runtime/thread.cpp
> > >>>>>
> > >>>>> Can you please explain why
> > JavaThread::wait_for_object_deoptimization
> > >>>>> has to be handcrafted in this way rather than using proper
> transitions.
> > >>>>>
> > >>>>> We got rid of "deopt suspend" some time ago and it is disturbing to
> > see
> > >>>>> it being added back (effectively). This seems like it may be
> something
> > >>>>> that handshakes could be used for.
> > >>>>>
> > >>>>> Thanks,
> > >>>>> David
> > >>>>> -----
> > >>>>>
> > >>>>> On 12/12/2019 7:02 am, David Holmes wrote:
> > >>>>>> On 12/12/2019 1:07 am, Reingruber, Richard wrote:
> > >>>>>>> Hi David,
> > >>>>>>>
> > >>>>>>>   ??? > Most of the details here are in areas I can comment on in
> > detail,
> > >>>>>>> but I
> > >>>>>>>   ??? > did take an initial general look at things.
> > >>>>>>>
> > >>>>>>> Thanks for taking the time!
> > >>>>>>
> > >>>>>> Apologies the above should read:
> > >>>>>>
> > >>>>>> "Most of the details here are in areas I *can't* comment on in
> detail
> > >>>>>> ..."
> > >>>>>>
> > >>>>>> David
> > >>>>>>
> > >>>>>>>   ??? > The only thing that jumped out at me is that I think the
> > >>>>>>>   ??? > DeoptimizeObjectsALotThread should be a hidden thread.
> > >>>>>>>   ??? >
> > >>>>>>>   ??? > +? bool is_hidden_from_external_view() const { return true;
> }
> > >>>>>>>
> > >>>>>>> Yes, it should. Will add the method like above.
> > >>>>>>>
> > >>>>>>>   ??? > Also I don't see any testing of the
> > DeoptimizeObjectsALotThread.
> > >>>>>>> Without
> > >>>>>>>   ??? > active testing this will just bit-rot.
> > >>>>>>>
> > >>>>>>> DeoptimizeObjectsALot is meant for stress testing with a larger
> > >>>>>>> workload. I will add a minimal test
> > >>>>>>> to keep it fresh.
> > >>>>>>>
> > >>>>>>>   ??? > Also on the tests I don't understand your @requires clause:
> > >>>>>>>   ??? >
> > >>>>>>>   ??? >?? @requires ((vm.compMode != "Xcomp") &
> > vm.compiler2.enabled
> > >> &
> > >>>>>>>   ??? > (vm.opt.TieredCompilation != true))
> > >>>>>>>   ??? >
> > >>>>>>>   ??? > This seems to require that TieredCompilation is disabled, but
> > >>>>>>> tiered is
> > >>>>>>>   ??? > our normal mode of operation. ??
> > >>>>>>>   ??? >
> > >>>>>>>
> > >>>>>>> I removed the clause. I guess I wanted to target the tests towards
> > the
> > >>>>>>> code they are supposed to
> > >>>>>>> test, and it's easier to analyze failures w/o tiered compilation and
> > >>>>>>> with just one compiler thread.
> > >>>>>>>
> > >>>>>>> Additionally I will make use of
> > >>>>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the
> tests.
> > >>>>>>>
> > >>>>>>> Thanks,
> > >>>>>>> Richard.
> > >>>>>>>
> > >>>>>>> -----Original Message-----
> > >>>>>>> From: David Holmes <david.holmes at oracle.com>
> > >>>>>>> Sent: Mittwoch, 11. Dezember 2019 08:03
> > >>>>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
> > >>>>>>> serviceability-dev at openjdk.java.net;
> > >>>>>>> hotspot-compiler-dev at openjdk.java.net;
> > >>>>>>> hotspot-runtime-dev at openjdk.java.net
> > >>>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> > >>>>>>> Performance in the Presence of JVMTI Agents
> > >>>>>>>
> > >>>>>>> Hi Richard,
> > >>>>>>>
> > >>>>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote:
> > >>>>>>>> Hi,
> > >>>>>>>>
> > >>>>>>>> I would like to get reviews please for
> > >>>>>>>>
> > >>>>>>>>
> > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
> > >>>>>>>>
> > >>>>>>>> Corresponding RFE:
> > >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745
> > >>>>>>>>
> > >>>>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
> > >>>>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-
> > 8214584 [1]
> > >>>>>>>>
> > >>>>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing
> > without
> > >>>>>>>> issues (thanks!). In addition the
> > >>>>>>>> change is being tested at SAP since I posted the first RFR some
> > >>>>>>>> months ago.
> > >>>>>>>>
> > >>>>>>>> The intention of this enhancement is to benefit performance
> wise
> > from
> > >>>>>>>> escape analysis even if JVMTI
> > >>>>>>>> agents request capabilities that allow them to access local
> variable
> > >>>>>>>> values. E.g. if you start-up
> > >>>>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,
> > then
> > >>>>>>>> escape analysis is disabled right
> > >>>>>>>> from the beginning, well before a debugger attaches -- if ever
> one
> > >>>>>>>> should do so. With the
> > >>>>>>>> enhancement, escape analysis will remain enabled until and
> after
> > a
> > >>>>>>>> debugger attaches. EA based
> > >>>>>>>> optimizations are reverted just before an agent acquires the
> > >>>>>>>> reference to an object. In the JBS item
> > >>>>>>>> you'll find more details.
> > >>>>>>>
> > >>>>>>> Most of the details here are in areas I can comment on in detail,
> but
> > I
> > >>>>>>> did take an initial general look at things.
> > >>>>>>>
> > >>>>>>> The only thing that jumped out at me is that I think the
> > >>>>>>> DeoptimizeObjectsALotThread should be a hidden thread.
> > >>>>>>>
> > >>>>>>> +? bool is_hidden_from_external_view() const { return true; }
> > >>>>>>>
> > >>>>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread.
> > >>>>>>> Without
> > >>>>>>> active testing this will just bit-rot.
> > >>>>>>>
> > >>>>>>> Also on the tests I don't understand your @requires clause:
> > >>>>>>>
> > >>>>>>>   ??? @requires ((vm.compMode != "Xcomp") &
> > vm.compiler2.enabled &
> > >>>>>>> (vm.opt.TieredCompilation != true))
> > >>>>>>>
> > >>>>>>> This seems to require that TieredCompilation is disabled, but
> tiered
> > is
> > >>>>>>> our normal mode of operation. ??
> > >>>>>>>
> > >>>>>>> Thanks,
> > >>>>>>> David
> > >>>>>>>
> > >>>>>>>> Thanks,
> > >>>>>>>> Richard.
> > >>>>>>>>
> > >>>>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745
> > >>>>>>>>
> > >>
> >
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.pa
> > tc
> > >> h
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>

From robbin.ehn at oracle.com  Tue Mar 31 14:20:41 2020
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Tue, 31 Mar 2020 16:20:41 +0200
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <AM0PR0202MB3331EEF36942FCEBA7E131389BCB0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
 <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
 <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com>
 <AM0PR0202MB33316510E86767AED0D29F679B030@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM7PR02MB6049A3D2F6DE10CAD6AA7A51ECEC0@AM7PR02MB6049.eurprd02.prod.outlook.com>
 <b159e349-95bc-01c3-5250-f3b454d7ef53@oracle.com>
 <AM0PR0202MB33315707EAB1F5C9801DB4C19BE40@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB32972071A26C80FB22FC49DE9AFD0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331EEF36942FCEBA7E131389BCB0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
Message-ID: <0a07f87e-ede1-edbd-c754-e7df884e0545@oracle.com>

Thanks for cleaning up thread.hpp!

/Robbin

On 2020-03-30 10:31, Reingruber, Richard wrote:
> Hi,
> 
> this is webrev.5 based on Robbin's feedback and Martin's review - thanks! :)
> 
> The change affects jvmti, hotspot and c2. Partial reviews are very welcome too.
> 
> Full:  http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5/
> Delta: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5.inc/
> 
> Robbin, Martin, please let me know, if anything shouldn't be quite as you wanted it. Also find my
> comments on your feedback below.
> 
> Robbin, can I count you as Reviewer for the runtime part?
> 
> Thanks, Richard.
> 
> --
> 
>> DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
>> You can move both declaration and definition to that file, no need to clobber
>> thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)
> 
> Done.
> 
>> Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's own
>> hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.
> 
> I moved JvmtiDeferredUpdates to vframe_hp.hpp where preexisting jvmtiDeferredLocalVariableSet is
> declared.
> 
>> src/hotspot/share/code/compiledMethod.cpp
>> Nice cleanup!
> 
> Thanks :)
> 
>> src/hotspot/share/code/debugInfoRec.cpp
>> src/hotspot/share/code/debugInfoRec.hpp
>> Additional parmeters. (Remark: I think "non_global_escape_in_scope" would read better than "not_global_escape_in_scope", but your version is consistent with existing code, so no change request from my side.) Ok.
> 
> I've been thinking about this too and finally stayed with not_global_escape_in_scope. It's supposed
> to mean an object whose escape state is not GlobalEscape is in scope.
> 
>> src/hotspot/share/compiler/compileBroker.cpp
>> src/hotspot/share/compiler/compileBroker.hpp
>> Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into a follow up change together with the test in order to make this webrev smaller, but since it is included, I'm reviewing everything at once. Not a big deal.) Ok.
> 
> Yes the change would be a little smaller. And if it helps I'll split it off. In general I prefer
> patches that bring along a suitable amount of tests.
> 
>> src/hotspot/share/opto/c2compiler.cpp
>> Make do_escape_analysis independent of JVMCI capabilities. Nice!
> 
> It is the main goal of the enhancement. It is done for C2, but could be done for JVMCI compilers
> with just a small effort as well.
> 
>> src/hotspot/share/opto/escape.cpp
>> Annotation for MachSafePointNodes. Your added functionality looks correct.
>> But I'd prefer to move the bulky code out of the large function.
>> I suggest to factor out something like has_not_global_escape and has_arg_escape. So the code could look like this:
>>        SafePointNode* sfn = sfn_worklist.at(next);
>>        sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn));
>>        if (sfn->is_CallJava()) {
>>          CallJavaNode* call = sfn->as_CallJava();
>>          call->set_arg_escape(has_arg_escape(call));
>>        }
>> This would also allow us to get rid of the found_..._escape_in_args variables making the loops better readable.
> 
> Done.
> 
>> It's kind of ugly to use strcmp to recognize uncommon trap, but that seems to be the way to do it (there are more such places). So it's ok.
> 
> Yeah. I copied the snippet.
> 
>> src/hotspot/share/prims/jvmtiImpl.cpp
>> src/hotspot/share/prims/jvmtiImpl.hpp
>> The sequence is pretty complex:
>> VM_GetOrSetLocal element initialization executes EscapeBarrier code which suspends the target thread (extra VM Operation).
> 
> Note that the target threads have to be suspended already for VM_GetOrSetLocal*. So it's mainly the
> synchronization effect of EscapeBarrier::sync_and_suspend_one() that is required here. Also no extra
> _handshake_ is executed, since sync_and_suspend_one() will find the target threads already
> suspended.
> 
>> VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM Thread to prepare VM Operation with frame deoptimization).
>> VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor which resumes the target thread.
>> But I don't have any improvement proposal. Performance is probably not a concern, here. So it's ok.
> 
>> VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it has non-globally escaping objects and other frames if they have arg escaping ones. Good.
> 
> It's not specifically the top frame, but the frame that is accessed.
> 
>> src/hotspot/share/runtime/deoptimization.cpp
>> Object deoptimization. I have more comments and proposals, here.
>> First of all, handling recursive and waiting locks in relock_objects is tricky, but looks correct.
>> Comments are sufficient to understand why things are done as they are implemented.
> 
>> BiasedLocking related parts are complex, but we may get rid of them in the future (with BiasedLocking removal).
>> Anyway, looks correct, too.
> 
>> Typo in comment: "regularily" => "regularly"
> 
>> Deoptimization::fetch_unroll_info_helper is the only place where _jvmti_deferred_updates get deallocated (except JavaThread destructor). But I think we always go through it, so I can't see a memory leak or such kind of issues.
> 
> That's correct. The compiled frame for which deferred updates are allocated is always deoptimized
> before (see EscapeBarrier::deoptimize_objects()). This is also asserted in
> compiledVFrame::update_deferred_value(). I've added the same assertion to
> Deoptimization::relock_objects(). So we can be sure that _jvmti_deferred_updates are deallocated
> again in fetch_unroll_info_helper().
> 
>> EscapeBarrier::deoptimize_objects: ResourceMark should use calling_thread().
> 
> Sure, well spotted!
> 
>> You can use MutexLocker and MonitorLocker with Thread* to save the Thread::current() call.
> 
> Right, good hint. This was recently introduced with 8235678. I even had to resolve conflicts. Should
> have done this then.
> 
>> I'd make set_objs_are_deoptimized static and remove it from the EscapeBarrier interface because I think it shouldn't be used outside of EscapeBarrier::deoptimize_objects.
> 
> Done.
> 
>> Typo in comment: "we must only deoptimize" => "we only have to deoptimize"
> 
> Replaced with "[...] we deoptimize iff local objects are passed as args"
> 
>> "bool EscapeBarrier::deoptimize_objects(intptr_t* fr_id)" is trivial and barrier_active() is redundant. Implementation can get moved to hpp file.
> 
> Ok. Done.
> 
>> I'll get back to suspend flags, later.
> 
>> There are weird cases regarding _self_deoptimization_in_progress.
>> Assume we have 3 threads A, B and C. A deopts C, B deopts C, C deopts C. C can set _self_deoptimization_in_progress while A performs the handshake for suspending C. I think this doesn't lead to errors, but it's probably not desired.
>> I think it would be better to use only one "wait" call in sync_and_suspend_one and sync_and_suspend_all.
> 
> You're right. We've discussed that face-to-face, but couldn't find a real issue. But now, thinking again, a reckon I found one:
> 
> 2808   // Sync with other threads that might be doing deoptimizations
> 2809   {
> 2810     // Need to switch to _thread_blocked for the wait() call
> 2811     ThreadBlockInVM tbivm(_calling_thread);
> 2812     MonitorLocker ml(EscapeBarrier_lock, Mutex::_no_safepoint_check_flag);
> 2813     while (_self_deoptimization_in_progress) {
> 2814       ml.wait();
> 2815     }
> 2816
> 2817     if (self_deopt()) {
> 2818       _self_deoptimization_in_progress = true;
> 2819     }
> 2820
> 2821     while (_deoptee_thread->is_ea_obj_deopt_suspend()) {
> 2822       ml.wait();
> 2823     }
> 2824
> 2825     if (self_deopt()) {
> 2826       return;
> 2827     }
> 2828
> 2829     // set suspend flag for target thread
> 2830     _deoptee_thread->set_ea_obj_deopt_flag();
> 2831   }
> 
> - A waits in 2822
> - C is suspended
> - B notifies all in resume_one()
> - A and C wake up
> - C wins over A and sets _self_deoptimization_in_progress = true in 2818
> - C does the self deoptimization
> - A executes 2830 _deoptee_thread->set_ea_obj_deopt_flag()
> 
> C will self suspend at some undefined point. The resulting state is illegal.
> 
>> I first thought it'd be better to move ThreadBlockInVM before wait() to reduce thread state transitions, but that seems to be problematic because ThreadBlockInVM destructor contains a safepoint check which we shouldn't do while holding EscapeBarrier_lock. So no change request.
> 
> Yes, would be nice to have the state change only if needed, but for the reason you mentioned it is
> not quite as easy as it seems to be. I experimented as well with a second lock, but did not succeed.
> 
>> Change in thred_added:
>> I think the sequence would be more comprehensive if we waited for deopt_all_threads in Thread::start and all other places where a new thread can run into Java code (e.g. JVMTI attach).
>> Your version makes new threads come up with suspend flag set. That looks correct, too. Advantage is that you only have to change one place (thread_added). It'll be interesting to see how it will look like when we use async handshakes instead of suspend flags.
>> For now, I'm ok with your version.
> 
> I had a version that did what you are suggesting. The current version also has the advantage, that
> there are fewer places where a thread has to wait for ongoing object deoptimization. This means
> viewer places where you have to worry about correct thread state transitions, possible deadlocks,
> and if all oops are properly Handle'ed.
> 
>> I'd only move MutexLocker ml(EscapeBarrier_lock...) after if (!jt->is_hidden_from_external_view()).
> 
> Done.
> 
>> Having 4 different deoptimize_objects functions makes it a little hard to keep an overview of which one is used for what.
>> Maybe adding suffixes would help a little bit, but I can also live with what you have.
>> Implementation looks correct to me.
> 
> 2 are internal. I added the suffix _internal to them. This leaves 2 to choose from.
> 
>> src/hotspot/share/runtime/deoptimization.hpp
>> Escape barriers and object deoptimization functions.
>> Typo in comment: "helt" => "held"
> 
> Done in place already.
> 
>> src/hotspot/share/runtime/interfaceSupport.cpp
>> InterfaceSupport::deoptimizeAllObjects() is only used for DeoptimizeObjectsALot = 1.
>> I think DeoptimizeObjectsALot = 2 is more important, but I think it's not bad to have DeoptimizeObjectsALot = 1 in addition. Ok.
> 
> I never used DeoptimizeObjectsALot = 1 that much. It could be more deterministic in single threaded
> scenarios. I wouldn't object to get rid of it though.
> 
>> src/hotspot/share/runtime/stackValue.hpp
>> Better reinitilization in StackValue. Good.
> 
> StackValue::obj_is_scalar_replaced() should not return true after calling set_obj().
> 
>> src/hotspot/share/runtime/thread.cpp
>> src/hotspot/share/runtime/thread.hpp
>> src/hotspot/share/runtime/thread.inline.hpp
>> wait_for_object_deoptimization, suspend flag, deferred updates and test feature to deoptimize objects.
> 
>> In the long term, we want to get rid of suspend flags, so it's not so nice to introduce a new one. But I agree with G?tz that it should be acceptable as temporary solution until async handshakes are available (which takes more time). So I'm ok with your change.
> 
> I'm keen to build the feature on async handshakes when the arive.
> 
>> You can use MutexLocker with Thread*.
> 
> Done.
> 
>> JVMTIDeferredUpdates: I agree with Robin. It'd be nice to move the class out of thread.hpp.
> 
> Done.
> 
>> src/hotspot/share/runtime/vframe.cpp
>> Added support for entry frame to new_vframe. Ok.
> 
> 
>> src/hotspot/share/runtime/vframe_hp.cpp
>> src/hotspot/share/runtime/vframe_hp.hpp
> 
>> I think code()->as_nmethod() in not_global_escape_in_scope() and arg_escape() should better be under #ifdef ASSERT or inside the assert statement (no need for code cache walking in product build).
> 
> Done.
> 
>> jvmtiDeferredLocalVariableSet::update_monitors:
>> Please add a comment explaining that owner referenced by original info may be scalar replaced, but it is deoptimized in the vframe.
> 
> Done.
> 
> -----Original Message-----
> From: Doerr, Martin <martin.doerr at sap.com>
> Sent: Donnerstag, 12. M?rz 2020 17:28
> To: Reingruber, Richard <richard.reingruber at sap.com>; 'Robbin Ehn' <robbin.ehn at oracle.com>; Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; David Holmes <david.holmes at oracle.com>; Vladimir Kozlov (vladimir.kozlov at oracle.com) <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
> Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents
> 
> Hi Richard,
> 
> 
> I managed to find time for a (almost) complete review of webrev.4. (I'll review the tests separately.)
> 
> First of all, the change seems to be in pretty good quality for its significant complexity. I couldn't find any real bugs. But I'd like to propose minor improvements.
> I'm convinced that it's mature because we did substantial testing.
> 
> I like the new functionality for object deoptimization. It can possibly be reused for future escape analysis based optimizations. So I appreciate having it available in the code base.
> In addition to that, your change makes the JVMTI implementation better integrated into the VM.
> 
> 
> Now to the details:
> 
> 
> src/hotspot/share/c1/c1_IR.hpp
> describe_scope parameters. Ok.
> 
> 
> src/hotspot/share/ci/ciEnv.cpp
> src/hotspot/share/ci/ciEnv.hpp
> Fix for JvmtiExport::can_walk_any_space() capability. Ok.
> 
> 
> src/hotspot/share/code/compiledMethod.cpp
> Nice cleanup!
> 
> 
> src/hotspot/share/code/debugInfoRec.cpp
> src/hotspot/share/code/debugInfoRec.hpp
> Additional parmeters. (Remark: I think "non_global_escape_in_scope" would read better than "not_global_escape_in_scope", but your version is consistent with existing code, so no change request from my side.) Ok.
> 
> 
> src/hotspot/share/code/nmethod.cpp
> Nice cleanup!
> 
> 
> src/hotspot/share/code/pcDesc.hpp
> Additional parameters. Ok.
> 
> 
> src/hotspot/share/code/scopeDesc.cpp
> src/hotspot/share/code/scopeDesc.hpp
> Improved implementation + additional parameters. Ok.
> 
> 
> src/hotspot/share/compiler/compileBroker.cpp
> src/hotspot/share/compiler/compileBroker.hpp
> Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into a follow up change together with the test in order to make this webrev smaller, but since it is included, I'm reviewing everything at once. Not a big deal.) Ok.
> 
> 
> src/hotspot/share/jvmci/jvmciCodeInstaller.cpp
> Additional parameters. Ok.
> 
> 
> src/hotspot/share/opto/c2compiler.cpp
> Make do_escape_analysis independent of JVMCI capabilities. Nice!
> 
> 
> src/hotspot/share/opto/callnode.hpp
> Additional fields for MachSafePointNodes. Ok.
> 
> 
> src/hotspot/share/opto/escape.cpp
> Annotation for MachSafePointNodes. Your added functionality looks correct.
> But I'd prefer to move the bulky code out of the large function.
> I suggest to factor out something like has_not_global_escape and has_arg_escape. So the code could look like this:
>        SafePointNode* sfn = sfn_worklist.at(next);
>        sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn));
>        if (sfn->is_CallJava()) {
>          CallJavaNode* call = sfn->as_CallJava();
>          call->set_arg_escape(has_arg_escape(call));
>        }
> This would also allow us to get rid of the found_..._escape_in_args variables making the loops better readable.
> 
> It's kind of ugly to use strcmp to recognize uncommon trap, but that seems to be the way to do it (there are more such places). So it's ok.
> 
> 
> src/hotspot/share/opto/machnode.hpp
> Additional fields for MachSafePointNodes. Ok.
> 
> 
> src/hotspot/share/opto/macro.cpp
> Allow elimination of non-escaping allocations. Ok.
> 
> 
> src/hotspot/share/opto/matcher.cpp
> src/hotspot/share/opto/output.cpp
> Copy attribute / pass parameters. Ok.
> 
> 
> src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp
> Nice cleanup!
> 
> 
> src/hotspot/share/prims/jvmtiEnv.cpp
> src/hotspot/share/prims/jvmtiEnvBase.cpp
> Escape barriers + deoptimize objects for target thread. Good.
> 
> 
> src/hotspot/share/prims/jvmtiImpl.cpp
> src/hotspot/share/prims/jvmtiImpl.hpp
> The sequence is pretty complex:
> VM_GetOrSetLocal element initialization executes EscapeBarrier code which suspends the target thread (extra VM Operation).
> VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM Thread to prepare VM Operation with frame deoptimization).
> VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor which resumes the target thread.
> But I don't have any improvement proposal. Performance is probably not a concern, here. So it's ok.
> 
> VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it has non-globally escaping objects and other frames if they have arg escaping ones. Good.
> 
> 
> src/hotspot/share/prims/jvmtiTagMap.cpp
> Escape barriers + deoptimize objects for all threads. Ok.
> 
> 
> src/hotspot/share/prims/whitebox.cpp
> Added WB_IsFrameDeoptimized to API. Ok.
> 
> 
> src/hotspot/share/runtime/deoptimization.cpp
> Object deoptimization. I have more comments and proposals, here.
> First of all, handling recursive and waiting locks in relock_objects is tricky, but looks correct.
> Comments are sufficient to understand why things are done as they are implemented.
> 
> BiasedLocking related parts are complex, but we may get rid of them in the future (with BiasedLocking removal).
> Anyway, looks correct, too.
> 
> Typo in comment: "regularily" => "regularly"
> 
> Deoptimization::fetch_unroll_info_helper is the only place where _jvmti_deferred_updates get deallocated (except JavaThread destructor). But I think we always go through it, so I can't see a memory leak or such kind of issues.
> 
> EscapeBarrier::deoptimize_objects: ResourceMark should use calling_thread().
> 
> You can use MutexLocker and MonitorLocker with Thread* to save the Thread::current() call.
> 
> I'd make set_objs_are_deoptimized static and remove it from the EscapeBarrier interface because I think it shouldn't be used outside of EscapeBarrier::deoptimize_objects.
> 
> Typo in comment: "we must only deoptimize" => "we only have to deoptimize"
> 
> "bool EscapeBarrier::deoptimize_objects(intptr_t* fr_id)" is trivial and barrier_active() is redundant. Implementation can get moved to hpp file.
> 
> I'll get back to suspend flags, later.
> 
> There are weird cases regarding _self_deoptimization_in_progress.
> Assume we have 3 threads A, B and C. A deopts C, B deopts C, C deopts C. C can set _self_deoptimization_in_progress while A performs the handshake for suspending C. I think this doesn't lead to errors, but it's probably not desired.
> I think it would be better to use only one "wait" call in sync_and_suspend_one and sync_and_suspend_all.
> 
> I first thought it'd be better to move ThreadBlockInVM before wait() to reduce thread state transitions, but that seems to be problematic because ThreadBlockInVM destructor contains a safepoint check which we shouldn't do while holding EscapeBarrier_lock. So no change request.
> 
> Change in thred_added:
> I think the sequence would be more comprehensive if we waited for deopt_all_threads in Thread::start and all other places where a new thread can run into Java code (e.g. JVMTI attach).
> Your version makes new threads come up with suspend flag set. That looks correct, too. Advantage is that you only have to change one place (thread_added). It'll be interesting to see how it will look like when we use async handshakes instead of suspend flags.
> For now, I'm ok with your version.
> 
> I'd only move MutexLocker ml(EscapeBarrier_lock...) after if (!jt->is_hidden_from_external_view()).
> 
> Having 4 different deoptimize_objects functions makes it a little hard to keep an overview of which one is used for what.
> Maybe adding suffixes would help a little bit, but I can also live with what you have.
> Implementation looks correct to me.
> 
> 
> src/hotspot/share/runtime/deoptimization.hpp
> Escape barriers and object deoptimization functions.
> Typo in comment: "helt" => "held"
> 
> 
> src/hotspot/share/runtime/globals.hpp
> Addition of develop flag DeoptimizeObjectsALotInterval. Ok.
> 
> 
> src/hotspot/share/runtime/interfaceSupport.cpp
> InterfaceSupport::deoptimizeAllObjects() is only used for DeoptimizeObjectsALot = 1.
> I think DeoptimizeObjectsALot = 2 is more important, but I think it's not bad to have DeoptimizeObjectsALot = 1 in addition. Ok.
> 
> 
> src/hotspot/share/runtime/interfaceSupport.inline.hpp
> Addition of deoptimizeAllObjects. Ok.
> 
> 
> src/hotspot/share/runtime/mutexLocker.cpp
> src/hotspot/share/runtime/mutexLocker.hpp
> Addition of EscapeBarrier_lock. Ok.
> 
> 
> src/hotspot/share/runtime/objectMonitor.cpp
> Make recursion count relock aware. Ok.
> 
> 
> src/hotspot/share/runtime/stackValue.hpp
> Better reinitilization in StackValue. Good.
> 
> 
> src/hotspot/share/runtime/thread.cpp
> src/hotspot/share/runtime/thread.hpp
> src/hotspot/share/runtime/thread.inline.hpp
> wait_for_object_deoptimization, suspend flag, deferred updates and test feature to deoptimize objects.
> 
> In the long term, we want to get rid of suspend flags, so it's not so nice to introduce a new one. But I agree with G?tz that it should be acceptable as temporary solution until async handshakes are available (which takes more time). So I'm ok with your change.
> 
> You can use MutexLocker with Thread*.
> 
> JVMTIDeferredUpdates: I agree with Robin. It'd be nice to move the class out of thread.hpp.
> 
> 
> src/hotspot/share/runtime/vframe.cpp
> Added support for entry frame to new_vframe. Ok.
> 
> 
> src/hotspot/share/runtime/vframe_hp.cpp
> src/hotspot/share/runtime/vframe_hp.hpp
> 
> I think code()->as_nmethod() in not_global_escape_in_scope() and arg_escape() should better be under #ifdef ASSERT or inside the assert statement (no need for code cache walking in product build).
> 
> jvmtiDeferredLocalVariableSet::update_monitors:
> Please add a comment explaining that owner referenced by original info may be scalar replaced, but it is deoptimized in the vframe.
> 
> 
> src/hotspot/share/utilities/macros.hpp
> Addition of NOT_COMPILER2_OR_JVMCI_RETURN macros. Ok.
> 
> 
> test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysisEnabled.java
> test/hotspot/jtreg/serviceability/jvmti/Heap/libIterateHeapWithEscapeAnalysisEnabled.c
> New test. Will review separately.
> 
> 
> test/jdk/TEST.ROOT
> Addition of vm.jvmci as required property. Ok.
> 
> 
> test/jdk/com/sun/jdi/EATests.java
> test/jdk/com/sun/jdi/EATestsJVMCI.java
> New test. Will review separately.
> 
> 
> test/lib/sun/hotspot/WhiteBox.java
> Added isFrameDeoptimized to API. Ok.
> 
> 
> That was it. Best regards,
> Martin
> 
> 
>> -----Original Message-----
>> From: hotspot-compiler-dev <hotspot-compiler-dev-
>> bounces at openjdk.java.net> On Behalf Of Reingruber, Richard
>> Sent: Dienstag, 3. M?rz 2020 21:23
>> To: 'Robbin Ehn' <robbin.ehn at oracle.com>; Lindenmaier, Goetz
>> <goetz.lindenmaier at sap.com>; David Holmes <david.holmes at oracle.com>;
>> Vladimir Kozlov (vladimir.kozlov at oracle.com)
>> <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net;
>> hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-
>> dev at openjdk.java.net
>> Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better
>> Performance in the Presence of JVMTI Agents
>>
>> Hi Robbin,
>>
>>>> I understand that Robbin proposed to replace the usage of
>>>> _suspend_flag with handshakes. Apparently, async handshakes
>>>> are needed to do so. We have been waiting a while for removal
>>>> of the _suspend_flag / introduction of async handshakes [2].
>>>> What is the status here?
>>
>>> I have an old prototype which I would like to continue to work on.
>>> So do not assume asynch handshakes will make 15.
>>> Even if it would, I think there are a lot more investigate work to remove
>>> _suspend_flag.
>>
>> Let us know, if we can be of any help to you and be it only testing.
>>
>>>>> Full:
>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/
>>
>>> DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
>>> You can move both declaration and definition to that file, no need to
>> clobber
>>> thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)
>>
>> Will do.
>>
>>> Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's
>> own
>>> hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.
>>
>> You are right. It shouldn't be declared in thread.hpp. I will look into that.
>>
>>> Note that we also think we may have a bug in deopt:
>>> https://bugs.openjdk.java.net/browse/JDK-8238237
>>
>>> I think it would be best, if possible, to push after that is resolved.
>>
>> Sure.
>>
>>> Not even nearly a full review :)
>>
>> I know :)
>>
>> Anyways, thanks a lot,
>> Richard.
>>
>>
>> -----Original Message-----
>> From: Robbin Ehn <robbin.ehn at oracle.com>
>> Sent: Monday, March 2, 2020 11:17 AM
>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Reingruber, Richard
>> <richard.reingruber at sap.com>; David Holmes <david.holmes at oracle.com>;
>> Vladimir Kozlov (vladimir.kozlov at oracle.com)
>> <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net;
>> hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-
>> dev at openjdk.java.net
>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance
>> in the Presence of JVMTI Agents
>>
>> Hi,
>>
>> On 2/24/20 5:39 PM, Lindenmaier, Goetz wrote:
>>> Hi,
>>>
>>> I had a look at the progress of this change. Nothing
>>> happened since Richard posted his update using more
>>> handshakes [1].
>>> But we (SAP) would appreciate a lot if this change could
>>> be successfully reviewed and pushed.
>>>
>>> I think there is basic understanding that this
>>> change is helpful. It fixes a number of issues with JVMTI,
>>> and will deliver the same performance benefits as EA
>>> does in current production mode for debugging scenarios.
>>>
>>> This is important for us as we run our VMs prepared
>>> for debugging in production mode.
>>>
>>> I understand that Robbin proposed to replace the usage of
>>> _suspend_flag with handshakes. Apparently, async handshakes
>>> are needed to do so. We have been waiting a while for removal
>>> of the _suspend_flag / introduction of async handshakes [2].
>>> What is the status here?
>>
>> I have an old prototype which I would like to continue to work on.
>> So do not assume asynch handshakes will make 15.
>> Even if it would, I think there are a lot more investigate work to remove
>> _suspend_flag.
>>
>>>
>>> I think we should no longer wait, but proceed with
>>> this change. We will look into removing the usage of
>>> suspend_flag introduced here once it is possible to implement
>>> it with handshakes.
>>
>> Yes, sure.
>>
>>>> Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/
>>
>> DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
>> You can move both declaration and definition to that file, no need to clobber
>> thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)
>>
>> Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's
>> own
>> hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.
>>
>> Note that we also think we may have a bug in deopt:
>> https://bugs.openjdk.java.net/browse/JDK-8238237
>>
>> I think it would be best, if possible, to push after that is resolved.
>>
>> Not even nearly a full review :)
>>
>> Thanks, Robbin
>>
>>
>>>> Incremental:
>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/
>>>>
>>>> I was not able to eliminate the additional suspend flag now. I'll take care
>> of this
>>>> as soon as the
>>>> existing suspend-resume-mechanism is reworked.
>>>>
>>>> Testing:
>>>>
>>>> Nightly tests @SAP:
>>>>
>>>>     JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015,
>> Renaissance
>>>> Suite, SAP specific tests
>>>>     with fastdebug and release builds on all platforms
>>>>
>>>>     Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x
>> parallel
>>>> for 24h
>>>>
>>>> Thanks, Richard.
>>>>
>>>>
>>>> More details on the changes:
>>>>
>>>> * Hide DeoptimizeObjectsALotThread from external view.
>>>>
>>>> * Changed EscapeBarrier_lock to be a _safepoint_check_never lock.
>>>>     It used to be _safepoint_check_sometimes, which will be eliminated
>> sooner or
>>>> later.
>>>>     I added explicit thread state changes with ThreadBlockInVM to code
>> paths
>>>> where we can wait()
>>>>     on EscapeBarrier_lock to become safepoint safe.
>>>>
>>>> * Use handshake EscapeBarrierSuspendHandshake to suspend target
>> threads
>>>> instead of vm operation
>>>>     VM_ThreadSuspendAllForObjDeopt.
>>>>
>>>> * Removed uses of Threads_lock. When adding a new thread we suspend
>> it iff
>>>> EA optimizations are
>>>>     being reverted. In the previous version we were waiting on
>> Threads_lock
>>>> while EA optimizations
>>>>     were reverted. See EscapeBarrier::thread_added().
>>>>
>>>> * Made tests require Xmixed compilation mode.
>>>>
>>>> * Made tests agnostic regarding tiered compilation.
>>>>     I.e. tc isn't disabled anymore, and the tests can be run with tc enabled or
>>>> disabled.
>>>>
>>>> * Exercising EATests.java as well with stress test options
>>>> DeoptimizeObjectsALot*
>>>>     Due to the non-deterministic deoptimizations some tests need to be
>> skipped.
>>>>     We do this to prevent bit-rot of the stress test code.
>>>>
>>>> * Executing EATests.java as well with graal if available. Driver for this is
>>>>     EATestsJVMCI.java. Graal cannot pass all tests, because it does not
>> provide all
>>>> the new debug info
>>>>     (namely not_global_escape_in_scope and arg_escape in
>> scopeDesc.hpp).
>>>>     And graal does not yet support the JVMTI operations force early return
>> and
>>>> pop frame.
>>>>
>>>> * Removed tracing from new jdi tests in EATests.java. Too much trace
>> output
>>>> before the debugging
>>>>     connection is established can cause deadlock because output buffers fill
>> up.
>>>>     (See https://bugs.openjdk.java.net/browse/JDK-8173304)
>>>>
>>>> * Many copyright year changes and smaller clean-up changes of testing
>> code
>>>> (trailing white-space and
>>>>     the like).
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: David Holmes <david.holmes at oracle.com>
>>>> Sent: Donnerstag, 19. Dezember 2019 03:12
>>>> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-
>>>> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net;
>> hotspot-
>>>> runtime-dev at openjdk.java.net; Vladimir Kozlov
>> (vladimir.kozlov at oracle.com)
>>>> <vladimir.kozlov at oracle.com>
>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
>> Performance in
>>>> the Presence of JVMTI Agents
>>>>
>>>> Hi Richard,
>>>>
>>>> I think my issue is with the way EliminateNestedLocks works so I'm going
>>>> to look into that more deeply.
>>>>
>>>> Thanks for the explanations.
>>>>
>>>> David
>>>>
>>>> On 18/12/2019 12:47 am, Reingruber, Richard wrote:
>>>>> Hi David,
>>>>>
>>>>>      > >    > Some further queries/concerns:
>>>>>      > >    >
>>>>>      > >    > src/hotspot/share/runtime/objectMonitor.cpp
>>>>>      > >    >
>>>>>      > >    > Can you please explain the changes to ObjectMonitor::wait:
>>>>>      > >    >
>>>>>      > >    > !   _recursions = save      // restore the old recursion count
>>>>>      > >    > !                 + jt->get_and_reset_relock_count_after_wait(); //
>>>>>      > >    > increased by the deferred relock count
>>>>>      > >    >
>>>>>      > >    > what is the "deferred relock count"? I gather it relates to
>>>>>      > >    >
>>>>>      > >    > "The code was extended to be able to deoptimize objects of a
>>>>>      > > frame that
>>>>>      > >    > is not the top frame and to let another thread than the owning
>>>>>      > > thread do
>>>>>      > >    > it."
>>>>>      > >
>>>>>      > > Yes, these relate. Currently EA based optimizations are reverted,
>> when a
>>>> compiled frame is
>>>>>      > > replaced with corresponding interpreter frames. Part of this is
>> relocking
>>>> objects with eliminated
>>>>>      > > locking. New with the enhancement is that we do this also just
>> before
>>>> object references are
>>>>>      > > acquired through JVMTI. In this case we deoptimize also the
>> owning
>>>> compiled frame C and we
>>>>>      > > register deoptimized objects as deferred updates. When control
>> returns
>>>> to C it gets deoptimized,
>>>>>      > > we notice that objects are already deoptimized (reallocated and
>>>> relocked), so we don't do it again
>>>>>      > > (relocking twice would be incorrect of course). Deferred updates
>> are
>>>> copied into the new
>>>>>      > > interpreter frames.
>>>>>      > >
>>>>>      > > Problem: relocking is not possible if the target thread T is waiting
>> on the
>>>> monitor that needs to
>>>>>      > > be relocked. This happens only with non-local objects with
>>>> EliminateNestedLocks. Instead relocking
>>>>>      > > is deferred until T owns the monitor again. This is what the piece of
>>>> code above does.
>>>>>      >
>>>>>      >  Sorry I need some more detail here. How can you wait() on an
>> object
>>>>>      >  monitor if the object allocation and/or locking was optimised away?
>> And
>>>>>      >  what is a "non-local object" in this context? Isn't EA restricted to
>>>>>      >  thread-confined objects?
>>>>>
>>>>> "Non-local object" is an object that escapes its thread. The issue I'm
>>>> addressing with the changes
>>>>> in ObjectMonitor::wait are almost unrelated to EA. They are caused by
>>>> EliminateNestedLocks, where C2
>>>>> eliminates recursive locking of an already owned lock. The lock owning
>> object
>>>> exists on the heap, it
>>>>> is locked and you can call wait() on it.
>>>>>
>>>>> EliminateLocks is the C2 option that controls lock elimination based on
>> EA.
>>>> Both optimizations have
>>>>> in common that objects with eliminated locking need to be relocked
>> when
>>>> deoptimizing a frame,
>>>>> i.e. when replacing a compiled frame with equivalent interpreter
>>>>> frames. Deoptimization::relock_objects does that job for /all/ eliminated
>>>> locks in scope. /All/ can
>>>>> be a mix of eliminated nested locks and locks of not-escaping objects.
>>>>>
>>>>> New with the enhancement: I call relock_objects earlier, just before
>> objects
>>>> pontentially
>>>>> escape. But then later when the owning compiled frame gets
>> deoptimized, I
>>>> must not do it again:
>>>>>
>>>>> See call to EscapeBarrier::objs_are_deoptimized in deoptimization.cpp:
>>>>>
>>>>>     373   if ((jvmci_enabled || ((DoEscapeAnalysis ||
>> EliminateNestedLocks) &&
>>>> EliminateLocks))
>>>>>     374       && !EscapeBarrier::objs_are_deoptimized(thread,
>> deoptee.id())) {
>>>>>     375     bool unused;
>>>>>     376     eliminate_locks(thread, chunk, realloc_failures, deoptee,
>> exec_mode,
>>>> unused);
>>>>>     377   }
>>>>>
>>>>> Now when calling relock_objects early it is quiet possible that I have to
>> relock
>>>> an object the
>>>>> target thread currently waits for. Obviously I cannot relock in this case,
>>>> instead I chose to
>>>>> introduce relock_count_after_wait to JavaThread.
>>>>>
>>>>>      >  Is it just that some of the locking gets optimized away e.g.
>>>>>      >
>>>>>      >  synchronised(obj) {
>>>>>      >     synchronised(obj) {
>>>>>      >       synchronised(obj) {
>>>>>      >         obj.wait();
>>>>>      >       }
>>>>>      >     }
>>>>>      >  }
>>>>>      >
>>>>>      >  If this is reduced to a form as-if it were a single lock of the monitor
>>>>>      >  (due to EA) and the wait() triggers a JVM TI event which leads to the
>>>>>      >  escape of "obj" then we need to reconstruct the true lock state, and
>> so
>>>>>      >  when the wait() internally unblocks and reacquires the monitor it
>> has to
>>>>>      >  set the true recursion count to 3, not the 1 that it appeared to be
>> when
>>>>>      >  wait() was initially called. Is that the scenario?
>>>>>
>>>>> Kind of... except that the locking is not eliminated due to EA and there is
>> no
>>>> JVM TI event
>>>>> triggered by wait.
>>>>>
>>>>> Add
>>>>>
>>>>> LocalObject l1 = new LocalObject();
>>>>>
>>>>> in front of the synchrnized blocks and assume a JVM TI agent acquires l1.
>> This
>>>> triggers the code in
>>>>> question.
>>>>>
>>>>> See that relocking/reallocating is transactional. If it is done then for /all/
>>>> objects in scope and it is
>>>>> done at most once. It wouldn't be quite so easy to split this in relocking
>> of
>>>> nested/EA-based
>>>>> eliminated locks.
>>>>>
>>>>>      >  If so I find this truly awful. Anyone using wait() in a realistic form
>>>>>      >  requires a notification and so the object cannot be thread confined.
>> In
>>>>>
>>>>> It is not thread confined.
>>>>>
>>>>>      >  which case I would strongly argue that upon hitting the wait() the
>> deopt
>>>>>      >  should occur unconditionally and so the lock state is correct before
>> we
>>>>>      >  wait and so we don't need to mess with the recursion count
>> internally
>>>>>      >  when we reacquire the monitor.
>>>>>      >
>>>>>      > >
>>>>>      > >    > which I don't like the sound of at all when it comes to
>> ObjectMonitor
>>>>>      > >    > state. So I'd like to understand in detail exactly what is going on
>> here
>>>>>      > >    > and why.  This is a very intrusive change that seems to badly
>> break
>>>>>      > >    > encapsulation and impacts future changes to ObjectMonitor
>> that are
>>>> under
>>>>>      > >    > investigation.
>>>>>      > >
>>>>>      > > I would not regard this as breaking encapsulation. Certainly not
>> badly.
>>>>>      > >
>>>>>      > > I've added a property relock_count_after_wait to JavaThread. The
>>>> property is well
>>>>>      > > encapsulated. Future ObjectMonitor implementations have to deal
>> with
>>>> recursion too. They are free
>>>>>      > > in choosing a way to do that as long as that property is taken into
>>>> account. This is hardly a
>>>>>      > > limitation.
>>>>>      >
>>>>>      >  I do think this badly breaks encapsulation as you have to add a
>> callout
>>>>>      >  from the guts of the ObjectMonitor code to reach into the thread to
>> get
>>>>>      >  this lock count adjustment. I understand why you have had to do
>> this but
>>>>>      >  I would much rather see a change to the EA optimisation strategy so
>> that
>>>>>      >  this is not needed.
>>>>>      >
>>>>>      > > Note also that the property is a straight forward extension of the
>>>> existing concept of deferred
>>>>>      > > local updates. It is embedded into the structure holding them. So
>> not
>>>> even the footprint of a
>>>>>      > > JavaThread is enlarged if no deferred updates are generated.
>>>>>      >
>>>>>      > [...]
>>>>>      >
>>>>>      > >
>>>>>      > > I'm actually duplicating the existing external suspend mechanism,
>>>> because a thread can be
>>>>>      > > suspended at most once. And hey, and don't like that either! But it
>>>> seems not unlikely that the
>>>>>      > > duplicate can be removed together with the original and the new
>> type
>>>> of handshakes that will be
>>>>>      > > used for thread suspend can be used for object deoptimization
>> too. See
>>>> today's discussion in
>>>>>      > > JDK-8227745 [2].
>>>>>      >
>>>>>      >  I hope that discussion bears some fruit, at the moment it seems not
>> to
>>>>>      >  be possible to use handshakes here. :(
>>>>>      >
>>>>>      >  The external suspend mechanism is a royal pain in the proverbial
>> that we
>>>>>      >  have to carefully live with. The idea that we're duplicating that for
>>>>>      >  use in another fringe area of functionality does not thrill me at all.
>>>>>      >
>>>>>      >  To be clear, I understand the problem that exists and that you wish
>> to
>>>>>      >  solve, but for the runtime parts I balk at the complexity cost of
>>>>>      >  solving it.
>>>>>
>>>>> I know it's complex, but by far no rocket science.
>>>>>
>>>>> Also I find it hard to imagine another fix for JDK-8233915 besides
>> changing
>>>> the JVM TI specification.
>>>>>
>>>>> Thanks, Richard.
>>>>>
>>>>> -----Original Message-----
>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>> Sent: Dienstag, 17. Dezember 2019 08:03
>>>>> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-
>>>> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net;
>> hotspot-
>>>> runtime-dev at openjdk.java.net; Vladimir Kozlov
>> (vladimir.kozlov at oracle.com)
>>>> <vladimir.kozlov at oracle.com>
>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
>> Performance
>>>> in the Presence of JVMTI Agents
>>>>>
>>>>> <resend as my mailer crashed during last send>
>>>>>
>>>>> David
>>>>>
>>>>> On 17/12/2019 4:57 pm, David Holmes wrote:
>>>>>> Hi Richard,
>>>>>>
>>>>>> On 14/12/2019 5:01 am, Reingruber, Richard wrote:
>>>>>>> Hi David,
>>>>>>>
>>>>>>>    ?? > Some further queries/concerns:
>>>>>>>    ?? >
>>>>>>>    ?? > src/hotspot/share/runtime/objectMonitor.cpp
>>>>>>>    ?? >
>>>>>>>    ?? > Can you please explain the changes to ObjectMonitor::wait:
>>>>>>>    ?? >
>>>>>>>    ?? > !?? _recursions = save????? // restore the old recursion count
>>>>>>>    ?? > !???????????????? + jt->get_and_reset_relock_count_after_wait(); //
>>>>>>>    ?? > increased by the deferred relock count
>>>>>>>    ?? >
>>>>>>>    ?? > what is the "deferred relock count"? I gather it relates to
>>>>>>>    ?? >
>>>>>>>    ?? > "The code was extended to be able to deoptimize objects of a
>>>>>>> frame that
>>>>>>>    ?? > is not the top frame and to let another thread than the owning
>>>>>>> thread do
>>>>>>>    ?? > it."
>>>>>>>
>>>>>>> Yes, these relate. Currently EA based optimizations are reverted,
>> when
>>>>>>> a compiled frame is replaced
>>>>>>> with corresponding interpreter frames. Part of this is relocking
>>>>>>> objects with eliminated
>>>>>>> locking. New with the enhancement is that we do this also just before
>>>>>>> object references are acquired
>>>>>>> through JVMTI. In this case we deoptimize also the owning compiled
>>>>>>> frame C and we register
>>>>>>> deoptimized objects as deferred updates. When control returns to C
>> it
>>>>>>> gets deoptimized, we notice
>>>>>>> that objects are already deoptimized (reallocated and relocked), so
>> we
>>>>>>> don't do it again (relocking
>>>>>>> twice would be incorrect of course). Deferred updates are copied into
>>>>>>> the new interpreter frames.
>>>>>>>
>>>>>>> Problem: relocking is not possible if the target thread T is waiting
>>>>>>> on the monitor that needs to be
>>>>>>> relocked. This happens only with non-local objects with
>>>>>>> EliminateNestedLocks. Instead relocking is
>>>>>>> deferred until T owns the monitor again. This is what the piece of
>>>>>>> code above does.
>>>>>>
>>>>>> Sorry I need some more detail here. How can you wait() on an object
>>>>>> monitor if the object allocation and/or locking was optimised away?
>> And
>>>>>> what is a "non-local object" in this context? Isn't EA restricted to
>>>>>> thread-confined objects?
>>>>>>
>>>>>> Is it just that some of the locking gets optimized away e.g.
>>>>>>
>>>>>> synchronised(obj) {
>>>>>>     ? synchronised(obj) {
>>>>>>     ??? synchronised(obj) {
>>>>>>     ????? obj.wait();
>>>>>>     ??? }
>>>>>>     ? }
>>>>>> }
>>>>>>
>>>>>> If this is reduced to a form as-if it were a single lock of the monitor
>>>>>> (due to EA) and the wait() triggers a JVM TI event which leads to the
>>>>>> escape of "obj" then we need to reconstruct the true lock state, and so
>>>>>> when the wait() internally unblocks and reacquires the monitor it has to
>>>>>> set the true recursion count to 3, not the 1 that it appeared to be when
>>>>>> wait() was initially called. Is that the scenario?
>>>>>>
>>>>>> If so I find this truly awful. Anyone using wait() in a realistic form
>>>>>> requires a notification and so the object cannot be thread confined. In
>>>>>> which case I would strongly argue that upon hitting the wait() the
>> deopt
>>>>>> should occur unconditionally and so the lock state is correct before we
>>>>>> wait and so we don't need to mess with the recursion count internally
>>>>>> when we reacquire the monitor.
>>>>>>
>>>>>>>
>>>>>>>    ?? > which I don't like the sound of at all when it comes to
>>>>>>> ObjectMonitor
>>>>>>>    ?? > state. So I'd like to understand in detail exactly what is going
>>>>>>> on here
>>>>>>>    ?? > and why.? This is a very intrusive change that seems to badly
>> break
>>>>>>>    ?? > encapsulation and impacts future changes to ObjectMonitor that
>>>>>>> are under
>>>>>>>    ?? > investigation.
>>>>>>>
>>>>>>> I would not regard this as breaking encapsulation. Certainly not badly.
>>>>>>>
>>>>>>> I've added a property relock_count_after_wait to JavaThread. The
>>>>>>> property is well
>>>>>>> encapsulated. Future ObjectMonitor implementations have to deal
>> with
>>>>>>> recursion too. They are free in
>>>>>>> choosing a way to do that as long as that property is taken into
>>>>>>> account. This is hardly a
>>>>>>> limitation.
>>>>>>
>>>>>> I do think this badly breaks encapsulation as you have to add a callout
>>>>>> from the guts of the ObjectMonitor code to reach into the thread to
>> get
>>>>>> this lock count adjustment. I understand why you have had to do this
>> but
>>>>>> I would much rather see a change to the EA optimisation strategy so
>> that
>>>>>> this is not needed.
>>>>>>
>>>>>>> Note also that the property is a straight forward extension of the
>>>>>>> existing concept of deferred
>>>>>>> local updates. It is embedded into the structure holding them. So not
>>>>>>> even the footprint of a
>>>>>>> JavaThread is enlarged if no deferred updates are generated.
>>>>>>>
>>>>>>>    ?? > ---
>>>>>>>    ?? >
>>>>>>>    ?? > src/hotspot/share/runtime/thread.cpp
>>>>>>>    ?? >
>>>>>>>    ?? > Can you please explain why
>>>>>>> JavaThread::wait_for_object_deoptimization
>>>>>>>    ?? > has to be handcrafted in this way rather than using proper
>>>>>>> transitions.
>>>>>>>    ?? >
>>>>>>>
>>>>>>> I wrote wait_for_object_deoptimization taking
>>>>>>> JavaThread::java_suspend_self_with_safepoint_check
>>>>>>> as template. So in short: for the same reasons :)
>>>>>>>
>>>>>>> Threads reach both methods as part of thread state transitions,
>>>>>>> therefore special handling is
>>>>>>> required to change thread state on top of ongoing transitions.
>>>>>>>
>>>>>>>    ?? > We got rid of "deopt suspend" some time ago and it is disturbing
>>>>>>> to see
>>>>>>>    ?? > it being added back (effectively). This seems like it may be
>>>>>>> something
>>>>>>>    ?? > that handshakes could be used for.
>>>>>>>
>>>>>>> Deopt suspend used to be something rather different with a similar
>>>>>>> name[1]. It is not being added back.
>>>>>>
>>>>>> I stand corrected. Despite comments in the code to the contrary
>>>>>> deopt_suspend didn't actually cause a self-suspend. I was doing a lot of
>>>>>> cleanup in this area 13 years ago :)
>>>>>>
>>>>>>>
>>>>>>> I'm actually duplicating the existing external suspend mechanism,
>>>>>>> because a thread can be suspended
>>>>>>> at most once. And hey, and don't like that either! But it seems not
>>>>>>> unlikely that the duplicate can
>>>>>>> be removed together with the original and the new type of
>> handshakes
>>>>>>> that will be used for
>>>>>>> thread suspend can be used for object deoptimization too. See
>> today's
>>>>>>> discussion in JDK-8227745 [2].
>>>>>>
>>>>>> I hope that discussion bears some fruit, at the moment it seems not to
>>>>>> be possible to use handshakes here. :(
>>>>>>
>>>>>> The external suspend mechanism is a royal pain in the proverbial that
>> we
>>>>>> have to carefully live with. The idea that we're duplicating that for
>>>>>> use in another fringe area of functionality does not thrill me at all.
>>>>>>
>>>>>> To be clear, I understand the problem that exists and that you wish to
>>>>>> solve, but for the runtime parts I balk at the complexity cost of
>>>>>> solving it.
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>> Thanks, Richard.
>>>>>>>
>>>>>>> [1] Deopt suspend was something like an async. handshake for
>>>>>>> architectures with register windows,
>>>>>>>    ???? where patching the return pc for deoptimization of a compiled
>>>>>>> frame was racy if the owner thread
>>>>>>>    ???? was in native code. Instead a "deopt" suspend flag was set on
>>>>>>> which the thread patched its own
>>>>>>>    ???? frame upon return from native. So no thread was suspended. It
>> got
>>>>>>> its name only from the name of
>>>>>>>    ???? the flags.
>>>>>>>
>>>>>>> [2] Discussion about using handshakes to sync. with the target thread:
>>>>>>>
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-
>>>>
>> 8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.syst
>> e
>>>> m.issuetabpanels:comment-tabpanel#comment-14306727
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>>> Sent: Freitag, 13. Dezember 2019 00:56
>>>>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
>>>>>>> serviceability-dev at openjdk.java.net;
>>>>>>> hotspot-compiler-dev at openjdk.java.net;
>>>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
>>>>>>> Performance in the Presence of JVMTI Agents
>>>>>>>
>>>>>>> Hi Richard,
>>>>>>>
>>>>>>> Some further queries/concerns:
>>>>>>>
>>>>>>> src/hotspot/share/runtime/objectMonitor.cpp
>>>>>>>
>>>>>>> Can you please explain the changes to ObjectMonitor::wait:
>>>>>>>
>>>>>>> !?? _recursions = save????? // restore the old recursion count
>>>>>>> !???????????????? + jt->get_and_reset_relock_count_after_wait(); //
>>>>>>> increased by the deferred relock count
>>>>>>>
>>>>>>> what is the "deferred relock count"? I gather it relates to
>>>>>>>
>>>>>>> "The code was extended to be able to deoptimize objects of a frame
>> that
>>>>>>> is not the top frame and to let another thread than the owning thread
>> do
>>>>>>> it."
>>>>>>>
>>>>>>> which I don't like the sound of at all when it comes to ObjectMonitor
>>>>>>> state. So I'd like to understand in detail exactly what is going on here
>>>>>>> and why.? This is a very intrusive change that seems to badly break
>>>>>>> encapsulation and impacts future changes to ObjectMonitor that are
>> under
>>>>>>> investigation.
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/share/runtime/thread.cpp
>>>>>>>
>>>>>>> Can you please explain why
>> JavaThread::wait_for_object_deoptimization
>>>>>>> has to be handcrafted in this way rather than using proper transitions.
>>>>>>>
>>>>>>> We got rid of "deopt suspend" some time ago and it is disturbing to
>> see
>>>>>>> it being added back (effectively). This seems like it may be something
>>>>>>> that handshakes could be used for.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>> On 12/12/2019 7:02 am, David Holmes wrote:
>>>>>>>> On 12/12/2019 1:07 am, Reingruber, Richard wrote:
>>>>>>>>> Hi David,
>>>>>>>>>
>>>>>>>>>    ??? > Most of the details here are in areas I can comment on in
>> detail,
>>>>>>>>> but I
>>>>>>>>>    ??? > did take an initial general look at things.
>>>>>>>>>
>>>>>>>>> Thanks for taking the time!
>>>>>>>>
>>>>>>>> Apologies the above should read:
>>>>>>>>
>>>>>>>> "Most of the details here are in areas I *can't* comment on in detail
>>>>>>>> ..."
>>>>>>>>
>>>>>>>> David
>>>>>>>>
>>>>>>>>>    ??? > The only thing that jumped out at me is that I think the
>>>>>>>>>    ??? > DeoptimizeObjectsALotThread should be a hidden thread.
>>>>>>>>>    ??? >
>>>>>>>>>    ??? > +? bool is_hidden_from_external_view() const { return true; }
>>>>>>>>>
>>>>>>>>> Yes, it should. Will add the method like above.
>>>>>>>>>
>>>>>>>>>    ??? > Also I don't see any testing of the
>> DeoptimizeObjectsALotThread.
>>>>>>>>> Without
>>>>>>>>>    ??? > active testing this will just bit-rot.
>>>>>>>>>
>>>>>>>>> DeoptimizeObjectsALot is meant for stress testing with a larger
>>>>>>>>> workload. I will add a minimal test
>>>>>>>>> to keep it fresh.
>>>>>>>>>
>>>>>>>>>    ??? > Also on the tests I don't understand your @requires clause:
>>>>>>>>>    ??? >
>>>>>>>>>    ??? >?? @requires ((vm.compMode != "Xcomp") &
>> vm.compiler2.enabled
>>>> &
>>>>>>>>>    ??? > (vm.opt.TieredCompilation != true))
>>>>>>>>>    ??? >
>>>>>>>>>    ??? > This seems to require that TieredCompilation is disabled, but
>>>>>>>>> tiered is
>>>>>>>>>    ??? > our normal mode of operation. ??
>>>>>>>>>    ??? >
>>>>>>>>>
>>>>>>>>> I removed the clause. I guess I wanted to target the tests towards
>> the
>>>>>>>>> code they are supposed to
>>>>>>>>> test, and it's easier to analyze failures w/o tiered compilation and
>>>>>>>>> with just one compiler thread.
>>>>>>>>>
>>>>>>>>> Additionally I will make use of
>>>>>>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Richard.
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>>>>> Sent: Mittwoch, 11. Dezember 2019 08:03
>>>>>>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
>>>>>>>>> serviceability-dev at openjdk.java.net;
>>>>>>>>> hotspot-compiler-dev at openjdk.java.net;
>>>>>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
>>>>>>>>> Performance in the Presence of JVMTI Agents
>>>>>>>>>
>>>>>>>>> Hi Richard,
>>>>>>>>>
>>>>>>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I would like to get reviews please for
>>>>>>>>>>
>>>>>>>>>>
>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
>>>>>>>>>>
>>>>>>>>>> Corresponding RFE:
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745
>>>>>>>>>>
>>>>>>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
>>>>>>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-
>> 8214584 [1]
>>>>>>>>>>
>>>>>>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing
>> without
>>>>>>>>>> issues (thanks!). In addition the
>>>>>>>>>> change is being tested at SAP since I posted the first RFR some
>>>>>>>>>> months ago.
>>>>>>>>>>
>>>>>>>>>> The intention of this enhancement is to benefit performance wise
>> from
>>>>>>>>>> escape analysis even if JVMTI
>>>>>>>>>> agents request capabilities that allow them to access local variable
>>>>>>>>>> values. E.g. if you start-up
>>>>>>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,
>> then
>>>>>>>>>> escape analysis is disabled right
>>>>>>>>>> from the beginning, well before a debugger attaches -- if ever one
>>>>>>>>>> should do so. With the
>>>>>>>>>> enhancement, escape analysis will remain enabled until and after
>> a
>>>>>>>>>> debugger attaches. EA based
>>>>>>>>>> optimizations are reverted just before an agent acquires the
>>>>>>>>>> reference to an object. In the JBS item
>>>>>>>>>> you'll find more details.
>>>>>>>>>
>>>>>>>>> Most of the details here are in areas I can comment on in detail, but
>> I
>>>>>>>>> did take an initial general look at things.
>>>>>>>>>
>>>>>>>>> The only thing that jumped out at me is that I think the
>>>>>>>>> DeoptimizeObjectsALotThread should be a hidden thread.
>>>>>>>>>
>>>>>>>>> +? bool is_hidden_from_external_view() const { return true; }
>>>>>>>>>
>>>>>>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread.
>>>>>>>>> Without
>>>>>>>>> active testing this will just bit-rot.
>>>>>>>>>
>>>>>>>>> Also on the tests I don't understand your @requires clause:
>>>>>>>>>
>>>>>>>>>    ??? @requires ((vm.compMode != "Xcomp") &
>> vm.compiler2.enabled &
>>>>>>>>> (vm.opt.TieredCompilation != true))
>>>>>>>>>
>>>>>>>>> This seems to require that TieredCompilation is disabled, but tiered
>> is
>>>>>>>>> our normal mode of operation. ??
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Richard.
>>>>>>>>>>
>>>>>>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745
>>>>>>>>>>
>>>>
>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.pa
>> tc
>>>> h
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>

From daniel.daugherty at oracle.com  Tue Mar 31 14:41:01 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Tue, 31 Mar 2020 10:41:01 -0400
Subject: Thread Local Handshake in JVMTI functions
In-Reply-To: <4c9aa3ab-468d-eede-18f7-ac8d352575b6@oss.nttdata.com>
References: <9ecf6856-f5c7-4723-7cc9-7d257e7bb7c0@oss.nttdata.com>
 <e2f3b51c-be7a-cd27-caf9-8d2f5e3afc1c@oracle.com>
 <4c9aa3ab-468d-eede-18f7-ac8d352575b6@oss.nttdata.com>
Message-ID: <64323780-b96e-0e3a-5f61-8f1dd74a1805@oracle.com>

Add Robbin to this thread...


This reminded of the following RFE that Robbin filed:

 ??? JDK-8201641 JVMTI: GetThreadListStackTraces should use Thread-Local 
Handshakes
 ??? https://bugs.openjdk.java.net/browse/JDK-8201641

We could update 8201641 to include everything that Yasumasa-san is 
requesting.
Would be a good place to track it...

Dan


On 3/31/20 7:40 AM, Yasumasa Suenaga wrote:
> Hi David,
>
> On 2020/03/31 19:16, David Holmes wrote:
>> Hi Yasumasa,
>>
>> On 31/03/2020 8:06 pm, Yasumasa Suenaga wrote:
>>> Hi all,
>>>
>>> Many JVMTI functions uses VM Operation to get information. However 
>>> some of them need to stop only one thread - they don't need to stop 
>>> all threads.
>>> So I think we can use Thread Local Handshake as this webrev. It is 
>>> example for GetOneCurrentContendedMonitor().
>>
>> True, but at the moment handshakes involve the VMThread. There is 
>> work being done to support direct thread-to-thread handshakes and 
>> once that is done this kind of conversion should be more easily done. 
>> It might be worth waiting for that.
>
> Thanks, I will be back to this topic when thread-to-thread handshake 
> is done.
> I wondered at first why VMThread involves handshake. Its improvement 
> is welcome for me ;)
>
>
> Cheers,
>
> Yasumasa
>
>
>>> http://cr.openjdk.java.net/~ysuenaga/jvmti-thread-local-handshake/
>>
>> An observation, it seems to me that calling_thread is not used when 
>> this is not a VMOperation.
>>
>> Cheers,
>> David
>>
>>> Also I think we can replace following VM Operations to Thread Local 
>>> Handshake:
>>>
>>> class VM_GetCurrentLocation
>>> class VM_EnterInterpOnlyMode
>>> class VM_UpdateForPopTopFrame
>>> class VM_SetFramePop
>>> class VM_GetOwnedMonitorInfo
>>> class VM_GetCurrentContendedMonitor
>>> class VM_GetFrameCount
>>> class VM_GetFrameLocation
>>>
>>> What do you think?
>>> It it is acceptable, I will file it to JBS and send review request.
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa


From poonam.bajaj at oracle.com  Tue Mar 31 16:19:23 2020
From: poonam.bajaj at oracle.com (Poonam Parhar)
Date: Tue, 31 Mar 2020 09:19:23 -0700
Subject: Discussion about fixing deprecation in jdk.hotspot.agent
In-Reply-To: <113dd83a-82a3-88fc-8f31-fe9bfd00c12c@oracle.com>
References: <b66dad5c-cd1e-afdc-6c5c-62d1d89fda00@oracle.com>
 <1916207b-de97-1f25-f93c-8830025fad62@oracle.com>
 <a9590dc7-867b-2f13-6772-5bc06c4d1a86@oracle.com>
 <113dd83a-82a3-88fc-8f31-fe9bfd00c12c@oracle.com>
Message-ID: <bd869217-46b8-1181-19cc-311d54aadd26@oracle.com>

Hello Coleen,

Does the removal of this code only impact the 'reattach' functionality, 
and it does not affect any commands available in 'clhsdb' once it is 
attached to a core file? If that's true, then I think it should be okay 
to remove this code.

Thanks,
Poonam

On 3/31/20 5:34 AM, coleen.phillimore at oracle.com wrote:
>
> To answer my own question, this functionality is used to allow 
> detach/reattach from {cl}hsdb.? Which seems to work on linux but not 
> windows with this code removed.
>
> The next question is whether this is useful functionality to justify 
> all this code (900+ and this new code that Magnus has added).? Can't 
> you just exit and restart the clhsdb process on the core file or process?
>
> For the record, this is me playing with python to remove this code.
>
> http://cr.openjdk.java.net/~coleenp/2020/01/webrev/index.html
>
> Thanks,
> Coleen
>
> On 3/30/20 3:04 PM, coleen.phillimore at oracle.com wrote:
>>
>> I was wondering why this is needed when debugging a core file, which 
>> is the key thing we need the SA for:
>>
>> ? /** This is used by both the debugger and any runtime system. It is
>> ????? the basic mechanism by which classes which mimic underlying VM
>> ????? functionality cause themselves to be initialized. The given
>> ????? observer will be notified (with arguments (null, null)) when the
>> ????? VM is re-initialized, as well as when it registers itself with
>> ????? the VM. */
>> ? public static void registerVMInitializedObserver(Observer o) {
>> ??? vmInitializedObservers.add(o);
>> ??? o.update(null, null);
>> ? }
>>
>> It seems like if it isn't needed, we shouldn't add these classes and 
>> remove their use.
>>
>> Coleen
>>
>> On 3/30/20 8:14 AM, Magnus Ihse Bursie wrote:
>>> No opinions on this?
>>>
>>> /Magnus
>>>
>>> On 2020-03-25 23:34, Magnus Ihse Bursie wrote:
>>>> Hi everyone,
>>>>
>>>> As a follow-up to the ongoing review for JDK-8241618, I have also 
>>>> looked at fixing the deprecation warnings in jdk.hotspot.agent. 
>>>> These fall in three broad categories:
>>>>
>>>> * Deprecation of the boxing type constructors (e.g. "new 
>>>> Integer(42)").
>>>>
>>>> * Deprecation of java.util.Observer and Observable.
>>>>
>>>> * The rest (mostly Class.newInstance(), and a few number of other 
>>>> odd deprecations)
>>>>
>>>> The first category is trivial to fix. The last category need some 
>>>> special discussion. But the overwhelming majority of deprecation 
>>>> warnings come from the use of Observer and Observable. This really 
>>>> dwarfs anything else, and needs to be handled first, otherwise it's 
>>>> hard to even spot the other issues.
>>>>
>>>> My analysis of the situation is that the deprecation of Observer 
>>>> and Observable seems a bit harsh, from the PoV of 
>>>> jdk.hotspot.agent. Sure, it might be limited, but I think it does 
>>>> exactly what is needed here. So the migration suggested in 
>>>> Observable (java.beans or java.util.concurrent) seems overkill. If 
>>>> there are genuine threading issues at play here, this assumption 
>>>> might be wrong, and then maybe going the j.u.c. route is correct.
>>>>
>>>> But if that's not, the main goal should be to stay with the current 
>>>> implementation. One way to do this is to sprinkle the code with 
>>>> @SuppressWarning. But I think a better way would be to just 
>>>> implement our own Observer and Observable. After all, the classes 
>>>> are trivial.
>>>>
>>>> I've made a mock-up of this solution, were I just copied the 
>>>> java.util.Observer and Observable, and removed the deprecation 
>>>> annotations. The only thing needed for the rest of the code is to 
>>>> make sure we import these; I've done this for three arbitrarily 
>>>> selected classes just to show what the change would typically look 
>>>> like. Here's the mock-up:
>>>>
>>>> http://cr.openjdk.java.net/~ihse/hotspot-agent-observer/webrev.01
>>>>
>>>> Let me know what you think.
>>>>
>>>> /Magnus
>>>
>>
>


From serguei.spitsyn at oracle.com  Tue Mar 31 16:59:52 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 31 Mar 2020 09:59:52 -0700
Subject: Thread Local Handshake in JVMTI functions
In-Reply-To: <64323780-b96e-0e3a-5f61-8f1dd74a1805@oracle.com>
References: <9ecf6856-f5c7-4723-7cc9-7d257e7bb7c0@oss.nttdata.com>
 <e2f3b51c-be7a-cd27-caf9-8d2f5e3afc1c@oracle.com>
 <4c9aa3ab-468d-eede-18f7-ac8d352575b6@oss.nttdata.com>
 <64323780-b96e-0e3a-5f61-8f1dd74a1805@oracle.com>
Message-ID: <8ced0d82-4125-7179-bac2-c8ce54807274@oracle.com>

Hi Yasumasa,

Yes, this works needs to be done.
I'll take look at you webrev.

Thanks,
Serguei

On 3/31/20 07:41, Daniel D. Daugherty wrote:
> Add Robbin to this thread...
>
>
> This reminded of the following RFE that Robbin filed:
>
> ??? JDK-8201641 JVMTI: GetThreadListStackTraces should use 
> Thread-Local Handshakes
> ??? https://bugs.openjdk.java.net/browse/JDK-8201641
>
> We could update 8201641 to include everything that Yasumasa-san is 
> requesting.
> Would be a good place to track it...
>
> Dan
>
>
> On 3/31/20 7:40 AM, Yasumasa Suenaga wrote:
>> Hi David,
>>
>> On 2020/03/31 19:16, David Holmes wrote:
>>> Hi Yasumasa,
>>>
>>> On 31/03/2020 8:06 pm, Yasumasa Suenaga wrote:
>>>> Hi all,
>>>>
>>>> Many JVMTI functions uses VM Operation to get information. However 
>>>> some of them need to stop only one thread - they don't need to stop 
>>>> all threads.
>>>> So I think we can use Thread Local Handshake as this webrev. It is 
>>>> example for GetOneCurrentContendedMonitor().
>>>
>>> True, but at the moment handshakes involve the VMThread. There is 
>>> work being done to support direct thread-to-thread handshakes and 
>>> once that is done this kind of conversion should be more easily 
>>> done. It might be worth waiting for that.
>>
>> Thanks, I will be back to this topic when thread-to-thread handshake 
>> is done.
>> I wondered at first why VMThread involves handshake. Its improvement 
>> is welcome for me ;)
>>
>>
>> Cheers,
>>
>> Yasumasa
>>
>>
>>>> http://cr.openjdk.java.net/~ysuenaga/jvmti-thread-local-handshake/
>>>
>>> An observation, it seems to me that calling_thread is not used when 
>>> this is not a VMOperation.
>>>
>>> Cheers,
>>> David
>>>
>>>> Also I think we can replace following VM Operations to Thread Local 
>>>> Handshake:
>>>>
>>>> class VM_GetCurrentLocation
>>>> class VM_EnterInterpOnlyMode
>>>> class VM_UpdateForPopTopFrame
>>>> class VM_SetFramePop
>>>> class VM_GetOwnedMonitorInfo
>>>> class VM_GetCurrentContendedMonitor
>>>> class VM_GetFrameCount
>>>> class VM_GetFrameLocation
>>>>
>>>> What do you think?
>>>> It it is acceptable, I will file it to JBS and send review request.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>


From mandy.chung at oracle.com  Tue Mar 31 18:06:58 2020
From: mandy.chung at oracle.com (Mandy Chung)
Date: Tue, 31 Mar 2020 11:06:58 -0700
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
Message-ID: <940c6907-612e-8744-376c-5362991d4a42@oracle.com>

This patch addresses Joe's feedback on the CSR [1]:

http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03-delta-jdarcy/

Specifically, it adds to the class specification of java.lang.Class to 
describe how the relevant methods behave for hidden classes.? In 
addition, use the new inline @jvms tag.

Thanks
Mandy
[1] https://bugs.openjdk.java.net/browse/JDK-8238359

On 3/26/20 4:57 PM, Mandy Chung wrote:
> Please review the implementation of JEP 371: Hidden Classes. The main 
> changes are in core-libs and hotspot runtime area.? Small changes are 
> made in javac, VM compiler (intrinsification of Class::isHiddenClass), 
> JFR, JDI, and jcmd.? CSR [1]has been reviewed and is in the finalized 
> state (see specdiff and javadoc below for reference).
>
> Webrev:
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 
>
>
> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's point
> of view, a hidden class is a normal class except the following:
>
> - A hidden class has no initiating class loader and is not registered 
> in any dictionary.
> - A hidden class has a name containing an illegal character 
> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` 
> returns "Lp/Foo.0x1234;".
> - A hidden class is not modifiable, i.e. cannot be redefined or 
> retransformed. JVM TI IsModifableClass returns false on a hidden.
> - Final fields in a hidden class is "final".? The value of final 
> fields cannot be overriden via reflection.? setAccessible(true) can 
> still be called on reflected objects representing final fields in a 
> hidden class and its access check will be suppressed but only have 
> read-access (i.e. can do Field::getXXX but not setXXX).
>
> Brief summary of this patch:
>
> 1. A new Lookup::defineHiddenClass method is the API to create a 
> hidden class.
> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG 
> option that
> ?? can be specified when creating a hidden class.
> 3. A new Class::isHiddenClass method tests if a class is a hidden class.
> 4. Field::setXXX method will throw IAE on a final field of a hidden class
> ?? regardless of the value of the accessible flag.
> 5. JVM_LookupDefineClass is the new JVM entry point for 
> Lookup::defineClass
> ?? and defineHiddenClass to create a class from the given bytes.
> 6. ClassLoaderData implementation is not changed.? There is one 
> primary CLD
> ?? that holds the classes strongly referenced by its defining loader.? 
> There
> ?? can be zero or more additional CLDs - one per weak class.
> 7. Nest host determination is updated per revised JVMS 5.4.4. Access 
> control
> ?? check no longer throws LinkageError but instead it will throw IAE with
> ?? a clear message if a class fails to resolve/validate the nest host 
> declared
> ?? in NestHost/NestMembers attribute.
> 8. JFR, jcmd, JDI are updated to support hidden classes.
> 9. update javac LambdaToMethod as lambda proxy starts using nestmates
> ?? and generate a bridge method to desuger a method reference to a 
> protected
> ?? method in its supertype in a different package
>
> This patch also updates StringConcatFactory, LambdaMetaFactory, and 
> LambdaForms
> to use hidden classes.? The webrev includes changes in nashorn to 
> hidden class
> and I will update the webrev if JEP 372 removes it any time soon.
>
> We uncovered a bug in Lookup::defineClass spec throws LinkageError and 
> intends
> to have the newly created class linked.? However, the implementation 
> in 14
> does not link the class.? A separate CSR [2] proposes to update the
> implementation to match the spec.? This patch fixes the implementation.
>
> The spec update on JVM TI, JDI and Instrumentation will be done as
> a separate RFE [3].? This patch includes new tests for JVM TI and
> java.instrument that validates how the existing APIs work for hidden 
> classes.
>
> javadoc/specdiff
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ 
>
>
> JVMS 5.4.4 change:
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf 
>
>
> CSR:
> https://bugs.openjdk.java.net/browse/JDK-8238359
>
> Thanks
> Mandy
> [1] https://bugs.openjdk.java.net/browse/JDK-8238359
> [2] https://bugs.openjdk.java.net/browse/JDK-8240338
> [3] https://bugs.openjdk.java.net/browse/JDK-8230502

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200331/b0efdd5a/attachment-0001.htm>

From leonid.mesnik at oracle.com  Tue Mar 31 19:09:30 2020
From: leonid.mesnik at oracle.com (Leonid Mesnik)
Date: Tue, 31 Mar 2020 12:09:30 -0700
Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the
 children process if vmArguments is already specified
In-Reply-To: <ed4167ad-681a-01f8-add6-c0f01188fefd@oracle.com>
References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com>
 <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com>
 <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com>
 <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com>
 <042D1C56-75E3-40B9-8259-035C9A13C13B@oracle.com>
 <47cba4c6-233d-84a9-1d89-b40e3c974c08@oracle.com>
 <8bcf232e-e05c-98ae-767f-26adf18ad3fd@oracle.com>
 <cd4bb93e-b41c-a59f-ac9c-c1ea10a6a374@oracle.com>
 <ed4167ad-681a-01f8-add6-c0f01188fefd@oracle.com>
Message-ID: <b08d3b9c-3198-2658-1aac-043dbab330bc@oracle.com>

Hi

On 3/30/20 9:43 PM, Chris Plummer wrote:
> Hi Leonid,
>
> On 3/30/20 5:42 PM, Leonid Mesnik wrote:
>> Hi
>>
>> See my comments inline. I will update webrev after go through all 
>> your comments.
>>
>>
>> On 3/30/20 11:39 AM, Chris Plummer wrote:
>>> Hi Leonid,
>>>
>>> I haven't gone through all the tests yet.? I've accumulated enough 
>>> questions that I'd like to see them answered or addressed before I 
>>> continue on.
>>>
>>> This isn't directly related to your changes, but I noticed that 
>>> users of JDKToolLauncher do nothing to make sure that default test 
>>> options are used. This means we are never running these tools with 
>>> the test options being specified with the jtreg run. Is that a bug 
>>> or intentional?
>>
>> Which "default test options" do you mean? We have 2 properties to set 
>> JVM options. The idea is to pass test.vm.opts to ALL java processes 
>> and test.java.opts? to only tested processes if applicable. Usually, 
>> for example we don't want to run jcmd with -Xcomp. test.vm.opts was 
>> used (a long time ago) for options like '-d32/-d64' on Solaris where 
>> JVM don't start without choosing correct version. Also, it is used to 
>> reduce maximum heap for all JVM instances when tests are running 
>> concurrently.
>>
>> So, probably test.vm.opts (or test.vm.tools.opts) should be added by 
>> JDKToolLauncher but not test.java.opts. It is separate topic, there 
>> are a lot of launchers which ignore test.vm.opts now.
> I always get confused about which set of options these properties 
> represent, but basically I'm suggesting that if for example we are 
> doing a -Xcomp run in mach5, JDKToolLauncher (at least in some cases) 
> should be launched with this option. I think this is what you get from 
> Utils.getTestJavaOpts(),.
>
> For example the SA tests use 
> JDKToolLauncher.createUsingTestJDK("jhsdb"). jhsdb is what is really 
> being tested here, and it should be launched with the test vm options. 
> Currently we launch the target process with these options, which is 
> probably also a good idea.? Also we aren't too concerned with the 
> options that the test itself is run with, although I'm guessing they 
> also get run with the test java opts. So we have 3 processes here:
> ?- jhsdb, which should be getting test java opts but is not
> ?- the target process, which should be getting test java opts and 
> currently is
> ?- the test itself, where options don't really matter, but is getting 
> passed test java opts
>
> However, you could argue that tests like jinfo, jstack, and jcmd, all 
> of which use the Attach API and the bulk of the work is done on the 
> target process, are not that concerned with the options passed to the 
> command, but do want the options passed to the target process.

Well, it is a good question if we want to run jhsdb tool itself with 
additional slow options like Xcomp. Does it help us to improve coverage? 
IIRC the original idea of adding test.java/vm.opts was to don't waste 
time executing javac and debuggers in slow mode on SPARC.

Anyway, it is a separate question which is out of scope of this change. 
We might want to review all debugger/debugee tests to find better way to 
deal with this.

>>
>>>
>>> In the problem lists, is it necessary to list the test multiple 
>>> times with #id0, #id1, etc, or could you list it just once and leave 
>>> that part off. It seems very error prone. Also, changing tests like 
>>> ClhsdbFindPC, ClhsdbJstack, and ClhsdbScanOops to split out the 
>>> testing in this manner seems completely unrelated to this CR, 
>>> especially when the tests do not even contain any changes related to 
>>> the CR.
>>
>> I think, that these chages are related. The startApp(...) was updated 
>> so some test combinations become invalid or redundant.
>>
>> ClhsdbFindPC and ClhsdbJstack were always run twice. Now, when test 
>> options passed in test it is not needed to run it twice when Xcomp is 
>> already set by user.
>>
> Ok. I see now that the second test run, which is the non -Xcomp run, 
> adds '@requires vm.compMode != "Xcomp"'. But this also is strange. The 
> first test run, which does not have the @requires and is the one that 
> makes LingeredApp launch with -Xcomp, will always run whether or not 
> it is an -Xcomp test run. So it will run as part of the a regular test 
> run and as part of a -Xcomp test run. The only difference between the 
> two is the -Xcomp run will also run the test with -Xcomp, but that's 
> not really needed (I think it will also end up passing -Xcomp to the 
> target processs twice). Perhaps '@requires vm.compMode == "Xcomp"' 
> should be used for the first test run, but that means it no longer 
> gets run until later tiers when we use -Xcomp. Why not revert it back 
> to a single test, but also add '@requires vm.compMode != "Xcomp"'. 
> Then it gets run both ways in an early tier and not run during the 
> -Xcomp run, which isn't really needed.

There several flag which are executed with Xcomp only: 
"-XX:-DoEscapeAnalysis",? "-XX:-UseBiasedLocking", "-XX:+DeoptimizeALot" 
where this test is going to be skipped. So we never run test with these 
options.

The original idea is to run test with given options and with added 
Xcomp.? I left logic the same and only skip run with "Xcomp" when it is 
set already by user. I agree that we have some duplication here and it 
could be improved, but it could be done separately. If you are ok with 
this let me file separate RFE for this.

>
>> ClhsdbScanOops is fixed to don't allow to run incompatible GC 
>> combination.
> Ok
>>
>> So I should update these tests by splitting them or change them to? 
>> startAppExactJvmOpts() if we wan't continue to ignore user-given test 
>> options.
> I don't think I was suggesting removing user-given test options. I 
> don't see why you would.

I just wanted to say that these tests are affected by my changes and 
should be fixed anyway.

Leonid

>>
>> It seems that #idN are required by jtreg now, otherwise it just run 
>> test.
> Ok.
>>
>>>
>>> ?426???? public static LingeredApp startApp(String... 
>>> additionalJvmOpts) throws IOException {
>>>
>>> The default test opts are appended to additionalJvmOpts, and if you 
>>> want prepended you need to call Utils.prependTestJavaOpts(). I would 
>>> have thought the opposite would be more desirable and expected 
>>> default behavior. Why did you choose this way? I also find it 
>>> somewhat confusing that there is even a default mode for where the 
>>> additionalJvmOpts go. Maybe it would be best to have 
>>> startAppAppendJvmArgs() and startAppPrependJvmArgs() just to make it 
>>> explicit. This would also be in line with the existing 
>>> startAppExactJvmOpts().
>>>
>> I've chosen the most popular usage, which was 
>> Utils.appendTestJavaOpts. But I agree, that it would be better to 
>> change it to prepend. Thanks for pointing to this.
>>
>> I don't want to add startAppAppendJvmArgs()/startAppPrependJvmArgs() 
>> to don't complicate all things. I think that startApp() should be 
>> used in the cases when test vm options really shouldn't interfere 
>> with user-provided options or overwrite them. So basically the 
>> behavior is the same as for 
>> ProcessTools.createJavaProcessBuilder(true, ...) and jtreg itself.
>>
> Ok.
>>
>>> Is ClhsdbFindPC correct. It used to use just use -Xcomp or -Xint, 
>>> ignoring any default test opts. You've fixed it to include the 
>>> default test opts, but the are appended, possibly overriding the 
>>> -Xcomp or -Xint. Don't we want the default test opts prepended? Same 
>>> for ClhsdbJstack.
>>
>> The idea is to don't mix Xcomp and Xmixed/Xint using requires filter. 
>> However ClhsdbFindPC might override Xint with Xmixed if it is set 
>> explicitly. Switching to prepending will fix it.
> Yes, that's what I was thinking and one reason I thought that should 
> be default behavior.
>
> thanks,
>
> Chris
>>
>> Leonid
>>
>>>
>>> thanks,
>>>
>>> Chris
>>>
>>> On 3/25/20 2:31 PM, Leonid Mesnik wrote:
>>>>
>>>> Igor, Stefan, Ioi
>>>>
>>>> Thank you for your feedback.
>>>>
>>>> Filed https://bugs.openjdk.java.net/browse/JDK-8241624 To change 
>>>> @run main... to @run driver.
>>>>
>>>> Test ClhsdbJstack.java is updated.
>>>>
>>>> Still waiting for review from SVC team.
>>>>
>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/
>>>>
>>>> Leonid
>>>>
>>>> On 3/25/20 12:46 PM, Igor Ignatyev wrote:
>>>>> Hi Leonid,
>>>>>
>>>>> not related related to your patch (but yet somewhat made more 
>>>>> obvious by it), it seems all (or at least almost all) the tests 
>>>>> which use?LingeredApp should be run in "driver" mode as they just 
>>>>> orchestrate execution of other JVMs, so running them w/ main (let 
>>>>> alone main/othervm) just wastes time, 
>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java#id1, for 
>>>>> example, will now executed w/ Xcomp which will make it very slow 
>>>>> for no reasons. since you already got your hands dirty w/ these 
>>>>> tests, could you please file an RFE to sort this out and list all 
>>>>> the affected tests there?
>>>>>
>>>>> re: the patch, could you please update ClhsdbJstack.java test not 
>>>>> to be run w/ Xcomp and follow the same pattern you used in other 
>>>>> tests (e.g.?ClhsdbScanOops) ? other than that it looks fine to me, 
>>>>> I however wouldn't be able to tell if all svc tests continue to do 
>>>>> that they were supposed to, so I'd prefer for someone from svc 
>>>>> team to?chime in.
>>>>>
>>>>> Thanks,
>>>>> -- Igor
>>>>>
>>>>>> On Mar 25, 2020, at 12:01 PM, Leonid Mesnik 
>>>>>> <leonid.mesnik at oracle.com <mailto:leonid.mesnik at oracle.com>> wrote:
>>>>>>
>>>>>> Added Ioi, who also proposed new version of startAppVmOpts.
>>>>>>
>>>>>> Please find new webrev: 
>>>>>> http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/
>>>>>>
>>>>>> Renamed startAppVmOpts/runAppVmOpts to 
>>>>>> "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should make 
>>>>>> very clear that this method doesn't use any of test.java.opts, 
>>>>>> test.vm.opts.
>>>>>>
>>>>>> Also, I fixed 
>>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java metnioned 
>>>>>> by Igor, and removed null pointer check as Ioi suggested in 
>>>>>> startApp method.
>>>>>>
>>>>>> + public static void startApp(LingeredApp theApp, String... 
>>>>>> additionalJvmOpts) throws IOException {
>>>>>> + startAppExactJvmOpts(theApp, 
>>>>>> Utils.appendTestJavaOpts(additionalJvmOpts));
>>>>>> + }
>>>>>>
>>>>>> Leonid
>>>>>>
>>>>>> On 3/25/20 10:14 AM, Stefan Karlsson wrote:
>>>>>>> On 2020-03-25 17:40, Igor Ignatyev wrote:
>>>>>>>> Hi Leonid,
>>>>>>>>
>>>>>>>> I have briefly looked at the patch, a few comments so far:
>>>>>>>>
>>>>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java:
>>>>>>>> ? - at L#114, could you please call static method using class 
>>>>>>>> name (as the opposite of using instance)? or was it meant to be 
>>>>>>>> theApp.runAppVmOpts(vmArgs) ?
>>>>>>>>
>>>>>>>> test/lib/jdk/test/lib/apps/LingeredApp.java:
>>>>>>>> - it seems that code indent of startApp(LingeredApp, String[]) 
>>>>>>>> isn't correct
>>>>>>>> - I don't like startAppVmOpts name, but unfortunately don't 
>>>>>>>> have a better suggestion (yet)
>>>>>>>
>>>>>>> I was going to say the same. Jtreg has the concept of "java 
>>>>>>> options" and "vm options". We have had a fair share of bugs and 
>>>>>>> wasted time when tests have been using the "vm options" part 
>>>>>>> (VM_OPTIONS, test.vm.options, etc), and we've been moving away 
>>>>>>> from using that way to pass options. I recently cleaned up some 
>>>>>>> of this with:
>>>>>>>
>>>>>>> 8237111: LingeredApp should be started with getTestJavaOpts
>>>>>>>
>>>>>>> Because of this, I would prefer if we used a name that doesn't 
>>>>>>> include "VmOpts", because it's too alike the other concept. Some 
>>>>>>> suggestions:
>>>>>>> ?startAppJavaOptions
>>>>>>> ?startAppUsingJavaOptions
>>>>>>> ?startAppWithJavaOptions
>>>>>>> ?startAppExactJavaOptions
>>>>>>> ?startAppJvmOptions
>>>>>>>
>>>>>>> Thanks,
>>>>>>> StefanK
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> -- Igor
>>>>>>>>
>>>>>>>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik 
>>>>>>>>> <leonid.mesnik at oracle.com> wrote:
>>>>>>>>>
>>>>>>>>> Hi
>>>>>>>>>
>>>>>>>>> Could you please review following fix which change LingeredApp 
>>>>>>>>> to prepend vm options to java/vm.test.opts when startApp is 
>>>>>>>>> used and provide startAppVmOpts to override options completely.
>>>>>>>>>
>>>>>>>>> The intention is to avoid issue like in this bug where 
>>>>>>>>> test/jtreg options were ignored by tests. Also I fixed some 
>>>>>>>>> tests where intention was to append vm options rather than to 
>>>>>>>>> override them.
>>>>>>>>>
>>>>>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/
>>>>>>>>>
>>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698
>>>>>>>>>
>>>>>>>>> Leonid
>>>>>>>>>
>>>>>>>
>>>>>
>>>
>>>
>
>

From mandy.chung at oracle.com  Tue Mar 31 19:25:53 2020
From: mandy.chung at oracle.com (Mandy Chung)
Date: Tue, 31 Mar 2020 12:25:53 -0700
Subject: Review Request: 8238358: Implementation of JEP 371: Hidden Classes
In-Reply-To: <940c6907-612e-8744-376c-5362991d4a42@oracle.com>
References: <0d824617-3eaf-727c-6eb8-be2414111510@oracle.com>
 <940c6907-612e-8744-376c-5362991d4a42@oracle.com>
Message-ID: <cfe9c4ae-f70e-6408-9d8b-70f02158ff3a@oracle.com>

Alex's feedback:? rename isHiddenClass to isHidden as it can be a hidden 
class or interface.

`isLocalClass` and `sAnonymousClass` are correct because the Java 
language only has local classes and anon classes, not local interfaces 
or anon. interfaces.? `isHidden` is like `isSynthetic`, it could be a 
class or interface.

Although isHiddenClass seems clearer, I'm okay to rename it to `isHidden`.

Mandy

On 3/31/20 11:06 AM, Mandy Chung wrote:
> This patch addresses Joe's feedback on the CSR [1]:
>
> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03-delta-jdarcy/ 
>
>
> Specifically, it adds to the class specification of java.lang.Class to 
> describe how the relevant methods behave for hidden classes.? In 
> addition, use the new inline @jvms tag.
>
> Thanks
> Mandy
> [1] https://bugs.openjdk.java.net/browse/JDK-8238359
>
> On 3/26/20 4:57 PM, Mandy Chung wrote:
>> Please review the implementation of JEP 371: Hidden Classes. The main 
>> changes are in core-libs and hotspot runtime area.? Small changes are 
>> made in javac, VM compiler (intrinsification of 
>> Class::isHiddenClass), JFR, JDI, and jcmd.? CSR [1]has been reviewed 
>> and is in the finalized state (see specdiff and javadoc below for 
>> reference).
>>
>> Webrev:
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/webrev.03 
>>
>>
>> Hidden class is created via `Lookup::defineHiddenClass`. From JVM's 
>> point
>> of view, a hidden class is a normal class except the following:
>>
>> - A hidden class has no initiating class loader and is not registered 
>> in any dictionary.
>> - A hidden class has a name containing an illegal character 
>> `Class::getName` returns `p.Foo/0x1234` whereas `GetClassSignature` 
>> returns "Lp/Foo.0x1234;".
>> - A hidden class is not modifiable, i.e. cannot be redefined or 
>> retransformed. JVM TI IsModifableClass returns false on a hidden.
>> - Final fields in a hidden class is "final".? The value of final 
>> fields cannot be overriden via reflection.? setAccessible(true) can 
>> still be called on reflected objects representing final fields in a 
>> hidden class and its access check will be suppressed but only have 
>> read-access (i.e. can do Field::getXXX but not setXXX).
>>
>> Brief summary of this patch:
>>
>> 1. A new Lookup::defineHiddenClass method is the API to create a 
>> hidden class.
>> 2. A new Lookup.ClassOption enum class defines NESTMATE and STRONG 
>> option that
>> ?? can be specified when creating a hidden class.
>> 3. A new Class::isHiddenClass method tests if a class is a hidden class.
>> 4. Field::setXXX method will throw IAE on a final field of a hidden 
>> class
>> ?? regardless of the value of the accessible flag.
>> 5. JVM_LookupDefineClass is the new JVM entry point for 
>> Lookup::defineClass
>> ?? and defineHiddenClass to create a class from the given bytes.
>> 6. ClassLoaderData implementation is not changed.? There is one 
>> primary CLD
>> ?? that holds the classes strongly referenced by its defining 
>> loader.? There
>> ?? can be zero or more additional CLDs - one per weak class.
>> 7. Nest host determination is updated per revised JVMS 5.4.4. Access 
>> control
>> ?? check no longer throws LinkageError but instead it will throw IAE 
>> with
>> ?? a clear message if a class fails to resolve/validate the nest host 
>> declared
>> ?? in NestHost/NestMembers attribute.
>> 8. JFR, jcmd, JDI are updated to support hidden classes.
>> 9. update javac LambdaToMethod as lambda proxy starts using nestmates
>> ?? and generate a bridge method to desuger a method reference to a 
>> protected
>> ?? method in its supertype in a different package
>>
>> This patch also updates StringConcatFactory, LambdaMetaFactory, and 
>> LambdaForms
>> to use hidden classes.? The webrev includes changes in nashorn to 
>> hidden class
>> and I will update the webrev if JEP 372 removes it any time soon.
>>
>> We uncovered a bug in Lookup::defineClass spec throws LinkageError 
>> and intends
>> to have the newly created class linked.? However, the implementation 
>> in 14
>> does not link the class.? A separate CSR [2] proposes to update the
>> implementation to match the spec.? This patch fixes the implementation.
>>
>> The spec update on JVM TI, JDI and Instrumentation will be done as
>> a separate RFE [3].? This patch includes new tests for JVM TI and
>> java.instrument that validates how the existing APIs work for hidden 
>> classes.
>>
>> javadoc/specdiff
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/api/
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/specdiff/ 
>>
>>
>> JVMS 5.4.4 change:
>> http://cr.openjdk.java.net/~mchung/valhalla/webrevs/hidden-classes/Draft-JVMS-HiddenClasses.pdf 
>>
>>
>> CSR:
>> https://bugs.openjdk.java.net/browse/JDK-8238359
>>
>> Thanks
>> Mandy
>> [1] https://bugs.openjdk.java.net/browse/JDK-8238359
>> [2] https://bugs.openjdk.java.net/browse/JDK-8240338
>> [3] https://bugs.openjdk.java.net/browse/JDK-8230502
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200331/73df3e15/attachment-0001.htm>

From chris.plummer at oracle.com  Tue Mar 31 20:32:34 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 31 Mar 2020 13:32:34 -0700
Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the
 children process if vmArguments is already specified
In-Reply-To: <b08d3b9c-3198-2658-1aac-043dbab330bc@oracle.com>
References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com>
 <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com>
 <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com>
 <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com>
 <042D1C56-75E3-40B9-8259-035C9A13C13B@oracle.com>
 <47cba4c6-233d-84a9-1d89-b40e3c974c08@oracle.com>
 <8bcf232e-e05c-98ae-767f-26adf18ad3fd@oracle.com>
 <cd4bb93e-b41c-a59f-ac9c-c1ea10a6a374@oracle.com>
 <ed4167ad-681a-01f8-add6-c0f01188fefd@oracle.com>
 <b08d3b9c-3198-2658-1aac-043dbab330bc@oracle.com>
Message-ID: <8d0cfc5c-f622-2ac9-aecc-8d398f6d3f2e@oracle.com>

On 3/31/20 12:09 PM, Leonid Mesnik wrote:
> Hi
>
> On 3/30/20 9:43 PM, Chris Plummer wrote:
>> Hi Leonid,
>>
>> On 3/30/20 5:42 PM, Leonid Mesnik wrote:
>>> Hi
>>>
>>> See my comments inline. I will update webrev after go through all 
>>> your comments.
>>>
>>>
>>> On 3/30/20 11:39 AM, Chris Plummer wrote:
>>>> Hi Leonid,
>>>>
>>>> I haven't gone through all the tests yet.? I've accumulated enough 
>>>> questions that I'd like to see them answered or addressed before I 
>>>> continue on.
>>>>
>>>> This isn't directly related to your changes, but I noticed that 
>>>> users of JDKToolLauncher do nothing to make sure that default test 
>>>> options are used. This means we are never running these tools with 
>>>> the test options being specified with the jtreg run. Is that a bug 
>>>> or intentional?
>>>
>>> Which "default test options" do you mean? We have 2 properties to 
>>> set JVM options. The idea is to pass test.vm.opts to ALL java 
>>> processes and test.java.opts? to only tested processes if 
>>> applicable. Usually, for example we don't want to run jcmd with 
>>> -Xcomp. test.vm.opts was used (a long time ago) for options like 
>>> '-d32/-d64' on Solaris where JVM don't start without choosing 
>>> correct version. Also, it is used to reduce maximum heap for all JVM 
>>> instances when tests are running concurrently.
>>>
>>> So, probably test.vm.opts (or test.vm.tools.opts) should be added by 
>>> JDKToolLauncher but not test.java.opts. It is separate topic, there 
>>> are a lot of launchers which ignore test.vm.opts now.
>> I always get confused about which set of options these properties 
>> represent, but basically I'm suggesting that if for example we are 
>> doing a -Xcomp run in mach5, JDKToolLauncher (at least in some cases) 
>> should be launched with this option. I think this is what you get 
>> from Utils.getTestJavaOpts(),.
>>
>> For example the SA tests use 
>> JDKToolLauncher.createUsingTestJDK("jhsdb"). jhsdb is what is really 
>> being tested here, and it should be launched with the test vm 
>> options. Currently we launch the target process with these options, 
>> which is probably also a good idea.? Also we aren't too concerned 
>> with the options that the test itself is run with, although I'm 
>> guessing they also get run with the test java opts. So we have 3 
>> processes here:
>> ?- jhsdb, which should be getting test java opts but is not
>> ?- the target process, which should be getting test java opts and 
>> currently is
>> ?- the test itself, where options don't really matter, but is getting 
>> passed test java opts
>>
>> However, you could argue that tests like jinfo, jstack, and jcmd, all 
>> of which use the Attach API and the bulk of the work is done on the 
>> target process, are not that concerned with the options passed to the 
>> command, but do want the options passed to the target process.
>
> Well, it is a good question if we want to run jhsdb tool itself with 
> additional slow options like Xcomp. Does it help us to improve 
> coverage? IIRC the original idea of adding test.java/vm.opts was to 
> don't waste time executing javac and debuggers in slow mode on SPARC.
>
> Anyway, it is a separate question which is out of scope of this 
> change. We might want to review all debugger/debugee tests to find 
> better way to deal with this.
Might be good to get an RFE filed for this.
>
>>>
>>>>
>>>> In the problem lists, is it necessary to list the test multiple 
>>>> times with #id0, #id1, etc, or could you list it just once and 
>>>> leave that part off. It seems very error prone. Also, changing 
>>>> tests like ClhsdbFindPC, ClhsdbJstack, and ClhsdbScanOops to split 
>>>> out the testing in this manner seems completely unrelated to this 
>>>> CR, especially when the tests do not even contain any changes 
>>>> related to the CR.
>>>
>>> I think, that these chages are related. The startApp(...) was 
>>> updated so some test combinations become invalid or redundant.
>>>
>>> ClhsdbFindPC and ClhsdbJstack were always run twice. Now, when test 
>>> options passed in test it is not needed to run it twice when Xcomp 
>>> is already set by user.
>>>
>> Ok. I see now that the second test run, which is the non -Xcomp run, 
>> adds '@requires vm.compMode != "Xcomp"'. But this also is strange. 
>> The first test run, which does not have the @requires and is the one 
>> that makes LingeredApp launch with -Xcomp, will always run whether or 
>> not it is an -Xcomp test run. So it will run as part of the a regular 
>> test run and as part of a -Xcomp test run. The only difference 
>> between the two is the -Xcomp run will also run the test with -Xcomp, 
>> but that's not really needed (I think it will also end up passing 
>> -Xcomp to the target processs twice). Perhaps '@requires vm.compMode 
>> == "Xcomp"' should be used for the first test run, but that means it 
>> no longer gets run until later tiers when we use -Xcomp. Why not 
>> revert it back to a single test, but also add '@requires vm.compMode 
>> != "Xcomp"'. Then it gets run both ways in an early tier and not run 
>> during the -Xcomp run, which isn't really needed.
>
> There several flag which are executed with Xcomp only: 
> "-XX:-DoEscapeAnalysis",? "-XX:-UseBiasedLocking", 
> "-XX:+DeoptimizeALot" where this test is going to be skipped. So we 
> never run test with these options.
>
> The original idea is to run test with given options and with added 
> Xcomp.? I left logic the same and only skip run with "Xcomp" when it 
> is set already by user. I agree that we have some duplication here and 
> it could be improved, but it could be done separately. If you are ok 
> with this let me file separate RFE for this.
Ok.
>
>>
>>> ClhsdbScanOops is fixed to don't allow to run incompatible GC 
>>> combination.
>> Ok
>>>
>>> So I should update these tests by splitting them or change them to? 
>>> startAppExactJvmOpts() if we wan't continue to ignore user-given 
>>> test options.
>> I don't think I was suggesting removing user-given test options. I 
>> don't see why you would.
>
> I just wanted to say that these tests are affected by my changes and 
> should be fixed anyway.
Ok.

So I think the one change you agreed to make is have the default be to 
append test vm opts rather than prepend them. Let me know when you have 
a new webrev.

thanks,

Chris
>
> Leonid
>
>>>
>>> It seems that #idN are required by jtreg now, otherwise it just run 
>>> test.
>> Ok.
>>>
>>>>
>>>> ?426???? public static LingeredApp startApp(String... 
>>>> additionalJvmOpts) throws IOException {
>>>>
>>>> The default test opts are appended to additionalJvmOpts, and if you 
>>>> want prepended you need to call Utils.prependTestJavaOpts(). I 
>>>> would have thought the opposite would be more desirable and 
>>>> expected default behavior. Why did you choose this way? I also find 
>>>> it somewhat confusing that there is even a default mode for where 
>>>> the additionalJvmOpts go. Maybe it would be best to have 
>>>> startAppAppendJvmArgs() and startAppPrependJvmArgs() just to make 
>>>> it explicit. This would also be in line with the existing 
>>>> startAppExactJvmOpts().
>>>>
>>> I've chosen the most popular usage, which was 
>>> Utils.appendTestJavaOpts. But I agree, that it would be better to 
>>> change it to prepend. Thanks for pointing to this.
>>>
>>> I don't want to add startAppAppendJvmArgs()/startAppPrependJvmArgs() 
>>> to don't complicate all things. I think that startApp() should be 
>>> used in the cases when test vm options really shouldn't interfere 
>>> with user-provided options or overwrite them. So basically the 
>>> behavior is the same as for 
>>> ProcessTools.createJavaProcessBuilder(true, ...) and jtreg itself.
>>>
>> Ok.
>>>
>>>> Is ClhsdbFindPC correct. It used to use just use -Xcomp or -Xint, 
>>>> ignoring any default test opts. You've fixed it to include the 
>>>> default test opts, but the are appended, possibly overriding the 
>>>> -Xcomp or -Xint. Don't we want the default test opts prepended? 
>>>> Same for ClhsdbJstack.
>>>
>>> The idea is to don't mix Xcomp and Xmixed/Xint using requires 
>>> filter. However ClhsdbFindPC might override Xint with Xmixed if it 
>>> is set explicitly. Switching to prepending will fix it.
>> Yes, that's what I was thinking and one reason I thought that should 
>> be default behavior.
>>
>> thanks,
>>
>> Chris
>>>
>>> Leonid
>>>
>>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>
>>>> On 3/25/20 2:31 PM, Leonid Mesnik wrote:
>>>>>
>>>>> Igor, Stefan, Ioi
>>>>>
>>>>> Thank you for your feedback.
>>>>>
>>>>> Filed https://bugs.openjdk.java.net/browse/JDK-8241624 To change 
>>>>> @run main... to @run driver.
>>>>>
>>>>> Test ClhsdbJstack.java is updated.
>>>>>
>>>>> Still waiting for review from SVC team.
>>>>>
>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/
>>>>>
>>>>> Leonid
>>>>>
>>>>> On 3/25/20 12:46 PM, Igor Ignatyev wrote:
>>>>>> Hi Leonid,
>>>>>>
>>>>>> not related related to your patch (but yet somewhat made more 
>>>>>> obvious by it), it seems all (or at least almost all) the tests 
>>>>>> which use?LingeredApp should be run in "driver" mode as they just 
>>>>>> orchestrate execution of other JVMs, so running them w/ main (let 
>>>>>> alone main/othervm) just wastes time, 
>>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java#id1, for 
>>>>>> example, will now executed w/ Xcomp which will make it very slow 
>>>>>> for no reasons. since you already got your hands dirty w/ these 
>>>>>> tests, could you please file an RFE to sort this out and list all 
>>>>>> the affected tests there?
>>>>>>
>>>>>> re: the patch, could you please update ClhsdbJstack.java test not 
>>>>>> to be run w/ Xcomp and follow the same pattern you used in other 
>>>>>> tests (e.g.?ClhsdbScanOops) ? other than that it looks fine to 
>>>>>> me, I however wouldn't be able to tell if all svc tests continue 
>>>>>> to do that they were supposed to, so I'd prefer for someone from 
>>>>>> svc team to?chime in.
>>>>>>
>>>>>> Thanks,
>>>>>> -- Igor
>>>>>>
>>>>>>> On Mar 25, 2020, at 12:01 PM, Leonid Mesnik 
>>>>>>> <leonid.mesnik at oracle.com <mailto:leonid.mesnik at oracle.com>> wrote:
>>>>>>>
>>>>>>> Added Ioi, who also proposed new version of startAppVmOpts.
>>>>>>>
>>>>>>> Please find new webrev: 
>>>>>>> http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/
>>>>>>>
>>>>>>> Renamed startAppVmOpts/runAppVmOpts to 
>>>>>>> "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should 
>>>>>>> make very clear that this method doesn't use any of 
>>>>>>> test.java.opts, test.vm.opts.
>>>>>>>
>>>>>>> Also, I fixed 
>>>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java metnioned 
>>>>>>> by Igor, and removed null pointer check as Ioi suggested in 
>>>>>>> startApp method.
>>>>>>>
>>>>>>> + public static void startApp(LingeredApp theApp, String... 
>>>>>>> additionalJvmOpts) throws IOException {
>>>>>>> + startAppExactJvmOpts(theApp, 
>>>>>>> Utils.appendTestJavaOpts(additionalJvmOpts));
>>>>>>> + }
>>>>>>>
>>>>>>> Leonid
>>>>>>>
>>>>>>> On 3/25/20 10:14 AM, Stefan Karlsson wrote:
>>>>>>>> On 2020-03-25 17:40, Igor Ignatyev wrote:
>>>>>>>>> Hi Leonid,
>>>>>>>>>
>>>>>>>>> I have briefly looked at the patch, a few comments so far:
>>>>>>>>>
>>>>>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java:
>>>>>>>>> ? - at L#114, could you please call static method using class 
>>>>>>>>> name (as the opposite of using instance)? or was it meant to 
>>>>>>>>> be theApp.runAppVmOpts(vmArgs) ?
>>>>>>>>>
>>>>>>>>> test/lib/jdk/test/lib/apps/LingeredApp.java:
>>>>>>>>> - it seems that code indent of startApp(LingeredApp, String[]) 
>>>>>>>>> isn't correct
>>>>>>>>> - I don't like startAppVmOpts name, but unfortunately don't 
>>>>>>>>> have a better suggestion (yet)
>>>>>>>>
>>>>>>>> I was going to say the same. Jtreg has the concept of "java 
>>>>>>>> options" and "vm options". We have had a fair share of bugs and 
>>>>>>>> wasted time when tests have been using the "vm options" part 
>>>>>>>> (VM_OPTIONS, test.vm.options, etc), and we've been moving away 
>>>>>>>> from using that way to pass options. I recently cleaned up some 
>>>>>>>> of this with:
>>>>>>>>
>>>>>>>> 8237111: LingeredApp should be started with getTestJavaOpts
>>>>>>>>
>>>>>>>> Because of this, I would prefer if we used a name that doesn't 
>>>>>>>> include "VmOpts", because it's too alike the other concept. 
>>>>>>>> Some suggestions:
>>>>>>>> ?startAppJavaOptions
>>>>>>>> ?startAppUsingJavaOptions
>>>>>>>> ?startAppWithJavaOptions
>>>>>>>> ?startAppExactJavaOptions
>>>>>>>> ?startAppJvmOptions
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> StefanK
>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> -- Igor
>>>>>>>>>
>>>>>>>>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik 
>>>>>>>>>> <leonid.mesnik at oracle.com> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi
>>>>>>>>>>
>>>>>>>>>> Could you please review following fix which change 
>>>>>>>>>> LingeredApp to prepend vm options to java/vm.test.opts when 
>>>>>>>>>> startApp is used and provide startAppVmOpts to override 
>>>>>>>>>> options completely.
>>>>>>>>>>
>>>>>>>>>> The intention is to avoid issue like in this bug where 
>>>>>>>>>> test/jtreg options were ignored by tests. Also I fixed some 
>>>>>>>>>> tests where intention was to append vm options rather than to 
>>>>>>>>>> override them.
>>>>>>>>>>
>>>>>>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/
>>>>>>>>>>
>>>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698
>>>>>>>>>>
>>>>>>>>>> Leonid
>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>>>
>>
>>


From coleen.phillimore at oracle.com  Tue Mar 31 20:32:45 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 31 Mar 2020 16:32:45 -0400
Subject: Discussion about fixing deprecation in jdk.hotspot.agent
In-Reply-To: <bd869217-46b8-1181-19cc-311d54aadd26@oracle.com>
References: <b66dad5c-cd1e-afdc-6c5c-62d1d89fda00@oracle.com>
 <1916207b-de97-1f25-f93c-8830025fad62@oracle.com>
 <a9590dc7-867b-2f13-6772-5bc06c4d1a86@oracle.com>
 <113dd83a-82a3-88fc-8f31-fe9bfd00c12c@oracle.com>
 <bd869217-46b8-1181-19cc-311d54aadd26@oracle.com>
Message-ID: <12e28226-136b-3391-ca01-e9e04058a2a8@oracle.com>


On 3/31/20 12:19 PM, Poonam Parhar wrote:
> Hello Coleen,
>
> Does the removal of this code only impact the 'reattach' 
> functionality, and it does not affect any commands available in 
> 'clhsdb' once it is attached to a core file? If that's true, then I 
> think it should be okay to remove this code.

Hi Poonam,? Thank you for answering. Yes, this patch only removes the 
reattach functionality.? I tried out the other clhsdb commands from your 
wiki page, and they worked fine, including object and heap inspection.

Thanks,
Coleen
>
> Thanks,
> Poonam
>
> On 3/31/20 5:34 AM, coleen.phillimore at oracle.com wrote:
>>
>> To answer my own question, this functionality is used to allow 
>> detach/reattach from {cl}hsdb.? Which seems to work on linux but not 
>> windows with this code removed.
>>
>> The next question is whether this is useful functionality to justify 
>> all this code (900+ and this new code that Magnus has added).? Can't 
>> you just exit and restart the clhsdb process on the core file or 
>> process?
>>
>> For the record, this is me playing with python to remove this code.
>>
>> http://cr.openjdk.java.net/~coleenp/2020/01/webrev/index.html
>>
>> Thanks,
>> Coleen
>>
>> On 3/30/20 3:04 PM, coleen.phillimore at oracle.com wrote:
>>>
>>> I was wondering why this is needed when debugging a core file, which 
>>> is the key thing we need the SA for:
>>>
>>> ? /** This is used by both the debugger and any runtime system. It is
>>> ????? the basic mechanism by which classes which mimic underlying VM
>>> ????? functionality cause themselves to be initialized. The given
>>> ????? observer will be notified (with arguments (null, null)) when the
>>> ????? VM is re-initialized, as well as when it registers itself with
>>> ????? the VM. */
>>> ? public static void registerVMInitializedObserver(Observer o) {
>>> ??? vmInitializedObservers.add(o);
>>> ??? o.update(null, null);
>>> ? }
>>>
>>> It seems like if it isn't needed, we shouldn't add these classes and 
>>> remove their use.
>>>
>>> Coleen
>>>
>>> On 3/30/20 8:14 AM, Magnus Ihse Bursie wrote:
>>>> No opinions on this?
>>>>
>>>> /Magnus
>>>>
>>>> On 2020-03-25 23:34, Magnus Ihse Bursie wrote:
>>>>> Hi everyone,
>>>>>
>>>>> As a follow-up to the ongoing review for JDK-8241618, I have also 
>>>>> looked at fixing the deprecation warnings in jdk.hotspot.agent. 
>>>>> These fall in three broad categories:
>>>>>
>>>>> * Deprecation of the boxing type constructors (e.g. "new 
>>>>> Integer(42)").
>>>>>
>>>>> * Deprecation of java.util.Observer and Observable.
>>>>>
>>>>> * The rest (mostly Class.newInstance(), and a few number of other 
>>>>> odd deprecations)
>>>>>
>>>>> The first category is trivial to fix. The last category need some 
>>>>> special discussion. But the overwhelming majority of deprecation 
>>>>> warnings come from the use of Observer and Observable. This really 
>>>>> dwarfs anything else, and needs to be handled first, otherwise 
>>>>> it's hard to even spot the other issues.
>>>>>
>>>>> My analysis of the situation is that the deprecation of Observer 
>>>>> and Observable seems a bit harsh, from the PoV of 
>>>>> jdk.hotspot.agent. Sure, it might be limited, but I think it does 
>>>>> exactly what is needed here. So the migration suggested in 
>>>>> Observable (java.beans or java.util.concurrent) seems overkill. If 
>>>>> there are genuine threading issues at play here, this assumption 
>>>>> might be wrong, and then maybe going the j.u.c. route is correct.
>>>>>
>>>>> But if that's not, the main goal should be to stay with the 
>>>>> current implementation. One way to do this is to sprinkle the code 
>>>>> with @SuppressWarning. But I think a better way would be to just 
>>>>> implement our own Observer and Observable. After all, the classes 
>>>>> are trivial.
>>>>>
>>>>> I've made a mock-up of this solution, were I just copied the 
>>>>> java.util.Observer and Observable, and removed the deprecation 
>>>>> annotations. The only thing needed for the rest of the code is to 
>>>>> make sure we import these; I've done this for three arbitrarily 
>>>>> selected classes just to show what the change would typically look 
>>>>> like. Here's the mock-up:
>>>>>
>>>>> http://cr.openjdk.java.net/~ihse/hotspot-agent-observer/webrev.01
>>>>>
>>>>> Let me know what you think.
>>>>>
>>>>> /Magnus
>>>>
>>>
>>
>


From chris.plummer at oracle.com  Tue Mar 31 20:55:57 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 31 Mar 2020 13:55:57 -0700
Subject: Discussion about fixing deprecation in jdk.hotspot.agent
In-Reply-To: <12e28226-136b-3391-ca01-e9e04058a2a8@oracle.com>
References: <b66dad5c-cd1e-afdc-6c5c-62d1d89fda00@oracle.com>
 <1916207b-de97-1f25-f93c-8830025fad62@oracle.com>
 <a9590dc7-867b-2f13-6772-5bc06c4d1a86@oracle.com>
 <113dd83a-82a3-88fc-8f31-fe9bfd00c12c@oracle.com>
 <bd869217-46b8-1181-19cc-311d54aadd26@oracle.com>
 <12e28226-136b-3391-ca01-e9e04058a2a8@oracle.com>
Message-ID: <71739959-0aaf-c973-10d2-36f4295ddc37@oracle.com>

On 3/31/20 1:32 PM, coleen.phillimore at oracle.com wrote:
>
>
> On 3/31/20 12:19 PM, Poonam Parhar wrote:
>> Hello Coleen,
>>
>> Does the removal of this code only impact the 'reattach' 
>> functionality, and it does not affect any commands available in 
>> 'clhsdb' once it is attached to a core file? If that's true, then I 
>> think it should be okay to remove this code.
>
> Hi Poonam,? Thank you for answering. Yes, this patch only removes the 
> reattach functionality.? I tried out the other clhsdb commands from 
> your wiki page, and they worked fine, including object and heap 
> inspection.
I'm trying to understand exactly when all these static initializes are 
triggered. Is it only after you do an attach?

The implementation of clhsdb reattach is exactly the same as doing a 
detach followed by an attach to the same process. I'm not sure how much 
value it has, but I think in general the removal of this code means you 
can't detach and then attach to anything, even a different pid. So 
"detach" might as well become "detach-and-exit", because your clhsdb 
session is dead once you detach. Do we really want to do this?

Chris
>
> Thanks,
> Coleen
>>
>> Thanks,
>> Poonam
>>
>> On 3/31/20 5:34 AM, coleen.phillimore at oracle.com wrote:
>>>
>>> To answer my own question, this functionality is used to allow 
>>> detach/reattach from {cl}hsdb.? Which seems to work on linux but not 
>>> windows with this code removed.
>>>
>>> The next question is whether this is useful functionality to justify 
>>> all this code (900+ and this new code that Magnus has added).? Can't 
>>> you just exit and restart the clhsdb process on the core file or 
>>> process?
>>>
>>> For the record, this is me playing with python to remove this code.
>>>
>>> http://cr.openjdk.java.net/~coleenp/2020/01/webrev/index.html
>>>
>>> Thanks,
>>> Coleen
>>>
>>> On 3/30/20 3:04 PM, coleen.phillimore at oracle.com wrote:
>>>>
>>>> I was wondering why this is needed when debugging a core file, 
>>>> which is the key thing we need the SA for:
>>>>
>>>> ? /** This is used by both the debugger and any runtime system. It is
>>>> ????? the basic mechanism by which classes which mimic underlying VM
>>>> ????? functionality cause themselves to be initialized. The given
>>>> ????? observer will be notified (with arguments (null, null)) when the
>>>> ????? VM is re-initialized, as well as when it registers itself with
>>>> ????? the VM. */
>>>> ? public static void registerVMInitializedObserver(Observer o) {
>>>> ??? vmInitializedObservers.add(o);
>>>> ??? o.update(null, null);
>>>> ? }
>>>>
>>>> It seems like if it isn't needed, we shouldn't add these classes 
>>>> and remove their use.
>>>>
>>>> Coleen
>>>>
>>>> On 3/30/20 8:14 AM, Magnus Ihse Bursie wrote:
>>>>> No opinions on this?
>>>>>
>>>>> /Magnus
>>>>>
>>>>> On 2020-03-25 23:34, Magnus Ihse Bursie wrote:
>>>>>> Hi everyone,
>>>>>>
>>>>>> As a follow-up to the ongoing review for JDK-8241618, I have also 
>>>>>> looked at fixing the deprecation warnings in jdk.hotspot.agent. 
>>>>>> These fall in three broad categories:
>>>>>>
>>>>>> * Deprecation of the boxing type constructors (e.g. "new 
>>>>>> Integer(42)").
>>>>>>
>>>>>> * Deprecation of java.util.Observer and Observable.
>>>>>>
>>>>>> * The rest (mostly Class.newInstance(), and a few number of other 
>>>>>> odd deprecations)
>>>>>>
>>>>>> The first category is trivial to fix. The last category need some 
>>>>>> special discussion. But the overwhelming majority of deprecation 
>>>>>> warnings come from the use of Observer and Observable. This 
>>>>>> really dwarfs anything else, and needs to be handled first, 
>>>>>> otherwise it's hard to even spot the other issues.
>>>>>>
>>>>>> My analysis of the situation is that the deprecation of Observer 
>>>>>> and Observable seems a bit harsh, from the PoV of 
>>>>>> jdk.hotspot.agent. Sure, it might be limited, but I think it does 
>>>>>> exactly what is needed here. So the migration suggested in 
>>>>>> Observable (java.beans or java.util.concurrent) seems overkill. 
>>>>>> If there are genuine threading issues at play here, this 
>>>>>> assumption might be wrong, and then maybe going the j.u.c. route 
>>>>>> is correct.
>>>>>>
>>>>>> But if that's not, the main goal should be to stay with the 
>>>>>> current implementation. One way to do this is to sprinkle the 
>>>>>> code with @SuppressWarning. But I think a better way would be to 
>>>>>> just implement our own Observer and Observable. After all, the 
>>>>>> classes are trivial.
>>>>>>
>>>>>> I've made a mock-up of this solution, were I just copied the 
>>>>>> java.util.Observer and Observable, and removed the deprecation 
>>>>>> annotations. The only thing needed for the rest of the code is to 
>>>>>> make sure we import these; I've done this for three arbitrarily 
>>>>>> selected classes just to show what the change would typically 
>>>>>> look like. Here's the mock-up:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~ihse/hotspot-agent-observer/webrev.01
>>>>>>
>>>>>> Let me know what you think.
>>>>>>
>>>>>> /Magnus
>>>>>
>>>>
>>>
>>
>


From coleen.phillimore at oracle.com  Tue Mar 31 21:20:19 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 31 Mar 2020 17:20:19 -0400
Subject: Discussion about fixing deprecation in jdk.hotspot.agent
In-Reply-To: <71739959-0aaf-c973-10d2-36f4295ddc37@oracle.com>
References: <b66dad5c-cd1e-afdc-6c5c-62d1d89fda00@oracle.com>
 <1916207b-de97-1f25-f93c-8830025fad62@oracle.com>
 <a9590dc7-867b-2f13-6772-5bc06c4d1a86@oracle.com>
 <113dd83a-82a3-88fc-8f31-fe9bfd00c12c@oracle.com>
 <bd869217-46b8-1181-19cc-311d54aadd26@oracle.com>
 <12e28226-136b-3391-ca01-e9e04058a2a8@oracle.com>
 <71739959-0aaf-c973-10d2-36f4295ddc37@oracle.com>
Message-ID: <1cc556ce-67ed-1e6f-ee53-36d8227d0e1e@oracle.com>


On 3/31/20 4:55 PM, Chris Plummer wrote:
> On 3/31/20 1:32 PM, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 3/31/20 12:19 PM, Poonam Parhar wrote:
>>> Hello Coleen,
>>>
>>> Does the removal of this code only impact the 'reattach' 
>>> functionality, and it does not affect any commands available in 
>>> 'clhsdb' once it is attached to a core file? If that's true, then I 
>>> think it should be okay to remove this code.
>>
>> Hi Poonam,? Thank you for answering. Yes, this patch only removes the 
>> reattach functionality.? I tried out the other clhsdb commands from 
>> your wiki page, and they worked fine, including object and heap 
>> inspection.
> I'm trying to understand exactly when all these static initializes are 
> triggered. Is it only after you do an attach?
>
> The implementation of clhsdb reattach is exactly the same as doing a 
> detach followed by an attach to the same process. I'm not sure how 
> much value it has, but I think in general the removal of this code 
> means you can't detach and then attach to anything, even a different 
> pid. So "detach" might as well become "detach-and-exit", because your 
> clhsdb session is dead once you detach. Do we really want to do this?

Well, that was my question. It seems like you could just exit and start 
up jhsdb again and that's more like something someone would do just as 
easily.? Given the use cases that we've seen from sustaining, this 
appears to be unneeded functionality.

The original mail was proposing adding more code to work around the 
deprecation messages.? It seems like more code should not be added for 
something that is unused.

thanks,
Coleen

>
> Chris
>>
>> Thanks,
>> Coleen
>>>
>>> Thanks,
>>> Poonam
>>>
>>> On 3/31/20 5:34 AM, coleen.phillimore at oracle.com wrote:
>>>>
>>>> To answer my own question, this functionality is used to allow 
>>>> detach/reattach from {cl}hsdb.? Which seems to work on linux but 
>>>> not windows with this code removed.
>>>>
>>>> The next question is whether this is useful functionality to 
>>>> justify all this code (900+ and this new code that Magnus has 
>>>> added).? Can't you just exit and restart the clhsdb process on the 
>>>> core file or process?
>>>>
>>>> For the record, this is me playing with python to remove this code.
>>>>
>>>> http://cr.openjdk.java.net/~coleenp/2020/01/webrev/index.html
>>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>> On 3/30/20 3:04 PM, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>> I was wondering why this is needed when debugging a core file, 
>>>>> which is the key thing we need the SA for:
>>>>>
>>>>> ? /** This is used by both the debugger and any runtime system. It is
>>>>> ????? the basic mechanism by which classes which mimic underlying VM
>>>>> ????? functionality cause themselves to be initialized. The given
>>>>> ????? observer will be notified (with arguments (null, null)) when 
>>>>> the
>>>>> ????? VM is re-initialized, as well as when it registers itself with
>>>>> ????? the VM. */
>>>>> ? public static void registerVMInitializedObserver(Observer o) {
>>>>> ??? vmInitializedObservers.add(o);
>>>>> ??? o.update(null, null);
>>>>> ? }
>>>>>
>>>>> It seems like if it isn't needed, we shouldn't add these classes 
>>>>> and remove their use.
>>>>>
>>>>> Coleen
>>>>>
>>>>> On 3/30/20 8:14 AM, Magnus Ihse Bursie wrote:
>>>>>> No opinions on this?
>>>>>>
>>>>>> /Magnus
>>>>>>
>>>>>> On 2020-03-25 23:34, Magnus Ihse Bursie wrote:
>>>>>>> Hi everyone,
>>>>>>>
>>>>>>> As a follow-up to the ongoing review for JDK-8241618, I have 
>>>>>>> also looked at fixing the deprecation warnings in 
>>>>>>> jdk.hotspot.agent. These fall in three broad categories:
>>>>>>>
>>>>>>> * Deprecation of the boxing type constructors (e.g. "new 
>>>>>>> Integer(42)").
>>>>>>>
>>>>>>> * Deprecation of java.util.Observer and Observable.
>>>>>>>
>>>>>>> * The rest (mostly Class.newInstance(), and a few number of 
>>>>>>> other odd deprecations)
>>>>>>>
>>>>>>> The first category is trivial to fix. The last category need 
>>>>>>> some special discussion. But the overwhelming majority of 
>>>>>>> deprecation warnings come from the use of Observer and 
>>>>>>> Observable. This really dwarfs anything else, and needs to be 
>>>>>>> handled first, otherwise it's hard to even spot the other issues.
>>>>>>>
>>>>>>> My analysis of the situation is that the deprecation of Observer 
>>>>>>> and Observable seems a bit harsh, from the PoV of 
>>>>>>> jdk.hotspot.agent. Sure, it might be limited, but I think it 
>>>>>>> does exactly what is needed here. So the migration suggested in 
>>>>>>> Observable (java.beans or java.util.concurrent) seems overkill. 
>>>>>>> If there are genuine threading issues at play here, this 
>>>>>>> assumption might be wrong, and then maybe going the j.u.c. route 
>>>>>>> is correct.
>>>>>>>
>>>>>>> But if that's not, the main goal should be to stay with the 
>>>>>>> current implementation. One way to do this is to sprinkle the 
>>>>>>> code with @SuppressWarning. But I think a better way would be to 
>>>>>>> just implement our own Observer and Observable. After all, the 
>>>>>>> classes are trivial.
>>>>>>>
>>>>>>> I've made a mock-up of this solution, were I just copied the 
>>>>>>> java.util.Observer and Observable, and removed the deprecation 
>>>>>>> annotations. The only thing needed for the rest of the code is 
>>>>>>> to make sure we import these; I've done this for three 
>>>>>>> arbitrarily selected classes just to show what the change would 
>>>>>>> typically look like. Here's the mock-up:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~ihse/hotspot-agent-observer/webrev.01
>>>>>>>
>>>>>>> Let me know what you think.
>>>>>>>
>>>>>>> /Magnus
>>>>>>
>>>>>
>>>>
>>>
>>
>
>


From leonid.mesnik at oracle.com  Tue Mar 31 23:12:32 2020
From: leonid.mesnik at oracle.com (Leonid Mesnik)
Date: Tue, 31 Mar 2020 16:12:32 -0700
Subject: RFR: 8240698: LingeredApp does not pass getTestJavaOpts() to the
 children process if vmArguments is already specified
In-Reply-To: <8d0cfc5c-f622-2ac9-aecc-8d398f6d3f2e@oracle.com>
References: <96a10c0a-83d1-5f17-5426-217e3647ffc3@oracle.com>
 <89BA7F8A-000C-4C59-AC04-2DF595F7634E@oracle.com>
 <197eeb11-8328-7d07-3fa0-03494155d3c9@oracle.com>
 <47425e6e-a7ee-5ef5-0285-fcf5289becda@oracle.com>
 <042D1C56-75E3-40B9-8259-035C9A13C13B@oracle.com>
 <47cba4c6-233d-84a9-1d89-b40e3c974c08@oracle.com>
 <8bcf232e-e05c-98ae-767f-26adf18ad3fd@oracle.com>
 <cd4bb93e-b41c-a59f-ac9c-c1ea10a6a374@oracle.com>
 <ed4167ad-681a-01f8-add6-c0f01188fefd@oracle.com>
 <b08d3b9c-3198-2658-1aac-043dbab330bc@oracle.com>
 <8d0cfc5c-f622-2ac9-aecc-8d398f6d3f2e@oracle.com>
Message-ID: <8B98C7C4-C2BD-4E21-B79B-CDAD9C1C2E97@oracle.com>

Here is new webrev:
http://cr.openjdk.java.net/~lmesnik/8240698/webrev.03/ <http://cr.openjdk.java.net/~lmesnik/8240698/webrev.03/>

The only difference is updated startApp() method and it's comments:
http://cr.openjdk.java.net/~lmesnik/8240698/webrev.03/test/lib/jdk/test/lib/apps/LingeredApp.java.udiff.html <http://cr.openjdk.java.net/~lmesnik/8240698/webrev.03/test/lib/jdk/test/lib/apps/LingeredApp.java.udiff.html>

Leonid

> On Mar 31, 2020, at 1:32 PM, Chris Plummer <chris.plummer at oracle.com> wrote:
> 
> On 3/31/20 12:09 PM, Leonid Mesnik wrote:
>> Hi
>> 
>> On 3/30/20 9:43 PM, Chris Plummer wrote:
>>> Hi Leonid,
>>> 
>>> On 3/30/20 5:42 PM, Leonid Mesnik wrote:
>>>> Hi
>>>> 
>>>> See my comments inline. I will update webrev after go through all your comments.
>>>> 
>>>> 
>>>> On 3/30/20 11:39 AM, Chris Plummer wrote:
>>>>> Hi Leonid,
>>>>> 
>>>>> I haven't gone through all the tests yet.  I've accumulated enough questions that I'd like to see them answered or addressed before I continue on.
>>>>> 
>>>>> This isn't directly related to your changes, but I noticed that users of JDKToolLauncher do nothing to make sure that default test options are used. This means we are never running these tools with the test options being specified with the jtreg run. Is that a bug or intentional?
>>>> 
>>>> Which "default test options" do you mean? We have 2 properties to set JVM options. The idea is to pass test.vm.opts to ALL java processes and test.java.opts  to only tested processes if applicable. Usually, for example we don't want to run jcmd with -Xcomp. test.vm.opts was used (a long time ago) for options like '-d32/-d64' on Solaris where JVM don't start without choosing correct version. Also, it is used to reduce maximum heap for all JVM instances when tests are running concurrently.
>>>> 
>>>> So, probably test.vm.opts (or test.vm.tools.opts) should be added by JDKToolLauncher but not test.java.opts. It is separate topic, there are a lot of launchers which ignore test.vm.opts now.
>>> I always get confused about which set of options these properties represent, but basically I'm suggesting that if for example we are doing a -Xcomp run in mach5, JDKToolLauncher (at least in some cases) should be launched with this option. I think this is what you get from Utils.getTestJavaOpts(),.
>>> 
>>> For example the SA tests use JDKToolLauncher.createUsingTestJDK("jhsdb"). jhsdb is what is really being tested here, and it should be launched with the test vm options. Currently we launch the target process with these options, which is probably also a good idea.  Also we aren't too concerned with the options that the test itself is run with, although I'm guessing they also get run with the test java opts. So we have 3 processes here:
>>>  - jhsdb, which should be getting test java opts but is not
>>>  - the target process, which should be getting test java opts and currently is
>>>  - the test itself, where options don't really matter, but is getting passed test java opts
>>> 
>>> However, you could argue that tests like jinfo, jstack, and jcmd, all of which use the Attach API and the bulk of the work is done on the target process, are not that concerned with the options passed to the command, but do want the options passed to the target process.
>> 
>> Well, it is a good question if we want to run jhsdb tool itself with additional slow options like Xcomp. Does it help us to improve coverage? IIRC the original idea of adding test.java/vm.opts was to don't waste time executing javac and debuggers in slow mode on SPARC.
>> 
>> Anyway, it is a separate question which is out of scope of this change. We might want to review all debugger/debugee tests to find better way to deal with this.
> Might be good to get an RFE filed for this.
>> 
>>>> 
>>>>> 
>>>>> In the problem lists, is it necessary to list the test multiple times with #id0, #id1, etc, or could you list it just once and leave that part off. It seems very error prone. Also, changing tests like ClhsdbFindPC, ClhsdbJstack, and ClhsdbScanOops to split out the testing in this manner seems completely unrelated to this CR, especially when the tests do not even contain any changes related to the CR.
>>>> 
>>>> I think, that these chages are related. The startApp(...) was updated so some test combinations become invalid or redundant.
>>>> 
>>>> ClhsdbFindPC and ClhsdbJstack were always run twice. Now, when test options passed in test it is not needed to run it twice when Xcomp is already set by user.
>>>> 
>>> Ok. I see now that the second test run, which is the non -Xcomp run, adds '@requires vm.compMode != "Xcomp"'. But this also is strange. The first test run, which does not have the @requires and is the one that makes LingeredApp launch with -Xcomp, will always run whether or not it is an -Xcomp test run. So it will run as part of the a regular test run and as part of a -Xcomp test run. The only difference between the two is the -Xcomp run will also run the test with -Xcomp, but that's not really needed (I think it will also end up passing -Xcomp to the target processs twice). Perhaps '@requires vm.compMode == "Xcomp"' should be used for the first test run, but that means it no longer gets run until later tiers when we use -Xcomp. Why not revert it back to a single test, but also add '@requires vm.compMode != "Xcomp"'. Then it gets run both ways in an early tier and not run during the -Xcomp run, which isn't really needed.
>> 
>> There several flag which are executed with Xcomp only: "-XX:-DoEscapeAnalysis",  "-XX:-UseBiasedLocking", "-XX:+DeoptimizeALot" where this test is going to be skipped. So we never run test with these options.
>> 
>> The original idea is to run test with given options and with added Xcomp.  I left logic the same and only skip run with "Xcomp" when it is set already by user. I agree that we have some duplication here and it could be improved, but it could be done separately. If you are ok with this let me file separate RFE for this.
> Ok.
>> 
>>> 
>>>> ClhsdbScanOops is fixed to don't allow to run incompatible GC combination.
>>> Ok
>>>> 
>>>> So I should update these tests by splitting them or change them to  startAppExactJvmOpts() if we wan't continue to ignore user-given test options.
>>> I don't think I was suggesting removing user-given test options. I don't see why you would.
>> 
>> I just wanted to say that these tests are affected by my changes and should be fixed anyway.
> Ok.
> 
> So I think the one change you agreed to make is have the default be to append test vm opts rather than prepend them. Let me know when you have a new webrev.
> 
> thanks,
> 
> Chris
>> 
>> Leonid
>> 
>>>> 
>>>> It seems that #idN are required by jtreg now, otherwise it just run test.
>>> Ok.
>>>> 
>>>>> 
>>>>>  426     public static LingeredApp startApp(String... additionalJvmOpts) throws IOException {
>>>>> 
>>>>> The default test opts are appended to additionalJvmOpts, and if you want prepended you need to call Utils.prependTestJavaOpts(). I would have thought the opposite would be more desirable and expected default behavior. Why did you choose this way? I also find it somewhat confusing that there is even a default mode for where the additionalJvmOpts go. Maybe it would be best to have startAppAppendJvmArgs() and startAppPrependJvmArgs() just to make it explicit. This would also be in line with the existing startAppExactJvmOpts().
>>>>> 
>>>> I've chosen the most popular usage, which was Utils.appendTestJavaOpts. But I agree, that it would be better to change it to prepend. Thanks for pointing to this.
>>>> 
>>>> I don't want to add startAppAppendJvmArgs()/startAppPrependJvmArgs() to don't complicate all things. I think that startApp() should be used in the cases when test vm options really shouldn't interfere with user-provided options or overwrite them. So basically the behavior is the same as for ProcessTools.createJavaProcessBuilder(true, ...) and jtreg itself.
>>>> 
>>> Ok.
>>>> 
>>>>> Is ClhsdbFindPC correct. It used to use just use -Xcomp or -Xint, ignoring any default test opts. You've fixed it to include the default test opts, but the are appended, possibly overriding the -Xcomp or -Xint. Don't we want the default test opts prepended? Same for ClhsdbJstack.
>>>> 
>>>> The idea is to don't mix Xcomp and Xmixed/Xint using requires filter. However ClhsdbFindPC might override Xint with Xmixed if it is set explicitly. Switching to prepending will fix it.
>>> Yes, that's what I was thinking and one reason I thought that should be default behavior.
>>> 
>>> thanks,
>>> 
>>> Chris
>>>> 
>>>> Leonid
>>>> 
>>>>> 
>>>>> thanks,
>>>>> 
>>>>> Chris
>>>>> 
>>>>> On 3/25/20 2:31 PM, Leonid Mesnik wrote:
>>>>>> 
>>>>>> Igor, Stefan, Ioi
>>>>>> 
>>>>>> Thank you for your feedback.
>>>>>> 
>>>>>> Filed https://bugs.openjdk.java.net/browse/JDK-8241624 To change @run main... to @run driver.
>>>>>> 
>>>>>> Test ClhsdbJstack.java is updated.
>>>>>> 
>>>>>> Still waiting for review from SVC team.
>>>>>> 
>>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.02/
>>>>>> 
>>>>>> Leonid
>>>>>> 
>>>>>> On 3/25/20 12:46 PM, Igor Ignatyev wrote:
>>>>>>> Hi Leonid,
>>>>>>> 
>>>>>>> not related related to your patch (but yet somewhat made more obvious by it), it seems all (or at least almost all) the tests which use?LingeredApp should be run in "driver" mode as they just orchestrate execution of other JVMs, so running them w/ main (let alone main/othervm) just wastes time, test/hotspot/jtreg/serviceability/sa/ClhsdbJstack.java#id1, for example, will now executed w/ Xcomp which will make it very slow for no reasons. since you already got your hands dirty w/ these tests, could you please file an RFE to sort this out and list all the affected tests there?
>>>>>>> 
>>>>>>> re: the patch, could you please update ClhsdbJstack.java test not to be run w/ Xcomp and follow the same pattern you used in other tests (e.g.?ClhsdbScanOops) ? other than that it looks fine to me, I however wouldn't be able to tell if all svc tests continue to do that they were supposed to, so I'd prefer for someone from svc team to?chime in.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> -- Igor
>>>>>>> 
>>>>>>>> On Mar 25, 2020, at 12:01 PM, Leonid Mesnik <leonid.mesnik at oracle.com <mailto:leonid.mesnik at oracle.com>> wrote:
>>>>>>>> 
>>>>>>>> Added Ioi, who also proposed new version of startAppVmOpts.
>>>>>>>> 
>>>>>>>> Please find new webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.01/
>>>>>>>> 
>>>>>>>> Renamed startAppVmOpts/runAppVmOpts to "startAppExactJvmOpts/runAppExactJvmOpts" is used. It should make very clear that this method doesn't use any of test.java.opts, test.vm.opts.
>>>>>>>> 
>>>>>>>> Also, I fixed test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java metnioned by Igor, and removed null pointer check as Ioi suggested in startApp method.
>>>>>>>> 
>>>>>>>> + public static void startApp(LingeredApp theApp, String... additionalJvmOpts) throws IOException {
>>>>>>>> + startAppExactJvmOpts(theApp, Utils.appendTestJavaOpts(additionalJvmOpts));
>>>>>>>> + }
>>>>>>>> 
>>>>>>>> Leonid
>>>>>>>> 
>>>>>>>> On 3/25/20 10:14 AM, Stefan Karlsson wrote:
>>>>>>>>> On 2020-03-25 17:40, Igor Ignatyev wrote:
>>>>>>>>>> Hi Leonid,
>>>>>>>>>> 
>>>>>>>>>> I have briefly looked at the patch, a few comments so far:
>>>>>>>>>> 
>>>>>>>>>> test/hotspot/jtreg/serviceability/sa/ClhsdbFlags.java:
>>>>>>>>>> ? - at L#114, could you please call static method using class name (as the opposite of using instance)? or was it meant to be theApp.runAppVmOpts(vmArgs) ?
>>>>>>>>>> 
>>>>>>>>>> test/lib/jdk/test/lib/apps/LingeredApp.java:
>>>>>>>>>> - it seems that code indent of startApp(LingeredApp, String[]) isn't correct
>>>>>>>>>> - I don't like startAppVmOpts name, but unfortunately don't have a better suggestion (yet)
>>>>>>>>> 
>>>>>>>>> I was going to say the same. Jtreg has the concept of "java options" and "vm options". We have had a fair share of bugs and wasted time when tests have been using the "vm options" part (VM_OPTIONS, test.vm.options, etc), and we've been moving away from using that way to pass options. I recently cleaned up some of this with:
>>>>>>>>> 
>>>>>>>>> 8237111: LingeredApp should be started with getTestJavaOpts
>>>>>>>>> 
>>>>>>>>> Because of this, I would prefer if we used a name that doesn't include "VmOpts", because it's too alike the other concept. Some suggestions:
>>>>>>>>> ?startAppJavaOptions
>>>>>>>>> ?startAppUsingJavaOptions
>>>>>>>>> ?startAppWithJavaOptions
>>>>>>>>> ?startAppExactJavaOptions
>>>>>>>>> ?startAppJvmOptions
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> StefanK
>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> -- Igor
>>>>>>>>>> 
>>>>>>>>>>> On Mar 25, 2020, at 8:55 AM, Leonid Mesnik <leonid.mesnik at oracle.com> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Hi
>>>>>>>>>>> 
>>>>>>>>>>> Could you please review following fix which change LingeredApp to prepend vm options to java/vm.test.opts when startApp is used and provide startAppVmOpts to override options completely.
>>>>>>>>>>> 
>>>>>>>>>>> The intention is to avoid issue like in this bug where test/jtreg options were ignored by tests. Also I fixed some tests where intention was to append vm options rather than to override them.
>>>>>>>>>>> 
>>>>>>>>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8240698/webrev.00/
>>>>>>>>>>> 
>>>>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8240698
>>>>>>>>>>> 
>>>>>>>>>>> Leonid

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200331/d34e98d3/attachment-0001.htm>