RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

Robbin Ehn robbin.ehn at oracle.com
Mon Mar 2 10:16:44 UTC 2020


Hi,

On 2/24/20 5:39 PM, Lindenmaier, Goetz wrote:
> Hi,
> 
> I had a look at the progress of this change. Nothing
> happened since Richard posted his update using more
> handshakes [1].
> But we (SAP) would appreciate a lot if this change could
> be successfully reviewed and pushed.
> 
> I think there is basic understanding that this
> change is helpful. It fixes a number of issues with JVMTI,
> and will deliver the same performance benefits as EA
> does in current production mode for debugging scenarios.
> 
> This is important for us as we run our VMs prepared
> for debugging in production mode.
> 
> I understand that Robbin proposed to replace the usage of
> _suspend_flag with handshakes. Apparently, async handshakes
> are needed to do so. We have been waiting a while for removal
> of the _suspend_flag / introduction of async handshakes [2].
> What is the status here?

I have an old prototype which I would like to continue to work on.
So do not assume asynch handshakes will make 15.
Even if it would, I think there are a lot more investigate work to remove
_suspend_flag.

> 
> I think we should no longer wait, but proceed with
> this change. We will look into removing the usage of
> suspend_flag introduced here once it is possible to implement
> it with handshakes.

Yes, sure.

>> Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/

DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
You can move both declaration and definition to that file, no need to clobber 
thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)

Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's own 
hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.

Note that we also think we may have a bug in deopt:
https://bugs.openjdk.java.net/browse/JDK-8238237

I think it would be best, if possible, to push after that is resolved.

Not even nearly a full review :)

Thanks, Robbin


>> Incremental:
>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/
>>
>> I was not able to eliminate the additional suspend flag now. I'll take care of this
>> as soon as the
>> existing suspend-resume-mechanism is reworked.
>>
>> Testing:
>>
>> Nightly tests @SAP:
>>
>>    JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, Renaissance
>> Suite, SAP specific tests
>>    with fastdebug and release builds on all platforms
>>
>>    Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x parallel
>> for 24h
>>
>> Thanks, Richard.
>>
>>
>> More details on the changes:
>>
>> * Hide DeoptimizeObjectsALotThread from external view.
>>
>> * Changed EscapeBarrier_lock to be a _safepoint_check_never lock.
>>    It used to be _safepoint_check_sometimes, which will be eliminated sooner or
>> later.
>>    I added explicit thread state changes with ThreadBlockInVM to code paths
>> where we can wait()
>>    on EscapeBarrier_lock to become safepoint safe.
>>
>> * Use handshake EscapeBarrierSuspendHandshake to suspend target threads
>> instead of vm operation
>>    VM_ThreadSuspendAllForObjDeopt.
>>
>> * Removed uses of Threads_lock. When adding a new thread we suspend it iff
>> EA optimizations are
>>    being reverted. In the previous version we were waiting on Threads_lock
>> while EA optimizations
>>    were reverted. See EscapeBarrier::thread_added().
>>
>> * Made tests require Xmixed compilation mode.
>>
>> * Made tests agnostic regarding tiered compilation.
>>    I.e. tc isn't disabled anymore, and the tests can be run with tc enabled or
>> disabled.
>>
>> * Exercising EATests.java as well with stress test options
>> DeoptimizeObjectsALot*
>>    Due to the non-deterministic deoptimizations some tests need to be skipped.
>>    We do this to prevent bit-rot of the stress test code.
>>
>> * Executing EATests.java as well with graal if available. Driver for this is
>>    EATestsJVMCI.java. Graal cannot pass all tests, because it does not provide all
>> the new debug info
>>    (namely not_global_escape_in_scope and arg_escape in scopeDesc.hpp).
>>    And graal does not yet support the JVMTI operations force early return and
>> pop frame.
>>
>> * Removed tracing from new jdi tests in EATests.java. Too much trace output
>> before the debugging
>>    connection is established can cause deadlock because output buffers fill up.
>>    (See https://bugs.openjdk.java.net/browse/JDK-8173304)
>>
>> * Many copyright year changes and smaller clean-up changes of testing code
>> (trailing white-space and
>>    the like).
>>
>>
>> -----Original Message-----
>> From: David Holmes <david.holmes at oracle.com>
>> Sent: Donnerstag, 19. Dezember 2019 03:12
>> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-
>> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-
>> runtime-dev at openjdk.java.net; Vladimir Kozlov (vladimir.kozlov at oracle.com)
>> <vladimir.kozlov at oracle.com>
>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
>> the Presence of JVMTI Agents
>>
>> Hi Richard,
>>
>> I think my issue is with the way EliminateNestedLocks works so I'm going
>> to look into that more deeply.
>>
>> Thanks for the explanations.
>>
>> David
>>
>> On 18/12/2019 12:47 am, Reingruber, Richard wrote:
>>> Hi David,
>>>
>>>     > >    > Some further queries/concerns:
>>>     > >    >
>>>     > >    > src/hotspot/share/runtime/objectMonitor.cpp
>>>     > >    >
>>>     > >    > Can you please explain the changes to ObjectMonitor::wait:
>>>     > >    >
>>>     > >    > !   _recursions = save      // restore the old recursion count
>>>     > >    > !                 + jt->get_and_reset_relock_count_after_wait(); //
>>>     > >    > increased by the deferred relock count
>>>     > >    >
>>>     > >    > what is the "deferred relock count"? I gather it relates to
>>>     > >    >
>>>     > >    > "The code was extended to be able to deoptimize objects of a
>>>     > > frame that
>>>     > >    > is not the top frame and to let another thread than the owning
>>>     > > thread do
>>>     > >    > it."
>>>     > >
>>>     > > Yes, these relate. Currently EA based optimizations are reverted, when a
>> compiled frame is
>>>     > > replaced with corresponding interpreter frames. Part of this is relocking
>> objects with eliminated
>>>     > > locking. New with the enhancement is that we do this also just before
>> object references are
>>>     > > acquired through JVMTI. In this case we deoptimize also the owning
>> compiled frame C and we
>>>     > > register deoptimized objects as deferred updates. When control returns
>> to C it gets deoptimized,
>>>     > > we notice that objects are already deoptimized (reallocated and
>> relocked), so we don't do it again
>>>     > > (relocking twice would be incorrect of course). Deferred updates are
>> copied into the new
>>>     > > interpreter frames.
>>>     > >
>>>     > > Problem: relocking is not possible if the target thread T is waiting on the
>> monitor that needs to
>>>     > > be relocked. This happens only with non-local objects with
>> EliminateNestedLocks. Instead relocking
>>>     > > is deferred until T owns the monitor again. This is what the piece of
>> code above does.
>>>     >
>>>     >  Sorry I need some more detail here. How can you wait() on an object
>>>     >  monitor if the object allocation and/or locking was optimised away? And
>>>     >  what is a "non-local object" in this context? Isn't EA restricted to
>>>     >  thread-confined objects?
>>>
>>> "Non-local object" is an object that escapes its thread. The issue I'm
>> addressing with the changes
>>> in ObjectMonitor::wait are almost unrelated to EA. They are caused by
>> EliminateNestedLocks, where C2
>>> eliminates recursive locking of an already owned lock. The lock owning object
>> exists on the heap, it
>>> is locked and you can call wait() on it.
>>>
>>> EliminateLocks is the C2 option that controls lock elimination based on EA.
>> Both optimizations have
>>> in common that objects with eliminated locking need to be relocked when
>> deoptimizing a frame,
>>> i.e. when replacing a compiled frame with equivalent interpreter
>>> frames. Deoptimization::relock_objects does that job for /all/ eliminated
>> locks in scope. /All/ can
>>> be a mix of eliminated nested locks and locks of not-escaping objects.
>>>
>>> New with the enhancement: I call relock_objects earlier, just before objects
>> pontentially
>>> escape. But then later when the owning compiled frame gets deoptimized, I
>> must not do it again:
>>>
>>> See call to EscapeBarrier::objs_are_deoptimized in deoptimization.cpp:
>>>
>>>    373   if ((jvmci_enabled || ((DoEscapeAnalysis || EliminateNestedLocks) &&
>> EliminateLocks))
>>>    374       && !EscapeBarrier::objs_are_deoptimized(thread, deoptee.id())) {
>>>    375     bool unused;
>>>    376     eliminate_locks(thread, chunk, realloc_failures, deoptee, exec_mode,
>> unused);
>>>    377   }
>>>
>>> Now when calling relock_objects early it is quiet possible that I have to relock
>> an object the
>>> target thread currently waits for. Obviously I cannot relock in this case,
>> instead I chose to
>>> introduce relock_count_after_wait to JavaThread.
>>>
>>>     >  Is it just that some of the locking gets optimized away e.g.
>>>     >
>>>     >  synchronised(obj) {
>>>     >     synchronised(obj) {
>>>     >       synchronised(obj) {
>>>     >         obj.wait();
>>>     >       }
>>>     >     }
>>>     >  }
>>>     >
>>>     >  If this is reduced to a form as-if it were a single lock of the monitor
>>>     >  (due to EA) and the wait() triggers a JVM TI event which leads to the
>>>     >  escape of "obj" then we need to reconstruct the true lock state, and so
>>>     >  when the wait() internally unblocks and reacquires the monitor it has to
>>>     >  set the true recursion count to 3, not the 1 that it appeared to be when
>>>     >  wait() was initially called. Is that the scenario?
>>>
>>> Kind of... except that the locking is not eliminated due to EA and there is no
>> JVM TI event
>>> triggered by wait.
>>>
>>> Add
>>>
>>> LocalObject l1 = new LocalObject();
>>>
>>> in front of the synchrnized blocks and assume a JVM TI agent acquires l1. This
>> triggers the code in
>>> question.
>>>
>>> See that relocking/reallocating is transactional. If it is done then for /all/
>> objects in scope and it is
>>> done at most once. It wouldn't be quite so easy to split this in relocking of
>> nested/EA-based
>>> eliminated locks.
>>>
>>>     >  If so I find this truly awful. Anyone using wait() in a realistic form
>>>     >  requires a notification and so the object cannot be thread confined. In
>>>
>>> It is not thread confined.
>>>
>>>     >  which case I would strongly argue that upon hitting the wait() the deopt
>>>     >  should occur unconditionally and so the lock state is correct before we
>>>     >  wait and so we don't need to mess with the recursion count internally
>>>     >  when we reacquire the monitor.
>>>     >
>>>     > >
>>>     > >    > which I don't like the sound of at all when it comes to ObjectMonitor
>>>     > >    > state. So I'd like to understand in detail exactly what is going on here
>>>     > >    > and why.  This is a very intrusive change that seems to badly break
>>>     > >    > encapsulation and impacts future changes to ObjectMonitor that are
>> under
>>>     > >    > investigation.
>>>     > >
>>>     > > I would not regard this as breaking encapsulation. Certainly not badly.
>>>     > >
>>>     > > I've added a property relock_count_after_wait to JavaThread. The
>> property is well
>>>     > > encapsulated. Future ObjectMonitor implementations have to deal with
>> recursion too. They are free
>>>     > > in choosing a way to do that as long as that property is taken into
>> account. This is hardly a
>>>     > > limitation.
>>>     >
>>>     >  I do think this badly breaks encapsulation as you have to add a callout
>>>     >  from the guts of the ObjectMonitor code to reach into the thread to get
>>>     >  this lock count adjustment. I understand why you have had to do this but
>>>     >  I would much rather see a change to the EA optimisation strategy so that
>>>     >  this is not needed.
>>>     >
>>>     > > Note also that the property is a straight forward extension of the
>> existing concept of deferred
>>>     > > local updates. It is embedded into the structure holding them. So not
>> even the footprint of a
>>>     > > JavaThread is enlarged if no deferred updates are generated.
>>>     >
>>>     > [...]
>>>     >
>>>     > >
>>>     > > I'm actually duplicating the existing external suspend mechanism,
>> because a thread can be
>>>     > > suspended at most once. And hey, and don't like that either! But it
>> seems not unlikely that the
>>>     > > duplicate can be removed together with the original and the new type
>> of handshakes that will be
>>>     > > used for thread suspend can be used for object deoptimization too. See
>> today's discussion in
>>>     > > JDK-8227745 [2].
>>>     >
>>>     >  I hope that discussion bears some fruit, at the moment it seems not to
>>>     >  be possible to use handshakes here. :(
>>>     >
>>>     >  The external suspend mechanism is a royal pain in the proverbial that we
>>>     >  have to carefully live with. The idea that we're duplicating that for
>>>     >  use in another fringe area of functionality does not thrill me at all.
>>>     >
>>>     >  To be clear, I understand the problem that exists and that you wish to
>>>     >  solve, but for the runtime parts I balk at the complexity cost of
>>>     >  solving it.
>>>
>>> I know it's complex, but by far no rocket science.
>>>
>>> Also I find it hard to imagine another fix for JDK-8233915 besides changing
>> the JVM TI specification.
>>>
>>> Thanks, Richard.
>>>
>>> -----Original Message-----
>>> From: David Holmes <david.holmes at oracle.com>
>>> Sent: Dienstag, 17. Dezember 2019 08:03
>>> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-
>> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-
>> runtime-dev at openjdk.java.net; Vladimir Kozlov (vladimir.kozlov at oracle.com)
>> <vladimir.kozlov at oracle.com>
>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance
>> in the Presence of JVMTI Agents
>>>
>>> <resend as my mailer crashed during last send>
>>>
>>> David
>>>
>>> On 17/12/2019 4:57 pm, David Holmes wrote:
>>>> Hi Richard,
>>>>
>>>> On 14/12/2019 5:01 am, Reingruber, Richard wrote:
>>>>> Hi David,
>>>>>
>>>>>      > Some further queries/concerns:
>>>>>      >
>>>>>      > src/hotspot/share/runtime/objectMonitor.cpp
>>>>>      >
>>>>>      > Can you please explain the changes to ObjectMonitor::wait:
>>>>>      >
>>>>>      > !   _recursions = save      // restore the old recursion count
>>>>>      > !                 + jt->get_and_reset_relock_count_after_wait(); //
>>>>>      > increased by the deferred relock count
>>>>>      >
>>>>>      > what is the "deferred relock count"? I gather it relates to
>>>>>      >
>>>>>      > "The code was extended to be able to deoptimize objects of a
>>>>> frame that
>>>>>      > is not the top frame and to let another thread than the owning
>>>>> thread do
>>>>>      > it."
>>>>>
>>>>> Yes, these relate. Currently EA based optimizations are reverted, when
>>>>> a compiled frame is replaced
>>>>> with corresponding interpreter frames. Part of this is relocking
>>>>> objects with eliminated
>>>>> locking. New with the enhancement is that we do this also just before
>>>>> object references are acquired
>>>>> through JVMTI. In this case we deoptimize also the owning compiled
>>>>> frame C and we register
>>>>> deoptimized objects as deferred updates. When control returns to C it
>>>>> gets deoptimized, we notice
>>>>> that objects are already deoptimized (reallocated and relocked), so we
>>>>> don't do it again (relocking
>>>>> twice would be incorrect of course). Deferred updates are copied into
>>>>> the new interpreter frames.
>>>>>
>>>>> Problem: relocking is not possible if the target thread T is waiting
>>>>> on the monitor that needs to be
>>>>> relocked. This happens only with non-local objects with
>>>>> EliminateNestedLocks. Instead relocking is
>>>>> deferred until T owns the monitor again. This is what the piece of
>>>>> code above does.
>>>>
>>>> Sorry I need some more detail here. How can you wait() on an object
>>>> monitor if the object allocation and/or locking was optimised away? And
>>>> what is a "non-local object" in this context? Isn't EA restricted to
>>>> thread-confined objects?
>>>>
>>>> Is it just that some of the locking gets optimized away e.g.
>>>>
>>>> synchronised(obj) {
>>>>      synchronised(obj) {
>>>>        synchronised(obj) {
>>>>          obj.wait();
>>>>        }
>>>>      }
>>>> }
>>>>
>>>> If this is reduced to a form as-if it were a single lock of the monitor
>>>> (due to EA) and the wait() triggers a JVM TI event which leads to the
>>>> escape of "obj" then we need to reconstruct the true lock state, and so
>>>> when the wait() internally unblocks and reacquires the monitor it has to
>>>> set the true recursion count to 3, not the 1 that it appeared to be when
>>>> wait() was initially called. Is that the scenario?
>>>>
>>>> If so I find this truly awful. Anyone using wait() in a realistic form
>>>> requires a notification and so the object cannot be thread confined. In
>>>> which case I would strongly argue that upon hitting the wait() the deopt
>>>> should occur unconditionally and so the lock state is correct before we
>>>> wait and so we don't need to mess with the recursion count internally
>>>> when we reacquire the monitor.
>>>>
>>>>>
>>>>>      > which I don't like the sound of at all when it comes to
>>>>> ObjectMonitor
>>>>>      > state. So I'd like to understand in detail exactly what is going
>>>>> on here
>>>>>      > and why.  This is a very intrusive change that seems to badly break
>>>>>      > encapsulation and impacts future changes to ObjectMonitor that
>>>>> are under
>>>>>      > investigation.
>>>>>
>>>>> I would not regard this as breaking encapsulation. Certainly not badly.
>>>>>
>>>>> I've added a property relock_count_after_wait to JavaThread. The
>>>>> property is well
>>>>> encapsulated. Future ObjectMonitor implementations have to deal with
>>>>> recursion too. They are free in
>>>>> choosing a way to do that as long as that property is taken into
>>>>> account. This is hardly a
>>>>> limitation.
>>>>
>>>> I do think this badly breaks encapsulation as you have to add a callout
>>>> from the guts of the ObjectMonitor code to reach into the thread to get
>>>> this lock count adjustment. I understand why you have had to do this but
>>>> I would much rather see a change to the EA optimisation strategy so that
>>>> this is not needed.
>>>>
>>>>> Note also that the property is a straight forward extension of the
>>>>> existing concept of deferred
>>>>> local updates. It is embedded into the structure holding them. So not
>>>>> even the footprint of a
>>>>> JavaThread is enlarged if no deferred updates are generated.
>>>>>
>>>>>      > ---
>>>>>      >
>>>>>      > src/hotspot/share/runtime/thread.cpp
>>>>>      >
>>>>>      > Can you please explain why
>>>>> JavaThread::wait_for_object_deoptimization
>>>>>      > has to be handcrafted in this way rather than using proper
>>>>> transitions.
>>>>>      >
>>>>>
>>>>> I wrote wait_for_object_deoptimization taking
>>>>> JavaThread::java_suspend_self_with_safepoint_check
>>>>> as template. So in short: for the same reasons :)
>>>>>
>>>>> Threads reach both methods as part of thread state transitions,
>>>>> therefore special handling is
>>>>> required to change thread state on top of ongoing transitions.
>>>>>
>>>>>      > We got rid of "deopt suspend" some time ago and it is disturbing
>>>>> to see
>>>>>      > it being added back (effectively). This seems like it may be
>>>>> something
>>>>>      > that handshakes could be used for.
>>>>>
>>>>> Deopt suspend used to be something rather different with a similar
>>>>> name[1]. It is not being added back.
>>>>
>>>> I stand corrected. Despite comments in the code to the contrary
>>>> deopt_suspend didn't actually cause a self-suspend. I was doing a lot of
>>>> cleanup in this area 13 years ago :)
>>>>
>>>>>
>>>>> I'm actually duplicating the existing external suspend mechanism,
>>>>> because a thread can be suspended
>>>>> at most once. And hey, and don't like that either! But it seems not
>>>>> unlikely that the duplicate can
>>>>> be removed together with the original and the new type of handshakes
>>>>> that will be used for
>>>>> thread suspend can be used for object deoptimization too. See today's
>>>>> discussion in JDK-8227745 [2].
>>>>
>>>> I hope that discussion bears some fruit, at the moment it seems not to
>>>> be possible to use handshakes here. :(
>>>>
>>>> The external suspend mechanism is a royal pain in the proverbial that we
>>>> have to carefully live with. The idea that we're duplicating that for
>>>> use in another fringe area of functionality does not thrill me at all.
>>>>
>>>> To be clear, I understand the problem that exists and that you wish to
>>>> solve, but for the runtime parts I balk at the complexity cost of
>>>> solving it.
>>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>> Thanks, Richard.
>>>>>
>>>>> [1] Deopt suspend was something like an async. handshake for
>>>>> architectures with register windows,
>>>>>        where patching the return pc for deoptimization of a compiled
>>>>> frame was racy if the owner thread
>>>>>        was in native code. Instead a "deopt" suspend flag was set on
>>>>> which the thread patched its own
>>>>>        frame upon return from native. So no thread was suspended. It got
>>>>> its name only from the name of
>>>>>        the flags.
>>>>>
>>>>> [2] Discussion about using handshakes to sync. with the target thread:
>>>>>
>>>>> https://bugs.openjdk.java.net/browse/JDK-
>> 8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.syste
>> m.issuetabpanels:comment-tabpanel#comment-14306727
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>> Sent: Freitag, 13. Dezember 2019 00:56
>>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
>>>>> serviceability-dev at openjdk.java.net;
>>>>> hotspot-compiler-dev at openjdk.java.net;
>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
>>>>> Performance in the Presence of JVMTI Agents
>>>>>
>>>>> Hi Richard,
>>>>>
>>>>> Some further queries/concerns:
>>>>>
>>>>> src/hotspot/share/runtime/objectMonitor.cpp
>>>>>
>>>>> Can you please explain the changes to ObjectMonitor::wait:
>>>>>
>>>>> !   _recursions = save      // restore the old recursion count
>>>>> !                 + jt->get_and_reset_relock_count_after_wait(); //
>>>>> increased by the deferred relock count
>>>>>
>>>>> what is the "deferred relock count"? I gather it relates to
>>>>>
>>>>> "The code was extended to be able to deoptimize objects of a frame that
>>>>> is not the top frame and to let another thread than the owning thread do
>>>>> it."
>>>>>
>>>>> which I don't like the sound of at all when it comes to ObjectMonitor
>>>>> state. So I'd like to understand in detail exactly what is going on here
>>>>> and why.  This is a very intrusive change that seems to badly break
>>>>> encapsulation and impacts future changes to ObjectMonitor that are under
>>>>> investigation.
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/runtime/thread.cpp
>>>>>
>>>>> Can you please explain why JavaThread::wait_for_object_deoptimization
>>>>> has to be handcrafted in this way rather than using proper transitions.
>>>>>
>>>>> We got rid of "deopt suspend" some time ago and it is disturbing to see
>>>>> it being added back (effectively). This seems like it may be something
>>>>> that handshakes could be used for.
>>>>>
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>>
>>>>> On 12/12/2019 7:02 am, David Holmes wrote:
>>>>>> On 12/12/2019 1:07 am, Reingruber, Richard wrote:
>>>>>>> Hi David,
>>>>>>>
>>>>>>>       > Most of the details here are in areas I can comment on in detail,
>>>>>>> but I
>>>>>>>       > did take an initial general look at things.
>>>>>>>
>>>>>>> Thanks for taking the time!
>>>>>>
>>>>>> Apologies the above should read:
>>>>>>
>>>>>> "Most of the details here are in areas I *can't* comment on in detail
>>>>>> ..."
>>>>>>
>>>>>> David
>>>>>>
>>>>>>>       > The only thing that jumped out at me is that I think the
>>>>>>>       > DeoptimizeObjectsALotThread should be a hidden thread.
>>>>>>>       >
>>>>>>>       > +  bool is_hidden_from_external_view() const { return true; }
>>>>>>>
>>>>>>> Yes, it should. Will add the method like above.
>>>>>>>
>>>>>>>       > Also I don't see any testing of the DeoptimizeObjectsALotThread.
>>>>>>> Without
>>>>>>>       > active testing this will just bit-rot.
>>>>>>>
>>>>>>> DeoptimizeObjectsALot is meant for stress testing with a larger
>>>>>>> workload. I will add a minimal test
>>>>>>> to keep it fresh.
>>>>>>>
>>>>>>>       > Also on the tests I don't understand your @requires clause:
>>>>>>>       >
>>>>>>>       >   @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled
>> &
>>>>>>>       > (vm.opt.TieredCompilation != true))
>>>>>>>       >
>>>>>>>       > This seems to require that TieredCompilation is disabled, but
>>>>>>> tiered is
>>>>>>>       > our normal mode of operation. ??
>>>>>>>       >
>>>>>>>
>>>>>>> I removed the clause. I guess I wanted to target the tests towards the
>>>>>>> code they are supposed to
>>>>>>> test, and it's easier to analyze failures w/o tiered compilation and
>>>>>>> with just one compiler thread.
>>>>>>>
>>>>>>> Additionally I will make use of
>>>>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Richard.
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>>> Sent: Mittwoch, 11. Dezember 2019 08:03
>>>>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
>>>>>>> serviceability-dev at openjdk.java.net;
>>>>>>> hotspot-compiler-dev at openjdk.java.net;
>>>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
>>>>>>> Performance in the Presence of JVMTI Agents
>>>>>>>
>>>>>>> Hi Richard,
>>>>>>>
>>>>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I would like to get reviews please for
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
>>>>>>>>
>>>>>>>> Corresponding RFE:
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745
>>>>>>>>
>>>>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
>>>>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]
>>>>>>>>
>>>>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without
>>>>>>>> issues (thanks!). In addition the
>>>>>>>> change is being tested at SAP since I posted the first RFR some
>>>>>>>> months ago.
>>>>>>>>
>>>>>>>> The intention of this enhancement is to benefit performance wise from
>>>>>>>> escape analysis even if JVMTI
>>>>>>>> agents request capabilities that allow them to access local variable
>>>>>>>> values. E.g. if you start-up
>>>>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then
>>>>>>>> escape analysis is disabled right
>>>>>>>> from the beginning, well before a debugger attaches -- if ever one
>>>>>>>> should do so. With the
>>>>>>>> enhancement, escape analysis will remain enabled until and after a
>>>>>>>> debugger attaches. EA based
>>>>>>>> optimizations are reverted just before an agent acquires the
>>>>>>>> reference to an object. In the JBS item
>>>>>>>> you'll find more details.
>>>>>>>
>>>>>>> Most of the details here are in areas I can comment on in detail, but I
>>>>>>> did take an initial general look at things.
>>>>>>>
>>>>>>> The only thing that jumped out at me is that I think the
>>>>>>> DeoptimizeObjectsALotThread should be a hidden thread.
>>>>>>>
>>>>>>> +  bool is_hidden_from_external_view() const { return true; }
>>>>>>>
>>>>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread.
>>>>>>> Without
>>>>>>> active testing this will just bit-rot.
>>>>>>>
>>>>>>> Also on the tests I don't understand your @requires clause:
>>>>>>>
>>>>>>>       @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>>>>>>> (vm.opt.TieredCompilation != true))
>>>>>>>
>>>>>>> This seems to require that TieredCompilation is disabled, but tiered is
>>>>>>> our normal mode of operation. ??
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Richard.
>>>>>>>>
>>>>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745
>>>>>>>>
>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patc
>> h
>>>>>>>>
>>>>>>>>
>>>>>>>>


More information about the serviceability-dev mailing list