RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v3]
Richard Reingruber
rrich at openjdk.java.net
Tue Oct 6 14:09:13 UTC 2020
On Tue, 29 Sep 2020 13:57:26 GMT, Richard Reingruber <rrich at openjdk.org> wrote:
>>>
>>>
>>> The minor updates in response to my comments are fine.
>>>
>>> The more major updates ... I can't really comment on.
>>
>> Thanks for looking at the changes and for giving feedback.
>
> Hi Serguei,
>
> thanks for providing feedback! I've pushed the changes based on it now but I
> have not yet merged master again. This needs a little work...
>
> Please find my replies to your comments below.
>
> Thanks, Richard.
>
>> Could you consider to place the classes EscapeBarrier and JvmtiDeferredUpdates
>> into theyr own .hpp/.cpp files? The class JvmtiDeferredUpdates would be better
>> to put into the folder 'prims' then.
>
> Done. In addition I moved preexisting class jvmtiDeferredLocalVariableSet and
> class jvmtiDeferredLocalVariable from runtime/vframe_hp.hpp to
> prims/jvmtiDeferredUpdates.hpp. Please let me know if not ok.
>
>> src/hotspot/share/opto/macro.cpp:
>>
>> ```
>> @@ -1091,11 +1091,11 @@
>> bool PhaseMacroExpand::eliminate_allocate_node(AllocateNode *alloc) {
>> // Don't do scalar replacement if the frame can be popped by JVMTI:
>> // if reallocation fails during deoptimization we'll pop all
>> // interpreter frames for this compiled frame and that won't play
>> // nice with JVMTI popframe.
>> - if (!EliminateAllocations || JvmtiExport::can_pop_frame() || !alloc->_is_non_escaping) {
>> + if (!EliminateAllocations || !alloc->_is_non_escaping) {
>> return false;
>> }
>> ```
>>
>> I wonder if the comment is still correct after you removed the check for JvmtiExport::can_pop_frame().
>
> Good catch. I fixed it previously with
> https://github.com/openjdk/jdk/pull/119/commits/18dd54b4e6f17ca723e4ae1a1e8dc57e81878dd3
>
>> src/hotspot/share/runtime/deoptimization.hpp:
>>
>> ```
>> + EscapeBarrier(JavaThread* calling_thread, JavaThread* deoptee_thread, bool barrier_active)
>> + : _calling_thread(calling_thread), _deoptee_thread(deoptee_thread),
>> + _barrier_active(barrier_active && (JVMCI_ONLY(UseJVMCICompiler) NOT_JVMCI(false)
>> + COMPILER2_PRESENT(|| DoEscapeAnalysis)))
>> . . . . . . . . .
>> +
>> + // Revert ea based optimizations for all java threads
>> + EscapeBarrier(JavaThread* calling_thread, bool barrier_active)
>> + : _calling_thread(calling_thread), _deoptee_thread(NULL),
>> ```
>>
>> Nit: would better to make the parameter deoptee_thread to be the 3rd to better mach the seconf constructor.
>
> I have shuffled the parameters and moved barrier_active at first position. Would
> that be ok?
>
>>
>> ```
>> + bool all_threads() const { return _deoptee_thread == NULL; } // Should revert optimizations for all
>> threads. + bool self_deopt() const { return _calling_thread == _deoptee_thread; } // Current thread deoptimizes
>> its own objects. + bool barrier_active() const { return _barrier_active; } // Inactive barriers are
>> created if no local objects can escape. ```
>>
>> I'd suggest to put comments in a line before function definitions as it is done for other declarations/definitions.
>
> Done. // Note that there are quite a few locations with the comment on the same line ;)
>
>> src/hotspot/share/runtime/deoptimization.cpp:
>>
>> ```
>> @@ -349,12 +408,12 @@
>>
>> // Now that the vframeArray has been created if we have any deferred local writes
>> // added by jvmti then we can free up that structure as the data is now in the
>> // vframeArray
>>
>> - if (thread->deferred_locals() != NULL) {
>> - GrowableArray<jvmtiDeferredLocalVariableSet*>* list = thread->deferred_locals();
>> + if (JvmtiDeferredUpdates::deferred_locals(thread) != NULL) {
>> + GrowableArray<jvmtiDeferredLocalVariableSet*>* list = JvmtiDeferredUpdates::deferred_locals(thread);
>> int i = 0;
>> do {
>> // Because of inlining we could have multiple vframes for a single frame
>> // and several of the vframes could have deferred writes. Find them all.
>> if (list->at(i)->id() == array->original().id()) {
>>
>> @@ -365,13 +424,14 @@
>> } else {
>> i++;
>> }
>> } while ( i < list->length() );
>> if (list->length() == 0) {
>> - thread->set_deferred_locals(NULL);
>> - // free the list and elements back to C heap.
>> - delete list;
>> + JvmtiDeferredUpdates* updates = thread->deferred_updates();
>> + thread->set_deferred_updates(NULL);
>> + // free deferred updates.
>> + delete updates;
>> }
>> ```
>>
>> It is not clear why the 'list' is not deleted anymore. If it is intentional then could you, please, add a comment with
>> an explanation?
>
> 'list' is now embedded in JvmtiDeferredUpdates. It es deleted as part of the
> JvmtiDeferredUpdates instance when there are no more deferred updates.
>
> class JvmtiDeferredUpdates : public CHeapObj<mtCompiler> {
>
> [...]
>
> // Deferred updates of locals, expressions, and monitors
> GrowableArray<jvmtiDeferredLocalVariableSet*> _deferred_locals_updates;
>
> [...]
>
> };
>
> I introduced JvmtiDeferredUpdates because this patch introduces a new type of
> deferred update: _relock_count_after_wait.
>
> I tried to improve the encapsulation of class JvmtiDeferredUpdates and
> simplified the location you are referring to.
>
> So when is memory for deferred updates freed?
>
> (A) Deferred local variable updates are deleted when the compiled target frame is
> replaced with corresponding interpreter frames.
> See JvmtiDeferredUpdates::delete_updates_for_frame().
>
> (B) A thread's JvmtiDeferredUpdates instance is deleted if all updates where
> delivered. All updates where delivered when JvmtiDeferredUpdates::count()
> returns 0. This is checked whenever updates are delivered. See call sites in
> JvmtiDeferredUpdates::delete_updates_for_frame() and
> JvmtiDeferredUpdates::get_and_reset_relock_count_after_wait().
>
> (C) Besides (B) a thread's JvmtiDeferredUpdates instance is also deleted when
> the thread is destroyed. All not yet delivered updates are deleted then
> too. See JavaThread::~JavaThread() and JvmtiDeferredUpdates::~JvmtiDeferredUpdates().
>
>> If you are okay to separate the EscapeBarrier class into its own hpp/cpp files
>> then the class EscapeBarrierSuspendHandshake is better to be colocated with
>> it.
>
> Done.
>
>> The below functions EscapeBarrier::sync_and_suspend_one() and do_thread() make a call to the set_obj_deopt_flag() which
>> seems to be a duplication. At least, it is not clear why this duplication exist and so, needs to be explained in a
>> comment. ```
>> +void EscapeBarrier::sync_and_suspend_one() {
>> + assert(_calling_thread != NULL, "calling thread must not be NULL");
>> + assert(_deoptee_thread != NULL, "deoptee thread must not be NULL");
>> + assert(barrier_active(), "should not call");
>> +
>> + // Sync with other threads that might be doing deoptimizations
>> + {
>> + // Need to switch to _thread_blocked for the wait() call
>> + ThreadBlockInVM tbivm(_calling_thread);
>> + MonitorLocker ml(_calling_thread, EscapeBarrier_lock, Mutex::_no_safepoint_check_flag);
>> + while (_self_deoptimization_in_progress || _deoptee_thread->is_obj_deopt_suspend()) {
>> + ml.wait();
>> + }
>> +
>> + if (self_deopt()) {
>> + _self_deoptimization_in_progress = true;
>> + return;
>> + }
>> +
>> + // set suspend flag for target thread
>> + _deoptee_thread->set_obj_deopt_flag();
>> + }
>> +
>> + // suspend target thread
>> + EscapeBarrierSuspendHandshake sh(NULL, "EscapeBarrierSuspendOne");
>> + Handshake::execute_direct(&sh, _deoptee_thread);
>> + assert(!_deoptee_thread->has_last_Java_frame() || _deoptee_thread->frame_anchor()->walkable(),
>> + "stack should be walkable now");
>> +}
>> . . . . .
>> +class EscapeBarrierSuspendHandshake : public HandshakeClosure {
>> + JavaThread* _excluded_thread;
>> + public:
>> + EscapeBarrierSuspendHandshake(JavaThread* excluded_thread, const char* name) :
>> + HandshakeClosure(name),
>> + _excluded_thread(excluded_thread) {}
>> + void do_thread(Thread* th) {
>> + if (th->is_Java_thread() && !th->is_hidden_from_external_view() && (th != _excluded_thread)) {
>> + th->set_obj_deopt_flag();
>> + }
>> + }
>> +};
>> ```
>
> I previously removed the set_obj_deopt_flag() call from
> EscapeBarrierSuspendHandshake::do_thread() in [1]. For synchronization it is
> better to set_obj_deopt_flag() before the handshake (see comment in
> EscapeBarrier::sync_and_suspend_all()).
>
> [1] https://github.com/openjdk/jdk/pull/119/commits/18dd54b4e6f17ca723e4ae1a1e8dc57e81878dd3
>
>> /src/hotspot/share/prims/jvmtiImpl.cpp:
>>
>> ```
>> 421 // Constructor for non-object getter
>> 422 VM_GetOrSetLocal::VM_GetOrSetLocal(JavaThread* thread, jint depth, jint index, BasicType type)
>> 423 : _thread(thread)
>> 424 , _calling_thread(NULL)
>> 425 , _depth(depth)
>> 426 , _index(index)
>> 427 , _type(type)
>> 428 , _jvf(NULL)
>> 429 , _set(false)
>> 430 , _eb(NULL, NULL, type == T_OBJECT)
>> 431 , _result(JVMTI_ERROR_NONE)
>> 432 {
>> 433 }
>> 434
>> 435 // Constructor for object or non-object setter
>> 436 VM_GetOrSetLocal::VM_GetOrSetLocal(JavaThread* thread, jint depth, jint index, BasicType type, jvalue value)
>> 437 : _thread(thread)
>> 438 , _calling_thread(NULL)
>> 439 , _depth(depth)
>> 440 , _index(index)
>> 441 , _type(type)
>> 442 , _value(value)
>> 443 , _jvf(NULL)
>> 444 , _set(true)
>> 445 , _eb(JavaThread::current(), thread, type == T_OBJECT)
>> 446 , _result(JVMTI_ERROR_NONE)
>> 447 {
>> 448 }
>> 449
>> 450 // Constructor for object getter
>> 451 VM_GetOrSetLocal::VM_GetOrSetLocal(JavaThread* thread, JavaThread* calling_thread, jint depth, int index)
>> 452 : _thread(thread)
>> 453 , _calling_thread(calling_thread)
>> 454 , _depth(depth)
>> 455 , _index(index)
>> 456 , _type(T_OBJECT)
>> 457 , _jvf(NULL)
>> 458 , _set(false)
>> 459 , _eb(calling_thread, thread, true)
>> 460 , _result(JVMTI_ERROR_NONE)
>> 461 {
>> 462 }
>> ```
>>
>> I think, false has to be passed to the constructors of non-object getters instead of expression:
>> "type == T_OBJECT".
>> The type can not be T_OBJECT for non-object getters.
>
> I used to do that. Then I changed it because the c++ compiler can fold the
> comparison to "false" and if somebody changes the non-object getter to get
> objects too then it would still be correct.
>
> Let me know if you still think it is better to pass false. Maybe add an
> assertion type == T_OBJECT then?
>
>> Q: Is an EscapeBarrier useful if false is passed as the barrier_active parameter?
>
> The EscapeBarrier is not needed then. In the case of the non-object getter above
> I'd hope that most of the constructor/desctructor of EscapeBarrier is eliminated
> by the c++ compiler then.
>
> Besides the changes you suggested I have made a bugfix in
> test/jdk/com/sun/jdi/EATests.java to prevent ObjectCollectedException.
>
> Thanks, Richard.
Hi Serguei
(@sspitsyn)
are you ok with the changes I made based on your comments?
Will you further review the change?
Thanks, Richard.
-------------
PR: https://git.openjdk.java.net/jdk/pull/119
More information about the core-libs-dev
mailing list