RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v3]

Richard Reingruber rrich at openjdk.java.net
Tue Sep 29 13:59:51 UTC 2020


On Fri, 25 Sep 2020 12:28:13 GMT, Richard Reingruber <rrich at openjdk.org> wrote:

>> The minor updates in response to my comments are fine.
>> 
>> The more major updates ... I can't really comment on.
>
>> 
>> 
>> The minor updates in response to my comments are fine.
>> 
>> The more major updates ... I can't really comment on.
> 
> Thanks for looking at the changes and for giving feedback.

Hi Serguei,

thanks for providing feedback! I've pushed the changes based on it now but I
have not yet merged master again. This needs a little work...

Please find my replies to your comments below.

Thanks, Richard.

> Could you consider to place the classes EscapeBarrier and JvmtiDeferredUpdates
> into theyr own .hpp/.cpp files? The class JvmtiDeferredUpdates would be better
> to put into the folder 'prims' then.

Done. In addition I moved preexisting class jvmtiDeferredLocalVariableSet and
class jvmtiDeferredLocalVariable from runtime/vframe_hp.hpp to
prims/jvmtiDeferredUpdates.hpp. Please let me know if not ok.

> src/hotspot/share/opto/macro.cpp:
> 
> ```
> @@ -1091,11 +1091,11 @@
>  bool PhaseMacroExpand::eliminate_allocate_node(AllocateNode *alloc) {
>    // Don't do scalar replacement if the frame can be popped by JVMTI:
>    // if reallocation fails during deoptimization we'll pop all
>    // interpreter frames for this compiled frame and that won't play
>    // nice with JVMTI popframe.
> -  if (!EliminateAllocations || JvmtiExport::can_pop_frame() || !alloc->_is_non_escaping) {
> +  if (!EliminateAllocations || !alloc->_is_non_escaping) {
>      return false;
>    }
> ```
> 
> I wonder if the comment is still correct after you removed the check for JvmtiExport::can_pop_frame().

Good catch. I fixed it previously with
https://github.com/openjdk/jdk/pull/119/commits/18dd54b4e6f17ca723e4ae1a1e8dc57e81878dd3

> src/hotspot/share/runtime/deoptimization.hpp:
> 
> ```
> +  EscapeBarrier(JavaThread* calling_thread, JavaThread* deoptee_thread, bool barrier_active)
> +    : _calling_thread(calling_thread), _deoptee_thread(deoptee_thread),
> +      _barrier_active(barrier_active && (JVMCI_ONLY(UseJVMCICompiler) NOT_JVMCI(false)
> +                      COMPILER2_PRESENT(|| DoEscapeAnalysis)))
> . . . . . . . . .
> +
> +  // Revert ea based optimizations for all java threads
> +  EscapeBarrier(JavaThread* calling_thread, bool barrier_active)
> +    : _calling_thread(calling_thread), _deoptee_thread(NULL),
> ```
> 
> Nit: would better to make the parameter deoptee_thread to be the 3rd to better mach the seconf constructor.

I have shuffled the parameters and moved barrier_active at first position. Would
that be ok?

> 
> ```
> +  bool all_threads()    const { return _deoptee_thread == NULL; }            // Should revert optimizations for all
> threads. +  bool self_deopt()     const { return _calling_thread == _deoptee_thread; } // Current thread deoptimizes
> its own objects. +  bool barrier_active() const { return _barrier_active; }                    // Inactive barriers are
> created if no local objects can escape. ```
> 
> I'd suggest to put comments in a line before function definitions as it is done for other declarations/definitions.

Done. // Note that there are quite a few locations with the comment on the same line ;)

> src/hotspot/share/runtime/deoptimization.cpp:
> 
> ```
>  @@ -349,12 +408,12 @@
>  
>    // Now that the vframeArray has been created if we have any deferred local writes
>    // added by jvmti then we can free up that structure as the data is now in the
>    // vframeArray
>  
> -  if (thread->deferred_locals() != NULL) {
> -    GrowableArray<jvmtiDeferredLocalVariableSet*>* list = thread->deferred_locals();
> +  if (JvmtiDeferredUpdates::deferred_locals(thread) != NULL) {
> +    GrowableArray<jvmtiDeferredLocalVariableSet*>* list = JvmtiDeferredUpdates::deferred_locals(thread);
>      int i = 0;
>      do {
>        // Because of inlining we could have multiple vframes for a single frame
>        // and several of the vframes could have deferred writes. Find them all.
>        if (list->at(i)->id() == array->original().id()) {
> 
> @@ -365,13 +424,14 @@
>        } else {
>          i++;
>        }
>      } while ( i < list->length() );
>      if (list->length() == 0) {
> -      thread->set_deferred_locals(NULL);
> -      // free the list and elements back to C heap.
> -      delete list;
> +      JvmtiDeferredUpdates* updates = thread->deferred_updates();
> +      thread->set_deferred_updates(NULL);
> +      // free deferred updates.
> +      delete updates;
>      }
> ```
> 
> It is not clear why the 'list' is not deleted anymore. If it is intentional then could you, please, add a comment with
> an explanation?

'list' is now embedded in JvmtiDeferredUpdates. It es deleted as part of the
JvmtiDeferredUpdates instance when there are no more deferred updates.

class JvmtiDeferredUpdates : public CHeapObj<mtCompiler> {

  [...]

  // Deferred updates of locals, expressions, and monitors
  GrowableArray<jvmtiDeferredLocalVariableSet*> _deferred_locals_updates;

  [...]

};

I introduced JvmtiDeferredUpdates because this patch introduces a new type of
deferred update: _relock_count_after_wait.

I tried to improve the encapsulation of class JvmtiDeferredUpdates and
simplified the location you are referring to.

So when is memory for deferred updates freed?

(A) Deferred local variable updates are deleted when the compiled target frame is
    replaced with corresponding interpreter frames.
    See JvmtiDeferredUpdates::delete_updates_for_frame().

(B) A thread's JvmtiDeferredUpdates instance is deleted if all updates where
    delivered. All updates where delivered when JvmtiDeferredUpdates::count()
    returns 0. This is checked whenever updates are delivered. See call sites in
    JvmtiDeferredUpdates::delete_updates_for_frame() and
    JvmtiDeferredUpdates::get_and_reset_relock_count_after_wait().

(C) Besides (B) a thread's JvmtiDeferredUpdates instance is also deleted when
    the thread is destroyed. All not yet delivered updates are deleted then
    too. See JavaThread::~JavaThread() and JvmtiDeferredUpdates::~JvmtiDeferredUpdates().

> If you are okay to separate the EscapeBarrier class into its own hpp/cpp files
> then the class EscapeBarrierSuspendHandshake is better to be colocated with
> it.

Done.

> The below functions EscapeBarrier::sync_and_suspend_one() and do_thread() make a call to the set_obj_deopt_flag() which
> seems to be a duplication. At least, it is not clear why this duplication exist and so, needs to be explained in a
> comment.  ```
> +void EscapeBarrier::sync_and_suspend_one() {
> +  assert(_calling_thread != NULL, "calling thread must not be NULL");
> +  assert(_deoptee_thread != NULL, "deoptee thread must not be NULL");
> +  assert(barrier_active(), "should not call");
> +
> +  // Sync with other threads that might be doing deoptimizations
> +  {
> +    // Need to switch to _thread_blocked for the wait() call
> +    ThreadBlockInVM tbivm(_calling_thread);
> +    MonitorLocker ml(_calling_thread, EscapeBarrier_lock, Mutex::_no_safepoint_check_flag);
> +    while (_self_deoptimization_in_progress || _deoptee_thread->is_obj_deopt_suspend()) {
> +      ml.wait();
> +    }
> +
> +    if (self_deopt()) {
> +      _self_deoptimization_in_progress = true;
> +      return;
> +    }
> +
> +    // set suspend flag for target thread
> +    _deoptee_thread->set_obj_deopt_flag();
> +  }
> +
> +  // suspend target thread
> +  EscapeBarrierSuspendHandshake sh(NULL, "EscapeBarrierSuspendOne");
> +  Handshake::execute_direct(&sh, _deoptee_thread);
> +  assert(!_deoptee_thread->has_last_Java_frame() || _deoptee_thread->frame_anchor()->walkable(),
> +         "stack should be walkable now");
> +}
> . . . . .
> +class EscapeBarrierSuspendHandshake : public HandshakeClosure {
> +  JavaThread* _excluded_thread;
> + public:
> +  EscapeBarrierSuspendHandshake(JavaThread* excluded_thread, const char* name) :
> +    HandshakeClosure(name),
> +    _excluded_thread(excluded_thread) {}
> +  void do_thread(Thread* th) {
> +    if (th->is_Java_thread() && !th->is_hidden_from_external_view() && (th != _excluded_thread)) {
> +      th->set_obj_deopt_flag();
> +    }
> +  }
> +};
> ```

I previously removed the set_obj_deopt_flag() call from
EscapeBarrierSuspendHandshake::do_thread() in [1]. For synchronization it is
better to set_obj_deopt_flag() before the handshake (see comment in
EscapeBarrier::sync_and_suspend_all()).

[1] https://github.com/openjdk/jdk/pull/119/commits/18dd54b4e6f17ca723e4ae1a1e8dc57e81878dd3

> /src/hotspot/share/prims/jvmtiImpl.cpp:
> 
> ```
> 421 // Constructor for non-object getter
>  422 VM_GetOrSetLocal::VM_GetOrSetLocal(JavaThread* thread, jint depth, jint index, BasicType type)
>  423   : _thread(thread)
>  424   , _calling_thread(NULL)
>  425   , _depth(depth)
>  426   , _index(index)
>  427   , _type(type)
>  428   , _jvf(NULL)
>  429   , _set(false)
>  430   , _eb(NULL, NULL, type == T_OBJECT)
>  431   , _result(JVMTI_ERROR_NONE)
>  432 {
>  433 }
>  434
>  435 // Constructor for object or non-object setter
>  436 VM_GetOrSetLocal::VM_GetOrSetLocal(JavaThread* thread, jint depth, jint index, BasicType type, jvalue value)
>  437   : _thread(thread)
>  438   , _calling_thread(NULL)
>  439   , _depth(depth)
>  440   , _index(index)
>  441   , _type(type)
>  442   , _value(value)
>  443   , _jvf(NULL)
>  444   , _set(true)
>  445   , _eb(JavaThread::current(), thread, type == T_OBJECT)
>  446   , _result(JVMTI_ERROR_NONE)
>  447 {
>  448 }
>  449
>  450 // Constructor for object getter
>  451 VM_GetOrSetLocal::VM_GetOrSetLocal(JavaThread* thread, JavaThread* calling_thread, jint depth, int index)
>  452   : _thread(thread)
>  453   , _calling_thread(calling_thread)
>  454   , _depth(depth)
>  455   , _index(index)
>  456   , _type(T_OBJECT)
>  457   , _jvf(NULL)
>  458   , _set(false)
>  459   , _eb(calling_thread, thread, true)
>  460   , _result(JVMTI_ERROR_NONE)
>  461 {
>  462 }
> ```
> 
> I think, false has to be passed to the constructors of non-object getters instead of expression:
> "type == T_OBJECT".
> The type can not be T_OBJECT for non-object getters.

I used to do that. Then I changed it because the c++ compiler can fold the
comparison to "false" and if somebody changes the non-object getter to get
objects too then it would still be correct.

Let me know if you still think it is better to pass false. Maybe add an
assertion type == T_OBJECT then?

> Q: Is an EscapeBarrier useful if false is passed as the barrier_active parameter?

The EscapeBarrier is not needed then. In the case of the non-object getter above
I'd hope that most of the constructor/desctructor of EscapeBarrier is eliminated
by the c++ compiler then.

Besides the changes you suggested I have made a bugfix in
test/jdk/com/sun/jdi/EATests.java to prevent ObjectCollectedException.

Thanks, Richard.

-------------

PR: https://git.openjdk.java.net/jdk/pull/119


More information about the core-libs-dev mailing list