RFR: 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents [v3]
Serguei Spitsyn
sspitsyn at openjdk.java.net
Wed Oct 7 04:31:16 UTC 2020
On Tue, 6 Oct 2020 14:06:35 GMT, Richard Reingruber <rrich at openjdk.org> wrote:
>> Hi Serguei,
>>
>> thanks for providing feedback! I've pushed the changes based on it now but I
>> have not yet merged master again. This needs a little work...
>>
>> Please find my replies to your comments below.
>>
>> Thanks, Richard.
>>
>>> Could you consider to place the classes EscapeBarrier and JvmtiDeferredUpdates
>>> into theyr own .hpp/.cpp files? The class JvmtiDeferredUpdates would be better
>>> to put into the folder 'prims' then.
>>
>> Done. In addition I moved preexisting class jvmtiDeferredLocalVariableSet and
>> class jvmtiDeferredLocalVariable from runtime/vframe_hp.hpp to
>> prims/jvmtiDeferredUpdates.hpp. Please let me know if not ok.
>>
>>> src/hotspot/share/opto/macro.cpp:
>>>
>>> ```
>>> @@ -1091,11 +1091,11 @@
>>> bool PhaseMacroExpand::eliminate_allocate_node(AllocateNode *alloc) {
>>> // Don't do scalar replacement if the frame can be popped by JVMTI:
>>> // if reallocation fails during deoptimization we'll pop all
>>> // interpreter frames for this compiled frame and that won't play
>>> // nice with JVMTI popframe.
>>> - if (!EliminateAllocations || JvmtiExport::can_pop_frame() || !alloc->_is_non_escaping) {
>>> + if (!EliminateAllocations || !alloc->_is_non_escaping) {
>>> return false;
>>> }
>>> ```
>>>
>>> I wonder if the comment is still correct after you removed the check for JvmtiExport::can_pop_frame().
>>
>> Good catch. I fixed it previously with
>> https://github.com/openjdk/jdk/pull/119/commits/18dd54b4e6f17ca723e4ae1a1e8dc57e81878dd3
>>
>>> src/hotspot/share/runtime/deoptimization.hpp:
>>>
>>> ```
>>> + EscapeBarrier(JavaThread* calling_thread, JavaThread* deoptee_thread, bool barrier_active)
>>> + : _calling_thread(calling_thread), _deoptee_thread(deoptee_thread),
>>> + _barrier_active(barrier_active && (JVMCI_ONLY(UseJVMCICompiler) NOT_JVMCI(false)
>>> + COMPILER2_PRESENT(|| DoEscapeAnalysis)))
>>> . . . . . . . . .
>>> +
>>> + // Revert ea based optimizations for all java threads
>>> + EscapeBarrier(JavaThread* calling_thread, bool barrier_active)
>>> + : _calling_thread(calling_thread), _deoptee_thread(NULL),
>>> ```
>>>
>>> Nit: would better to make the parameter deoptee_thread to be the 3rd to better mach the seconf constructor.
>>
>> I have shuffled the parameters and moved barrier_active at first position. Would
>> that be ok?
>>
>>>
>>> ```
>>> + bool all_threads() const { return _deoptee_thread == NULL; } // Should revert optimizations for all
>>> threads. + bool self_deopt() const { return _calling_thread == _deoptee_thread; } // Current thread deoptimizes
>>> its own objects. + bool barrier_active() const { return _barrier_active; } // Inactive barriers are
>>> created if no local objects can escape. ```
>>>
>>> I'd suggest to put comments in a line before function definitions as it is done for other declarations/definitions.
>>
>> Done. // Note that there are quite a few locations with the comment on the same line ;)
>>
>>> src/hotspot/share/runtime/deoptimization.cpp:
>>>
>>> ```
>>> @@ -349,12 +408,12 @@
>>>
>>> // Now that the vframeArray has been created if we have any deferred local writes
>>> // added by jvmti then we can free up that structure as the data is now in the
>>> // vframeArray
>>>
>>> - if (thread->deferred_locals() != NULL) {
>>> - GrowableArray<jvmtiDeferredLocalVariableSet*>* list = thread->deferred_locals();
>>> + if (JvmtiDeferredUpdates::deferred_locals(thread) != NULL) {
>>> + GrowableArray<jvmtiDeferredLocalVariableSet*>* list = JvmtiDeferredUpdates::deferred_locals(thread);
>>> int i = 0;
>>> do {
>>> // Because of inlining we could have multiple vframes for a single frame
>>> // and several of the vframes could have deferred writes. Find them all.
>>> if (list->at(i)->id() == array->original().id()) {
>>>
>>> @@ -365,13 +424,14 @@
>>> } else {
>>> i++;
>>> }
>>> } while ( i < list->length() );
>>> if (list->length() == 0) {
>>> - thread->set_deferred_locals(NULL);
>>> - // free the list and elements back to C heap.
>>> - delete list;
>>> + JvmtiDeferredUpdates* updates = thread->deferred_updates();
>>> + thread->set_deferred_updates(NULL);
>>> + // free deferred updates.
>>> + delete updates;
>>> }
>>> ```
>>>
>>> It is not clear why the 'list' is not deleted anymore. If it is intentional then could you, please, add a comment with
>>> an explanation?
>>
>> 'list' is now embedded in JvmtiDeferredUpdates. It es deleted as part of the
>> JvmtiDeferredUpdates instance when there are no more deferred updates.
>>
>> class JvmtiDeferredUpdates : public CHeapObj<mtCompiler> {
>>
>> [...]
>>
>> // Deferred updates of locals, expressions, and monitors
>> GrowableArray<jvmtiDeferredLocalVariableSet*> _deferred_locals_updates;
>>
>> [...]
>>
>> };
>>
>> I introduced JvmtiDeferredUpdates because this patch introduces a new type of
>> deferred update: _relock_count_after_wait.
>>
>> I tried to improve the encapsulation of class JvmtiDeferredUpdates and
>> simplified the location you are referring to.
>>
>> So when is memory for deferred updates freed?
>>
>> (A) Deferred local variable updates are deleted when the compiled target frame is
>> replaced with corresponding interpreter frames.
>> See JvmtiDeferredUpdates::delete_updates_for_frame().
>>
>> (B) A thread's JvmtiDeferredUpdates instance is deleted if all updates where
>> delivered. All updates where delivered when JvmtiDeferredUpdates::count()
>> returns 0. This is checked whenever updates are delivered. See call sites in
>> JvmtiDeferredUpdates::delete_updates_for_frame() and
>> JvmtiDeferredUpdates::get_and_reset_relock_count_after_wait().
>>
>> (C) Besides (B) a thread's JvmtiDeferredUpdates instance is also deleted when
>> the thread is destroyed. All not yet delivered updates are deleted then
>> too. See JavaThread::~JavaThread() and JvmtiDeferredUpdates::~JvmtiDeferredUpdates().
>>
>>> If you are okay to separate the EscapeBarrier class into its own hpp/cpp files
>>> then the class EscapeBarrierSuspendHandshake is better to be colocated with
>>> it.
>>
>> Done.
>>
>>> The below functions EscapeBarrier::sync_and_suspend_one() and do_thread() make a call to the set_obj_deopt_flag() which
>>> seems to be a duplication. At least, it is not clear why this duplication exist and so, needs to be explained in a
>>> comment. ```
>>> +void EscapeBarrier::sync_and_suspend_one() {
>>> + assert(_calling_thread != NULL, "calling thread must not be NULL");
>>> + assert(_deoptee_thread != NULL, "deoptee thread must not be NULL");
>>> + assert(barrier_active(), "should not call");
>>> +
>>> + // Sync with other threads that might be doing deoptimizations
>>> + {
>>> + // Need to switch to _thread_blocked for the wait() call
>>> + ThreadBlockInVM tbivm(_calling_thread);
>>> + MonitorLocker ml(_calling_thread, EscapeBarrier_lock, Mutex::_no_safepoint_check_flag);
>>> + while (_self_deoptimization_in_progress || _deoptee_thread->is_obj_deopt_suspend()) {
>>> + ml.wait();
>>> + }
>>> +
>>> + if (self_deopt()) {
>>> + _self_deoptimization_in_progress = true;
>>> + return;
>>> + }
>>> +
>>> + // set suspend flag for target thread
>>> + _deoptee_thread->set_obj_deopt_flag();
>>> + }
>>> +
>>> + // suspend target thread
>>> + EscapeBarrierSuspendHandshake sh(NULL, "EscapeBarrierSuspendOne");
>>> + Handshake::execute_direct(&sh, _deoptee_thread);
>>> + assert(!_deoptee_thread->has_last_Java_frame() || _deoptee_thread->frame_anchor()->walkable(),
>>> + "stack should be walkable now");
>>> +}
>>> . . . . .
>>> +class EscapeBarrierSuspendHandshake : public HandshakeClosure {
>>> + JavaThread* _excluded_thread;
>>> + public:
>>> + EscapeBarrierSuspendHandshake(JavaThread* excluded_thread, const char* name) :
>>> + HandshakeClosure(name),
>>> + _excluded_thread(excluded_thread) {}
>>> + void do_thread(Thread* th) {
>>> + if (th->is_Java_thread() && !th->is_hidden_from_external_view() && (th != _excluded_thread)) {
>>> + th->set_obj_deopt_flag();
>>> + }
>>> + }
>>> +};
>>> ```
>>
>> I previously removed the set_obj_deopt_flag() call from
>> EscapeBarrierSuspendHandshake::do_thread() in [1]. For synchronization it is
>> better to set_obj_deopt_flag() before the handshake (see comment in
>> EscapeBarrier::sync_and_suspend_all()).
>>
>> [1] https://github.com/openjdk/jdk/pull/119/commits/18dd54b4e6f17ca723e4ae1a1e8dc57e81878dd3
>>
>>> /src/hotspot/share/prims/jvmtiImpl.cpp:
>>>
>>> ```
>>> 421 // Constructor for non-object getter
>>> 422 VM_GetOrSetLocal::VM_GetOrSetLocal(JavaThread* thread, jint depth, jint index, BasicType type)
>>> 423 : _thread(thread)
>>> 424 , _calling_thread(NULL)
>>> 425 , _depth(depth)
>>> 426 , _index(index)
>>> 427 , _type(type)
>>> 428 , _jvf(NULL)
>>> 429 , _set(false)
>>> 430 , _eb(NULL, NULL, type == T_OBJECT)
>>> 431 , _result(JVMTI_ERROR_NONE)
>>> 432 {
>>> 433 }
>>> 434
>>> 435 // Constructor for object or non-object setter
>>> 436 VM_GetOrSetLocal::VM_GetOrSetLocal(JavaThread* thread, jint depth, jint index, BasicType type, jvalue value)
>>> 437 : _thread(thread)
>>> 438 , _calling_thread(NULL)
>>> 439 , _depth(depth)
>>> 440 , _index(index)
>>> 441 , _type(type)
>>> 442 , _value(value)
>>> 443 , _jvf(NULL)
>>> 444 , _set(true)
>>> 445 , _eb(JavaThread::current(), thread, type == T_OBJECT)
>>> 446 , _result(JVMTI_ERROR_NONE)
>>> 447 {
>>> 448 }
>>> 449
>>> 450 // Constructor for object getter
>>> 451 VM_GetOrSetLocal::VM_GetOrSetLocal(JavaThread* thread, JavaThread* calling_thread, jint depth, int index)
>>> 452 : _thread(thread)
>>> 453 , _calling_thread(calling_thread)
>>> 454 , _depth(depth)
>>> 455 , _index(index)
>>> 456 , _type(T_OBJECT)
>>> 457 , _jvf(NULL)
>>> 458 , _set(false)
>>> 459 , _eb(calling_thread, thread, true)
>>> 460 , _result(JVMTI_ERROR_NONE)
>>> 461 {
>>> 462 }
>>> ```
>>>
>>> I think, false has to be passed to the constructors of non-object getters instead of expression:
>>> "type == T_OBJECT".
>>> The type can not be T_OBJECT for non-object getters.
>>
>> I used to do that. Then I changed it because the c++ compiler can fold the
>> comparison to "false" and if somebody changes the non-object getter to get
>> objects too then it would still be correct.
>>
>> Let me know if you still think it is better to pass false. Maybe add an
>> assertion type == T_OBJECT then?
>>
>>> Q: Is an EscapeBarrier useful if false is passed as the barrier_active parameter?
>>
>> The EscapeBarrier is not needed then. In the case of the non-object getter above
>> I'd hope that most of the constructor/desctructor of EscapeBarrier is eliminated
>> by the c++ compiler then.
>>
>> Besides the changes you suggested I have made a bugfix in
>> test/jdk/com/sun/jdi/EATests.java to prevent ObjectCollectedException.
>>
>> Thanks, Richard.
>
> Hi Serguei
> (@sspitsyn)
>
> are you ok with the changes I made based on your comments?
> Will you further review the change?
>
> Thanks, Richard.
Hi Richard,
Thank you for making the refactoring. I like it more now. :)
So, the fix looks good to me in general.
But could I ask you, to adjust some formatting, please?
There are several things that can be done to improve the code readability.
src/hotspot/share/prims/jvmtiDeferredUpdates.hpp:
I'd suggest to add an empty line before lines 40, 71, 73, 93, 95, 109 to make class definitions and function
declarations/definitions with comments more catchable by eyes. The following lines can be removed: 81, 82, 103
Also, there is inconsistency in function definitions formatting:
- some functions have big indent between the type and name
- some functions have no indent between the type and name but a big indent between name and body
I'd suggest to either to remove all indents or make it reasonably smaller but consistent.
It seems, there is no reason to keep these class declarations:
38 class jvmtiDeferredLocalVariable;
108 class jvmtiDeferredLocalVariableSet;
src/hotspot/share/prims/jvmtiDeferredUpdates.cpp:
82 // Free deferred updates.
83 // (Note the 'list' of local variable updates is embedded in 'updates')
A suggestion to change the line 83 as follows:
` 83 // Note, the 'list' of local variable updates is embedded in 'updates'.`
src/hotspot/share/runtime/escapeBarrier.hpp:
Add dots at the end of comments at lines 97, 99, 103.
I'd suggest to add an empty line before lines 39, 40, 80, 81, 93, 94, 99, 119, 121.
src/hotspot/share/runtime/escapeBarrier.cpp:
The following class declaration is not needed:
` 49 class jvmtiDeferredLocalVariableSet;`
because you already added this line:
` 29 #include "prims/jvmtiDeferredUpdates.hpp"`
The lines below deserve a refactoring. It can be separate functions for locals, expressions and monitors, or just one
function for the whole fragment:
345 GrowableArray* scopeLocals = cvf->scope()->locals();
346 StackValueCollection* locals = cvf->locals();
347 if (locals != NULL) {
348 for (int i2 = 0; i2 < locals->size(); i2++) {
349 StackValue* var = locals->at(i2);
350 if (var->type() == T_OBJECT && scopeLocals->at(i2)->is_object()) {
351 jvalue val;
352 val.l = cast_from_oop(locals->at(i2)->get_obj()());
353 cvf->update_local(T_OBJECT, i2, val);
354 }
355 }
356 }
357
358 // expressions
359 GrowableArray* scopeExpressions = cvf->scope()->expressions();
360 StackValueCollection* expressions = cvf->expressions();
361 if (expressions != NULL) {
362 for (int i2 = 0; i2 < expressions->size(); i2++) {
363 StackValue* var = expressions->at(i2);
364 if (var->type() == T_OBJECT && scopeExpressions->at(i2)->is_object()) {
365 jvalue val;
366 val.l = cast_from_oop(expressions->at(i2)->get_obj()());
367 cvf->update_stack(T_OBJECT, i2, val);
368 }
369 }
370 }
371
372 // monitors
373 GrowableArray* monitors = cvf->monitors();
374 if (monitors != NULL) {
375 for (int i2 = 0; i2 < monitors->length(); i2++) {
376 if (monitors->at(i2)->eliminated()) {
377 assert(!monitors->at(i2)->owner_is_scalar_replaced(),
378 "reallocation failure, should not update");
379 cvf->update_monitor(i2, monitors->at(i2));
380 }
381 }
382 }
src/hotspot/share/prims/jvmtiImpl.cpp:
420 // Constructor for non-object getter
421 VM_GetOrSetLocal::VM_GetOrSetLocal(JavaThread* thread, jint depth, jint index, BasicType type)
422 : _thread(thread)
423 , _calling_thread(NULL)
424 , _depth(depth)
425 , _index(index)
426 , _type(type)
427 , _jvf(NULL)
428 , _set(false)
429 , _eb(type == T_OBJECT, NULL, NULL)
430 , _result(JVMTI_ERROR_NONE)
431 {
432 }
I still think, that the line 429 is going to cause confusions.
It is a non-object getter, so the type should never be T_OBJECT.
It won't change in the future to allow the T_OBJECT types.
The only way to allow it is to merge the constructors for object and non-object getters.
So, I'm suggesting to replace this line with:
` 429 , _eb(false, NULL, NULL)`
-------------
PR: https://git.openjdk.java.net/jdk/pull/119
More information about the core-libs-dev
mailing list