RFR(s): 8234086: VM operation can be simplified

Mon Nov 25 12:48:35 UTC 2019

Hi Serguei, thanks for having a look.

AFAIK:
Today one of these three can happen when returning to agent (native).
- The target thread for stop have not yet installed the async exception.
- The target thread have installed the async exception, but not yet stopped.
- The target thread have installed the async exception and already stopped.

A agent must handle all three possible scenarios.
This patch just removes the first scenario, and makes the installation part 
synchrone.

Hope that helps!

Note, I don't see the method "StopThread" being documented as either synchrone
or asynchrone itself.
The exception is documented as beeing asynchrone.

And I don't think we follow the specs now, since we ignore the result of:
void VM_ThreadStop::doit()
One would expect e.g. JVMTI_ERROR_THREAD_NOT_ALIVE if we never deilver the async
exception. So making the async exception installation part synchrone would be a
step to fix that issue.

Thanks, Robbin

On 2019-11-25 12:45, serguei.spitsyn at oracle.com wrote:
> Please, skip my reply below.
> I need to read all emails carefully.
> 
> Thanks,
> Serguei
> 
> On 11/25/19 03:35, serguei.spitsyn at oracle.com wrote:
>> Hi Dan and Robbin,
>>
>> I can be wrong and missing something but it feels like there is no issue for 
>> JVMTI with this fix.
>>
>> > Off the top of head, I can't think of a way for a caller of
>> > Thread::send_async_exception() to determine that the call is now
>> > synchronous instead of asynchronous, but ...
>>
>> There can be some confusion here about what is synchronous relative to.
>> I read it this way:
>>  It synchronous for the current thread which calls the send_async_exception().
>>  However, it is asynchronous for the target thread that needs to be stopped.
>>  So that the fix does not break the JVMTI spec requirements.
>>
>> Please, let me know if you agree (or not) with this reading.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 11/22/19 13:50, Daniel D. Daugherty wrote:
>>> Hi Robbin,
>>>
>>> Sorry I'm late to this review thread...
>>>
>>> I'm adding Serguei to this email thread since I'm making comments
>>> about the JVM/TI parts of this changeset...
>>>
>>>
>>>> http://cr.openjdk.java.net/~rehn/8234086/v4/full/webrev/index.html 
>>>
>>>
>>> src/hotspot/share/runtime/vmOperations.hpp
>>>     No comments.
>>>
>>> src/hotspot/share/runtime/vmOperations.cpp
>>>     No comments.
>>>
>>> src/hotspot/share/runtime/vmThread.hpp
>>>     L148:   // The ever running loop for the VMThread
>>>     L149:   void loop();
>>>     L150:   static void check_cleanup();
>>>         nit - Feels like an odd place to add check_cleanup().
>>>
>>>         Update: Now that I've seen what clean_up(), it needs a
>>>         better name. Perhaps check_for_forced_cleanup()? And since
>>>         it is supposed to affect the running loop for the VMThread
>>>         I'm okay with its location now.
>>>
>>> src/hotspot/share/runtime/vmThread.cpp
>>>     L382:   event->set_blocking(true);
>>>         Probably have to keep the 'blocking' attribute in the event
>>>         for backward compatibility in the JFR record format?
>>>
>>>     L478:         // wait with a timeout to guarantee safepoints at regular 
>>> intervals
>>>         Is this comment true anymore (even before this changeset)?
>>>         Adding this on the next line might help:
>>>
>>>                   // (if there is cleanup work to do)
>>>
>>>         since I _think_ that's how the policy has been evolved...
>>>
>>>     L479:         mu_queue.wait(GuaranteedSafepointInterval);
>>>         Please prefix with "(void)" to make it clear you are
>>>         intentionally ignoring the return value.
>>>
>>>     old L627-634 (We want to make sure that we get to a safepoint regularly)
>>>         I think this now old code is covered by your change above:
>>>
>>>         L488:         // If the queue contains a safepoint VM op,
>>>         L489:         // clean up will be done so we can skip this part.
>>>         L490:         if (!_vm_queue->peek_at_safepoint_priority()) {
>>>
>>>         Please confirm that our thinking is the same here.
>>>
>>>     L661:     int ticket =  t->vm_operation_ticket();
>>>         nit - extra space after '='
>>>
>>>     Okay. Definitely simpler code.
>>>
>>> src/hotspot/share/runtime/handshake.cpp
>>>     No comments.
>>>
>>> src/hotspot/share/runtime/safepoint.hpp
>>>     No comments.
>>>
>>> src/hotspot/share/runtime/safepoint.cpp
>>>     Definitely got my attention with
>>>     ObjectSynchronizer::needs_monitor_scavenge().
>>>
>>> src/hotspot/share/runtime/synchronizer.hpp
>>>     No comments.
>>>
>>> src/hotspot/share/runtime/synchronizer.cpp
>>>     L921:     log_info(monitorinflation)("Monitor scavenge needed, triggering 
>>> safepoint cleanup.");
>>>         Thanks for adding the logging line.
>>>
>>>         Update: As Kim pointed out, this code goes away when
>>>         MonitorBound is made obsolete (JDK-8230940). I'm looking
>>>         forward to making that change.
>>>
>>>     L1003:   if (Atomic::load(&_forceMonitorScavenge) == 0 && Atomic::xchg 
>>> (1, &_forceMonitorScavenge) == 0) {
>>>         nit - extra space between 'xchg ('
>>>
>>>         Since InduceScavenge() is only called when the deprecated
>>>         MonitorBound is specified, I think you could use cmpxchg()
>>>         for clarity. Of course, you might be thinking that the
>>>         pattern is a useful example for other folks to copy...
>>>
>>> src/hotspot/share/runtime/thread.cpp
>>>     old L527: // Enqueue a VM_Operation to do the job for us - sometime later
>>>     L527: void Thread::send_async_exception(oop java_thread, oop 
>>> java_throwable) {
>>>     L528:   VM_ThreadStop vm_stop(java_thread, java_throwable);
>>>     L529:   VMThread::execute(&vm_stop);
>>>     L530: }
>>>        Okay so you deleted the comment about the call being async and the
>>>        VM op is no longer async, but does that break the expectation of
>>>        any callers?
>>>
>>>        Off the top of head, I can't think of a way for a caller of
>>>        Thread::send_async_exception() to determine that the call is now
>>>        synchronous instead of asynchronous, but ...
>>>
>>>        Update: Just took a look at JvmtiEnv::StopThread() which calls
>>>        Thread::send_async_exception(). If JVM/TI StopThread() is being
>>>        used to throw an exception at the calling thread, I suspect that
>>>        in the baseline, the call would always return JVMTI_ERROR_NONE.
>>>        With the exception throwing now being synchronous, would that
>>>        affect the return value of the JVM/TI StopThread() call?
>>>
>>>        Looks like the JVM/TI wrapper (see gensrc/jvmtifiles/jvmtiEnter.cpp
>>>        in the build directory) uses ThreadInVMfromNative so the calling
>>>        thread is in VM when it requests the now synchronous VM operation.
>>>        When it requests the VM op, the calling thread will block which
>>>        should allow the VM thread to execute the op. No worries there so
>>>        far...
>>>
>>>        It looks like the code also uses CautiouslyPreserveExceptionMark
>>>        so I think if the exception is delivered to the calling thread
>>>        it won't affect the return from jvmti_env->StopThread(), i.e., we
>>>        will have our return value. The CautiouslyPreserveExceptionMark
>>>        destructor won't kick in until we return from jvmti_StopThread()
>>>        (the JVM/TI wrapper from the build).
>>>
>>>        However, that might cause this assertion to fire:
>>>
>>>        src/hotspot/share/utilities/preserveException.cpp:
>>>        assert(!_thread->has_pending_exception(), "unexpected exception 
>>> generated");
>>>
>>>        because it is now detecting that an exception was thrown
>>>        while executing a JVM/TI call. This is pure theory here.
>>>
>>> src/hotspot/share/jfr/leakprofiler/utilities/vmOperation.hpp
>>>     No comments.
>>>
>>> src/hotspot/share/jfr/recorder/service/jfrRecorderService.cpp
>>>     No comments.
>>>
>>> src/hotspot/share/runtime/biasedLocking.cpp
>>>     old L85:     // Use async VM operation to avoid blocking the Watcher thread.
>>>         Again, you've deleted the comment, but is there going to
>>>         be any unexpected side effects from the change? Looks like
>>>         the work consists of:
>>>
>>>         L70: ClassLoaderDataGraph::dictionary_classes_do(enable_biased_locking);
>>>
>>>         Is that going to be a problem for the WatcherThread?
>>>
>>> test/hotspot/gtest/threadHelper.inline.hpp
>>>     No comments.
>>>
>>> As David H. likes to say: the proof is in the building and testing.
>>>
>>> Thumbs up on the overall idea and implementation. There might be an
>>> issue lurking there in JVM/TI StopThread(), but that's just a theory
>>> on my part...
>>>
>>> Dan
>>>
>>>
>>>
>>> On 11/22/19 9:39 AM, Robbin Ehn wrote:
>>>> Hi David,
>>>>
>>>> On 11/22/19 7:13 AM, David Holmes wrote:
>>>>> Hi Robbin,
>>>>>
>>>>> On 21/11/2019 9:50 pm, Robbin Ehn wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Here is v3:
>>>>>>
>>>>>> Full:
>>>>>> http://cr.openjdk.java.net/~rehn/8234086/v3/full/webrev/
>>>>>
>>>>> src/hotspot/share/runtime/synchronizer.cpp
>>>>>
>>>>> Looking at the highly discussed:
>>>>>
>>>>> if (Atomic::load(&ForceMonitorScavenge) == 0 && Atomic::xchg (1, 
>>>>> &ForceMonitorScavenge) == 0) {
>>>>>
>>>>> why isn't that just:
>>>>>
>>>>> if (Atomic::cmpxchg(1, &ForceMonitorScavenge,0) == 0) {
>>>>>
>>>>> ??
>>>>
>>>> I assumed someone had seen contention on ForceMonitorScavenge.
>>>> Many threads can be enter and re-enter here.
>>>> I don't know if that's still the case.
>>>>
>>>> Since we only hit this path when the deprecated MonitorsBound is set, I 
>>>> think I can change it?
>>>>
>>>>>
>>>>> Also while we are here can we clean this up further:
>>>>>
>>>>> static volatile int ForceMonitorScavenge = 0;
>>>>>
>>>>> becomes
>>>>>
>>>>> static int _forceMonitorScavenge = 0;
>>>>>
>>>>> so the variable doesn't look like it came from globals.hpp :)
>>>>>
>>>>
>>>> Sure!
>>>>
>>>>> Just to be clear, I understand the changes around monitor scavenging now, 
>>>>> though I'm not sure getting rid of async VM ops and replacing with a new 
>>>>> way to directly wakeup the VMThread really amounts to a simplification.
>>>>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/runtime/vmOperations.hpp
>>>>>
>>>>> I still think getting rid of Mode altogether would be a good 
>>>>> simplification. :)
>>>>
>>>> Sure!
>>>>
>>>> Here is v4, inc:
>>>> http://cr.openjdk.java.net/~rehn/8234086/v4/inc/webrev/index.html
>>>> Full:
>>>> http://cr.openjdk.java.net/~rehn/8234086/v4/full/webrev/index.html
>>>>
>>>> Tested t1-3
>>>>
>>>> Thanks, Robbin
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>>
>>>>>
>>>>>> Inc:
>>>>>> http://cr.openjdk.java.net/~rehn/8234086/v3/inc/webrev/
>>>>>>
>>>>>> Tested t1-3
>>>>>>
>>>>>> Thanks, Robbin
>>>>>>
>>>>>> On 2019-11-19 12:05, Robbin Ehn wrote:
>>>>>>> Hi all, please review.
>>>>>>>
>>>>>>> CMS was the last real user of the more advantage features of VM operation.
>>>>>>> VM operation can be simplified to always be an stack object and thus 
>>>>>>> either be
>>>>>>> of safepoint or no safepoint type.
>>>>>>>
>>>>>>> VM_EnableBiasedLocking is executed once by watcher thread, if needed 
>>>>>>> (default not used). Making it synchrone doesn't matter.
>>>>>>> VM_ThreadStop is executed by a JavaThread, that thread should stop for 
>>>>>>> the safepoint anyways, no real point in not stopping direct.
>>>>>>> VM_ScavengeMonitors is only used to trigger a safepoint cleanup, the VM 
>>>>>>> op is not needed. Arguably this thread should actually stop here, since 
>>>>>>> we are about to safepoint.
>>>>>>>
>>>>>>> There is also a small cleanup in vmThread.cpp where an unused method is 
>>>>>>> removed.
>>>>>>> And the extra safepoint is removed:
>>>>>>> "// We want to make sure that we get to a safepoint regularly"
>>>>>>> No we don't :)
>>>>>>>
>>>>>>> Issue:
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8234086
>>>>>>> Change-set:
>>>>>>> http://cr.openjdk.java.net/~rehn/8234086/v1/webrev/index.html
>>>>>>>
>>>>>>> Tested scavenge manually, passes t1-2.
>>>>>>>
>>>>>>> Thanks, Robbin
>>>
>>
>