RFR (S): 8023461: Thread holding lock at safepoint that vm can block on: MethodCompileQueue_lock

Igor Veresov igor.veresov at oracle.com
Wed Mar 12 00:50:56 UTC 2014


What I mean is a strategy like this:

diff --git a/src/share/vm/compiler/compileBroker.cpp b/src/share/vm/compiler/compileBroker.cpp
--- a/src/share/vm/compiler/compileBroker.cpp
+++ b/src/share/vm/compiler/compileBroker.cpp
@@ -247,6 +247,7 @@

   _is_complete = false;
   _is_success = false;
+  _skip_compile = false;
   _code_handle = NULL;

   _hot_method = NULL;
@@ -1691,9 +1692,14 @@
     // Assign the task to the current thread.  Mark this compilation
     // thread as active for the profiler.
     CompileTaskWrapper ctw(task);
+    methodHandle method(thread, task->method());
+    if (task->skip_compile()) {
+      method->clear_queued_for_compilation();
+      continue;
+    }
+
     nmethodLocker result_handle;  // (handle for the nmethod produced by this task)
     task->set_code_handle(&result_handle);
-    methodHandle method(thread, task->method());

     // Never compile a method if breakpoints are present in it
     if (method()->number_of_breakpoints() == 0) {
diff --git a/src/share/vm/compiler/compileBroker.hpp b/src/share/vm/compiler/compileBroker.hpp
--- a/src/share/vm/compiler/compileBroker.hpp
+++ b/src/share/vm/compiler/compileBroker.hpp
@@ -48,6 +48,7 @@
   bool         _is_complete;
   bool         _is_success;
   bool         _is_blocking;
+  bool         _skip_compile;
   int          _comp_level;
   int          _num_inlined_bytecodes;
   nmethodLocker* _code_handle;  // holder of eventual result
@@ -78,6 +79,8 @@
   bool         is_blocking() const               { return _is_blocking; }
   bool         is_success() const                { return _is_success; }

+  bool         skip_compile() const              { return _skip_compile; }
+  void         set_skip_compile()                { _skip_compile = true; }
   nmethodLocker* code_handle() const             { return _code_handle; }
   void         set_code_handle(nmethodLocker* l) { _code_handle = l; }
   nmethod*     code() const;                     // _code_handle->code()
diff --git a/src/share/vm/runtime/advancedThresholdPolicy.cpp b/src/share/vm/runtime/advancedThresholdPolicy.cpp
--- a/src/share/vm/runtime/advancedThresholdPolicy.cpp
+++ b/src/share/vm/runtime/advancedThresholdPolicy.cpp
@@ -174,11 +174,8 @@
         if (PrintTieredEvents) {
           print_event(REMOVE_FROM_QUEUE, method, method, task->osr_bci(), (CompLevel)task->comp_level());
         }
-        CompileTaskWrapper ctw(task); // Frees the task
-        compile_queue->remove(task);
-        method->clear_queued_for_compilation();
-        task = next_task;
-        continue;
+        task->set_skip_compile();
+        return task;
       }

       // Select a method with a higher rate
On Mar 11, 2014, at 5:30 PM, Igor Veresov <igor.veresov at oracle.com> wrote:

> Yes, I think you’re right, this is a bug. select_task() obviously assumes there can be no safepoints. May be we can introduce a flag in CompileTask that would indicate that it needs to be removed, and then return it from select_task() and correspondingly from CompileQueue::get() and then free it with the existing CompileTaskWrapper that is CompileBroker::compiler_thread_loop() instead of invoking a compile. That way, combined with your existing fixes select_task() will be lock-free. Does that make sense?
> 
> igor
> 
> On Mar 11, 2014, at 4:34 PM, Vladimir Ivanov <vladimir.x.ivanov at oracle.com> wrote:
> 
>> Igor,
>> 
>>> I vaguely remember that is was allowed before. That’s basically the reason why everything has handles in the policy. I need to recall how that works...
>> It's there for a long time, but I converted the check from VM warning to fatal error only recently.
>> 
>> AdvancedThresholdPolicy::select_task operates on raw Method*. As I can see in the sources, handles are used only in Method::build_method_counters. Lazy allocation of method counters wasn't there originally. It was added by 8010862.
>> 
>>> Btw, I may be wrong but it seems like there could be a race in MethodCounters creation. There is a similar problem with MDO, but we take a lock for it to avoid races.
>> You are right. There's a window in Method::build_method_counters when counters can be allocated twice. We need to grab a lock / use CAS to avoid memory leak here.
>> 
>> Best regards,
>> Vladimir Ivanov
>> 
>> [1] https://bugs.openjdk.java.net/browse/JDK-8010862
>> 
>>> 
>>> igor
>>> 
>>> On Mar 11, 2014, at 3:04 PM, Vladimir Ivanov <vladimir.x.ivanov at oracle.com> wrote:
>>> 
>>>> The policy for a thread is not to hold any locks VM can block on when entering a safepoint (see Thread::check_for_valid_safepoint_state).
>>>> 
>>>> Otherwise we would need to be very careful about what code can be executed during a safepoint to avoid deadlocks.
>>>> 
>>>> There are exceptions (like Threads_lock and Compile_lock), but generally we try to adhere the rule.
>>>> 
>>>> Making an exception for MethodCompileQueue looks safe (I went through the code and didn't find any scenarios when VM can attempt to grab it during a safepoint), but I'd like to avoid it if possible.
>>>> 
>>>> Best regards,
>>>> Vladimir Ivanov
>>>> 
>>>> On 3/11/14 10:50 PM, Igor Veresov wrote:
>>>>> Could you please remind me why we can’t enter a safepoint while holding the MethodCompileQueue_lock?
>>>>> 
>>>>> igor
>>>>> 
>>>>> On Mar 11, 2014, at 8:50 AM, Vladimir Ivanov <vladimir.x.ivanov at oracle.com> wrote:
>>>>> 
>>>>>> Unfortunately, it's not enough. There's another safepoint check.
>>>>>> 
>>>>>> For blocking compilation requests of stale methods CompileTaskWrapper (see AdvancedThresholdPolicy::select_task) sends a notification to blocked threads after cancelling the compilation. It can safepoint while locking on compile task before sending notification.
>>>>>> 
>>>>>> I don't see how to avoid this situation. Any ideas?
>>>>>> Otherwise, I need to exclude MethodCompileQueue from the check in Thread::check_for_valid_safepoint_state.
>>>>>> 
>>>>>> Best regards,
>>>>>> Vladimir Ivanov
>>>>>> 
>>>>>> On 3/11/14 11:58 AM, Vladimir Ivanov wrote:
>>>>>>> Igor, Vladimir, thanks for review.
>>>>>>> 
>>>>>>> Best regards,
>>>>>>> Vladimir Ivanov
>>>>>>> 
>>>>>>> On 3/11/14 7:31 AM, Igor Veresov wrote:
>>>>>>>> I think it’s a reasonable fix.
>>>>>>>> 
>>>>>>>> igor
>>>>>>>> 
>>>>>>>> On Mar 10, 2014, at 4:57 PM, Vladimir Ivanov
>>>>>>>> <vladimir.x.ivanov at oracle.com> wrote:
>>>>>>>> 
>>>>>>>>> Vladimir, thanks for the review.
>>>>>>>>> 
>>>>>>>>> You are absolutely right about
>>>>>>>>> Method::increment_interpreter_invocation_count. Reverted the change.
>>>>>>>>> 
>>>>>>>>> Updated fix:
>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/8023461/webrev.01/
>>>>>>>>> 
>>>>>>>>> Yes, Igor's feedback on this change would be invaluable.
>>>>>>>>> 
>>>>>>>>> Best regards,
>>>>>>>>> Vladimir Ivanov
>>>>>>>>> 
>>>>>>>>> On 3/11/14 2:33 AM, Vladimir Kozlov wrote:
>>>>>>>>>> The method Method::increment_interpreter_invocation_count(TRAP) changes
>>>>>>>>>> are incorrect. It is used by C++ Interpreter and you did not modified
>>>>>>>>>> code there. I would leave this method unchanged.
>>>>>>>>>> 
>>>>>>>>>> The rest looks fine to me but Igor should know better this code.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Vladimir K
>>>>>>>>>> 
>>>>>>>>>> On 3/7/14 8:26 AM, Vladimir Ivanov wrote:
>>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/8023461/webrev.00
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8023461
>>>>>>>>>>> 42 lines changed: 13 ins; 1 del; 28 mod
>>>>>>>>>>> 
>>>>>>>>>>> The rule of thumb for VM is that a thread shouldn't hold any VM lock
>>>>>>>>>>> when it reaches a safepoint. It's not the case for
>>>>>>>>>>> MethodCompileQueue_lock now.
>>>>>>>>>>> 
>>>>>>>>>>> The problem is that AdvancedThresholdPolicy updates task's rate when
>>>>>>>>>>> iterating compiler queue. It holds MethodCompileQueue_lock while doing
>>>>>>>>>>> so. Method counters are allocated lazily. If method counters aren't
>>>>>>>>>>> there and VM fails to allocate them, GC is initiated (see
>>>>>>>>>>> CollectorPolicy::satisfy_failed_metadata_allocation) and a thead
>>>>>>>>>>> entering a safepoint holding MethodCompileQueue lock.
>>>>>>>>>>> 
>>>>>>>>>>> Normally, counters are initialized during method interpretation,
>>>>>>>>>>> but in
>>>>>>>>>>> Xcomp mode it's not the case. That's the mode where the failures are
>>>>>>>>>>> observed.
>>>>>>>>>>> 
>>>>>>>>>>> The fix is to skip the update, if counters aren't allocated yet.
>>>>>>>>>>> 
>>>>>>>>>>> Testing: added No_Safepoint_Verifier, JPRT, failing tests from nightly
>>>>>>>>>>> testing (in progress).
>>>>>>>>>>> 
>>>>>>>>>>> Best regards,
>>>>>>>>>>> Vladimir Ivanov
>>>>>>>> 
>>>>> 
>>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20140311/29d3cc2b/attachment-0001.html>


More information about the hotspot-compiler-dev mailing list