RFR (S): 8023461: Thread holding lock at safepoint that vm can block on: MethodCompileQueue_lock
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Wed Mar 12 11:58:20 UTC 2014
Igor, thanks for the suggestion.
I thought about this approach and my concern was that we need multiple
passes over the queue to clear stale compile requests (worst case is
O(n^2)). Currently, the queue is cleared out in a single pass.
But the only option I see is to accumulate canceled tasks in a side
structure and send bulk notifications between iterations of compiler loop.
Best regards,
Vladimir Ivanov
On 3/12/14 4:50 AM, Igor Veresov wrote:
> What I mean is a strategy like this:
>
> diff --git a/src/share/vm/compiler/compileBroker.cpp
> b/src/share/vm/compiler/compileBroker.cpp
> --- a/src/share/vm/compiler/compileBroker.cpp
> +++ b/src/share/vm/compiler/compileBroker.cpp
> @@ -247,6 +247,7 @@
>
> _is_complete = false;
> _is_success = false;
> + _skip_compile = false;
> _code_handle = NULL;
>
> _hot_method = NULL;
> @@ -1691,9 +1692,14 @@
> // Assign the task to the current thread. Mark this compilation
> // thread as active for the profiler.
> CompileTaskWrapper ctw(task);
> + methodHandle method(thread, task->method());
> + if (task->skip_compile()) {
> + method->clear_queued_for_compilation();
> + continue;
> + }
> +
> nmethodLocker result_handle; // (handle for the nmethod produced
> by this task)
> task->set_code_handle(&result_handle);
> - methodHandle method(thread, task->method());
>
> // Never compile a method if breakpoints are present in it
> if (method()->number_of_breakpoints() == 0) {
> diff --git a/src/share/vm/compiler/compileBroker.hpp
> b/src/share/vm/compiler/compileBroker.hpp
> --- a/src/share/vm/compiler/compileBroker.hpp
> +++ b/src/share/vm/compiler/compileBroker.hpp
> @@ -48,6 +48,7 @@
> bool _is_complete;
> bool _is_success;
> bool _is_blocking;
> + bool _skip_compile;
> int _comp_level;
> int _num_inlined_bytecodes;
> nmethodLocker* _code_handle; // holder of eventual result
> @@ -78,6 +79,8 @@
> bool is_blocking() const { return _is_blocking; }
> bool is_success() const { return _is_success; }
>
> + bool skip_compile() const { return _skip_compile; }
> + void set_skip_compile() { _skip_compile = true; }
> nmethodLocker* code_handle() const { return _code_handle; }
> void set_code_handle(nmethodLocker* l) { _code_handle = l; }
> nmethod* code() const; // _code_handle->code()
> diff --git a/src/share/vm/runtime/advancedThresholdPolicy.cpp
> b/src/share/vm/runtime/advancedThresholdPolicy.cpp
> --- a/src/share/vm/runtime/advancedThresholdPolicy.cpp
> +++ b/src/share/vm/runtime/advancedThresholdPolicy.cpp
> @@ -174,11 +174,8 @@
> if (PrintTieredEvents) {
> print_event(REMOVE_FROM_QUEUE, method, method,
> task->osr_bci(), (CompLevel)task->comp_level());
> }
> - CompileTaskWrapper ctw(task); // Frees the task
> - compile_queue->remove(task);
> - method->clear_queued_for_compilation();
> - task = next_task;
> - continue;
> + task->set_skip_compile();
> + return task;
> }
>
> // Select a method with a higher rate
> On Mar 11, 2014, at 5:30 PM, Igor Veresov <igor.veresov at oracle.com
> <mailto:igor.veresov at oracle.com>> wrote:
>
>> Yes, I think you’re right, this is a bug. select_task() obviously
>> assumes there can be no safepoints. May be we can introduce a flag in
>> CompileTask that would indicate that it needs to be removed, and then
>> return it from select_task() and correspondingly from
>> CompileQueue::get() and then free it with the existing
>> CompileTaskWrapper that is CompileBroker::compiler_thread_loop()
>> instead of invoking a compile. That way, combined with your existing
>> fixes select_task() will be lock-free. Does that make sense?
>>
>> igor
>>
>> On Mar 11, 2014, at 4:34 PM, Vladimir Ivanov
>> <vladimir.x.ivanov at oracle.com <mailto:vladimir.x.ivanov at oracle.com>>
>> wrote:
>>
>>> Igor,
>>>
>>>> I vaguely remember that is was allowed before. That’s basically the
>>>> reason why everything has handles in the policy. I need to recall
>>>> how that works...
>>> It's there for a long time, but I converted the check from VM warning
>>> to fatal error only recently.
>>>
>>> AdvancedThresholdPolicy::select_task operates on raw Method*. As I
>>> can see in the sources, handles are used only in
>>> Method::build_method_counters. Lazy allocation of method counters
>>> wasn't there originally. It was added by 8010862.
>>>
>>>> Btw, I may be wrong but it seems like there could be a race in
>>>> MethodCounters creation. There is a similar problem with MDO, but we
>>>> take a lock for it to avoid races.
>>> You are right. There's a window in Method::build_method_counters when
>>> counters can be allocated twice. We need to grab a lock / use CAS to
>>> avoid memory leak here.
>>>
>>> Best regards,
>>> Vladimir Ivanov
>>>
>>> [1] https://bugs.openjdk.java.net/browse/JDK-8010862
>>>
>>>>
>>>> igor
>>>>
>>>> On Mar 11, 2014, at 3:04 PM, Vladimir Ivanov
>>>> <vladimir.x.ivanov at oracle.com <mailto:vladimir.x.ivanov at oracle.com>>
>>>> wrote:
>>>>
>>>>> The policy for a thread is not to hold any locks VM can block on
>>>>> when entering a safepoint (see
>>>>> Thread::check_for_valid_safepoint_state).
>>>>>
>>>>> Otherwise we would need to be very careful about what code can be
>>>>> executed during a safepoint to avoid deadlocks.
>>>>>
>>>>> There are exceptions (like Threads_lock and Compile_lock), but
>>>>> generally we try to adhere the rule.
>>>>>
>>>>> Making an exception for MethodCompileQueue looks safe (I went
>>>>> through the code and didn't find any scenarios when VM can attempt
>>>>> to grab it during a safepoint), but I'd like to avoid it if possible.
>>>>>
>>>>> Best regards,
>>>>> Vladimir Ivanov
>>>>>
>>>>> On 3/11/14 10:50 PM, Igor Veresov wrote:
>>>>>> Could you please remind me why we can’t enter a safepoint while
>>>>>> holding the MethodCompileQueue_lock?
>>>>>>
>>>>>> igor
>>>>>>
>>>>>> On Mar 11, 2014, at 8:50 AM, Vladimir Ivanov
>>>>>> <vladimir.x.ivanov at oracle.com
>>>>>> <mailto:vladimir.x.ivanov at oracle.com>> wrote:
>>>>>>
>>>>>>> Unfortunately, it's not enough. There's another safepoint check.
>>>>>>>
>>>>>>> For blocking compilation requests of stale methods
>>>>>>> CompileTaskWrapper (see AdvancedThresholdPolicy::select_task)
>>>>>>> sends a notification to blocked threads after cancelling the
>>>>>>> compilation. It can safepoint while locking on compile task
>>>>>>> before sending notification.
>>>>>>>
>>>>>>> I don't see how to avoid this situation. Any ideas?
>>>>>>> Otherwise, I need to exclude MethodCompileQueue from the check in
>>>>>>> Thread::check_for_valid_safepoint_state.
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Vladimir Ivanov
>>>>>>>
>>>>>>> On 3/11/14 11:58 AM, Vladimir Ivanov wrote:
>>>>>>>> Igor, Vladimir, thanks for review.
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>> Vladimir Ivanov
>>>>>>>>
>>>>>>>> On 3/11/14 7:31 AM, Igor Veresov wrote:
>>>>>>>>> I think it’s a reasonable fix.
>>>>>>>>>
>>>>>>>>> igor
>>>>>>>>>
>>>>>>>>> On Mar 10, 2014, at 4:57 PM, Vladimir Ivanov
>>>>>>>>> <vladimir.x.ivanov at oracle.com
>>>>>>>>> <mailto:vladimir.x.ivanov at oracle.com>> wrote:
>>>>>>>>>
>>>>>>>>>> Vladimir, thanks for the review.
>>>>>>>>>>
>>>>>>>>>> You are absolutely right about
>>>>>>>>>> Method::increment_interpreter_invocation_count. Reverted the
>>>>>>>>>> change.
>>>>>>>>>>
>>>>>>>>>> Updated fix:
>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/8023461/webrev.01/
>>>>>>>>>>
>>>>>>>>>> Yes, Igor's feedback on this change would be invaluable.
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>> Vladimir Ivanov
>>>>>>>>>>
>>>>>>>>>> On 3/11/14 2:33 AM, Vladimir Kozlov wrote:
>>>>>>>>>>> The method
>>>>>>>>>>> Method::increment_interpreter_invocation_count(TRAP) changes
>>>>>>>>>>> are incorrect. It is used by C++ Interpreter and you did not
>>>>>>>>>>> modified
>>>>>>>>>>> code there. I would leave this method unchanged.
>>>>>>>>>>>
>>>>>>>>>>> The rest looks fine to me but Igor should know better this code.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Vladimir K
>>>>>>>>>>>
>>>>>>>>>>> On 3/7/14 8:26 AM, Vladimir Ivanov wrote:
>>>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/8023461/webrev.00
>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8023461
>>>>>>>>>>>> 42 lines changed: 13 ins; 1 del; 28 mod
>>>>>>>>>>>>
>>>>>>>>>>>> The rule of thumb for VM is that a thread shouldn't hold any
>>>>>>>>>>>> VM lock
>>>>>>>>>>>> when it reaches a safepoint. It's not the case for
>>>>>>>>>>>> MethodCompileQueue_lock now.
>>>>>>>>>>>>
>>>>>>>>>>>> The problem is that AdvancedThresholdPolicy updates task's
>>>>>>>>>>>> rate when
>>>>>>>>>>>> iterating compiler queue. It holds MethodCompileQueue_lock
>>>>>>>>>>>> while doing
>>>>>>>>>>>> so. Method counters are allocated lazily. If method counters
>>>>>>>>>>>> aren't
>>>>>>>>>>>> there and VM fails to allocate them, GC is initiated (see
>>>>>>>>>>>> CollectorPolicy::satisfy_failed_metadata_allocation) and a thead
>>>>>>>>>>>> entering a safepoint holding MethodCompileQueue lock.
>>>>>>>>>>>>
>>>>>>>>>>>> Normally, counters are initialized during method interpretation,
>>>>>>>>>>>> but in
>>>>>>>>>>>> Xcomp mode it's not the case. That's the mode where the
>>>>>>>>>>>> failures are
>>>>>>>>>>>> observed.
>>>>>>>>>>>>
>>>>>>>>>>>> The fix is to skip the update, if counters aren't allocated yet.
>>>>>>>>>>>>
>>>>>>>>>>>> Testing: added No_Safepoint_Verifier, JPRT, failing tests
>>>>>>>>>>>> from nightly
>>>>>>>>>>>> testing (in progress).
>>>>>>>>>>>>
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>> Vladimir Ivanov
>>>>>>>>>
>>>>>>
>>>>
>>
>
More information about the hotspot-compiler-dev
mailing list