[13] RFR(M): 8163511: Allocation of compile task fails with assert: "Leaking compilation tasks?"
coleen.phillimore at oracle.com
coleen.phillimore at oracle.com
Wed Feb 27 16:25:02 UTC 2019
On 2/27/19 7:13 AM, Tobias Hartmann wrote:
> Hi Vladimir,
>
> thanks for the review!
>
> On 26.02.19 19:36, Vladimir Kozlov wrote:
>> My only concern is CompileTask::select_for_compilation() where you may have small window between
>> releasing weak reference and adding strong. Can we add strong reference fist before removing weak?
>>
>> With concurrent class unloading we may hit the assert in select_for_compilation() I think. May be
>> make the method get task as argument and return NULL if is_unloaded() is true. What do you think?
> I thought we should be safe because we can't have a safepoint after the is_unloaded() check in
> CompilationPolicy::policy()->select_task() and this assert (there is a NoSafepointVerifier in the
> caller). But I'm not sure what the guarantees are with concurrent class unloading.
>
> I've changed the code to create the (local) handle before removing the weak handles:
> http://cr.openjdk.java.net/~thartmann/8163511/webrev.01/
>
> Now I'm wondering what will happen if concurrent class unloading kicks in right after the
> is_unloaded() check and before we create the method_holder handle.
>
> I think someone from GC/runtime (Erik Österlund?) should have a look at this.
I don't know for sure (wait for Erik!) but it should be ok because we
have other places where we make objects alive with a Handle to
klass_holder().
http://cr.openjdk.java.net/~thartmann/8163511/webrev.01/src/hotspot/share/compiler/compileTask.cpp.udiff.html
Nit, can you use // comments.
Also, there should be a HandleMark here.
I think this method also needs to check for is_unloading() before
touching method().
// RedefineClasses support
void CompileTask::metadata_do(void f(Metadata*)) {
f(method());
if (hot_method() != NULL && hot_method() != method()) {
f(hot_method());
}
}
This change makes sense to me. It would be nice if these were in VM
mode, if you could use WeakHandles instead of JNI, but you would need
your own weak handle area. For future consideration, anyway.
Thanks,
Coleen
>
>> Do you kept output format for flag -XX:+PrintCompilation and changed only hs_err output? Or you
>> changed both?
> I've changed both because both are using CompileTask::print -> CompileTask::print_impl. What about
> guarding the additional output by -XX:+Verbose (see new webrev)?
>
> Best regards,
> Tobias
>
> P.S. I'll be out until Wednesday next week.
>
>
>> On 2/26/19 8:39 AM, Tobias Hartmann wrote:
>>> Hi,
>>>
>>> please review the following patch:
>>> https://bugs.openjdk.java.net/browse/JDK-8163511
>>> http://cr.openjdk.java.net/~thartmann/8163511/webrev.00/
>>>
>>> We hit an assert while trying to allocate a compile task because the compile queue already contains
>>> 10.000 tasks. The root cause is that the compiler threads are not able to keep up with the high
>>> number of compilation requests. The (internal) test that triggers this runs many different tests in
>>> the same VM by using different class loaders. Many of these tests use the same framework and
>>> therefore the same methods are enqueued for compilation again and again (up to 100x).
>>>
>>> The bug shows up more often when running with -XX:-TieredCompilation which is due to a lower number
>>> of compiler threads and a different threshold policy that does not remove stale (unused) methods
>>> from the compile queue. But the same problem can and does happen with TieredCompilation enabled (for
>>> example, we never remove "old" methods that were executed a lot, even if they are stale).
>>>
>>> After a test has been executed, the class loader becomes dead and therefore all test classes/methods
>>> should be unloaded. However, a compile task in the queue keeps the referenced method alive through a
>>> global handle (see CompileTask::initialize -> JNIHandles::make_global) until compilation finished or
>>> the task was removed from the queue for another reason. This causes compilation of thousands of dead
>>> methods.
>>>
>>> The regression test TestOverloadCompileQueues.java triggers this reliably for the C1 and C2 compile
>>> queues. The blue lines show the growth of the number of tasks in the queue until we hit the assert,
>>> the red lines show the growth with the fix:
>>> http://cr.openjdk.java.net/~thartmann/8163511/8163511_results.pdf
>>>
>>> The proposed fix uses weak handles for the compile tasks that don't prevent unloading but can be
>>> used to check if the referenced method is still alive (CompileTask::is_unloaded()). Unloaded tasks
>>> are aggressively removed with both threshold policies. If the method is still alive when the task is
>>> selected for compilation, the weak references are replaced by strong references (to avoid unloading
>>> during compilation which would crash the compiler).
>>>
>>> Of course, overloading the queue is still possible if classes are not unloaded fast enough or not at
>>> all (in the regression test, simply remove the System.gc() call). But I think this is very unlikely
>>> for real world applications and this assert does not affect product builds.
>>>
>>> The fix also changes the printing of compile tasks such that the time when they were enqueued and
>>> the time when compilation started is shown in the hs_err file:
>>>
>>> Threads with active compile tasks:
>>> C2 CompilerThread0 22446 18824 200 5372 4 java.lang.reflect.Method::<init> (68 bytes)
>>> C1 CompilerThread0 22446 21474 199 3377 2 jdk.internal.loader.URLClassPath$3::run (172 bytes)
>>>
>>> I'm currently running extended testing.
>>>
>>> Thanks,
>>> Tobias
>>>
More information about the hotspot-dev
mailing list