JDK-8230459: Test failed to resume JVMCI CompilerThread
David Holmes
david.holmes at oracle.com
Mon Nov 4 23:18:48 UTC 2019
On 4/11/2019 9:12 pm, Doerr, Martin wrote:
> Hi all,
>
> @Dean
>> changing can_remove() to CompileBroker::can_remove()?
> Yes. That would be an option.
>
> @Kim, David
> I think there's another problem with this implementation.
> It introduces a use-after-free pattern due to concurrency.
> Compiler threads may still read the oops from the handles after one of them has called destroy_global until next safepoint. It doesn't matter which values they get in this case, but the VM should not crash. I believe that OopStorage allows freeing storage without safepoints, so this may be unsafe. Right?
I don't understand what you mean. If a compiler thread holds an oop, any
oop, it must hold it in a Handle to ensure it can't be gc'd.
David
> If so, I think replacing the oops in the handles (and keeping the handles alive) would be better. And also much more simple.
>
> Best regards,
> Martin
>
>
>> -----Original Message-----
>> From: dean.long at oracle.com <dean.long at oracle.com>
>> Sent: Samstag, 2. November 2019 08:36
>> To: Doerr, Martin <martin.doerr at sap.com>; David Holmes
>> <david.holmes at oracle.com>; Kim Barrett <kim.barrett at oracle.com>
>> Cc: Vladimir Kozlov (vladimir.kozlov at oracle.com)
>> <vladimir.kozlov at oracle.com>; hotspot-compiler-dev at openjdk.java.net
>> Subject: Re: JDK-8230459: Test failed to resume JVMCI CompilerThread
>>
>> Hi Martin,
>>
>> On 10/30/19 3:18 AM, Doerr, Martin wrote:
>>> Hi David,
>>>
>>>> I don't think factoring out CompileBroker::clear_compiler2_object when
>>>> it is only used once was warranted, but that's call for compiler team to
>>>> make.
>>> I did that because _compiler2_objects is private and there's currently no
>> setter available.
>>> But let's see what the compiler folks think.
>>
>> how about changing can_remove() to CompileBroker::can_remove()? Then
>> you
>> can access _compiler2_objects directly, right?
>>
>> dl
>>>> Otherwise changes seem fine and I have noted the use of the
>>>> MutexUnlocker as per your direct email.
>>> Thanks a lot for reviewing. It was not a trivial one
>>>
>>> You had noticed an incorrect usage of the CHECK macro. I've created a new
>> bug for that:
>>> https://bugs.openjdk.java.net/browse/JDK-8233193
>>> Would be great if you could take a look if that's what you meant and made
>> adaptions if needed.
>>>
>>> Best regards,
>>> Martin
>>>
>>>
>>>> -----Original Message-----
>>>> From: David Holmes <david.holmes at oracle.com>
>>>> Sent: Mittwoch, 30. Oktober 2019 05:47
>>>> To: Doerr, Martin <martin.doerr at sap.com>; Kim Barrett
>>>> <kim.barrett at oracle.com>
>>>> Cc: Vladimir Kozlov (vladimir.kozlov at oracle.com)
>>>> <vladimir.kozlov at oracle.com>; hotspot-compiler-dev at openjdk.java.net;
>>>> David Holmes <David.Holmes at oracle.com>
>>>> Subject: Re: JDK-8230459: Test failed to resume JVMCI CompilerThread
>>>>
>>>> Hi Martin,
>>>>
>>>> On 29/10/2019 12:06 am, Doerr, Martin wrote:
>>>>> Hi David and Kim,
>>>>>
>>>>> I think it's easier to talk about code. So here's a new webrev:
>>>>>
>>>>
>> http://cr.openjdk.java.net/~mdoerr/8230459_JVMCI_thread_objects/webr
>>>> ev.03/
>>>>
>>>> I don't think factoring out CompileBroker::clear_compiler2_object when
>>>> it is only used once was warranted, but that's call for compiler team to
>>>> make. Otherwise changes seem fine and I have noted the use of the
>>>> MutexUnlocker as per your direct email.
>>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>> @Kim:
>>>>> Thanks for looking at the handle related parts. It's ok if you don't want to
>>>> be a reviewer of the whole change.
>>>>>> I think it's weird that can_remove() is a predicate with optional side
>>>>>> effects. I think it would be simpler to have it be a pure predicate,
>>>>>> and have the one caller with do_it = true perform the updates. That
>>>>>> should include NULLing out the handle pointer (perhaps debug-only,
>> but
>>>>>> it doesn't cost much to cleanly maintain the data structure).
>>>>> Nevertheless, it has the advantage that it enforces the update to be
>>>> consistent.
>>>>> A caller could use it without holding the lock or mess it up otherwise.
>>>>> In addition, I don't what to change that as part of this fix.
>>>>>
>>>>>> So far as I can tell, THREAD == NULL here.
>>>>> This is a very tricky part (not my invention):
>>>>> EXCEPTION_MARK contains an ExceptionMark constructor call which
>> sets
>>>> __the_thread__ to Thread::current().
>>>>> I don't want to publish my opinion about this
>>>>>
>>>>> @David:
>>>>> Seems like this option is preferred over option 3
>>>> (possibly_add_compiler_threads part of webrev.02 and leave the
>>>> initialization as is).
>>>>> So when you're ok with it, I'll request a 2nd review from the compiler
>> folks
>>>> (I should change the subject to contain RFR).
>>>>> Thanks,
>>>>> Martin
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>> Sent: Montag, 28. Oktober 2019 05:04
>>>>>> To: Kim Barrett <kim.barrett at oracle.com>
>>>>>> Cc: Doerr, Martin <martin.doerr at sap.com>; Vladimir Kozlov
>>>>>> (vladimir.kozlov at oracle.com) <vladimir.kozlov at oracle.com>; hotspot-
>>>>>> compiler-dev at openjdk.java.net
>>>>>> Subject: Re: JDK-8230459: Test failed to resume JVMCI CompilerThread
>>>>>>
>>>>>> On 28/10/2019 1:42 pm, Kim Barrett wrote:
>>>>>>>> On Oct 27, 2019, at 6:04 PM, David Holmes
>>>> <david.holmes at oracle.com>
>>>>>> wrote:
>>>>>>>> On 24/10/2019 12:47 am, Doerr, Martin wrote:
>>>>>>>>> Hi Kim,
>>>>>>>>> I didn't like using the OopStorage stuff directly, either. I just have
>> not
>>>>>> seen how to allocate a global handle and add the oop later.
>>>>>>>>> Thanks for pointing me to JVMCI::make_global. I was not aware of
>>>> that.
>>>>>>>>> So I can imagine 3 ways to implement it:
>>>>>>>>> 1. Use JNIHandles::destroy_global to release the handles. I just
>> added
>>>>>> that to
>>>>
>> http://cr.openjdk.java.net/~mdoerr/8230459_JVMCI_thread_objects/webr
>>>>>> ev.01/
>>>>>>>>> We may want to improve that further by setting the handle pointer
>> to
>>>>>> NULL and asserting that it is NULL before adding the new one.
>>>>>>>>> I had been concerned about NULLs in the array, but looks like the
>>>>>> existing code can deal with that.
>>>>>>>> I think it would be cleaner to both destroy the global handle and
>> NULL it
>>>> in
>>>>>> the array at the same time.
>>>>>>>> This comment
>>>>>>>>
>>>>>>>> 325 // Old j.l.Thread object can die here.
>>>>>>>>
>>>>>>>> Isn't quite accurate. The j.l.Thread is still referenced via ct-
>>> threadObj()
>>>> so
>>>>>> can't "die" until that is also cleared during the actual termination
>> process.
>>>>>>> I think if there is such a thread here that it can't die, because the
>>>>>>> death predicate (the can_remove stuff) won't see that old thread as
>>>>>>> the last thread in _compiler2_objects. That's what I meant by this:
>>>>>>>
>>>>>>>> On Oct 25, 2019, at 4:05 PM, Kim Barrett <kim.barrett at oracle.com>
>>>>>> wrote:
>>>>>>>> I also think that here:
>>>>>>>>
>>>>>>>> 947 jobject thread_handle =
>>>> JNIHandles::make_global(thread_oop);
>>>>>>>> 948 _compiler2_objects[i] = thread_handle;
>>>>>>>>
>>>>>>>> should assert _compiler2_objects[i] == NULL. Or if that isn't a valid
>>>>>>>> assertion then I think there are other problems.
>>>>>>> I think either that comment about an old thread is wrong (and the
>> NULL
>>>>>>> assertion I suggested is okay), or I think the whole mechanism here
>>>>>>> has problems. Or at least I was unable to figure out how it could
>> work...
>>>>>>>
>>>>>> I'm not following sorry. You can't assert NULL unless it's actually set
>>>>>> to NULL which it presently isn't. But it could be set NULL as Martin
>>>>>> suggested:
>>>>>>
>>>>>> "We may want to improve that further by setting the handle pointer to
>>>>>> NULL and asserting that it is NULL before adding the new one."
>>>>>>
>>>>>> and which I also supported. But that aside once the delete_global has
>>>>>> been called that JNIHandle no longer references the j.l.Thread that it
>>>>>> did, at which point it is only reachable via the threadObj() of the
>>>>>> CompilerThread.
>>>>>>
>>>>>> David
>
More information about the hotspot-compiler-dev
mailing list