JDK-8230459: Test failed to resume JVMCI CompilerThread

Doerr, Martin martin.doerr at sap.com
Tue Nov 5 13:37:09 UTC 2019


Hi David

> With JVMCI compiler threads, each getting a new j.l.Thread oop that 
> lasts for the lifetime of that compiler thread (just like a regular 
> JavaThread) do we even actually need these arrays? I'm unclear what 
> purpose they serve when we are not trying to reuse the oops stored in 
> the array. ??

Compiler threads can lookup j.l.Thread objects of live compilers by iterating over the arrays.
That's used to find the last compiler alive or to find a log instance for a compiler.
Could get designed differently, but that would make the change even bigger.

Best regards,
Martin


> -----Original Message-----
> From: David Holmes <david.holmes at oracle.com>
> Sent: Dienstag, 5. November 2019 13:33
> To: Doerr, Martin <martin.doerr at sap.com>; dean.long at oracle.com; Kim
> Barrett <kim.barrett at oracle.com>
> Cc: Vladimir Kozlov (vladimir.kozlov at oracle.com)
> <vladimir.kozlov at oracle.com>; hotspot-compiler-dev at openjdk.java.net
> Subject: Re: JDK-8230459: Test failed to resume JVMCI CompilerThread
> 
> On 5/11/2019 6:40 pm, Doerr, Martin wrote:
> > Hi David
> >
> >> I don't understand what you mean. If a compiler thread holds an oop, any
> >> oop, it must hold it in a Handle to ensure it can't be gc'd.
> >
> > The problem is not related to gc.
> > My change introduces destroy_global for the handles. This means that the
> OopStorage portion which has held the oop can get freed.
> > However, other compiler threads are running concurrently. They may
> execute code which reads the oop from the handle which is freed by this
> thread. Reading stale data is not a problem here, but reading freed memory
> may assert or even crash in general.
> > I can't see how OopStorage supports reading from handles which were
> freed by destroy_global.
> 
> With JVMCI compiler threads, each getting a new j.l.Thread oop that
> lasts for the lifetime of that compiler thread (just like a regular
> JavaThread) do we even actually need these arrays? I'm unclear what
> purpose they serve when we are not trying to reuse the oops stored in
> the array. ??
> 
> David
> -----
> 
> > I think it would be safe if the freeing only occurred at safepoints, but I don't
> think this is the case.
> >
> > Best regards,
> > Martin
> >
> >
> >> -----Original Message-----
> >> From: David Holmes <david.holmes at oracle.com>
> >> Sent: Dienstag, 5. November 2019 00:19
> >> To: Doerr, Martin <martin.doerr at sap.com>; dean.long at oracle.com; Kim
> >> Barrett <kim.barrett at oracle.com>
> >> Cc: Vladimir Kozlov (vladimir.kozlov at oracle.com)
> >> <vladimir.kozlov at oracle.com>; hotspot-compiler-dev at openjdk.java.net
> >> Subject: Re: JDK-8230459: Test failed to resume JVMCI CompilerThread
> >>
> >> On 4/11/2019 9:12 pm, Doerr, Martin wrote:
> >>> Hi all,
> >>>
> >>> @Dean
> >>>> changing can_remove() to CompileBroker::can_remove()?
> >>> Yes. That would be an option.
> >>>
> >>> @Kim, David
> >>> I think there's another problem with this implementation.
> >>> It introduces a use-after-free pattern due to concurrency.
> >>> Compiler threads may still read the oops from the handles after one of
> >> them has called destroy_global until next safepoint. It doesn't matter
> which
> >> values they get in this case, but the VM should not crash. I believe that
> >> OopStorage allows freeing storage without safepoints, so this may be
> >> unsafe. Right?
> >>
> >> I don't understand what you mean. If a compiler thread holds an oop, any
> >> oop, it must hold it in a Handle to ensure it can't be gc'd.
> >>
> >> David
> >>
> >>> If so, I think replacing the oops in the handles (and keeping the handles
> >> alive) would be better. And also much more simple.
> >>>
> >>> Best regards,
> >>> Martin
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: dean.long at oracle.com <dean.long at oracle.com>
> >>>> Sent: Samstag, 2. November 2019 08:36
> >>>> To: Doerr, Martin <martin.doerr at sap.com>; David Holmes
> >>>> <david.holmes at oracle.com>; Kim Barrett <kim.barrett at oracle.com>
> >>>> Cc: Vladimir Kozlov (vladimir.kozlov at oracle.com)
> >>>> <vladimir.kozlov at oracle.com>; hotspot-compiler-
> dev at openjdk.java.net
> >>>> Subject: Re: JDK-8230459: Test failed to resume JVMCI CompilerThread
> >>>>
> >>>> Hi Martin,
> >>>>
> >>>> On 10/30/19 3:18 AM, Doerr, Martin wrote:
> >>>>> Hi David,
> >>>>>
> >>>>>> I don't think factoring out CompileBroker::clear_compiler2_object
> >> when
> >>>>>> it is only used once was warranted, but that's call for compiler team
> to
> >>>>>> make.
> >>>>> I did that because _compiler2_objects is private and there's currently
> no
> >>>> setter available.
> >>>>> But let's see what the compiler folks think.
> >>>>
> >>>> how about changing can_remove() to CompileBroker::can_remove()?
> >> Then
> >>>> you
> >>>> can access _compiler2_objects directly, right?
> >>>>
> >>>> dl
> >>>>>> Otherwise changes seem fine and I have noted the use of the
> >>>>>> MutexUnlocker as per your direct email.
> >>>>> Thanks a lot for reviewing. It was not a trivial one ��
> >>>>>
> >>>>> You had noticed an incorrect usage of the CHECK macro. I've created a
> >> new
> >>>> bug for that:
> >>>>> https://bugs.openjdk.java.net/browse/JDK-8233193
> >>>>> Would be great if you could take a look if that's what you meant and
> >> made
> >>>> adaptions if needed.
> >>>>>
> >>>>> Best regards,
> >>>>> Martin
> >>>>>
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: David Holmes <david.holmes at oracle.com>
> >>>>>> Sent: Mittwoch, 30. Oktober 2019 05:47
> >>>>>> To: Doerr, Martin <martin.doerr at sap.com>; Kim Barrett
> >>>>>> <kim.barrett at oracle.com>
> >>>>>> Cc: Vladimir Kozlov (vladimir.kozlov at oracle.com)
> >>>>>> <vladimir.kozlov at oracle.com>; hotspot-compiler-
> >> dev at openjdk.java.net;
> >>>>>> David Holmes <David.Holmes at oracle.com>
> >>>>>> Subject: Re: JDK-8230459: Test failed to resume JVMCI
> CompilerThread
> >>>>>>
> >>>>>> Hi Martin,
> >>>>>>
> >>>>>> On 29/10/2019 12:06 am, Doerr, Martin wrote:
> >>>>>>> Hi David and Kim,
> >>>>>>>
> >>>>>>> I think it's easier to talk about code. So here's a new webrev:
> >>>>>>>
> >>>>>>
> >>>>
> >>
> http://cr.openjdk.java.net/~mdoerr/8230459_JVMCI_thread_objects/webr
> >>>>>> ev.03/
> >>>>>>
> >>>>>> I don't think factoring out CompileBroker::clear_compiler2_object
> >> when
> >>>>>> it is only used once was warranted, but that's call for compiler team
> to
> >>>>>> make. Otherwise changes seem fine and I have noted the use of the
> >>>>>> MutexUnlocker as per your direct email.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> David
> >>>>>> -----
> >>>>>>
> >>>>>>> @Kim:
> >>>>>>> Thanks for looking at the handle related parts. It's ok if you don't
> want
> >> to
> >>>>>> be a reviewer of the whole change.
> >>>>>>>> I think it's weird that can_remove() is a predicate with optional
> side
> >>>>>>>> effects.  I think it would be simpler to have it be a pure predicate,
> >>>>>>>> and have the one caller with do_it = true perform the updates.
> That
> >>>>>>>> should include NULLing out the handle pointer (perhaps debug-
> only,
> >>>> but
> >>>>>>>> it doesn't cost much to cleanly maintain the data structure).
> >>>>>>> Nevertheless, it has the advantage that it enforces the update to
> be
> >>>>>> consistent.
> >>>>>>> A caller could use it without holding the lock or mess it up
> otherwise.
> >>>>>>> In addition, I don't what to change that as part of this fix.
> >>>>>>>
> >>>>>>>> So far as I can tell, THREAD == NULL here.
> >>>>>>> This is a very tricky part (not my invention):
> >>>>>>> EXCEPTION_MARK contains an ExceptionMark constructor call
> which
> >>>> sets
> >>>>>> __the_thread__ to Thread::current().
> >>>>>>> I don't want to publish my opinion about this ��
> >>>>>>>
> >>>>>>> @David:
> >>>>>>> Seems like this option is preferred over option 3
> >>>>>> (possibly_add_compiler_threads part of webrev.02 and leave the
> >>>>>> initialization as is).
> >>>>>>> So when you're ok with it, I'll request a 2nd review from the
> compiler
> >>>> folks
> >>>>>> (I should change the subject to contain RFR).
> >>>>>>> Thanks,
> >>>>>>> Martin
> >>>>>>>
> >>>>>>>
> >>>>>>>> -----Original Message-----
> >>>>>>>> From: David Holmes <david.holmes at oracle.com>
> >>>>>>>> Sent: Montag, 28. Oktober 2019 05:04
> >>>>>>>> To: Kim Barrett <kim.barrett at oracle.com>
> >>>>>>>> Cc: Doerr, Martin <martin.doerr at sap.com>; Vladimir Kozlov
> >>>>>>>> (vladimir.kozlov at oracle.com) <vladimir.kozlov at oracle.com>;
> >> hotspot-
> >>>>>>>> compiler-dev at openjdk.java.net
> >>>>>>>> Subject: Re: JDK-8230459: Test failed to resume JVMCI
> >> CompilerThread
> >>>>>>>>
> >>>>>>>> On 28/10/2019 1:42 pm, Kim Barrett wrote:
> >>>>>>>>>> On Oct 27, 2019, at 6:04 PM, David Holmes
> >>>>>> <david.holmes at oracle.com>
> >>>>>>>> wrote:
> >>>>>>>>>> On 24/10/2019 12:47 am, Doerr, Martin wrote:
> >>>>>>>>>>> Hi Kim,
> >>>>>>>>>>> I didn't like using the OopStorage stuff directly, either. I just
> have
> >>>> not
> >>>>>>>> seen how to allocate a global handle and add the oop later.
> >>>>>>>>>>> Thanks for pointing me to JVMCI::make_global. I was not
> aware
> >> of
> >>>>>> that.
> >>>>>>>>>>> So I can imagine 3 ways to implement it:
> >>>>>>>>>>> 1. Use JNIHandles::destroy_global to release the handles. I
> just
> >>>> added
> >>>>>>>> that to
> >>>>>>
> >>>>
> >>
> http://cr.openjdk.java.net/~mdoerr/8230459_JVMCI_thread_objects/webr
> >>>>>>>> ev.01/
> >>>>>>>>>>> We may want to improve that further by setting the handle
> >> pointer
> >>>> to
> >>>>>>>> NULL and asserting that it is NULL before adding the new one.
> >>>>>>>>>>> I had been concerned about NULLs in the array, but looks like
> >> the
> >>>>>>>> existing code can deal with that.
> >>>>>>>>>> I think it would be cleaner to both destroy the global handle
> and
> >>>> NULL it
> >>>>>> in
> >>>>>>>> the array at the same time.
> >>>>>>>>>> This comment
> >>>>>>>>>>
> >>>>>>>>>> 325         // Old j.l.Thread object can die here.
> >>>>>>>>>>
> >>>>>>>>>> Isn't quite accurate. The j.l.Thread is still referenced via ct-
> >>>>> threadObj()
> >>>>>> so
> >>>>>>>> can't "die" until that is also cleared during the actual termination
> >>>> process.
> >>>>>>>>> I think if there is such a thread here that it can't die, because the
> >>>>>>>>> death predicate (the can_remove stuff) won't see that old
> thread
> >> as
> >>>>>>>>> the last thread in _compiler2_objects.  That's what I meant by
> this:
> >>>>>>>>>
> >>>>>>>>>> On Oct 25, 2019, at 4:05 PM, Kim Barrett
> >> <kim.barrett at oracle.com>
> >>>>>>>> wrote:
> >>>>>>>>>> I also think that here:
> >>>>>>>>>>
> >>>>>>>>>> 947         jobject thread_handle =
> >>>>>> JNIHandles::make_global(thread_oop);
> >>>>>>>>>> 948         _compiler2_objects[i] = thread_handle;
> >>>>>>>>>>
> >>>>>>>>>> should assert _compiler2_objects[i] == NULL.  Or if that isn't a
> >> valid
> >>>>>>>>>> assertion then I think there are other problems.
> >>>>>>>>> I think either that comment about an old thread is wrong (and
> the
> >>>> NULL
> >>>>>>>>> assertion I suggested is okay), or I think the whole mechanism
> here
> >>>>>>>>> has problems.  Or at least I was unable to figure out how it could
> >>>> work...
> >>>>>>>>>
> >>>>>>>> I'm not following sorry. You can't assert NULL unless it's actually
> set
> >>>>>>>> to NULL which it presently isn't. But it could be set NULL as Martin
> >>>>>>>> suggested:
> >>>>>>>>
> >>>>>>>> "We may want to improve that further by setting the handle
> pointer
> >> to
> >>>>>>>> NULL and asserting that it is NULL before adding the new one."
> >>>>>>>>
> >>>>>>>> and which I also supported. But that aside once the delete_global
> >> has
> >>>>>>>> been called that JNIHandle no longer references the j.l.Thread
> that it
> >>>>>>>> did, at which point it is only reachable via the threadObj() of the
> >>>>>>>> CompilerThread.
> >>>>>>>>
> >>>>>>>> David
> >>>


More information about the hotspot-compiler-dev mailing list