RFR(xs): 8213834: JVMTI ResourceExhausted should not be posted in CompilerThread
Thomas Stüfe
thomas.stuefe at gmail.com
Mon Nov 26 11:32:41 UTC 2018
Okay, I created an RFE for this:
https://bugs.openjdk.java.net/browse/JDK-8214294
And for now I will prepare a new minimal patch based on our
discussions and post it shortly.
..Thomas
On Mon, Nov 26, 2018 at 12:28 PM David Holmes <david.holmes at oracle.com> wrote:
>
> On 26/11/2018 6:58 pm, Thomas Stüfe wrote:
> > Hi all,
> >
> > just a general question, would a
> >
> > virtual bool Thread::can_post_JVMTI_events();
> >
> > (similar to can_call_java()) not be a clearer solution? It clearly
> > expresses what we want, and we can, in every Thread child class,
> > explicitly specify the behavior.
>
> That would be good for when we update the specification to actually
> define what threads can post JVM TI events. But I'd prefer for now to
> see just a comment and the can_call_java check.
>
> David
>
> > ..Thomas
> >
> >
> >
> >
> > On Mon, Nov 26, 2018 at 9:51 AM Thomas Stüfe <thomas.stuefe at gmail.com> wrote:
> >>
> >> On Mon, Nov 26, 2018 at 4:58 AM serguei.spitsyn at oracle.com
> >> <serguei.spitsyn at oracle.com> wrote:
> >>>
> >>> Hi guys,
> >>>
> >>> On 11/25/18 06:17, Daniel D. Daugherty wrote:
> >>>> On 11/25/18 6:57 AM, David Holmes wrote:
> >>>>>
> >>>>>
> >>>>> On 25/11/2018 12:41 am, Daniel D. Daugherty wrote:
> >>>>>> On 11/22/18 5:24 PM, David Holmes wrote:
> >>>>>>> Hi Thomas,
> >>>>>>>
> >>>>>>> On 23/11/2018 2:16 am, Thomas Stüfe wrote:
> >>>>>>>> Hi all,
> >>>>>>>>
> >>>>>>>> would such a patch be acceptable:
> >>>>>>>>
> >>>>>>>> http://cr.openjdk.java.net/~stuefe/webrevs/8213834-jvmti-reourceexhausted-shall-not-be-posted-from-compiler-thread/webrev.01/webrev/
> >>>>>>>>
> >>>>>>>> ?
> >>>>>>>>
> >>>>>>>> As has been proposed, I am now using
> >>>>>>>> thread->is_hidden_from_external_view() to suppress the event.
> >>>>>>>
> >>>>>>> I still disagree with this for reason previously outlined. It not
> >>>>>>> only excludes the CompilerThreads that can't run Java but all
> >>>>>>> CompilerThreads, and so excludes JVMCI compiler threads that do
> >>>>>>> everything in Java and so will quite possibly trigger resource
> >>>>>>> exhaustion.
> >>>>>>
> >>>>>> I thought the CompilerThread is_hidden_from_external_view() is built on
> >>>>>> top of can_call_java():
> >>>>>>
> >>>>>> // Hide native compiler threads from external view.
> >>>>>> bool is_hidden_from_external_view() const { return
> >>>>>> !can_call_java(); }
> >>>>>>
> >>>>>> so is_hidden_from_external_view() does not exclude JVMCI...
> >>>>>
> >>>>> Yes sorry my mistake.
> >>>
> >>> There are many confusions around this.
> >>> Unfortunately, I make the same mistakes. :(
> >>>
> >>>>>> Of course, the question of whether JVMCI threads should be
> >>>>>> participating
> >>>>>> in JVM/TI events and other stuff is being debugged and explored at the
> >>>>>> current time.
> >>>>>>
> >>>>>>
> >>>>>>> It also excludes the ServiceThread and CodeCacheSweeperThread.
> >>>>>>> While the latter seems insignificant I think the ServiceThread is
> >>>>>>> more significant. Those threads are not excluded from posting other
> >>>>>>> events AFAICS so I don't see why this event should be treated
> >>>>>>> specially for them.
> >>>>>>
> >>>>>> Hmmm... I don't think the ServiceThread should be posting JVM/TI events
> >>>>>
> >>>>> But it already does! It's already posting deferred JVM TI events
> >>>>> explicitly as part of its job, and I see nothing that excludes it
> >>>>> from posting JVM TI events as part of any other Java code it runs.
> >>>>
> >>>> You're right. I completely forgot about the ServiceThread doing proxy
> >>>> event posting duty. So I think where we are is switching back to using
> >>>>
> >>>> can_call_java()
> >>>>
> >>>> to restrict the posting of ResourceExhausted events. Less limiting than
> >>>> !is_hidden_from_external_view(), but still limited. I'm reluctantly okay
> >>>> with this, but we need to hear from Serguei.
> >>>
> >>> I've read all the discussion thread.
> >>> Thank you to everyone participating in this discussion - it is very helpful.
> >>> Now, I'm convinced that using can_call_java (not
> >>> is_hidden_from_external_view)
> >>> is right (thanks a lot to David for defending this point!)
> >>>
> >>> I still have a concern though.
> >>> If we skip a ResourceExhausted event this way (as in the webrev above)
> >>> then can we expect similar event posted on another (executing Java) thread?
> >>
> >> I think this usually would happen, but there is no guarantee.
> >>
> >> Speaking for metaspace: Before metaspace OOM happens, we already tried
> >> to GC and that failed to recover metaspace so there is no immediate
> >> hope and chances are high subsequent metaspace allocations will fail
> >> too.
> >>
> >> If the JIT runs into metaspace OOM, it will cope by bailing out and
> >> leave that particular method uncompiled. I argue that this is not
> >> observable to the user and therefore ResourceExhausted can be
> >> suppressed.
> >>
> >> If we, as you fear, do not resend ResourceExhausted, that means that
> >> no-one else tried to allocate metaspace, and that this JIT compilation
> >> was by pure chance the last one to ever want metaspace. Unlikely as it
> >> is, in that case I think it okay suppressing ResourceExhausted and
> >> going on with our lives.
> >>
> >> But if Metaspace OOM keep occurring, they most probably will occur
> >> from regular java threads too and that will trigger ResourceExhausted.
> >>
> >> Finally, I am not sure whether a pathological case is possible where
> >> only the JIT were to allocate metaspace over and over again, failing
> >> each time, silently swallowing all OOMs and just falling back to
> >> interpreted. But that problem exists today too.
> >>
> >> Thanks, Thomas
> >>
> >>> I have a doubt that we skip it correctly (I need to learn the code more).
> >>>
> >>>
> >>> Thanks,
> >>> Serguei
> >>>
> >>>> Dan
> >>>>
> >>>>
> >>>>>
> >>>>> David
> >>>>> -----
> >>>>>
> >>>>>> either. We definitely don't want the ServiceThread running "arbitrary"
> >>>>>> event handler code since taking down the ServiceThread could mess up
> >>>>>> various things that rely on it running.
> >>>>>>
> >>>>>> I don't have an opinion about the CodeCacheSweeperThread.
> >>>>>>
> >>>>>> dan
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>>> Side note, this function apparently should only ever be called from
> >>>>>>>> JavaThreads? To my knowledge we do not guard metaspace against
> >>>>>>>> allocation from non-java-threads, should we then do that?
> >>>>>>>
> >>>>>>> Not sure how it is arranged but certainly Metaspace::allocate
> >>>>>>> expects to only be called from a JavaThread as it immediately
> >>>>>>> checks for
> >>>>>>>
> >>>>>>> if (HAS_PENDING_EXCEPTION) {
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>> David
> >>>>>>> -----
> >>>>>>>
> >>>>>>>> I attempted to create a jtreg test for this but this turns out to
> >>>>>>>> be difficult:
> >>>>>>>>
> >>>>>>>> Attempting to trigger a metaspace OOM from a compiler thread proved
> >>>>>>>> futile - chances for that to happen are just too low compared to
> >>>>>>>> other
> >>>>>>>> metaspace users. Only way to do this reliably would be to
> >>>>>>>> artificially
> >>>>>>>> allocate a lot of metaspace from the compiler thread - triggered by a
> >>>>>>>> test switch or similar? That would be very ugly though. And then, I
> >>>>>>>> would need to add a jvmti agent to jtreg, or adapt
> >>>>>>>> jtreg/vmTestbase/nsk/jvmti?
> >>>>>>>>
> >>>>>>>> Thanks, Thomas
> >>>>>>>> On Thu, Nov 22, 2018 at 4:22 PM Daniel D. Daugherty
> >>>>>>>> <daniel.daugherty at oracle.com> wrote:
> >>>>>>>>>
> >>>>>>>>> On 11/22/18 2:42 AM, David Holmes wrote:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On 22/11/2018 5:36 pm, Thomas Stüfe wrote:
> >>>>>>>>>>> Hi JC,
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, Nov 21, 2018 at 6:03 PM JC Beyler <jcbeyler at google.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Hi Thomas,
> >>>>>>>>>>>>
> >>>>>>>>>>> <snip>
> >>>>>>>>>>>>
> >>>>>>>>>>>> I actually agree with not using the service thread to be honest,
> >>>>>>>>>>>> resource exhaustion seems to be something you'd want to know
> >>>>>>>>>>>> sooner
> >>>>>>>>>>>> than later.
> >>>>>>>>>>>>
> >>>>>>>>>>>> How about we do both?
> >>>>>>>>>>>> - Fix it now so that the compiler thread does not post the
> >>>>>>>>>>>> event
> >>>>>>>>>>>> because we'd rather not have crashes and that can get backported
> >>>>>>>>>>>> - Put out a RFE that would add the information to
> >>>>>>>>>>>> ThreadInfo and
> >>>>>>>>>>>> work through the process of a CSR/etc. to get it working for
> >>>>>>>>>>>> future
> >>>>>>>>>>>> versions?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks,
> >>>>>>>>>>>> Jc
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Makes sense, sure. But both Dan and Serguei voiced opposition
> >>>>>>>>>>> to this
> >>>>>>>>>>> patch. Dan, Serguei, what do you think?
> >>>>>>>>>>
> >>>>>>>>>> I note neither Dan nor Serguei responded to my response to them :)
> >>>>>>>>>
> >>>>>>>>> I didn't think a response from me was needed. The last thing I
> >>>>>>>>> said was:
> >>>>>>>>>
> >>>>>>>>>> I would have no problem suppressing the event from a "hidden"
> >>>>>>>>>> thread
> >>>>>>>>>> because those threads intentionally try to hide from JVM/TI.
> >>>>>>>>>> That would
> >>>>>>>>>> cover the case for the C1 and C2 CompilerThreads, but I don't know
> >>>>>>>>>> about these newer JVM/CI Compiler threads that actually run Java
> >>>>>>>>>> code...
> >>>>>>>>>
> >>>>>>>>> I read your combined response to this as not in conflict with
> >>>>>>>>> what I said.
> >>>>>>>>>
> >>>>>>>>> The last line of your response:
> >>>>>>>>>
> >>>>>>>>> > So my preferred point solution is still to skip posting
> >>>>>>>>> ResourceExhausted
> >>>>>>>>> > if the thread can not run Java code.
> >>>>>>>>>
> >>>>>>>>> matches what I said since the C1 and C2 compiler threads are
> >>>>>>>>> hidden and
> >>>>>>>>> cannot run Java code. We're in agreement.
> >>>>>>>>>
> >>>>>>>>> Dan
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Cheers,
> >>>>>>>>>> David
> >>>>>>>>>> -----
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> If we do not find an agreement, I would rather close this bug
> >>>>>>>>>>> as WNF
> >>>>>>>>>>> than push it in without consensus. I would then just fix it
> >>>>>>>>>>> downstream
> >>>>>>>>>>> in our port.
> >>>>>>>>>>>
> >>>>>>>>>>> Your proposed RFE would still make sense, but not in jdk12
> >>>>>>>>>>> anymore,
> >>>>>>>>>>> let alone in older releases.
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks, Thomas
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> >>>>
> >>>
More information about the serviceability-dev
mailing list