RFR(xs): 8213834: JVMTI ResourceExhausted should not be posted in CompilerThread
David Holmes
david.holmes at oracle.com
Thu Nov 15 01:47:02 UTC 2018
Hi Dan, Serguei,
I'm going to combine my response to you both into one as it's the same
discussion ...
On 15/11/2018 9:28 am, Daniel D. Daugherty wrote:
> On 11/14/18 6:01 PM, David Holmes wrote:
>> On 15/11/2018 6:13 am, Daniel D. Daugherty wrote:
>>> I do not have a better idea at the moment.
>>>
>>> The ResourceExhausted event is one of many JVM/TI events. So we put in a
>>> special case for this event because an agent out in the world tries
>>> to do
>>> something that is not expressly forbidden by the spec, but it is a bad
>>> thing to do with HotSpot. Okay. What about the next event? What about
>>> the next agent? Once you've added a special case, where do you stop?
>>
>> I had proposed more broadly not posting any events in threads that can
>> not run Java but that is potentially more damaging to existing agents
>> that just happen to work in the compiler thread today.
>>
>> But it goes beyond just running Java. Can these other threads make JNI
>> calls, or other JVM TI calls? If they can run Java they can do all
>> those things. If they can't run Java they may not be able to.
>>
>> There is no perfect solution here because JVM TI allows too much
>> freedom. It doesn't put any constraints on the threads that can
>> execute callbacks, and has very few constraints on what callbacks can
>> be executed when.
>>
>> On the other hand an implementation has complete freedom as to when
>> exactly resource exhaustion events get posted. Does it make sense for
>> an internal VM thread to post this ResourceExhausted event? Is it just
>> an accident of implementation that the code path decided to throw OOME
>> and so also posts the event? You can easily imagine this code not
>> doing that - and as Thomas points out the compiler thread swallows the
>> exception anyway!
>>
>> As there is no way to know what an agent may do ahead of time we are
>> faced with either:
>> - breaking some agents and crashing the VM; or
>> - masking an event
>>
>> I vote for no breakage.
>
> I would have no problem suppressing the event from a "hidden" thread
> because those threads intentionally try to hide from JVM/TI.
Yes but only for certain things:
- not visible in thread stack dumps
- not able to suspend/resume
- not visible in thread lists/groups
- no thread lifecycle events
- not roots for Heap walking
But as far as I can see "hidden" threads still generate all other JVM TI
events, so excluding them here seems to be too broad and marks a change
in behaviour that seems unjustified.
> That would
> cover the case for the C1 and C2 CompilerThreads, but I don't know
> about these newer JVM/CI Compiler threads that actually run Java code...
The JVMCI threads are not hidden. As Serguei points out in his response
the hidden-ness of a compiler thread is based on its ability to run
Java. So although it comes down to the same thing for compiler threads I
still thing the correct test is for can_call_Java().
[Aside: I think JVMCI needs to re-think how it expects internal compiler
threads to interact with things like JVM TI - we're already seeing
issues with tests 'accidentally' suspending compiler threads and then
getting into deadlocks! ]
JC Beyler made a suggestion in Thomas's original query thread: perhaps
we should defer these events for the ServiceThread to execute? That
would still have to be gated on either the thread being hidden or
(preferably to me) not being able to run Java, but at least it should
avoid the problem whilst still causing the event to appear.
That said, ResourceExhausted events have no context information and to
me they appear to be expected to be synchronous events thrown in the
thread that encountered the resource exhaustion. So to have them post in
the ServiceThread might still be somewhat surprising to an agent.
So my preferred point solution is still to skip posting
ResourceExhausted if the thread can not run Java code.
Cheers,
David
-----
> Dan
>
>
>>
>> Cheers,
>> David
>>
>>> Dan
>>>
>>>
>>> On 11/14/18 2:59 PM, Thomas Stüfe wrote:
>>>> Hi Dan,
>>>>
>>>> On Wed, Nov 14, 2018 at 8:32 PM Daniel D. Daugherty
>>>> <daniel.daugherty at oracle.com> wrote:
>>>>> I have a philosophical problem with a fix like this.
>>>>>
>>>>> The fix is making the assumption that the handler for this event
>>>>> will want
>>>>> to run Java code and if the event is posted from a Java thread that
>>>>> cannot
>>>>> run Java code, then we skip posting the event.
>>>>>
>>>>> If I happen to have a more conservative agent that does not run
>>>>> Java code
>>>>> in its handlers, then my agent won't receive this event even though it
>>>>> should not cause my agent any problems. That might be an unexpected
>>>>> change
>>>>> in behavior on the part of my agent.
>>>>>
>>>>> Dan
>>>>>
>>>> I was thinking about that too. In fact, the JVMTI agent I am trying to
>>>> add this fix for exists in two variants. The base form,
>>>> airlift/jvmkill, just does a plain kill(2). So not problem there. Only
>>>> the forked version by cloudfoundry/jvmkill do this fancy stuff. (Note
>>>> that both projects are on github, if you care to look them up. They
>>>> are not by me. I am just using them).
>>>>
>>>> But the problem is, the JVMTI spec says nothing about what you can or
>>>> cannot do in reaction to a ResourceExhausted event. So what
>>>> cloudfoundry/jvmkill does is valid and not forbidden.
>>>>
>>>> Therefore I think suppressing ResourceExhausted in this case is the
>>>> only choice we have. One might fine-tune the conditions under which we
>>>> suppress sending ResourceExhausted: maybe suppress for
>>>> CompilerThreads, or only for CompilerThreads getting MetaspaceOOMs...
>>>>
>>>> But I think we should do something here. By neglecting to add
>>>> restrictions to the JVMTI spec, we encouraged JVMTI agents to do these
>>>> kind of things. The least we can do is minimize the damage.
>>>>
>>>> And then, usually this will not be the last opportunity for
>>>> ResourceExhausted to be posted, no? Chances are high that there will
>>>> be more OOMs following, in real java threads.
>>>>
>>>> Finally, I find swallowing the ResourceExhausted for compiler threads
>>>> is symmetric to the way the compiler thread swallows the OOME itself.
>>>> They do not report the OOME but ignore and clear it.
>>>>
>>>> But as you can see, I see your point. Do you have a better proposal?
>>>>
>>>> ..Thomas
>>>>
>>>>>
>>>>> On 11/14/18 10:06 AM, Daniel D. Daugherty wrote:
>>>>>> Adding serviceability-dev at ...
>>>>>>
>>>>>> Since the proposed solution to the bug is to not post an event, I
>>>>>> think
>>>>>> the Serviceability Team (which owns JVM/TI) needs to be involved
>>>>>> directly
>>>>>> in the review.
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>> On 11/14/18 9:28 AM, Thomas Stüfe wrote:
>>>>>>> Dear all,
>>>>>>>
>>>>>>> may I please have reviews on this small patch. Note that this is
>>>>>>> borderline serviceability. I try to avoid crossposting, but if you
>>>>>>> think this should be looked at by serviceability feel free to
>>>>>>> forward
>>>>>>> it there.
>>>>>>>
>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8213834
>>>>>>>
>>>>>>> CR:
>>>>>>> http://cr.openjdk.java.net/~stuefe/webrevs/8213834-jvmti-reourceexhausted-shall-not-be-posted-from-compiler-thread/webrev.00/webrev/
>>>>>>>
>>>>>>>
>>>>>>> Short description: we may post ResourceExhausted from Compiler
>>>>>>> threads. Handlers of this event may call back into the JVM, which
>>>>>>> may
>>>>>>> cause problems since we run in the compiler thread. The proposed
>>>>>>> solution is not to post the event in that case.
>>>>>>>
>>>>>>> See previous discussion here:
>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2018-November/025898.html
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thanks, Thomas
>>>
>
More information about the serviceability-dev
mailing list