RFR (S): 8213397: Stack dump should show more clearly when a thread is blocked on a class initialization monitor

Daniel D. Daugherty daniel.daugherty at oracle.com
Thu Jan 17 14:33:11 UTC 2019


On 1/16/19 9:28 PM, David Holmes wrote:
> On 17/01/2019 6:57 am, coleen.phillimore at oracle.com wrote:
>> On 1/16/19 11:52 AM, Daniel D. Daugherty wrote:
>>> On 1/14/19 8:26 AM, David Holmes wrote:
>>>> On 14/01/2019 11:21 pm, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>>
>>>>> On 1/14/19 8:13 AM, David Holmes wrote:
>>>>>> Hi Coleen,
>>>>>>
>>>>>> On 14/01/2019 10:25 pm, coleen.phillimore at oracle.com wrote:
>>>>>>>
>>>>>>> This is a nice improvement.
>>>>>>
>>>>>> Thanks for taking a look.
>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dholmes/8213397/webrev/src/hotspot/share/runtime/vframe.cpp.udiff.html 
>>>>>>>
>>>>>>>
>>>>>>> + // top-frame, so we should be waiting on a Class 
>>>>>>> initialization monitor.
>>>>>>> + InstanceKlass* k = thread()->class_to_be_initialized();
>>>>>>> + if (k != NULL) {
>>>>>>> + st->print_cr("\t- waiting on the Class initialization monitor 
>>>>>>> for %s", k->external_name());
>>>>>>> + }
>>>>>>> + else {
>>>>>>> + assert(false, "OBJECT_WAIT state is unexpected");
>>>>>>> + }
>>>>>>>
>>>>>>>
>>>>>>> Should be } else {
>>>>>>>
>>>>>>> And I don't think this should be an assert.  There are a couple 
>>>>>>> of other places that we lock an object monitor from the VM, but 
>>>>>>> I don't think we wait in these places.  In any case, if we add a 
>>>>>>> wait, it would take a lot of testing and a long time to get to 
>>>>>>> this assert to find it needs to add another case here.  So if 
>>>>>>> you remove the 'else' completely and change the comment to "see 
>>>>>>> if we are waiting on a class initialization monitor" you don't 
>>>>>>> need to fix the {}.
>>>>>>
>>>>>> Initially I thought the "else" might be some JNI or JVM TI case, 
>>>>>> but that's not so as there's no direct API to do wait(), it just 
>>>>>> requires invoking the Object.wait method. So at the moment the 
>>>>>> implicit else is "unreachable" and I wanted to convey that. I 
>>>>>> chose the assert in case I was wrong, or in the unlikely event a 
>>>>>> new VM internal call to wait() was ever added and we happened to 
>>>>>> encounter it. Leaving it empty to me raises the question "how do 
>>>>>> we get here?".
>>>>>
>>>>> The thing is, that leaving an assert to find some future case is 
>>>>> only good if that code is going to execute frequently and we find 
>>>>> it in testing.  This is not a frequent path.   Having the assert 
>>>>> might be hit much later down the road when we've forgotten this.  
>>>>> It's not helpful.
>>>>
>>>> It may not be very helpful but it is more helpful than nothing.
>>>>
>>>> But I guess I can remove it.
>>>
>>> Just for the record, I would recommend keeping it. I find it helpful
>>> to have assertions in places where the code has reached something
>>> unexpected. Instead of:
>>>
>>>   assert(false, "OBJECT_WAIT state is unexpected");
>>>
>>> I would do it as:
>>>
>>>   fatal("OBJECT_WAIT state is unexpected");
>>>
>>> to catch it in any form of bits (product, debug, etc).
>>
>> No, this is even worse! 
>
> I have to agree with Coleen here - at most an assert is warranted so 
> that during testing if we hit such an unexpected condition it is 
> brought to our attention. But if the unexpected condition hits in the 
> wild we do not want to crash the VM! Nothing bad happens if we find 
> the unexpected condition, we just won't report anything.

Thanks for clarifying.


> For the record it was pushed with no assert, as Coleen requested.

I'm good with that.


>
> David
>
>> Do you really want some rarely reproduceable crash coming from a 
>> customer?

Are you forgetting that I like chasing rare bugs? However, you have
a good point about it coming from a customer.


>> The assertion isn't even checking for some error in processing, it's 
>> for a helpful diagnostic.

Hmmm... okay. I read it as the code isn't written to handle this
condition implying that something bad would happen if we saw this
state. Maybe I inferred too much...


>> The only reason for guarantees is that something *very* bad will 
>> happen later.

Right. Data corruption, lost data, deadlock. Yup.


>> Anyway, this is somewhat of a minor point whether there should have 
>> been an assert or not.

Agreed.

Dan


>> My recommendation was what I already wrote.
>> thanks,
>> Coleen
>>>
>>> As for tracking back a future occurrence to this review, I often
>>> check to see what changeset added an assertion to see the context
>>> in which the assertion was placed.
>>>
>>> The fix is already pushed so again this is just for the record.
>>>
>>> Dan
>>>
>>>
>>>
>>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>>
>>>>> Thanks,
>>>>> Coleen
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>> Thanks,
>>>>>>> Coleen
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 1/14/19 1:01 AM, David Holmes wrote:
>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8213397
>>>>>>>> webrev: http://cr.openjdk.java.net/~dholmes/8213397/webrev/
>>>>>>>>
>>>>>>>> Please review this simple enhancement to improve the stack 
>>>>>>>> trace information when a thread is blocked doing a (logical) 
>>>>>>>> Object.wait on the monitor associated with Class static 
>>>>>>>> initialization.
>>>>>>>>
>>>>>>>> Details are in the bug report but basically we record, in a 
>>>>>>>> thread-local variable, the name of the class before the "wait" 
>>>>>>>> and clear it after, thus allowing the stack dump code to query 
>>>>>>>> it if it appears we may be in that situation.
>>>>>>>>
>>>>>>>> In InstanceKlass::initialize_impl I also refactored the code a 
>>>>>>>> tiny bit to make use of the JavaThread reference to the current 
>>>>>>>> thread.
>>>>>>>>
>>>>>>>> A new test using jstack was also added.
>>>>>>>>
>>>>>>>> Testing:
>>>>>>>>   - mach5 tiers 1 - 3
>>>>>>>>   - the new test on Windows/Linux/OSX x86 and Solaris sparc, in 
>>>>>>>> regular and Xcomp mode
>>>>>>>>   - All jstack/stack-dump using tests I could see (linux x64):
>>>>>>>>     - runtime/Thread
>>>>>>>>     - serviceability/sa/
>>>>>>>>     - serviceability/tmtools/jstack/
>>>>>>>>     - vmTestbase/nsk/jvmti/scenarios/sampling/
>>>>>>>>     - vmTestbase/nsk/jvmti/MonitorWaited/ .
>>>>>>>>     - vmTestbase/nsk/jvmti/GetOwnedMonitorInfo/
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>
>>>>>
>>>
>>



More information about the hotspot-runtime-dev mailing list