RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output

David Holmes david.holmes at oracle.com
Fri Apr 5 01:28:13 UTC 2019


Hi Chris,

On 5/04/2019 1:48 am, Chris Plummer wrote:
> Hi David,
> 
> On 4/4/19 12:14 AM, David Holmes wrote:
>> On 4/04/2019 4:35 pm, Chris Plummer wrote:
>>> On 4/3/19 11:23 PM, David Holmes wrote:
>>>> Hi Chris,
>>>>
>>>> On 4/04/2019 4:12 pm, Chris Plummer wrote:
>>>>> Hi David,
>>>>>
>>>>> I have concerns that this will hide some of the other bugs I've 
>>>>> mentioned: JDK-8133749, JDK-8133747, and JDK-8133740. These bugs 
>>>>> result in 1 or two frames appearing in the stacktrace that should 
>>>>> be skipped. Notably NativeCallStack::NativeCallStack() and 
>>>>> os::get_native_stack().
>>>>
>>>> The test still checks those are not present first:
>>>>
>>>> 73         // We should never see either of these frames because 
>>>> they are supposed to be skipped. */
>>>> 74 output.shouldNotContain("NativeCallStack::NativeCallStack");
>>>> 75         output.shouldNotContain("os::get_native_stack");
>>> Ah yes. I skimmed over the test looking for it but missed it.
>>>>
>>>>> Also, AllocateHeap() should normally not be in the stack trace, but 
>>>>> the test has specifically allowed for it for windows and solaris 
>>>>> slowdebug builds. Although these builds should have honored the 
>>>>> ALWAYSINLINE directive, it was deemed acceptable that it was not in 
>>>>> slowdebug builds. However, I would not want to allow AllocateHeap() 
>>>>> to appear in a product build, and best not to see it in fastdebug 
>>>>> either.
>>>>
>>>> This is a test of NMT detail not a test of whether a given compiler 
>>>> chooses to inline something like AllocateHeap. I don't think it is 
>>>> the job of this test to be checking for something specific to the 
>>>> native compiler. The previous handling of AllocateHeap seemed to be 
>>>> there simply because it was the only way to deal with an optional 
>>>> frame - but now that's handled generically.
>>> It's appearance means you effectively only have 3 frames to identity 
>>> callsites instead of 4. 
>>
>> Both stacktraces in the old test had 4 elements and expected 4 
>> matches. The current bug is that one of those (new_entry) could 
>> actually be inlined as well, resulting in only 3 matches. So that is 
>> what the revised test checks for: at least 3 matches. Often there will 
>> be 4 matches.
> I think you misunderstood my "3 frames" comment. I was referring to how 
> many frames NMT uses to identify the callsite. It wants to use 4, but if 
> AllocateHeap() doesn't get inlined, it effectively is using 3. The test 
> should detect when this happens so the NMT implementation can address 
> the issue.

You're right I don't understand this part as I don't know how/what NMT 
detail is doing in this regard.

>>
>> Hmmm but now I'm wondering why this trace:
>>
>>   50     public static String stackTraceAllocateHeap =
>>   51         ".*AllocateHeap.*\n" +
>>   52         ".*ModuleEntryTable.*new_entry.*\n" +
>>
>> doesn't include ".*Hashtable.*allocate_new_entry.*"? Was it getting 
>> inlined already when AllocateHeap was not? Even so we still end up 
>> with 4 frames matching normally.
> I noticed that last night also and scratch my head over it for a while 
> and then went to bed. The only explanation I could come up with is that 
> allocate_new_entry() is getting inlined, and as a result (due to being a 
> slowdebug build and doing minimal inlining) AllocateHeap() was not inlined.
>>
>>> If it does appear in a product build, a solution should be looked 
>>> into to get rid of it. If the port owner decides it can't get rid of 
>>> it (or is unwilling to), then an exception should be added to the 
>>> test like was done for solaris and windows slowdebug builds.
>>
>> Are we specifically trying to test the compiler's ability to inline 
>> that function and just happen to be using this test to verify that? 
>> Doesn't seem like a suitable place to do this - and why do we need to 
>> do it? The Visual Studio docs state:
>>
>> "You cannot force the compiler to inline a particular function, even 
>> with the __forceinline keyword."
>>
>> so ALWAYSINLINE is just a hint even in product builds and could change 
>> with any update to the compiler.
>>
>> For Solaris Studio it is again not guaranteed to inline - specifically 
>> -xinline only has an effect at –xO3 or higher. Which likely explains 
>> why it is ignored in slowdebug. And there are other cases where it 
>> won't honour the ALWAYSINLINE.
>>
>> Even with gcc we seem to be misusing the attribute if we want to 
>> ensure inlining when not optimising:
>>
>> "GCC does not inline any functions when not optimizing unless you 
>> specify the ‘always_inline’ attribute for the function, like this:
>>
>> /* Prototype.  */
>> inline void foo (const char) __attribute__((always_inline));"
>>
>> and we don't write it that way.
>>
>> So if we're that concerned about release builds guaranteeing to inline 
>> AllocateHeap then I think we need something a bit more explicit than 
>> this test to determine that.
> With respect to the 3 methods/functions we don't want to see in the 
> callsite stacktrace, NMT has made a number of assumptions on inlining. 
> One of the things the test is doing is making sure those assumptions are 
> correct. If incorrect, then you run into issues like I mentioned above 
> where callsite backtraces effectively only have 3 unique frames rather 
> than 4 (actually before some bug fixes it was often just 2 unique 
> frames). So I think it's appropriate to have a test to make sure we are 
> not seeing any of these 3 methods/functions.

Okay I get the gist of that. Is there somewhere I can clearly see what 
this inlining assumptions are that NMT makes? Are they clearly documented?

> Now the test also has made inlining assumptions beyond what NMT has 
> made, and that is really what this bug is about. In general I think your 
> fix is fine in the way it relaxes which frames are actually found, but 
> as Thomas points out, it suffers from not actually looking at a single 
> stacktrace, but just looking for the specified frames somewhere in the 
> output (and in the order specified.) You should probably address this.

Right that was an error on my part. I thought the existing MULTILINE 
pattern matching with .* would also find non-sequential lines and so I 
was acting similarly. I will re-think this.

Thanks,
David

> thanks,
> 
> Chris
>>
>> Thanks,
>> David
>>
>>> thanks,
>>>
>>> Chris
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> Given the changes you made to allow more flexibly in which frames 
>>>>> appear, I think you need to now also make sure the above 3 
>>>>> mentioned frames are not present, except for allowing 
>>>>> AllocateHeap() in slowdebug builds.
>>>>>
>>>>> thanks,
>>>>>
>>>>> Chris
>>>>>
>>>>> On 4/3/19 10:53 PM, David Holmes wrote:
>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458
>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev/
>>>>>>
>>>>>> The actual stack trace reported by NMT detail is affected by the 
>>>>>> inlining decisions of the native compiler, and on the type of 
>>>>>> build. So we define an "ideal" stacktrace and then allow for some 
>>>>>> frames to be missing based on empirical observations. So to date 
>>>>>> we have seen two frames that may or may not be inlined and so we 
>>>>>> allow for 2 non-matching entries.
>>>>>>
>>>>>> The special-casing of AllocateHeap is removed as now it is just an 
>>>>>> optional frame.
>>>>>>
>>>>>> Chris: does this maintain the "spirit" of the test as you intended?
>>>>>>
>>>>>> Zhengyu: can you test this on your system(s) please.
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>
>>>>>
>>>
>>>
> 
> 


More information about the hotspot-runtime-dev mailing list