RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output

David Holmes david.holmes at oracle.com
Sat Apr 6 04:13:24 UTC 2019


Hi Chris,

On 6/04/2019 3:09 am, Chris Plummer wrote:
> Hi David,
> 
> Why was the JVM_DefineModule frame left off of stackTraceAlternate?

?? That isn't part of any of the existing stacktraces.

> Since you've added the following:
> 
>   103         if (!okToHaveAllocateHeap) {
>   104             output.shouldNotContain("AllocateHeap");
>   105         }

I didn't add that - see old code line 80.

> You can simplify the following:
> 
>   123         if (okToHaveAllocateHeap) {
>   124             expectedStackTrace = stackTraceAllocateHeap;
>   125             if (stackTraceMatches(expectedStackTrace, output)) {
>   126                 return;
>   127             }
>   128         } else {
> 
> The is no need for the okToHaveAllocateHeap check here anymore. Just 
> check all 3 allowed stacktraces until one passes. This is a slight 
> improvement in flexibility in that it would no longer require the 
> slowdebug builds to match stackTraceAllocateHeap. They could match any 
> of the 3. You could then put all 3 allowed stacktraces in an array and 
> check them in a loop if you wish.

The only change I have made (which might be obscured by the structure) 
is that if stackTraceDefault fails to match I then try 
stackTraceAlternate. The handling of okToHaveAllocateHeap is unchanged.

By the same argument you made I think it best to only expect the 
AllocateHeap stack on those slowdebug platforms, so that we can notice 
when something changes - again I've mode no change in this regard.

> The following is no longer correct:
> 
>   140         throw new RuntimeException("Expected stack trace missing 
> from output: " + expectedStackTrace);
> 
> In your current approach, expectedStackTrace is just the last stacktrace 
> we tried. Since we may try more than one, maybe all the ones that failed 
> to match should be listed (or none listed if just too messy).

It reports the last failing stacktrace, out of a possible two. Perhaps I 
can print both ... you want something in the jtr file so that it can be 
triaged without having to go and look up the test code.

Thanks,
David

> thanks,
> 
> Chris
> 
> On 4/5/19 12:04 AM, David Holmes wrote:
>> Hi Chris,
>>
>> Updated webrev:
>>
>> http://cr.openjdk.java.net/~dholmes/8218458/webrev.v2/
>>
>> Checks for alternate stack now. Added lots of comments and misc fixups.
>>
>> Zhengyu: please re-test (I can't test any slowdebug except linux-x64).
>>
>> Thanks,
>> David
>>
>> On 5/04/2019 4:01 pm, Chris Plummer wrote:
>>> Thinking about this a bit more, there is still the potential for some 
>>> confusion if this test fails again in the future due to the top frame 
>>> missing. Is it missing because it got inlined or is it missing 
>>> because the frame skipping code skipped an extra frame? Hopefully 
>>> whoever deals with it doesn't just hastily add another valid 
>>> stacktrace to the test but instead investigates to make sure the 
>>> issue is indeed that the method got inlined.
>>>
>>> Chris
>>>
>>> On 4/4/19 10:56 PM, David Holmes wrote:
>>>> Hi Chris,
>>>>
>>>> Okay I will simply check for the third alternative.
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>> On 5/04/2019 3:53 pm, Chris Plummer wrote:
>>>>> Hi David,
>>>>>
>>>>> For the callsite that this test is checking for, right now there 
>>>>> appear to be 3 possible stacktraces: the "normal" one, the one that 
>>>>> includes AllocateHeap() on solaris and windows slowdebug builds, 
>>>>> and the one Zhengyu is now seeing on linux-x64. You would need to 
>>>>> check for all 3, limiting the AllocateHeap() one to just being 
>>>>> allowed on solaris and windows slowdebug as it is now. So basically 
>>>>> this test needs to cover all (allowable) stacktraces that we've 
>>>>> seen for this callsite, and be updated in the future as needed. Not 
>>>>> ideal, but I don't see a better solution. It's similar to the 
>>>>> situation described in JDK-8163899 which covered the fragility of 
>>>>> the NMT frame skipping code. In the end it was decided it would be 
>>>>> easier to just deal fix issues as they came up rather then engineer 
>>>>> a solution that wasn't as fragile. I think this test falls in the 
>>>>> same category.
>>>>>
>>>>> thanks,
>>>>>
>>>>> Chris
>>>>>
>>>>> On 4/4/19 10:11 PM, David Holmes wrote:
>>>>>> Hi Chris,
>>>>>>
>>>>>> Thanks for the explanation about the frame counting from 
>>>>>> os::malloc - now I get it. But I don't understand your final comment:
>>>>>>
>>>>>> > Looking at this code also reminds me of a reason to have the test
>>>>>> > continue to check for all 4 specific frames. If the frame 
>>>>>> skipping code
>>>>>> > skips an extra frame, then the callsite will be missing a needed 
>>>>>> frame
>>>>>> > at the top. The way the test was written it would detect this. 
>>>>>> With your
>>>>>> > changes it will not. It would just revert to always matching on 
>>>>>> 3 frames
>>>>>> > instead of 4, and the frame skipping bug would go unnoticed.
>>>>>>
>>>>>> How can I fix this bug if I have to check for 4 specific frames 
>>>>>> but one (or more) may be missing - i.e how can I tell the 
>>>>>> different between "Frame A was inlined" and "Frame A was skipped 
>>>>>> by mistake" ??
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>
>>>>>> On 5/04/2019 2:17 pm, Chris Plummer wrote:
>>>>>>> Hi David,
>>>>>>>
>>>>>>> On 4/4/19 6:28 PM, David Holmes wrote:
>>>>>>>> Hi Chris,
>>>>>>>>
>>>>>>>> On 5/04/2019 1:48 am, Chris Plummer wrote:
>>>>>>>>> Hi David,
>>>>>>>>>
>>>>>>>>> On 4/4/19 12:14 AM, David Holmes wrote:
>>>>>>>>>> On 4/04/2019 4:35 pm, Chris Plummer wrote:
>>>>>>>>>>> On 4/3/19 11:23 PM, David Holmes wrote:
>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>
>>>>>>>>>>>> On 4/04/2019 4:12 pm, Chris Plummer wrote:
>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have concerns that this will hide some of the other bugs 
>>>>>>>>>>>>> I've mentioned: JDK-8133749, JDK-8133747, and JDK-8133740. 
>>>>>>>>>>>>> These bugs result in 1 or two frames appearing in the 
>>>>>>>>>>>>> stacktrace that should be skipped. Notably 
>>>>>>>>>>>>> NativeCallStack::NativeCallStack() and os::get_native_stack().
>>>>>>>>>>>>
>>>>>>>>>>>> The test still checks those are not present first:
>>>>>>>>>>>>
>>>>>>>>>>>> 73         // We should never see either of these frames 
>>>>>>>>>>>> because they are supposed to be skipped. */
>>>>>>>>>>>> 74 output.shouldNotContain("NativeCallStack::NativeCallStack");
>>>>>>>>>>>> 75 output.shouldNotContain("os::get_native_stack");
>>>>>>>>>>> Ah yes. I skimmed over the test looking for it but missed it.
>>>>>>>>>>>>
>>>>>>>>>>>>> Also, AllocateHeap() should normally not be in the stack 
>>>>>>>>>>>>> trace, but the test has specifically allowed for it for 
>>>>>>>>>>>>> windows and solaris slowdebug builds. Although these builds 
>>>>>>>>>>>>> should have honored the ALWAYSINLINE directive, it was 
>>>>>>>>>>>>> deemed acceptable that it was not in slowdebug builds. 
>>>>>>>>>>>>> However, I would not want to allow AllocateHeap() to appear 
>>>>>>>>>>>>> in a product build, and best not to see it in fastdebug 
>>>>>>>>>>>>> either.
>>>>>>>>>>>>
>>>>>>>>>>>> This is a test of NMT detail not a test of whether a given 
>>>>>>>>>>>> compiler chooses to inline something like AllocateHeap. I 
>>>>>>>>>>>> don't think it is the job of this test to be checking for 
>>>>>>>>>>>> something specific to the native compiler. The previous 
>>>>>>>>>>>> handling of AllocateHeap seemed to be there simply because 
>>>>>>>>>>>> it was the only way to deal with an optional frame - but now 
>>>>>>>>>>>> that's handled generically.
>>>>>>>>>>> It's appearance means you effectively only have 3 frames to 
>>>>>>>>>>> identity callsites instead of 4. 
>>>>>>>>>>
>>>>>>>>>> Both stacktraces in the old test had 4 elements and expected 4 
>>>>>>>>>> matches. The current bug is that one of those (new_entry) 
>>>>>>>>>> could actually be inlined as well, resulting in only 3 
>>>>>>>>>> matches. So that is what the revised test checks for: at least 
>>>>>>>>>> 3 matches. Often there will be 4 matches.
>>>>>>>>> I think you misunderstood my "3 frames" comment. I was 
>>>>>>>>> referring to how many frames NMT uses to identify the callsite. 
>>>>>>>>> It wants to use 4, but if AllocateHeap() doesn't get inlined, 
>>>>>>>>> it effectively is using 3. The test should detect when this 
>>>>>>>>> happens so the NMT implementation can address the issue.
>>>>>>>>
>>>>>>>> You're right I don't understand this part as I don't know 
>>>>>>>> how/what NMT detail is doing in this regard.
>>>>>>>
>>>>>>> An NMT callsite is simply the 4 most recent frames (afters some 
>>>>>>> pruning) that led to the os:malloc() call. "4" is somewhat 
>>>>>>> arbitrary as Thomas pointed out, and is controlled by 
>>>>>>> NMT_TrackingStackDepth. Making NMT_TrackingStackDepth bigger 
>>>>>>> means more refinement of the callsites (thus more callsites), but 
>>>>>>> a clearer picture of what actually led to the os:malloc().
>>>>>>>
>>>>>>> For example, with NMT_TrackingStackDepth == 4, if you have a() 
>>>>>>> calls b() calls c() calls d() calls os:malloc(), and foo() and 
>>>>>>> bar() both call a(), the NMT detail output will not distinguish 
>>>>>>> between these two calls paths to os:mallco(), and will consider 
>>>>>>> both paths to be the same callsite. The 4 frames in the NMT 
>>>>>>> detail output would always be a, b, c, and d. However, bump up 
>>>>>>> NMT_TrackingStackDepth to 5 and now NMT will treat them as two 
>>>>>>> separate callsites, one with foo() as the bottom frame and one 
>>>>>>> with bar() as the bottom frame, and both with a, b, c, and d as 
>>>>>>> the other 4 frames.
>>>>>>>
>>>>>>> So my point is if AllocateHeap() is not inlined, then every 
>>>>>>> allocation that is the result of doing a "new" of any CHeapObj 
>>>>>>> subtype will have AllocateHeap() in its callsite, which 
>>>>>>> effectively lowers they callsite refinement by 1.
>>>>>>>
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hmmm but now I'm wondering why this trace:
>>>>>>>>>>
>>>>>>>>>>   50     public static String stackTraceAllocateHeap =
>>>>>>>>>>   51         ".*AllocateHeap.*\n" +
>>>>>>>>>>   52         ".*ModuleEntryTable.*new_entry.*\n" +
>>>>>>>>>>
>>>>>>>>>> doesn't include ".*Hashtable.*allocate_new_entry.*"? Was it 
>>>>>>>>>> getting inlined already when AllocateHeap was not? Even so we 
>>>>>>>>>> still end up with 4 frames matching normally.
>>>>>>>>> I noticed that last night also and scratch my head over it for 
>>>>>>>>> a while and then went to bed. The only explanation I could come 
>>>>>>>>> up with is that allocate_new_entry() is getting inlined, and as 
>>>>>>>>> a result (due to being a slowdebug build and doing minimal 
>>>>>>>>> inlining) AllocateHeap() was not inlined.
>>>>>>>>>>
>>>>>>>>>>> If it does appear in a product build, a solution should be 
>>>>>>>>>>> looked into to get rid of it. If the port owner decides it 
>>>>>>>>>>> can't get rid of it (or is unwilling to), then an exception 
>>>>>>>>>>> should be added to the test like was done for solaris and 
>>>>>>>>>>> windows slowdebug builds.
>>>>>>>>>>
>>>>>>>>>> Are we specifically trying to test the compiler's ability to 
>>>>>>>>>> inline that function and just happen to be using this test to 
>>>>>>>>>> verify that? Doesn't seem like a suitable place to do this - 
>>>>>>>>>> and why do we need to do it? The Visual Studio docs state:
>>>>>>>>>>
>>>>>>>>>> "You cannot force the compiler to inline a particular 
>>>>>>>>>> function, even with the __forceinline keyword."
>>>>>>>>>>
>>>>>>>>>> so ALWAYSINLINE is just a hint even in product builds and 
>>>>>>>>>> could change with any update to the compiler.
>>>>>>>>>>
>>>>>>>>>> For Solaris Studio it is again not guaranteed to inline - 
>>>>>>>>>> specifically -xinline only has an effect at –xO3 or higher. 
>>>>>>>>>> Which likely explains why it is ignored in slowdebug. And 
>>>>>>>>>> there are other cases where it won't honour the ALWAYSINLINE.
>>>>>>>>>>
>>>>>>>>>> Even with gcc we seem to be misusing the attribute if we want 
>>>>>>>>>> to ensure inlining when not optimising:
>>>>>>>>>>
>>>>>>>>>> "GCC does not inline any functions when not optimizing unless 
>>>>>>>>>> you specify the ‘always_inline’ attribute for the function, 
>>>>>>>>>> like this:
>>>>>>>>>>
>>>>>>>>>> /* Prototype.  */
>>>>>>>>>> inline void foo (const char) __attribute__((always_inline));"
>>>>>>>>>>
>>>>>>>>>> and we don't write it that way.
>>>>>>>>>>
>>>>>>>>>> So if we're that concerned about release builds guaranteeing 
>>>>>>>>>> to inline AllocateHeap then I think we need something a bit 
>>>>>>>>>> more explicit than this test to determine that.
>>>>>>>>> With respect to the 3 methods/functions we don't want to see in 
>>>>>>>>> the callsite stacktrace, NMT has made a number of assumptions 
>>>>>>>>> on inlining. One of the things the test is doing is making sure 
>>>>>>>>> those assumptions are correct. If incorrect, then you run into 
>>>>>>>>> issues like I mentioned above where callsite backtraces 
>>>>>>>>> effectively only have 3 unique frames rather than 4 (actually 
>>>>>>>>> before some bug fixes it was often just 2 unique frames). So I 
>>>>>>>>> think it's appropriate to have a test to make sure we are not 
>>>>>>>>> seeing any of these 3 methods/functions.
>>>>>>>>
>>>>>>>> Okay I get the gist of that. Is there somewhere I can clearly 
>>>>>>>> see what this inlining assumptions are that NMT makes? Are they 
>>>>>>>> clearly documented?
>>>>>>>
>>>>>>> Not that I know of. I discovered them while looking at the 
>>>>>>> various bugs that led to NativeCallStack::NativeCallStack() and 
>>>>>>> os::get_native_stack() (and sometimes both) being in the 
>>>>>>> callsite. Reviewing the bugs I referred to will give you an idea 
>>>>>>> of where to look. One good place to look at 
>>>>>>> NativeCallStack::NativeCallStack(). Lots of special case code 
>>>>>>> there that controls how many frames to skip based on on the 
>>>>>>> platform and whether optimized or not. Also some comments there 
>>>>>>> to help you out. I did a lot of bug fixing in this method.
>>>>>>>
>>>>>>> Looking at this code also reminds me of a reason to have the test 
>>>>>>> continue to check for all 4 specific frames. If the frame 
>>>>>>> skipping code skips an extra frame, then the callsite will be 
>>>>>>> missing a needed frame at the top. The way the test was written 
>>>>>>> it would detect this. With your changes it will not. It would 
>>>>>>> just revert to always matching on 3 frames instead of 4, and the 
>>>>>>> frame skipping bug would go unnoticed.
>>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>>>
>>>>>>>>> Now the test also has made inlining assumptions beyond what NMT 
>>>>>>>>> has made, and that is really what this bug is about. In general 
>>>>>>>>> I think your fix is fine in the way it relaxes which frames are 
>>>>>>>>> actually found, but as Thomas points out, it suffers from not 
>>>>>>>>> actually looking at a single stacktrace, but just looking for 
>>>>>>>>> the specified frames somewhere in the output (and in the order 
>>>>>>>>> specified.) You should probably address this.
>>>>>>>>
>>>>>>>> Right that was an error on my part. I thought the existing 
>>>>>>>> MULTILINE pattern matching with .* would also find 
>>>>>>>> non-sequential lines and so I was acting similarly. I will 
>>>>>>>> re-think this.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>>
>>>>>>>>> thanks,
>>>>>>>>>
>>>>>>>>> Chris
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>> thanks,
>>>>>>>>>>>
>>>>>>>>>>> Chris
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> David
>>>>>>>>>>>>
>>>>>>>>>>>>> Given the changes you made to allow more flexibly in which 
>>>>>>>>>>>>> frames appear, I think you need to now also make sure the 
>>>>>>>>>>>>> above 3 mentioned frames are not present, except for 
>>>>>>>>>>>>> allowing AllocateHeap() in slowdebug builds.
>>>>>>>>>>>>>
>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 4/3/19 10:53 PM, David Holmes wrote:
>>>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458
>>>>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The actual stack trace reported by NMT detail is affected 
>>>>>>>>>>>>>> by the inlining decisions of the native compiler, and on 
>>>>>>>>>>>>>> the type of build. So we define an "ideal" stacktrace and 
>>>>>>>>>>>>>> then allow for some frames to be missing based on 
>>>>>>>>>>>>>> empirical observations. So to date we have seen two frames 
>>>>>>>>>>>>>> that may or may not be inlined and so we allow for 2 
>>>>>>>>>>>>>> non-matching entries.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The special-casing of AllocateHeap is removed as now it is 
>>>>>>>>>>>>>> just an optional frame.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Chris: does this maintain the "spirit" of the test as you 
>>>>>>>>>>>>>> intended?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Zhengyu: can you test this on your system(s) please.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> David
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>
>>>
> 
> 


More information about the hotspot-runtime-dev mailing list