RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output
David Holmes
david.holmes at oracle.com
Sun Apr 7 06:06:51 UTC 2019
On 6/04/2019 4:24 pm, Chris Plummer wrote:
> On 4/5/19 9:13 PM, David Holmes wrote:
>> Hi Chris,
>>
>> On 6/04/2019 3:09 am, Chris Plummer wrote:
>>> Hi David,
>>>
>>> Why was the JVM_DefineModule frame left off of stackTraceAlternate?
>>
>> ?? That isn't part of any of the existing stacktraces.
> See the following comment from Zhengyu in the CR:
>
> https://bugs.openjdk.java.net/browse/JDK-8218458?focusedCommentId=14242865&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14242865
That comment simply includes a fragment of a stack which happens to
include JVM_DefineModule and makes no further mention of it. I don't
recall anyone saying that we should now be including that frame in the
check.
Do you want the test extended to also check for that frame?
>>
>>> Since you've added the following:
>>>
>>> 103 if (!okToHaveAllocateHeap) {
>>> 104 output.shouldNotContain("AllocateHeap");
>>> 105 }
>>
>> I didn't add that - see old code line 80.
> Ok, but my comment below still applies since this check is in place.
>>
>>> You can simplify the following:
>>>
>>> 123 if (okToHaveAllocateHeap) {
>>> 124 expectedStackTrace = stackTraceAllocateHeap;
>>> 125 if (stackTraceMatches(expectedStackTrace, output)) {
>>> 126 return;
>>> 127 }
>>> 128 } else {
>>>
>>> The is no need for the okToHaveAllocateHeap check here anymore. Just
>>> check all 3 allowed stacktraces until one passes. This is a slight
>>> improvement in flexibility in that it would no longer require the
>>> slowdebug builds to match stackTraceAllocateHeap. They could match
>>> any of the 3. You could then put all 3 allowed stacktraces in an
>>> array and check them in a loop if you wish.
>>
>> The only change I have made (which might be obscured by the structure)
>> is that if stackTraceDefault fails to match I then try
>> stackTraceAlternate. The handling of okToHaveAllocateHeap is unchanged.
>>
>> By the same argument you made I think it best to only expect the
>> AllocateHeap stack on those slowdebug platforms, so that we can notice
>> when something changes - again I've mode no change in this regard.
> Since line 104 already verified that AllocateHeap does not appear except
> possibly in slow debug heaps, it is harmless to check all builds against
> the stacktrace that includes AllocateHeap.
"Harmless" but a waste of time checking for a stack that we know can't
match. The current version was at your suggestion:
"You would need to check for all 3, limiting the AllocateHeap() one to
just being allowed on solaris and windows slowdebug as it is now."
Checking all three returns to my original version (modulo not removing
the check for the AllocateHeap frame, and fixing the matching logic).
> Also, if a slowdebug platform
> were to change to no longer include AllocateHeap, checking it against
> the other two stacktraces would allow the test to continue to pass
> without modification.
This is counter to your earlier argument that we should be using this
test to specifically check for such changes in compiler behaviour and
update the platform specific guards accordingly. If you allow it to go
either way then we would never remove the guard even when it was no
longer needed on any platform.
> For these two reasons I was suggesting just always
> check all 3 stacktraces until one passes. It would simplify the logic some.
I'd need to change a number of other things make the main logic simpler
(ie loop over all three stacks) but the error reporting part will be
more awkward. And Thomas already complained about the number of times we
scan the entire process output doing this matching, so this would make
it worse - unless I completely change the way we do the matching, which
then introduces more complexity and more likelihood of introducing new bugs.
Let me know how you want to proceed.
Thanks,
David
-----
>>
>>> The following is no longer correct:
>>>
>>> 140 throw new RuntimeException("Expected stack trace
>>> missing from output: " + expectedStackTrace);
>>>
>>> In your current approach, expectedStackTrace is just the last
>>> stacktrace we tried. Since we may try more than one, maybe all the
>>> ones that failed to match should be listed (or none listed if just
>>> too messy).
>>
>> It reports the last failing stacktrace, out of a possible two. Perhaps
>> I can print both ... you want something in the jtr file so that it can
>> be triaged without having to go and look up the test code.
> Yeah, just pointing out that only printing one stacktrace might lead the
> .jtr reader down the wrong path.
>
> thanks,
>
> Chris
>>
>> Thanks,
>> David
>>
>>> thanks,
>>>
>>> Chris
>>>
>>> On 4/5/19 12:04 AM, David Holmes wrote:
>>>> Hi Chris,
>>>>
>>>> Updated webrev:
>>>>
>>>> http://cr.openjdk.java.net/~dholmes/8218458/webrev.v2/
>>>>
>>>> Checks for alternate stack now. Added lots of comments and misc fixups.
>>>>
>>>> Zhengyu: please re-test (I can't test any slowdebug except linux-x64).
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>> On 5/04/2019 4:01 pm, Chris Plummer wrote:
>>>>> Thinking about this a bit more, there is still the potential for
>>>>> some confusion if this test fails again in the future due to the
>>>>> top frame missing. Is it missing because it got inlined or is it
>>>>> missing because the frame skipping code skipped an extra frame?
>>>>> Hopefully whoever deals with it doesn't just hastily add another
>>>>> valid stacktrace to the test but instead investigates to make sure
>>>>> the issue is indeed that the method got inlined.
>>>>>
>>>>> Chris
>>>>>
>>>>> On 4/4/19 10:56 PM, David Holmes wrote:
>>>>>> Hi Chris,
>>>>>>
>>>>>> Okay I will simply check for the third alternative.
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>> On 5/04/2019 3:53 pm, Chris Plummer wrote:
>>>>>>> Hi David,
>>>>>>>
>>>>>>> For the callsite that this test is checking for, right now there
>>>>>>> appear to be 3 possible stacktraces: the "normal" one, the one
>>>>>>> that includes AllocateHeap() on solaris and windows slowdebug
>>>>>>> builds, and the one Zhengyu is now seeing on linux-x64. You would
>>>>>>> need to check for all 3, limiting the AllocateHeap() one to just
>>>>>>> being allowed on solaris and windows slowdebug as it is now. So
>>>>>>> basically this test needs to cover all (allowable) stacktraces
>>>>>>> that we've seen for this callsite, and be updated in the future
>>>>>>> as needed. Not ideal, but I don't see a better solution. It's
>>>>>>> similar to the situation described in JDK-8163899 which covered
>>>>>>> the fragility of the NMT frame skipping code. In the end it was
>>>>>>> decided it would be easier to just deal fix issues as they came
>>>>>>> up rather then engineer a solution that wasn't as fragile. I
>>>>>>> think this test falls in the same category.
>>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On 4/4/19 10:11 PM, David Holmes wrote:
>>>>>>>> Hi Chris,
>>>>>>>>
>>>>>>>> Thanks for the explanation about the frame counting from
>>>>>>>> os::malloc - now I get it. But I don't understand your final
>>>>>>>> comment:
>>>>>>>>
>>>>>>>> > Looking at this code also reminds me of a reason to have the test
>>>>>>>> > continue to check for all 4 specific frames. If the frame
>>>>>>>> skipping code
>>>>>>>> > skips an extra frame, then the callsite will be missing a
>>>>>>>> needed frame
>>>>>>>> > at the top. The way the test was written it would detect this.
>>>>>>>> With your
>>>>>>>> > changes it will not. It would just revert to always matching
>>>>>>>> on 3 frames
>>>>>>>> > instead of 4, and the frame skipping bug would go unnoticed.
>>>>>>>>
>>>>>>>> How can I fix this bug if I have to check for 4 specific frames
>>>>>>>> but one (or more) may be missing - i.e how can I tell the
>>>>>>>> different between "Frame A was inlined" and "Frame A was skipped
>>>>>>>> by mistake" ??
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>>
>>>>>>>>
>>>>>>>> On 5/04/2019 2:17 pm, Chris Plummer wrote:
>>>>>>>>> Hi David,
>>>>>>>>>
>>>>>>>>> On 4/4/19 6:28 PM, David Holmes wrote:
>>>>>>>>>> Hi Chris,
>>>>>>>>>>
>>>>>>>>>> On 5/04/2019 1:48 am, Chris Plummer wrote:
>>>>>>>>>>> Hi David,
>>>>>>>>>>>
>>>>>>>>>>> On 4/4/19 12:14 AM, David Holmes wrote:
>>>>>>>>>>>> On 4/04/2019 4:35 pm, Chris Plummer wrote:
>>>>>>>>>>>>> On 4/3/19 11:23 PM, David Holmes wrote:
>>>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 4/04/2019 4:12 pm, Chris Plummer wrote:
>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I have concerns that this will hide some of the other
>>>>>>>>>>>>>>> bugs I've mentioned: JDK-8133749, JDK-8133747, and
>>>>>>>>>>>>>>> JDK-8133740. These bugs result in 1 or two frames
>>>>>>>>>>>>>>> appearing in the stacktrace that should be skipped.
>>>>>>>>>>>>>>> Notably NativeCallStack::NativeCallStack() and
>>>>>>>>>>>>>>> os::get_native_stack().
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The test still checks those are not present first:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 73 // We should never see either of these frames
>>>>>>>>>>>>>> because they are supposed to be skipped. */
>>>>>>>>>>>>>> 74
>>>>>>>>>>>>>> output.shouldNotContain("NativeCallStack::NativeCallStack");
>>>>>>>>>>>>>> 75 output.shouldNotContain("os::get_native_stack");
>>>>>>>>>>>>> Ah yes. I skimmed over the test looking for it but missed it.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Also, AllocateHeap() should normally not be in the stack
>>>>>>>>>>>>>>> trace, but the test has specifically allowed for it for
>>>>>>>>>>>>>>> windows and solaris slowdebug builds. Although these
>>>>>>>>>>>>>>> builds should have honored the ALWAYSINLINE directive, it
>>>>>>>>>>>>>>> was deemed acceptable that it was not in slowdebug
>>>>>>>>>>>>>>> builds. However, I would not want to allow AllocateHeap()
>>>>>>>>>>>>>>> to appear in a product build, and best not to see it in
>>>>>>>>>>>>>>> fastdebug either.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is a test of NMT detail not a test of whether a given
>>>>>>>>>>>>>> compiler chooses to inline something like AllocateHeap. I
>>>>>>>>>>>>>> don't think it is the job of this test to be checking for
>>>>>>>>>>>>>> something specific to the native compiler. The previous
>>>>>>>>>>>>>> handling of AllocateHeap seemed to be there simply because
>>>>>>>>>>>>>> it was the only way to deal with an optional frame - but
>>>>>>>>>>>>>> now that's handled generically.
>>>>>>>>>>>>> It's appearance means you effectively only have 3 frames to
>>>>>>>>>>>>> identity callsites instead of 4.
>>>>>>>>>>>>
>>>>>>>>>>>> Both stacktraces in the old test had 4 elements and expected
>>>>>>>>>>>> 4 matches. The current bug is that one of those (new_entry)
>>>>>>>>>>>> could actually be inlined as well, resulting in only 3
>>>>>>>>>>>> matches. So that is what the revised test checks for: at
>>>>>>>>>>>> least 3 matches. Often there will be 4 matches.
>>>>>>>>>>> I think you misunderstood my "3 frames" comment. I was
>>>>>>>>>>> referring to how many frames NMT uses to identify the
>>>>>>>>>>> callsite. It wants to use 4, but if AllocateHeap() doesn't
>>>>>>>>>>> get inlined, it effectively is using 3. The test should
>>>>>>>>>>> detect when this happens so the NMT implementation can
>>>>>>>>>>> address the issue.
>>>>>>>>>>
>>>>>>>>>> You're right I don't understand this part as I don't know
>>>>>>>>>> how/what NMT detail is doing in this regard.
>>>>>>>>>
>>>>>>>>> An NMT callsite is simply the 4 most recent frames (afters some
>>>>>>>>> pruning) that led to the os:malloc() call. "4" is somewhat
>>>>>>>>> arbitrary as Thomas pointed out, and is controlled by
>>>>>>>>> NMT_TrackingStackDepth. Making NMT_TrackingStackDepth bigger
>>>>>>>>> means more refinement of the callsites (thus more callsites),
>>>>>>>>> but a clearer picture of what actually led to the os:malloc().
>>>>>>>>>
>>>>>>>>> For example, with NMT_TrackingStackDepth == 4, if you have a()
>>>>>>>>> calls b() calls c() calls d() calls os:malloc(), and foo() and
>>>>>>>>> bar() both call a(), the NMT detail output will not distinguish
>>>>>>>>> between these two calls paths to os:mallco(), and will consider
>>>>>>>>> both paths to be the same callsite. The 4 frames in the NMT
>>>>>>>>> detail output would always be a, b, c, and d. However, bump up
>>>>>>>>> NMT_TrackingStackDepth to 5 and now NMT will treat them as two
>>>>>>>>> separate callsites, one with foo() as the bottom frame and one
>>>>>>>>> with bar() as the bottom frame, and both with a, b, c, and d as
>>>>>>>>> the other 4 frames.
>>>>>>>>>
>>>>>>>>> So my point is if AllocateHeap() is not inlined, then every
>>>>>>>>> allocation that is the result of doing a "new" of any CHeapObj
>>>>>>>>> subtype will have AllocateHeap() in its callsite, which
>>>>>>>>> effectively lowers they callsite refinement by 1.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Hmmm but now I'm wondering why this trace:
>>>>>>>>>>>>
>>>>>>>>>>>> 50 public static String stackTraceAllocateHeap =
>>>>>>>>>>>> 51 ".*AllocateHeap.*\n" +
>>>>>>>>>>>> 52 ".*ModuleEntryTable.*new_entry.*\n" +
>>>>>>>>>>>>
>>>>>>>>>>>> doesn't include ".*Hashtable.*allocate_new_entry.*"? Was it
>>>>>>>>>>>> getting inlined already when AllocateHeap was not? Even so
>>>>>>>>>>>> we still end up with 4 frames matching normally.
>>>>>>>>>>> I noticed that last night also and scratch my head over it
>>>>>>>>>>> for a while and then went to bed. The only explanation I
>>>>>>>>>>> could come up with is that allocate_new_entry() is getting
>>>>>>>>>>> inlined, and as a result (due to being a slowdebug build and
>>>>>>>>>>> doing minimal inlining) AllocateHeap() was not inlined.
>>>>>>>>>>>>
>>>>>>>>>>>>> If it does appear in a product build, a solution should be
>>>>>>>>>>>>> looked into to get rid of it. If the port owner decides it
>>>>>>>>>>>>> can't get rid of it (or is unwilling to), then an exception
>>>>>>>>>>>>> should be added to the test like was done for solaris and
>>>>>>>>>>>>> windows slowdebug builds.
>>>>>>>>>>>>
>>>>>>>>>>>> Are we specifically trying to test the compiler's ability to
>>>>>>>>>>>> inline that function and just happen to be using this test
>>>>>>>>>>>> to verify that? Doesn't seem like a suitable place to do
>>>>>>>>>>>> this - and why do we need to do it? The Visual Studio docs
>>>>>>>>>>>> state:
>>>>>>>>>>>>
>>>>>>>>>>>> "You cannot force the compiler to inline a particular
>>>>>>>>>>>> function, even with the __forceinline keyword."
>>>>>>>>>>>>
>>>>>>>>>>>> so ALWAYSINLINE is just a hint even in product builds and
>>>>>>>>>>>> could change with any update to the compiler.
>>>>>>>>>>>>
>>>>>>>>>>>> For Solaris Studio it is again not guaranteed to inline -
>>>>>>>>>>>> specifically -xinline only has an effect at –xO3 or higher.
>>>>>>>>>>>> Which likely explains why it is ignored in slowdebug. And
>>>>>>>>>>>> there are other cases where it won't honour the ALWAYSINLINE.
>>>>>>>>>>>>
>>>>>>>>>>>> Even with gcc we seem to be misusing the attribute if we
>>>>>>>>>>>> want to ensure inlining when not optimising:
>>>>>>>>>>>>
>>>>>>>>>>>> "GCC does not inline any functions when not optimizing
>>>>>>>>>>>> unless you specify the ‘always_inline’ attribute for the
>>>>>>>>>>>> function, like this:
>>>>>>>>>>>>
>>>>>>>>>>>> /* Prototype. */
>>>>>>>>>>>> inline void foo (const char) __attribute__((always_inline));"
>>>>>>>>>>>>
>>>>>>>>>>>> and we don't write it that way.
>>>>>>>>>>>>
>>>>>>>>>>>> So if we're that concerned about release builds guaranteeing
>>>>>>>>>>>> to inline AllocateHeap then I think we need something a bit
>>>>>>>>>>>> more explicit than this test to determine that.
>>>>>>>>>>> With respect to the 3 methods/functions we don't want to see
>>>>>>>>>>> in the callsite stacktrace, NMT has made a number of
>>>>>>>>>>> assumptions on inlining. One of the things the test is doing
>>>>>>>>>>> is making sure those assumptions are correct. If incorrect,
>>>>>>>>>>> then you run into issues like I mentioned above where
>>>>>>>>>>> callsite backtraces effectively only have 3 unique frames
>>>>>>>>>>> rather than 4 (actually before some bug fixes it was often
>>>>>>>>>>> just 2 unique frames). So I think it's appropriate to have a
>>>>>>>>>>> test to make sure we are not seeing any of these 3
>>>>>>>>>>> methods/functions.
>>>>>>>>>>
>>>>>>>>>> Okay I get the gist of that. Is there somewhere I can clearly
>>>>>>>>>> see what this inlining assumptions are that NMT makes? Are
>>>>>>>>>> they clearly documented?
>>>>>>>>>
>>>>>>>>> Not that I know of. I discovered them while looking at the
>>>>>>>>> various bugs that led to NativeCallStack::NativeCallStack() and
>>>>>>>>> os::get_native_stack() (and sometimes both) being in the
>>>>>>>>> callsite. Reviewing the bugs I referred to will give you an
>>>>>>>>> idea of where to look. One good place to look at
>>>>>>>>> NativeCallStack::NativeCallStack(). Lots of special case code
>>>>>>>>> there that controls how many frames to skip based on on the
>>>>>>>>> platform and whether optimized or not. Also some comments there
>>>>>>>>> to help you out. I did a lot of bug fixing in this method.
>>>>>>>>>
>>>>>>>>> Looking at this code also reminds me of a reason to have the
>>>>>>>>> test continue to check for all 4 specific frames. If the frame
>>>>>>>>> skipping code skips an extra frame, then the callsite will be
>>>>>>>>> missing a needed frame at the top. The way the test was written
>>>>>>>>> it would detect this. With your changes it will not. It would
>>>>>>>>> just revert to always matching on 3 frames instead of 4, and
>>>>>>>>> the frame skipping bug would go unnoticed.
>>>>>>>>>
>>>>>>>>> thanks,
>>>>>>>>>
>>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Now the test also has made inlining assumptions beyond what
>>>>>>>>>>> NMT has made, and that is really what this bug is about. In
>>>>>>>>>>> general I think your fix is fine in the way it relaxes which
>>>>>>>>>>> frames are actually found, but as Thomas points out, it
>>>>>>>>>>> suffers from not actually looking at a single stacktrace, but
>>>>>>>>>>> just looking for the specified frames somewhere in the output
>>>>>>>>>>> (and in the order specified.) You should probably address this.
>>>>>>>>>>
>>>>>>>>>> Right that was an error on my part. I thought the existing
>>>>>>>>>> MULTILINE pattern matching with .* would also find
>>>>>>>>>> non-sequential lines and so I was acting similarly. I will
>>>>>>>>>> re-think this.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>> thanks,
>>>>>>>>>>>
>>>>>>>>>>> Chris
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> David
>>>>>>>>>>>>
>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Given the changes you made to allow more flexibly in
>>>>>>>>>>>>>>> which frames appear, I think you need to now also make
>>>>>>>>>>>>>>> sure the above 3 mentioned frames are not present, except
>>>>>>>>>>>>>>> for allowing AllocateHeap() in slowdebug builds.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 4/3/19 10:53 PM, David Holmes wrote:
>>>>>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458
>>>>>>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The actual stack trace reported by NMT detail is
>>>>>>>>>>>>>>>> affected by the inlining decisions of the native
>>>>>>>>>>>>>>>> compiler, and on the type of build. So we define an
>>>>>>>>>>>>>>>> "ideal" stacktrace and then allow for some frames to be
>>>>>>>>>>>>>>>> missing based on empirical observations. So to date we
>>>>>>>>>>>>>>>> have seen two frames that may or may not be inlined and
>>>>>>>>>>>>>>>> so we allow for 2 non-matching entries.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The special-casing of AllocateHeap is removed as now it
>>>>>>>>>>>>>>>> is just an optional frame.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Chris: does this maintain the "spirit" of the test as
>>>>>>>>>>>>>>>> you intended?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Zhengyu: can you test this on your system(s) please.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>
>>>
>
>
More information about the hotspot-runtime-dev
mailing list