RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output
David Holmes
david.holmes at oracle.com
Mon Apr 8 13:15:04 UTC 2019
On 8/04/2019 10:35 pm, Zhengyu Gu wrote:
>> Zhengyu: can you please test this again on your platforms. Again I
>> can't test the alternate stack matching as I have no systems where it
>> fails (nor can I test slowedebug other than Linux).
>
> Still good.
Thanks!
David
> Thanks,
>
> -Zhengyu
>
>>
>> Thanks,
>> David
>> -----
>>
>> On 7/04/2019 5:10 pm, David Holmes wrote:
>>> Hi Chris,
>>>
>>> On 7/04/2019 4:51 pm, Chris Plummer wrote:
>>>> Hi David,
>>>>
>>>> On 4/6/19 11:06 PM, David Holmes wrote:
>>>>> On 6/04/2019 4:24 pm, Chris Plummer wrote:
>>>>>> On 4/5/19 9:13 PM, David Holmes wrote:
>>>>>>> Hi Chris,
>>>>>>>
>>>>>>> On 6/04/2019 3:09 am, Chris Plummer wrote:
>>>>>>>> Hi David,
>>>>>>>>
>>>>>>>> Why was the JVM_DefineModule frame left off of stackTraceAlternate?
>>>>>>>
>>>>>>> ?? That isn't part of any of the existing stacktraces.
>>>>>> See the following comment from Zhengyu in the CR:
>>>>>>
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8218458?focusedCommentId=14242865&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14242865
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> That comment simply includes a fragment of a stack which happens to
>>>>> include JVM_DefineModule and makes no further mention of it. I
>>>>> don't recall anyone saying that we should now be including that
>>>>> frame in the check.
>>>>>
>>>>> Do you want the test extended to also check for that frame?
>>>> Because ModuleEntryTable::new_entry() got inlined, JVM_DefineModule
>>>> is the additional frame that now appears in the detail output for
>>>> that call chain. So yes, the test should include it. If the inlining
>>>> of ModuleEntryTable::new_entry() had always happened, then the test
>>>> would originally have checked for the stacktrace as it appears in
>>>> the CR comment.
>>>
>>> I see - to be clear you want to always check for 4 frames, so the
>>> additional frame is only checked for the alternate stack.
>>>
>>>>>>>
>>>>>>>> Since you've added the following:
>>>>>>>>
>>>>>>>> 103 if (!okToHaveAllocateHeap) {
>>>>>>>> 104 output.shouldNotContain("AllocateHeap");
>>>>>>>> 105 }
>>>>>>>
>>>>>>> I didn't add that - see old code line 80.
>>>>>> Ok, but my comment below still applies since this check is in place.
>>>>>>>
>>>>>>>> You can simplify the following:
>>>>>>>>
>>>>>>>> 123 if (okToHaveAllocateHeap) {
>>>>>>>> 124 expectedStackTrace = stackTraceAllocateHeap;
>>>>>>>> 125 if (stackTraceMatches(expectedStackTrace,
>>>>>>>> output)) {
>>>>>>>> 126 return;
>>>>>>>> 127 }
>>>>>>>> 128 } else {
>>>>>>>>
>>>>>>>> The is no need for the okToHaveAllocateHeap check here anymore.
>>>>>>>> Just check all 3 allowed stacktraces until one passes. This is a
>>>>>>>> slight improvement in flexibility in that it would no longer
>>>>>>>> require the slowdebug builds to match stackTraceAllocateHeap.
>>>>>>>> They could match any of the 3. You could then put all 3 allowed
>>>>>>>> stacktraces in an array and check them in a loop if you wish.
>>>>>>>
>>>>>>> The only change I have made (which might be obscured by the
>>>>>>> structure) is that if stackTraceDefault fails to match I then try
>>>>>>> stackTraceAlternate. The handling of okToHaveAllocateHeap is
>>>>>>> unchanged.
>>>>>>>
>>>>>>> By the same argument you made I think it best to only expect the
>>>>>>> AllocateHeap stack on those slowdebug platforms, so that we can
>>>>>>> notice when something changes - again I've mode no change in this
>>>>>>> regard.
>>>>>> Since line 104 already verified that AllocateHeap does not appear
>>>>>> except possibly in slow debug heaps, it is harmless to check all
>>>>>> builds against the stacktrace that includes AllocateHeap.
>>>>>
>>>>> "Harmless" but a waste of time checking for a stack that we know
>>>>> can't match. The current version was at your suggestion:
>>>>>
>>>>> "You would need to check for all 3, limiting the AllocateHeap() one
>>>>> to just being allowed on solaris and windows slowdebug as it is now."
>>>> That was before I realized there was already an explicit check for
>>>> AllocateHeap() to not be allowed except for slowdebug ones. Once I
>>>> realized that, it occurred to me that checking for all 3 stacktraces
>>>> in a loop would simplify the logic.
>>>>>
>>>>> Checking all three returns to my original version (modulo not
>>>>> removing the check for the AllocateHeap frame, and fixing the
>>>>> matching logic).
>>>> Your original version checked for a large number of permutations
>>>> that included any 3 of 5 specified frames, not checks for any of 3
>>>> specific stacktraces (of 4 frames each).
>>>
>>> That was never the intent and what I was referring to when I said
>>> "and fixing the matching logic".
>>>
>>>>>
>>>>>> Also, if a slowdebug platform were to change to no longer include
>>>>>> AllocateHeap, checking it against the other two stacktraces would
>>>>>> allow the test to continue to pass without modification.
>>>>>
>>>>> This is counter to your earlier argument that we should be using
>>>>> this test to specifically check for such changes in compiler
>>>>> behaviour and update the platform specific guards accordingly. If
>>>>> you allow it to go either way then we would never remove the guard
>>>>> even when it was no longer needed on any platform.
>>>> But this is one compiler inlining behavior change that is ok. If
>>>> AllocateHeap() suddenly starts being inlined by slowdebug builds,
>>>> that is actually a good thing, and we would end up modifying the
>>>> test to allow it. So why not allow it now?
>>>>>
>>>>>> For these two reasons I was suggesting just always check all 3
>>>>>> stacktraces until one passes. It would simplify the logic some.
>>>>>
>>>>> I'd need to change a number of other things make the main logic
>>>>> simpler (ie loop over all three stacks) but the error reporting
>>>>> part will be more awkward. And Thomas already complained about the
>>>>> number of times we scan the entire process output doing this
>>>>> matching, so this would make it worse - unless I completely change
>>>>> the way we do the matching, which then introduces more complexity
>>>>> and more likelihood of introducing new bugs.
>>>>>
>>>>> Let me know how you want to proceed.
>>>>
>>>> The loop idea was just to make the code simpler. If you feel it will
>>>> slow things down unacceptably, then I'm fine with the logic as-is in
>>>> v2, but you need to add JVM_DefineModule to the new stacktrace.
>>>
>>> Okay I intend to add the missing 4th frame, and print both potential
>>> stacks on failure, but otherwise leave at V2.
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>
>>>>>
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>>
>>>>>>>
>>>>>>>> The following is no longer correct:
>>>>>>>>
>>>>>>>> 140 throw new RuntimeException("Expected stack trace
>>>>>>>> missing from output: " + expectedStackTrace);
>>>>>>>>
>>>>>>>> In your current approach, expectedStackTrace is just the last
>>>>>>>> stacktrace we tried. Since we may try more than one, maybe all
>>>>>>>> the ones that failed to match should be listed (or none listed
>>>>>>>> if just too messy).
>>>>>>>
>>>>>>> It reports the last failing stacktrace, out of a possible two.
>>>>>>> Perhaps I can print both ... you want something in the jtr file
>>>>>>> so that it can be triaged without having to go and look up the
>>>>>>> test code.
>>>>>> Yeah, just pointing out that only printing one stacktrace might
>>>>>> lead the .jtr reader down the wrong path.
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Chris
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>>> thanks,
>>>>>>>>
>>>>>>>> Chris
>>>>>>>>
>>>>>>>> On 4/5/19 12:04 AM, David Holmes wrote:
>>>>>>>>> Hi Chris,
>>>>>>>>>
>>>>>>>>> Updated webrev:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dholmes/8218458/webrev.v2/
>>>>>>>>>
>>>>>>>>> Checks for alternate stack now. Added lots of comments and misc
>>>>>>>>> fixups.
>>>>>>>>>
>>>>>>>>> Zhengyu: please re-test (I can't test any slowdebug except
>>>>>>>>> linux-x64).
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>> On 5/04/2019 4:01 pm, Chris Plummer wrote:
>>>>>>>>>> Thinking about this a bit more, there is still the potential
>>>>>>>>>> for some confusion if this test fails again in the future due
>>>>>>>>>> to the top frame missing. Is it missing because it got inlined
>>>>>>>>>> or is it missing because the frame skipping code skipped an
>>>>>>>>>> extra frame? Hopefully whoever deals with it doesn't just
>>>>>>>>>> hastily add another valid stacktrace to the test but instead
>>>>>>>>>> investigates to make sure the issue is indeed that the method
>>>>>>>>>> got inlined.
>>>>>>>>>>
>>>>>>>>>> Chris
>>>>>>>>>>
>>>>>>>>>> On 4/4/19 10:56 PM, David Holmes wrote:
>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>
>>>>>>>>>>> Okay I will simply check for the third alternative.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> David
>>>>>>>>>>>
>>>>>>>>>>> On 5/04/2019 3:53 pm, Chris Plummer wrote:
>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>
>>>>>>>>>>>> For the callsite that this test is checking for, right now
>>>>>>>>>>>> there appear to be 3 possible stacktraces: the "normal" one,
>>>>>>>>>>>> the one that includes AllocateHeap() on solaris and windows
>>>>>>>>>>>> slowdebug builds, and the one Zhengyu is now seeing on
>>>>>>>>>>>> linux-x64. You would need to check for all 3, limiting the
>>>>>>>>>>>> AllocateHeap() one to just being allowed on solaris and
>>>>>>>>>>>> windows slowdebug as it is now. So basically this test needs
>>>>>>>>>>>> to cover all (allowable) stacktraces that we've seen for
>>>>>>>>>>>> this callsite, and be updated in the future as needed. Not
>>>>>>>>>>>> ideal, but I don't see a better solution. It's similar to
>>>>>>>>>>>> the situation described in JDK-8163899 which covered the
>>>>>>>>>>>> fragility of the NMT frame skipping code. In the end it was
>>>>>>>>>>>> decided it would be easier to just deal fix issues as they
>>>>>>>>>>>> came up rather then engineer a solution that wasn't as
>>>>>>>>>>>> fragile. I think this test falls in the same category.
>>>>>>>>>>>>
>>>>>>>>>>>> thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Chris
>>>>>>>>>>>>
>>>>>>>>>>>> On 4/4/19 10:11 PM, David Holmes wrote:
>>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for the explanation about the frame counting from
>>>>>>>>>>>>> os::malloc - now I get it. But I don't understand your
>>>>>>>>>>>>> final comment:
>>>>>>>>>>>>>
>>>>>>>>>>>>> > Looking at this code also reminds me of a reason to have
>>>>>>>>>>>>> the test
>>>>>>>>>>>>> > continue to check for all 4 specific frames. If the frame
>>>>>>>>>>>>> skipping code
>>>>>>>>>>>>> > skips an extra frame, then the callsite will be missing a
>>>>>>>>>>>>> needed frame
>>>>>>>>>>>>> > at the top. The way the test was written it would detect
>>>>>>>>>>>>> this. With your
>>>>>>>>>>>>> > changes it will not. It would just revert to always
>>>>>>>>>>>>> matching on 3 frames
>>>>>>>>>>>>> > instead of 4, and the frame skipping bug would go unnoticed.
>>>>>>>>>>>>>
>>>>>>>>>>>>> How can I fix this bug if I have to check for 4 specific
>>>>>>>>>>>>> frames but one (or more) may be missing - i.e how can I
>>>>>>>>>>>>> tell the different between "Frame A was inlined" and "Frame
>>>>>>>>>>>>> A was skipped by mistake" ??
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> David
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 5/04/2019 2:17 pm, Chris Plummer wrote:
>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 4/4/19 6:28 PM, David Holmes wrote:
>>>>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 5/04/2019 1:48 am, Chris Plummer wrote:
>>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 4/4/19 12:14 AM, David Holmes wrote:
>>>>>>>>>>>>>>>>> On 4/04/2019 4:35 pm, Chris Plummer wrote:
>>>>>>>>>>>>>>>>>> On 4/3/19 11:23 PM, David Holmes wrote:
>>>>>>>>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 4/04/2019 4:12 pm, Chris Plummer wrote:
>>>>>>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I have concerns that this will hide some of the
>>>>>>>>>>>>>>>>>>>> other bugs I've mentioned: JDK-8133749, JDK-8133747,
>>>>>>>>>>>>>>>>>>>> and JDK-8133740. These bugs result in 1 or two
>>>>>>>>>>>>>>>>>>>> frames appearing in the stacktrace that should be
>>>>>>>>>>>>>>>>>>>> skipped. Notably NativeCallStack::NativeCallStack()
>>>>>>>>>>>>>>>>>>>> and os::get_native_stack().
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The test still checks those are not present first:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 73 // We should never see either of these
>>>>>>>>>>>>>>>>>>> frames because they are supposed to be skipped. */
>>>>>>>>>>>>>>>>>>> 74
>>>>>>>>>>>>>>>>>>> output.shouldNotContain("NativeCallStack::NativeCallStack");
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 75 output.shouldNotContain("os::get_native_stack");
>>>>>>>>>>>>>>>>>> Ah yes. I skimmed over the test looking for it but
>>>>>>>>>>>>>>>>>> missed it.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Also, AllocateHeap() should normally not be in the
>>>>>>>>>>>>>>>>>>>> stack trace, but the test has specifically allowed
>>>>>>>>>>>>>>>>>>>> for it for windows and solaris slowdebug builds.
>>>>>>>>>>>>>>>>>>>> Although these builds should have honored the
>>>>>>>>>>>>>>>>>>>> ALWAYSINLINE directive, it was deemed acceptable
>>>>>>>>>>>>>>>>>>>> that it was not in slowdebug builds. However, I
>>>>>>>>>>>>>>>>>>>> would not want to allow AllocateHeap() to appear in
>>>>>>>>>>>>>>>>>>>> a product build, and best not to see it in fastdebug
>>>>>>>>>>>>>>>>>>>> either.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> This is a test of NMT detail not a test of whether a
>>>>>>>>>>>>>>>>>>> given compiler chooses to inline something like
>>>>>>>>>>>>>>>>>>> AllocateHeap. I don't think it is the job of this
>>>>>>>>>>>>>>>>>>> test to be checking for something specific to the
>>>>>>>>>>>>>>>>>>> native compiler. The previous handling of
>>>>>>>>>>>>>>>>>>> AllocateHeap seemed to be there simply because it was
>>>>>>>>>>>>>>>>>>> the only way to deal with an optional frame - but now
>>>>>>>>>>>>>>>>>>> that's handled generically.
>>>>>>>>>>>>>>>>>> It's appearance means you effectively only have 3
>>>>>>>>>>>>>>>>>> frames to identity callsites instead of 4.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Both stacktraces in the old test had 4 elements and
>>>>>>>>>>>>>>>>> expected 4 matches. The current bug is that one of
>>>>>>>>>>>>>>>>> those (new_entry) could actually be inlined as well,
>>>>>>>>>>>>>>>>> resulting in only 3 matches. So that is what the
>>>>>>>>>>>>>>>>> revised test checks for: at least 3 matches. Often
>>>>>>>>>>>>>>>>> there will be 4 matches.
>>>>>>>>>>>>>>>> I think you misunderstood my "3 frames" comment. I was
>>>>>>>>>>>>>>>> referring to how many frames NMT uses to identify the
>>>>>>>>>>>>>>>> callsite. It wants to use 4, but if AllocateHeap()
>>>>>>>>>>>>>>>> doesn't get inlined, it effectively is using 3. The test
>>>>>>>>>>>>>>>> should detect when this happens so the NMT
>>>>>>>>>>>>>>>> implementation can address the issue.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> You're right I don't understand this part as I don't know
>>>>>>>>>>>>>>> how/what NMT detail is doing in this regard.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> An NMT callsite is simply the 4 most recent frames (afters
>>>>>>>>>>>>>> some pruning) that led to the os:malloc() call. "4" is
>>>>>>>>>>>>>> somewhat arbitrary as Thomas pointed out, and is
>>>>>>>>>>>>>> controlled by NMT_TrackingStackDepth. Making
>>>>>>>>>>>>>> NMT_TrackingStackDepth bigger means more refinement of the
>>>>>>>>>>>>>> callsites (thus more callsites), but a clearer picture of
>>>>>>>>>>>>>> what actually led to the os:malloc().
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> For example, with NMT_TrackingStackDepth == 4, if you have
>>>>>>>>>>>>>> a() calls b() calls c() calls d() calls os:malloc(), and
>>>>>>>>>>>>>> foo() and bar() both call a(), the NMT detail output will
>>>>>>>>>>>>>> not distinguish between these two calls paths to
>>>>>>>>>>>>>> os:mallco(), and will consider both paths to be the same
>>>>>>>>>>>>>> callsite. The 4 frames in the NMT detail output would
>>>>>>>>>>>>>> always be a, b, c, and d. However, bump up
>>>>>>>>>>>>>> NMT_TrackingStackDepth to 5 and now NMT will treat them as
>>>>>>>>>>>>>> two separate callsites, one with foo() as the bottom frame
>>>>>>>>>>>>>> and one with bar() as the bottom frame, and both with a,
>>>>>>>>>>>>>> b, c, and d as the other 4 frames.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So my point is if AllocateHeap() is not inlined, then
>>>>>>>>>>>>>> every allocation that is the result of doing a "new" of
>>>>>>>>>>>>>> any CHeapObj subtype will have AllocateHeap() in its
>>>>>>>>>>>>>> callsite, which effectively lowers they callsite
>>>>>>>>>>>>>> refinement by 1.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hmmm but now I'm wondering why this trace:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 50 public static String stackTraceAllocateHeap =
>>>>>>>>>>>>>>>>> 51 ".*AllocateHeap.*\n" +
>>>>>>>>>>>>>>>>> 52 ".*ModuleEntryTable.*new_entry.*\n" +
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> doesn't include ".*Hashtable.*allocate_new_entry.*"?
>>>>>>>>>>>>>>>>> Was it getting inlined already when AllocateHeap was
>>>>>>>>>>>>>>>>> not? Even so we still end up with 4 frames matching
>>>>>>>>>>>>>>>>> normally.
>>>>>>>>>>>>>>>> I noticed that last night also and scratch my head over
>>>>>>>>>>>>>>>> it for a while and then went to bed. The only
>>>>>>>>>>>>>>>> explanation I could come up with is that
>>>>>>>>>>>>>>>> allocate_new_entry() is getting inlined, and as a result
>>>>>>>>>>>>>>>> (due to being a slowdebug build and doing minimal
>>>>>>>>>>>>>>>> inlining) AllocateHeap() was not inlined.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> If it does appear in a product build, a solution
>>>>>>>>>>>>>>>>>> should be looked into to get rid of it. If the port
>>>>>>>>>>>>>>>>>> owner decides it can't get rid of it (or is unwilling
>>>>>>>>>>>>>>>>>> to), then an exception should be added to the test
>>>>>>>>>>>>>>>>>> like was done for solaris and windows slowdebug builds.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Are we specifically trying to test the compiler's
>>>>>>>>>>>>>>>>> ability to inline that function and just happen to be
>>>>>>>>>>>>>>>>> using this test to verify that? Doesn't seem like a
>>>>>>>>>>>>>>>>> suitable place to do this - and why do we need to do
>>>>>>>>>>>>>>>>> it? The Visual Studio docs state:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> "You cannot force the compiler to inline a particular
>>>>>>>>>>>>>>>>> function, even with the __forceinline keyword."
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> so ALWAYSINLINE is just a hint even in product builds
>>>>>>>>>>>>>>>>> and could change with any update to the compiler.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> For Solaris Studio it is again not guaranteed to inline
>>>>>>>>>>>>>>>>> - specifically -xinline only has an effect at –xO3 or
>>>>>>>>>>>>>>>>> higher. Which likely explains why it is ignored in
>>>>>>>>>>>>>>>>> slowdebug. And there are other cases where it won't
>>>>>>>>>>>>>>>>> honour the ALWAYSINLINE.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Even with gcc we seem to be misusing the attribute if
>>>>>>>>>>>>>>>>> we want to ensure inlining when not optimising:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> "GCC does not inline any functions when not optimizing
>>>>>>>>>>>>>>>>> unless you specify the ‘always_inline’ attribute for
>>>>>>>>>>>>>>>>> the function, like this:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> /* Prototype. */
>>>>>>>>>>>>>>>>> inline void foo (const char)
>>>>>>>>>>>>>>>>> __attribute__((always_inline));"
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> and we don't write it that way.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> So if we're that concerned about release builds
>>>>>>>>>>>>>>>>> guaranteeing to inline AllocateHeap then I think we
>>>>>>>>>>>>>>>>> need something a bit more explicit than this test to
>>>>>>>>>>>>>>>>> determine that.
>>>>>>>>>>>>>>>> With respect to the 3 methods/functions we don't want to
>>>>>>>>>>>>>>>> see in the callsite stacktrace, NMT has made a number of
>>>>>>>>>>>>>>>> assumptions on inlining. One of the things the test is
>>>>>>>>>>>>>>>> doing is making sure those assumptions are correct. If
>>>>>>>>>>>>>>>> incorrect, then you run into issues like I mentioned
>>>>>>>>>>>>>>>> above where callsite backtraces effectively only have 3
>>>>>>>>>>>>>>>> unique frames rather than 4 (actually before some bug
>>>>>>>>>>>>>>>> fixes it was often just 2 unique frames). So I think
>>>>>>>>>>>>>>>> it's appropriate to have a test to make sure we are not
>>>>>>>>>>>>>>>> seeing any of these 3 methods/functions.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Okay I get the gist of that. Is there somewhere I can
>>>>>>>>>>>>>>> clearly see what this inlining assumptions are that NMT
>>>>>>>>>>>>>>> makes? Are they clearly documented?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Not that I know of. I discovered them while looking at the
>>>>>>>>>>>>>> various bugs that led to
>>>>>>>>>>>>>> NativeCallStack::NativeCallStack() and
>>>>>>>>>>>>>> os::get_native_stack() (and sometimes both) being in the
>>>>>>>>>>>>>> callsite. Reviewing the bugs I referred to will give you
>>>>>>>>>>>>>> an idea of where to look. One good place to look at
>>>>>>>>>>>>>> NativeCallStack::NativeCallStack(). Lots of special case
>>>>>>>>>>>>>> code there that controls how many frames to skip based on
>>>>>>>>>>>>>> on the platform and whether optimized or not. Also some
>>>>>>>>>>>>>> comments there to help you out. I did a lot of bug fixing
>>>>>>>>>>>>>> in this method.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Looking at this code also reminds me of a reason to have
>>>>>>>>>>>>>> the test continue to check for all 4 specific frames. If
>>>>>>>>>>>>>> the frame skipping code skips an extra frame, then the
>>>>>>>>>>>>>> callsite will be missing a needed frame at the top. The
>>>>>>>>>>>>>> way the test was written it would detect this. With your
>>>>>>>>>>>>>> changes it will not. It would just revert to always
>>>>>>>>>>>>>> matching on 3 frames instead of 4, and the frame skipping
>>>>>>>>>>>>>> bug would go unnoticed.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Now the test also has made inlining assumptions beyond
>>>>>>>>>>>>>>>> what NMT has made, and that is really what this bug is
>>>>>>>>>>>>>>>> about. In general I think your fix is fine in the way it
>>>>>>>>>>>>>>>> relaxes which frames are actually found, but as Thomas
>>>>>>>>>>>>>>>> points out, it suffers from not actually looking at a
>>>>>>>>>>>>>>>> single stacktrace, but just looking for the specified
>>>>>>>>>>>>>>>> frames somewhere in the output (and in the order
>>>>>>>>>>>>>>>> specified.) You should probably address this.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Right that was an error on my part. I thought the
>>>>>>>>>>>>>>> existing MULTILINE pattern matching with .* would also
>>>>>>>>>>>>>>> find non-sequential lines and so I was acting similarly.
>>>>>>>>>>>>>>> I will re-think this.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Given the changes you made to allow more flexibly in
>>>>>>>>>>>>>>>>>>>> which frames appear, I think you need to now also
>>>>>>>>>>>>>>>>>>>> make sure the above 3 mentioned frames are not
>>>>>>>>>>>>>>>>>>>> present, except for allowing AllocateHeap() in
>>>>>>>>>>>>>>>>>>>> slowdebug builds.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On 4/3/19 10:53 PM, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458
>>>>>>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~dholmes/8218458/webrev/
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> The actual stack trace reported by NMT detail is
>>>>>>>>>>>>>>>>>>>>> affected by the inlining decisions of the native
>>>>>>>>>>>>>>>>>>>>> compiler, and on the type of build. So we define an
>>>>>>>>>>>>>>>>>>>>> "ideal" stacktrace and then allow for some frames
>>>>>>>>>>>>>>>>>>>>> to be missing based on empirical observations. So
>>>>>>>>>>>>>>>>>>>>> to date we have seen two frames that may or may not
>>>>>>>>>>>>>>>>>>>>> be inlined and so we allow for 2 non-matching entries.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> The special-casing of AllocateHeap is removed as
>>>>>>>>>>>>>>>>>>>>> now it is just an optional frame.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Chris: does this maintain the "spirit" of the test
>>>>>>>>>>>>>>>>>>>>> as you intended?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Zhengyu: can you test this on your system(s) please.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
More information about the hotspot-runtime-dev
mailing list