RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output
Zhengyu Gu
zgu at redhat.com
Mon Apr 8 12:35:49 UTC 2019
> Zhengyu: can you please test this again on your platforms. Again I can't
> test the alternate stack matching as I have no systems where it fails
> (nor can I test slowedebug other than Linux).
Still good.
Thanks,
-Zhengyu
>
> Thanks,
> David
> -----
>
> On 7/04/2019 5:10 pm, David Holmes wrote:
>> Hi Chris,
>>
>> On 7/04/2019 4:51 pm, Chris Plummer wrote:
>>> Hi David,
>>>
>>> On 4/6/19 11:06 PM, David Holmes wrote:
>>>> On 6/04/2019 4:24 pm, Chris Plummer wrote:
>>>>> On 4/5/19 9:13 PM, David Holmes wrote:
>>>>>> Hi Chris,
>>>>>>
>>>>>> On 6/04/2019 3:09 am, Chris Plummer wrote:
>>>>>>> Hi David,
>>>>>>>
>>>>>>> Why was the JVM_DefineModule frame left off of stackTraceAlternate?
>>>>>>
>>>>>> ?? That isn't part of any of the existing stacktraces.
>>>>> See the following comment from Zhengyu in the CR:
>>>>>
>>>>> https://bugs.openjdk.java.net/browse/JDK-8218458?focusedCommentId=14242865&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14242865
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> That comment simply includes a fragment of a stack which happens to
>>>> include JVM_DefineModule and makes no further mention of it. I don't
>>>> recall anyone saying that we should now be including that frame in
>>>> the check.
>>>>
>>>> Do you want the test extended to also check for that frame?
>>> Because ModuleEntryTable::new_entry() got inlined, JVM_DefineModule
>>> is the additional frame that now appears in the detail output for
>>> that call chain. So yes, the test should include it. If the inlining
>>> of ModuleEntryTable::new_entry() had always happened, then the test
>>> would originally have checked for the stacktrace as it appears in the
>>> CR comment.
>>
>> I see - to be clear you want to always check for 4 frames, so the
>> additional frame is only checked for the alternate stack.
>>
>>>>>>
>>>>>>> Since you've added the following:
>>>>>>>
>>>>>>> 103 if (!okToHaveAllocateHeap) {
>>>>>>> 104 output.shouldNotContain("AllocateHeap");
>>>>>>> 105 }
>>>>>>
>>>>>> I didn't add that - see old code line 80.
>>>>> Ok, but my comment below still applies since this check is in place.
>>>>>>
>>>>>>> You can simplify the following:
>>>>>>>
>>>>>>> 123 if (okToHaveAllocateHeap) {
>>>>>>> 124 expectedStackTrace = stackTraceAllocateHeap;
>>>>>>> 125 if (stackTraceMatches(expectedStackTrace,
>>>>>>> output)) {
>>>>>>> 126 return;
>>>>>>> 127 }
>>>>>>> 128 } else {
>>>>>>>
>>>>>>> The is no need for the okToHaveAllocateHeap check here anymore.
>>>>>>> Just check all 3 allowed stacktraces until one passes. This is a
>>>>>>> slight improvement in flexibility in that it would no longer
>>>>>>> require the slowdebug builds to match stackTraceAllocateHeap.
>>>>>>> They could match any of the 3. You could then put all 3 allowed
>>>>>>> stacktraces in an array and check them in a loop if you wish.
>>>>>>
>>>>>> The only change I have made (which might be obscured by the
>>>>>> structure) is that if stackTraceDefault fails to match I then try
>>>>>> stackTraceAlternate. The handling of okToHaveAllocateHeap is
>>>>>> unchanged.
>>>>>>
>>>>>> By the same argument you made I think it best to only expect the
>>>>>> AllocateHeap stack on those slowdebug platforms, so that we can
>>>>>> notice when something changes - again I've mode no change in this
>>>>>> regard.
>>>>> Since line 104 already verified that AllocateHeap does not appear
>>>>> except possibly in slow debug heaps, it is harmless to check all
>>>>> builds against the stacktrace that includes AllocateHeap.
>>>>
>>>> "Harmless" but a waste of time checking for a stack that we know
>>>> can't match. The current version was at your suggestion:
>>>>
>>>> "You would need to check for all 3, limiting the AllocateHeap() one
>>>> to just being allowed on solaris and windows slowdebug as it is now."
>>> That was before I realized there was already an explicit check for
>>> AllocateHeap() to not be allowed except for slowdebug ones. Once I
>>> realized that, it occurred to me that checking for all 3 stacktraces
>>> in a loop would simplify the logic.
>>>>
>>>> Checking all three returns to my original version (modulo not
>>>> removing the check for the AllocateHeap frame, and fixing the
>>>> matching logic).
>>> Your original version checked for a large number of permutations that
>>> included any 3 of 5 specified frames, not checks for any of 3
>>> specific stacktraces (of 4 frames each).
>>
>> That was never the intent and what I was referring to when I said "and
>> fixing the matching logic".
>>
>>>>
>>>>> Also, if a slowdebug platform were to change to no longer include
>>>>> AllocateHeap, checking it against the other two stacktraces would
>>>>> allow the test to continue to pass without modification.
>>>>
>>>> This is counter to your earlier argument that we should be using
>>>> this test to specifically check for such changes in compiler
>>>> behaviour and update the platform specific guards accordingly. If
>>>> you allow it to go either way then we would never remove the guard
>>>> even when it was no longer needed on any platform.
>>> But this is one compiler inlining behavior change that is ok. If
>>> AllocateHeap() suddenly starts being inlined by slowdebug builds,
>>> that is actually a good thing, and we would end up modifying the test
>>> to allow it. So why not allow it now?
>>>>
>>>>> For these two reasons I was suggesting just always check all 3
>>>>> stacktraces until one passes. It would simplify the logic some.
>>>>
>>>> I'd need to change a number of other things make the main logic
>>>> simpler (ie loop over all three stacks) but the error reporting part
>>>> will be more awkward. And Thomas already complained about the number
>>>> of times we scan the entire process output doing this matching, so
>>>> this would make it worse - unless I completely change the way we do
>>>> the matching, which then introduces more complexity and more
>>>> likelihood of introducing new bugs.
>>>>
>>>> Let me know how you want to proceed.
>>>
>>> The loop idea was just to make the code simpler. If you feel it will
>>> slow things down unacceptably, then I'm fine with the logic as-is in
>>> v2, but you need to add JVM_DefineModule to the new stacktrace.
>>
>> Okay I intend to add the missing 4th frame, and print both potential
>> stacks on failure, but otherwise leave at V2.
>>
>> Thanks,
>> David
>> -----
>>
>>> thanks,
>>>
>>> Chris
>>>
>>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>>>
>>>>>>> The following is no longer correct:
>>>>>>>
>>>>>>> 140 throw new RuntimeException("Expected stack trace
>>>>>>> missing from output: " + expectedStackTrace);
>>>>>>>
>>>>>>> In your current approach, expectedStackTrace is just the last
>>>>>>> stacktrace we tried. Since we may try more than one, maybe all
>>>>>>> the ones that failed to match should be listed (or none listed if
>>>>>>> just too messy).
>>>>>>
>>>>>> It reports the last failing stacktrace, out of a possible two.
>>>>>> Perhaps I can print both ... you want something in the jtr file so
>>>>>> that it can be triaged without having to go and look up the test
>>>>>> code.
>>>>> Yeah, just pointing out that only printing one stacktrace might
>>>>> lead the .jtr reader down the wrong path.
>>>>>
>>>>> thanks,
>>>>>
>>>>> Chris
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On 4/5/19 12:04 AM, David Holmes wrote:
>>>>>>>> Hi Chris,
>>>>>>>>
>>>>>>>> Updated webrev:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dholmes/8218458/webrev.v2/
>>>>>>>>
>>>>>>>> Checks for alternate stack now. Added lots of comments and misc
>>>>>>>> fixups.
>>>>>>>>
>>>>>>>> Zhengyu: please re-test (I can't test any slowdebug except
>>>>>>>> linux-x64).
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>>
>>>>>>>> On 5/04/2019 4:01 pm, Chris Plummer wrote:
>>>>>>>>> Thinking about this a bit more, there is still the potential
>>>>>>>>> for some confusion if this test fails again in the future due
>>>>>>>>> to the top frame missing. Is it missing because it got inlined
>>>>>>>>> or is it missing because the frame skipping code skipped an
>>>>>>>>> extra frame? Hopefully whoever deals with it doesn't just
>>>>>>>>> hastily add another valid stacktrace to the test but instead
>>>>>>>>> investigates to make sure the issue is indeed that the method
>>>>>>>>> got inlined.
>>>>>>>>>
>>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>> On 4/4/19 10:56 PM, David Holmes wrote:
>>>>>>>>>> Hi Chris,
>>>>>>>>>>
>>>>>>>>>> Okay I will simply check for the third alternative.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>> On 5/04/2019 3:53 pm, Chris Plummer wrote:
>>>>>>>>>>> Hi David,
>>>>>>>>>>>
>>>>>>>>>>> For the callsite that this test is checking for, right now
>>>>>>>>>>> there appear to be 3 possible stacktraces: the "normal" one,
>>>>>>>>>>> the one that includes AllocateHeap() on solaris and windows
>>>>>>>>>>> slowdebug builds, and the one Zhengyu is now seeing on
>>>>>>>>>>> linux-x64. You would need to check for all 3, limiting the
>>>>>>>>>>> AllocateHeap() one to just being allowed on solaris and
>>>>>>>>>>> windows slowdebug as it is now. So basically this test needs
>>>>>>>>>>> to cover all (allowable) stacktraces that we've seen for this
>>>>>>>>>>> callsite, and be updated in the future as needed. Not ideal,
>>>>>>>>>>> but I don't see a better solution. It's similar to the
>>>>>>>>>>> situation described in JDK-8163899 which covered the
>>>>>>>>>>> fragility of the NMT frame skipping code. In the end it was
>>>>>>>>>>> decided it would be easier to just deal fix issues as they
>>>>>>>>>>> came up rather then engineer a solution that wasn't as
>>>>>>>>>>> fragile. I think this test falls in the same category.
>>>>>>>>>>>
>>>>>>>>>>> thanks,
>>>>>>>>>>>
>>>>>>>>>>> Chris
>>>>>>>>>>>
>>>>>>>>>>> On 4/4/19 10:11 PM, David Holmes wrote:
>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for the explanation about the frame counting from
>>>>>>>>>>>> os::malloc - now I get it. But I don't understand your final
>>>>>>>>>>>> comment:
>>>>>>>>>>>>
>>>>>>>>>>>> > Looking at this code also reminds me of a reason to have
>>>>>>>>>>>> the test
>>>>>>>>>>>> > continue to check for all 4 specific frames. If the frame
>>>>>>>>>>>> skipping code
>>>>>>>>>>>> > skips an extra frame, then the callsite will be missing a
>>>>>>>>>>>> needed frame
>>>>>>>>>>>> > at the top. The way the test was written it would detect
>>>>>>>>>>>> this. With your
>>>>>>>>>>>> > changes it will not. It would just revert to always
>>>>>>>>>>>> matching on 3 frames
>>>>>>>>>>>> > instead of 4, and the frame skipping bug would go unnoticed.
>>>>>>>>>>>>
>>>>>>>>>>>> How can I fix this bug if I have to check for 4 specific
>>>>>>>>>>>> frames but one (or more) may be missing - i.e how can I tell
>>>>>>>>>>>> the different between "Frame A was inlined" and "Frame A was
>>>>>>>>>>>> skipped by mistake" ??
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> David
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 5/04/2019 2:17 pm, Chris Plummer wrote:
>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 4/4/19 6:28 PM, David Holmes wrote:
>>>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 5/04/2019 1:48 am, Chris Plummer wrote:
>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 4/4/19 12:14 AM, David Holmes wrote:
>>>>>>>>>>>>>>>> On 4/04/2019 4:35 pm, Chris Plummer wrote:
>>>>>>>>>>>>>>>>> On 4/3/19 11:23 PM, David Holmes wrote:
>>>>>>>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 4/04/2019 4:12 pm, Chris Plummer wrote:
>>>>>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I have concerns that this will hide some of the other
>>>>>>>>>>>>>>>>>>> bugs I've mentioned: JDK-8133749, JDK-8133747, and
>>>>>>>>>>>>>>>>>>> JDK-8133740. These bugs result in 1 or two frames
>>>>>>>>>>>>>>>>>>> appearing in the stacktrace that should be skipped.
>>>>>>>>>>>>>>>>>>> Notably NativeCallStack::NativeCallStack() and
>>>>>>>>>>>>>>>>>>> os::get_native_stack().
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The test still checks those are not present first:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 73 // We should never see either of these
>>>>>>>>>>>>>>>>>> frames because they are supposed to be skipped. */
>>>>>>>>>>>>>>>>>> 74
>>>>>>>>>>>>>>>>>> output.shouldNotContain("NativeCallStack::NativeCallStack");
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 75 output.shouldNotContain("os::get_native_stack");
>>>>>>>>>>>>>>>>> Ah yes. I skimmed over the test looking for it but
>>>>>>>>>>>>>>>>> missed it.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Also, AllocateHeap() should normally not be in the
>>>>>>>>>>>>>>>>>>> stack trace, but the test has specifically allowed
>>>>>>>>>>>>>>>>>>> for it for windows and solaris slowdebug builds.
>>>>>>>>>>>>>>>>>>> Although these builds should have honored the
>>>>>>>>>>>>>>>>>>> ALWAYSINLINE directive, it was deemed acceptable that
>>>>>>>>>>>>>>>>>>> it was not in slowdebug builds. However, I would not
>>>>>>>>>>>>>>>>>>> want to allow AllocateHeap() to appear in a product
>>>>>>>>>>>>>>>>>>> build, and best not to see it in fastdebug either.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> This is a test of NMT detail not a test of whether a
>>>>>>>>>>>>>>>>>> given compiler chooses to inline something like
>>>>>>>>>>>>>>>>>> AllocateHeap. I don't think it is the job of this test
>>>>>>>>>>>>>>>>>> to be checking for something specific to the native
>>>>>>>>>>>>>>>>>> compiler. The previous handling of AllocateHeap seemed
>>>>>>>>>>>>>>>>>> to be there simply because it was the only way to deal
>>>>>>>>>>>>>>>>>> with an optional frame - but now that's handled
>>>>>>>>>>>>>>>>>> generically.
>>>>>>>>>>>>>>>>> It's appearance means you effectively only have 3
>>>>>>>>>>>>>>>>> frames to identity callsites instead of 4.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Both stacktraces in the old test had 4 elements and
>>>>>>>>>>>>>>>> expected 4 matches. The current bug is that one of those
>>>>>>>>>>>>>>>> (new_entry) could actually be inlined as well, resulting
>>>>>>>>>>>>>>>> in only 3 matches. So that is what the revised test
>>>>>>>>>>>>>>>> checks for: at least 3 matches. Often there will be 4
>>>>>>>>>>>>>>>> matches.
>>>>>>>>>>>>>>> I think you misunderstood my "3 frames" comment. I was
>>>>>>>>>>>>>>> referring to how many frames NMT uses to identify the
>>>>>>>>>>>>>>> callsite. It wants to use 4, but if AllocateHeap()
>>>>>>>>>>>>>>> doesn't get inlined, it effectively is using 3. The test
>>>>>>>>>>>>>>> should detect when this happens so the NMT implementation
>>>>>>>>>>>>>>> can address the issue.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> You're right I don't understand this part as I don't know
>>>>>>>>>>>>>> how/what NMT detail is doing in this regard.
>>>>>>>>>>>>>
>>>>>>>>>>>>> An NMT callsite is simply the 4 most recent frames (afters
>>>>>>>>>>>>> some pruning) that led to the os:malloc() call. "4" is
>>>>>>>>>>>>> somewhat arbitrary as Thomas pointed out, and is controlled
>>>>>>>>>>>>> by NMT_TrackingStackDepth. Making NMT_TrackingStackDepth
>>>>>>>>>>>>> bigger means more refinement of the callsites (thus more
>>>>>>>>>>>>> callsites), but a clearer picture of what actually led to
>>>>>>>>>>>>> the os:malloc().
>>>>>>>>>>>>>
>>>>>>>>>>>>> For example, with NMT_TrackingStackDepth == 4, if you have
>>>>>>>>>>>>> a() calls b() calls c() calls d() calls os:malloc(), and
>>>>>>>>>>>>> foo() and bar() both call a(), the NMT detail output will
>>>>>>>>>>>>> not distinguish between these two calls paths to
>>>>>>>>>>>>> os:mallco(), and will consider both paths to be the same
>>>>>>>>>>>>> callsite. The 4 frames in the NMT detail output would
>>>>>>>>>>>>> always be a, b, c, and d. However, bump up
>>>>>>>>>>>>> NMT_TrackingStackDepth to 5 and now NMT will treat them as
>>>>>>>>>>>>> two separate callsites, one with foo() as the bottom frame
>>>>>>>>>>>>> and one with bar() as the bottom frame, and both with a, b,
>>>>>>>>>>>>> c, and d as the other 4 frames.
>>>>>>>>>>>>>
>>>>>>>>>>>>> So my point is if AllocateHeap() is not inlined, then every
>>>>>>>>>>>>> allocation that is the result of doing a "new" of any
>>>>>>>>>>>>> CHeapObj subtype will have AllocateHeap() in its callsite,
>>>>>>>>>>>>> which effectively lowers they callsite refinement by 1.
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hmmm but now I'm wondering why this trace:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 50 public static String stackTraceAllocateHeap =
>>>>>>>>>>>>>>>> 51 ".*AllocateHeap.*\n" +
>>>>>>>>>>>>>>>> 52 ".*ModuleEntryTable.*new_entry.*\n" +
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> doesn't include ".*Hashtable.*allocate_new_entry.*"? Was
>>>>>>>>>>>>>>>> it getting inlined already when AllocateHeap was not?
>>>>>>>>>>>>>>>> Even so we still end up with 4 frames matching normally.
>>>>>>>>>>>>>>> I noticed that last night also and scratch my head over
>>>>>>>>>>>>>>> it for a while and then went to bed. The only explanation
>>>>>>>>>>>>>>> I could come up with is that allocate_new_entry() is
>>>>>>>>>>>>>>> getting inlined, and as a result (due to being a
>>>>>>>>>>>>>>> slowdebug build and doing minimal inlining)
>>>>>>>>>>>>>>> AllocateHeap() was not inlined.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If it does appear in a product build, a solution should
>>>>>>>>>>>>>>>>> be looked into to get rid of it. If the port owner
>>>>>>>>>>>>>>>>> decides it can't get rid of it (or is unwilling to),
>>>>>>>>>>>>>>>>> then an exception should be added to the test like was
>>>>>>>>>>>>>>>>> done for solaris and windows slowdebug builds.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Are we specifically trying to test the compiler's
>>>>>>>>>>>>>>>> ability to inline that function and just happen to be
>>>>>>>>>>>>>>>> using this test to verify that? Doesn't seem like a
>>>>>>>>>>>>>>>> suitable place to do this - and why do we need to do it?
>>>>>>>>>>>>>>>> The Visual Studio docs state:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> "You cannot force the compiler to inline a particular
>>>>>>>>>>>>>>>> function, even with the __forceinline keyword."
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> so ALWAYSINLINE is just a hint even in product builds
>>>>>>>>>>>>>>>> and could change with any update to the compiler.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> For Solaris Studio it is again not guaranteed to inline
>>>>>>>>>>>>>>>> - specifically -xinline only has an effect at –xO3 or
>>>>>>>>>>>>>>>> higher. Which likely explains why it is ignored in
>>>>>>>>>>>>>>>> slowdebug. And there are other cases where it won't
>>>>>>>>>>>>>>>> honour the ALWAYSINLINE.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Even with gcc we seem to be misusing the attribute if we
>>>>>>>>>>>>>>>> want to ensure inlining when not optimising:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> "GCC does not inline any functions when not optimizing
>>>>>>>>>>>>>>>> unless you specify the ‘always_inline’ attribute for the
>>>>>>>>>>>>>>>> function, like this:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> /* Prototype. */
>>>>>>>>>>>>>>>> inline void foo (const char)
>>>>>>>>>>>>>>>> __attribute__((always_inline));"
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> and we don't write it that way.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> So if we're that concerned about release builds
>>>>>>>>>>>>>>>> guaranteeing to inline AllocateHeap then I think we need
>>>>>>>>>>>>>>>> something a bit more explicit than this test to
>>>>>>>>>>>>>>>> determine that.
>>>>>>>>>>>>>>> With respect to the 3 methods/functions we don't want to
>>>>>>>>>>>>>>> see in the callsite stacktrace, NMT has made a number of
>>>>>>>>>>>>>>> assumptions on inlining. One of the things the test is
>>>>>>>>>>>>>>> doing is making sure those assumptions are correct. If
>>>>>>>>>>>>>>> incorrect, then you run into issues like I mentioned
>>>>>>>>>>>>>>> above where callsite backtraces effectively only have 3
>>>>>>>>>>>>>>> unique frames rather than 4 (actually before some bug
>>>>>>>>>>>>>>> fixes it was often just 2 unique frames). So I think it's
>>>>>>>>>>>>>>> appropriate to have a test to make sure we are not seeing
>>>>>>>>>>>>>>> any of these 3 methods/functions.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Okay I get the gist of that. Is there somewhere I can
>>>>>>>>>>>>>> clearly see what this inlining assumptions are that NMT
>>>>>>>>>>>>>> makes? Are they clearly documented?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Not that I know of. I discovered them while looking at the
>>>>>>>>>>>>> various bugs that led to NativeCallStack::NativeCallStack()
>>>>>>>>>>>>> and os::get_native_stack() (and sometimes both) being in
>>>>>>>>>>>>> the callsite. Reviewing the bugs I referred to will give
>>>>>>>>>>>>> you an idea of where to look. One good place to look at
>>>>>>>>>>>>> NativeCallStack::NativeCallStack(). Lots of special case
>>>>>>>>>>>>> code there that controls how many frames to skip based on
>>>>>>>>>>>>> on the platform and whether optimized or not. Also some
>>>>>>>>>>>>> comments there to help you out. I did a lot of bug fixing
>>>>>>>>>>>>> in this method.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Looking at this code also reminds me of a reason to have
>>>>>>>>>>>>> the test continue to check for all 4 specific frames. If
>>>>>>>>>>>>> the frame skipping code skips an extra frame, then the
>>>>>>>>>>>>> callsite will be missing a needed frame at the top. The way
>>>>>>>>>>>>> the test was written it would detect this. With your
>>>>>>>>>>>>> changes it will not. It would just revert to always
>>>>>>>>>>>>> matching on 3 frames instead of 4, and the frame skipping
>>>>>>>>>>>>> bug would go unnoticed.
>>>>>>>>>>>>>
>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Now the test also has made inlining assumptions beyond
>>>>>>>>>>>>>>> what NMT has made, and that is really what this bug is
>>>>>>>>>>>>>>> about. In general I think your fix is fine in the way it
>>>>>>>>>>>>>>> relaxes which frames are actually found, but as Thomas
>>>>>>>>>>>>>>> points out, it suffers from not actually looking at a
>>>>>>>>>>>>>>> single stacktrace, but just looking for the specified
>>>>>>>>>>>>>>> frames somewhere in the output (and in the order
>>>>>>>>>>>>>>> specified.) You should probably address this.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Right that was an error on my part. I thought the existing
>>>>>>>>>>>>>> MULTILINE pattern matching with .* would also find
>>>>>>>>>>>>>> non-sequential lines and so I was acting similarly. I will
>>>>>>>>>>>>>> re-think this.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Given the changes you made to allow more flexibly in
>>>>>>>>>>>>>>>>>>> which frames appear, I think you need to now also
>>>>>>>>>>>>>>>>>>> make sure the above 3 mentioned frames are not
>>>>>>>>>>>>>>>>>>> present, except for allowing AllocateHeap() in
>>>>>>>>>>>>>>>>>>> slowdebug builds.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 4/3/19 10:53 PM, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458
>>>>>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~dholmes/8218458/webrev/
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The actual stack trace reported by NMT detail is
>>>>>>>>>>>>>>>>>>>> affected by the inlining decisions of the native
>>>>>>>>>>>>>>>>>>>> compiler, and on the type of build. So we define an
>>>>>>>>>>>>>>>>>>>> "ideal" stacktrace and then allow for some frames to
>>>>>>>>>>>>>>>>>>>> be missing based on empirical observations. So to
>>>>>>>>>>>>>>>>>>>> date we have seen two frames that may or may not be
>>>>>>>>>>>>>>>>>>>> inlined and so we allow for 2 non-matching entries.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The special-casing of AllocateHeap is removed as now
>>>>>>>>>>>>>>>>>>>> it is just an optional frame.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Chris: does this maintain the "spirit" of the test
>>>>>>>>>>>>>>>>>>>> as you intended?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Zhengyu: can you test this on your system(s) please.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>
>>>
More information about the hotspot-runtime-dev
mailing list