RFR: JDK-8261520: JDK-8261302 breaks runtime/NMT/CheckForProperDetailStackTrace.java [v3]
Coleen Phillimore
coleenp at openjdk.java.net
Thu Feb 25 23:53:42 UTC 2021
On Wed, 24 Feb 2021 19:56:04 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:
>> Since JDK-8261302, the test runtime/NMT/CheckForProperDetailStackTrace.java fails with
>> java.lang.RuntimeException: 'NativeCallStack::NativeCallStack' found in stdout
>>
>> --
>>
>> `NativeCallStack` contains a hash code. Before JDK-8261302, that hash code was calculated lazily in a non-inline hashcode getter. With JDK-8261302, the hash code calculation was moved into the `NativeCallStack` constructor and the getter was made inline.
>>
>> The `NativeCallStack` constructor fills itself via `os::get_native_stack()`. Before JDK-8261302, that call has been the last call in the constructor and hence had been sometimes optimized into a tail call. Whether or not its a tail call matters since it affects the number of stack frames the stack walker has to skip. Therefore, the constructor contains coding to predict tail-call-ness:
>>
>> #if (defined(_NMT_NOINLINE_) || defined(_WINDOWS) || !defined(_LP64) || defined(PPC64))
>> // Not a tail call.
>> toSkip++;
>> #if (defined(_NMT_NOINLINE_) && defined(BSD) && defined(_LP64))
>> // Mac OS X slowdebug builds have this odd behavior where NativeCallStack::NativeCallStack
>> // appears as two frames, so we need to skip an extra frame.
>> toSkip++;
>> #endif // Special-case for BSD.
>> #endif // Not a tail call.
>>
>> This prediction was now off since the hash code calculation happened at the end of the callstack. This causes the test error, since on some platforms (eg Linux x64) we now think we have a tail call when we don't, which means we do not skip enough frames, and the NMT output contains call frames like "NativeCallStack::NativeCallStack()", which trips the test.
>>
>> -----------
>>
>> Fix:
>>
>> This fix moves the hash code calculation completely out of NativeCallStack. There is no reason why NativeCallStack should have a hash code. It mainly exists as a convenience to place it in a hash map. The patch moves the hash code calculation up into MallocSiteTableEntry.
>>
>> This has the advantage of only having to pay for a hash code when you need it - in theory, one may use NativeCallStack in places other than NMT, where it is unnecessary.
>>
>> I considered other options:
>> - modify `os::get_native_stack()` to also calculate a hash in addition to capturing the stack, and return it in a caller provided variable. That would have left this call to be the tail call. However, it seemed less clean - we have two implementations of this function, as well as other, non-capturing, NativeCallStack constructors, which would have to be modified. It also would have made `os::get_native_stack()` less general purpose.
>> - Leave it as it is and just always skip frames: Seemed attractive, but I did not want to touch the tailcode-prediction-code and play whack-the-mole with platform specific test errors.
>>
>> ---------------
>>
>> Tests: GA, manual test, nightlies at SAP
>
> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:
>
> - Merge
> - unsigned->unsigned int
> - Initial
Yes this makes sense. I use NativeCallStack sometimes for debugging things and don't need the hash code, so this is good.
-------------
Marked as reviewed by coleenp (Reviewer).
PR: https://git.openjdk.java.net/jdk/pull/2672
More information about the hotspot-dev
mailing list