RFR: 8339285: Test fails with assert(depth < max_critical_stack_depth) failed: can't have more than 10 critical frames [v2]

Maurizio Cimadamore mcimadamore at openjdk.org
Thu Sep 5 11:17:34 UTC 2024


> Scoped methods are critical methods in the FFM API where memory is accessed in a potentially unsafe way. When closing shared arenas, we look at threads in the middle of a scoped operation involving that arena, and if we find one, we make it fail (by installing an async handshake on that thread).
> 
> To find whether a thread is in a scoped method or not, we need a stack walk. For performance reasons, it is preferrable to have the stack walk to be bounded in size.
> 
> A test started picking up a JVM assertion where the stack of a scoped method (namely `ScopedMemoryAccess::isLoaded`) is too big. This is caused by the scoped method stack walk finding the thread using the scoped method in the middle of some JNI lookup (which is required as `isLoaded` eventually ends up in a native method). This condition seems to have been made easier by the fact that these [changes](https://git.openjdk.org/jdk/pull/19213).
> 
> This PR reverts the stack trace associated with JNI lookup to what it was before, by eliminating the extra frame with a bit of refactoring/cleanup. But this is not enough: the stress test introduced in this PR still fails, even when the stack associated with `ClassLoader::findNative` is restored.
> 
> To address this problem in full, I have resorted to `registerNatives` - that is, the native `isLoaded0`, `load0`, `unload0` and `force0` are pre-registered, when the static initializer of `MappedMemoryUtils` is ran. This means that we no longer need to run a JNI lookup in the middle of a scoped method call. This brings the stack back under control, and passes the stress test.
> 
> Of course there's more to do in this area - we should have a more direct test to check the stack associated with scoped methods (for instance, vector load/store operations are also potential suspects), in order to catch "suspicious refactoring" earlier in the process. For this reason I also filed a follow up i[ssue](https://bugs.openjdk.org/browse/JDK-8339551).

Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision:

  Address review comments

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/20854/files
  - new: https://git.openjdk.org/jdk/pull/20854/files/5a4cd150..f02b51c1

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=20854&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20854&range=00-01

  Stats: 45 lines in 3 files changed: 0 ins; 12 del; 33 mod
  Patch: https://git.openjdk.org/jdk/pull/20854.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20854/head:pull/20854

PR: https://git.openjdk.org/jdk/pull/20854


More information about the security-dev mailing list