RFR: JDK-8283326: Implement SafeFetch statically

Thomas Stuefe stuefe at openjdk.java.net
Sun Apr 10 08:32:01 UTC 2022


Hi,

This is a reimplementation of SafeFetch to get rid of dynamic code generation. This is done, depending on platform, via static assembly, Posix setjmp/longjmp or SEH.

This is based on a proposal by @theRealAph [1] when discussing [JDK-8282475](https://bugs.openjdk.java.net/browse/JDK-8282475). I extended the proposal to do this for all platforms, not just MacOS, see discussion below.

---

Why we do this:

SafeFetch is special: it is a basic utility function that may be called before the VM is properly initialized, from a "foreign" thread not attached to the VM, in a broken environment... mainly this is due to its use during stack walking, error reporting, hex dumping, etc. SafeFetch has to be safe to call under any circumstances - or as safe as we can make it.

Today, SafeFetch is implemented via stub routines. That causes problems:

- MacOs aarch64 has mechanisms to prevent execution of arbitrary non-text memory. On that platform, invoking generated code requires some awkward gymnastics [2]. These workarounds require access to Thread::current, which renders SafeFetch non-functional if Thread::current is NULL (e.g. crash reporting in "foreign" native threads or in a corrupted VM). They also cost a bit of performance on every SafeFetch invocation. Note that more OSes may implement similar execution prevention mechanics in the future.

- There is a time window during VM initialization where the stubs are not generated yet and thus SafeFetch is not available.

Both points cause hs-err files to be compromised (e.g. missing call stacks) for crashes in native threads, when the VM is corrupted, or early during VM initialization. Since SafeFetch is such a basic tool, it causes more issues than that (e.g., on AIX it is used for verifying mprotect() itself). But hs-err files being compromised by secondary errors is the most annoying effect.

A minor point is that generating stubs takes a tiny bit of startup time. Not a big issue, but anything we can generate at build time instead of at runtime is good.

---

The patch:

- Removes the SafeFetch stub routine generation for all platforms
- For Linux and MacOS, it implements SafeFetch as cpu-specific static assembler routines. These routines live in `os_cpu/<os>_<cpu>/safefetch_<os>_<cpu>.S`.
- On Windows, SafeFetch is implemented via SEH. This is beautifully simple and does not need handwritten assembly at all. See `os/windows/safefetch_windows.hpp`.
- On Zero, SafeFetch uses a Posix compatible solution using `sigsetjmp(2)`. This is portable and robust, but a bit slow compared to the other methods. Note that this solution existed before, and is well tested. I just moved the code around and massaged it a bit.
- AIX uses now the same method as Zero. AIX "downgraded" in that sense from the stub routine solution, but since SAP gave up maintenance of AIX, I don't have easy access to AIX hardware. If SafeFetch speed becomes an issue, I am sure a solution based on static assembly can be implemented on AIX too.
- The SafeFetch gtests were greatly expanded.

Tests:

- The patch was tested manually by building and running gtests on:
  - MacOS (x64, aarch64)
  - Linux (aarch64, arm, ppcle, s390, x64, x86, riscv)
  - Windows (x64 and x86, but not arm, see below)
  - AIX
  - Zero (on Linux x64).
- GHAs
- The patch ran through SAP nightlies for several nights in a row

Note:
- I have no access to Windows arm, so I could not test it. I contacted @mo-beck  to verify that the patch works there.
- I have no access to BSD. I am not sure who maintains BSD, so I have no idea who to contact. However, I believe the MacOS solution should work for BSD too. If not, BSD can simply fall back to the Posix sigjmp solution like AIX and Zero.

It would be nice to have @RealLucy, @TheRealMDoerr and @theRealAph to okay s390, ppcle and aarch64 (and possibly arm32). Each assembly file is tiny, so it should not take long. @RealFYang and @backwaterred already tested the patch on RiscV and AIX (thanks).

Thanks, Thomas

[1] https://github.com/openjdk/jdk/pull/7727#issuecomment-1064980598
[2] https://github.com/openjdk/jdk/blob/4df67426ed02f18af0757897acb28b636a317a91/src/hotspot/share/runtime/safefetch.inline.hpp#L37-L41

-------------

Commit messages:
 - wip
 - static safefetch

Changes: https://git.openjdk.java.net/jdk/pull/7865/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7865&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8283326
  Stats: 1546 lines in 38 files changed: 955 ins; 550 del; 41 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7865.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7865/head:pull/7865

PR: https://git.openjdk.java.net/jdk/pull/7865


More information about the hotspot-dev mailing list