RFR: JDK-8283326: Implement SafeFetch statically
David Holmes
dholmes at openjdk.java.net
Mon Apr 11 06:27:48 UTC 2022
On Fri, 18 Mar 2022 08:52:46 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:
> Hi,
>
> This is a reimplementation of SafeFetch to get rid of dynamic code generation. This is done, depending on platform, via static assembly, Posix setjmp/longjmp or SEH.
>
> This is based on a proposal by @theRealAph [1] when discussing [JDK-8282475](https://bugs.openjdk.java.net/browse/JDK-8282475). I extended the proposal to do this for all platforms, not just MacOS, see discussion below.
>
> ---
>
> Why we do this:
>
> SafeFetch is special: it is a basic utility function that may be called before the VM is properly initialized, from a "foreign" thread not attached to the VM, in a broken environment... mainly this is due to its use during stack walking, error reporting, hex dumping, etc. SafeFetch has to be safe to call under any circumstances - or as safe as we can make it.
>
> Today, SafeFetch is implemented via stub routines. That causes problems:
>
> - MacOs aarch64 has mechanisms to prevent execution of arbitrary non-text memory. On that platform, invoking generated code requires some awkward gymnastics [2]. These workarounds require access to Thread::current, which renders SafeFetch non-functional if Thread::current is NULL (e.g. crash reporting in "foreign" native threads or in a corrupted VM). They also cost a bit of performance on every SafeFetch invocation. Note that more OSes may implement similar execution prevention mechanics in the future.
>
> - There is a time window during VM initialization where the stubs are not generated yet and thus SafeFetch is not available.
>
> Both points cause hs-err files to be compromised (e.g. missing call stacks) for crashes in native threads, when the VM is corrupted, or early during VM initialization. Since SafeFetch is such a basic tool, it causes more issues than that (e.g., on AIX it is used for verifying mprotect() itself). But hs-err files being compromised by secondary errors is the most annoying effect.
>
> A minor point is that generating stubs takes a tiny bit of startup time. Not a big issue, but anything we can generate at build time instead of at runtime is good.
>
> ---
>
> The patch:
>
> - Removes the SafeFetch stub routine generation for all platforms
> - For Linux and MacOS, it implements SafeFetch as cpu-specific static assembler routines. These routines live in `os_cpu/<os>_<cpu>/safefetch_<os>_<cpu>.S`.
> - On Windows, SafeFetch is implemented via SEH. This is beautifully simple and does not need handwritten assembly at all. See `os/windows/safefetch_windows.hpp`.
> - On Zero, SafeFetch uses a Posix compatible solution using `sigsetjmp(2)`. This is portable and robust, but a bit slow compared to the other methods. Note that this solution existed before, and is well tested. I just moved the code around and massaged it a bit.
> - AIX uses now the same method as Zero. AIX "downgraded" in that sense from the stub routine solution, but since SAP gave up maintenance of AIX, I don't have easy access to AIX hardware. If SafeFetch speed becomes an issue, I am sure a solution based on static assembly can be implemented on AIX too.
> - The SafeFetch gtests were greatly expanded.
>
> Tests:
>
> - The patch was tested manually by building and running gtests on:
> - MacOS (x64, aarch64)
> - Linux (aarch64, arm, ppcle, s390, x64, x86, riscv)
> - Windows (x64 and x86, but not arm, see below)
> - AIX
> - Zero (on Linux x64).
> - GHAs
> - The patch ran through SAP nightlies for several nights in a row
>
> Note:
> - I have no access to Windows arm, so I could not test it. I contacted @mo-beck to verify that the patch works there.
> - I have no access to BSD. I am not sure who maintains BSD, so I have no idea who to contact. However, I believe the MacOS solution should work for BSD too. If not, BSD can simply fall back to the Posix sigjmp solution like AIX and Zero.
>
> It would be nice to have @RealLucy, @TheRealMDoerr and @theRealAph to okay s390, ppcle and aarch64 (and possibly arm32). Each assembly file is tiny, so it should not take long. @RealFYang and @backwaterred already tested the patch on RiscV and AIX (thanks).
>
> Thanks, Thomas
>
> [1] https://github.com/openjdk/jdk/pull/7727#issuecomment-1064980598
> [2] https://github.com/openjdk/jdk/blob/4df67426ed02f18af0757897acb28b636a317a91/src/hotspot/share/runtime/safefetch.inline.hpp#L37-L41
Hi Thomas,
This looks really good! Definite approval in principle. I can't comment on the details of the .S files. A few nits and comments. Main is comment is that SAFEFETCH_METHOD_x defines don't seem to actually be needed ??
Thanks,
David
src/hotspot/os/posix/safefetch_sigjmp.cpp line 35:
> 33:
> 34: // For SafeFetch we need POSIX TLS and sigsetjmp/longjmp.
> 35: // (Note: We would prefer compiler level TLS but for some reason __thread does not
__thread cannot be safely used from a signal handler.
src/hotspot/os/posix/safefetch_sigjmp.cpp line 88:
> 86:
> 87: // Still here... All went well, adr was valid.
> 88: *result = n;
Nit: if we are paranoid shouldn't we reset TLS before we write the result?
src/hotspot/os/posix/safefetch_sigjmp.hpp line 27:
> 25:
> 26: #ifndef CPU_POSIX_SAFEFETCH_SIGJMP_HPP
> 27: #define CPU_POSIX_SAFEFETCH_SIGJMP_HPP
The include guard doesn't match the file name.
src/hotspot/os/posix/signals_posix.cpp line 603:
> 601: if (S390_ONLY(sig == SIGILL || sig == SIGFPE) NOT_S390(false)) {
> 602: pc = (address)info->si_addr;
> 603: } else if (ZERO_ONLY(true) NOT_ZERO(false)) {
So on most platforms this will become:
if (false) {
...
} else if (false) {
...
}
I would expect compilers to complain about that. ??
Update: I see this code already exists so obviously the compilers don't complain.
src/hotspot/os/posix/signals_posix.cpp line 614:
> 612: }
> 613:
> 614: #if defined(SAFEFETCH_METHOD_STATIC_ASSEMBLY) || defined(SAFEFETCH_METHOD_SIGSETJMP)
The guard seems unnecessary - all posix platforms need to handle safefetch here don't they?
src/hotspot/os/posix/vmError_posix.cpp line 79:
> 77:
> 78: #if defined(SAFEFETCH_METHOD_STATIC_ASSEMBLY) || defined(SAFEFETCH_METHOD_SIGSETJMP)
> 79: // Handle safefetch here too, to be able to use SafeFetc() inside the error handler
Typo: SafeFetc()
src/hotspot/os/windows/safefetch_windows.hpp line 46:
> 44:
> 45: inline int SafeFetch32(const int* adr, int errValue) { return SafeFetchXX<int>(adr, errValue); }
> 46: inline intptr_t SafeFetchN(const intptr_t* adr, intptr_t errValue) { return SafeFetchXX<intptr_t>(adr, errValue); }
Nit: these are hard to read on one long line, please use multi-lines.
src/hotspot/share/runtime/os.hpp line 302:
> 300: // platforms, usable as a compile-time constant.
> 301: static const size_t max_page_size_crossplatform = 64 * K; // ppc has 64 K pages.
> 302:
What is this for??
src/hotspot/share/runtime/safefetch.hpp line 32:
> 30: // Windows uses Structured Exception Handling
> 31: #include "safefetch_windows.hpp"
> 32: #define SAFEFETCH_METHOD_SEH
This doesn't seem to be needed.
src/hotspot/share/runtime/safefetch.hpp line 42:
> 40: // All other platforms use static assembly
> 41: #include "safefetch_static.hpp"
> 42: #define SAFEFETCH_METHOD_STATIC_ASSEMBLY
It seems to me that we don't actually need these SAFEFETCH_METHOD_x defines.
-------------
Changes requested by dholmes (Reviewer).
PR: https://git.openjdk.java.net/jdk/pull/7865
More information about the hotspot-dev
mailing list