RFR: JDK-8283326: Implement SafeFetch statically

David Holmes dholmes at openjdk.java.net
Mon Apr 11 06:27:48 UTC 2022


On Fri, 18 Mar 2022 08:52:46 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

> Hi,
> 
> This is a reimplementation of SafeFetch to get rid of dynamic code generation. This is done, depending on platform, via static assembly, Posix setjmp/longjmp or SEH.
> 
> This is based on a proposal by @theRealAph [1] when discussing [JDK-8282475](https://bugs.openjdk.java.net/browse/JDK-8282475). I extended the proposal to do this for all platforms, not just MacOS, see discussion below.
> 
> ---
> 
> Why we do this:
> 
> SafeFetch is special: it is a basic utility function that may be called before the VM is properly initialized, from a "foreign" thread not attached to the VM, in a broken environment... mainly this is due to its use during stack walking, error reporting, hex dumping, etc. SafeFetch has to be safe to call under any circumstances - or as safe as we can make it.
> 
> Today, SafeFetch is implemented via stub routines. That causes problems:
> 
> - MacOs aarch64 has mechanisms to prevent execution of arbitrary non-text memory. On that platform, invoking generated code requires some awkward gymnastics [2]. These workarounds require access to Thread::current, which renders SafeFetch non-functional if Thread::current is NULL (e.g. crash reporting in "foreign" native threads or in a corrupted VM). They also cost a bit of performance on every SafeFetch invocation. Note that more OSes may implement similar execution prevention mechanics in the future.
> 
> - There is a time window during VM initialization where the stubs are not generated yet and thus SafeFetch is not available.
> 
> Both points cause hs-err files to be compromised (e.g. missing call stacks) for crashes in native threads, when the VM is corrupted, or early during VM initialization. Since SafeFetch is such a basic tool, it causes more issues than that (e.g., on AIX it is used for verifying mprotect() itself). But hs-err files being compromised by secondary errors is the most annoying effect.
> 
> A minor point is that generating stubs takes a tiny bit of startup time. Not a big issue, but anything we can generate at build time instead of at runtime is good.
> 
> ---
> 
> The patch:
> 
> - Removes the SafeFetch stub routine generation for all platforms
> - For Linux and MacOS, it implements SafeFetch as cpu-specific static assembler routines. These routines live in `os_cpu/<os>_<cpu>/safefetch_<os>_<cpu>.S`.
> - On Windows, SafeFetch is implemented via SEH. This is beautifully simple and does not need handwritten assembly at all. See `os/windows/safefetch_windows.hpp`.
> - On Zero, SafeFetch uses a Posix compatible solution using `sigsetjmp(2)`. This is portable and robust, but a bit slow compared to the other methods. Note that this solution existed before, and is well tested. I just moved the code around and massaged it a bit.
> - AIX uses now the same method as Zero. AIX "downgraded" in that sense from the stub routine solution, but since SAP gave up maintenance of AIX, I don't have easy access to AIX hardware. If SafeFetch speed becomes an issue, I am sure a solution based on static assembly can be implemented on AIX too.
> - The SafeFetch gtests were greatly expanded.
> 
> Tests:
> 
> - The patch was tested manually by building and running gtests on:
>   - MacOS (x64, aarch64)
>   - Linux (aarch64, arm, ppcle, s390, x64, x86, riscv)
>   - Windows (x64 and x86, but not arm, see below)
>   - AIX
>   - Zero (on Linux x64).
> - GHAs
> - The patch ran through SAP nightlies for several nights in a row
> 
> Note:
> - I have no access to Windows arm, so I could not test it. I contacted @mo-beck  to verify that the patch works there.
> - I have no access to BSD. I am not sure who maintains BSD, so I have no idea who to contact. However, I believe the MacOS solution should work for BSD too. If not, BSD can simply fall back to the Posix sigjmp solution like AIX and Zero.
> 
> It would be nice to have @RealLucy, @TheRealMDoerr and @theRealAph to okay s390, ppcle and aarch64 (and possibly arm32). Each assembly file is tiny, so it should not take long. @RealFYang and @backwaterred already tested the patch on RiscV and AIX (thanks).
> 
> Thanks, Thomas
> 
> [1] https://github.com/openjdk/jdk/pull/7727#issuecomment-1064980598
> [2] https://github.com/openjdk/jdk/blob/4df67426ed02f18af0757897acb28b636a317a91/src/hotspot/share/runtime/safefetch.inline.hpp#L37-L41

Hi Thomas,

This looks really good! Definite approval in principle. I can't comment on the details of the .S files. A few nits and comments. Main is comment is that SAFEFETCH_METHOD_x defines don't seem to actually be needed ??

Thanks,
David

src/hotspot/os/posix/safefetch_sigjmp.cpp line 35:

> 33: 
> 34: // For SafeFetch we need POSIX TLS and sigsetjmp/longjmp.
> 35: // (Note: We would prefer compiler level TLS but for some reason __thread does not

__thread cannot be safely used from a signal handler.

src/hotspot/os/posix/safefetch_sigjmp.cpp line 88:

> 86: 
> 87:   // Still here... All went well, adr was valid.
> 88:   *result = n;

Nit: if we are paranoid shouldn't we reset TLS before we write the result?

src/hotspot/os/posix/safefetch_sigjmp.hpp line 27:

> 25: 
> 26: #ifndef CPU_POSIX_SAFEFETCH_SIGJMP_HPP
> 27: #define CPU_POSIX_SAFEFETCH_SIGJMP_HPP

The include guard doesn't match the file name.

src/hotspot/os/posix/signals_posix.cpp line 603:

> 601:     if (S390_ONLY(sig == SIGILL || sig == SIGFPE) NOT_S390(false)) {
> 602:       pc = (address)info->si_addr;
> 603:     } else if (ZERO_ONLY(true) NOT_ZERO(false)) {

So on most platforms this will become:


if (false) {
...
} else if (false) {
...
}


I would expect compilers to complain about that. ??

Update: I see this code already exists so obviously the compilers don't complain.

src/hotspot/os/posix/signals_posix.cpp line 614:

> 612:   }
> 613: 
> 614: #if defined(SAFEFETCH_METHOD_STATIC_ASSEMBLY) || defined(SAFEFETCH_METHOD_SIGSETJMP)

The guard seems unnecessary - all posix platforms need to handle safefetch here don't they?

src/hotspot/os/posix/vmError_posix.cpp line 79:

> 77: 
> 78: #if defined(SAFEFETCH_METHOD_STATIC_ASSEMBLY) || defined(SAFEFETCH_METHOD_SIGSETJMP)
> 79:   // Handle safefetch here too, to be able to use SafeFetc() inside the error handler

Typo: SafeFetc()

src/hotspot/os/windows/safefetch_windows.hpp line 46:

> 44: 
> 45: inline int SafeFetch32(const int* adr, int errValue)                { return SafeFetchXX<int>(adr, errValue); }
> 46: inline intptr_t SafeFetchN(const intptr_t* adr, intptr_t errValue)  { return SafeFetchXX<intptr_t>(adr, errValue); }

Nit: these are hard to read on one long line, please use multi-lines.

src/hotspot/share/runtime/os.hpp line 302:

> 300:   // platforms, usable as a compile-time constant.
> 301:   static const size_t max_page_size_crossplatform = 64 * K; // ppc has 64 K pages.
> 302: 

What is this for??

src/hotspot/share/runtime/safefetch.hpp line 32:

> 30:   // Windows uses Structured Exception Handling
> 31:   #include "safefetch_windows.hpp"
> 32:   #define SAFEFETCH_METHOD_SEH

This doesn't seem to be needed.

src/hotspot/share/runtime/safefetch.hpp line 42:

> 40:   // All other platforms use static assembly
> 41:   #include "safefetch_static.hpp"
> 42:   #define SAFEFETCH_METHOD_STATIC_ASSEMBLY

It seems to me that we don't actually need these SAFEFETCH_METHOD_x defines.

-------------

Changes requested by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7865


More information about the hotspot-dev mailing list