RFR: 8283179: SA tests fail with "ERROR: catch_mach_exception_raise: Message doesn't denote a Unix soft signal." [v2]
Chris Plummer
cjplummer at openjdk.java.net
Tue Apr 19 04:01:16 UTC 2022
> During the SA attach process on macOS, SA installs an exception handler and expects to get an EXC_SOFTWARE exception for the expected SIGSTOP signal. On aarch64, sometimes it instead gets an EXC_BAD_INSTRUCTION exception. I found if I changed the code to just ignore this, it eventually gets the EXC_SOFTWARE exception and everything is fine.
>
> The EXC_BAD_INSTRUCTION is from the following code in the JIT and is expected behavior:
>
>
> bool NativeInstruction::is_sigill_zombie_not_entrant() {
> return uint_at(0) == 0xd4bbd5a1; // dcps1 #0xdead
> }
>
> void NativeIllegalInstruction::insert(address code_pos) {
> *(juint*)code_pos = 0xd4bbd5a1; // dcps1 #0xdead
> }
>
> void NativeJump::patch_verified_entry(address entry, address verified_entry, address dest) {
> ...
> if (Assembler::reachable_from_branch_at(verified_entry, dest)) {
> ...
> } else {
> // We use an illegal instruction for marking a method as
> // not_entrant or zombie.
> NativeIllegalInstruction::insert(verified_entry);
> }
>
>
> I fixed SA to retry each time it gets EXC_BAD_INSTRUCTION. I also did the same for EXC_BAD_ACCESS. We saw this once on x64, and it looks like it is for a trap-based NPE, given that the bad address that was provided as part of the exception data is for address 0x8.
>
> Although the fix is somewhat straight forward, the context in which it is run is not. SA does a `ptrace_attach()`. This seems to result in a SIGSTOP being generated. SA provides an exception handler called `catch_mach_exception_raise()`. It gets called for any pending exception when SA calls `mach_msg(&exc_msg.header, MACH_RCV_MSG, ...)` from `wait_for_exception()`. So `catch_mach_exception_raise()` should see the SIGSTOP, return KERN_SUCCESS (which means to ignore the exception), and then `mach_msg()` returns to `wait_for_exception()`, which checks for success of having received the SIGSTOP, and then allows SA to continue with its attach process (like suspending all threads).
>
> The details of attaching and exception handling are actually much uglier than that, and I don't pretend to even understand half of it. https://www.spaceflint.com/?p=150 is an interesting read if you are looking for more details.
Chris Plummer has updated the pull request incrementally with one additional commit since the last revision:
Minor comment fix.
-------------
Changes:
- all: https://git.openjdk.java.net/jdk/pull/8255/files
- new: https://git.openjdk.java.net/jdk/pull/8255/files/f170e847..4d36edce
Webrevs:
- full: https://webrevs.openjdk.java.net/?repo=jdk&pr=8255&range=01
- incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=8255&range=00-01
Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
Patch: https://git.openjdk.java.net/jdk/pull/8255.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/8255/head:pull/8255
PR: https://git.openjdk.java.net/jdk/pull/8255
More information about the serviceability-dev
mailing list