RFR(S) 8162369: PPC64: Wrong ucontext used after SIGTRAP while in HTM critical section

Gustavo Romero gromero at linux.vnet.ibm.com
Tue Aug 2 20:38:36 UTC 2016


Hi,

Could the following webrev be hosted and reviewed, please?

Webrev: http://81.de.7a9f.ip4.static.sl-reverse.com/8162369/v1
CR: https://bugs.openjdk.java.net/browse/JDK-8162369


Thomas, I've tested the proposed fix under two different (unexpected)
error conditions that can happen when in an HTM transaction:
a SIGSEGV and a SIGILL.

Since reporting pc at tbegin+4 is misleading, in both cases the correct
context for error reporting is the second (transactional). Then:

1) in a segmentation fault in the middle of a transaction it will be
reported correctly, like this:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00003fff4c2dffac, pid=31046, tid=31047

siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000000000000

Hence when we inspect the offending location indicated by the pc value in
the hs_err log we find the right root cause of the SIGSEGV:

(gdb) disas 0x00003fff4c2dffac-4,+8
Dump of assembler code from 0x3fff4c2dffa8 to 0x3fff4c2dffb0:
   0x00003fff4c2dffa8:	li      r15,0
   0x00003fff4c2dffac:	ld      r15,0(r15)
End of assembler dump.

Tested by the injection of a load from memory position 0x0, accordingly to
this patch: https://paste.fedoraproject.org/400032/
Full log hs_err log: http://paste.fedoraproject.org/400035/raw/

2) in an illegal instruction in the middle of a transaction it will be
reported correctly as well, like this:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGILL (0x4) at pc=0x00003fff7c2dd930, pid=21140, tid=21141

siginfo: si_signo: 4 (SIGILL), si_code: 1 (ILL_ILLOPC), si_addr: 0x00003fff7c2dd930

Hence when we inspect the offending location indicated by the pc value we
find the right offending instruction:

(gdb) disas 0x00003fff7c2dd930-4,+8
Dump of assembler code from 0x3fff7c2dd92c to 0x3fff7c2dd934:
   0x00003fff7c2dd92c:	cmpwi   cr6,r0,1
   0x00003fff7c2dd930:	.long 0xea2f0013
End of assembler dump.

Tested by the injection of a illegal instruction, accordingly to this
patch: https://paste.fedoraproject.org/400080/
Full log hs_err log: https://paste.fedoraproject.org/400073/raw/


Martin, although the detection using is_tbegin() is equivalent to checking
the uc_link pointer, I've decided to use the detection suggested by the
Linux kernel documentation, i.e checking the uc_link pointer plus the MSR
TS bits. As you said it's not an issue since uc_link isn't new and the msr
register isn't also new. So this kind of check will compile on both PPC64
LE and BE fine.

Thank you for sponsoring the change.

No regression was observed. One additional test now passes:
Passed: compiler/rtm/cli/TestUseRTMForStackLocksOptionOnSupportedConfig.java

Thank you!


Best regards,
Gustavo



More information about the ppc-aix-port-dev mailing list