[aarch64-port-dev ] Crashes while building aarch64 port using icedtea 2.5.1
Edward Nevill
edward.nevill at linaro.org
Thu Jul 24 09:41:24 UTC 2014
Hi Fridrich,
On Thu, 2014-07-24 at 07:46 +0200, Fridrich Strba wrote:
>#
># A fatal error has been detected by the Java Runtime Environment:
>#
># SIGSEGV (0xb) at pc=0x0000004001c074b4, pid=32573, tid=274904080880
OK, I have had a look at this and discussed this with our QEMU expert and we think we may know what is going on.
Instructions: (pc=0x0000004001c074b4)
> 0x0000004001c07494: 20 f9 ff 58 42 4c 1d 12 5f 00 00 71 20 02 00 54
> 0x0000004001c074a4: 3f 0c 00 b9 fd 7b 44 a9 ff 43 01 91 e8 ec ff f0
> 0x0000004001c074b4: 1f 01 40 b9 c0 03 5f d6 e0 07 00 f9 08 00 80 92
> 0x0000004001c074c4: e8 03 00 f9 8e be fd 97 da ff ff 17 e0 07 00 f9
>
Disassembling the above hex gives us:-
0x411028 <tri>: ldr x0, 0x410f4c
0x41102c <tri+4>: and w2, w2, #0x7ffff8
0x411030 <tri+8>: cmp w2, #0x0
0x411034 <tri+12>: b.eq 0x411078
0x411038 <tri+16>: str wzr, [x1,#12]
0x41103c <tri+20>: ldp x29, x30, [sp,#64]
0x411040 <tri+24>: add sp, sp, #0x50
0x411044 <tri+28>: adrp x8, 0x1b0000
0x411048 <tri+32>: ldr wzr, [x8] ;;; <<< Fault address
0x41104c <tri+36>: ret
0x411050 <tri+40>: str x0, [sp,#8]
0x411054 <tri+44>: mov x8, #0xffffffffffffffff // #-1
0x411058 <tri+48>: str x8, [sp]
0x41105c <tri+52>: bl 0x380a94
0x411060 <tri+56>: b 0x410fc8
0x411064 <tri+60>: str x0, [sp,#8]
So it is faulting on the instruction "ldr wzr, [x8]". This is a test of the polling page. What OpenJDK does is, when it wants to do a GC it runs all threads to a safe point, ie a point in the compiled code where it knows the location of all object references. It does this by read protecting the polling page. The polling page is polled at well known safe points such as (in the above case) a return from a method.
OpenJDK then traps the resultant SIG SEGV and when all threads have run to a safe point it is safe to do a GC.
However, it would seem that in this case although the signal is being thrown, it is not being caught correctly by OpenJDK.
The first question is whether you are running this is user emulation or system emulation. My QEMU expert tells me that in user emulation mode signals may not be delivered to the correct thread.
He has pointed me at the following QEMU patch which may help, but is not a fix.
Otherwise I am afraid the answer is to run in system emulation mode, or run on real HW.
All the best,
Edward Nevill
--- CUT HERE ---
From: Alexander Graf <agraf at suse.de>
Date: Tue, 10 Jul 2012 20:40:55 +0200
Subject: linux-user: Run multi-threaded code on a single core
Running multi-threaded code can easily expose some of the fundamental
breakages in QEMU's design. It's just not a well supported scenario.
So if we pin the whole process to a single host CPU, we guarantee that
we will never have concurrent memory access actually happen. We can still
get scheduled away at any time, so it's no complete guarantee, but apparently
it reduces the odds well enough to get my test cases to pass.
This gets Java 1.7 working for me again on my test box.
Signed-off-by: Alexander Graf <agraf at suse.de>
---
linux-user/syscall.c | 9 +++++++++
1 files changed, 9 insertions(+), 0 deletions(-)
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index d62e9e6..5295afb 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -4400,6 +4400,15 @@ static int do_fork(CPUArchState *env, unsigned int flags, abi_ulong newsp,
if (nptl_flags & CLONE_SETTLS)
cpu_set_tls (new_env, newtls);
+ /* agraf: Pin ourselves to a single CPU when running multi-threaded.
+ This turned out to improve stability for me. */
+ {
+ cpu_set_t mask;
+ CPU_ZERO(&mask);
+ CPU_SET(0, &mask);
+ sched_setaffinity(0, sizeof(mask), &mask);
+ }
+
/* Grab a mutex so that thread setup appears atomic. */
pthread_mutex_lock(&clone_lock);
--- CUT HERE ---
More information about the aarch64-port-dev
mailing list