[aarch64-port-dev ] guarantee failures with large code cache sizes on jtreg test java/lang/invoke/LFCaching10/LFMultiThreadCachingTest.java
Edward Nevill
edward.nevill at gmail.com
Thu Dec 24 15:06:56 UTC 2015
Hi,
I am seeing intermittent guarantee failures on jdk jtreg test java/lang/invoke/LFCaching10/LFMultiThreadCachingTest.jtr.
The failure is
#
# A fatal error has been detected by the Java Runtime Environment:
#
# Internal Error (assembler_aarch64.hpp:218), pid=43991, tid=44418
# guarantee(chk == -1 || chk == 0) failed: Field too big for insn
#
The test is being run with -XX:ReservedCodeCacheSize=256m, the following is the full command line
/home/ed/images/jdk9-orig/bin/java -XX:-TieredCompilation -jar lib/jtreg.jar -vmoption:-XX:ReservedCodeCacheSize=256m -retain -nr -conc:8 -timeout:99 -othervm -jdk:/home/ed/images/jdk9-orig -v1 -a -ignore:quiet /home/ed/new_jdk9/dev/jdk_test/test/java/lang/invoke
I have trapped the failure in gdb, it is occurring in pd_patch_instruction_size when trying to patch a BL instruction.
#8 0x000003ff7a7a360c in MacroAssembler::pd_patch_instruction_size (
branch=0x3ff691cf2d8 "\223\323\343\227\277:\003\325\213c\313\071\313\b",
target=0x3ff60a108a4 "\375{\277\251\375\003")
at /home/ed/new_jdk9/hs-comp/hotspot/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp:74
74 Instruction_aarch64::spatch(branch, 25, 0, offset);
(gdb) p offset
$17 = -35584653
(gdb)
Here is the backtrace from gdb
#8 0x000003ff7a7a360c in MacroAssembler::pd_patch_instruction_size (
branch=0x3ff691cf2d8 "\223\323\343\227\277:\003\325\213c\313\071\313\b",
target=0x3ff60a108a4 "\375{\277\251\375\003")
at /home/ed/new_jdk9/hs-comp/hotspot/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp:74
#9 0x000003ff7a33451c in MacroAssembler::pd_patch_instruction (
branch=0x3ff691cf2d8 "\223\323\343\227\277:\003\325\213c\313\071\313\b",
target=0x3ff60a108a4 "\375{\277\251\375\003")
at /home/ed/new_jdk9/hs-comp/hotspot/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp:565
#10 0x000003ff7a8ca9bc in Relocation::pd_set_call_destination (this=0x3fdc5fab3e8,
x=0x3ff60a108a4 "\375{\277\251\375\003")
at /home/ed/new_jdk9/hs-comp/hotspot/src/cpu/aarch64/vm/relocInfo_aarch64.cpp:85
#11 0x000003ff7a8c8650 in CallRelocation::fix_relocation_after_move (
this=0x3fdc5fab3e8, src=0x3fdc5fae0b0, dest=0x3fdc5fab490)
at /home/ed/new_jdk9/hs-comp/hotspot/src/share/vm/code/relocInfo.cpp:549
#12 0x000003ff7a4736bc in CodeBuffer::relocate_code_to (this=0x3fdc5fae0b0,
dest=0x3fdc5fab490)
at /home/ed/new_jdk9/hs-comp/hotspot/src/share/vm/asm/codeBuffer.cpp:812
#13 0x000003ff7a473be8 in CodeBuffer::expand (this=0x3fdc5fae0b0,
which_cs=0x3fdc5fae158, amount=64)
at /home/ed/new_jdk9/hs-comp/hotspot/src/share/vm/asm/codeBuffer.cpp:942
#14 0x000003ff7a334404 in CodeSection::maybe_expand_to_ensure_remaining (
this=0x3fdc5fae158, amount=64)
at /home/ed/new_jdk9/hs-comp/hotspot/src/share/vm/asm/codeBuffer.hpp:661
#15 0x000003ff7a33379c in AbstractAssembler::start_a_stub (this=0x3fdc5fab838,
required_space=64)
at /home/ed/new_jdk9/hs-comp/hotspot/src/share/vm/asm/assembler.cpp:65
#16 0x000003ff7a7a54a0 in MacroAssembler::emit_trampoline_stub (this=0x3fdc5fab838,
insts_call_instruction_offset=976, dest=0x3ff609cf080 "\375{\277\251H\001")
at /home/ed/new_jdk9/hs-comp/hotspot/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp:704
#17 0x000003ff7a7a53c0 in MacroAssembler::trampoline_call (this=0x3fdc5fab838,
entry=..., cbuf=0x3fdc5fae0b0)
at /home/ed/new_jdk9/hs-comp/hotspot/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp:673
#18 0x000003ff7a2a1bd0 in CallStaticJavaDirectNode::emit (this=0x3fdac0024b0,
cbuf=..., ra_=0x3fdc5fabd30)
at /home/ed/new_jdk9/hs-comp/hotspot/src/cpu/aarch64/vm/aarch64.ad:4673
#19 0x000003ff7a85edc0 in Compile::fill_buffer (this=0x3fdc5fad870,
cb=0x3fdc5fae0b0, blk_starts=0x3fd40042520)
at /home/ed/new_jdk9/hs-comp/hotspot/src/share/vm/opto/output.cpp:1380
#20 0x000003ff7a85b960 in Compile::Output (this=0x3fdc5fad870)
at /home/ed/new_jdk9/hs-comp/hotspot/src/share/vm/opto/output.cpp:154
#21 0x000003ff7a4a6c88 in Compile::Code_Gen (this=0x3fdc5fad870)
at /home/ed/new_jdk9/hs-comp/hotspot/src/share/vm/opto/compile.cpp:2407
#22 0x000003ff7a4a1fa8 in Compile::Compile (this=0x3fdc5fad870,
ci_env=0x3fdc5fae390, compiler=0x3ff746bc7d0, target=0x3fd981ff670, osr_bci=-1,
subsume_loads=true, do_escape_analysis=true, eliminate_boxing=true,
directive=0x3ff74680570)
at /home/ed/new_jdk9/hs-comp/hotspot/src/share/vm/opto/compile.cpp:899
#23 0x000003ff7a3e7684 in C2Compiler::compile_method (this=0x3ff746bc7d0,
env=0x3fdc5fae390, target=0x3fd981ff670, entry_bci=-1, directive=0x3ff74680570)
at /home/ed/new_jdk9/hs-comp/hotspot/src/share/vm/opto/c2compiler.cpp:106
#24 0x000003ff7a4b2ea8 in CompileBroker::invoke_compiler_on_method (
task=0x3fd640bbfd0)
at /home/ed/new_jdk9/hs-comp/hotspot/src/share/vm/compiler/compileBroker.cpp:1814
#25 0x000003ff7a4b25d4 in CompileBroker::compiler_thread_loop ()
at /home/ed/new_jdk9/hs-comp/hotspot/src/share/vm/compiler/compileBroker.cpp:1564
#26 0x000003ff7a96a9b4 in compiler_thread_entry (thread=0x3ff746bf000,
__the_thread__=0x3ff746bf000)
at /home/ed/new_jdk9/hs-comp/hotspot/src/share/vm/runtime/thread.cpp:3238
#27 0x000003ff7a9678f4 in JavaThread::thread_main_inner (this=0x3ff746bf000)
at /home/ed/new_jdk9/hs-comp/hotspot/src/share/vm/runtime/thread.cpp:1723
#28 0x000003ff7a967830 in JavaThread::run (this=0x3ff746bf000)
at /home/ed/new_jdk9/hs-comp/hotspot/src/share/vm/runtime/thread.cpp:1703
#29 0x000003ff7a849614 in java_start (thread=0x3ff746bf000)
at /home/ed/new_jdk9/hs-comp/hotspot/src/os/linux/vm/os_linux.cpp:683
#30 0x000003ff7af07e2c in start_thread (arg=0x3fdc5faf1f0) at pthread_create.c:314
#31 0x000003ff7ae18c40 in clone ()
at ../ports/sysdeps/unix/sysv/linux/aarch64/nptl/../clone.S:96
Looking at frame #11 above we see
(gdb) list
544 // On some platforms, the reference is absolute (not self-relative).
545 // The enhanced use of pd_call_destination sorts this all out.
546 address orig_addr = old_addr_for(addr(), src, dest);
547 address callee = pd_call_destination(orig_addr);
548 // Reassert the callee address, this time in the new copy of the code.
549 pd_set_call_destination(callee);
550 }
551
552
553 //// pack/unpack methods
(gdb) p/x addr()
$18 = 0x3ff691cf2d8
(gdb) p/x orig_addr
$19 = 0x3ff6111ba58
(gdb) p/x callee
$20 = 0x3ff60a108a4
Looking at a section of code at both orig_addr and addr() and at the destination of the BL in each case we have
(gdb) x/10i orig_addr-20
0x3ff6111ba44: add x10, x14, w15, sxtw
0x3ff6111ba48: sxtw x2, w17
0x3ff6111ba4c: add x0, x10, #0x10
0x3ff6111ba50: cmp w17, w13
0x3ff6111ba54: b.lt 0x3ff6111bb24
0x3ff6111ba58: bl 0x3ff60a108a4
0x3ff6111ba5c: dmb ishst
0x3ff6111ba60: ldrsb w11, [x28,#728]
0x3ff6111ba64: cbnz w11, 0x3ff6111bb7c
0x3ff6111ba68: mov x10, x19
(gdb) x/10i 0x3ff60a108a4
0x3ff60a108a4: stp x29, x30, [sp,#-16]!
0x3ff60a108a8: mov x29, sp
0x3ff60a108ac: cmp x1, x0
0x3ff60a108b0: b.ls 0x3ff60a10808
0x3ff60a108b4: add x0, x0, x2, uxtx
0x3ff60a108b8: add x1, x1, x2, uxtx
0x3ff60a108bc: cmp x2, #0x10
0x3ff60a108c0: b.cc 0x3ff60a10914
0x3ff60a108c4: and x9, x0, #0xf
0x3ff60a108c8: cbz x9, 0x3ff60a1090c
(gdb) x/10i addr()-20
0x3ff691cf2c4: add x10, x14, w15, sxtw
0x3ff691cf2c8: sxtw x2, w17
0x3ff691cf2cc: add x0, x10, #0x10
0x3ff691cf2d0: cmp w17, w13
0x3ff691cf2d4: b.lt 0x3ff691cf3a4
0x3ff691cf2d8: bl 0x3ff68ac4124
0x3ff691cf2dc: dmb ishst
0x3ff691cf2e0: ldrsb w11, [x28,#728]
0x3ff691cf2e4: cbnz w11, 0x3ff691cf3fc
0x3ff691cf2e8: mov x10, x19
(gdb) x/10i 0x3ff68ac4124
0x3ff68ac4124: .inst 0x00000000 ; undefined
0x3ff68ac4128: .inst 0x00000000 ; undefined
0x3ff68ac412c: .inst 0x00000000 ; undefined
0x3ff68ac4130: .inst 0x00000000 ; undefined
0x3ff68ac4134: .inst 0x00000000 ; undefined
0x3ff68ac4138: .inst 0x00000000 ; undefined
0x3ff68ac413c: .inst 0x00000000 ; undefined
0x3ff68ac4140: .inst 0x00000000 ; undefined
0x3ff68ac4144: .inst 0x00000000 ; undefined
0x3ff68ac4148: .inst 0x00000000 ; undefined
What appears to be the case here is that we have a BL to another method, therefore outside the scope of the current codeblob. However, this codeblob is now being moved and will now require a trampoline instead of a straight BL.
However the BL is not recognised as requiring a trampoline. Looking at frame #10
(gdb) down
#10 0x000003ff7a8ca9bc in Relocation::pd_set_call_destination (this=0x3fdc5fab3e8,
x=0x3ff60a108a4 "\375{\277\251\375\003")
at /home/ed/new_jdk9/hs-comp/hotspot/src/cpu/aarch64/vm/relocInfo_aarch64.cpp:85
85 MacroAssembler::pd_patch_instruction(addr(), x);
(gdb) list 81
76 assert(is_call(), "should be a call here");
77 if (NativeCall::is_call_at(addr())) {
78 address trampoline = nativeCall_at(addr())->get_trampoline();
79 if (trampoline) {
80 nativeCall_at(addr())->set_destination_mt_safe(x, /* assert_lock */false);
81 return;
82 }
83 }
84 assert(addr() != x, "call instruction in an infinite loop");
85 MacroAssembler::pd_patch_instruction(addr(), x);
'trampoline' is set to false here
(gdb) p NativeCall::is_call_at(addr())
$21 = true
(gdb) p nativeCall_at(addr())->get_trampoline()
$22 = (u_char *) 0x0
(gdb)
Looking at the source for get_trampoline()
CodeBlob *code = CodeCache::find_blob(call_addr);
assert(code != NULL, "Could not find the containing code blob");
address bl_destination
= MacroAssembler::pd_call_destination(call_addr);
if (code->content_contains(bl_destination) &&
is_NativeCallTrampolineStub_at(bl_destination))
return bl_destination;
This only tests for a trampoline if the BL destination is within the current code blob, and as seen previously with the problems with adrp, it must not test for a trampoline outside the current code blob because that could be pointing somewhere completely random. In this case it happens to be pointing to a block of .inst 0x00000000 words.
The problem arises from the implementation of MacroAssembler::trampoline_call where is does
if (Assembler::reachable_from_branch_at(pc(), entry.target())) {
bl(entry.target());
} else {
bl(pc());
}
Here if the call reaches, it plants a BL, however when the call subsequently fails to reach, because the codeblob is moved out of range of a bl, it has no way of finding the trampoline, because it will not look outside the current code blob.
Only possibility might be to always write it as bl(pc()) and rely on the final reloc to fix it up to either point to the trampoline, or call direct. However I think there may be a problem with this if the codeblob is moved more than once, in this case the first move would relocate it using a direct BL and then the second could move it out of range and fail to find the trampoline as above.
Anyone got any ideas on how to fix this?
All the best, and Happy Christmas,
Ed.
More information about the aarch64-port-dev
mailing list