RFA: Exhume the ARM port [PR icedtea/484, 323]
Xerxes Rånby
xerxes at zafena.se
Thu Oct 20 16:21:46 PDT 2011
2011-10-20 15:55, Andrew Haley skrev:
> This patch is in two parts. The first part simply undoes the commit
> that deleted the ARM assembler port. The second part patches the
> port to work with the Hard FP ABI and HotSpot 20.
Thank you Andrew for showing us that it indeed are possible for a polymath to maintain the
ARM assembler port!
sorry for the long reply, not every day you review a 2000line patch. ;)
>
> This is the first patch. As it simply undoes an earlier patch, I don't
> include a diff.
>
> 2011-09-15 Andrew Haley<aph at redhat.com>
>
> Reinstate the ARM assembler port. Back out this patch:
>
> 2011-07-11 Xerxes RÃ¥nby<xerxes at zafena.se>
>
> Removal of the ARM assembler port, unbreaks Zero and Shark builds.
OK, as long as the second part below gets updated and pushed at the same time!
>
> This second patch is the update. I'm sorry it's rather long, but
> anything much smaller crashes during bootstrap; you really need it
> all. It's still only JDK 6 at the moment. I'll look at implementing
> invokedynamic next, and there are some performance improvements on the
> way.
>
> OK for trunk?
My main concern with the second patch, as is, are that the pr323 update will not work in
combination with the Hotspot profiler. In order to make the profiler happy we need to
change to code to first set THREAD_LAST_JAVA_SP to 0 before THREAD_LAST_JAVA_FP can be
updated.
If you look at the zero file hotspot/src/cpu/zero/vm/javaFrameAnchor_zero.hpp inside the
function void set(... there you can see that Zero make sure to set THREAD_LAST_JAVA_SP to
NULL before touching THREAD_LAST_JAVA_FP.
I have posted parts on how Rob Savoye tried to restructure the code to make the profiler
happy extracted from: http://www.senecass.com/projects/OpenJDK-ARM/thumb2-090710.patch
(this patch do not include any HS20 fixes since HS20 had not been released when that patch
was made). so if you see ... : in the comments then it means the block above could be
rewritten as the following block below to make the profiler happy...
If the patch work, runs/bootstraps, as is, despite fixing this latent profiler bug... im
OK for it to go in after some testing rounds on armel and armhf systems in order for all
your other nice fixes to get some testing.
Comments inline below:
> Andrew.
> 2011-10-20 Andrew Haley<aph at redhat.com>
>
> * patches/arm.patch (CFLAGS): Enable the ARM assembler port.
The second patch contains the whole file for patches/arm.patch instead of the diff.
>
> PR icedtea/484:
> * arm_port/hotspot/src/cpu/zero/vm/cppInterpreter_arm.S
> (fast_empty_entry, normal_entry_synchronized, normal_entry):
> Return 0, #deoptimized_frames.
Why are the Return 0 unneeded for native_entry and native_entry_synchronized?
I would expect to find something like this:
--- a/ports/hotspot/src/cpu/zero/vm/cppInterpreter_arm.S Fri Aug 06 20:08:18 2010 +0100
+++ b/ports/hotspot/src/cpu/zero/vm/cppInterpreter_arm.S Tue Sep 07 16:21:29 2010 -0600
@@ -830,6 +830,7 @@
native_entry:
stmfd arm_sp!, {regset, lr}
bl fast_native_entry
+ mov r0, #0 @ No deoptimized frames pr484
ldmia sp!, {regset, pc}
ALIGN_CODE
@@ -1059,6 +1099,7 @@
str tmp1, [ip, #JNIHANDLEBLOCK_TOP]
str r1, [r5, #4]
str r5, [r9, #THREAD_JAVA_SP]
+ mov r0, #0 @ No deoptimized frames, pr484
ldmfd arm_sp!, {fast_regset, pc}
.fast_native_return_byte:
mov r0, r0, lsl #24
@@ -1067,6 +1108,7 @@
str tmp1, [ip, #JNIHANDLEBLOCK_TOP]
str r0, [r5, #-4]!
str r5, [r9, #THREAD_JAVA_SP]
+ mov r0, #0 @ No deoptimized frames, pr484
ldmfd arm_sp!, {fast_regset, pc}
.fast_native_return_char:
mov r0, r0, lsl #16
@@ -1075,6 +1117,7 @@
str tmp1, [ip, #JNIHANDLEBLOCK_TOP]
str r0, [r5, #-4]!
str r5, [r9, #THREAD_JAVA_SP]
+ mov r0, #0 @ No deoptimized frames, pr484
ldmfd arm_sp!, {fast_regset, pc}
.fast_native_return_bool:
ands r0, r0, #255
@@ -1083,6 +1126,7 @@
str tmp1, [ip, #JNIHANDLEBLOCK_TOP]
str r0, [r5, #-4]!
str r5, [r9, #THREAD_JAVA_SP]
+ mov r0, #0 @ No deoptimized frames, pr484
ldmfd arm_sp!, {fast_regset, pc}
.fast_native_return_obj:
cmp r0, #0
@@ -1091,6 +1135,7 @@
str lr, [r9, #THREAD_TOP_ZERO_FRAME]
str tmp1, [ip, #JNIHANDLEBLOCK_TOP]
str r5, [r9, #THREAD_JAVA_SP]
+ mov r0, #0 @ No deoptimized frames, pr484
ldmfd arm_sp!, {fast_regset, pc}
.fast_native_return_short:
mov r0, r0, lsl #16
@@ -1102,6 +1147,7 @@
str tmp1, [ip, #JNIHANDLEBLOCK_TOP]
.fast_native_exit:
str r5, [r9, #THREAD_JAVA_SP]
+ mov r0, #0 @ No deoptimized frames, pr484
ldmfd arm_sp!, {fast_regset, pc}
.fast_native_entry_throw_stack_overflow:
@@ -1337,6 +1385,7 @@
mov lr, pc
ldr pc, [tmp1]
+ mov r0, #0 @ No deoptimized frames, pr484
ldmfd arm_sp!, {fast_regset, pc}
#endif // THUMB2EE
> * arm_port/hotspot/src/cpu/zero/vm/thumb2.cpp (Thumb2_Return):
> Likewise.
Question: Can the thumb2 jit deoptimize a frame?
If it can then its must return the number of deoptimized frames.
> * arm_port/hotspot/src/cpu/zero/vm/bytecodes_arm.def
> (return_unsafe, ireturn_unsafe, lreturn_unsafe)
> (ireturn,areturn,freturn): Return deoptimized_frames = 0.
>
> PR icedtea/323:
> * arm_port/hotspot/src/cpu/zero/vm/cppInterpreter_arm.S:
Please add a clear on THREAD_JAVA_SP before stores to THREAD_LAST_JAVA_FP.
> Replace
> all stores to THREAD_LAST_JAVA_SP with stores to
> THREAD_LAST_JAVA_FP.
Perhaps mention the loads from THREAD_JAVA_SP here?
> Add stores to THREAD_LAST_JAVA_SP.
> * arm_port/hotspot/src/cpu/zero/vm/thumb2.cpp: Likewise.
>
> * Makefile.am (ICEDTEA_PATCHES): Add patches/arm-debug.patch.
> (stamps/ports.stamp): Use cp -l to link the ARM interpreter
> sources into the target dir rather than copying them.
> (EXTRA_DIST): Add arm_port.
These changes looks fine. look out for Makefile.am chunks that are already found in
icedtea6 HEAD. More details in the patch below:
>
> Update to HS20:
> * arm_port/hotspot/src/cpu/zero/vm/thumb2.cpp: Replace all calls
> to fatal1() (which no longer exists) with calls to fatal().
> (Thumb2_is_zombie, Thumb2_pass2): invoke* and {get,put}* now take
> native, not Java, byte ordered indexes.
> (Thumb2_disass, Thumb2_codegen, Thumb2_tablegen):
> Bytecodes::special_length_at() takes different arguments from the
> previous version; adjust suitably.
> * arm_port/hotspot/src/cpu/zero/vm/cppInterpreter_arm.S: Replace
> call to report_fatal_vararg(char const*, int, char const*, ...)
> (which no longer exists) with call to Helper_report_fatal.
Can you add a note about the fixed ISTATE_SELF_LINK reference?
> * arm_port/hotspot/src/cpu/zero/vm/bytecodes_arm.def (new):
> Renumber the fast bytecodes iload_0_iconst_N to iload_3_iload_N.
> * arm_port/hotspot/src/cpu/zero/vm/thumb2.cpp: Likewise.
> * patches/arm.patch: Likewise.
> Add *.S to the list of source files.
> * arm_port/hotspot/src/cpu/zero/vm/asm_helper.cpp (All #includes)
> : Move to new OpenJDK include file format.
> (Helper_report_fatal): New assember helper.
> (print_vm_offsets): Add THREAD_LAST_JAVA_FP.
Please mention the add of SIZEOF_FFI_CIF to this file in the changelog.
>
> Hard FP port:
> * arm_port/hotspot/src/cpu/zero/vm/thumb2.cpp (Thumb2_Initialize):
> Add hard FP variants for the stubs that need it.
> * arm_port/hotspot/src/cpu/zero/vm/cppInterpreter_arm.S
> (POPF0, POPF1, POPD0, POPD1,PUSHF0, PUSHD0): New macros.
> (.eabi_attribute): Use real names. Add declaration for hard
> FP ABI.
> (.fast_native_return_double, .fast_native_return_float, .fast_copy_double,
.fast_copy_float): New.
> (fast_native_entry): use SIZEOF_FFI_CIF, not 24.
Nice! i have been wondering what this magic constant was refering to.
> Add logic to handle args in FP registers.
> (FIND_LOWEST_BIT, FIND_LOWEST_BIT_PAIR, COPY_DOUBLE, COPY_FLOAT): New macros.
> (.copy_float_table, .copy_double_table): New.
> * arm_port/hotspot/src/cpu/zero/vm/bytecodes_arm.def
> (frem, drem, f2i, f2l, d2i, d2l): Add hardfp args.
>
> * openjdk-ecj/hotspot/src/cpu/zero/vm/cppInterpreter_arm.S
> (REWRITE_PAIRS): New macro.
> * openjdk-ecj/hotspot/src/cpu/zero/vm/bytecodes_arm.def: Use
> REWRITE_PAIRS to prevent rewriting pairs of bytecodes in the
> instruction stream.
>
> * patches/arm-debug.patch: New file.
>
> diff -r 886f06403d1c Makefile.am
> --- a/Makefile.am Wed Oct 19 12:37:40 2011 -0400
> +++ b/Makefile.am Thu Oct 20 09:35:07 2011 -0400
...
> @@ -650,7 +652,7 @@
> clean-icedtea-against-ecj clean-extract-ecj clean-generated clean-replace-hotspot \
> clean-rewriter clean-rewrite-rhino clean-rt clean-bootstrap-directory \
> clean-bootstrap-directory-ecj clean-bootstrap-directory-symlink \
> - clean-bootstrap-directory-symlink-ecj clean-fonts jtregcheck
> + clean-bootstrap-directory-symlink-ecj clean-fonts
> if [ -e bootstrap ]; then \
> rmdir bootstrap ; \
> fi
This chunk are already fixed in HEAD:
http://icedtea.classpath.org/hg/icedtea6/rev/6b8691d033ae27da89ac605fca75e15a0b426500
> @@ -1058,7 +1060,7 @@
> done
> rm -f stamps/ports.stamp
>
> -+stamps/generated.stamp: stamps/ports.stamp
> +stamps/generated.stamp: stamps/ports.stamp
> if [ ! -e $(GENERATED_BUILD_DIR) ]; then \
> cp -a $(abs_top_srcdir)/generated $(GENERATED_BUILD_DIR)&& \
> chmod -R ug+rwX $(GENERATED_BUILD_DIR) ; \
This chunk looks like a merge leftover.
> diff -r 886f06403d1c arm_port/hotspot/src/cpu/zero/vm/bytecodes_arm.def
> --- a/arm_port/hotspot/src/cpu/zero/vm/bytecodes_arm.def Wed Oct 19 12:37:40 2011 -0400
> +++ b/arm_port/hotspot/src/cpu/zero/vm/bytecodes_arm.def Thu Oct 20 09:35:07 2011 -0400
> @@ -265,35 +265,35 @@
> iload_iload = 0xe3, 4
> iload_iload_N = 0xe4, 3
>
> - at return_register_finalizer = 0xe5, 1
> -
> -iload_0_iconst_N = 0xe7, 2
...
> -iload_3_iload_N = 0xfd, 2
> -
> -#endif // FAST_BYTECODES
> -
> -return_register_finalizer = 0xe5, 1
> + at return_register_finalizer = 0xe7, 1
> +
> +iload_0_iconst_N = 0xe9, 2
...
> +iload_3_iload_N = 0xff, 2
> +
> +#endif
> +
> +return_register_finalizer = 0xe7, 1
Wow we are lucky that enough room existed for all the fast bytecodes when the two new
internal bytecodes got added for HS20 ! :D
> diff -r 886f06403d1c arm_port/hotspot/src/cpu/zero/vm/cppInterpreter_arm.S
> --- a/arm_port/hotspot/src/cpu/zero/vm/cppInterpreter_arm.S Wed Oct 19 12:37:40 2011 -0400
> +++ b/arm_port/hotspot/src/cpu/zero/vm/cppInterpreter_arm.S Thu Oct 20 09:35:07 2011 -0400
> @@ -872,6 +957,7 @@
> str r7, [r9, #ISTATE_LOCALS]
> str r10, [r9, #ISTATE_CONSTANTS]
> str r11, [r9, #ISTATE_METHOD]
> + str r9, [r9, #ISTATE_SELF_LINK]
>
> @ stmia r9, {r2, r5, r7, r10, r11}
> ldr r1, [r2, #THREAD_STACK_SIZE]
Nice, thank you for fixing this micro optimization and make the ISTATE struct valid.
> @@ -886,12 +972,17 @@
> blt .fast_native_entry_throw_stack_overflow
> cmp r5, #0
> bne .fast_native_entry_got_handleraddr
I think this code are broken.
you need to add
str r5, [r9, #THREAD_LAST_JAVA_SP] @ r5 is zero at this point
here to clear THREAD_LAST_JAVA_SP before updating THREAD_LAST_JAVA_FP
> + str r0, [r9, #THREAD_LAST_JAVA_FP]
> + ldr r0, [r9, #THREAD_JAVA_SP]
> str r0, [r9, #THREAD_LAST_JAVA_SP]
> mov r0, r9
> mov r1, r11
> bl _ZN18InterpreterRuntime19prepare_native_callEP10JavaThreadP13methodOopDesc
> ldr r11, [r9, #THREAD_TOP_ZERO_FRAME]
> ldr r1, [r9, #THREAD_PENDING_EXC]
> + str r5, [r9, #THREAD_LAST_JAVA_FP] @ r5 is zero at this point
> + str r5, [r9, #THREAD_LAST_JAVA_SP]
to be safe can you reorder these two clears so that _SP gets cleared before _FP.
> + ldr r5, [r9, #THREAD_JAVA_SP]
> str r5, [r9, #THREAD_LAST_JAVA_SP]
> ldr r11, [r11, #-72 + ISTATE_METHOD]
> cmp r1, #0
> @@ -904,6 +995,8 @@
> ldr r3, [r9, #THREAD_TOP_ZERO_FRAME]
> mov r2, #0
> mov r0, r9
I think this code are broken.
you need to add
str r2, [r9, #THREAD_LAST_JAVA_SP] @ r2 is zero at this point
here
> + str r3, [r9, #THREAD_LAST_JAVA_FP]
> + ldr r3, [r9, #THREAD_JAVA_SP]
> str r3, [r9, #THREAD_LAST_JAVA_SP]
> mov r3, r2
> mov r1, r11
> @@ -990,9 +1153,12 @@
> ldr r0, [r9, #THREAD_TOP_ZERO_FRAME]
> mov r2, #_thread_in_native
>
add
mov ip, #0 @ ip are free to use at this point, gets reloaded below.
str ip, [r9, #THREAD_LAST_JAVA_SP]
here
> - str r0, [r9, #THREAD_LAST_JAVA_SP]
> + str r0, [r9, #THREAD_LAST_JAVA_FP]
> str r2, [r9, #THREAD_STATE]
>
> + ldr r2, [r9, #THREAD_JAVA_SP]
> + str r2, [r9, #THREAD_LAST_JAVA_SP]
> +
> ldr ip, [r11, #METHOD_NATIVEHANDLER]
> ldrh r11, [r11, #METHOD_SIZEOFPARAMETERS]
>
> @@ -1053,6 +1225,23 @@
> .word 0
> .word .fast_native_return_obj @ FFI_TYPE_POINTER == 14
>
> +#ifdef __ARM_PCS_VFP
> +.fast_native_return_double:
> + fsts s0, [r5, #-8]
> + str lr, [r9, #THREAD_TOP_ZERO_FRAME]
> + str tmp1, [ip, #JNIHANDLEBLOCK_TOP]
> + fsts s1, [r5, #-4]
> + add r5, #-8
> + str r5, [r9, #THREAD_JAVA_SP]
pr484 Return 0 here?
> + ldmfd arm_sp!, {fast_regset, pc}
> +.fast_native_return_float:
> + fsts s0, [r5, #-4]
> + str lr, [r9, #THREAD_TOP_ZERO_FRAME]
> + str tmp1, [ip, #JNIHANDLEBLOCK_TOP]
> + add r5, #-4
> + str r5, [r9, #THREAD_JAVA_SP]
pr484 Return 0 here?
> + ldmfd arm_sp!, {fast_regset, pc}
> +#endif
> .fast_native_return_dw:
> str r0, [r5, #-8]!
> str lr, [r9, #THREAD_TOP_ZERO_FRAME]
> @@ -1105,11 +1294,12 @@
> ldmfd arm_sp!, {fast_regset, pc}
>
> .fast_native_entry_throw_stack_overflow:
> - str r0, [r9, #THREAD_LAST_JAVA_SP]
> + str r0, [r9, #THREAD_LAST_JAVA_FP]
> mov r0, r9
> bl _ZN18InterpreterRuntime24throw_StackOverflowErrorEP10JavaThread
> mov r3, #0
> ldr r1, [r9, #THREAD_PENDING_EXC]
> + str r3, [r9, #THREAD_LAST_JAVA_FP]
> str r3, [r9, #THREAD_LAST_JAVA_SP]
> .fast_native_entry_exception:
> ldr r5, [r9, #THREAD_TOP_ZERO_FRAME]
> @@ -1955,8 +2224,12 @@
> str r3, [istate, #ISTATE_STACK_LIMIT]
> str r0, [r1, #4]
> ldr r1, [istate, #ISTATE_THREAD]
I think this part need to be:
+@ set SP to zero before setting the FP
+ str r0, [r1, #THREAD_LAST_JAVA_SP]
+@ set FP to the top zero frame
+ str r3, [r1, #THREAD_LAST_JAVA_FP]
+@ get the stack pointer, use r3 as it gets reset below
+ ldr r3, [r1, #THREAD_JAVA_SP]
+@ set SP to the current top of stack
str r3, [r1, #THREAD_LAST_JAVA_SP]
instead of
> + @ I think this is right, but I do not know why monitorenter needs
> + @ to save the FP at this point.
> + ldr r3, [r1, #THREAD_JAVA_SP]
> + str r3, [r1, #THREAD_LAST_JAVA_SP]
> ldr r3, [r1, #THREAD_TOP_ZERO_FRAME]
> - str r3, [r1, #THREAD_LAST_JAVA_SP]
> + str r3, [r1, #THREAD_LAST_JAVA_FP]
> CACHE_STACK
> ldr sl, [istate, #ISTATE_STACK_BASE]
> ldr r3, [stack, #4]
> @@ -2347,15 +2622,19 @@
> bl build_frame
> mov tmp_vvv, r0
> ldr r3, [tmp1, #THREAD_TOP_ZERO_FRAME]
> - sub r0, r0, #72
> + sub r0, r0, #72 @ This mysterious constant is actually the offset of
> + @ the next frame field. Why is "ISTATE_NEXT_FRAME"
> + @ not used here?
> mov istate, r0
> str r3, [tmp_vvv, #0]
> adrl ip, dispatch_init_adcon
> + ldr r1, [tmp1, #THREAD_JAVA_SP]
> + str r1, [tmp1, #THREAD_LAST_JAVA_SP]
likewise... i dont think its valid to update SP here
> ldm ip, {r0, r1}
> add r0, r0, ip
> str tmp_vvv, [tmp1, #THREAD_TOP_ZERO_FRAME]
> @ CACHE_JPC
> - str tmp_vvv, [tmp1, #THREAD_LAST_JAVA_SP]
> + str tmp_vvv, [tmp1, #THREAD_LAST_JAVA_FP]
> add dispatch, r1, r0
> ldr r0, [istate, #ISTATE_METHOD]
> ldr r3, [r0, #METHOD_ACCESSFLAGS]
How about rewriting this chunk with:
@@ -2355,7 +2416,16 @@
add r0, r0, ip
str tmp_vvv, [tmp1, #THREAD_TOP_ZERO_FRAME]
@ CACHE_JPC
- str tmp_vvv, [tmp1, #THREAD_LAST_JAVA_SP]
+@ set SP to zero before setting the FP
+ mov r2, #0
+ str r2, [tmp1, #THREAD_LAST_JAVA_SP]
+@ set FP to the top zero frame
+ str tmp_vvv, [tmp1, #THREAD_LAST_JAVA_FP]
+@ get the stack pointer, use r3 as it gets reset below
+ ldr r3, [tmp1, #THREAD_JAVA_SP]
+@ set SP to the current top of stack
+ str r3, [tmp1, #THREAD_LAST_JAVA_SP]
+
add dispatch, r1, r0
ldr r0, [istate, #ISTATE_METHOD]
ldr r3, [r0, #METHOD_ACCESSFLAGS]
> @@ -2528,10 +2808,13 @@
> ldr r2, [istate, #ISTATE_STACK_LIMIT]
> sub stack, stack, #4
>
> +
> + ldr r1, [ip, #THREAD_JAVA_SP]
> + str r1, [ip, #THREAD_LAST_JAVA_SP]
> ldr r1, [ip, #THREAD_TOP_ZERO_FRAME]
> add r2, r2, #4
> str r2, [ip, #THREAD_JAVA_SP]
> - str r1, [ip, #THREAD_LAST_JAVA_SP]
> + str r1, [ip, #THREAD_LAST_JAVA_FP]
SP needs to be set last.
> DISPATCH_START 5
> ldr r3, [ip, #4]
> DISPATCH_NEXT
how about revriting this chunk with:
@@ -2531,7 +2602,15 @@
ldr r1, [ip, #THREAD_TOP_ZERO_FRAME]
add r2, r2, #4
str r2, [ip, #THREAD_JAVA_SP]
- str r1, [ip, #THREAD_LAST_JAVA_SP]
+@ set SP to zero before setting the FP
+ mov r0, #0
+@ set SP to zero before setting the FP
+ str r0, [ip, #THREAD_LAST_JAVA_SP]
+@ set FP to the top zero frame
+ str r1, [ip, #THREAD_LAST_JAVA_FP]
+@ set SP to the current top of stack
+ str r2, [ip, #THREAD_LAST_JAVA_SP]
+
DISPATCH_START 5
ldr r3, [ip, #4]
DISPATCH_NEXT
> @@ -2680,7 +2963,8 @@
> ldr r1, [ip, #THREAD_TOP_ZERO_FRAME]
> add r2, r2, #4
> str r2, [ip, #THREAD_JAVA_SP]
> - str r1, [ip, #THREAD_LAST_JAVA_SP]
> + str r1, [ip, #THREAD_LAST_JAVA_FP]
> + str r2, [ip, #THREAD_LAST_JAVA_SP]
> DISPATCH_START 3
> ldr r3, [ip, #4]
> DISPATCH_NEXT
> @@ -2717,6 +3001,7 @@
> ldr tmp1, [istate, #ISTATE_THREAD]
> str r3, [istate, #36]
> str r1, [tmp1, #THREAD_LAST_JAVA_SP]
> + str r1, [tmp1, #THREAD_LAST_JAVA_FP]
>
> add stack, stack, #4
> str stack, [tmp1, #THREAD_JAVA_SP]
How about revriting this chunk with:
@@ -2680,7 +2761,14 @@
ldr r1, [ip, #THREAD_TOP_ZERO_FRAME]
add r2, r2, #4
str r2, [ip, #THREAD_JAVA_SP]
- str r1, [ip, #THREAD_LAST_JAVA_SP]
+@ set SP to zero before setting the FP to keep the profiler happy
+ mov r3, #0
+ str r3, [ip, #THREAD_LAST_JAVA_SP]
+@ set FP to the top zero frame
+ str r1, [ip, #THREAD_LAST_JAVA_FP]
+@ set SP to the current top of stack
+ str r2, [ip, #THREAD_LAST_JAVA_SP]
+
DISPATCH_START 3
ldr r3, [ip, #4]
DISPATCH_NEXT
> @@ -2842,6 +3127,8 @@
> ldr constpool, [constpool, #CONSTANTPOOL_CACHE]
> str ip, [tmp1, #THREAD_TOP_ZERO_FRAME]
> USEC ldr r3, [r10, #METHOD_INVOCATIONCOUNTER]
> + str ip, [tmp1, #THREAD_LAST_JAVA_FP]
> + ldr ip, [tmp1, #THREAD_JAVA_SP]
> str ip, [tmp1, #THREAD_LAST_JAVA_SP]
> DISPATCH_NEXT
> USEC ldr lr, [dispatch, #InterpreterInvocationLimit_Address-XXX]
how about for this chunk:
@@ -2842,7 +2932,14 @@
ldr constpool, [constpool, #CONSTANTPOOL_CACHE]
str ip, [tmp1, #THREAD_TOP_ZERO_FRAME]
USEC ldr r3, [r10, #METHOD_INVOCATIONCOUNTER]
- str ip, [tmp1, #THREAD_LAST_JAVA_SP]
+@ set SP to zero before setting the FP to keep the profiler happy
+ str r1, [tmp1, #THREAD_LAST_JAVA_SP]
+@ set FP to the top zero frame
+ str ip, [tmp1, #THREAD_LAST_JAVA_FP]
+@ get the stack pointer, use r5 as it gets reset below
+ ldr r2, [tmp1, #THREAD_JAVA_SP]
+@ set SP to the current top of stack
+ str r2, [tmp1, #THREAD_LAST_JAVA_SP]
DISPATCH_NEXT
USEC ldr lr, [dispatch, #InterpreterInvocationLimit_Address-XXX]
USEC add r3, r3, #INVOCATIONCOUNTER_COUNTINCREMENT
> @@ -2944,15 +3233,19 @@
> @
----------------------------------------------------------------------------------------
> stack_overflow_no_frame:
> mov r0, tmp1
> + ldr ip, [r0, #THREAD_JAVA_SP]
> + str ip, [r0, #THREAD_LAST_JAVA_SP]
> ldr ip, [r0, #THREAD_TOP_ZERO_FRAME]
> - str ip, [r0, #THREAD_LAST_JAVA_SP]
> + str ip, [r0, #THREAD_LAST_JAVA_FP]
> bl _ZN18InterpreterRuntime24throw_StackOverflowErrorEP10JavaThread
> ldmfd arm_sp!, {regset, pc}
>
> stack_overflow_before_frame:
> mov r0, tmp1
> + ldr ip, [r0, #THREAD_JAVA_SP]
> + str ip, [r0, #THREAD_LAST_JAVA_SP]
> ldr ip, [r0, #THREAD_TOP_ZERO_FRAME]
> - str ip, [r0, #THREAD_LAST_JAVA_SP]
> + str ip, [r0, #THREAD_LAST_JAVA_FP]
> bl _ZN18InterpreterRuntime24throw_StackOverflowErrorEP10JavaThread
> ldmfd arm_sp!, {fast_regset, pc}
>
... :
@@ -2939,21 +3037,41 @@
str r1, [tmp_xxx, #THREAD_JAVA_SP]
+ mov r0, #0 @ No deoptimized frames, pr484
ldmfd arm_sp!, {fast_regset, pc}
@ ----------------------------------------------------------------------------------------
stack_overflow_no_frame:
mov r0, tmp1
ldr ip, [r0, #THREAD_TOP_ZERO_FRAME]
- str ip, [r0, #THREAD_LAST_JAVA_SP]
+@ set SP to zero before setting the FP
+ mov r4, #0
+ str r4, [r0, #THREAD_LAST_JAVA_SP]
+@ set FP to the top zero frame
+ str ip, [r0, #THREAD_LAST_JAVA_FP]
+@ get the stack pointer
+ ldr r4, [r0, #THREAD_JAVA_SP]
+@ set SP to the current top of stack
+ str r4, [r0, #THREAD_LAST_JAVA_SP]
+
bl _ZN18InterpreterRuntime24throw_StackOverflowErrorEP10JavaThread
+ mov r0, #0 @ No deoptimized frames, pr484
ldmfd arm_sp!, {regset, pc}
stack_overflow_before_frame:
mov r0, tmp1
ldr ip, [r0, #THREAD_TOP_ZERO_FRAME]
- str ip, [r0, #THREAD_LAST_JAVA_SP]
+@ set SP to zero before setting the FP
+ mov r4, #0
+ str r4, [r0, #THREAD_LAST_JAVA_SP]
+@ set FP to the top zero frame
+ str ip, [r0, #THREAD_LAST_JAVA_FP]
+@ get the stack pointer
+ ldr r4, [r0, #THREAD_JAVA_SP]
+@ set SP to the current top of stack
+
bl _ZN18InterpreterRuntime24throw_StackOverflowErrorEP10JavaThread
+ mov r0, #0 @ No deoptimized frames, pr484
ldmfd arm_sp!, {fast_regset, pc}
handle_exception_do_not_unlock:
> @@ -4915,7 +5211,8 @@
> ldr r1, [Rthread, #THREAD_TOP_ZERO_FRAME]
> add r2, r2, #4
> str r2, [Rthread, #THREAD_JAVA_SP]
> - str r1, [Rthread, #THREAD_LAST_JAVA_SP]
> + str r2, [Rthread, #THREAD_LAST_JAVA_SP]
> + str r1, [Rthread, #THREAD_LAST_JAVA_FP]
> ldr r3, [Rthread, #4]
> cmp r3, #0
> bne istub_exception
... :
@@ -4915,10 +5042,18 @@
ldr r1, [Rthread, #THREAD_TOP_ZERO_FRAME]
add r2, r2, #4
str r2, [Rthread, #THREAD_JAVA_SP]
- str r1, [Rthread, #THREAD_LAST_JAVA_SP]
+@ set SP to zero before setting the FP
+ mov r4, #0
+ str r4, [Rthread, #THREAD_LAST_JAVA_SP]
+@ set FP to the top zero frame
+ str r1, [Rthread, #THREAD_LAST_JAVA_FP]
+@ set SP to the current top of stack
+ str r2, [Rthread, #THREAD_LAST_JAVA_SP]
+
ldr r3, [Rthread, #4]
cmp r3, #0
bne istub_exception
+ mov r0, #0 @ No deoptimized frames, pr484
ldmia sp!, {ip, pc}
istub_methodInterface:
> @@ -5031,10 +5329,12 @@
> ldr stack, [Rthread, #THREAD_JAVA_SP]
> ldr r2, [istate, #ISTATE_STACK_LIMIT]
>
> + ldr r1, [Rthread, #THREAD_JAVA_SP]
> + str r1, [Rthread, #THREAD_LAST_JAVA_SP]
> ldr r1, [Rthread, #THREAD_TOP_ZERO_FRAME]
> add r2, r2, #4
> str r2, [Rthread, #THREAD_JAVA_SP]
> - str r1, [Rthread, #THREAD_LAST_JAVA_SP]
> + str r1, [Rthread, #THREAD_LAST_JAVA_FP]
> ldr r3, [Rthread, #4]
> cmp r3, #0
> bne istub_exception
... :
@@ -5034,10 +5170,18 @@
ldr r1, [Rthread, #THREAD_TOP_ZERO_FRAME]
add r2, r2, #4
str r2, [Rthread, #THREAD_JAVA_SP]
- str r1, [Rthread, #THREAD_LAST_JAVA_SP]
+@ set SP to zero before setting the FP
+ mov r3, #0
+ str r3, [Rthread, #THREAD_LAST_JAVA_SP]
+@ set FP to the top zero frame
+ str r1, [Rthread, #THREAD_LAST_JAVA_FP]
+@ set SP to the current top of stack
+ str r2, [Rthread, #THREAD_LAST_JAVA_SP]
+
ldr r3, [Rthread, #4]
cmp r3, #0
bne istub_exception
+ mov r0, #0 @ No deoptimized frames, pr484
ldmia sp!, {ip, pc}
ivstub_resolve:
> @@ -5100,7 +5400,8 @@
> ldr r1, [Rthread, #THREAD_TOP_ZERO_FRAME]
> add r2, r2, #4
> str r2, [Rthread, #THREAD_JAVA_SP]
> - str r1, [Rthread, #THREAD_LAST_JAVA_SP]
> + str r2, [Rthread, #THREAD_LAST_JAVA_SP]
> + str r1, [Rthread, #THREAD_LAST_JAVA_FP]
> ldr r3, [Rthread, #4]
> cmp r3, #0
> bne istub_exception
... :
@@ -5100,10 +5245,18 @@
ldr r1, [Rthread, #THREAD_TOP_ZERO_FRAME]
add r2, r2, #4
str r2, [Rthread, #THREAD_JAVA_SP]
- str r1, [Rthread, #THREAD_LAST_JAVA_SP]
+@ set SP to zero before setting the FP
+ mov r3, #0
+ str r3, [Rthread, #THREAD_LAST_JAVA_SP]
+@ set FP to the top zero frame
+ str r1, [Rthread, #THREAD_LAST_JAVA_FP]
+@ set SP to the current top of stack
+ str r2, [Rthread, #THREAD_LAST_JAVA_SP]
+
ldr r3, [Rthread, #4]
cmp r3, #0
bne istub_exception
+ mov r0, #0 @ No deoptimized frames, pr484
ldmia sp!, {ip, pc}
isstub_resolve:
> @@ -5172,7 +5473,8 @@
> ldr r1, [Rthread, #THREAD_TOP_ZERO_FRAME]
> add r2, r2, #4
> str r2, [Rthread, #THREAD_JAVA_SP]
> - str r1, [Rthread, #THREAD_LAST_JAVA_SP]
> + str r2, [Rthread, #THREAD_LAST_JAVA_SP]
> + str r1, [Rthread, #THREAD_LAST_JAVA_FP]
> ldr r3, [Rthread, #4]
> cmp r3, #0
> bne istub_exception
... :
@@ -5172,10 +5326,18 @@
ldr r1, [Rthread, #THREAD_TOP_ZERO_FRAME]
add r2, r2, #4
str r2, [Rthread, #THREAD_JAVA_SP]
- str r1, [Rthread, #THREAD_LAST_JAVA_SP]
+@ set SP to zero before setting the FP
+ mov r3, #0
+ str r3, [Rthread, #THREAD_LAST_JAVA_SP]
+@ set FP to the top zero frame
+ str r1, [Rthread, #THREAD_LAST_JAVA_FP]
+@ set SP to the current top of stack
+ str r2, [Rthread, #THREAD_LAST_JAVA_SP]
+
ldr r3, [Rthread, #4]
cmp r3, #0
bne istub_exception
+ mov r0, #0 @ No deoptimized frames, pr484
ldmia sp!, {ip, pc}
ispstub_resolve:
@@ -6264,8 +6426,18 @@
Thumb2_Stack_Overflow:
mov r0, r2
ldr ip, [r0, #THREAD_TOP_ZERO_FRAME]
- str ip, [r0, #THREAD_LAST_JAVA_SP]
+@ set SP to zero before setting the FP
+ mov r2, #0
+ str r2, [r0, #THREAD_LAST_JAVA_SP]
+@ set FP to the top zero frame
+ str ip, [r0, #THREAD_LAST_JAVA_FP]
+@ get the stack pointer
+ ldr r2, [r0, #THREAD_JAVA_SP]
+@ set SP to the current top of stack
+ str r2, [r0, #THREAD_LAST_JAVA_SP]
+
bl _ZN18InterpreterRuntime24throw_StackOverflowErrorEP10JavaThread
+ mov r0, #0 @ No deoptimized frames, pr484
ldmfd arm_sp!, {fast_regset, pc}
.global Thumb2_Exit_To_Interpreter
> diff -r 886f06403d1c arm_port/hotspot/src/cpu/zero/vm/thumb2.cpp
> --- a/arm_port/hotspot/src/cpu/zero/vm/thumb2.cpp Wed Oct 19 12:37:40 2011 -0400
> +++ b/arm_port/hotspot/src/cpu/zero/vm/thumb2.cpp Thu Oct 20 09:35:07 2011 -0400
> @@ -4383,6 +4391,12 @@
> str_imm(jinfo->codebuf, Rstack, Rthread, THREAD_JAVA_SP, 1, 0);
> Thumb2_Debug(jinfo, H_DEBUG_METHODEXIT);
> // enter_leave(jinfo->codebuf, 0);
> +
> + // deoptimized_frames = 0
> + // FIXME: This should be done in the slow entry, but only three
> + // words are allocated there for the instructions.
> + mov_imm(jinfo->codebuf, ARM_R0, 0);
> +
> ldm(jinfo->codebuf, C_REGSET + (1<<ARM_PC), ARM_SP, POP_FD, 1);
> }
>
> @@ -4657,6 +4671,8 @@
>
> add_imm(jinfo->codebuf, ARM_R3, Ristate, ISTATE_NEXT_FRAME);
> str_imm(jinfo->codebuf, ARM_R3, ARM_R2, THREAD_TOP_ZERO_FRAME, 1, 0);
> + str_imm(jinfo->codebuf, ARM_R3, ARM_R2, THREAD_LAST_JAVA_FP, 1, 0);
> + ldr_imm(jinfo->codebuf, ARM_R3, ARM_R2, THREAD_JAVA_SP, 1, 0);
> str_imm(jinfo->codebuf, ARM_R3, ARM_R2, THREAD_LAST_JAVA_SP, 1, 0);
>
> ldr_imm(jinfo->codebuf, ARM_R3, ARM_IP, CONSTANTPOOL_CACHE, 1, 0);
... :
@@ -4657,8 +4658,17 @@
add_imm(jinfo->codebuf, ARM_R3, Ristate, ISTATE_NEXT_FRAME);
str_imm(jinfo->codebuf, ARM_R3, ARM_R2, THREAD_TOP_ZERO_FRAME, 1, 0);
+
+ // set SP to zero before setting the FP
+ mov_imm(jinfo->codebuf, ARM_LR, 0);
+ str_imm(jinfo->codebuf, ARM_LR, ARM_R2, THREAD_LAST_JAVA_SP, 1, 0);
+ // set FP to the top zero frame
+ str_imm(jinfo->codebuf, ARM_R3, ARM_R2, THREAD_LAST_JAVA_FP, 1, 0);
+ // get the stack pointer, use R3 as it gets reset below
+ ldr_imm(jinfo->codebuf, ARM_R3, ARM_R2, THREAD_JAVA_SP, 1, 0);
+ // set SP to the current top of stack
str_imm(jinfo->codebuf, ARM_R3, ARM_R2, THREAD_LAST_JAVA_SP, 1, 0);
-
+
ldr_imm(jinfo->codebuf, ARM_R3, ARM_IP, CONSTANTPOOL_CACHE, 1, 0);
str_imm(jinfo->codebuf, ARM_R3, Ristate, ISTATE_CONSTANTS, 1, 0);
> @@ -5858,7 +5890,8 @@
> add_imm(jinfo->codebuf, ARM_R2, ARM_R2, 4);
> ldr_imm(jinfo->codebuf, ARM_R3, Rthread, THREAD_PENDING_EXC, 1, 0);
> str_imm(jinfo->codebuf, ARM_R2, Rthread, THREAD_JAVA_SP, 1, 0);
> - str_imm(jinfo->codebuf, ARM_R1, Rthread, THREAD_LAST_JAVA_SP, 1, 0);
> + str_imm(jinfo->codebuf, ARM_R2, Rthread, THREAD_LAST_JAVA_SP, 1, 0);
> + str_imm(jinfo->codebuf, ARM_R1, Rthread, THREAD_LAST_JAVA_FP, 1, 0);
> cmp_imm(jinfo->codebuf, ARM_R3, 0);
> it(jinfo->codebuf, COND_NE, IT_MASK_T);
> bl(jinfo->codebuf, handlers[H_HANDLE_EXCEPTION_NO_REGS]);
... :
@@ -5858,7 +5870,15 @@
add_imm(jinfo->codebuf, ARM_R2, ARM_R2, 4);
ldr_imm(jinfo->codebuf, ARM_R3, Rthread, THREAD_PENDING_EXC, 1, 0);
str_imm(jinfo->codebuf, ARM_R2, Rthread, THREAD_JAVA_SP, 1, 0);
- str_imm(jinfo->codebuf, ARM_R1, Rthread, THREAD_LAST_JAVA_SP, 1, 0);
+
+ // set SP to zero before setting the FP
+ mov_imm(jinfo->codebuf, ARM_LR, 0);
+ str_imm(jinfo->codebuf, ARM_LR, Rthread, THREAD_LAST_JAVA_SP, 1, 0);
+ // set FP to the top zero frame
+ str_imm(jinfo->codebuf, ARM_R1, Rthread, THREAD_LAST_JAVA_FP, 1, 0);
+ // set SP to the current top of stack
+ str_imm(jinfo->codebuf, ARM_R2, Rthread, THREAD_LAST_JAVA_SP, 1, 0);
+
cmp_imm(jinfo->codebuf, ARM_R3, 0);
it(jinfo->codebuf, COND_NE, IT_MASK_T);
bl(jinfo->codebuf, handlers[H_HANDLE_EXCEPTION_NO_REGS]);
> @@ -5975,7 +6009,8 @@
> add_imm(jinfo->codebuf, ARM_R2, ARM_R2, 4);
> ldr_imm(jinfo->codebuf, ARM_R3, Rthread, THREAD_PENDING_EXC, 1, 0);
> str_imm(jinfo->codebuf, ARM_R2, Rthread, THREAD_JAVA_SP, 1, 0);
> - str_imm(jinfo->codebuf, ARM_R1, Rthread, THREAD_LAST_JAVA_SP, 1, 0);
> + str_imm(jinfo->codebuf, ARM_R2, Rthread, THREAD_LAST_JAVA_SP, 1, 0);
> + str_imm(jinfo->codebuf, ARM_R1, Rthread, THREAD_LAST_JAVA_FP, 1, 0);
> cmp_imm(jinfo->codebuf, ARM_R3, 0);
> it(jinfo->codebuf, COND_NE, IT_MASK_T);
> bl(jinfo->codebuf, handlers[H_HANDLE_EXCEPTION_NO_REGS]);
... :
@@ -5974,8 +5995,17 @@
ldr_imm(jinfo->codebuf, ARM_R1, Rthread, THREAD_TOP_ZERO_FRAME, 1, 0);
add_imm(jinfo->codebuf, ARM_R2, ARM_R2, 4);
ldr_imm(jinfo->codebuf, ARM_R3, Rthread, THREAD_PENDING_EXC, 1, 0);
- str_imm(jinfo->codebuf, ARM_R2, Rthread, THREAD_JAVA_SP, 1, 0);
- str_imm(jinfo->codebuf, ARM_R1, Rthread, THREAD_LAST_JAVA_SP, 1, 0);
+
+ // set SP to zero before setting the FP
+ mov_imm(jinfo->codebuf, ARM_LR, 0);
+ str_imm(jinfo->codebuf, ARM_LR, Rthread, THREAD_LAST_JAVA_SP, 1, 0);
+ // set FP to the top zero frame
+ str_imm(jinfo->codebuf, ARM_R1, Rthread, THREAD_LAST_JAVA_FP, 1, 0);
+ // get the stack pointer, use LR
+ ldr_imm(jinfo->codebuf, ARM_LR, Rthread, THREAD_JAVA_SP, 1, 0);
+ // set SP to the current top of stack
+ str_imm(jinfo->codebuf, ARM_LR, Rthread, THREAD_LAST_JAVA_SP, 1, 0);
+
cmp_imm(jinfo->codebuf, ARM_R3, 0);
it(jinfo->codebuf, COND_NE, IT_MASK_T);
bl(jinfo->codebuf, handlers[H_HANDLE_EXCEPTION_NO_REGS]);
> @@ -6014,7 +6050,8 @@
> add_imm(jinfo->codebuf, ARM_R2, ARM_R2, 4);
> ldr_imm(jinfo->codebuf, ARM_R3, Rthread, THREAD_PENDING_EXC, 1, 0);
> str_imm(jinfo->codebuf, ARM_R2, Rthread, THREAD_JAVA_SP, 1, 0);
> - str_imm(jinfo->codebuf, ARM_R1, Rthread, THREAD_LAST_JAVA_SP, 1, 0);
> + str_imm(jinfo->codebuf, ARM_R1, Rthread, THREAD_LAST_JAVA_FP, 1, 0);
> + str_imm(jinfo->codebuf, ARM_R2, Rthread, THREAD_LAST_JAVA_SP, 1, 0);
> cmp_imm(jinfo->codebuf, ARM_R3, 0);
> it(jinfo->codebuf, COND_NE, IT_MASK_T);
> bl(jinfo->codebuf, handlers[H_HANDLE_EXCEPTION_NO_REGS]);
... :
@@ -6014,7 +6045,15 @@
add_imm(jinfo->codebuf, ARM_R2, ARM_R2, 4);
ldr_imm(jinfo->codebuf, ARM_R3, Rthread, THREAD_PENDING_EXC, 1, 0);
str_imm(jinfo->codebuf, ARM_R2, Rthread, THREAD_JAVA_SP, 1, 0);
- str_imm(jinfo->codebuf, ARM_R1, Rthread, THREAD_LAST_JAVA_SP, 1, 0);
+
+ // set SP to zero before setting the FP, R7 appears to be unused
+ mov_imm(jinfo->codebuf, ARM_LR, 0);
+ str_imm(jinfo->codebuf, ARM_LR, Rthread, THREAD_LAST_JAVA_SP, 1, 0);
+ // set FP to the top zero frame
+ str_imm(jinfo->codebuf, ARM_R1, Rthread, THREAD_LAST_JAVA_FP, 1, 0);
+ // set SP to the current top of stack
+ str_imm(jinfo->codebuf, ARM_R2, Rthread, THREAD_LAST_JAVA_SP, 1, 0);
+
cmp_imm(jinfo->codebuf, ARM_R3, 0);
it(jinfo->codebuf, COND_NE, IT_MASK_T);
bl(jinfo->codebuf, handlers[H_HANDLE_EXCEPTION_NO_REGS]);
The diff of patches/arm.patch are not a diff.. its the whole file..
some comments inline
> diff -r 886f06403d1c patches/arm.patch
> --- /dev/null Thu Jan 01 00:00:00 1970 +0000
> +++ b/patches/arm.patch Thu Oct 20 09:35:07 2011 -0400
> @@ -0,0 +1,232 @@
...
> +diff -Nru openjdk.orig/hotspot/src/cpu/zero/vm/bytecodes_zero.cpp
openjdk/hotspot/src/cpu/zero/vm/bytecodes_zero.cpp
> +--- openjdk.orig/hotspot/src/cpu/zero/vm/bytecodes_zero.cpp 2010-05-28
11:10:30.000000000 +0100
> ++++ openjdk/hotspot/src/cpu/zero/vm/bytecodes_zero.cpp 2010-06-17 17:29:30.000000000 +0100
> +@@ -1,6 +1,7 @@
> + /*
> + * Copyright (c) 2003, Oracle and/or its affiliates. All rights reserved.
> + * Copyright 2007 Red Hat, Inc.
> ++ * Copyright 2009 Edward Nevill
> + * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
> + *
> + * This code is free software; you can redistribute it and/or modify it
> +@@ -27,5 +28,54 @@
...
> ++ def(_iload_iload, "iload_iload", "bi_i",NULL, T_INT, 2, false, _iload);
> ++ def(_iload_iload_N, "ilaod_iload_N", "bi_", NULL, T_INT, 2, false, _iload);
I think there are a typo here: "ilaod_iload_N" instead of "iload_iload_N"
> ++
> ++ def(_iload_0_iconst_N, "iload_0_iconst_N", "b_", NULL, T_INT, 2, false, _iload_0);
...
Cheers and have a great day!
Xerxes
More information about the distro-pkg-dev
mailing list