[aarch64-port-dev ] System.arraycopy() intrinsics

Andrew Dinn adinn at redhat.com
Mon Sep 23 08:26:45 PDT 2013


I pulled down the recent arraycopy fixes as I worked out that the
problems I have been seeing in C2 are related to array copying. There is
one aspect of the fix which is a problem.

> diff -r 423577eb8f6e -r 7c900775ce48 src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp
> --- a/src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp	Fri Sep 13 18:22:52 2013 +0100
> +++ b/src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp	Thu Sep 19 11:18:32 2013 +0100
> . . .
> @@ -2488,7 +2489,13 @@
>    bool aligned = (flags & LIR_OpArrayCopy::unaligned) == 0;
>    const char *name;
>    address entry = StubRoutines::select_arraycopy_function(basic_type, aligned, disjoint, name, false);
> -  __ call_VM_leaf(entry, 3);
> +
> + CodeBlob *cb = CodeCache::find_blob(entry);
> + if (cb) {
> +   __ bl(RuntimeAddress(entry));
> + } else {
> +   __ call_VM_leaf(entry, 3);
> + }
> 
>    __ bind(*stub->continuation());
>  }

This is breaking C2 because it mixes up the calling convention for the
arraycopy function returned by select_arraycopy_function -- sometimes
calling it as (generated) ARM code and sometimes as an x86 compiled C++
function (the combination of a bl call with a RuntimeAddress argument is
an error waiting to happen).

This happens because the C2 compiler manages planting of runtime calls
in generic code and always asks the backend generator to plant an x86
call to whatever address it decides is appropriate. So, when
select_arraycopy_function returns a pointer to generated ARM code C2
ends up trying to call it as x86 code.

A hack to fix this would be to add a find_blob call to the backend
generator (as per the code shown above) and have the back end plant the
required type of call. A proper fix is to make the generated ARM blobs
x86 callable so they get called using a kosher runtime call in both C1
and C2.

That requires adding

   __ c_stub_prolog(3, 0, MacroAssembler::ret_type_void);

at the start of StubGenerator methods generate_disjoint_copy(),
generate_conjoint_copy() and generate_checkcast_copy(). It also requires
changing the code above to remove the if/then clauses and only make the
VM_leaf call plus reverting some similar call in c1_LIRAssembler (to the
generic copy routine) back to a VM_leaf call. This restores both calls
to be as per the equivalent x86 code.

The patch below fixes this problem in C2 by making the stubs x86
callable. C2 now runs through a full javac of Hello.java with this
patch. It also runs javac to completion when using TieredCompilation.

I still need to rebuild a C1 only build and ensure that C1 can run javac
HelloWorld on its own. If it works I will check in the patch below.


regards,


Andrew Dinn
-----------

diff -r fdbe037fccad src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp
--- a/src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp	Thu Sep 19 18:19:30
2013 +0100
+++ b/src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp	Mon Sep 23 16:09:16
2013 +0100
@@ -2379,7 +2379,7 @@
         __ load_klass(c_rarg4, dst);
         __ ldr(c_rarg4, Address(c_rarg4,
ObjArrayKlass::element_klass_offset()));
         __ ldrw(c_rarg3, Address(c_rarg4,
Klass::super_check_offset_offset()));
-        __ call(RuntimeAddress(copyfunc_addr));
+        __ call_VM_leaf(copyfunc_addr, 5);

 #ifndef PRODUCT
         if (PrintC1Statistics) {
@@ -2491,12 +2491,7 @@
   const char *name;
   address entry = StubRoutines::select_arraycopy_function(basic_type,
aligned, disjoint, name, false);

- CodeBlob *cb = CodeCache::find_blob(entry);
- if (cb) {
-   __ bl(RuntimeAddress(entry));
- } else {
-   __ call_VM_leaf(entry, 3);
- }
+  __ call_VM_leaf(entry, 3);

   __ bind(*stub->continuation());
 }
diff -r fdbe037fccad src/cpu/aarch64/vm/stubGenerator_aarch64.cpp
--- a/src/cpu/aarch64/vm/stubGenerator_aarch64.cpp	Thu Sep 19 18:19:30
2013 +0100
+++ b/src/cpu/aarch64/vm/stubGenerator_aarch64.cpp	Mon Sep 23 16:09:16
2013 +0100
@@ -1120,6 +1120,9 @@
     __ align(CodeEntryAlignment);
     StubCodeMark mark(this, "StubRoutines", name);
     address start = __ pc();
+    // this buffer needs to be x86_callable
+    __ c_stub_prolog(3, 0, MacroAssembler::ret_type_void);
+    __ align(CodeEntryAlignment);
     if (entry != NULL) {
       *entry = __ pc();
       // caller can pass a 64-bit byte count here (from Unsafe.copyMemory)
@@ -1170,6 +1173,10 @@
     StubCodeMark mark(this, "StubRoutines", name);
     address start = __ pc();

+    // this buffer needs to be x86_callable
+    __ c_stub_prolog(3, 0, MacroAssembler::ret_type_void);
+    __ align(CodeEntryAlignment);
+
     __ cmp(d, s);
     __ br(Assembler::LS, nooverlap_target);

@@ -1484,6 +1491,7 @@
     __ align(CodeEntryAlignment);
     StubCodeMark mark(this, "StubRoutines", name);
     address start = __ pc();
+    __ c_stub_prolog(3, 0, MacroAssembler::ret_type_integral);

     __ enter(); // required for proper stackwalking of RuntimeStub frame





More information about the aarch64-port-dev mailing list