From edward.nevill at linaro.org Thu May 1 14:07:49 2014 From: edward.nevill at linaro.org (Edward Nevill) Date: Thu, 01 May 2014 15:07:49 +0100 Subject: [aarch64-port-dev ] RFR: Fix instruction size from 8 to 4 Message-ID: <1398953269.20704.8.camel@localhost.localdomain> Hi, I noticed while looking at the assembly output from C2 that the loops were not being aligned correctly as in this short code section 0x0000007f8d170a84: mov w0, wzr 0x0000007f8d170a88: nop ;*iload_1 ; - dhry::execute at 9 (line 8) <<<<< Incorrectly aligned here ;; B6: # B6 B7 <- B5 B6 Loop: B6-B6 inner Freq: 4.99998 0x0000007f8d170a8c: add x12, x10, w11, sxtw #2 0x0000007f8d170a90: ldr w12, [x12,#16] 0x0000007f8d170a94: add w0, w0, w12 ;*iadd ; - dhry::execute at 15 (line 8) 0x0000007f8d170a98: add w11, w11, #0x1 ;*iinc ; - dhry::execute at 17 (line 7) 0x0000007f8d170a9c: cmp w11, w1 0x0000007f8d170aa0: b.lt 0x0000007f8d170a8c The reason for this is that instruction_size is incorrectly defined in nativeInst_aarch64.hpp. It was defined as BytesPerWord which is of course 8. I have changed this to '4'. The code works out the no. of nops to emit as padding/instruction_size which explains why it emitted too few nops in the above. I have applied the patch below and the code is now generated as 0x0000007f88d83cc4: mov w0, wzr 0x0000007f88d83cc8: nop 0x0000007f88d83ccc: nop ;*iload_1 ; - dhry::execute at 9 (line 8) ;; B6: # B6 B7 <- B5 B6 Loop: B6-B6 inner Freq: 4.99998 0x0000007f88d83cd0: add x12, x10, w11, sxtw #2 0x0000007f88d83cd4: ldr w12, [x12,#16] 0x0000007f88d83cd8: add w0, w0, w12 ;*iadd ; - dhry::execute at 15 (line 8) 0x0000007f88d83cdc: add w11, w11, #0x1 ;*iinc ; - dhry::execute at 17 (line 7) 0x0000007f88d83ce0: cmp w11, w1 0x0000007f88d83ce4: b.lt 0x0000007f88d83cd0 OK to push? Ed. --- CUT HERE --- # HG changeset patch # User Edward Nevill edward.nevill at linaro.org # Date 1398952656 -3600 # Thu May 01 14:57:36 2014 +0100 # Node ID f67f9b1b52ae8b1778dacb49df641bb5b6e48da1 # Parent 9d641fdeea4d1772617b097fc231dcff8e4aa634 Fix instruction size from 8 to 4 diff -r 9d641fdeea4d -r f67f9b1b52ae src/cpu/aarch64/vm/nativeInst_aarch64.hpp --- a/src/cpu/aarch64/vm/nativeInst_aarch64.hpp Tue Apr 29 14:58:56 2014 +0100 +++ b/src/cpu/aarch64/vm/nativeInst_aarch64.hpp Thu May 01 14:57:36 2014 +0100 @@ -54,7 +54,7 @@ class NativeInstruction VALUE_OBJ_CLASS_SPEC { friend class Relocation; public: - enum { instruction_size = BytesPerWord }; + enum { instruction_size = 4 }; inline bool is_nop(); bool is_dtrace_trap(); inline bool is_call(); --- CUT HERE --- From aph at redhat.com Thu May 1 15:06:54 2014 From: aph at redhat.com (Andrew Haley) Date: Thu, 01 May 2014 16:06:54 +0100 Subject: [aarch64-port-dev ] RFR: Fix instruction size from 8 to 4 In-Reply-To: <1398953269.20704.8.camel@localhost.localdomain> References: <1398953269.20704.8.camel@localhost.localdomain> Message-ID: <5362630E.1070201@redhat.com> On 05/01/2014 03:07 PM, Edward Nevill wrote: > OK to push? Yes, thanks. Andrew. From ed at camswl.com Thu May 1 17:57:46 2014 From: ed at camswl.com (ed at camswl.com) Date: Thu, 01 May 2014 17:57:46 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8/hotspot: Fix instruction size from 8 to 4 Message-ID: <201405011757.s41Hvl8v014483@aojmv0008> Changeset: f67f9b1b52ae Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-01 14:57 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/f67f9b1b52ae Fix instruction size from 8 to 4 ! src/cpu/aarch64/vm/nativeInst_aarch64.hpp From openjdk-testing at linaro.org Fri May 2 13:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Fri, 2 May 2014 14:00:01 +0100 (BST) Subject: [aarch64-port-dev ] client JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140502130028.1248F1F71A@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/summary/2014/122/summary.html =============================================================================== client-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/apr/16 pass: 429; fail: 5; error: 4 Build 1: aarch64/2014/apr/17 pass: 429; fail: 5; error: 4 Build 2: aarch64/2014/apr/24 pass: 418; fail: 5; error: 15 Build 3: aarch64/2014/apr/25 pass: 421; fail: 5; error: 12 Build 4: aarch64/2014/apr/26 pass: 421; fail: 5; error: 12 Build 5: aarch64/2014/apr/30 pass: 422; fail: 5; error: 11 Build 6: aarch64/2014/may/02 pass: 431; fail: 5; error: 2 ------------------------------------------------------------------------------- =============================================================================== client-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/apr/16 pass: 2,936; error: 36 Build 1: aarch64/2014/apr/17 pass: 2,937; error: 35 Build 2: aarch64/2014/apr/24 pass: 2,916; error: 56 Build 3: aarch64/2014/apr/25 pass: 2,917; error: 55 Build 4: aarch64/2014/apr/26 pass: 2,917; error: 55 Build 5: aarch64/2014/apr/30 pass: 2,917; error: 55 Build 6: aarch64/2014/may/02 pass: 2,941; error: 31 ------------------------------------------------------------------------------- =============================================================================== client-release/jdk =============================================================================== Build 0: aarch64/2014/apr/16 pass: 5,231; fail: 156; error: 63 Build 1: aarch64/2014/apr/17 pass: 5,231; fail: 157; error: 62 Build 2: aarch64/2014/apr/24 pass: 4,886; fail: 474; error: 90 Build 3: aarch64/2014/apr/25 pass: 4,905; fail: 474; error: 71 Build 4: aarch64/2014/apr/26 pass: 4,905; fail: 472; error: 72 Build 5: aarch64/2014/apr/30 pass: 4,904; fail: 473; error: 72 Build 6: aarch64/2014/may/02 pass: 5,272; fail: 131; error: 46 ------------------------------------------------------------------------------- Previous results can be found here: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/index.html From openjdk-testing at linaro.org Fri May 2 13:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Fri, 2 May 2014 14:00:01 +0100 (BST) Subject: [aarch64-port-dev ] server JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140502130028.354581F62D@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/summary/2014/122/summary.html =============================================================================== server-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/apr/16 pass: 433; fail: 1; error: 4 Build 1: aarch64/2014/apr/17 pass: 433; fail: 1; error: 4 Build 2: aarch64/2014/apr/24 pass: 416; fail: 1; error: 21 Build 3: aarch64/2014/apr/25 pass: 425; fail: 1; error: 12 Build 4: aarch64/2014/apr/26 pass: 423; fail: 1; error: 14 Build 5: aarch64/2014/apr/30 pass: 421; fail: 2; error: 15 Build 6: aarch64/2014/may/02 pass: 434; fail: 2; error: 2 ------------------------------------------------------------------------------- =============================================================================== server-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/apr/16 pass: 2,934; error: 38 Build 1: aarch64/2014/apr/17 pass: 2,933; error: 39 Build 2: aarch64/2014/apr/24 pass: 2,894; error: 78 Build 3: aarch64/2014/apr/25 pass: 2,911; error: 61 Build 4: aarch64/2014/apr/26 pass: 2,912; error: 60 Build 5: aarch64/2014/apr/30 pass: 2,915; error: 57 Build 6: aarch64/2014/may/02 pass: 2,939; error: 33 ------------------------------------------------------------------------------- =============================================================================== server-release/jdk =============================================================================== Build 0: aarch64/2014/apr/16 pass: 5,242; fail: 134; error: 74 Build 1: aarch64/2014/apr/17 pass: 5,230; fail: 146; error: 74 Build 2: aarch64/2014/apr/24 pass: 4,699; fail: 472; error: 279 Build 3: aarch64/2014/apr/25 pass: 4,725; fail: 472; error: 253 Build 4: aarch64/2014/apr/26 pass: 4,723; fail: 475; error: 251 Build 5: aarch64/2014/apr/30 pass: 4,725; fail: 470; error: 254 Build 6: aarch64/2014/may/02 pass: 5,285; fail: 124; error: 40 ------------------------------------------------------------------------------- Previous results can be found here: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/index.html From ed at camswl.com Wed May 7 16:13:08 2014 From: ed at camswl.com (Edward Nevill) Date: Wed, 07 May 2014 17:13:08 +0100 Subject: [aarch64-port-dev ] Improvements to safepoint polling Message-ID: <1399479188.1986.22.camel@mint> Hi, The following patch makes some improvements to safepoint polling. The existing code uses adrp to get the address of the polling page in the case of a return poll as in the following output from C2:- 0x00007f953905eb0c: adrp xscratch1, 0x00007f9543e57000 ; {poll_return} 0x00007f953905eb10: ldr wzr, [xscratch1,#256] ; {poll_return} 0x00007f953905eb14: ret However for an inline poll it generates the following:- 0x00007f953905eb30: movz x12, #0x7000 0x00007f953905eb34: movk x12, #0x43e5, lsl #16 0x00007f953905eb38: movk x12, #0x7f95, lsl #32 ; OopMap{[0]=Oop off=156} ;*goto ; - MethodAtom::execute at 46 (line 35) 0x00007f953905eb3c: ldr wzr, [x12] ;*goto ; - MethodAtom::execute at 46 (line 35) ; {poll} Note also that for the poll_return case it uses the SafepointPollOffset of 256, whereas for the inline poll it does not and uses an offset of 0. The patch below causes it to generate adrp in all cases. I have set SafepointPollOffset to 0 as there is no reason for it to be 256. Having it set to 0 in all cases saves trashing a cache line. Also I have removed references to SafepointPollOffset in aarch64.ad since SafepointPollOffset is a C1 concept (defined in c1_globals). No other arch references SafepointPollOffset in C2. The patch involves some trickyness in the relocation code because the adrp may be subject to CSE / Loop invariance hoisting and therefore may be separated from the ldr. OK to push? Ed. --- CUT HERE --- # HG changeset patch # User Edward Nevill edward.nevill at linaro.org # Date 1399477316 -3600 # Wed May 07 16:41:56 2014 +0100 # Node ID 8a569467b81b8589fb4daa87b6d4292bafc206f9 # Parent f67f9b1b52ae8b1778dacb49df641bb5b6e48da1 Improvements to safepoint polling diff -r f67f9b1b52ae -r 8a569467b81b src/cpu/aarch64/vm/aarch64.ad --- a/src/cpu/aarch64/vm/aarch64.ad Thu May 01 14:57:36 2014 +0100 +++ b/src/cpu/aarch64/vm/aarch64.ad Wed May 07 16:41:56 2014 +0100 @@ -1011,7 +1011,7 @@ } if (do_polling() && C->is_method_compilation()) { - address polling_page(os::get_polling_page() + (SafepointPollOffset % os::vm_page_size())); + address polling_page(os::get_polling_page()); __ read_polling_page(rscratch1, polling_page, relocInfo::poll_return_type); } } @@ -2557,6 +2557,15 @@ __ mov(dst_reg, (u_int64_t)1); %} + enc_class aarch64_enc_mov_poll_page(iRegP dst, immPollPage src) %{ + MacroAssembler _masm(&cbuf); + address page = (address)$src$$constant; + Register dst_reg = as_Register($dst$$reg); + unsigned long off; + __ adrp(dst_reg, Address(page, relocInfo::poll_type), off); + assert(off == 0, "assumed offset == 0"); + %} + enc_class aarch64_enc_mov_n(iRegN dst, immN src) %{ MacroAssembler _masm(&cbuf); Register dst_reg = as_Register($dst$$reg); @@ -3726,6 +3735,17 @@ interface(CONST_INTER); %} +// Polling Page Pointer Immediate +operand immPollPage() +%{ + predicate((address)n->get_ptr() == os::get_polling_page()); + match(ConP); + + op_cost(0); + format %{ %} + interface(CONST_INTER); +%} + // Pointer Immediate Minus One // this is used when we want to write the current PC to the thread anchor operand immP_M1() @@ -5058,6 +5078,20 @@ ins_pipe(pipe_class_default); %} +// Load Poll Page Constant + +instruct loadConPollPage(iRegPNoSp dst, immPollPage con) +%{ + match(Set dst con); + + ins_cost(INSN_COST); + format %{ "adr $dst, $con\t# Poll Page Ptr" %} + + ins_encode(aarch64_enc_mov_poll_page(dst, con)); + + ins_pipe(pipe_class_default); +%} + // Load Narrow Pointer Constant instruct loadConN(iRegNNoSp dst, immN con) @@ -10815,7 +10849,7 @@ match(SafePoint poll); format %{ - "ldrw zr, [rscratch1, $poll]\t# Safepoint: poll for GC" + "ldrw zr, [$poll]\t# Safepoint: poll for GC" %} ins_encode %{ __ read_polling_page(as_Register($poll$$reg), relocInfo::poll_type); diff -r f67f9b1b52ae -r 8a569467b81b src/cpu/aarch64/vm/c1_globals_aarch64.hpp --- a/src/cpu/aarch64/vm/c1_globals_aarch64.hpp Thu May 01 14:57:36 2014 +0100 +++ b/src/cpu/aarch64/vm/c1_globals_aarch64.hpp Wed May 07 16:41:56 2014 +0100 @@ -74,6 +74,6 @@ define_pd_global(bool, CSEArrayLength, false); define_pd_global(bool, TwoOperandLIRForm, false ); -define_pd_global(intx, SafepointPollOffset, 256 ); +define_pd_global(intx, SafepointPollOffset, 0 ); #endif // CPU_AARCH64_VM_C1_GLOBALS_AARCH64_HPP diff -r f67f9b1b52ae -r 8a569467b81b src/cpu/aarch64/vm/macroAssembler_aarch64.cpp --- a/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Thu May 01 14:57:36 2014 +0100 +++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Wed May 07 16:41:56 2014 +0100 @@ -94,7 +94,9 @@ offset = adr_page - pc_page; unsigned insn2 = ((unsigned*)branch)[1]; - if (Instruction_aarch64::extract(insn2, 29, 24) == 0b111001) { + if ((address)target == os::get_polling_page()) { + assert(offset_lo == 0, "offset must be 0 for polling page"); + } else if (Instruction_aarch64::extract(insn2, 29, 24) == 0b111001) { // Load/store register (unsigned immediate) unsigned size = Instruction_aarch64::extract(insn2, 31, 30); Instruction_aarch64::patch(branch + sizeof (unsigned), @@ -180,7 +182,9 @@ uint64_t target_page = ((uint64_t)insn_addr) + offset; target_page &= ((uint64_t)-1) << shift; unsigned insn2 = ((unsigned*)insn_addr)[1]; - if (Instruction_aarch64::extract(insn2, 29, 24) == 0b111001) { + if ((address)target_page == os::get_polling_page()) { + return (address)target_page; + } else if (Instruction_aarch64::extract(insn2, 29, 24) == 0b111001) { // Load/store register (unsigned immediate) unsigned int byte_offset = Instruction_aarch64::extract(insn2, 21, 10); unsigned int size = Instruction_aarch64::extract(insn2, 31, 30); --- CUT HERE --- From aph at redhat.com Fri May 9 08:39:17 2014 From: aph at redhat.com (Andrew Haley) Date: Fri, 09 May 2014 09:39:17 +0100 Subject: [aarch64-port-dev ] Improvements to safepoint polling In-Reply-To: <1399479188.1986.22.camel@mint> References: <1399479188.1986.22.camel@mint> Message-ID: <536C9435.4050106@redhat.com> On 05/07/2014 05:13 PM, Edward Nevill wrote: > OK to push? OK, thanks. Andrew. From ed at camswl.com Fri May 9 08:50:39 2014 From: ed at camswl.com (ed at camswl.com) Date: Fri, 09 May 2014 08:50:39 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8/hotspot: Improvements to safepoint polling Message-ID: <201405090850.s498oek7028672@aojmv0008> Changeset: 8a569467b81b Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-07 16:41 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/8a569467b81b Improvements to safepoint polling ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/c1_globals_aarch64.hpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp From edward.nevill at linaro.org Fri May 9 13:51:24 2014 From: edward.nevill at linaro.org (Edward Nevill) Date: Fri, 09 May 2014 14:51:24 +0100 Subject: [aarch64-port-dev ] RFR: Optimise C2 entry point verification Message-ID: <1399643484.8672.11.camel@localhost.localdomain> Hi, The following patch makes a small optimsation to the entry point verification for C2. The patch replaces 0x00007fc5c111ebe0: ldr wscratch1, [x1,#8] ; {no_reloc} 0x00007fc5c111ebe4: lsl xscratch1, xscratch1, #3 0x00007fc5c111ebe8: cmp xscratch2, xscratch1 0x00007fc5c111ebec: b.eq 0x00007fc5c111ebf4 0x00007fc5c111ebf0: b 0x00007fc5c10cdb20 ; {runtime_call} 0x00007fc5c111ebf4: nop 0x00007fc5c111ebf8: nop 0x00007fc5c111ebfc: nop [Verified Entry Point] ... with the following 0x00007f077111ebe0: ldr wscratch1, [x1,#8] ; {no_reloc} 0x00007f077111ebe4: cmp xscratch2, xscratch1, lsl #3 0x00007f077111ebe8: b.eq 0x00007f077111ebf0 0x00007f077111ebec: b 0x00007f07710cdb20 ; {runtime_call} [Verified Entry Point] ... Should we also change CodeEntryAlignment which is currently set to 32? I can see very little point in the code entry alignment being less than a cache line. OK? Ed. --- CUT HERE --- exporting patch: # HG changeset patch # User Edward Nevill edward.nevill at linaro.org # Date 1399642797 -3600 # Fri May 09 14:39:57 2014 +0100 # Node ID 1b7ea58b2cf7b5e2dcbdbdecd387abd8bfa30176 # Parent f67f9b1b52ae8b1778dacb49df641bb5b6e48da1 Optimise C2 entry point verification diff -r f67f9b1b52ae -r 1b7ea58b2cf7 src/cpu/aarch64/vm/aarch64.ad --- a/src/cpu/aarch64/vm/aarch64.ad Thu May 01 14:57:36 2014 +0100 +++ b/src/cpu/aarch64/vm/aarch64.ad Fri May 09 14:39:57 2014 +0100 @@ -1421,9 +1421,8 @@ MacroAssembler _masm(&cbuf); // no need to worry about 4-byte of br alignment on AArch64 - __ load_klass(rscratch1, j_rarg0); + __ cmp_klass(j_rarg0, rscratch2, rscratch1); Label skip; - __ cmp(rscratch2, rscratch1); // TODO // can we avoid this skip and still use a reloc? __ br(Assembler::EQ, skip); diff -r f67f9b1b52ae -r 1b7ea58b2cf7 src/cpu/aarch64/vm/macroAssembler_aarch64.cpp --- a/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Thu May 01 14:57:36 2014 +0100 +++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Fri May 09 14:39:57 2014 +0100 @@ -2076,6 +2076,20 @@ } } +void MacroAssembler::cmp_klass(Register oop, Register trial_klass, Register tmp) { + if (UseCompressedClassPointers) { + ldrw(tmp, Address(oop, oopDesc::klass_offset_in_bytes())); + if (Universe::narrow_klass_base() == NULL) { + cmp(trial_klass, tmp, LSL, Universe::narrow_klass_shift()); + return; + } + decode_klass_not_null(tmp); + } else { + ldr(tmp, Address(oop, oopDesc::klass_offset_in_bytes())); + } + cmp(trial_klass, tmp); +} + void MacroAssembler::load_prototype_header(Register dst, Register src) { load_klass(dst, src); ldr(dst, Address(dst, Klass::prototype_header_offset())); diff -r f67f9b1b52ae -r 1b7ea58b2cf7 src/cpu/aarch64/vm/macroAssembler_aarch64.hpp --- a/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp Thu May 01 14:57:36 2014 +0100 +++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp Fri May 09 14:39:57 2014 +0100 @@ -703,6 +703,7 @@ // oop manipulations void load_klass(Register dst, Register src); void store_klass(Register dst, Register src); + void cmp_klass(Register oop, Register trial_klass, Register tmp); void load_heap_oop(Register dst, Address src); --- CUT HERE --- From openjdk-testing at linaro.org Sat May 10 13:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Sat, 10 May 2014 14:00:01 +0100 (BST) Subject: [aarch64-port-dev ] client JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140510130031.F1E841F63C@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/summary/2014/130/summary.html =============================================================================== client-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/apr/16 pass: 429; fail: 5; error: 4 Build 1: aarch64/2014/apr/17 pass: 429; fail: 5; error: 4 Build 2: aarch64/2014/apr/24 pass: 418; fail: 5; error: 15 Build 3: aarch64/2014/apr/25 pass: 421; fail: 5; error: 12 Build 4: aarch64/2014/apr/26 pass: 421; fail: 5; error: 12 Build 5: aarch64/2014/apr/30 pass: 422; fail: 5; error: 11 Build 6: aarch64/2014/may/02 pass: 431; fail: 5; error: 2 Build 7: aarch64/2014/may/10 pass: 421; fail: 5; error: 12 ------------------------------------------------------------------------------- =============================================================================== client-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/apr/16 pass: 2,936; error: 36 Build 1: aarch64/2014/apr/17 pass: 2,937; error: 35 Build 2: aarch64/2014/apr/24 pass: 2,916; error: 56 Build 3: aarch64/2014/apr/25 pass: 2,917; error: 55 Build 4: aarch64/2014/apr/26 pass: 2,917; error: 55 Build 5: aarch64/2014/apr/30 pass: 2,917; error: 55 Build 6: aarch64/2014/may/02 pass: 2,941; error: 31 Build 7: aarch64/2014/may/10 pass: 2,917; error: 55 ------------------------------------------------------------------------------- =============================================================================== client-release/jdk =============================================================================== Build 0: aarch64/2014/apr/16 pass: 5,231; fail: 156; error: 63 Build 1: aarch64/2014/apr/17 pass: 5,231; fail: 157; error: 62 Build 2: aarch64/2014/apr/24 pass: 4,886; fail: 474; error: 90 Build 3: aarch64/2014/apr/25 pass: 4,905; fail: 474; error: 71 Build 4: aarch64/2014/apr/26 pass: 4,905; fail: 472; error: 72 Build 5: aarch64/2014/apr/30 pass: 4,904; fail: 473; error: 72 Build 6: aarch64/2014/may/02 pass: 5,272; fail: 131; error: 46 Build 7: aarch64/2014/may/10 pass: 4,906; fail: 472; error: 71 ------------------------------------------------------------------------------- Previous results can be found here: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/index.html From openjdk-testing at linaro.org Sat May 10 13:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Sat, 10 May 2014 14:00:01 +0100 (BST) Subject: [aarch64-port-dev ] server JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140510130031.E25331FAE1@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/summary/2014/130/summary.html =============================================================================== server-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/apr/16 pass: 433; fail: 1; error: 4 Build 1: aarch64/2014/apr/17 pass: 433; fail: 1; error: 4 Build 2: aarch64/2014/apr/24 pass: 416; fail: 1; error: 21 Build 3: aarch64/2014/apr/25 pass: 425; fail: 1; error: 12 Build 4: aarch64/2014/apr/26 pass: 423; fail: 1; error: 14 Build 5: aarch64/2014/apr/30 pass: 421; fail: 2; error: 15 Build 6: aarch64/2014/may/02 pass: 434; fail: 2; error: 2 Build 7: aarch64/2014/may/10 pass: 421; fail: 2; error: 15 ------------------------------------------------------------------------------- =============================================================================== server-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/apr/16 pass: 2,934; error: 38 Build 1: aarch64/2014/apr/17 pass: 2,933; error: 39 Build 2: aarch64/2014/apr/24 pass: 2,894; error: 78 Build 3: aarch64/2014/apr/25 pass: 2,911; error: 61 Build 4: aarch64/2014/apr/26 pass: 2,912; error: 60 Build 5: aarch64/2014/apr/30 pass: 2,915; error: 57 Build 6: aarch64/2014/may/02 pass: 2,939; error: 33 Build 7: aarch64/2014/may/10 pass: 2,911; error: 61 ------------------------------------------------------------------------------- =============================================================================== server-release/jdk =============================================================================== Build 0: aarch64/2014/apr/16 pass: 5,242; fail: 134; error: 74 Build 1: aarch64/2014/apr/17 pass: 5,230; fail: 146; error: 74 Build 2: aarch64/2014/apr/24 pass: 4,699; fail: 472; error: 279 Build 3: aarch64/2014/apr/25 pass: 4,725; fail: 472; error: 253 Build 4: aarch64/2014/apr/26 pass: 4,723; fail: 475; error: 251 Build 5: aarch64/2014/apr/30 pass: 4,725; fail: 470; error: 254 Build 6: aarch64/2014/may/02 pass: 5,285; fail: 124; error: 40 Build 7: aarch64/2014/may/10 pass: 4,725; fail: 470; error: 254 ------------------------------------------------------------------------------- Previous results can be found here: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/index.html From aph at redhat.com Mon May 12 12:10:37 2014 From: aph at redhat.com (Andrew Haley) Date: Mon, 12 May 2014 13:10:37 +0100 Subject: [aarch64-port-dev ] RFR: Minor optimisation for divide by 2 In-Reply-To: <1398763004.20174.7.camel@localhost.localdomain> References: <1398763004.20174.7.camel@localhost.localdomain> Message-ID: <5370BA3D.7040605@redhat.com> Hi, On 04/29/2014 10:16 AM, Edward Nevill wrote: > C2 currently generates > > mov rdst, rsrc, asr #31 > mov rdst, rdst, lsr #31 > add rdst, rsrc, rdst > mov rdst, rdst, asr #1 > > for divide by 2. I get asr rdst, rsrc, #31 add rtmp, rsrc, rdst, lsr #31 asr rdst, rtmp, #1 from C2. Which is not as nice as yours, but less worrying. > The following patch reduces this to > > add rdst, rsrc, rsrc, lsr #31 > mov rdst, rdst, asr #1 > > I know this is very minor, but it offends me:-) > > OK? Why is there no long version of this? Andrew. From ed at camswl.com Mon May 12 12:42:15 2014 From: ed at camswl.com (ed at camswl.com) Date: Mon, 12 May 2014 12:42:15 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8/hotspot: 2 new changesets Message-ID: <201405121242.s4CCgISV011579@aojmv0008> Changeset: 99180a14ca07 Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-12 13:39 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/99180a14ca07 Optimise C2 entry point verification ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp Changeset: 6523308f9626 Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-12 13:41 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/6523308f9626 Make code entry alignment 64 for C2 ! src/cpu/aarch64/vm/globals_aarch64.hpp From ed at camswl.com Mon May 12 13:21:05 2014 From: ed at camswl.com (Edward Nevill) Date: Mon, 12 May 2014 14:21:05 +0100 Subject: [aarch64-port-dev ] RFR: Optimise addressing of card table byte map base Message-ID: <1399900865.21025.9.camel@mint> Hi, The following patch optimises the addressing of the card table byte map from movz x, #N movk x, #N << 16 movk x, #N << 32 to adrp x, #N The implementation is identical to the previous optimisation of addressing the safepoint polling page. The card table byte map is always page aligned (and this is asserted in the code below). Regards, Ed. --- CUT HERE --- # HG changeset patch # User Edward Nevill edward.nevill at linaro.org # Date 1399900167 -3600 # Mon May 12 14:09:27 2014 +0100 # Node ID 1a6e4b95d2689fc31c1715bf8bec786a7d66161f # Parent 6523308f9626004171794372e5577a0f6939b4df Optimise addressing of card table byte map base diff -r 6523308f9626 -r 1a6e4b95d268 src/cpu/aarch64/vm/aarch64.ad --- a/src/cpu/aarch64/vm/aarch64.ad Mon May 12 13:41:43 2014 +0100 +++ b/src/cpu/aarch64/vm/aarch64.ad Mon May 12 14:09:27 2014 +0100 @@ -2565,6 +2565,15 @@ assert(off == 0, "assumed offset == 0"); %} + enc_class aarch64_enc_mov_byte_map_base(iRegP dst, immByteMapBase src) %{ + MacroAssembler _masm(&cbuf); + address page = (address)$src$$constant; + Register dst_reg = as_Register($dst$$reg); + unsigned long off; + __ adrp(dst_reg, Address(page, relocInfo::poll_type), off); + assert(off == 0, "assumed offset == 0"); + %} + enc_class aarch64_enc_mov_n(iRegN dst, immN src) %{ MacroAssembler _masm(&cbuf); Register dst_reg = as_Register($dst$$reg); @@ -3745,6 +3754,19 @@ interface(CONST_INTER); %} +// Card Table Byte Map Base +operand immByteMapBase() +%{ + // Get base of card map + predicate((jbyte*)n->get_ptr() == + ((CardTableModRefBS*)(Universe::heap()->barrier_set()))->byte_map_base); + match(ConP); + + op_cost(0); + format %{ %} + interface(CONST_INTER); +%} + // Pointer Immediate Minus One // this is used when we want to write the current PC to the thread anchor operand immP_M1() @@ -5091,6 +5113,20 @@ ins_pipe(pipe_class_default); %} +// Load Byte Map Base Constant + +instruct loadByteMapBase(iRegPNoSp dst, immByteMapBase con) +%{ + match(Set dst con); + + ins_cost(INSN_COST); + format %{ "adr $dst, $con\t# Byte Map Base" %} + + ins_encode(aarch64_enc_mov_byte_map_base(dst, con)); + + ins_pipe(pipe_class_default); +%} + // Load Narrow Pointer Constant instruct loadConN(iRegNNoSp dst, immN con) diff -r 6523308f9626 -r 1a6e4b95d268 src/cpu/aarch64/vm/macroAssembler_aarch64.cpp --- a/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Mon May 12 13:41:43 2014 +0100 +++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Mon May 12 14:09:27 2014 +0100 @@ -94,7 +94,9 @@ offset = adr_page - pc_page; unsigned insn2 = ((unsigned*)branch)[1]; - if ((address)target == os::get_polling_page()) { + if ((jbyte *)target == + ((CardTableModRefBS*)(Universe::heap()->barrier_set()))->byte_map_base || + (address)target == os::get_polling_page()) { assert(offset_lo == 0, "offset must be 0 for polling page"); } else if (Instruction_aarch64::extract(insn2, 29, 24) == 0b111001) { // Load/store register (unsigned immediate) @@ -182,7 +184,9 @@ uint64_t target_page = ((uint64_t)insn_addr) + offset; target_page &= ((uint64_t)-1) << shift; unsigned insn2 = ((unsigned*)insn_addr)[1]; - if ((address)target_page == os::get_polling_page()) { + if ((jbyte *)target_page == + ((CardTableModRefBS*)(Universe::heap()->barrier_set()))->byte_map_base || + (address)target_page == os::get_polling_page()) { return (address)target_page; } else if (Instruction_aarch64::extract(insn2, 29, 24) == 0b111001) { // Load/store register (unsigned immediate) diff -r 6523308f9626 -r 1a6e4b95d268 src/cpu/aarch64/vm/relocInfo_aarch64.cpp --- a/src/cpu/aarch64/vm/relocInfo_aarch64.cpp Mon May 12 13:41:43 2014 +0100 +++ b/src/cpu/aarch64/vm/relocInfo_aarch64.cpp Mon May 12 14:09:27 2014 +0100 @@ -62,16 +62,7 @@ // fprintf(stderr, "Try to fix poll reloc at %p to %p\n", addr(), dest); if (NativeInstruction::maybe_cpool_ref(addr())) { address old_addr = old_addr_for(addr(), src, dest); - if (! os::is_poll_address(pd_call_destination(old_addr))) { - fprintf(stderr, " bollocks!\n"); - old_addr = old_addr_for(addr(), src, dest); - } MacroAssembler::pd_patch_instruction(addr(), pd_call_destination(old_addr)); - if (! os::is_poll_address(pd_call_destination(addr()))) { - fprintf(stderr, " result at %p is %p\n", addr(), pd_call_destination(addr())); - MacroAssembler::pd_patch_instruction(addr(), pd_call_destination(old_addr)); - } - } else { } } --- CUT HERE --- From aph at redhat.com Mon May 12 13:29:35 2014 From: aph at redhat.com (Andrew Haley) Date: Mon, 12 May 2014 14:29:35 +0100 Subject: [aarch64-port-dev ] RFR: Optimise addressing of card table byte map base In-Reply-To: <1399900865.21025.9.camel@mint> References: <1399900865.21025.9.camel@mint> Message-ID: <5370CCBF.9000006@redhat.com> On 05/12/2014 02:21 PM, Edward Nevill wrote: > Hi, > > The following patch optimises the addressing of the card table byte map from > > movz x, #N > movk x, #N << 16 > movk x, #N << 32 > > > to > > adrp x, #N > > The implementation is identical to the previous optimisation of addressing the safepoint polling page. > > The card table byte map is always page aligned (and this is asserted in the code below). Comments inline. > > --- CUT HERE --- > # HG changeset patch > # User Edward Nevill edward.nevill at linaro.org > # Date 1399900167 -3600 > # Mon May 12 14:09:27 2014 +0100 > # Node ID 1a6e4b95d2689fc31c1715bf8bec786a7d66161f > # Parent 6523308f9626004171794372e5577a0f6939b4df > Optimise addressing of card table byte map base > > diff -r 6523308f9626 -r 1a6e4b95d268 src/cpu/aarch64/vm/aarch64.ad > --- a/src/cpu/aarch64/vm/aarch64.ad Mon May 12 13:41:43 2014 +0100 > +++ b/src/cpu/aarch64/vm/aarch64.ad Mon May 12 14:09:27 2014 +0100 > @@ -2565,6 +2565,15 @@ > assert(off == 0, "assumed offset == 0"); > %} > > + enc_class aarch64_enc_mov_byte_map_base(iRegP dst, immByteMapBase src) %{ > + MacroAssembler _masm(&cbuf); > + address page = (address)$src$$constant; > + Register dst_reg = as_Register($dst$$reg); > + unsigned long off; > + __ adrp(dst_reg, Address(page, relocInfo::poll_type), off); Is this reloc type correct? It should be an external address, no? What is the cost of this insn? > + assert(off == 0, "assumed offset == 0"); > + %} > + > diff -r 6523308f9626 -r 1a6e4b95d268 src/cpu/aarch64/vm/macroAssembler_aarch64.cpp > --- a/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Mon May 12 13:41:43 2014 +0100 > +++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Mon May 12 14:09:27 2014 +0100 > @@ -94,7 +94,9 @@ > offset = adr_page - pc_page; > > unsigned insn2 = ((unsigned*)branch)[1]; > - if ((address)target == os::get_polling_page()) { > + if ((jbyte *)target == > + ((CardTableModRefBS*)(Universe::heap()->barrier_set()))->byte_map_base || > + (address)target == os::get_polling_page()) { > assert(offset_lo == 0, "offset must be 0 for polling page"); > } else if (Instruction_aarch64::extract(insn2, 29, 24) == 0b111001) { > // Load/store register (unsigned immediate) > @@ -182,7 +184,9 @@ > uint64_t target_page = ((uint64_t)insn_addr) + offset; > target_page &= ((uint64_t)-1) << shift; > unsigned insn2 = ((unsigned*)insn_addr)[1]; > - if ((address)target_page == os::get_polling_page()) { > + if ((jbyte *)target_page == > + ((CardTableModRefBS*)(Universe::heap()->barrier_set()))->byte_map_base || > + (address)target_page == os::get_polling_page()) { > return (address)target_page; > } else if (Instruction_aarch64::extract(insn2, 29, 24) == 0b111001) { > // Load/store register (unsigned immediate) I'm getting rather unhappy about these special cases in the reloc code. Andrew. From ed at camswl.com Mon May 12 14:45:10 2014 From: ed at camswl.com (Edward Nevill) Date: Mon, 12 May 2014 15:45:10 +0100 Subject: [aarch64-port-dev ] RFR: Minor optimisation for divide by 2 In-Reply-To: <5370BA3D.7040605@redhat.com> References: <1398763004.20174.7.camel@localhost.localdomain> <5370BA3D.7040605@redhat.com> Message-ID: <1399905910.21025.15.camel@mint> On Mon, 2014-05-12 at 13:10 +0100, Andrew Haley wrote: > Why is there no long version of this? Here is the patch for the long case, Regards, Ed. --- CUT HERE --- # HG changeset patch # User Edward Nevill edward.nevill at linaro.org # Date 1399905423 -3600 # Mon May 12 15:37:03 2014 +0100 # Node ID 5fb4ac5c2400d120e9ea1de2d3c6a218b71ea9b2 # Parent 6523308f9626004171794372e5577a0f6939b4df Optimise long divide by 2 diff -r 6523308f9626 -r 5fb4ac5c2400 src/cpu/aarch64/vm/aarch64.ad --- a/src/cpu/aarch64/vm/aarch64.ad Mon May 12 13:41:43 2014 +0100 +++ b/src/cpu/aarch64/vm/aarch64.ad Mon May 12 15:37:03 2014 +0100 @@ -3464,6 +3464,16 @@ interface(CONST_INTER); %} +operand immL_63() +%{ + predicate(n->get_int() == 63); + match(ConI); + + op_cost(0); + format %{ %} + interface(CONST_INTER); +%} + operand immL_255() %{ predicate(n->get_int() == 255); @@ -7353,6 +7363,30 @@ ins_pipe(pipe_class_default); %} +instruct signExtractL(iRegLNoSp dst, iRegL src, immL_63 div1, immL_63 div2) %{ + match(Set dst (URShiftL (RShiftL src div1) div2)); + ins_cost(INSN_COST); + format %{ "lsr $dst, $src, $div1" %} + ins_encode %{ + __ lsr(as_Register($dst$$reg), as_Register($src$$reg), 63); + %} + ins_pipe(pipe_class_default); +%} + +instruct div2RoundL(iRegLNoSp dst, iRegL src, immL_63 div1, immL_63 div2) %{ + match(Set dst (AddL src (URShiftL (RShiftL src div1) div2))); + ins_cost(INSN_COST); + format %{ "add $dst, $src, $div1" %} + + ins_encode %{ + __ add(as_Register($dst$$reg), + as_Register($src$$reg), + as_Register($src$$reg), + Assembler::LSR, 63); + %} + ins_pipe(pipe_class_default); +%} + // Integer Remainder instruct modI(iRegINoSp dst, iRegIorL2I src1, iRegIorL2I src2) %{ --- CUT HERE --- From aph at redhat.com Mon May 12 16:33:10 2014 From: aph at redhat.com (Andrew Haley) Date: Mon, 12 May 2014 17:33:10 +0100 Subject: [aarch64-port-dev ] Optimize pushes and pops Message-ID: <5370F7C6.4050701@redhat.com> This patch converts the old sequences of pre- and post-incremented stores and loads into stores and loads without writeback. This avoids pipeline stalls with address dependencies. After this patch we get: 0x00007fffd10d1ff8: stp x1, x2, [sp,#-48]! 0x00007fffd10d1ffc: stp x3, x4, [sp,#16] 0x00007fffd10d2000: stp x5, x6, [sp,#32] and 0x00007fffd10d202c: ldp xscratch1, xmethod, [sp],#16 0x00007fffd10d2030: ldp x1, x2, [sp] 0x00007fffd10d2034: ldp x3, x4, [sp,#16] 0x00007fffd10d2038: ldp x5, x6, [sp,#32] 0x00007fffd10d203c: add sp, sp, #0x30 Andrew. # HG changeset patch # User aph # Date 1399908399 -3600 # Mon May 12 16:26:39 2014 +0100 # Node ID 3852a506a19bb79e0a77d8474978f09484fc3fed # Parent ac30fdebd5f5811d768d493d58d40852cff0886c Tidy up stack frame handling. diff -r ac30fdebd5f5 -r 3852a506a19b src/cpu/aarch64/vm/aarch64.ad --- a/src/cpu/aarch64/vm/aarch64.ad Mon May 12 14:34:00 2014 +0100 +++ b/src/cpu/aarch64/vm/aarch64.ad Mon May 12 16:26:39 2014 +0100 @@ -911,19 +911,16 @@ if (C->need_stack_bang(framesize)) __ generate_stack_overflow_check(framesize); - // push lr and rfp to create a frame - __ stp(rfp, lr, Address(__ pre(sp, -2 * wordSize))); - - // allow for already pushed values - framesize -= 2 * wordSize; - - if (framesize) { - if (Assembler::operand_valid_for_add_sub_immediate(framesize)) { - __ sub(sp, sp, framesize); - } else { - __ mov(rscratch1, framesize); - __ sub(sp, sp, rscratch1); - } + if (framesize == 0) { + // Is this even possible? + __ stp(rfp, lr, Address(__ pre(sp, -2 * wordSize))); + } else if (framesize < (1 << 12)) { + __ sub(sp, sp, framesize); + __ stp(rfp, lr, Address(sp, framesize - 2 * wordSize)); + } else { + __ stp(rfp, lr, Address(__ pre(sp, -2 * wordSize))); + __ mov(rscratch1, framesize - 2 * wordSize); + __ sub(sp, sp, rscratch1); } if (NotifySimulator) { @@ -993,19 +990,17 @@ MacroAssembler _masm(&cbuf); int framesize = C->frame_slots() << LogBytesPerInt; - framesize -= 2 * wordSize; - - if (framesize) { - if (Assembler::operand_valid_for_add_sub_immediate(framesize)) { - __ add(sp, sp, framesize); - } else { - __ mov(rscratch1, framesize); - __ add(sp, sp, rscratch1); - } + if (framesize == 0) { + __ ldp(rfp, lr, Address(__ post(sp, 2 * wordSize))); + } else if (framesize < (1 << 12)) { + __ ldp(rfp, lr, Address(sp, framesize - 2 * wordSize)); + __ add(sp, sp, framesize); + } else { + __ mov(rscratch1, framesize - 2 * wordSize); + __ add(sp, sp, rscratch1); + __ ldp(rfp, lr, Address(__ post(sp, 2 * wordSize))); } - __ ldp(rfp, lr, Address(__ post(sp, 2 * wordSize))); - if (NotifySimulator) { __ notify(Assembler::method_reentry); } diff -r ac30fdebd5f5 -r 3852a506a19b src/cpu/aarch64/vm/macroAssembler_aarch64.cpp --- a/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Mon May 12 14:34:00 2014 +0100 +++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Mon May 12 16:26:39 2014 +0100 @@ -1665,7 +1665,7 @@ // Scan bitset to accumulate register pairs unsigned char regs[32]; - unsigned count = 0; + int count = 0; for (int reg = 0; reg <= 30; reg++) { if (1 & bitset) regs[count++] = reg; @@ -1674,11 +1674,16 @@ regs[count++] = zr->encoding_nocheck(); count &= ~1; // Only push an even nuber of regs - for (int i = count - 2; i >= 0; i-= 2) { - stp(as_Register(regs[i]), as_Register(regs[i+1]), - Address(pre(stack, -2 * wordSize))); + if (count) { + stp(as_Register(regs[0]), as_Register(regs[1]), + Address(pre(stack, -count * wordSize))); words_pushed += 2; } + for (int i = 2; i < count; i += 2) { + stp(as_Register(regs[i]), as_Register(regs[i+1]), + Address(stack, i * wordSize)); + words_pushed += 2; + } return words_pushed; } @@ -1688,7 +1693,7 @@ // Scan bitset to accumulate register pairs unsigned char regs[32]; - unsigned count = 0; + int count = 0; for (int reg = 0; reg <= 30; reg++) { if (1 & bitset) regs[count++] = reg; @@ -1697,10 +1702,19 @@ regs[count++] = zr->encoding_nocheck(); count &= ~1; - for (unsigned i = 0; i < count; i+= 2) { - ldp(as_Register(regs[i]), as_Register(regs[i+1]), - Address(post(stack, 2 * wordSize))); - words_pushed += 2; + if (count <= 4) { + for (int i = 0; i < count; i+= 2) { + ldp(as_Register(regs[i]), as_Register(regs[i+1]), + Address(post(stack, 2 * wordSize))); + words_pushed += 2; + } + } else { + for (int i = 0; i < count; i+= 2) { + ldp(as_Register(regs[i]), as_Register(regs[i+1]), + Address(stack, i * wordSize)); + words_pushed += 2; + } + add(stack, stack, words_pushed * wordSize); } return words_pushed; From aph at redhat.com Mon May 12 16:34:33 2014 From: aph at redhat.com (Andrew Haley) Date: Mon, 12 May 2014 17:34:33 +0100 Subject: [aarch64-port-dev ] Correct opto assembly for sign-extened operands Message-ID: <5370F819.3000400@redhat.com> Does what it says. # HG changeset patch # User aph # Date 1399901640 -3600 # Mon May 12 14:34:00 2014 +0100 # Node ID ac30fdebd5f5811d768d493d58d40852cff0886c # Parent 8a569467b81b8589fb4daa87b6d4292bafc206f9 Fix opto assembly for shifts. diff -r 8a569467b81b -r ac30fdebd5f5 src/cpu/aarch64/vm/aarch64.ad --- a/src/cpu/aarch64/vm/aarch64.ad Wed May 07 16:41:56 2014 +0100 +++ b/src/cpu/aarch64/vm/aarch64.ad Mon May 12 14:34:00 2014 +0100 @@ -6988,7 +6988,7 @@ match(Set dst (AddP src1 (ConvI2L src2))); ins_cost(INSN_COST); - format %{ "add $dst, $src1, $src2\t# ptr" %} + format %{ "add $dst, $src1, $src2, sxtw\t# ptr" %} ins_encode %{ __ add(as_Register($dst$$reg), @@ -7331,7 +7331,7 @@ instruct div2Round(iRegINoSp dst, iRegI src, immI_31 div1, immI_31 div2) %{ match(Set dst (AddI src (URShiftI (RShiftI src div1) div2))); ins_cost(INSN_COST); - format %{ "addw $dst, $src, $div1" %} + format %{ "addw $dst, $src, LSR $div1" %} ins_encode %{ __ addw(as_Register($dst$$reg), @@ -8957,7 +8957,7 @@ %{ match(Set dst (AddL src1 (ConvI2L src2))); ins_cost(INSN_COST); - format %{ "add $dst, $src1, $src2" %} + format %{ "add $dst, $src1, $src2, sxtw" %} ins_encode %{ __ add(as_Register($dst$$reg), as_Register($src1$$reg), @@ -8970,7 +8970,7 @@ %{ match(Set dst (SubL src1 (ConvI2L src2))); ins_cost(INSN_COST); - format %{ "sub $dst, $src1, $src2" %} + format %{ "sub $dst, $src1, $src2, sxtw" %} ins_encode %{ __ sub(as_Register($dst$$reg), as_Register($src1$$reg), @@ -9036,7 +9036,7 @@ %{ match(Set dst (AddL src1 (RShiftL (LShiftL src2 lshift) rshift))); ins_cost(INSN_COST); - format %{ "add $dst, $src1, sxtw $src2" %} + format %{ "add $dst, $src1, sxtw $src2, sxtw" %} ins_encode %{ __ add(as_Register($dst$$reg), as_Register($src1$$reg), From ed at camswl.com Mon May 12 16:47:07 2014 From: ed at camswl.com (Edward Nevill) Date: Mon, 12 May 2014 17:47:07 +0100 Subject: [aarch64-port-dev ] RFR: Optimise addressing of card table byte map base In-Reply-To: <5370CCBF.9000006@redhat.com> References: <1399900865.21025.9.camel@mint> <5370CCBF.9000006@redhat.com> Message-ID: <1399913227.21025.34.camel@mint> On Mon, 2014-05-12 at 14:29 +0100, Andrew Haley wrote: > On 05/12/2014 02:21 PM, Edward Nevill wrote: > > + enc_class aarch64_enc_mov_byte_map_base(iRegP dst, immByteMapBase src) %{ > > + MacroAssembler _masm(&cbuf); > > + address page = (address)$src$$constant; > > + Register dst_reg = as_Register($dst$$reg); > > + unsigned long off; > > + __ adrp(dst_reg, Address(page, relocInfo::poll_type), off); > > Is this reloc type correct? It should be an external address, no? Yes. An ExternalAddress is correct. Thanks for spotting this. Modified as follows:- --- a/src/cpu/aarch64/vm/aarch64.ad Mon May 12 14:09:27 2014 +0100 +++ b/src/cpu/aarch64/vm/aarch64.ad Mon May 12 17:24:33 2014 +0100 @@ -2570,7 +2570,7 @@ address page = (address)$src$$constant; Register dst_reg = as_Register($dst$$reg); unsigned long off; - __ adrp(dst_reg, Address(page, relocInfo::poll_type), off); + __ adrp(dst_reg, ExternalAddress(page), off); assert(off == 0, "assumed offset == 0"); %} > What is the cost of this insn? 1 * INSN_CONST for a single data processing insn. This is specified in loadByteMapBase. instruct loadByteMapBase(iRegPNoSp dst, immByteMapBase con) %{ match(Set dst con); ins_cost(INSN_COST); format %{ "adr $dst, $con\t# Byte Map Base" %} ins_encode(aarch64_enc_mov_byte_map_base(dst, con)); ins_pipe(pipe_class_default); %} > > diff -r 6523308f9626 -r 1a6e4b95d268 src/cpu/aarch64/vm/macroAssembler_aarch64.cpp > > --- a/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Mon May 12 13:41:43 2014 +0100 > > +++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Mon May 12 14:09:27 2014 +0100 > > @@ -94,7 +94,9 @@ > > offset = adr_page - pc_page; > > > > unsigned insn2 = ((unsigned*)branch)[1]; > > - if ((address)target == os::get_polling_page()) { > > + if ((jbyte *)target == > > + ((CardTableModRefBS*)(Universe::heap()->barrier_set()))->byte_map_base || > > + (address)target == os::get_polling_page()) { > > assert(offset_lo == 0, "offset must be 0 for polling page"); > > } else if (Instruction_aarch64::extract(insn2, 29, 24) == 0b111001) { > > // Load/store register (unsigned immediate) > > @@ -182,7 +184,9 @@ > > uint64_t target_page = ((uint64_t)insn_addr) + offset; > > target_page &= ((uint64_t)-1) << shift; > > unsigned insn2 = ((unsigned*)insn_addr)[1]; > > - if ((address)target_page == os::get_polling_page()) { > > + if ((jbyte *)target_page == > > + ((CardTableModRefBS*)(Universe::heap()->barrier_set()))->byte_map_base || > > + (address)target_page == os::get_polling_page()) { > > return (address)target_page; > > } else if (Instruction_aarch64::extract(insn2, 29, 24) == 0b111001) { > > // Load/store register (unsigned immediate) > > I'm getting rather unhappy about these special cases in the reloc code. Understood. All of the special case code is really only there to support the assert()/ShouldNotReachHere() logic (IE. What it does is ensures that if we find an adrp on its own (IE not followed by a ldr/str/add instruction) then we assert that it is trying to address either the polling page, or the byte map base page (we also of course assert that the offset is 0)). It all becomes much clearer (and shorter!) if rewritten as below. Does this ease your concerns? Ed. diff -r 1a6e4b95d268 src/cpu/aarch64/vm/macroAssembler_aarch64.cpp --- a/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Mon May 12 14:09:27 2014 +0100 +++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Mon May 12 17:40:30 2014 +0100 @@ -94,6 +94,7 @@ offset = adr_page - pc_page; unsigned insn2 = ((unsigned*)branch)[1]; +#if 0 if ((jbyte *)target == ((CardTableModRefBS*)(Universe::heap()->barrier_set()))->byte_map_base || (address)target == os::get_polling_page()) { @@ -135,6 +136,44 @@ } else { ShouldNotReachHere(); } +#else + // We handle 3 types of PC relative addressing + // 1 - adrp Rx, target_page + // ldr/str Ry, [Rx, #offset_in_page] + // 2 - adrp Rx, target_page + // add Ry, Rx, #offset_in_page + // 3 - adrp Rx, target_page (page aligned reloc, offset == 0) + // In the first 2 cases we must check that Rx is the same in the adrp and the + // subsequent ldr/str or add instruction. Otherwise we could accidentally end + // up treating a type 3 relocation as a type 1 or 2 just because it happened + // to be followed by a random unrelated ldr/str or add instruction. + // + // In the case of a type 3 relocation, we know that these are only generated + // for the safepoint polling page, or for the card type byte map base so we + // assert as much and of course that the offset is 0. + // + if (Instruction_aarch64::extract(insn2, 29, 24) == 0b111001 && + Instruction_aarch64::extract(insn, 4, 0) == + Instruction_aarch64::extract(insn2, 9, 5)) { + // Load/store register (unsigned immediate) + unsigned size = Instruction_aarch64::extract(insn2, 31, 30); + Instruction_aarch64::patch(branch + sizeof (unsigned), + 21, 10, offset_lo >> size); + guarantee(((dest >> size) << size) == dest, "misaligned target"); + } else if (Instruction_aarch64::extract(insn2, 31, 22) == 0b1001000100 && + Instruction_aarch64::extract(insn, 4, 0) == + Instruction_aarch64::extract(insn2, 4, 0)) { + // add (immediate) + Instruction_aarch64::patch(branch + sizeof (unsigned), + 21, 10, offset_lo); + } else { + assert((jbyte *)target == + ((CardTableModRefBS*)(Universe::heap()->barrier_set()))->byte_map_base || + (address)target == os::get_polling_page(), + "adrp must be polling page or byte map base"); + assert(offset_lo == 0, "offset must be 0 for polling page or byte map base"); + } +#endif } int offset_lo = offset & 3; offset >>= 2; @@ -184,6 +223,7 @@ uint64_t target_page = ((uint64_t)insn_addr) + offset; target_page &= ((uint64_t)-1) << shift; unsigned insn2 = ((unsigned*)insn_addr)[1]; +#if 0 if ((jbyte *)target_page == ((CardTableModRefBS*)(Universe::heap()->barrier_set()))->byte_map_base || (address)target_page == os::get_polling_page()) { @@ -196,6 +236,40 @@ } else { ShouldNotReachHere(); } +#else + // Return the target address for the following sequences + // 1 - adrp Rx, target_page + // ldr/str Ry, [Rx, #offset_in_page] + // [ 2 - adrp Rx, target_page ] Not handled + // [ add Ry, Rx, #offset_in_page ] + // 3 - adrp Rx, target_page (page aligned reloc, offset == 0) + // + // In the case of type 1 we check that the register is the same and + // return the target_page + the offset within the page. + // + // Otherwise we assume it is a page aligned relocation and return + // the target page only. The only cases this is generated is for + // the safepoint polling page or for the card table byte map base so + // we assert as much. + // + // Note: Strangely, we do not handle 'type 2' relocation (adrp followed + // by add) which is handled in pd_patch_instruction above. + // + if (Instruction_aarch64::extract(insn2, 29, 24) == 0b111001 && + Instruction_aarch64::extract(insn, 4, 0) == + Instruction_aarch64::extract(insn2, 9, 5)) { + // Load/store register (unsigned immediate) + unsigned int byte_offset = Instruction_aarch64::extract(insn2, 21, 10); + unsigned int size = Instruction_aarch64::extract(insn2, 31, 30); + return address(target_page + (byte_offset << size)); + } else { + assert((jbyte *)target_page == + ((CardTableModRefBS*)(Universe::heap()->barrier_set()))->byte_map_base || + (address)target_page == os::get_polling_page(), + "adrp must be polling page or byte map base"); + return (address)target_page; + } +#endif } else { ShouldNotReachHere(); } From aph at redhat.com Mon May 12 18:42:56 2014 From: aph at redhat.com (Andrew Haley) Date: Mon, 12 May 2014 19:42:56 +0100 Subject: [aarch64-port-dev ] Optimize pushes and pops In-Reply-To: <5370F7C6.4050701@redhat.com> References: <5370F7C6.4050701@redhat.com> Message-ID: <53711630.8000500@redhat.com> On 05/12/2014 05:33 PM, Andrew Haley wrote: > 0x00007fffd10d1ff8: stp x1, x2, [sp,#-48]! > 0x00007fffd10d1ffc: stp x3, x4, [sp,#16] > 0x00007fffd10d2000: stp x5, x6, [sp,#32] > > and > > 0x00007fffd10d202c: ldp xscratch1, xmethod, [sp],#16 > 0x00007fffd10d2030: ldp x1, x2, [sp] > 0x00007fffd10d2034: ldp x3, x4, [sp,#16] > 0x00007fffd10d2038: ldp x5, x6, [sp,#32] > 0x00007fffd10d203c: add sp, sp, #0x30 Oops. That should be: 0x00007fffd10d2030: ldp x1, x2, [sp] 0x00007fffd10d2034: ldp x3, x4, [sp,#16] 0x00007fffd10d2038: ldp x5, x6, [sp,#32] 0x00007fffd10d203c: add sp, sp, #0x30 Andrew. From ed at camswl.com Mon May 12 19:18:19 2014 From: ed at camswl.com (Edward Nevill) Date: Mon, 12 May 2014 20:18:19 +0100 Subject: [aarch64-port-dev ] Optimize pushes and pops In-Reply-To: <53711630.8000500@redhat.com> References: <5370F7C6.4050701@redhat.com> <53711630.8000500@redhat.com> Message-ID: <1399922299.21025.42.camel@mint> On Mon, 2014-05-12 at 19:42 +0100, Andrew Haley wrote: > On 05/12/2014 05:33 PM, Andrew Haley wrote: > 0x00007fffd10d2030: ldp x1, x2, [sp] > 0x00007fffd10d2034: ldp x3, x4, [sp,#16] > 0x00007fffd10d2038: ldp x5, x6, [sp,#32] > 0x00007fffd10d203c: add sp, sp, #0x30 > > Andrew. > What about ldp x3, x4, [sp, #16] ldp x5, x6, [sp, #32] ldp x1, x2, [sp], #48 Is this better or worse? Do we need to keep a count of words_pushed? This must == count at the end. If in doubt, assert(words_pushed == count, ...) and return count. See below:- Regards, Ed. --- CUT HERE --- int MacroAssembler::push(unsigned int bitset, Register stack) { // need to push all registers including original sp int words_pushed = 0; // Scan bitset to accumulate register pairs unsigned char regs[32]; int count = 0; for (int reg = 0; reg <= 30; reg++) { if (1 & bitset) regs[count++] = reg; bitset >>= 1; } regs[count++] = zr->encoding_nocheck(); count &= ~1; // Only push an even nuber of regs if (count) { stp(as_Register(regs[0]), as_Register(regs[1]), Address(pre(stack, -count * wordSize))); words_pushed += 2; } for (int i = 2; i < count; i += 2) { stp(as_Register(regs[i]), as_Register(regs[i+1]), Address(stack, i * wordSize)); words_pushed += 2; } assert(words_pushed == count, "oops, pushed != count"); return count; } int MacroAssembler::pop(unsigned int bitset, Register stack) { int words_pushed = 0; // Scan bitset to accumulate register pairs unsigned char regs[32]; int count = 0; for (int reg = 0; reg <= 30; reg++) { if (1 & bitset) regs[count++] = reg; bitset >>= 1; } regs[count++] = zr->encoding_nocheck(); count &= ~1; for (int i = 2; i < count; i += 2) { ldp(as_Register(regs[i]), as_Register(regs[i+1]), Address(stack, i * wordSize)); words_pushed += 2; } if (count) { ldp(as_Register(regs[0]), as_Register(regs[1]), Address(post(stack, count * wordSize))); words_pushed += 2; } assert(words_pushed == count, "oops, pushed != count"); return count; } --- CUT HERE --- From aph at redhat.com Mon May 12 20:45:40 2014 From: aph at redhat.com (Andrew Haley) Date: Mon, 12 May 2014 21:45:40 +0100 Subject: [aarch64-port-dev ] Optimize pushes and pops In-Reply-To: <1399922299.21025.42.camel@mint> References: <5370F7C6.4050701@redhat.com> <53711630.8000500@redhat.com> <1399922299.21025.42.camel@mint> Message-ID: <537132F4.30008@redhat.com> On 05/12/2014 08:18 PM, Edward Nevill wrote: > On Mon, 2014-05-12 at 19:42 +0100, Andrew Haley wrote: >> On 05/12/2014 05:33 PM, Andrew Haley wrote: >> 0x00007fffd10d2030: ldp x1, x2, [sp] >> 0x00007fffd10d2034: ldp x3, x4, [sp,#16] >> 0x00007fffd10d2038: ldp x5, x6, [sp,#32] >> 0x00007fffd10d203c: add sp, sp, #0x30 >> >> Andrew. >> > > What about > > ldp x3, x4, [sp, #16] > ldp x5, x6, [sp, #32] > ldp x1, x2, [sp], #48 > > Is this better or worse? Looks better. I don't know if it'll be any faster, but it is shorter. > Do we need to keep a count of words_pushed? This must == count at the end. OK, ta. Andrew. From ed at camswl.com Tue May 13 14:16:07 2014 From: ed at camswl.com (Edward Nevill) Date: Tue, 13 May 2014 15:16:07 +0100 Subject: [aarch64-port-dev ] Stop spurious O_BUFLEN warnings Message-ID: <1399990567.7713.7.camel@localhost.localdomain> Hi, The following patch stops spurious warning of the form OpenJDK 64-Bit Server VM warning: increase O_BUFLEN in ostream.hpp -- output truncated when invoke with -XX:+PrintFlagsFinal These are generated by the following code in ostream.cpp } else if (vsnprintf(buffer, buflen, format, ap) >= 0) { result = buffer; result_len = strlen(result); } else { DEBUG_ONLY(warning("increase O_BUFLEN in ostream.hpp -- output truncated");) result = buffer; result_len = buflen - 1; buffer[result_len] = 0; } So, if vsnprintf returns negative it prints this error. This happens because it is passed a bad format string from globals.cpp. Regards, Ed. --- CUT HERE --- # HG changeset patch # User Edward Nevill edward.nevill at linaro.org # Date 1399990511 -3600 # Tue May 13 15:15:11 2014 +0100 # Node ID 0ca397cbac958ef0f2ec94405d8c6b72c2526902 # Parent 6523308f9626004171794372e5577a0f6939b4df Stop spurious O_BUFLEN warnings diff -r 6523308f9626 -r 0ca397cbac95 src/share/vm/runtime/globals.cpp --- a/src/share/vm/runtime/globals.cpp Mon May 12 13:41:43 2014 +0100 +++ b/src/share/vm/runtime/globals.cpp Tue May 13 15:15:11 2014 +0100 @@ -295,7 +295,7 @@ else st->print("%-16s", ""); } - st->print("%-20"); + st->print("%-20s", " "); print_kind(st); if (withComments) { --- CUT HERE --- From ed at camswl.com Tue May 13 15:02:01 2014 From: ed at camswl.com (ed at camswl.com) Date: Tue, 13 May 2014 15:02:01 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8/hotspot: Stop spurious O_BUFLEN warnings Message-ID: <201405131502.s4DF23RO021426@aojmv0008> Changeset: 0ca397cbac95 Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-13 15:15 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/0ca397cbac95 Stop spurious O_BUFLEN warnings ! src/share/vm/runtime/globals.cpp From ed at camswl.com Tue May 13 15:10:34 2014 From: ed at camswl.com (ed at camswl.com) Date: Tue, 13 May 2014 15:10:34 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8/hotspot: Optimise long divide by 2 Message-ID: <201405131510.s4DFAZNI022914@aojmv0008> Changeset: 1fcabae0e46f Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-13 16:09 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/1fcabae0e46f Optimise long divide by 2 ! src/cpu/aarch64/vm/aarch64.ad From aph at redhat.com Tue May 13 15:54:49 2014 From: aph at redhat.com (Andrew Haley) Date: Tue, 13 May 2014 16:54:49 +0100 Subject: [aarch64-port-dev ] Some small cleanups Message-ID: <53724049.6050102@redhat.com> Remove dead code and replace obscure constants with RegSet expressions. Andrew. # HG changeset patch # User aph # Date 1399994902 -3600 # Tue May 13 16:28:22 2014 +0100 # Node ID a1b63a9c0d1f98a50494405ef5d781630a0aa690 # Parent 92cd832e8f785bfbaaed282ad2a7e9f403232a63 Add RegSet::operator+=. diff -r 92cd832e8f78 -r a1b63a9c0d1f src/cpu/aarch64/vm/register_aarch64.hpp --- a/src/cpu/aarch64/vm/register_aarch64.hpp Tue May 13 15:57:30 2014 +0100 +++ b/src/cpu/aarch64/vm/register_aarch64.hpp Tue May 13 16:28:22 2014 +0100 @@ -244,16 +244,21 @@ RegSet(Register r1) : _bitset(r1->bit()) { } - RegSet operator+(RegSet aSet) const { + RegSet operator+(const RegSet aSet) const { RegSet result(_bitset | aSet._bitset); return result; } - RegSet operator-(RegSet aSet) const { + RegSet operator-(const RegSet aSet) const { RegSet result(_bitset & ~aSet._bitset); return result; } + RegSet &operator+=(const RegSet aSet) { + *this = *this + aSet; + return *this; + } + static RegSet of(Register r1) { return RegSet(r1); } # HG changeset patch # User aph # Date 1399996165 -3600 # Tue May 13 16:49:25 2014 +0100 # Node ID 4d1f5e7d102cc5a96572091644457cd28c594537 # Parent a1b63a9c0d1f98a50494405ef5d781630a0aa690 Tidy up register usage in push/pop instructions. diff -r a1b63a9c0d1f -r 4d1f5e7d102c src/cpu/aarch64/vm/macroAssembler_aarch64.cpp --- a/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Tue May 13 16:28:22 2014 +0100 +++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Tue May 13 16:49:25 2014 +0100 @@ -989,14 +989,15 @@ assert(sub_klass != r2, "killed reg"); // killed by lea(r2, &pst_counter) // Get super_klass value into r0 (even if it was in r5 or r2). - bool pushed_r0 = false, pushed_r2 = !IS_A_TEMP(r2), pushed_r5 = !IS_A_TEMP(r5); + RegSet pushed_registers; + if (!IS_A_TEMP(r2)) pushed_registers += r2; + if (!IS_A_TEMP(r5)) pushed_registers += r5; if (super_klass != r0 || UseCompressedOops) { - if (!IS_A_TEMP(r0)) - pushed_r0 = true; + if (!IS_A_TEMP(r0)) pushed_registers += r0; } - push(r0->bit(pushed_r0) | r2->bit(pushed_r2) | r5->bit(pushed_r5), sp); + push(pushed_registers, sp); #ifndef PRODUCT mov(rscratch2, (address)&SharedRuntime::_partial_subtype_ctr); @@ -1019,7 +1020,7 @@ repne_scan(r5, r0, r2, rscratch1); // Unspill the temp. registers: - pop(r0->bit(pushed_r0) | r2->bit(pushed_r2) | r5->bit(pushed_r5), sp); + pop(pushed_registers, sp); br(Assembler::NE, *L_failure); diff -r a1b63a9c0d1f -r 4d1f5e7d102c src/os_cpu/linux_aarch64/vm/assembler_linux_aarch64.cpp --- a/src/os_cpu/linux_aarch64/vm/assembler_linux_aarch64.cpp Tue May 13 16:28:22 2014 +0100 +++ b/src/os_cpu/linux_aarch64/vm/assembler_linux_aarch64.cpp Tue May 13 16:49:25 2014 +0100 @@ -22,20 +22,12 @@ * */ -#ifndef OS_CPU_LINUX_AARCH64_VM_ASSEMBLER_LINUX_AARCH64_HPP -#define OS_CPU_LINUX_AARCH64_VM_ASSEMBLER_LINUX_AARCH64_HPP - #include "precompiled.hpp" #include "asm/macroAssembler.hpp" #include "asm/macroAssembler.inline.hpp" #include "runtime/os.hpp" #include "runtime/threadLocalStorage.hpp" -#if 0 -void MacroAssembler::int3() { - fixme() -} -#endif // get_thread can be called anywhere inside generated code so we need // to save whatever non-callee save context might get clobbered by the @@ -47,7 +39,7 @@ // void * pthread_getspecific(pthread_key_t key); // Save all call-clobbered regs except dst, plus r19 and r20. - unsigned int saved_regs = 0x401fffff & ~(1<encoding()); + RegSet saved_regs = RegSet::range(r0, r20) + lr - dst; push(saved_regs, sp); mov(c_rarg0, ThreadLocalStorage::thread_index()); mov(r19, CAST_FROM_FN_PTR(address, pthread_getspecific)); @@ -59,4 +51,3 @@ pop(saved_regs, sp); } -#endif // OS_CPU_LINUX_AARCH64_VM_ASSEMBLER_LINUX_AARCH64_HPP From aph at redhat.com Tue May 13 16:08:19 2014 From: aph at redhat.com (aph at redhat.com) Date: Tue, 13 May 2014 16:08:19 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8/hotspot: 8 new changesets Message-ID: <201405131608.s4DG8Qld002221@aojmv0008> Changeset: ac30fdebd5f5 Author: aph Date: 2014-05-12 14:34 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/ac30fdebd5f5 Fix opto assembly for shifts. ! src/cpu/aarch64/vm/aarch64.ad Changeset: 3852a506a19b Author: aph Date: 2014-05-12 16:26 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/3852a506a19b Tidy up stack frame handling. ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Changeset: 92cd832e8f78 Author: aph Date: 2014-05-13 15:57 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/92cd832e8f78 Improve code generation for pop(), as suggested by Edward Nevill. ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Changeset: a1b63a9c0d1f Author: aph Date: 2014-05-13 16:28 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/a1b63a9c0d1f Add RegSet::operator+=. ! src/cpu/aarch64/vm/register_aarch64.hpp Changeset: 4d1f5e7d102c Author: aph Date: 2014-05-13 16:49 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/4d1f5e7d102c Tidy up register usage in push/pop instructions. ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/os_cpu/linux_aarch64/vm/assembler_linux_aarch64.cpp Changeset: 202a78c1caef Author: aph Date: 2014-05-12 11:28 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/202a78c1caef Merge ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Changeset: a7c6a42da087 Author: aph Date: 2014-05-13 11:51 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/a7c6a42da087 Merge ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Changeset: e7b46e8cc544 Author: aph Date: 2014-05-13 17:06 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/e7b46e8cc544 Merge ! src/cpu/aarch64/vm/aarch64.ad From ed at camswl.com Tue May 13 19:23:09 2014 From: ed at camswl.com (ed at camswl.com) Date: Tue, 13 May 2014 19:23:09 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8/hotspot: Optimise addressing of card table byte map base Message-ID: <201405131923.s4DJNACW001131@aojmv0008> Changeset: 639009aad87b Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-13 20:22 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/639009aad87b Optimise addressing of card table byte map base ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/relocInfo_aarch64.cpp From aph at redhat.com Wed May 14 08:43:57 2014 From: aph at redhat.com (Andrew Haley) Date: Wed, 14 May 2014 09:43:57 +0100 Subject: [aarch64-port-dev ] Stop spurious O_BUFLEN warnings In-Reply-To: <1399990567.7713.7.camel@localhost.localdomain> References: <1399990567.7713.7.camel@localhost.localdomain> Message-ID: <53732CCD.4040307@redhat.com> Hi, On 05/13/2014 03:16 PM, Edward Nevill wrote: > The following patch stops spurious warning of the form > > OpenJDK 64-Bit Server VM warning: increase O_BUFLEN in ostream.hpp -- output truncated > > when invoke with -XX:+PrintFlagsFinal > > These are generated by the following code in ostream.cpp > > } else if (vsnprintf(buffer, buflen, format, ap) >= 0) { > result = buffer; > result_len = strlen(result); > } else { > DEBUG_ONLY(warning("increase O_BUFLEN in ostream.hpp -- output truncated");) > result = buffer; > result_len = buflen - 1; > buffer[result_len] = 0; > } > > > So, if vsnprintf returns negative it prints this error. This happens because it is passed a bad format string from globals.cpp. This is shared code, nothing to do with AArch64 port. It is not an appropriate local change. It might be suitable for jck8u. Andrew. From ed at camswl.com Wed May 14 09:06:03 2014 From: ed at camswl.com (Edward Nevill) Date: Wed, 14 May 2014 10:06:03 +0100 Subject: [aarch64-port-dev ] Stop spurious O_BUFLEN warnings In-Reply-To: <53732CCD.4040307@redhat.com> References: <1399990567.7713.7.camel@localhost.localdomain> <53732CCD.4040307@redhat.com> Message-ID: <1400058363.32387.7.camel@localhost.localdomain> On Wed, 2014-05-14 at 09:43 +0100, Andrew Haley wrote: > Hi, > > On 05/13/2014 03:16 PM, Edward Nevill wrote: > > > > > > > So, if vsnprintf returns negative it prints this error. This happens because it is passed a bad format string from globals.cpp. > > This is shared code, nothing to do with AArch64 port. It is not > an appropriate local change. It might be suitable for jck8u. OK. I will back it out, Regards, Ed. From openjdk-testing at linaro.org Wed May 14 13:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Wed, 14 May 2014 14:00:01 +0100 (BST) Subject: [aarch64-port-dev ] server JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140514130032.BD5281FBDC@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/summary/2014/134/summary.html =============================================================================== server-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/apr/16 pass: 433; fail: 1; error: 4 Build 1: aarch64/2014/apr/17 pass: 433; fail: 1; error: 4 Build 2: aarch64/2014/apr/24 pass: 416; fail: 1; error: 21 Build 3: aarch64/2014/apr/25 pass: 425; fail: 1; error: 12 Build 4: aarch64/2014/apr/26 pass: 423; fail: 1; error: 14 Build 5: aarch64/2014/apr/30 pass: 421; fail: 2; error: 15 Build 6: aarch64/2014/may/02 pass: 434; fail: 2; error: 2 Build 7: aarch64/2014/may/10 pass: 421; fail: 2; error: 15 Build 8: aarch64/2014/may/13 pass: 427; fail: 2; error: 9 ------------------------------------------------------------------------------- =============================================================================== server-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/apr/16 pass: 2,934; error: 38 Build 1: aarch64/2014/apr/17 pass: 2,933; error: 39 Build 2: aarch64/2014/apr/24 pass: 2,894; error: 78 Build 3: aarch64/2014/apr/25 pass: 2,911; error: 61 Build 4: aarch64/2014/apr/26 pass: 2,912; error: 60 Build 5: aarch64/2014/apr/30 pass: 2,915; error: 57 Build 6: aarch64/2014/may/02 pass: 2,939; error: 33 Build 7: aarch64/2014/may/10 pass: 2,911; error: 61 Build 8: aarch64/2014/may/13 pass: 2,910; error: 62 Build 9: aarch64/2014/may/14 pass: 2,859; error: 113 ------------------------------------------------------------------------------- =============================================================================== server-release/jdk =============================================================================== Build 0: aarch64/2014/apr/16 pass: 5,242; fail: 134; error: 74 Build 1: aarch64/2014/apr/17 pass: 5,230; fail: 146; error: 74 Build 2: aarch64/2014/apr/24 pass: 4,699; fail: 472; error: 279 Build 3: aarch64/2014/apr/25 pass: 4,725; fail: 472; error: 253 Build 4: aarch64/2014/apr/26 pass: 4,723; fail: 475; error: 251 Build 5: aarch64/2014/apr/30 pass: 4,725; fail: 470; error: 254 Build 6: aarch64/2014/may/02 pass: 5,285; fail: 124; error: 40 Build 7: aarch64/2014/may/10 pass: 4,725; fail: 470; error: 254 Build 8: aarch64/2014/may/13 pass: 4,723; fail: 470; error: 256 Build 9: aarch64/2014/may/14 pass: 4,674; fail: 482; error: 293 ------------------------------------------------------------------------------- Previous results can be found here: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/index.html From openjdk-testing at linaro.org Wed May 14 13:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Wed, 14 May 2014 14:00:01 +0100 (BST) Subject: [aarch64-port-dev ] client JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140514130032.3BE9A1FBDD@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/summary/2014/134/summary.html =============================================================================== client-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/apr/16 pass: 429; fail: 5; error: 4 Build 1: aarch64/2014/apr/17 pass: 429; fail: 5; error: 4 Build 2: aarch64/2014/apr/24 pass: 418; fail: 5; error: 15 Build 3: aarch64/2014/apr/25 pass: 421; fail: 5; error: 12 Build 4: aarch64/2014/apr/26 pass: 421; fail: 5; error: 12 Build 5: aarch64/2014/apr/30 pass: 422; fail: 5; error: 11 Build 6: aarch64/2014/may/02 pass: 431; fail: 5; error: 2 Build 7: aarch64/2014/may/10 pass: 421; fail: 5; error: 12 Build 8: aarch64/2014/may/13 pass: 422; fail: 5; error: 11 Build 9: aarch64/2014/may/14 pass: 418; fail: 5; error: 15 ------------------------------------------------------------------------------- =============================================================================== client-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/apr/16 pass: 2,936; error: 36 Build 1: aarch64/2014/apr/17 pass: 2,937; error: 35 Build 2: aarch64/2014/apr/24 pass: 2,916; error: 56 Build 3: aarch64/2014/apr/25 pass: 2,917; error: 55 Build 4: aarch64/2014/apr/26 pass: 2,917; error: 55 Build 5: aarch64/2014/apr/30 pass: 2,917; error: 55 Build 6: aarch64/2014/may/02 pass: 2,941; error: 31 Build 7: aarch64/2014/may/10 pass: 2,917; error: 55 Build 8: aarch64/2014/may/13 pass: 2,921; error: 51 Build 9: aarch64/2014/may/14 pass: 2,908; error: 64 ------------------------------------------------------------------------------- =============================================================================== client-release/jdk =============================================================================== Build 0: aarch64/2014/apr/16 pass: 5,231; fail: 156; error: 63 Build 1: aarch64/2014/apr/17 pass: 5,231; fail: 157; error: 62 Build 2: aarch64/2014/apr/24 pass: 4,886; fail: 474; error: 90 Build 3: aarch64/2014/apr/25 pass: 4,905; fail: 474; error: 71 Build 4: aarch64/2014/apr/26 pass: 4,905; fail: 472; error: 72 Build 5: aarch64/2014/apr/30 pass: 4,904; fail: 473; error: 72 Build 6: aarch64/2014/may/02 pass: 5,272; fail: 131; error: 46 Build 7: aarch64/2014/may/10 pass: 4,906; fail: 472; error: 71 Build 8: aarch64/2014/may/13 pass: 4,904; fail: 473; error: 72 Build 9: aarch64/2014/may/14 pass: 4,862; fail: 474; error: 113 ------------------------------------------------------------------------------- Previous results can be found here: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/index.html From ed at camswl.com Wed May 14 14:43:57 2014 From: ed at camswl.com (ed at camswl.com) Date: Wed, 14 May 2014 14:43:57 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8/hotspot: Backout 6713:0ca397cbac95 Message-ID: <201405141443.s4EEhwls001188@aojmv0008> Changeset: 9d3bc0f40cce Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-14 15:43 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/9d3bc0f40cce Backout 6713:0ca397cbac95 ! src/share/vm/runtime/globals.cpp From openjdk-testing at linaro.org Thu May 15 13:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Thu, 15 May 2014 14:00:01 +0100 (BST) Subject: [aarch64-port-dev ] server JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140515130032.1BE021F5CB@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/summary/2014/135/summary.html =============================================================================== server-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/apr/16 pass: 433; fail: 1; error: 4 Build 1: aarch64/2014/apr/17 pass: 433; fail: 1; error: 4 Build 2: aarch64/2014/apr/24 pass: 416; fail: 1; error: 21 Build 3: aarch64/2014/apr/25 pass: 425; fail: 1; error: 12 Build 4: aarch64/2014/apr/26 pass: 423; fail: 1; error: 14 Build 5: aarch64/2014/apr/30 pass: 421; fail: 2; error: 15 Build 6: aarch64/2014/may/02 pass: 434; fail: 2; error: 2 Build 7: aarch64/2014/may/10 pass: 421; fail: 2; error: 15 Build 8: aarch64/2014/may/13 pass: 427; fail: 2; error: 9 Build 9: aarch64/2014/may/15 pass: 409; fail: 2; error: 27 ------------------------------------------------------------------------------- =============================================================================== server-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/apr/17 pass: 2,933; error: 39 Build 1: aarch64/2014/apr/24 pass: 2,894; error: 78 Build 2: aarch64/2014/apr/25 pass: 2,911; error: 61 Build 3: aarch64/2014/apr/26 pass: 2,912; error: 60 Build 4: aarch64/2014/apr/30 pass: 2,915; error: 57 Build 5: aarch64/2014/may/02 pass: 2,939; error: 33 Build 6: aarch64/2014/may/10 pass: 2,911; error: 61 Build 7: aarch64/2014/may/13 pass: 2,910; error: 62 Build 8: aarch64/2014/may/14 pass: 2,859; error: 113 Build 9: aarch64/2014/may/15 pass: 2,894; error: 78 ------------------------------------------------------------------------------- =============================================================================== server-release/jdk =============================================================================== Build 0: aarch64/2014/apr/17 pass: 5,230; fail: 146; error: 74 Build 1: aarch64/2014/apr/24 pass: 4,699; fail: 472; error: 279 Build 2: aarch64/2014/apr/25 pass: 4,725; fail: 472; error: 253 Build 3: aarch64/2014/apr/26 pass: 4,723; fail: 475; error: 251 Build 4: aarch64/2014/apr/30 pass: 4,725; fail: 470; error: 254 Build 5: aarch64/2014/may/02 pass: 5,285; fail: 124; error: 40 Build 6: aarch64/2014/may/10 pass: 4,725; fail: 470; error: 254 Build 7: aarch64/2014/may/13 pass: 4,723; fail: 470; error: 256 Build 8: aarch64/2014/may/14 pass: 4,674; fail: 482; error: 293 Build 9: aarch64/2014/may/15 pass: 4,705; fail: 473; error: 271 ------------------------------------------------------------------------------- Previous results can be found here: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/index.html From openjdk-testing at linaro.org Thu May 15 13:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Thu, 15 May 2014 14:00:01 +0100 (BST) Subject: [aarch64-port-dev ] client JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140515130032.0D5F91FBFC@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/summary/2014/135/summary.html =============================================================================== client-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/apr/17 pass: 429; fail: 5; error: 4 Build 1: aarch64/2014/apr/24 pass: 418; fail: 5; error: 15 Build 2: aarch64/2014/apr/25 pass: 421; fail: 5; error: 12 Build 3: aarch64/2014/apr/26 pass: 421; fail: 5; error: 12 Build 4: aarch64/2014/apr/30 pass: 422; fail: 5; error: 11 Build 5: aarch64/2014/may/02 pass: 431; fail: 5; error: 2 Build 6: aarch64/2014/may/10 pass: 421; fail: 5; error: 12 Build 7: aarch64/2014/may/13 pass: 422; fail: 5; error: 11 Build 8: aarch64/2014/may/14 pass: 418; fail: 5; error: 15 Build 9: aarch64/2014/may/15 pass: 418; fail: 5; error: 15 ------------------------------------------------------------------------------- =============================================================================== client-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/apr/17 pass: 2,937; error: 35 Build 1: aarch64/2014/apr/24 pass: 2,916; error: 56 Build 2: aarch64/2014/apr/25 pass: 2,917; error: 55 Build 3: aarch64/2014/apr/26 pass: 2,917; error: 55 Build 4: aarch64/2014/apr/30 pass: 2,917; error: 55 Build 5: aarch64/2014/may/02 pass: 2,941; error: 31 Build 6: aarch64/2014/may/10 pass: 2,917; error: 55 Build 7: aarch64/2014/may/13 pass: 2,921; error: 51 Build 8: aarch64/2014/may/14 pass: 2,908; error: 64 Build 9: aarch64/2014/may/15 pass: 2,915; error: 57 ------------------------------------------------------------------------------- =============================================================================== client-release/jdk =============================================================================== Build 0: aarch64/2014/apr/17 pass: 5,231; fail: 157; error: 62 Build 1: aarch64/2014/apr/24 pass: 4,886; fail: 474; error: 90 Build 2: aarch64/2014/apr/25 pass: 4,905; fail: 474; error: 71 Build 3: aarch64/2014/apr/26 pass: 4,905; fail: 472; error: 72 Build 4: aarch64/2014/apr/30 pass: 4,904; fail: 473; error: 72 Build 5: aarch64/2014/may/02 pass: 5,272; fail: 131; error: 46 Build 6: aarch64/2014/may/10 pass: 4,906; fail: 472; error: 71 Build 7: aarch64/2014/may/13 pass: 4,904; fail: 473; error: 72 Build 8: aarch64/2014/may/14 pass: 4,862; fail: 474; error: 113 Build 9: aarch64/2014/may/15 pass: 4,890; fail: 473; error: 86 ------------------------------------------------------------------------------- Previous results can be found here: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/index.html From aph at redhat.com Thu May 15 13:21:22 2014 From: aph at redhat.com (Andrew Haley) Date: Thu, 15 May 2014 14:21:22 +0100 Subject: [aarch64-port-dev ] Correct OptoAssembly for prologs and epilogs Message-ID: <5374BF52.2020103@redhat.com> I changed the assembly, but not the OptoAssembly. Andrew. # HG changeset patch # User aph # Date 1400156137 14400 # Thu May 15 08:15:37 2014 -0400 # Node ID b8ec31c74e2d7f6789b290bb360b6c0b13fc2052 # Parent a2e9ac7b3434982b06faef8df4a78da78b374dad Correct OptoAssembly for prologs and epilogs. diff -r a2e9ac7b3434 -r b8ec31c74e2d src/cpu/aarch64/vm/aarch64.ad --- a/src/cpu/aarch64/vm/aarch64.ad Thu May 15 07:37:12 2014 -0400 +++ b/src/cpu/aarch64/vm/aarch64.ad Thu May 15 08:15:37 2014 -0400 @@ -861,38 +861,21 @@ Compile* C = ra_->C; int framesize = C->frame_slots() << LogBytesPerInt; - assert((framesize & (StackAlignmentInBytes-1)) == 0, "frame size not aligned"); - if (C->need_stack_bang(framesize)) { + + if (C->need_stack_bang(framesize)) st->print("# stack bang size=%d\n\t", framesize); + + if (framesize == 0) { + // Is this even possible? + st->print("stp lr, rfp, [sp, #%d]!", -(2 * wordSize)); + } else if (framesize < (1 << 12)) { + st->print("sub sp, sp, #%d\n\t", framesize); + st->print("stp rfp, lr, [sp, #%d]", framesize - 2 * wordSize); + } else { + st->print("stp lr, rfp, [sp, #%d]!\n\t", -(2 * wordSize)); + st->print("mov rscratch1, #%d\n\t", framesize - 2 * wordSize); + st->print("sub sp, sp, rscratch1"); } - - if (framesize > 0) { - st->print("# create frame %d\n\t", framesize); - } - - st->print("stp lr, rfp, [sp, #%d]!\n\t", -(2 * wordSize)); - - // allow for pushing ret address and rfp - - framesize -= (2 * wordSize); - - if (framesize) { - if (Assembler::operand_valid_for_add_sub_immediate(framesize)) { - st->print("sub sp, sp, #%d", framesize); - } else { - st->print("mov rscratch1, #%d\n\t", framesize); - st->print("sub sp, sp, rscratch1"); - } - } - - if (NotifySimulator) { - st->print("\n\t# notify(Assembler::method_entry)"); - } - - if (VerifyStackAtCalls) { - st->print("\n\t# VerifyStackAtCalls Unimplemented!"); - } - st->cr(); } #endif @@ -960,27 +943,22 @@ int framesize = C->frame_slots() << LogBytesPerInt; st->print("# pop frame %d\n\t",framesize); - framesize -= 2 * wordSize; - - if (framesize) { - if (Assembler::operand_valid_for_add_sub_immediate(framesize)) { - st->print("add sp, sp, #%d", framesize); - } else { - st->print("mov rscratch1, #%d\n\t", framesize); - st->print("add sp, sp, rscratch1\n\t"); - } + + if (framesize == 0) { + st->print("ldp lr, rfp, [sp],#%d\n\t", (2 * wordSize)); + } else if (framesize < (1 << 12)) { + st->print("ldp lr, rfp, [sp,#%d]\n\t", framesize - 2 * wordSize); + st->print("add sp, sp, #%d\n\t", framesize); + } else { + st->print("mov rscratch1, #%d\n\t", framesize - 2 * wordSize); + st->print("add sp, sp, rscratch1\n\t"); + st->print("ldp lr, rfp, [sp],#%d\n\t", (2 * wordSize)); } - st->print("# remove frame\n\t"); - st->print("ldp lr, rfp, [sp],#%d\n\t", (2 * wordSize)); - - if (NotifySimulator) { - st->print("notify method_reentry\n\t"); - } if (do_polling() && C->is_method_compilation()) { st->print("# touch polling page\n\t"); st->print("mov rscratch1, #0x%x\n\t", os::get_polling_page()); - st->print("ldr zr, [rscratch1]\n\t"); + st->print("ldr zr, [rscratch1]"); } } #endif From aph at redhat.com Thu May 15 13:24:31 2014 From: aph at redhat.com (Andrew Haley) Date: Thu, 15 May 2014 14:24:31 +0100 Subject: [aarch64-port-dev ] Correct costs for operations with shifts Message-ID: <5374C00F.8050108@redhat.com> Th can be more expensive to do, say, an addition with a shift than a plain addition. This patch makes such operations more expensive than simple operations, but less expensive than the two separate instructions. Andrew. # HG changeset patch # User aph # Date 1400153832 14400 # Thu May 15 07:37:12 2014 -0400 # Node ID a2e9ac7b3434982b06faef8df4a78da78b374dad # Parent 9d3bc0f40cce83038e5a0d5fc8a51389530538d9 Correct costs for operations with shifts. diff -r 9d3bc0f40cce -r a2e9ac7b3434 src/cpu/aarch64/vm/aarch64.ad --- a/src/cpu/aarch64/vm/aarch64.ad Wed May 14 15:43:50 2014 +0100 +++ b/src/cpu/aarch64/vm/aarch64.ad Thu May 15 07:37:12 2014 -0400 @@ -7042,7 +7042,7 @@ instruct addP_reg_reg_lsl(iRegPNoSp dst, iRegP src1, iRegL src2, immIScale scale) %{ match(Set dst (AddP src1 (LShiftL src2 scale))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "add $dst, $src1, $src2, LShiftL $scale\t# ptr" %} ins_encode %{ @@ -7057,7 +7057,7 @@ instruct addP_reg_reg_ext_shift(iRegPNoSp dst, iRegP src1, iRegIorL2I src2, immIScale scale) %{ match(Set dst (AddP src1 (LShiftL (ConvI2L src2) scale))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "add $dst, $src1, $src2, I2L $scale\t# ptr" %} ins_encode %{ @@ -7784,7 +7784,7 @@ iRegI src1, iRegI src2, immI src3, immI_M1 src4, rFlagsReg cr) %{ match(Set dst (AndI src1 (XorI(URShiftI src2 src3) src4))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "bicw $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -7802,7 +7802,7 @@ iRegL src1, iRegL src2, immI src3, immL_M1 src4, rFlagsReg cr) %{ match(Set dst (AndL src1 (XorL(URShiftL src2 src3) src4))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "bic $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -7820,7 +7820,7 @@ iRegI src1, iRegI src2, immI src3, immI_M1 src4, rFlagsReg cr) %{ match(Set dst (AndI src1 (XorI(RShiftI src2 src3) src4))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "bicw $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -7838,7 +7838,7 @@ iRegL src1, iRegL src2, immI src3, immL_M1 src4, rFlagsReg cr) %{ match(Set dst (AndL src1 (XorL(RShiftL src2 src3) src4))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "bic $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -7856,7 +7856,7 @@ iRegI src1, iRegI src2, immI src3, immI_M1 src4, rFlagsReg cr) %{ match(Set dst (AndI src1 (XorI(LShiftI src2 src3) src4))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "bicw $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -7874,7 +7874,7 @@ iRegL src1, iRegL src2, immI src3, immL_M1 src4, rFlagsReg cr) %{ match(Set dst (AndL src1 (XorL(LShiftL src2 src3) src4))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "bic $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -7892,7 +7892,7 @@ iRegI src1, iRegI src2, immI src3, immI_M1 src4, rFlagsReg cr) %{ match(Set dst (XorI src4 (XorI(URShiftI src2 src3) src1))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "eonw $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -7910,7 +7910,7 @@ iRegL src1, iRegL src2, immI src3, immL_M1 src4, rFlagsReg cr) %{ match(Set dst (XorL src4 (XorL(URShiftL src2 src3) src1))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "eon $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -7928,7 +7928,7 @@ iRegI src1, iRegI src2, immI src3, immI_M1 src4, rFlagsReg cr) %{ match(Set dst (XorI src4 (XorI(RShiftI src2 src3) src1))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "eonw $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -7946,7 +7946,7 @@ iRegL src1, iRegL src2, immI src3, immL_M1 src4, rFlagsReg cr) %{ match(Set dst (XorL src4 (XorL(RShiftL src2 src3) src1))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "eon $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -7964,7 +7964,7 @@ iRegI src1, iRegI src2, immI src3, immI_M1 src4, rFlagsReg cr) %{ match(Set dst (XorI src4 (XorI(LShiftI src2 src3) src1))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "eonw $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -7982,7 +7982,7 @@ iRegL src1, iRegL src2, immI src3, immL_M1 src4, rFlagsReg cr) %{ match(Set dst (XorL src4 (XorL(LShiftL src2 src3) src1))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "eon $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -8000,7 +8000,7 @@ iRegI src1, iRegI src2, immI src3, immI_M1 src4, rFlagsReg cr) %{ match(Set dst (OrI src1 (XorI(URShiftI src2 src3) src4))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "ornw $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -8018,7 +8018,7 @@ iRegL src1, iRegL src2, immI src3, immL_M1 src4, rFlagsReg cr) %{ match(Set dst (OrL src1 (XorL(URShiftL src2 src3) src4))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "orn $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -8036,7 +8036,7 @@ iRegI src1, iRegI src2, immI src3, immI_M1 src4, rFlagsReg cr) %{ match(Set dst (OrI src1 (XorI(RShiftI src2 src3) src4))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "ornw $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -8054,7 +8054,7 @@ iRegL src1, iRegL src2, immI src3, immL_M1 src4, rFlagsReg cr) %{ match(Set dst (OrL src1 (XorL(RShiftL src2 src3) src4))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "orn $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -8072,7 +8072,7 @@ iRegI src1, iRegI src2, immI src3, immI_M1 src4, rFlagsReg cr) %{ match(Set dst (OrI src1 (XorI(LShiftI src2 src3) src4))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "ornw $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -8090,7 +8090,7 @@ iRegL src1, iRegL src2, immI src3, immL_M1 src4, rFlagsReg cr) %{ match(Set dst (OrL src1 (XorL(LShiftL src2 src3) src4))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "orn $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -8109,7 +8109,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (AndI src1 (URShiftI src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "andw $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -8128,7 +8128,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (AndL src1 (URShiftL src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "andr $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -8147,7 +8147,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (AndI src1 (RShiftI src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "andw $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -8166,7 +8166,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (AndL src1 (RShiftL src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "andr $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -8185,7 +8185,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (AndI src1 (LShiftI src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "andw $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -8204,7 +8204,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (AndL src1 (LShiftL src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "andr $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -8223,7 +8223,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (XorI src1 (URShiftI src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "eorw $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -8242,7 +8242,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (XorL src1 (URShiftL src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "eor $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -8261,7 +8261,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (XorI src1 (RShiftI src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "eorw $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -8280,7 +8280,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (XorL src1 (RShiftL src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "eor $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -8299,7 +8299,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (XorI src1 (LShiftI src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "eorw $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -8318,7 +8318,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (XorL src1 (LShiftL src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "eor $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -8337,7 +8337,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (OrI src1 (URShiftI src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "orrw $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -8356,7 +8356,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (OrL src1 (URShiftL src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "orr $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -8375,7 +8375,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (OrI src1 (RShiftI src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "orrw $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -8394,7 +8394,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (OrL src1 (RShiftL src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "orr $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -8413,7 +8413,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (OrI src1 (LShiftI src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "orrw $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -8432,7 +8432,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (OrL src1 (LShiftL src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "orr $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -8451,7 +8451,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (AddI src1 (URShiftI src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "addw $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -8470,7 +8470,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (AddL src1 (URShiftL src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "add $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -8489,7 +8489,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (AddI src1 (RShiftI src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "addw $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -8508,7 +8508,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (AddL src1 (RShiftL src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "add $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -8527,7 +8527,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (AddI src1 (LShiftI src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "addw $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -8546,7 +8546,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (AddL src1 (LShiftL src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "add $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -8565,7 +8565,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (SubI src1 (URShiftI src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "subw $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -8584,7 +8584,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (SubL src1 (URShiftL src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "sub $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -8603,7 +8603,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (SubI src1 (RShiftI src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "subw $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -8622,7 +8622,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (SubL src1 (RShiftL src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "sub $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -8641,7 +8641,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (SubI src1 (LShiftI src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "subw $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -8660,7 +8660,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (SubL src1 (LShiftL src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "sub $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -9021,7 +9021,7 @@ %{ match(Set dst (AddL src1 (ConvI2L src2))); ins_cost(INSN_COST); - format %{ "add $dst, $src1, $src2, sxtw" %} + format %{ "add $dst, $src1, sxtw $src2" %} ins_encode %{ __ add(as_Register($dst$$reg), as_Register($src1$$reg), @@ -9034,7 +9034,7 @@ %{ match(Set dst (SubL src1 (ConvI2L src2))); ins_cost(INSN_COST); - format %{ "sub $dst, $src1, $src2, sxtw" %} + format %{ "sub $dst, $src1, sxtw $src2" %} ins_encode %{ __ sub(as_Register($dst$$reg), as_Register($src1$$reg), @@ -9100,7 +9100,7 @@ %{ match(Set dst (AddL src1 (RShiftL (LShiftL src2 lshift) rshift))); ins_cost(INSN_COST); - format %{ "add $dst, $src1, sxtw $src2, sxtw" %} + format %{ "add $dst, $src1, sxtw $src2" %} ins_encode %{ __ add(as_Register($dst$$reg), as_Register($src1$$reg), diff -r 9d3bc0f40cce -r a2e9ac7b3434 src/cpu/aarch64/vm/aarch64_ad.m4 --- a/src/cpu/aarch64/vm/aarch64_ad.m4 Wed May 14 15:43:50 2014 +0100 +++ b/src/cpu/aarch64/vm/aarch64_ad.m4 Thu May 15 07:37:12 2014 -0400 @@ -7,7 +7,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst ($2$1 src1 ($4$1 src2 src3))); - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "$3 $dst, $src1, $src2, $5 $src3" %} ins_encode %{ @@ -52,7 +52,7 @@ ifelse($2,Xor, match(Set dst ($2$1 src4 (Xor$1($4$1 src2 src3) src1)));, match(Set dst ($2$1 src1 (Xor$1($4$1 src2 src3) src4)));) - ins_cost(INSN_COST); + ins_cost(1.9 * INSN_COST); format %{ "$3 $dst, $src1, $src2, $5 $src3" %} ins_encode %{ @@ -278,7 +278,7 @@ %{ match(Set dst ($3$2 src1 (ConvI2L src2))); ins_cost(INSN_COST); - format %{ "$4 $dst, $src1, $6 $src2" %} + format %{ "$4 $dst, $src1, $5 $src2" %} ins_encode %{ __ $4(as_Register($dst$$reg), as_Register($src1$$reg), From aph at redhat.com Fri May 16 15:44:30 2014 From: aph at redhat.com (aph at redhat.com) Date: Fri, 16 May 2014 15:44:30 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk9/hotspot: 2 new changesets Message-ID: <201405161544.s4GFiWOZ008648@aojmv0008> Changeset: 9608b7cabf9e Author: aph Date: 2014-05-16 09:06 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/9608b7cabf9e Merge from aarch64-port. ! agent/src/os/linux/LinuxDebuggerLocal.c ! agent/src/os/linux/libproc.h ! make/defs.make ! make/linux/makefiles/buildtree.make ! make/linux/makefiles/defs.make ! make/linux/makefiles/gcc.make + src/cpu/aarch64/vm/aarch64.ad + src/cpu/aarch64/vm/aarch64Test.cpp + src/cpu/aarch64/vm/aarch64_ad.m4 + src/cpu/aarch64/vm/aarch64_call.cpp + src/cpu/aarch64/vm/aarch64_linkage.S + src/cpu/aarch64/vm/ad_encode.m4 + src/cpu/aarch64/vm/assembler_aarch64.cpp + src/cpu/aarch64/vm/assembler_aarch64.hpp + src/cpu/aarch64/vm/assembler_aarch64.inline.hpp + src/cpu/aarch64/vm/bytecodeInterpreter_aarch64.cpp + src/cpu/aarch64/vm/bytecodeInterpreter_aarch64.hpp + src/cpu/aarch64/vm/bytecodeInterpreter_aarch64.inline.hpp + src/cpu/aarch64/vm/bytecodes_aarch64.cpp + src/cpu/aarch64/vm/bytecodes_aarch64.hpp + src/cpu/aarch64/vm/bytes_aarch64.hpp + src/cpu/aarch64/vm/c1_CodeStubs_aarch64.cpp + src/cpu/aarch64/vm/c1_Defs_aarch64.hpp + src/cpu/aarch64/vm/c1_FpuStackSim_aarch64.cpp + src/cpu/aarch64/vm/c1_FpuStackSim_aarch64.hpp + src/cpu/aarch64/vm/c1_FrameMap_aarch64.cpp + src/cpu/aarch64/vm/c1_FrameMap_aarch64.hpp + src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp + src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.hpp + src/cpu/aarch64/vm/c1_LIRGenerator_aarch64.cpp + src/cpu/aarch64/vm/c1_LinearScan_aarch64.cpp + src/cpu/aarch64/vm/c1_LinearScan_aarch64.hpp + src/cpu/aarch64/vm/c1_MacroAssembler_aarch64.cpp + src/cpu/aarch64/vm/c1_MacroAssembler_aarch64.hpp + src/cpu/aarch64/vm/c1_Runtime1_aarch64.cpp + src/cpu/aarch64/vm/c1_globals_aarch64.hpp + src/cpu/aarch64/vm/c2_globals_aarch64.hpp + src/cpu/aarch64/vm/c2_init_aarch64.cpp + src/cpu/aarch64/vm/codeBuffer_aarch64.hpp + src/cpu/aarch64/vm/compiledIC_aarch64.cpp + src/cpu/aarch64/vm/copy_aarch64.hpp + src/cpu/aarch64/vm/cppInterpreterGenerator_aarch64.hpp + src/cpu/aarch64/vm/cpustate_aarch64.hpp + src/cpu/aarch64/vm/debug_aarch64.cpp + src/cpu/aarch64/vm/decode_aarch64.hpp + src/cpu/aarch64/vm/depChecker_aarch64.cpp + src/cpu/aarch64/vm/depChecker_aarch64.hpp + src/cpu/aarch64/vm/disassembler_aarch64.hpp + src/cpu/aarch64/vm/frame_aarch64.cpp + src/cpu/aarch64/vm/frame_aarch64.hpp + src/cpu/aarch64/vm/frame_aarch64.inline.hpp + src/cpu/aarch64/vm/globalDefinitions_aarch64.hpp + src/cpu/aarch64/vm/globals_aarch64.hpp + src/cpu/aarch64/vm/icBuffer_aarch64.cpp + src/cpu/aarch64/vm/icache_aarch64.cpp + src/cpu/aarch64/vm/icache_aarch64.hpp + src/cpu/aarch64/vm/immediate_aarch64.cpp + src/cpu/aarch64/vm/immediate_aarch64.hpp + src/cpu/aarch64/vm/interp_masm_aarch64.cpp + src/cpu/aarch64/vm/interp_masm_aarch64.hpp + src/cpu/aarch64/vm/interpreterGenerator_aarch64.hpp + src/cpu/aarch64/vm/interpreterRT_aarch64.cpp + src/cpu/aarch64/vm/interpreterRT_aarch64.hpp + src/cpu/aarch64/vm/interpreter_aarch64.cpp + src/cpu/aarch64/vm/interpreter_aarch64.hpp + src/cpu/aarch64/vm/javaFrameAnchor_aarch64.hpp + src/cpu/aarch64/vm/jniFastGetField_aarch64.cpp + src/cpu/aarch64/vm/jniTypes_aarch64.hpp + src/cpu/aarch64/vm/jni_aarch64.h + src/cpu/aarch64/vm/macroAssembler_aarch64.cpp + src/cpu/aarch64/vm/macroAssembler_aarch64.hpp + src/cpu/aarch64/vm/macroAssembler_aarch64.inline.hpp + src/cpu/aarch64/vm/metaspaceShared_aarch64.cpp + src/cpu/aarch64/vm/methodHandles_aarch64.cpp + src/cpu/aarch64/vm/methodHandles_aarch64.hpp + src/cpu/aarch64/vm/nativeInst_aarch64.cpp + src/cpu/aarch64/vm/nativeInst_aarch64.hpp + src/cpu/aarch64/vm/registerMap_aarch64.hpp + src/cpu/aarch64/vm/register_aarch64.cpp + src/cpu/aarch64/vm/register_aarch64.hpp + src/cpu/aarch64/vm/register_definitions_aarch64.cpp + src/cpu/aarch64/vm/relocInfo_aarch64.cpp + src/cpu/aarch64/vm/relocInfo_aarch64.hpp + src/cpu/aarch64/vm/runtime_aarch64.cpp + src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp + src/cpu/aarch64/vm/stubGenerator_aarch64.cpp + src/cpu/aarch64/vm/stubRoutines_aarch64.cpp + src/cpu/aarch64/vm/stubRoutines_aarch64.hpp + src/cpu/aarch64/vm/templateInterpreterGenerator_aarch64.hpp + src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp + src/cpu/aarch64/vm/templateInterpreter_aarch64.hpp + src/cpu/aarch64/vm/templateTable_aarch64.cpp + src/cpu/aarch64/vm/templateTable_aarch64.hpp + src/cpu/aarch64/vm/vmStructs_aarch64.hpp + src/cpu/aarch64/vm/vm_version_aarch64.cpp + src/cpu/aarch64/vm/vm_version_aarch64.hpp + src/cpu/aarch64/vm/vmreg_aarch64.cpp + src/cpu/aarch64/vm/vmreg_aarch64.hpp + src/cpu/aarch64/vm/vmreg_aarch64.inline.hpp + src/cpu/aarch64/vm/vtableStubs_aarch64.cpp ! src/os/linux/vm/os_linux.cpp ! src/os/linux/vm/thread_linux.inline.hpp + src/os_cpu/linux_aarch64/vm/assembler_linux_aarch64.cpp + src/os_cpu/linux_aarch64/vm/atomic_linux_aarch64.inline.hpp + src/os_cpu/linux_aarch64/vm/bytes_linux_aarch64.inline.hpp + src/os_cpu/linux_aarch64/vm/copy_linux_aarch64.inline.hpp + src/os_cpu/linux_aarch64/vm/globals_linux_aarch64.hpp + src/os_cpu/linux_aarch64/vm/linux_aarch64.S + src/os_cpu/linux_aarch64/vm/linux_aarch64.ad + src/os_cpu/linux_aarch64/vm/orderAccess_linux_aarch64.inline.hpp + src/os_cpu/linux_aarch64/vm/os_linux_aarch64.cpp + src/os_cpu/linux_aarch64/vm/os_linux_aarch64.hpp + src/os_cpu/linux_aarch64/vm/os_linux_aarch64.inline.hpp + src/os_cpu/linux_aarch64/vm/prefetch_linux_aarch64.inline.hpp + src/os_cpu/linux_aarch64/vm/threadLS_linux_aarch64.cpp + src/os_cpu/linux_aarch64/vm/threadLS_linux_aarch64.hpp + src/os_cpu/linux_aarch64/vm/thread_linux_aarch64.cpp + src/os_cpu/linux_aarch64/vm/thread_linux_aarch64.hpp + src/os_cpu/linux_aarch64/vm/vmStructs_linux_aarch64.hpp + src/os_cpu/linux_aarch64/vm/vm_version_linux_aarch64.cpp ! src/share/vm/adlc/main.cpp ! src/share/vm/asm/assembler.hpp ! src/share/vm/asm/assembler.inline.hpp ! src/share/vm/asm/codeBuffer.hpp ! src/share/vm/asm/macroAssembler.hpp ! src/share/vm/asm/macroAssembler.inline.hpp ! src/share/vm/asm/register.hpp ! src/share/vm/c1/c1_Canonicalizer.cpp ! src/share/vm/c1/c1_Defs.hpp ! src/share/vm/c1/c1_FpuStackSim.hpp ! src/share/vm/c1/c1_FrameMap.cpp ! src/share/vm/c1/c1_FrameMap.hpp ! src/share/vm/c1/c1_LIR.cpp ! src/share/vm/c1/c1_LIR.hpp ! src/share/vm/c1/c1_LIRAssembler.cpp ! src/share/vm/c1/c1_LIRAssembler.hpp ! src/share/vm/c1/c1_LinearScan.cpp ! src/share/vm/c1/c1_LinearScan.hpp ! src/share/vm/c1/c1_MacroAssembler.hpp ! src/share/vm/c1/c1_Runtime1.cpp ! src/share/vm/c1/c1_Runtime1.hpp ! src/share/vm/c1/c1_globals.hpp ! src/share/vm/classfile/bytecodeAssembler.cpp ! src/share/vm/classfile/classFileStream.hpp ! src/share/vm/classfile/stackMapTable.hpp ! src/share/vm/classfile/verifier.cpp ! src/share/vm/code/codeBlob.cpp ! src/share/vm/code/compiledIC.hpp ! src/share/vm/code/relocInfo.hpp ! src/share/vm/code/vmreg.hpp ! src/share/vm/compiler/disassembler.cpp ! src/share/vm/compiler/disassembler.hpp ! src/share/vm/interpreter/abstractInterpreter.hpp ! src/share/vm/interpreter/bytecode.hpp ! src/share/vm/interpreter/bytecodeInterpreter.cpp ! src/share/vm/interpreter/bytecodeInterpreter.hpp ! src/share/vm/interpreter/bytecodeInterpreter.inline.hpp ! src/share/vm/interpreter/bytecodeStream.hpp ! src/share/vm/interpreter/bytecodes.cpp ! src/share/vm/interpreter/bytecodes.hpp ! src/share/vm/interpreter/cppInterpreter.hpp ! src/share/vm/interpreter/cppInterpreterGenerator.hpp ! src/share/vm/interpreter/interpreter.hpp ! src/share/vm/interpreter/interpreterGenerator.hpp ! src/share/vm/interpreter/interpreterRuntime.cpp ! src/share/vm/interpreter/interpreterRuntime.hpp ! src/share/vm/interpreter/templateInterpreter.hpp ! src/share/vm/interpreter/templateInterpreterGenerator.hpp ! src/share/vm/interpreter/templateTable.hpp ! src/share/vm/oops/constantPool.hpp ! src/share/vm/oops/oop.inline.hpp ! src/share/vm/oops/typeArrayOop.hpp ! src/share/vm/opto/buildOopMap.cpp ! src/share/vm/opto/c2_globals.hpp ! src/share/vm/opto/c2compiler.cpp ! src/share/vm/opto/compile.cpp ! src/share/vm/opto/gcm.cpp ! src/share/vm/opto/generateOptoStub.cpp ! src/share/vm/opto/lcm.cpp ! src/share/vm/opto/locknode.hpp ! src/share/vm/opto/matcher.cpp ! src/share/vm/opto/output.hpp ! src/share/vm/opto/regmask.cpp ! src/share/vm/opto/regmask.hpp ! src/share/vm/opto/runtime.cpp ! src/share/vm/prims/jniCheck.cpp ! src/share/vm/prims/jni_md.h ! src/share/vm/prims/jvmtiClassFileReconstituter.cpp ! src/share/vm/prims/methodHandles.hpp ! src/share/vm/runtime/atomic.inline.hpp ! src/share/vm/runtime/deoptimization.cpp ! src/share/vm/runtime/dtraceJSDT.hpp ! src/share/vm/runtime/frame.cpp ! src/share/vm/runtime/frame.hpp ! src/share/vm/runtime/frame.inline.hpp ! src/share/vm/runtime/globals.hpp ! src/share/vm/runtime/icache.hpp ! src/share/vm/runtime/java.cpp ! src/share/vm/runtime/javaCalls.hpp ! src/share/vm/runtime/javaFrameAnchor.hpp ! src/share/vm/runtime/os.hpp ! src/share/vm/runtime/registerMap.hpp ! src/share/vm/runtime/relocator.hpp ! src/share/vm/runtime/safepoint.cpp ! src/share/vm/runtime/sharedRuntime.cpp ! src/share/vm/runtime/stackValueCollection.cpp ! src/share/vm/runtime/statSampler.cpp ! src/share/vm/runtime/stubRoutines.hpp ! src/share/vm/runtime/thread.hpp ! src/share/vm/runtime/threadLocalStorage.hpp ! src/share/vm/runtime/vmStructs.cpp ! src/share/vm/runtime/vm_version.cpp ! src/share/vm/utilities/copy.hpp ! src/share/vm/utilities/globalDefinitions.hpp ! src/share/vm/utilities/macros.hpp ! src/share/vm/utilities/taskqueue.hpp Changeset: f1eaccaeed61 Author: aph Date: 2014-05-16 11:42 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/f1eaccaeed61 Disable biased locking ! src/share/vm/runtime/globals.hpp From aph at redhat.com Fri May 16 15:44:47 2014 From: aph at redhat.com (aph at redhat.com) Date: Fri, 16 May 2014 15:44:47 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk9: 2 new changesets Message-ID: <201405161544.s4GFilgx008710@aojmv0008> Changeset: 6affd637d719 Author: aph Date: 2014-05-16 09:02 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/rev/6affd637d719 Merge from aarch64-port. ! common/autoconf/build-aux/config.sub ! common/autoconf/flags.m4 ! common/autoconf/jdk-options.m4 ! common/autoconf/platform.m4 Changeset: 7adc2f18b573 Author: aph Date: 2014-05-16 11:41 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/rev/7adc2f18b573 Merge from aarch64-port. ! common/autoconf/flags.m4 ! common/autoconf/generated-configure.sh From aph at redhat.com Fri May 16 15:44:56 2014 From: aph at redhat.com (aph at redhat.com) Date: Fri, 16 May 2014 15:44:56 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk9/jdk: Merge from aarch64-port. Message-ID: <201405161545.s4GFj339008791@aojmv0008> Changeset: 4dfa2ea6a6e7 Author: aph Date: 2014-05-01 13:41 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/jdk/rev/4dfa2ea6a6e7 Merge from aarch64-port. ! make/lib/SoundLibraries.gmk ! src/share/native/com/sun/media/sound/SoundDefs.h From aph at redhat.com Fri May 16 15:50:44 2014 From: aph at redhat.com (Andrew Haley) Date: Fri, 16 May 2014 16:50:44 +0100 Subject: [aarch64-port-dev ] JDK9 Message-ID: <537633D4.7080802@redhat.com> http://hg.openjdk.java.net/aarch64-port/jdk9 is now available. It is based on jdk9-b10. It has all the AArch64 code, and I've tried very hard to make the changes to shared code as minimal as possible. The only substantive difference to shared code is that biased locking is disabled by default because it doesn't work. It is slightly out of date with respect to AArch64's JDK8: we should start merging patches from there. All merges from JDK8 are pre-approved except changes to shared code, which require approval from me. This will become the staging repository for moving this port upstream. Andrew. From aph at redhat.com Mon May 19 11:22:06 2014 From: aph at redhat.com (Andrew Haley) Date: Mon, 19 May 2014 12:22:06 +0100 Subject: [aarch64-port-dev ] Correct memory barriers Message-ID: <5379E95E.8000802@redhat.com> We weren't placing memory barriers correctly around locks; fixed thusly. Also, I've taken the opportunity to tidy up the memory barrier code in the assembler. Andrew. # HG changeset patch # User aph # Date 1400498360 14400 # Mon May 19 07:19:20 2014 -0400 # Node ID a1363bc1be1dbe8a3bbe72732c30a27487753ed6 # Parent f1eaccaeed618cb852a652c18d343e7f1beffef6 Correct and tidy up memory barriers. diff -r f1eaccaeed61 -r a1363bc1be1d src/cpu/aarch64/vm/aarch64.ad --- a/src/cpu/aarch64/vm/aarch64.ad Fri May 16 11:42:50 2014 -0400 +++ b/src/cpu/aarch64/vm/aarch64.ad Mon May 19 07:19:20 2014 -0400 @@ -3143,11 +3143,10 @@ __ ldr(disp_hdr, Address(tmp, ObjectMonitor::cxq_offset_in_bytes())); __ orr(rscratch1, rscratch1, disp_hdr); // Will be 0 if both are 0. __ cmp(rscratch1, zr); - __ br(Assembler::NE, cont); + __ cbnz(rscratch1, cont); // need a release store here __ lea(tmp, Address(tmp, ObjectMonitor::owner_offset_in_bytes())); - __ mov(zr, rscratch1); - __ stlr(rscratch1, tmp); + __ stlr(rscratch1, tmp); // rscratch1 is zero } __ bind(cont); @@ -5818,14 +5817,14 @@ // ============================================================================ // MemBar Instruction -instruct membar_acquire() %{ +instruct load_fence() %{ match(LoadFence); ins_cost(VOLATILE_REF_COST); format %{ "membar_acquire" %} ins_encode %{ - __ membar(Assembler::Membar_mask_bits(Assembler::LoadLoad|Assembler::LoadStore)); + __ membar(Assembler::LoadLoad|Assembler::LoadStore); %} ins_pipe(pipe_class_memory); %} @@ -5834,7 +5833,7 @@ match(MemBarAcquire); ins_cost(0); - format %{ " -- \t// redundant MEMBAR-acquire - empty" %} + format %{ "membar_acquire (elided)" %} ins_encode %{ __ block_comment("membar_acquire (elided)"); @@ -5845,12 +5844,12 @@ instruct membar_acquire_lock() %{ match(MemBarAcquireLock); - ins_cost(0); - - format %{ " -- \t// redundant MEMBAR-acquire - empty (acquire as part of CAS in prior FastLock)" %} - - ins_encode %{ - __ block_comment("membar_acquire_lock (elided)"); + ins_cost(VOLATILE_REF_COST); + + format %{ "membar_acquire_lock" %} + + ins_encode %{ + __ membar(Assembler::LoadLoad|Assembler::LoadStore); %} ins_pipe(pipe_class_memory); @@ -5863,7 +5862,7 @@ format %{ "store_fence" %} ins_encode %{ - __ membar(Assembler::AnyAny); + __ membar(Assembler::StoreLoad|Assembler::StoreStore); %} ins_pipe(pipe_class_memory); %} @@ -5872,7 +5871,7 @@ match(MemBarRelease); ins_cost(0); - format %{ "membar_release" %} + format %{ "membar_release (elided)" %} ins_encode %{ __ block_comment("membar_release (elided)"); @@ -5894,10 +5893,12 @@ instruct membar_release_lock() %{ match(MemBarReleaseLock); - ins_cost(0); - - ins_encode %{ - __ block_comment("membar_release_lock (elided)"); + ins_cost(VOLATILE_REF_COST); + + format %{ "membar_release_lock" %} + + ins_encode %{ + __ membar(Assembler::StoreLoad|Assembler::StoreStore); %} ins_pipe(pipe_class_memory); @@ -5916,36 +5917,6 @@ ins_pipe(pipe_class_memory); %} -// This optimization is wrong on PPC and may be wrong on AArch64. The -// following pattern is not supported: -// MemBarVolatile -// ^ ^ -// | | -// CtrlProj MemProj -// ^ ^ -// | | -// | Load -// | -// MemBarVolatile -// -// The first MemBarVolatile could get optimized out! According to -// Vladimir, this pattern can not occur on Oracle platforms. -// However, it does occur on PPC64 (because of membars in -// inline_unsafe_load_store). -// -// instruct unnecessary_membar_volatile() %{ -// match(MemBarVolatile); -// predicate(Matcher::post_store_load_barrier(n)); -// ins_cost(0); - -// format %{ "!MEMBAR-volatile (unnecessary so empty encoding)" %} -// ins_encode %{ -// __ block_comment("unnecessary_membar_volatile"); -// __ membar(Assembler::AnyAny); -// %} -// ins_pipe(pipe_class_empty); -// %} - // ============================================================================ // Cast/Convert Instructions diff -r f1eaccaeed61 -r a1363bc1be1d src/cpu/aarch64/vm/assembler_aarch64.hpp --- a/src/cpu/aarch64/vm/assembler_aarch64.hpp Fri May 16 11:42:50 2014 -0400 +++ b/src/cpu/aarch64/vm/assembler_aarch64.hpp Mon May 19 07:19:20 2014 -0400 @@ -1992,6 +1992,11 @@ void emit_data64(jlong data, RelocationHolder const& rspec, int format = 0); }; +inline Assembler::Membar_mask_bits operator|(Assembler::Membar_mask_bits a, + Assembler::Membar_mask_bits b) { + return Assembler::Membar_mask_bits(unsigned(a)|unsigned(b)); +} + Instruction_aarch64::~Instruction_aarch64() { assem->emit(); } @@ -2003,8 +2008,6 @@ return Assembler::Condition(int(cond) ^ 1); } -// extra stuff needed to compile -// not sure which of these methods are really necessary class BiasedLockingCounters; extern "C" void das(uint64_t start, int len); diff -r f1eaccaeed61 -r a1363bc1be1d src/cpu/aarch64/vm/c1_Runtime1_aarch64.cpp --- a/src/cpu/aarch64/vm/c1_Runtime1_aarch64.cpp Fri May 16 11:42:50 2014 -0400 +++ b/src/cpu/aarch64/vm/c1_Runtime1_aarch64.cpp Mon May 19 07:19:20 2014 -0400 @@ -1257,7 +1257,7 @@ assert((int)CardTableModRefBS::dirty_card_val() == 0, "must be 0"); - __ membar(Assembler::Membar_mask_bits(Assembler::StoreLoad)); + __ membar(Assembler::StoreLoad); __ ldrb(rscratch1, Address(card_addr, offset)); __ cbzw(rscratch1, done); diff -r f1eaccaeed61 -r a1363bc1be1d src/cpu/aarch64/vm/globalDefinitions_aarch64.hpp --- a/src/cpu/aarch64/vm/globalDefinitions_aarch64.hpp Fri May 16 11:42:50 2014 -0400 +++ b/src/cpu/aarch64/vm/globalDefinitions_aarch64.hpp Mon May 19 07:19:20 2014 -0400 @@ -37,7 +37,4 @@ #define SUPPORTS_NATIVE_CX8 -// AArch64 is NOT multiple-copy-atomic. -#define CPU_NOT_MULTIPLE_COPY_ATOMIC - #endif // CPU_AARCH64_VM_GLOBALDEFINITIONS_AARCH64_HPP diff -r f1eaccaeed61 -r a1363bc1be1d src/cpu/aarch64/vm/macroAssembler_aarch64.cpp --- a/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Fri May 16 11:42:50 2014 -0400 +++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Mon May 19 07:19:20 2014 -0400 @@ -2520,7 +2520,7 @@ assert((int)CardTableModRefBS::dirty_card_val() == 0, "must be 0"); - membar(Assembler::Membar_mask_bits(Assembler::StoreLoad)); + membar(Assembler::StoreLoad); ldrb(tmp2, Address(card_addr, offset)); cbzw(tmp2, done); diff -r f1eaccaeed61 -r a1363bc1be1d src/cpu/aarch64/vm/templateTable_aarch64.cpp --- a/src/cpu/aarch64/vm/templateTable_aarch64.cpp Fri May 16 11:42:50 2014 -0400 +++ b/src/cpu/aarch64/vm/templateTable_aarch64.cpp Mon May 19 07:19:20 2014 -0400 @@ -2405,8 +2405,7 @@ __ bind(Done); // It's really not worth bothering to check whether this field // really is volatile in the slow case. - __ membar(MacroAssembler::Membar_mask_bits(MacroAssembler::LoadLoad | - MacroAssembler::LoadStore)); + __ membar(MacroAssembler::LoadLoad | MacroAssembler::LoadStore); } @@ -2498,7 +2497,7 @@ { Label notVolatile; __ tbz(r5, ConstantPoolCacheEntry::is_volatile_shift, notVolatile); - __ membar(MacroAssembler::Membar_mask_bits(MacroAssembler::StoreStore)); + __ membar(MacroAssembler::StoreStore); __ bind(notVolatile); } @@ -2645,7 +2644,7 @@ { Label notVolatile; __ tbz(r5, ConstantPoolCacheEntry::is_volatile_shift, notVolatile); - __ membar(MacroAssembler::Membar_mask_bits(MacroAssembler::StoreLoad)); + __ membar(MacroAssembler::StoreLoad); __ bind(notVolatile); } } @@ -2734,7 +2733,7 @@ { Label notVolatile; __ tbz(r3, ConstantPoolCacheEntry::is_volatile_shift, notVolatile); - __ membar(MacroAssembler::Membar_mask_bits(MacroAssembler::StoreStore)); + __ membar(MacroAssembler::StoreStore); __ bind(notVolatile); } @@ -2778,7 +2777,7 @@ { Label notVolatile; __ tbz(r3, ConstantPoolCacheEntry::is_volatile_shift, notVolatile); - __ membar(MacroAssembler::Membar_mask_bits(MacroAssembler::StoreLoad)); + __ membar(MacroAssembler::StoreLoad); __ bind(notVolatile); } } @@ -2855,8 +2854,7 @@ { Label notVolatile; __ tbz(r3, ConstantPoolCacheEntry::is_volatile_shift, notVolatile); - __ membar(MacroAssembler::Membar_mask_bits(MacroAssembler::LoadLoad | - MacroAssembler::LoadStore)); + __ membar(MacroAssembler::LoadLoad | MacroAssembler::LoadStore); __ bind(notVolatile); } } From aph at redhat.com Mon May 19 11:22:15 2014 From: aph at redhat.com (aph at redhat.com) Date: Mon, 19 May 2014 11:22:15 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk9/hotspot: Correct and tidy up memory barriers. Message-ID: <201405191122.s4JBMGi9003348@aojmv0008> Changeset: a1363bc1be1d Author: aph Date: 2014-05-19 07:19 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/a1363bc1be1d Correct and tidy up memory barriers. ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/assembler_aarch64.hpp ! src/cpu/aarch64/vm/c1_Runtime1_aarch64.cpp ! src/cpu/aarch64/vm/globalDefinitions_aarch64.hpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/templateTable_aarch64.cpp From edward.nevill at linaro.org Mon May 19 12:37:58 2014 From: edward.nevill at linaro.org (Edward Nevill) Date: Mon, 19 May 2014 13:37:58 +0100 Subject: [aarch64-port-dev ] RFR: JDK9: Add missing aarch64 platform files Message-ID: <1400503078.9093.4.camel@localhost.localdomain> Hi, The following aarch64 platform specific files seem to be missing in the jdk9 tree. hotspot/make/linux/makefiles/aarch64.make hotspot/make/linux/platform_aarch64 jdk/src/solaris/bin/aarch64/jvm.cfg It does not build natively, or x-compiled without them. Ok to push? Ed. --- CUT HERE --- # HG changeset patch # User Edward Nevill edward.nevill at linaro.org # Date 1400502685 -3600 # Mon May 19 13:31:25 2014 +0100 # Node ID ccec36f8fb8c943c471fe83b755506d27cc9e1e0 # Parent a1363bc1be1dbe8a3bbe72732c30a27487753ed6 Add missing aarch64 platform files diff -r a1363bc1be1d -r ccec36f8fb8c make/linux/makefiles/aarch64.make --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/make/linux/makefiles/aarch64.make Mon May 19 13:31:25 2014 +0100 @@ -0,0 +1,38 @@ +# +# Copyright (c) 2003, 2010, Oracle and/or its affiliates. All rights reserved. +# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. +# +# This code is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License version 2 only, as +# published by the Free Software Foundation. +# +# This code is distributed in the hope that it will be useful, but WITHOUT +# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or +# FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +# version 2 for more details (a copy is included in the LICENSE file that +# accompanied this code). +# +# You should have received a copy of the GNU General Public License version +# 2 along with this work; if not, write to the Free Software Foundation, +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. +# +# Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA +# or visit www.oracle.com if you need additional information or have any +# questions. +# +# + +# The copied fdlibm routines in sharedRuntimeTrig.o must not be optimized +OPT_CFLAGS/sharedRuntimeTrig.o = $(OPT_CFLAGS/NOOPT) +# The copied fdlibm routines in sharedRuntimeTrans.o must not be optimized +OPT_CFLAGS/sharedRuntimeTrans.o = $(OPT_CFLAGS/NOOPT) +# Must also specify if CPU is little endian +CFLAGS += -DVM_LITTLE_ENDIAN + +ifeq ($(BUILTIN_SIM), true) +CFLAGS += -DBUILTIN_SIM -DALLOW_OPERATOR_NEW_USAGE +endif + +# CFLAGS += -D_LP64=1 + +OPT_CFLAGS/compactingPermGenGen.o = -O1 diff -r a1363bc1be1d -r ccec36f8fb8c make/linux/platform_aarch64 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/make/linux/platform_aarch64 Mon May 19 13:31:25 2014 +0100 @@ -0,0 +1,15 @@ +os_family = linux + +arch = aarch64 + +arch_model = aarch64 + +os_arch = linux_aarch64 + +os_arch_model = linux_aarch64 + +lib_arch = aarch64 + +compiler = gcc + +sysdefs = -DLINUX -D_GNU_SOURCE -DAARCH64 --- CUT HERE --- --- CUT HERE --- # HG changeset patch # User Edward Nevill edward.nevill at linaro.org # Date 1400502736 -3600 # Mon May 19 13:32:16 2014 +0100 # Node ID eb1b4daa7e45e3caa7d8032d2fe2d2fd437337a3 # Parent 4dfa2ea6a6e7bfe71d0a0a23a88da70cf7373619 Add missing jvm.cfg diff -r 4dfa2ea6a6e7 -r eb1b4daa7e45 src/solaris/bin/aarch64/jvm.cfg --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/src/solaris/bin/aarch64/jvm.cfg Mon May 19 13:32:16 2014 +0100 @@ -0,0 +1,38 @@ +# +# +# +# Copyright (c) 2003, 2013, Oracle and/or its affiliates. All rights reserved. +# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. +# +# This code is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License version 2 only, as +# published by the Free Software Foundation. Oracle designates this +# particular file as subject to the "Classpath" exception as provided +# by Oracle in the LICENSE file that accompanied this code. +# +# This code is distributed in the hope that it will be useful, but WITHOUT +# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or +# FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +# version 2 for more details (a copy is included in the LICENSE file that +# accompanied this code). +# +# You should have received a copy of the GNU General Public License version +# 2 along with this work; if not, write to the Free Software Foundation, +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. +# +# Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA +# or visit www.oracle.com if you need additional information or have any +# questions. +# +# +# List of JVMs that can be used as an option to java, javac, etc. +# Order is important -- first in this list is the default JVM. +# NOTE that this both this file and its format are UNSUPPORTED and +# WILL GO AWAY in a future release. +# +# You may also select a JVM in an arbitrary location with the +# "-XXaltjvm=" option, but that too is unsupported +# and may not be available in a future release. +# +-server KNOWN +-client IGNORE --- CUT HERE --- From aph at redhat.com Mon May 19 13:04:13 2014 From: aph at redhat.com (Andrew Haley) Date: Mon, 19 May 2014 14:04:13 +0100 Subject: [aarch64-port-dev ] RFR: JDK9: Add missing aarch64 platform files In-Reply-To: <1400503078.9093.4.camel@localhost.localdomain> References: <1400503078.9093.4.camel@localhost.localdomain> Message-ID: <537A014D.3060208@redhat.com> On 05/19/2014 01:37 PM, Edward Nevill wrote: > The following aarch64 platform specific files seem to be missing in the jdk9 tree. > > hotspot/make/linux/makefiles/aarch64.make > hotspot/make/linux/platform_aarch64 > jdk/src/solaris/bin/aarch64/jvm.cfg > > It does not build natively, or x-compiled without them. > > Ok to push? Please let me do it: they're all in my tree. Andrew. From aph at redhat.com Mon May 19 13:07:35 2014 From: aph at redhat.com (aph at redhat.com) Date: Mon, 19 May 2014 13:07:35 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk9/jdk: Merge from aarch64-port. Message-ID: <201405191307.s4JD7hL2019647@aojmv0008> Changeset: 390d54dd8434 Author: aph Date: 2014-05-19 09:07 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/jdk/rev/390d54dd8434 Merge from aarch64-port. + src/solaris/bin/aarch64/jvm.cfg From aph at redhat.com Mon May 19 13:08:33 2014 From: aph at redhat.com (aph at redhat.com) Date: Mon, 19 May 2014 13:08:33 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk9/hotspot: Merge from aarch64-port. Message-ID: <201405191308.s4JD8YTT019830@aojmv0008> Changeset: fe8023ec3b4a Author: aph Date: 2014-05-19 09:08 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/fe8023ec3b4a Merge from aarch64-port. + make/linux/makefiles/aarch64.make + make/linux/platform_aarch64 From edward.nevill at linaro.org Mon May 19 15:23:08 2014 From: edward.nevill at linaro.org (Edward Nevill) Date: Mon, 19 May 2014 16:23:08 +0100 Subject: [aarch64-port-dev ] RFR: JDK9: Fix biased locking Message-ID: <1400512988.25209.4.camel@localhost.localdomain> Hi, The following patch fixes biased locking and re-enables it. I have tested it with jtreg/hotspot, jtreg/langtools and SPECjvm2008. This patch modifies shared code (it re-enables biased locking). OK to push? Ed. --- CUT HERE --- # HG changeset patch # User Edward Nevill edward.nevill at linaro.org # Date 1400512495 -3600 # Mon May 19 16:14:55 2014 +0100 # Node ID ddbcca4965f29762ced6755229642c9f4694536e # Parent a1363bc1be1dbe8a3bbe72732c30a27487753ed6 Fix biased locking and re-enable diff -r a1363bc1be1d -r ddbcca4965f2 src/cpu/aarch64/vm/macroAssembler_aarch64.cpp --- a/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Mon May 19 07:19:20 2014 -0400 +++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Mon May 19 16:14:55 2014 +0100 @@ -414,7 +414,7 @@ Label here; load_prototype_header(tmp_reg, obj_reg); orr(tmp_reg, rthread, tmp_reg); - cmpxchgptr(tmp_reg, swap_reg, obj_reg, rscratch1, here, slow_case); + cmpxchgptr(swap_reg, tmp_reg, obj_reg, rscratch1, here, slow_case); // If the biasing toward our thread failed, then another thread // succeeded in biasing it toward itself and we need to revoke that // bias. The revocation will occur in the runtime in the slow case. @@ -441,7 +441,7 @@ { Label here, nope; load_prototype_header(tmp_reg, obj_reg); - cmpxchgptr(tmp_reg, swap_reg, obj_reg, rscratch1, here, &nope); + cmpxchgptr(swap_reg, tmp_reg, obj_reg, rscratch1, here, &nope); bind(here); // Fall through to the normal CAS-based lock, because no matter what diff -r a1363bc1be1d -r ddbcca4965f2 src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp --- a/src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp Mon May 19 07:19:20 2014 -0400 +++ b/src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp Mon May 19 16:14:55 2014 +0100 @@ -1818,7 +1818,7 @@ __ ldr(obj_reg, Address(oop_handle_reg, 0)); if (UseBiasedLocking) { - // __ biased_locking_enter(lock_reg, obj_reg, swap_reg, rscratch1, false, lock_done, &slow_path_lock); + __ biased_locking_enter(lock_reg, obj_reg, swap_reg, rscratch2, false, lock_done, &slow_path_lock); } // Load (object->mark() | 1) into swap_reg %r0 diff -r a1363bc1be1d -r ddbcca4965f2 src/share/vm/runtime/globals.hpp --- a/src/share/vm/runtime/globals.hpp Mon May 19 07:19:20 2014 -0400 +++ b/src/share/vm/runtime/globals.hpp Mon May 19 16:14:55 2014 +0100 @@ -1257,7 +1257,7 @@ product(bool, RestrictContended, true, \ "Restrict @Contended to trusted classes") \ \ - product(bool, UseBiasedLocking, false, \ + product(bool, UseBiasedLocking, true, \ "Enable biased locking in JVM") \ \ product(intx, BiasedLockingStartupDelay, 4000, \ --- CUT HERE --- From aph at redhat.com Mon May 19 15:28:08 2014 From: aph at redhat.com (Andrew Haley) Date: Mon, 19 May 2014 16:28:08 +0100 Subject: [aarch64-port-dev ] RFR: JDK9: Fix biased locking In-Reply-To: <1400512988.25209.4.camel@localhost.localdomain> References: <1400512988.25209.4.camel@localhost.localdomain> Message-ID: <537A2308.6010904@redhat.com> On 05/19/2014 04:23 PM, Edward Nevill wrote: > OK to push? Yes, certainly, thanks. Andrew. From ed at camswl.com Mon May 19 15:38:53 2014 From: ed at camswl.com (ed at camswl.com) Date: Mon, 19 May 2014 15:38:53 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk9/hotspot: Fix biased locking and re-enable Message-ID: <201405191538.s4JFcsgu014815@aojmv0008> Changeset: fa02c3b67b5b Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-19 16:38 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/fa02c3b67b5b Fix biased locking and re-enable ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp ! src/share/vm/runtime/globals.hpp From ed at camswl.com Tue May 20 12:30:10 2014 From: ed at camswl.com (Edward Nevill) Date: Tue, 20 May 2014 13:30:10 +0100 Subject: [aarch64-port-dev ] JDK9: Merge in recent changes from JDK8 tree Message-ID: <1400589010.27686.7.camel@mint> Hi, The following changesets merge the JDK9 hotspot tree to the same state as the JDK8 tip. I have tested the resultant tree with jtreg hotspot and langtools. None of the changesets involve changes to shared code. Regards, Ed. --- CUT HERE --- changeset: 6285:b432f481e62c user: aph date: Wed Apr 23 09:26:04 2014 -0400 files: src/cpu/aarch64/vm/c1_Runtime1_aarch64.cpp src/cpu/aarch64/vm/register_aarch64.hpp description: Add a constructor as a conversion from Register - RegSet. Use it. changeset: 6286:212b2ca3ad1c user: Edward Nevill edward.nevill at linaro.org date: Tue Apr 29 14:58:56 2014 +0100 files: src/cpu/aarch64/vm/aarch64.ad description: Minor optimisation for divide by 2 changeset: 6287:4c9a7428f67b user: Edward Nevill edward.nevill at linaro.org date: Thu May 01 14:57:36 2014 +0100 files: src/cpu/aarch64/vm/nativeInst_aarch64.hpp description: Fix instruction size from 8 to 4 changeset: 6288:1ca541a15aff user: Edward Nevill edward.nevill at linaro.org date: Wed May 07 16:41:56 2014 +0100 files: src/cpu/aarch64/vm/aarch64.ad src/cpu/aarch64/vm/c1_globals_aarch64.hpp src/cpu/aarch64/vm/macroAssembler_aarch64.cpp description: Improvements to safepoint polling changeset: 6289:5b18c3dc4c4a user: Edward Nevill edward.nevill at linaro.org date: Mon May 12 13:39:41 2014 +0100 files: src/cpu/aarch64/vm/aarch64.ad src/cpu/aarch64/vm/macroAssembler_aarch64.cpp src/cpu/aarch64/vm/macroAssembler_aarch64.hpp description: Optimise C2 entry point verification changeset: 6290:777a522742fa user: Edward Nevill edward.nevill at linaro.org date: Mon May 12 13:41:43 2014 +0100 files: src/cpu/aarch64/vm/globals_aarch64.hpp description: Make code entry alignment 64 for C2 changeset: 6291:6cd24413811b user: Edward Nevill edward.nevill at linaro.org date: Tue May 13 16:09:08 2014 +0100 files: src/cpu/aarch64/vm/aarch64.ad description: Optimise long divide by 2 changeset: 6292:8a1d95a0fb86 user: aph date: Mon May 12 14:34:00 2014 +0100 files: src/cpu/aarch64/vm/aarch64.ad description: Fix opto assembly for shifts. changeset: 6293:aa29549aca08 user: aph date: Mon May 12 16:26:39 2014 +0100 files: src/cpu/aarch64/vm/aarch64.ad src/cpu/aarch64/vm/macroAssembler_aarch64.cpp description: Tidy up stack frame handling. changeset: 6294:27d284544a1a user: aph date: Tue May 13 15:57:30 2014 +0100 files: src/cpu/aarch64/vm/macroAssembler_aarch64.cpp description: Improve code generation for pop(), as suggested by Edward Nevill. changeset: 6295:5dc01b82a254 user: aph date: Tue May 13 16:28:22 2014 +0100 files: src/cpu/aarch64/vm/register_aarch64.hpp description: Add RegSet::operator+=. changeset: 6296:f28c0648a91a user: aph date: Tue May 13 16:49:25 2014 +0100 files: src/cpu/aarch64/vm/macroAssembler_aarch64.cpp src/os_cpu/linux_aarch64/vm/assembler_linux_aarch64.cpp description: Tidy up register usage in push/pop instructions. changeset: 6297:05fca26b84c9 tag: tip user: Edward Nevill edward.nevill at linaro.org date: Tue May 13 20:22:36 2014 +0100 files: src/cpu/aarch64/vm/aarch64.ad src/cpu/aarch64/vm/macroAssembler_aarch64.cpp src/cpu/aarch64/vm/relocInfo_aarch64.cpp description: Optimise addressing of card table byte map base --- CUT HERE --- From ed at camswl.com Wed May 21 12:44:24 2014 From: ed at camswl.com (ed at camswl.com) Date: Wed, 21 May 2014 12:44:24 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk9/hotspot: 13 new changesets Message-ID: <201405211244.s4LCidp4013458@aojmv0008> Changeset: b432f481e62c Author: aph Date: 2014-04-23 09:26 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/b432f481e62c Add a constructor as a conversion from Register - RegSet. Use it. ! src/cpu/aarch64/vm/c1_Runtime1_aarch64.cpp ! src/cpu/aarch64/vm/register_aarch64.hpp Changeset: 212b2ca3ad1c Author: Edward Nevill edward.nevill at linaro.org Date: 2014-04-29 14:58 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/212b2ca3ad1c Minor optimisation for divide by 2 ! src/cpu/aarch64/vm/aarch64.ad Changeset: 4c9a7428f67b Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-01 14:57 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/4c9a7428f67b Fix instruction size from 8 to 4 ! src/cpu/aarch64/vm/nativeInst_aarch64.hpp Changeset: 1ca541a15aff Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-07 16:41 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/1ca541a15aff Improvements to safepoint polling ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/c1_globals_aarch64.hpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Changeset: 5b18c3dc4c4a Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-12 13:39 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/5b18c3dc4c4a Optimise C2 entry point verification ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp Changeset: 777a522742fa Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-12 13:41 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/777a522742fa Make code entry alignment 64 for C2 ! src/cpu/aarch64/vm/globals_aarch64.hpp Changeset: 6cd24413811b Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-13 16:09 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/6cd24413811b Optimise long divide by 2 ! src/cpu/aarch64/vm/aarch64.ad Changeset: 8a1d95a0fb86 Author: aph Date: 2014-05-12 14:34 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/8a1d95a0fb86 Fix opto assembly for shifts. ! src/cpu/aarch64/vm/aarch64.ad Changeset: aa29549aca08 Author: aph Date: 2014-05-12 16:26 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/aa29549aca08 Tidy up stack frame handling. ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Changeset: 27d284544a1a Author: aph Date: 2014-05-13 15:57 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/27d284544a1a Improve code generation for pop(), as suggested by Edward Nevill. ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Changeset: 5dc01b82a254 Author: aph Date: 2014-05-13 16:28 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/5dc01b82a254 Add RegSet::operator+=. ! src/cpu/aarch64/vm/register_aarch64.hpp Changeset: f28c0648a91a Author: aph Date: 2014-05-13 16:49 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/f28c0648a91a Tidy up register usage in push/pop instructions. ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/os_cpu/linux_aarch64/vm/assembler_linux_aarch64.cpp Changeset: 05fca26b84c9 Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-13 20:22 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/05fca26b84c9 Optimise addressing of card table byte map base ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/relocInfo_aarch64.cpp From aph at redhat.com Thu May 22 13:32:23 2014 From: aph at redhat.com (Andrew Haley) Date: Thu, 22 May 2014 14:32:23 +0100 Subject: [aarch64-port-dev ] Use inner shareable versions of memory barriers Message-ID: <537DFC67.9020606@redhat.com> Inner shareable seems to be the preferred version of memory barriers on this architecture. Andrew. # HG changeset patch # User aph # Date 1400691624 14400 # Wed May 21 13:00:24 2014 -0400 # Node ID 5f4d7f52afc875fab4af0c68c3b657a2e8bd7283 # Parent fe8023ec3b4a44a66daa1b8208f44d1cddb723d4 Use inner shareable versions of memory barriers diff -r fe8023ec3b4a -r 5f4d7f52afc8 src/cpu/aarch64/vm/assembler_aarch64.hpp --- a/src/cpu/aarch64/vm/assembler_aarch64.hpp Mon May 19 09:08:08 2014 -0400 +++ b/src/cpu/aarch64/vm/assembler_aarch64.hpp Wed May 21 13:00:24 2014 -0400 @@ -1001,13 +1001,13 @@ // A more convenient access to dmb for our purposes enum Membar_mask_bits { - StoreStore = ST, - LoadStore = LD, - LoadLoad = LD, - // We can use ISH for a full barrier because the ARM ARM says - // "This architecture assumes that all Processing Elements that - // use the same operating system or hypervisor are in the same - // Inner Shareable shareability domain." + // We can use ISH for a barrier because the ARM ARM says "This + // architecture assumes that all Processing Elements that use the + // same operating system or hypervisor are in the same Inner + // Shareable shareability domain." + StoreStore = ISHST, + LoadStore = ISHLD, + LoadLoad = ISHLD, StoreLoad = ISH, AnyAny = ISH }; From aph at redhat.com Thu May 22 13:41:47 2014 From: aph at redhat.com (Andrew Haley) Date: Thu, 22 May 2014 14:41:47 +0100 Subject: [aarch64-port-dev ] Improve code generation for volatile operations and other barriers Message-ID: <537DFE9B.6040004@redhat.com> Memory barriers have been by far the hardest thing to get right on this port, and have been changed several times. The problem is that we want to use load acquire and store release instructions, but the compilers prefer barriers. We cannot simply emit acq/rel instructions and elide the barriers because sometimes barriers have no associated load or store (e.g. Unsafe fences) so we must emit them. I have solved this problem differently in C1 and C2. In C1 I have given up trying to use ld.acq and st.rel, and emit conventional stores and loads instead, along with separate barrier instructions. In C2 I've written some logic to walk the ideal graph from the barrier and (forwards of backwards) and find the associated load or store. If it's ordered (i.e. it will generate an acq/rel instruction) I elide the barrier. I also fixed a couple of thinkos. Andrew. # HG changeset patch # User aph # Date 1400757693 14400 # Thu May 22 07:21:33 2014 -0400 # Node ID 0be4629243a868f0d4375b5cb8aff77b25b134b3 # Parent 5f4d7f52afc875fab4af0c68c3b657a2e8bd7283 Improve code generation for volatile operations and other barriers. diff -r 5f4d7f52afc8 -r 0be4629243a8 src/cpu/aarch64/vm/aarch64.ad --- a/src/cpu/aarch64/vm/aarch64.ad Wed May 21 13:00:24 2014 -0400 +++ b/src/cpu/aarch64/vm/aarch64.ad Thu May 22 07:21:33 2014 -0400 @@ -753,53 +753,43 @@ } }; - // Returns true if Node n is followed by a MemBar node that - // will do an acquire. If so, this node must not do the acquire - // operation. - bool followed_by_acquire(const Node *n); + bool followed_by_ordered_store(const Node *barrier); + bool preceded_by_ordered_load(const Node *barrier); + %} source %{ -// Optimize load-acquire. -// -// Check if acquire is unnecessary due to following operation that does -// acquire anyways. -// Walk the pattern: -// -// n: Load.acq -// | -// MemBarAcquire -// | | -// Proj(ctrl) Proj(mem) -// | | -// MemBarRelease/Volatile -// -bool followed_by_acquire(const Node *load) { - if (!load->is_Load()) + // AArch64 has load acquire and store release instructions which we + // use for ordered memory accesses, e.g. for volatiles. The ideal + // graph generator also inserts memory barriers around volatile + // accesses, and we don't want to generate both barriers and acq/rel + // instructions. So, when we emit a MemBarAcquire we look back in + // the ideal graph for an ordered load and only emit the barrier if + // we don't find one. + +bool preceded_by_ordered_load(const Node *barrier) { + Node *x = barrier->lookup(TypeFunc::Parms); + + if (! x) return false; - // Find MemBarAcquire. - const Node *mba = NULL; - for (DUIterator_Fast imax, i = load->fast_outs(imax); i < imax; i++) { - const Node *out = load->fast_out(i); - if (out->Opcode() == Op_MemBarAcquire) { - if (out->in(0) == load) continue; // Skip control edge, membar should be found via precedence edge. - mba = out; - break; - } - } - if (!mba) return false; - - // Find following MemBar node. + if (x->is_DecodeNarrowPtr()) + x = x->in(1); + + if (x->is_Load()) + return ! x->as_Load()->is_unordered(); + + return false; +} + +bool followed_by_ordered_store(const Node *barrier) { + + // Find following mem node. // - // The following node must be reachable by control AND memory - // edge to assure no other operations are in between the two nodes. - // - // So first get the Proj node, mem_proj, to use it to iterate forward. Node *mem_proj = NULL; - for (DUIterator_Fast imax, i = mba->fast_outs(imax); i < imax; i++) { - mem_proj = mba->fast_out(i); // Throw out-of-bounds if proj not found + for (DUIterator_Fast imax, i = barrier->fast_outs(imax); i < imax; i++) { + mem_proj = barrier->fast_out(i); // Throw out-of-bounds if proj not found assert(mem_proj->is_Proj(), "only projections here"); ProjNode *proj = mem_proj->as_Proj(); if (proj->_con == TypeFunc::Memory && @@ -808,22 +798,11 @@ } assert(mem_proj->as_Proj()->_con == TypeFunc::Memory, "Graph broken"); - // Search MemBar behind Proj. If there are other memory operations - // behind the Proj we lost. + // Search behind Proj. for (DUIterator_Fast jmax, j = mem_proj->fast_outs(jmax); j < jmax; j++) { Node *x = mem_proj->fast_out(j); - // Proj might have an edge to a store or load node which precedes the membar. - if (x->is_Mem()) return false; - - int xop = x->Opcode(); - if (xop == Op_MemBarVolatile) { - // Make sure we're not missing Call/Phi/MergeMem by checking - // control edges. The control edge must directly lead back - // to the MemBarAcquire - Node *ctrl_proj = x->in(0); - if (ctrl_proj->is_Proj() && ctrl_proj->in(0) == mba) { - return true; - } + if (x->is_Store() && ! x->as_Store()->is_unordered()) { + return true; } } @@ -2352,13 +2331,12 @@ } Label retry_load, done; __ bind(retry_load); - __ ldaxr(rscratch1, addr_reg); + __ ldar(rscratch1, addr_reg); __ cmp(rscratch1, old_reg); __ br(Assembler::NE, done); __ stlxr(rscratch1, new_reg, addr_reg); __ cbnzw(rscratch1, retry_load); __ bind(done); - __ membar(__ AnyAny); %} enc_class aarch64_enc_cmpxchgw(memory mem, iRegINoSp oldval, iRegINoSp newval) %{ @@ -2392,13 +2370,12 @@ } Label retry_load, done; __ bind(retry_load); - __ ldaxrw(rscratch1, addr_reg); + __ ldarw(rscratch1, addr_reg); __ cmpw(rscratch1, old_reg); __ br(Assembler::NE, done); __ stlxrw(rscratch1, new_reg, addr_reg); __ cbnzw(rscratch1, retry_load); __ bind(done); - __ membar(__ AnyAny); %} // auxiliary used for CompareAndSwapX to set result register @@ -4748,7 +4725,7 @@ instruct loadB(iRegINoSp dst, memory mem) %{ match(Set dst (LoadB mem)); - predicate(n->as_Load()->is_unordered() || followed_by_acquire(n)); + predicate(n->as_Load()->is_unordered()); ins_cost(4 * INSN_COST); format %{ "ldrsbw $dst, $mem\t# byte" %} @@ -4762,7 +4739,7 @@ instruct loadB2L(iRegLNoSp dst, memory mem) %{ match(Set dst (ConvI2L (LoadB mem))); - predicate(n->in(1)->as_Load()->is_unordered() || followed_by_acquire(n->in(1))); + predicate(n->in(1)->as_Load()->is_unordered()); ins_cost(4 * INSN_COST); format %{ "ldrsb $dst, $mem\t# byte" %} @@ -4776,7 +4753,7 @@ instruct loadUB(iRegINoSp dst, memory mem) %{ match(Set dst (LoadUB mem)); - predicate(n->as_Load()->is_unordered() || followed_by_acquire(n)); + predicate(n->as_Load()->is_unordered()); ins_cost(4 * INSN_COST); format %{ "ldrbw $dst, $mem\t# byte" %} @@ -4790,7 +4767,7 @@ instruct loadUB2L(iRegLNoSp dst, memory mem) %{ match(Set dst (ConvI2L (LoadUB mem))); - predicate(n->in(1)->as_Load()->is_unordered() || followed_by_acquire(n->in(1))); + predicate(n->in(1)->as_Load()->is_unordered()); ins_cost(4 * INSN_COST); format %{ "ldrb $dst, $mem\t# byte" %} @@ -4804,7 +4781,7 @@ instruct loadS(iRegINoSp dst, memory mem) %{ match(Set dst (LoadS mem)); - predicate(n->as_Load()->is_unordered() || followed_by_acquire(n)); + predicate(n->as_Load()->is_unordered()); ins_cost(4 * INSN_COST); format %{ "ldrshw $dst, $mem\t# short" %} @@ -4818,7 +4795,7 @@ instruct loadS2L(iRegLNoSp dst, memory mem) %{ match(Set dst (ConvI2L (LoadS mem))); - predicate(n->in(1)->as_Load()->is_unordered() || followed_by_acquire(n->in(1))); + predicate(n->in(1)->as_Load()->is_unordered()); ins_cost(4 * INSN_COST); format %{ "ldrsh $dst, $mem\t# short" %} @@ -4832,7 +4809,7 @@ instruct loadUS(iRegINoSp dst, memory mem) %{ match(Set dst (LoadUS mem)); - predicate(n->as_Load()->is_unordered() || followed_by_acquire(n)); + predicate(n->as_Load()->is_unordered()); ins_cost(4 * INSN_COST); format %{ "ldrh $dst, $mem\t# short" %} @@ -4846,7 +4823,7 @@ instruct loadUS2L(iRegLNoSp dst, memory mem) %{ match(Set dst (ConvI2L (LoadUS mem))); - predicate(n->in(1)->as_Load()->is_unordered() || followed_by_acquire(n->in(1))); + predicate(n->in(1)->as_Load()->is_unordered()); ins_cost(4 * INSN_COST); format %{ "ldrh $dst, $mem\t# short" %} @@ -4860,7 +4837,7 @@ instruct loadI(iRegINoSp dst, memory mem) %{ match(Set dst (LoadI mem)); - predicate(n->as_Load()->is_unordered() || followed_by_acquire(n)); + predicate(n->as_Load()->is_unordered()); ins_cost(4 * INSN_COST); format %{ "ldrw $dst, $mem\t# int" %} @@ -4874,7 +4851,7 @@ instruct loadI2L(iRegLNoSp dst, memory mem) %{ match(Set dst (ConvI2L (LoadI mem))); - predicate(n->in(1)->as_Load()->is_unordered() || followed_by_acquire(n->in(1))); + predicate(n->in(1)->as_Load()->is_unordered()); ins_cost(4 * INSN_COST); format %{ "ldrsw $dst, $mem\t# int" %} @@ -4888,8 +4865,7 @@ instruct loadUI2L(iRegLNoSp dst, memory mem, immL_32bits mask) %{ match(Set dst (AndL (ConvI2L (LoadI mem)) mask)); - predicate(n->in(1)->in(1)->as_Load()->is_unordered() - || followed_by_acquire(n->in(1)->in(1))); + predicate(n->in(1)->in(1)->as_Load()->is_unordered()); ins_cost(4 * INSN_COST); format %{ "ldrw $dst, $mem\t# int" %} @@ -4903,7 +4879,7 @@ instruct loadL(iRegLNoSp dst, memory mem) %{ match(Set dst (LoadL mem)); - predicate(n->as_Load()->is_unordered() || followed_by_acquire(n)); + predicate(n->as_Load()->is_unordered()); ins_cost(4 * INSN_COST); format %{ "ldr $dst, $mem\t# int" %} @@ -4930,7 +4906,7 @@ instruct loadP(iRegPNoSp dst, memory mem) %{ match(Set dst (LoadP mem)); - predicate(n->as_Load()->is_unordered() || followed_by_acquire(n)); + predicate(n->as_Load()->is_unordered()); ins_cost(4 * INSN_COST); format %{ "ldr $dst, $mem\t# ptr" %} @@ -4944,7 +4920,7 @@ instruct loadN(iRegNNoSp dst, memory mem) %{ match(Set dst (LoadN mem)); - predicate(n->as_Load()->is_unordered() || followed_by_acquire(n)); + predicate(n->as_Load()->is_unordered()); ins_cost(4 * INSN_COST); format %{ "ldrw $dst, $mem\t# compressed ptr" %} @@ -4958,7 +4934,7 @@ instruct loadKlass(iRegPNoSp dst, memory mem) %{ match(Set dst (LoadKlass mem)); - predicate(n->as_Load()->is_unordered() || followed_by_acquire(n)); + predicate(n->as_Load()->is_unordered()); ins_cost(4 * INSN_COST); format %{ "ldr $dst, $mem\t# class" %} @@ -4972,7 +4948,7 @@ instruct loadNKlass(iRegNNoSp dst, memory mem) %{ match(Set dst (LoadNKlass mem)); - predicate(n->as_Load()->is_unordered() || followed_by_acquire(n)); + predicate(n->as_Load()->is_unordered()); ins_cost(4 * INSN_COST); format %{ "ldrw $dst, $mem\t# compressed class ptr" %} @@ -4986,7 +4962,7 @@ instruct loadF(vRegF dst, memory mem) %{ match(Set dst (LoadF mem)); - predicate(n->as_Load()->is_unordered() || followed_by_acquire(n)); + predicate(n->as_Load()->is_unordered()); ins_cost(4 * INSN_COST); format %{ "ldrs $dst, $mem\t# float" %} @@ -5000,7 +4976,7 @@ instruct loadD(vRegD dst, memory mem) %{ match(Set dst (LoadD mem)); - predicate(n->as_Load()->is_unordered() || followed_by_acquire(n)); + predicate(n->as_Load()->is_unordered()); ins_cost(4 * INSN_COST); format %{ "ldrd $dst, $mem\t# double" %} @@ -5821,7 +5797,7 @@ match(LoadFence); ins_cost(VOLATILE_REF_COST); - format %{ "membar_acquire" %} + format %{ "load_fence" %} ins_encode %{ __ membar(Assembler::LoadLoad|Assembler::LoadStore); @@ -5830,6 +5806,7 @@ %} instruct unnecessary_membar_acquire() %{ + predicate(preceded_by_ordered_load(n)); match(MemBarAcquire); ins_cost(0); @@ -5842,6 +5819,20 @@ ins_pipe(pipe_class_memory); %} +instruct membar_acquire() %{ + match(MemBarAcquire); + ins_cost(VOLATILE_REF_COST); + + format %{ "membar_acquire" %} + + ins_encode %{ + __ membar(Assembler::LoadLoad|Assembler::LoadStore); + %} + + ins_pipe(pipe_class_memory); +%} + + instruct membar_acquire_lock() %{ match(MemBarAcquireLock); ins_cost(VOLATILE_REF_COST); @@ -5862,19 +5853,32 @@ format %{ "store_fence" %} ins_encode %{ - __ membar(Assembler::StoreLoad|Assembler::StoreStore); + __ membar(Assembler::LoadStore|Assembler::StoreStore); + %} + ins_pipe(pipe_class_memory); +%} + +instruct unnecessary_membar_release() %{ + match(MemBarRelease); + predicate(followed_by_ordered_store(n)); + ins_cost(0); + + format %{ "membar_release (elided)" %} + + ins_encode %{ + __ block_comment("membar_release (elided)"); %} ins_pipe(pipe_class_memory); %} instruct membar_release() %{ match(MemBarRelease); - ins_cost(0); - - format %{ "membar_release (elided)" %} - - ins_encode %{ - __ block_comment("membar_release (elided)"); + ins_cost(VOLATILE_REF_COST); + + format %{ "membar_release" %} + + ins_encode %{ + __ membar(Assembler::LoadStore|Assembler::StoreStore); %} ins_pipe(pipe_class_memory); %} @@ -5898,7 +5902,7 @@ format %{ "membar_release_lock" %} ins_encode %{ - __ membar(Assembler::StoreLoad|Assembler::StoreStore); + __ membar(Assembler::LoadStore|Assembler::StoreStore); %} ins_pipe(pipe_class_memory); From aph at redhat.com Thu May 22 13:44:00 2014 From: aph at redhat.com (Andrew Haley) Date: Thu, 22 May 2014 14:44:00 +0100 Subject: [aarch64-port-dev ] Improve code generation for volatile operations and other barriers In-Reply-To: <537DFE9B.6040004@redhat.com> References: <537DFE9B.6040004@redhat.com> Message-ID: <537DFF20.2070409@redhat.com> This is the C1 part of the patch. All attempts to generate ld.acq and st.rel instructions are gone. Andrew. # HG changeset patch # User aph # Date 1400765080 14400 # Thu May 22 09:24:40 2014 -0400 # Node ID 78eff3c05f51ce9232950278d4da868d42500779 # Parent 0be4629243a868f0d4375b5cb8aff77b25b134b3 Use explicit barrier instructions in C1. diff -r 0be4629243a8 -r 78eff3c05f51 src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp --- a/src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp Thu May 22 07:21:33 2014 -0400 +++ b/src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp Thu May 22 09:24:40 2014 -0400 @@ -177,10 +177,6 @@ return result; } -static bool is_reg(LIR_Opr op) { - return op->is_double_cpu() | op->is_single_cpu(); -} - Address LIR_Assembler::as_Address(LIR_Address* addr, Register tmp) { Register base = addr->base()->as_pointer_register(); LIR_Opr opr = addr->index(); @@ -2730,148 +2726,12 @@ } void LIR_Assembler::volatile_move_op(LIR_Opr src, LIR_Opr dest, BasicType type, CodeEmitInfo* info) { - if (dest->is_address()) { - LIR_Address* to_addr = dest->as_address_ptr(); - Register compressed_src = noreg; - if (is_reg(src)) { - compressed_src = as_reg(src); - if (type == T_ARRAY || type == T_OBJECT) { - __ verify_oop(src->as_register()); - if (UseCompressedOops) { - compressed_src = rscratch2; - __ mov(compressed_src, src->as_register()); - __ encode_heap_oop(compressed_src); - } - } - } else if (src->is_single_fpu()) { - __ fmovs(rscratch2, src->as_float_reg()); - src = FrameMap::rscratch2_opr, type = T_INT; - } else if (src->is_double_fpu()) { - __ fmovd(rscratch2, src->as_double_reg()); - src = FrameMap::rscratch2_long_opr, type = T_LONG; - } - - if (dest->is_double_cpu()) - __ lea(rscratch1, as_Address(to_addr)); - else - __ lea(rscratch1, as_Address_lo(to_addr)); - - int null_check_here = code_offset(); - switch (type) { - case T_ARRAY: // fall through - case T_OBJECT: // fall through - if (UseCompressedOops) { - __ stlrw(compressed_src, rscratch1); - } else { - __ stlr(compressed_src, rscratch1); - } - break; - case T_METADATA: - // We get here to store a method pointer to the stack to pass to - // a dtrace runtime call. This can't work on 64 bit with - // compressed klass ptrs: T_METADATA can be a compressed klass - // ptr or a 64 bit method pointer. - LP64_ONLY(ShouldNotReachHere()); - __ stlr(src->as_register(), rscratch1); - break; - case T_ADDRESS: - __ stlr(src->as_register(), rscratch1); - break; - case T_INT: - __ stlrw(src->as_register(), rscratch1); - break; - - case T_LONG: { - __ stlr(src->as_register_lo(), rscratch1); - break; - } - - case T_BYTE: // fall through - case T_BOOLEAN: { - __ stlrb(src->as_register(), rscratch1); - break; - } - - case T_CHAR: // fall through - case T_SHORT: - __ stlrh(src->as_register(), rscratch1); - break; - - default: - ShouldNotReachHere(); - } - if (info != NULL) { - add_debug_info_for_null_check(null_check_here, info); - } - } else if (src->is_address()) { - LIR_Address* from_addr = src->as_address_ptr(); - - if (src->is_double_cpu()) - __ lea(rscratch1, as_Address(from_addr)); - else - __ lea(rscratch1, as_Address_lo(from_addr)); - - int null_check_here = code_offset(); - switch (type) { - case T_ARRAY: // fall through - case T_OBJECT: // fall through - if (UseCompressedOops) { - __ ldarw(dest->as_register(), rscratch1); - } else { - __ ldar(dest->as_register(), rscratch1); - } - break; - case T_ADDRESS: - __ ldar(dest->as_register(), rscratch1); - break; - case T_INT: - __ ldarw(dest->as_register(), rscratch1); - break; - case T_LONG: { - __ ldar(dest->as_register_lo(), rscratch1); - break; - } - - case T_BYTE: // fall through - case T_BOOLEAN: { - __ ldarb(dest->as_register(), rscratch1); - break; - } - - case T_CHAR: // fall through - case T_SHORT: - __ ldarh(dest->as_register(), rscratch1); - break; - - case T_FLOAT: - __ ldarw(rscratch2, rscratch1); - __ fmovs(dest->as_float_reg(), rscratch2); - break; - - case T_DOUBLE: - __ ldar(rscratch2, rscratch1); - __ fmovd(dest->as_double_reg(), rscratch2); - break; - - default: - ShouldNotReachHere(); - } - if (info != NULL) { - add_debug_info_for_null_check(null_check_here, info); - } - - if (type == T_ARRAY || type == T_OBJECT) { - if (UseCompressedOops) { - __ decode_heap_oop(dest->as_register()); - } - __ verify_oop(dest->as_register()); - } else if (type == T_ADDRESS && from_addr->disp() == oopDesc::klass_offset_in_bytes()) { - if (UseCompressedClassPointers) { - __ decode_klass_not_null(dest->as_register()); - } - } - } else + if (dest->is_address() || src->is_address()) { + move_op(src, dest, type, lir_patch_none, info, + /*pop_fpu_stack*/false, /*unaligned*/false, /*wide*/false); + } else { ShouldNotReachHere(); + } } #ifdef ASSERT @@ -2925,17 +2785,18 @@ } void LIR_Assembler::membar_acquire() { - __ block_comment("membar_acquire"); + __ membar(Assembler::LoadLoad|Assembler::LoadStore); } void LIR_Assembler::membar_release() { - __ block_comment("membar_release"); + __ membar(Assembler::LoadStore|Assembler::StoreStore); } -void LIR_Assembler::membar_loadload() { Unimplemented(); } +void LIR_Assembler::membar_loadload() { + __ membar(Assembler::LoadLoad); +} void LIR_Assembler::membar_storestore() { - COMMENT("membar_storestore"); __ membar(MacroAssembler::StoreStore); } From edward.nevill at linaro.org Fri May 23 10:02:22 2014 From: edward.nevill at linaro.org (Edward Nevill) Date: Fri, 23 May 2014 11:02:22 +0100 Subject: [aarch64-port-dev ] RFR: JDK8: Add support for CRC32 intrinsic Message-ID: <1400839342.14801.20.camel@localhost.localdomain> Hi, The following patch adds support for CRC32 intrinsic. The patch is a non neon patch. IE. It uses only the base aarch64 instruction set. Patch for neon to follow. Even without neon it gets 4.5 x improvement on my test case http://people.linaro.org/~edward.nevill/crc32/CRCTest.java As the patch is quite big (38K) I have put a copy of the patch @ http://people.linaro.org/~edward.nevill/crc32/crc32.patch which may be easier to apply if anyone wishes to try this out. The algorithm uses 4 x tables and handles 16 bytes (1 ldp worth) per iteration. I experimented doing 32 bytes per loop but I could not measure the difference so I left it at 16. There are also algorithms that use 8 tables (Google slice by 8) but I think the returns from this over the simpler by 4 algorithm are minimal. The guts of the algorithm are in kernel_crc32 if anyone is interested! All the best, Ed. --- CUT HERE --- # HG changeset patch # User Edward Nevill edward.nevill at linaro.org # Date 1400838435 -3600 # Fri May 23 10:47:15 2014 +0100 # Node ID 60fac40265fcc44faa834831777859561f2aa1c2 # Parent 9d3bc0f40cce83038e5a0d5fc8a51389530538d9 Add support for CRC32 intrinsic diff -r 9d3bc0f40cce -r 60fac40265fc src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp --- a/src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp Wed May 14 15:43:50 2014 +0100 +++ b/src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp Fri May 23 10:47:15 2014 +0100 @@ -2647,7 +2647,21 @@ } void LIR_Assembler::emit_updatecrc32(LIR_OpUpdateCRC32* op) { - fatal("CRC32 intrinsic is not implemented on this platform"); + assert(op->crc()->is_single_cpu(), "crc must be register"); + assert(op->val()->is_single_cpu(), "byte value must be register"); + assert(op->result_opr()->is_single_cpu(), "result must be register"); + Register crc = op->crc()->as_register(); + Register val = op->val()->as_register(); + Register res = op->result_opr()->as_register(); + + assert_different_registers(val, crc, res); + unsigned long offset; + __ adrp(res, ExternalAddress(StubRoutines::crc_table_addr()), offset); + if (offset) __ add(res, res, offset); + + __ ornw(crc, zr, crc); // ~crc + __ update_byte_crc32(crc, val, res); + __ ornw(res, zr, crc); // ~crc } void LIR_Assembler::emit_profile_type(LIR_OpProfileType* op) { diff -r 9d3bc0f40cce -r 60fac40265fc src/cpu/aarch64/vm/c1_LIRGenerator_aarch64.cpp --- a/src/cpu/aarch64/vm/c1_LIRGenerator_aarch64.cpp Wed May 14 15:43:50 2014 +0100 +++ b/src/cpu/aarch64/vm/c1_LIRGenerator_aarch64.cpp Fri May 23 10:47:15 2014 +0100 @@ -958,7 +958,81 @@ } void LIRGenerator::do_update_CRC32(Intrinsic* x) { - fatal("CRC32 intrinsic is not implemented on this platform"); + assert(UseCRC32Intrinsics, "why are we here?"); + // Make all state_for calls early since they can emit code + LIR_Opr result = rlock_result(x); + int flags = 0; + switch (x->id()) { + case vmIntrinsics::_updateCRC32: { + LIRItem crc(x->argument_at(0), this); + LIRItem val(x->argument_at(1), this); + // val is destroyed by update_crc32 + val.set_destroys_register(); + crc.load_item(); + val.load_item(); + __ update_crc32(crc.result(), val.result(), result); + break; + } + case vmIntrinsics::_updateBytesCRC32: + case vmIntrinsics::_updateByteBufferCRC32: { + bool is_updateBytes = (x->id() == vmIntrinsics::_updateBytesCRC32); + + LIRItem crc(x->argument_at(0), this); + LIRItem buf(x->argument_at(1), this); + LIRItem off(x->argument_at(2), this); + LIRItem len(x->argument_at(3), this); + buf.load_item(); + off.load_nonconstant(); + + LIR_Opr index = off.result(); + int offset = is_updateBytes ? arrayOopDesc::base_offset_in_bytes(T_BYTE) : 0; + if(off.result()->is_constant()) { + index = LIR_OprFact::illegalOpr; + offset += off.result()->as_jint(); + } + LIR_Opr base_op = buf.result(); + + if (index->is_valid()) { + LIR_Opr tmp = new_register(T_LONG); + __ convert(Bytecodes::_i2l, index, tmp); + index = tmp; + } + + if (offset) { + LIR_Opr tmp = new_pointer_register(); + __ add(base_op, LIR_OprFact::intConst(offset), tmp); + base_op = tmp; + offset = 0; + } + + LIR_Address* a = new LIR_Address(base_op, + index, + LIR_Address::times_1, + offset, + T_BYTE); + BasicTypeList signature(3); + signature.append(T_INT); + signature.append(T_ADDRESS); + signature.append(T_INT); + CallingConvention* cc = frame_map()->c_calling_convention(&signature); + const LIR_Opr result_reg = result_register_for(x->type()); + + LIR_Opr addr = new_pointer_register(); + __ leal(LIR_OprFact::address(a), addr); + + crc.load_item_force(cc->at(0)); + __ move(addr, cc->at(1)); + len.load_item_force(cc->at(2)); + + __ call_runtime_leaf(StubRoutines::updateBytesCRC32(), getThreadTemp(), result_reg, cc->args()); + __ move(result_reg, result); + + break; + } + default: { + ShouldNotReachHere(); + } + } } // _i2l, _i2f, _i2d, _l2i, _l2f, _l2d, _f2i, _f2l, _f2d, _d2i, _d2l, _d2f diff -r 9d3bc0f40cce -r 60fac40265fc src/cpu/aarch64/vm/interpreterGenerator_aarch64.hpp --- a/src/cpu/aarch64/vm/interpreterGenerator_aarch64.hpp Wed May 14 15:43:50 2014 +0100 +++ b/src/cpu/aarch64/vm/interpreterGenerator_aarch64.hpp Fri May 23 10:47:15 2014 +0100 @@ -46,6 +46,8 @@ address generate_empty_entry(void); address generate_accessor_entry(void); address generate_Reference_get_entry(); + address generate_CRC32_update_entry(); + address generate_CRC32_updateBytes_entry(AbstractInterpreter::MethodKind kind); void lock_method(void); void generate_stack_overflow_check(void); diff -r 9d3bc0f40cce -r 60fac40265fc src/cpu/aarch64/vm/macroAssembler_aarch64.cpp --- a/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Wed May 14 15:43:50 2014 +0100 +++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Fri May 23 10:47:15 2014 +0100 @@ -126,6 +126,7 @@ } else { assert((jbyte *)target == ((CardTableModRefBS*)(Universe::heap()->barrier_set()))->byte_map_base || + target == StubRoutines::crc_table_addr() || (address)target == os::get_polling_page(), "adrp must be polling page or byte map base"); assert(offset_lo == 0, "offset must be 0 for polling page or byte map base"); @@ -2045,6 +2046,114 @@ pop(0x3fffffff, sp); // integer registers except lr & sp } +/** + * Emits code to update CRC-32 with a byte value according to constants in table + * + * @param [in,out]crc Register containing the crc. + * @param [in]val Register containing the byte to fold into the CRC. + * @param [in]table Register containing the table of crc constants. + * + * uint32_t crc; + * val = crc_table[(val ^ crc) & 0xFF]; + * crc = val ^ (crc >> 8); + * + */ +void MacroAssembler::update_byte_crc32(Register crc, Register val, Register table) { + eor(val, val, crc); + andr(val, val, 0xff); + ldrw(val, Address(table, val, Address::lsl(2))); + eor(crc, val, crc, Assembler::LSR, 8); +} + +/** + * Emits code to update CRC-32 with a 32-bit value according to tables 0 to 3 + * + * @param [in,out]crc Register containing the crc. + * @param [in]v Register containing the 32-bit to fold into the CRC. + * @param [in]table0 Register containing table 0 of crc constants. + * @param [in]table1 Register containing table 1 of crc constants. + * @param [in]table2 Register containing table 2 of crc constants. + * @param [in]table3 Register containing table 3 of crc constants. + * + * uint32_t crc; + * v = crc ^ v + * crc = table3[v&0xff]^table2[(v>>8)&0xff]^table1[(v>>16)&0xff]^table0[v>>24] + * + */ +void MacroAssembler::update_word_crc32(Register crc, Register v, Register tmp, + Register table0, Register table1, Register table2, Register table3, + bool upper) { + eor(v, crc, v, upper ? LSR:LSL, upper ? 32:0); + uxtb(tmp, v); + ldrw(crc, Address(table3, tmp, Address::lsl(2))); + ubfx(tmp, v, 8, 8); + ldrw(tmp, Address(table2, tmp, Address::lsl(2))); + eor(crc, crc, tmp); + ubfx(tmp, v, 16, 8); + ldrw(tmp, Address(table1, tmp, Address::lsl(2))); + eor(crc, crc, tmp); + ubfx(tmp, v, 24, 8); + ldrw(tmp, Address(table0, tmp, Address::lsl(2))); + eor(crc, crc, tmp); +} + +/** + * @param crc register containing existing CRC (32-bit) + * @param buf register pointing to input byte buffer (byte*) + * @param len register containing number of bytes + * @param table register that will contain address of CRC table + * @param tmp scratch register + */ +void MacroAssembler::kernel_crc32(Register crc, Register buf, Register len, + Register table0, Register table1, Register table2, Register table3, + Register tmp, Register tmp2, Register tmp3) { + Label L_by16_loop, L_by4, L_by4_loop, L_by1, L_by1_loop, L_exit; + unsigned long offset; + ornw(crc, zr, crc); + adrp(table0, ExternalAddress(StubRoutines::crc_table_addr()), offset); + if (offset) add(table0, table0, offset); + add(table1, table0, 1*256*sizeof(juint)); + add(table2, table0, 2*256*sizeof(juint)); + add(table3, table0, 3*256*sizeof(juint)); + subs(len, len, 16); + br(Assembler::GE, L_by16_loop); + adds(len, len, 16-4); + br(Assembler::GE, L_by4_loop); + adds(len, len, 4); + br(Assembler::GT, L_by1_loop); + b(L_exit); + + BIND(L_by4_loop); + ldrw(tmp, Address(post(buf, 4))); + update_word_crc32(crc, tmp, tmp2, table0, table1, table2, table3); + subs(len, len, 4); + br(Assembler::GE, L_by4_loop); + adds(len, len, 4); + br(Assembler::LE, L_exit); + BIND(L_by1_loop); + subs(len, len, 1); + ldrb(tmp, Address(post(buf, 1))); + update_byte_crc32(crc, tmp, table0); + br(Assembler::GT, L_by1_loop); + b(L_exit); + + align(CodeEntryAlignment); + BIND(L_by16_loop); + subs(len, len, 16); + ldp(tmp, tmp3, Address(post(buf, 16))); + update_word_crc32(crc, tmp, tmp2, table0, table1, table2, table3, false); + update_word_crc32(crc, tmp, tmp2, table0, table1, table2, table3, true); + update_word_crc32(crc, tmp3, tmp2, table0, table1, table2, table3, false); + update_word_crc32(crc, tmp3, tmp2, table0, table1, table2, table3, true); + br(Assembler::GE, L_by16_loop); + adds(len, len, 16-4); + br(Assembler::GE, L_by4_loop); + adds(len, len, 4); + br(Assembler::GT, L_by1_loop); + BIND(L_exit); + ornw(crc, zr, crc); +} + SkipIfEqual::SkipIfEqual( MacroAssembler* masm, const bool* flag_addr, bool value) { _masm = masm; diff -r 9d3bc0f40cce -r 60fac40265fc src/cpu/aarch64/vm/macroAssembler_aarch64.hpp --- a/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp Wed May 14 15:43:50 2014 +0100 +++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp Fri May 23 10:47:15 2014 +0100 @@ -1226,6 +1226,11 @@ void verified_entry(int framesize, bool stack_bang, bool fp_mode_24b); #endif + // CRC32 code for java.util.zip.CRC32::updateBytes() instrinsic. + void kernel_crc32(Register crc, Register buf, Register len, + Register table0, Register table1, Register table2, Register table3, + Register tmp, Register tmp2, Register tmp3); + #undef VIRTUAL // Stack push and pop individual 64 bit registers @@ -1367,6 +1372,12 @@ // Used by aarch64.ad to control code generation static bool use_acq_rel_for_volatile_fields(); + + // CRC32 code for java.util.zip.CRC32::updateBytes() instrinsic. + void update_byte_crc32(Register crc, Register val, Register table); + void update_word_crc32(Register crc, Register v, Register tmp, + Register table0, Register table1, Register table2, Register table3, + bool upper = false); }; // Used by aarch64.ad to control code generation diff -r 9d3bc0f40cce -r 60fac40265fc src/cpu/aarch64/vm/stubGenerator_aarch64.cpp --- a/src/cpu/aarch64/vm/stubGenerator_aarch64.cpp Wed May 14 15:43:50 2014 +0100 +++ b/src/cpu/aarch64/vm/stubGenerator_aarch64.cpp Fri May 23 10:47:15 2014 +0100 @@ -1944,6 +1944,46 @@ } #endif + /** + * Arguments: + * + * Inputs: + * c_rarg0 - int crc + * c_rarg1 - byte* buf + * c_rarg2 - int length + * + * Ouput: + * rax - int crc result + */ + address generate_updateBytesCRC32() { + assert(UseCRC32Intrinsics, "what are we doing here?"); + + __ align(CodeEntryAlignment); + StubCodeMark mark(this, "StubRoutines", "updateBytesCRC32"); + + address start = __ pc(); + + const Register crc = c_rarg0; // crc + const Register buf = c_rarg1; // source java byte array address + const Register len = c_rarg2; // length + const Register table0 = c_rarg3; // crc_table address + const Register table1 = c_rarg4; + const Register table2 = c_rarg5; + const Register table3 = c_rarg6; + const Register tmp3 = c_rarg7; + + BLOCK_COMMENT("Entry:"); + __ enter(); // required for proper stackwalking of RuntimeStub frame + + __ kernel_crc32(crc, buf, len, + table0, table1, table2, table3, rscratch1, rscratch2, tmp3); + + __ leave(); // required for proper stackwalking of RuntimeStub frame + __ ret(lr); + + return start; + } + #undef __ #define __ masm-> @@ -2113,8 +2153,8 @@ generate_handler_for_unsafe_access(); // platform dependent - StubRoutines::x86::_get_previous_fp_entry = generate_get_previous_fp(); - StubRoutines::x86::_get_previous_sp_entry = generate_get_previous_sp(); + StubRoutines::aarch64::_get_previous_fp_entry = generate_get_previous_fp(); + StubRoutines::aarch64::_get_previous_sp_entry = generate_get_previous_sp(); // Build this early so it's available for the interpreter. StubRoutines::_throw_StackOverflowError_entry = @@ -2122,6 +2162,11 @@ CAST_FROM_FN_PTR(address, SharedRuntime:: throw_StackOverflowError)); + if (UseCRC32Intrinsics) { + // set table address before stub generation which use it + StubRoutines::_crc_table_adr = (address)StubRoutines::aarch64::_crc_table; + StubRoutines::_updateBytesCRC32 = generate_updateBytesCRC32(); + } } void generate_all() { diff -r 9d3bc0f40cce -r 60fac40265fc src/cpu/aarch64/vm/stubRoutines_aarch64.cpp --- a/src/cpu/aarch64/vm/stubRoutines_aarch64.cpp Wed May 14 15:43:50 2014 +0100 +++ b/src/cpu/aarch64/vm/stubRoutines_aarch64.cpp Fri May 23 10:47:15 2014 +0100 @@ -33,14 +33,237 @@ // Implementation of the platform-specific part of StubRoutines - for // a description of how to extend it, see the stubRoutines.hpp file. -address StubRoutines::x86::_get_previous_fp_entry = NULL; -address StubRoutines::x86::_get_previous_sp_entry = NULL; +address StubRoutines::aarch64::_get_previous_fp_entry = NULL; +address StubRoutines::aarch64::_get_previous_sp_entry = NULL; -address StubRoutines::x86::_f2i_fixup = NULL; -address StubRoutines::x86::_f2l_fixup = NULL; -address StubRoutines::x86::_d2i_fixup = NULL; -address StubRoutines::x86::_d2l_fixup = NULL; -address StubRoutines::x86::_float_sign_mask = NULL; -address StubRoutines::x86::_float_sign_flip = NULL; -address StubRoutines::x86::_double_sign_mask = NULL; -address StubRoutines::x86::_double_sign_flip = NULL; +address StubRoutines::aarch64::_f2i_fixup = NULL; +address StubRoutines::aarch64::_f2l_fixup = NULL; +address StubRoutines::aarch64::_d2i_fixup = NULL; +address StubRoutines::aarch64::_d2l_fixup = NULL; +address StubRoutines::aarch64::_float_sign_mask = NULL; +address StubRoutines::aarch64::_float_sign_flip = NULL; +address StubRoutines::aarch64::_double_sign_mask = NULL; +address StubRoutines::aarch64::_double_sign_flip = NULL; + +/** + * crc_table[] from jdk/src/share/native/java/util/zip/zlib-1.2.5/crc32.h + */ +juint StubRoutines::aarch64::_crc_table[] + __attribute__ ((aligned(4096))) = +{ + // Table 0 + 0x00000000UL, 0x77073096UL, 0xee0e612cUL, 0x990951baUL, 0x076dc419UL, + 0x706af48fUL, 0xe963a535UL, 0x9e6495a3UL, 0x0edb8832UL, 0x79dcb8a4UL, + 0xe0d5e91eUL, 0x97d2d988UL, 0x09b64c2bUL, 0x7eb17cbdUL, 0xe7b82d07UL, + 0x90bf1d91UL, 0x1db71064UL, 0x6ab020f2UL, 0xf3b97148UL, 0x84be41deUL, + 0x1adad47dUL, 0x6ddde4ebUL, 0xf4d4b551UL, 0x83d385c7UL, 0x136c9856UL, + 0x646ba8c0UL, 0xfd62f97aUL, 0x8a65c9ecUL, 0x14015c4fUL, 0x63066cd9UL, + 0xfa0f3d63UL, 0x8d080df5UL, 0x3b6e20c8UL, 0x4c69105eUL, 0xd56041e4UL, + 0xa2677172UL, 0x3c03e4d1UL, 0x4b04d447UL, 0xd20d85fdUL, 0xa50ab56bUL, + 0x35b5a8faUL, 0x42b2986cUL, 0xdbbbc9d6UL, 0xacbcf940UL, 0x32d86ce3UL, + 0x45df5c75UL, 0xdcd60dcfUL, 0xabd13d59UL, 0x26d930acUL, 0x51de003aUL, + 0xc8d75180UL, 0xbfd06116UL, 0x21b4f4b5UL, 0x56b3c423UL, 0xcfba9599UL, + 0xb8bda50fUL, 0x2802b89eUL, 0x5f058808UL, 0xc60cd9b2UL, 0xb10be924UL, + 0x2f6f7c87UL, 0x58684c11UL, 0xc1611dabUL, 0xb6662d3dUL, 0x76dc4190UL, + 0x01db7106UL, 0x98d220bcUL, 0xefd5102aUL, 0x71b18589UL, 0x06b6b51fUL, + 0x9fbfe4a5UL, 0xe8b8d433UL, 0x7807c9a2UL, 0x0f00f934UL, 0x9609a88eUL, + 0xe10e9818UL, 0x7f6a0dbbUL, 0x086d3d2dUL, 0x91646c97UL, 0xe6635c01UL, + 0x6b6b51f4UL, 0x1c6c6162UL, 0x856530d8UL, 0xf262004eUL, 0x6c0695edUL, + 0x1b01a57bUL, 0x8208f4c1UL, 0xf50fc457UL, 0x65b0d9c6UL, 0x12b7e950UL, + 0x8bbeb8eaUL, 0xfcb9887cUL, 0x62dd1ddfUL, 0x15da2d49UL, 0x8cd37cf3UL, + 0xfbd44c65UL, 0x4db26158UL, 0x3ab551ceUL, 0xa3bc0074UL, 0xd4bb30e2UL, + 0x4adfa541UL, 0x3dd895d7UL, 0xa4d1c46dUL, 0xd3d6f4fbUL, 0x4369e96aUL, + 0x346ed9fcUL, 0xad678846UL, 0xda60b8d0UL, 0x44042d73UL, 0x33031de5UL, + 0xaa0a4c5fUL, 0xdd0d7cc9UL, 0x5005713cUL, 0x270241aaUL, 0xbe0b1010UL, + 0xc90c2086UL, 0x5768b525UL, 0x206f85b3UL, 0xb966d409UL, 0xce61e49fUL, + 0x5edef90eUL, 0x29d9c998UL, 0xb0d09822UL, 0xc7d7a8b4UL, 0x59b33d17UL, + 0x2eb40d81UL, 0xb7bd5c3bUL, 0xc0ba6cadUL, 0xedb88320UL, 0x9abfb3b6UL, + 0x03b6e20cUL, 0x74b1d29aUL, 0xead54739UL, 0x9dd277afUL, 0x04db2615UL, + 0x73dc1683UL, 0xe3630b12UL, 0x94643b84UL, 0x0d6d6a3eUL, 0x7a6a5aa8UL, + 0xe40ecf0bUL, 0x9309ff9dUL, 0x0a00ae27UL, 0x7d079eb1UL, 0xf00f9344UL, + 0x8708a3d2UL, 0x1e01f268UL, 0x6906c2feUL, 0xf762575dUL, 0x806567cbUL, + 0x196c3671UL, 0x6e6b06e7UL, 0xfed41b76UL, 0x89d32be0UL, 0x10da7a5aUL, + 0x67dd4accUL, 0xf9b9df6fUL, 0x8ebeeff9UL, 0x17b7be43UL, 0x60b08ed5UL, + 0xd6d6a3e8UL, 0xa1d1937eUL, 0x38d8c2c4UL, 0x4fdff252UL, 0xd1bb67f1UL, + 0xa6bc5767UL, 0x3fb506ddUL, 0x48b2364bUL, 0xd80d2bdaUL, 0xaf0a1b4cUL, + 0x36034af6UL, 0x41047a60UL, 0xdf60efc3UL, 0xa867df55UL, 0x316e8eefUL, + 0x4669be79UL, 0xcb61b38cUL, 0xbc66831aUL, 0x256fd2a0UL, 0x5268e236UL, + 0xcc0c7795UL, 0xbb0b4703UL, 0x220216b9UL, 0x5505262fUL, 0xc5ba3bbeUL, + 0xb2bd0b28UL, 0x2bb45a92UL, 0x5cb36a04UL, 0xc2d7ffa7UL, 0xb5d0cf31UL, + 0x2cd99e8bUL, 0x5bdeae1dUL, 0x9b64c2b0UL, 0xec63f226UL, 0x756aa39cUL, + 0x026d930aUL, 0x9c0906a9UL, 0xeb0e363fUL, 0x72076785UL, 0x05005713UL, + 0x95bf4a82UL, 0xe2b87a14UL, 0x7bb12baeUL, 0x0cb61b38UL, 0x92d28e9bUL, + 0xe5d5be0dUL, 0x7cdcefb7UL, 0x0bdbdf21UL, 0x86d3d2d4UL, 0xf1d4e242UL, + 0x68ddb3f8UL, 0x1fda836eUL, 0x81be16cdUL, 0xf6b9265bUL, 0x6fb077e1UL, + 0x18b74777UL, 0x88085ae6UL, 0xff0f6a70UL, 0x66063bcaUL, 0x11010b5cUL, + 0x8f659effUL, 0xf862ae69UL, 0x616bffd3UL, 0x166ccf45UL, 0xa00ae278UL, + 0xd70dd2eeUL, 0x4e048354UL, 0x3903b3c2UL, 0xa7672661UL, 0xd06016f7UL, + 0x4969474dUL, 0x3e6e77dbUL, 0xaed16a4aUL, 0xd9d65adcUL, 0x40df0b66UL, + 0x37d83bf0UL, 0xa9bcae53UL, 0xdebb9ec5UL, 0x47b2cf7fUL, 0x30b5ffe9UL, + 0xbdbdf21cUL, 0xcabac28aUL, 0x53b39330UL, 0x24b4a3a6UL, 0xbad03605UL, + 0xcdd70693UL, 0x54de5729UL, 0x23d967bfUL, 0xb3667a2eUL, 0xc4614ab8UL, + 0x5d681b02UL, 0x2a6f2b94UL, 0xb40bbe37UL, 0xc30c8ea1UL, 0x5a05df1bUL, + 0x2d02ef8dUL, + + // Table 1 + 0x00000000UL, 0x191b3141UL, 0x32366282UL, 0x2b2d53c3UL, 0x646cc504UL, + 0x7d77f445UL, 0x565aa786UL, 0x4f4196c7UL, 0xc8d98a08UL, 0xd1c2bb49UL, + 0xfaefe88aUL, 0xe3f4d9cbUL, 0xacb54f0cUL, 0xb5ae7e4dUL, 0x9e832d8eUL, + 0x87981ccfUL, 0x4ac21251UL, 0x53d92310UL, 0x78f470d3UL, 0x61ef4192UL, + 0x2eaed755UL, 0x37b5e614UL, 0x1c98b5d7UL, 0x05838496UL, 0x821b9859UL, + 0x9b00a918UL, 0xb02dfadbUL, 0xa936cb9aUL, 0xe6775d5dUL, 0xff6c6c1cUL, + 0xd4413fdfUL, 0xcd5a0e9eUL, 0x958424a2UL, 0x8c9f15e3UL, 0xa7b24620UL, + 0xbea97761UL, 0xf1e8e1a6UL, 0xe8f3d0e7UL, 0xc3de8324UL, 0xdac5b265UL, + 0x5d5daeaaUL, 0x44469febUL, 0x6f6bcc28UL, 0x7670fd69UL, 0x39316baeUL, + 0x202a5aefUL, 0x0b07092cUL, 0x121c386dUL, 0xdf4636f3UL, 0xc65d07b2UL, + 0xed705471UL, 0xf46b6530UL, 0xbb2af3f7UL, 0xa231c2b6UL, 0x891c9175UL, + 0x9007a034UL, 0x179fbcfbUL, 0x0e848dbaUL, 0x25a9de79UL, 0x3cb2ef38UL, + 0x73f379ffUL, 0x6ae848beUL, 0x41c51b7dUL, 0x58de2a3cUL, 0xf0794f05UL, + 0xe9627e44UL, 0xc24f2d87UL, 0xdb541cc6UL, 0x94158a01UL, 0x8d0ebb40UL, + 0xa623e883UL, 0xbf38d9c2UL, 0x38a0c50dUL, 0x21bbf44cUL, 0x0a96a78fUL, + 0x138d96ceUL, 0x5ccc0009UL, 0x45d73148UL, 0x6efa628bUL, 0x77e153caUL, + 0xbabb5d54UL, 0xa3a06c15UL, 0x888d3fd6UL, 0x91960e97UL, 0xded79850UL, + 0xc7cca911UL, 0xece1fad2UL, 0xf5facb93UL, 0x7262d75cUL, 0x6b79e61dUL, + 0x4054b5deUL, 0x594f849fUL, 0x160e1258UL, 0x0f152319UL, 0x243870daUL, + 0x3d23419bUL, 0x65fd6ba7UL, 0x7ce65ae6UL, 0x57cb0925UL, 0x4ed03864UL, + 0x0191aea3UL, 0x188a9fe2UL, 0x33a7cc21UL, 0x2abcfd60UL, 0xad24e1afUL, + 0xb43fd0eeUL, 0x9f12832dUL, 0x8609b26cUL, 0xc94824abUL, 0xd05315eaUL, + 0xfb7e4629UL, 0xe2657768UL, 0x2f3f79f6UL, 0x362448b7UL, 0x1d091b74UL, + 0x04122a35UL, 0x4b53bcf2UL, 0x52488db3UL, 0x7965de70UL, 0x607eef31UL, + 0xe7e6f3feUL, 0xfefdc2bfUL, 0xd5d0917cUL, 0xcccba03dUL, 0x838a36faUL, + 0x9a9107bbUL, 0xb1bc5478UL, 0xa8a76539UL, 0x3b83984bUL, 0x2298a90aUL, + 0x09b5fac9UL, 0x10aecb88UL, 0x5fef5d4fUL, 0x46f46c0eUL, 0x6dd93fcdUL, + 0x74c20e8cUL, 0xf35a1243UL, 0xea412302UL, 0xc16c70c1UL, 0xd8774180UL, + 0x9736d747UL, 0x8e2de606UL, 0xa500b5c5UL, 0xbc1b8484UL, 0x71418a1aUL, + 0x685abb5bUL, 0x4377e898UL, 0x5a6cd9d9UL, 0x152d4f1eUL, 0x0c367e5fUL, + 0x271b2d9cUL, 0x3e001cddUL, 0xb9980012UL, 0xa0833153UL, 0x8bae6290UL, + 0x92b553d1UL, 0xddf4c516UL, 0xc4eff457UL, 0xefc2a794UL, 0xf6d996d5UL, + 0xae07bce9UL, 0xb71c8da8UL, 0x9c31de6bUL, 0x852aef2aUL, 0xca6b79edUL, + 0xd37048acUL, 0xf85d1b6fUL, 0xe1462a2eUL, 0x66de36e1UL, 0x7fc507a0UL, + 0x54e85463UL, 0x4df36522UL, 0x02b2f3e5UL, 0x1ba9c2a4UL, 0x30849167UL, + 0x299fa026UL, 0xe4c5aeb8UL, 0xfdde9ff9UL, 0xd6f3cc3aUL, 0xcfe8fd7bUL, + 0x80a96bbcUL, 0x99b25afdUL, 0xb29f093eUL, 0xab84387fUL, 0x2c1c24b0UL, + 0x350715f1UL, 0x1e2a4632UL, 0x07317773UL, 0x4870e1b4UL, 0x516bd0f5UL, + 0x7a468336UL, 0x635db277UL, 0xcbfad74eUL, 0xd2e1e60fUL, 0xf9ccb5ccUL, + 0xe0d7848dUL, 0xaf96124aUL, 0xb68d230bUL, 0x9da070c8UL, 0x84bb4189UL, + 0x03235d46UL, 0x1a386c07UL, 0x31153fc4UL, 0x280e0e85UL, 0x674f9842UL, + 0x7e54a903UL, 0x5579fac0UL, 0x4c62cb81UL, 0x8138c51fUL, 0x9823f45eUL, + 0xb30ea79dUL, 0xaa1596dcUL, 0xe554001bUL, 0xfc4f315aUL, 0xd7626299UL, + 0xce7953d8UL, 0x49e14f17UL, 0x50fa7e56UL, 0x7bd72d95UL, 0x62cc1cd4UL, + 0x2d8d8a13UL, 0x3496bb52UL, 0x1fbbe891UL, 0x06a0d9d0UL, 0x5e7ef3ecUL, + 0x4765c2adUL, 0x6c48916eUL, 0x7553a02fUL, 0x3a1236e8UL, 0x230907a9UL, + 0x0824546aUL, 0x113f652bUL, 0x96a779e4UL, 0x8fbc48a5UL, 0xa4911b66UL, + 0xbd8a2a27UL, 0xf2cbbce0UL, 0xebd08da1UL, 0xc0fdde62UL, 0xd9e6ef23UL, + 0x14bce1bdUL, 0x0da7d0fcUL, 0x268a833fUL, 0x3f91b27eUL, 0x70d024b9UL, + 0x69cb15f8UL, 0x42e6463bUL, 0x5bfd777aUL, 0xdc656bb5UL, 0xc57e5af4UL, + 0xee530937UL, 0xf7483876UL, 0xb809aeb1UL, 0xa1129ff0UL, 0x8a3fcc33UL, + 0x9324fd72UL, + + // Table 2 + 0x00000000UL, 0x01c26a37UL, 0x0384d46eUL, 0x0246be59UL, 0x0709a8dcUL, + 0x06cbc2ebUL, 0x048d7cb2UL, 0x054f1685UL, 0x0e1351b8UL, 0x0fd13b8fUL, + 0x0d9785d6UL, 0x0c55efe1UL, 0x091af964UL, 0x08d89353UL, 0x0a9e2d0aUL, + 0x0b5c473dUL, 0x1c26a370UL, 0x1de4c947UL, 0x1fa2771eUL, 0x1e601d29UL, + 0x1b2f0bacUL, 0x1aed619bUL, 0x18abdfc2UL, 0x1969b5f5UL, 0x1235f2c8UL, + 0x13f798ffUL, 0x11b126a6UL, 0x10734c91UL, 0x153c5a14UL, 0x14fe3023UL, + 0x16b88e7aUL, 0x177ae44dUL, 0x384d46e0UL, 0x398f2cd7UL, 0x3bc9928eUL, + 0x3a0bf8b9UL, 0x3f44ee3cUL, 0x3e86840bUL, 0x3cc03a52UL, 0x3d025065UL, + 0x365e1758UL, 0x379c7d6fUL, 0x35dac336UL, 0x3418a901UL, 0x3157bf84UL, + 0x3095d5b3UL, 0x32d36beaUL, 0x331101ddUL, 0x246be590UL, 0x25a98fa7UL, + 0x27ef31feUL, 0x262d5bc9UL, 0x23624d4cUL, 0x22a0277bUL, 0x20e69922UL, + 0x2124f315UL, 0x2a78b428UL, 0x2bbade1fUL, 0x29fc6046UL, 0x283e0a71UL, + 0x2d711cf4UL, 0x2cb376c3UL, 0x2ef5c89aUL, 0x2f37a2adUL, 0x709a8dc0UL, + 0x7158e7f7UL, 0x731e59aeUL, 0x72dc3399UL, 0x7793251cUL, 0x76514f2bUL, + 0x7417f172UL, 0x75d59b45UL, 0x7e89dc78UL, 0x7f4bb64fUL, 0x7d0d0816UL, + 0x7ccf6221UL, 0x798074a4UL, 0x78421e93UL, 0x7a04a0caUL, 0x7bc6cafdUL, + 0x6cbc2eb0UL, 0x6d7e4487UL, 0x6f38fadeUL, 0x6efa90e9UL, 0x6bb5866cUL, + 0x6a77ec5bUL, 0x68315202UL, 0x69f33835UL, 0x62af7f08UL, 0x636d153fUL, + 0x612bab66UL, 0x60e9c151UL, 0x65a6d7d4UL, 0x6464bde3UL, 0x662203baUL, + 0x67e0698dUL, 0x48d7cb20UL, 0x4915a117UL, 0x4b531f4eUL, 0x4a917579UL, + 0x4fde63fcUL, 0x4e1c09cbUL, 0x4c5ab792UL, 0x4d98dda5UL, 0x46c49a98UL, + 0x4706f0afUL, 0x45404ef6UL, 0x448224c1UL, 0x41cd3244UL, 0x400f5873UL, + 0x4249e62aUL, 0x438b8c1dUL, 0x54f16850UL, 0x55330267UL, 0x5775bc3eUL, + 0x56b7d609UL, 0x53f8c08cUL, 0x523aaabbUL, 0x507c14e2UL, 0x51be7ed5UL, + 0x5ae239e8UL, 0x5b2053dfUL, 0x5966ed86UL, 0x58a487b1UL, 0x5deb9134UL, + 0x5c29fb03UL, 0x5e6f455aUL, 0x5fad2f6dUL, 0xe1351b80UL, 0xe0f771b7UL, + 0xe2b1cfeeUL, 0xe373a5d9UL, 0xe63cb35cUL, 0xe7fed96bUL, 0xe5b86732UL, + 0xe47a0d05UL, 0xef264a38UL, 0xeee4200fUL, 0xeca29e56UL, 0xed60f461UL, + 0xe82fe2e4UL, 0xe9ed88d3UL, 0xebab368aUL, 0xea695cbdUL, 0xfd13b8f0UL, + 0xfcd1d2c7UL, 0xfe976c9eUL, 0xff5506a9UL, 0xfa1a102cUL, 0xfbd87a1bUL, + 0xf99ec442UL, 0xf85cae75UL, 0xf300e948UL, 0xf2c2837fUL, 0xf0843d26UL, + 0xf1465711UL, 0xf4094194UL, 0xf5cb2ba3UL, 0xf78d95faUL, 0xf64fffcdUL, + 0xd9785d60UL, 0xd8ba3757UL, 0xdafc890eUL, 0xdb3ee339UL, 0xde71f5bcUL, + 0xdfb39f8bUL, 0xddf521d2UL, 0xdc374be5UL, 0xd76b0cd8UL, 0xd6a966efUL, + 0xd4efd8b6UL, 0xd52db281UL, 0xd062a404UL, 0xd1a0ce33UL, 0xd3e6706aUL, + 0xd2241a5dUL, 0xc55efe10UL, 0xc49c9427UL, 0xc6da2a7eUL, 0xc7184049UL, + 0xc25756ccUL, 0xc3953cfbUL, 0xc1d382a2UL, 0xc011e895UL, 0xcb4dafa8UL, + 0xca8fc59fUL, 0xc8c97bc6UL, 0xc90b11f1UL, 0xcc440774UL, 0xcd866d43UL, + 0xcfc0d31aUL, 0xce02b92dUL, 0x91af9640UL, 0x906dfc77UL, 0x922b422eUL, + 0x93e92819UL, 0x96a63e9cUL, 0x976454abUL, 0x9522eaf2UL, 0x94e080c5UL, + 0x9fbcc7f8UL, 0x9e7eadcfUL, 0x9c381396UL, 0x9dfa79a1UL, 0x98b56f24UL, + 0x99770513UL, 0x9b31bb4aUL, 0x9af3d17dUL, 0x8d893530UL, 0x8c4b5f07UL, + 0x8e0de15eUL, 0x8fcf8b69UL, 0x8a809decUL, 0x8b42f7dbUL, 0x89044982UL, + 0x88c623b5UL, 0x839a6488UL, 0x82580ebfUL, 0x801eb0e6UL, 0x81dcdad1UL, + 0x8493cc54UL, 0x8551a663UL, 0x8717183aUL, 0x86d5720dUL, 0xa9e2d0a0UL, + 0xa820ba97UL, 0xaa6604ceUL, 0xaba46ef9UL, 0xaeeb787cUL, 0xaf29124bUL, + 0xad6fac12UL, 0xacadc625UL, 0xa7f18118UL, 0xa633eb2fUL, 0xa4755576UL, + 0xa5b73f41UL, 0xa0f829c4UL, 0xa13a43f3UL, 0xa37cfdaaUL, 0xa2be979dUL, + 0xb5c473d0UL, 0xb40619e7UL, 0xb640a7beUL, 0xb782cd89UL, 0xb2cddb0cUL, + 0xb30fb13bUL, 0xb1490f62UL, 0xb08b6555UL, 0xbbd72268UL, 0xba15485fUL, + 0xb853f606UL, 0xb9919c31UL, 0xbcde8ab4UL, 0xbd1ce083UL, 0xbf5a5edaUL, + 0xbe9834edUL, + + // Table 3 + 0x00000000UL, 0xb8bc6765UL, 0xaa09c88bUL, 0x12b5afeeUL, 0x8f629757UL, + 0x37def032UL, 0x256b5fdcUL, 0x9dd738b9UL, 0xc5b428efUL, 0x7d084f8aUL, + 0x6fbde064UL, 0xd7018701UL, 0x4ad6bfb8UL, 0xf26ad8ddUL, 0xe0df7733UL, + 0x58631056UL, 0x5019579fUL, 0xe8a530faUL, 0xfa109f14UL, 0x42acf871UL, + 0xdf7bc0c8UL, 0x67c7a7adUL, 0x75720843UL, 0xcdce6f26UL, 0x95ad7f70UL, + 0x2d111815UL, 0x3fa4b7fbUL, 0x8718d09eUL, 0x1acfe827UL, 0xa2738f42UL, + 0xb0c620acUL, 0x087a47c9UL, 0xa032af3eUL, 0x188ec85bUL, 0x0a3b67b5UL, + 0xb28700d0UL, 0x2f503869UL, 0x97ec5f0cUL, 0x8559f0e2UL, 0x3de59787UL, + 0x658687d1UL, 0xdd3ae0b4UL, 0xcf8f4f5aUL, 0x7733283fUL, 0xeae41086UL, + 0x525877e3UL, 0x40edd80dUL, 0xf851bf68UL, 0xf02bf8a1UL, 0x48979fc4UL, + 0x5a22302aUL, 0xe29e574fUL, 0x7f496ff6UL, 0xc7f50893UL, 0xd540a77dUL, + 0x6dfcc018UL, 0x359fd04eUL, 0x8d23b72bUL, 0x9f9618c5UL, 0x272a7fa0UL, + 0xbafd4719UL, 0x0241207cUL, 0x10f48f92UL, 0xa848e8f7UL, 0x9b14583dUL, + 0x23a83f58UL, 0x311d90b6UL, 0x89a1f7d3UL, 0x1476cf6aUL, 0xaccaa80fUL, + 0xbe7f07e1UL, 0x06c36084UL, 0x5ea070d2UL, 0xe61c17b7UL, 0xf4a9b859UL, + 0x4c15df3cUL, 0xd1c2e785UL, 0x697e80e0UL, 0x7bcb2f0eUL, 0xc377486bUL, + 0xcb0d0fa2UL, 0x73b168c7UL, 0x6104c729UL, 0xd9b8a04cUL, 0x446f98f5UL, + 0xfcd3ff90UL, 0xee66507eUL, 0x56da371bUL, 0x0eb9274dUL, 0xb6054028UL, + 0xa4b0efc6UL, 0x1c0c88a3UL, 0x81dbb01aUL, 0x3967d77fUL, 0x2bd27891UL, + 0x936e1ff4UL, 0x3b26f703UL, 0x839a9066UL, 0x912f3f88UL, 0x299358edUL, + 0xb4446054UL, 0x0cf80731UL, 0x1e4da8dfUL, 0xa6f1cfbaUL, 0xfe92dfecUL, + 0x462eb889UL, 0x549b1767UL, 0xec277002UL, 0x71f048bbUL, 0xc94c2fdeUL, + 0xdbf98030UL, 0x6345e755UL, 0x6b3fa09cUL, 0xd383c7f9UL, 0xc1366817UL, + 0x798a0f72UL, 0xe45d37cbUL, 0x5ce150aeUL, 0x4e54ff40UL, 0xf6e89825UL, + 0xae8b8873UL, 0x1637ef16UL, 0x048240f8UL, 0xbc3e279dUL, 0x21e91f24UL, + 0x99557841UL, 0x8be0d7afUL, 0x335cb0caUL, 0xed59b63bUL, 0x55e5d15eUL, + 0x47507eb0UL, 0xffec19d5UL, 0x623b216cUL, 0xda874609UL, 0xc832e9e7UL, + 0x708e8e82UL, 0x28ed9ed4UL, 0x9051f9b1UL, 0x82e4565fUL, 0x3a58313aUL, + 0xa78f0983UL, 0x1f336ee6UL, 0x0d86c108UL, 0xb53aa66dUL, 0xbd40e1a4UL, + 0x05fc86c1UL, 0x1749292fUL, 0xaff54e4aUL, 0x322276f3UL, 0x8a9e1196UL, + 0x982bbe78UL, 0x2097d91dUL, 0x78f4c94bUL, 0xc048ae2eUL, 0xd2fd01c0UL, + 0x6a4166a5UL, 0xf7965e1cUL, 0x4f2a3979UL, 0x5d9f9697UL, 0xe523f1f2UL, + 0x4d6b1905UL, 0xf5d77e60UL, 0xe762d18eUL, 0x5fdeb6ebUL, 0xc2098e52UL, + 0x7ab5e937UL, 0x680046d9UL, 0xd0bc21bcUL, 0x88df31eaUL, 0x3063568fUL, + 0x22d6f961UL, 0x9a6a9e04UL, 0x07bda6bdUL, 0xbf01c1d8UL, 0xadb46e36UL, + 0x15080953UL, 0x1d724e9aUL, 0xa5ce29ffUL, 0xb77b8611UL, 0x0fc7e174UL, + 0x9210d9cdUL, 0x2aacbea8UL, 0x38191146UL, 0x80a57623UL, 0xd8c66675UL, + 0x607a0110UL, 0x72cfaefeUL, 0xca73c99bUL, 0x57a4f122UL, 0xef189647UL, + 0xfdad39a9UL, 0x45115eccUL, 0x764dee06UL, 0xcef18963UL, 0xdc44268dUL, + 0x64f841e8UL, 0xf92f7951UL, 0x41931e34UL, 0x5326b1daUL, 0xeb9ad6bfUL, + 0xb3f9c6e9UL, 0x0b45a18cUL, 0x19f00e62UL, 0xa14c6907UL, 0x3c9b51beUL, + 0x842736dbUL, 0x96929935UL, 0x2e2efe50UL, 0x2654b999UL, 0x9ee8defcUL, + 0x8c5d7112UL, 0x34e11677UL, 0xa9362eceUL, 0x118a49abUL, 0x033fe645UL, + 0xbb838120UL, 0xe3e09176UL, 0x5b5cf613UL, 0x49e959fdUL, 0xf1553e98UL, + 0x6c820621UL, 0xd43e6144UL, 0xc68bceaaUL, 0x7e37a9cfUL, 0xd67f4138UL, + 0x6ec3265dUL, 0x7c7689b3UL, 0xc4caeed6UL, 0x591dd66fUL, 0xe1a1b10aUL, + 0xf3141ee4UL, 0x4ba87981UL, 0x13cb69d7UL, 0xab770eb2UL, 0xb9c2a15cUL, + 0x017ec639UL, 0x9ca9fe80UL, 0x241599e5UL, 0x36a0360bUL, 0x8e1c516eUL, + 0x866616a7UL, 0x3eda71c2UL, 0x2c6fde2cUL, 0x94d3b949UL, 0x090481f0UL, + 0xb1b8e695UL, 0xa30d497bUL, 0x1bb12e1eUL, 0x43d23e48UL, 0xfb6e592dUL, + 0xe9dbf6c3UL, 0x516791a6UL, 0xccb0a91fUL, 0x740cce7aUL, 0x66b96194UL, + 0xde0506f1UL +}; diff -r 9d3bc0f40cce -r 60fac40265fc src/cpu/aarch64/vm/stubRoutines_aarch64.hpp --- a/src/cpu/aarch64/vm/stubRoutines_aarch64.hpp Wed May 14 15:43:50 2014 +0100 +++ b/src/cpu/aarch64/vm/stubRoutines_aarch64.hpp Fri May 23 10:47:15 2014 +0100 @@ -24,8 +24,8 @@ * */ -#ifndef CPU_AARCH64_VM_STUBROUTINES_AARCH64_64_HPP -#define CPU_AARCH64_VM_STUBROUTINES_AARCH64_64_HPP +#ifndef CPU_AARCH64_VM_STUBROUTINES_AARCH64_HPP +#define CPU_AARCH64_VM_STUBROUTINES_AARCH64_HPP // This file holds the platform specific parts of the StubRoutines // definition. See stubRoutines.hpp for a description on how to @@ -45,7 +45,7 @@ code_size2 = 22000 // simply increase if too small (assembler will crash if too small) }; -class x86 { +class aarch64 { friend class StubGenerator; private: @@ -113,6 +113,10 @@ { return _double_sign_flip; } + + private: + static juint _crc_table[]; + }; -#endif // CPU_AARCH64_VM_STUBROUTINES_AARCH64_64_HPP +#endif // CPU_AARCH64_VM_STUBROUTINES_AARCH64_HPP diff -r 9d3bc0f40cce -r 60fac40265fc src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp --- a/src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp Wed May 14 15:43:50 2014 +0100 +++ b/src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp Fri May 23 10:47:15 2014 +0100 @@ -673,6 +673,126 @@ return NULL; } +/** + * Method entry for static native methods: + * int java.util.zip.CRC32.update(int crc, int b) + */ +address InterpreterGenerator::generate_CRC32_update_entry() { + if (UseCRC32Intrinsics) { + address entry = __ pc(); + + // rmethod: Method* + // r13: senderSP must preserved for slow path, set SP to it on fast path + // esp: args + + Label slow_path; + // If we need a safepoint check, generate full interpreter entry. + ExternalAddress state(SafepointSynchronize::address_of_state()); + unsigned long offset; + __ adrp(rscratch1, ExternalAddress(SafepointSynchronize::address_of_state()), offset); + __ ldrw(rscratch1, Address(rscratch1, offset)); + assert(SafepointSynchronize::_not_synchronized == 0, "rewrite this code"); + __ cbnz(rscratch1, slow_path); + + // We don't generate local frame and don't align stack because + // we call stub code and there is no safepoint on this path. + + // Load parameters + const Register crc = c_rarg0; // crc + const Register val = c_rarg1; // source java byte value + const Register tbl = c_rarg2; // scratch + + // Arguments are reversed on java expression stack + __ ldrw(val, Address(esp, 0)); // byte value + __ ldrw(crc, Address(esp, wordSize)); // Initial CRC + + __ adrp(tbl, ExternalAddress(StubRoutines::crc_table_addr()), offset); + __ add(tbl, tbl, offset); + + __ ornw(crc, zr, crc); // ~crc + __ update_byte_crc32(crc, val, tbl); + __ ornw(crc, zr, crc); // ~crc + + // result in c_rarg0 + + // _areturn + // __ mov(sp, r13); // set sp to sender sp + __ ret(lr); + + // generate a vanilla native entry as the slow path + __ bind(slow_path); + + (void) generate_native_entry(false); + + return entry; + } + return generate_native_entry(false); +} + +/** + * Method entry for static native methods: + * int java.util.zip.CRC32.updateBytes(int crc, byte[] b, int off, int len) + * int java.util.zip.CRC32.updateByteBuffer(int crc, long buf, int off, int len) + */ +address InterpreterGenerator::generate_CRC32_updateBytes_entry(AbstractInterpreter::MethodKind kind) { + if (UseCRC32Intrinsics) { + address entry = __ pc(); + + // rbx,: Method* + // r13: senderSP must preserved for slow path, set SP to it on fast path + + Label slow_path; + // If we need a safepoint check, generate full interpreter entry. + ExternalAddress state(SafepointSynchronize::address_of_state()); + unsigned long offset; + __ adrp(rscratch1, ExternalAddress(SafepointSynchronize::address_of_state()), offset); + __ ldrw(rscratch1, Address(rscratch1, offset)); + assert(SafepointSynchronize::_not_synchronized == 0, "rewrite this code"); + __ cbnz(rscratch1, slow_path); + + // We don't generate local frame and don't align stack because + // we call stub code and there is no safepoint on this path. + + // Load parameters + const Register crc = c_rarg0; // crc + const Register buf = c_rarg1; // source java byte array address + const Register len = c_rarg2; // length + const Register off = len; // offset (never overlaps with 'len') + + // Arguments are reversed on java expression stack + // Calculate address of start element + if (kind == Interpreter::java_util_zip_CRC32_updateByteBuffer) { + __ ldr(buf, Address(esp, 2*wordSize)); // long buf + __ ldrw(off, Address(esp, wordSize)); // offset + __ add(buf, buf, off); // + offset + __ ldrw(crc, Address(esp, 4*wordSize)); // Initial CRC + } else { + __ ldr(buf, Address(esp, 2*wordSize)); // byte[] array + __ add(buf, buf, arrayOopDesc::base_offset_in_bytes(T_BYTE)); // + header size + __ ldrw(off, Address(esp, wordSize)); // offset + __ add(buf, buf, off); // + offset + __ ldrw(crc, Address(esp, 3*wordSize)); // Initial CRC + } + // Can now load 'len' since we're finished with 'off' + __ ldrw(len, Address(esp, 0x0)); // Length + + __ mov(rscratch1, lr); // saved by call_VM_leaf + __ super_call_VM_leaf(CAST_FROM_FN_PTR(address, StubRoutines::updateBytesCRC32()), crc, buf, len); + + // _areturn + // __ mov(sp, r13); // set sp to sender sp + __ ret(rscratch1); + + // generate a vanilla native entry as the slow path + __ bind(slow_path); + + (void) generate_native_entry(false); + + return entry; + } + return generate_native_entry(false); +} + void InterpreterGenerator::bang_stack_shadow_pages(bool native_call) { // Bang each page in the shadow zone. We can't assume it's been done for // an interpreter frame with greater than a page of locals, so each page @@ -1373,6 +1493,12 @@ case Interpreter::java_lang_math_exp : entry_point = ((InterpreterGenerator*) this)->generate_math_entry(kind); break; case Interpreter::java_lang_ref_reference_get : entry_point = ((InterpreterGenerator*)this)->generate_Reference_get_entry(); break; + case Interpreter::java_util_zip_CRC32_update + : entry_point = ((InterpreterGenerator*)this)->generate_CRC32_update_entry(); break; + case Interpreter::java_util_zip_CRC32_updateBytes + : // fall thru + case Interpreter::java_util_zip_CRC32_updateByteBuffer + : entry_point = ((InterpreterGenerator*)this)->generate_CRC32_updateBytes_entry(kind); break; default : ShouldNotReachHere(); break; } diff -r 9d3bc0f40cce -r 60fac40265fc src/cpu/aarch64/vm/vm_version_aarch64.cpp --- a/src/cpu/aarch64/vm/vm_version_aarch64.cpp Wed May 14 15:43:50 2014 +0100 +++ b/src/cpu/aarch64/vm/vm_version_aarch64.cpp Fri May 23 10:47:15 2014 +0100 @@ -91,6 +91,10 @@ FLAG_SET_DEFAULT(PrefetchScanIntervalInBytes, 256); FLAG_SET_DEFAULT(PrefetchFieldsAhead, 256); FLAG_SET_DEFAULT(PrefetchCopyIntervalInBytes, 256); + + if (FLAG_IS_DEFAULT(UseCRC32Intrinsics)) { + UseCRC32Intrinsics = true; + } } void VM_Version::initialize() { --- CUT HERE --- From aph at redhat.com Fri May 23 14:52:41 2014 From: aph at redhat.com (aph at redhat.com) Date: Fri, 23 May 2014 14:52:41 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8/hotspot: 2 new changesets Message-ID: <201405231452.s4NEqiQE018605@aojmv0008> Changeset: a2e9ac7b3434 Author: aph Date: 2014-05-15 07:37 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/a2e9ac7b3434 Correct costs for operations with shifts. ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/aarch64_ad.m4 Changeset: b8ec31c74e2d Author: aph Date: 2014-05-15 08:15 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/b8ec31c74e2d Correct OptoAssembly for prologs and epilogs. ! src/cpu/aarch64/vm/aarch64.ad From aph at redhat.com Fri May 23 14:53:01 2014 From: aph at redhat.com (aph at redhat.com) Date: Fri, 23 May 2014 14:53:01 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk9/hotspot: 6 new changesets Message-ID: <201405231453.s4NEr8W5018699@aojmv0008> Changeset: 5f4d7f52afc8 Author: aph Date: 2014-05-21 13:00 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/5f4d7f52afc8 Use inner shareable versions of memory barriers ! src/cpu/aarch64/vm/assembler_aarch64.hpp Changeset: 0be4629243a8 Author: aph Date: 2014-05-22 07:21 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/0be4629243a8 Improve code generation for volatile operations and other barriers. ! src/cpu/aarch64/vm/aarch64.ad Changeset: 78eff3c05f51 Author: aph Date: 2014-05-22 09:24 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/78eff3c05f51 Use explicit barrier instructions in C1. ! src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp Changeset: 525031f598d2 Author: aph Date: 2014-05-22 09:25 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/525031f598d2 Merge ! src/cpu/aarch64/vm/aarch64.ad Changeset: ccdf2711f13c Author: aph Date: 2014-05-22 12:29 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/ccdf2711f13c Back out incorrectly applied Patch 6285. ! src/cpu/aarch64/vm/aarch64.ad Changeset: 56bac4c05e96 Author: aph Date: 2014-05-22 12:31 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/56bac4c05e96 Improve code generation for volatile operations and other barriers. ! src/cpu/aarch64/vm/aarch64.ad From aph at redhat.com Sat May 24 10:39:35 2014 From: aph at redhat.com (Andrew Haley) Date: Sat, 24 May 2014 11:39:35 +0100 Subject: [aarch64-port-dev ] RFR: JDK8: Add support for CRC32 intrinsic In-Reply-To: <1400839342.14801.20.camel@localhost.localdomain> References: <1400839342.14801.20.camel@localhost.localdomain> Message-ID: <538076E7.4010804@redhat.com> Hi, On 05/23/2014 11:02 AM, Edward Nevill wrote: > The following patch adds support for CRC32 intrinsic. > > The patch is a non neon patch. IE. It uses only the base aarch64 instruction set. Patch for neon to follow. > > Even without neon it gets 4.5 x improvement on my test case > > http://people.linaro.org/~edward.nevill/crc32/CRCTest.java > > As the patch is quite big (38K) I have put a copy of the patch @ > > http://people.linaro.org/~edward.nevill/crc32/crc32.patch > > which may be easier to apply if anyone wishes to try this out. > > The algorithm uses 4 x tables and handles 16 bytes (1 ldp worth) per iteration. I experimented doing 32 bytes per loop but I could not measure the difference so I left it at 16. There are also algorithms that use 8 tables (Google slice by 8) but I think the returns from this over the simpler by 4 algorithm are minimal. Basically OK, some comments inline. I wonder about the decision to generate interpreter and C1 versions of intrinsics like this. I would have thought that the advantage of C1 code is small, but if we had a client-only VM it would make sense. It's still OK to commit, though. Andrew. > +#endif // CPU_AARCH64_VM_STUBROUTINES_AARCH64_HPP > diff -r 9d3bc0f40cce -r 60fac40265fc src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp > --- a/src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp Wed May 14 15:43:50 2014 +0100 > +++ b/src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp Fri May 23 10:47:15 2014 +0100 > @@ -673,6 +673,126 @@ > return NULL; > } > > +/** > + * Method entry for static native methods: > + * int java.util.zip.CRC32.update(int crc, int b) > + */ > +address InterpreterGenerator::generate_CRC32_update_entry() { > + if (UseCRC32Intrinsics) { > + address entry = __ pc(); > + > + // rmethod: Method* > + // r13: senderSP must preserved for slow path, set SP to it on fast path This comment is wrong. SP does not get set to r13 on fast path. It must be preserved for slow path, that's true. > + // esp: args > + > + Label slow_path; > + // If we need a safepoint check, generate full interpreter entry. > + ExternalAddress state(SafepointSynchronize::address_of_state()); > + unsigned long offset; > + __ adrp(rscratch1, ExternalAddress(SafepointSynchronize::address_of_state()), offset); > + __ ldrw(rscratch1, Address(rscratch1, offset)); > + assert(SafepointSynchronize::_not_synchronized == 0, "rewrite this code"); > + __ cbnz(rscratch1, slow_path); > + > + // We don't generate local frame and don't align stack because > + // we call stub code and there is no safepoint on this path. > + > + // Load parameters > + const Register crc = c_rarg0; // crc > + const Register val = c_rarg1; // source java byte value > + const Register tbl = c_rarg2; // scratch > + > + // Arguments are reversed on java expression stack > + __ ldrw(val, Address(esp, 0)); // byte value > + __ ldrw(crc, Address(esp, wordSize)); // Initial CRC > + > + __ adrp(tbl, ExternalAddress(StubRoutines::crc_table_addr()), offset); > + __ add(tbl, tbl, offset); > + > + __ ornw(crc, zr, crc); // ~crc > + __ update_byte_crc32(crc, val, tbl); > + __ ornw(crc, zr, crc); // ~crc > + > + // result in c_rarg0 > + > + // _areturn > + // __ mov(sp, r13); // set sp to sender sp Please remove these two commented lines. > + __ ret(lr); > + > + // generate a vanilla native entry as the slow path > + __ bind(slow_path); > + > + (void) generate_native_entry(false); > + > + return entry; > + } > + return generate_native_entry(false); > +} > + > +/** > + * Method entry for static native methods: > + * int java.util.zip.CRC32.updateBytes(int crc, byte[] b, int off, int len) > + * int java.util.zip.CRC32.updateByteBuffer(int crc, long buf, int off, int len) > + */ > +address InterpreterGenerator::generate_CRC32_updateBytes_entry(AbstractInterpreter::MethodKind kind) { > + if (UseCRC32Intrinsics) { > + address entry = __ pc(); > + > + // rbx,: Method* > + // r13: senderSP must preserved for slow path, set SP to it on fast path The comments are wrong. > + > + Label slow_path; > + // If we need a safepoint check, generate full interpreter entry. > + ExternalAddress state(SafepointSynchronize::address_of_state()); > + unsigned long offset; > + __ adrp(rscratch1, ExternalAddress(SafepointSynchronize::address_of_state()), offset); > + __ ldrw(rscratch1, Address(rscratch1, offset)); > + assert(SafepointSynchronize::_not_synchronized == 0, "rewrite this code"); > + __ cbnz(rscratch1, slow_path); > + > + // We don't generate local frame and don't align stack because > + // we call stub code and there is no safepoint on this path. > + > + // Load parameters > + const Register crc = c_rarg0; // crc > + const Register buf = c_rarg1; // source java byte array address > + const Register len = c_rarg2; // length > + const Register off = len; // offset (never overlaps with 'len') > + > + // Arguments are reversed on java expression stack > + // Calculate address of start element > + if (kind == Interpreter::java_util_zip_CRC32_updateByteBuffer) { > + __ ldr(buf, Address(esp, 2*wordSize)); // long buf > + __ ldrw(off, Address(esp, wordSize)); // offset > + __ add(buf, buf, off); // + offset > + __ ldrw(crc, Address(esp, 4*wordSize)); // Initial CRC > + } else { > + __ ldr(buf, Address(esp, 2*wordSize)); // byte[] array > + __ add(buf, buf, arrayOopDesc::base_offset_in_bytes(T_BYTE)); // + header size > + __ ldrw(off, Address(esp, wordSize)); // offset > + __ add(buf, buf, off); // + offset > + __ ldrw(crc, Address(esp, 3*wordSize)); // Initial CRC > + } > + // Can now load 'len' since we're finished with 'off' > + __ ldrw(len, Address(esp, 0x0)); // Length > + > + __ mov(rscratch1, lr); // saved by call_VM_leaf > + __ super_call_VM_leaf(CAST_FROM_FN_PTR(address, StubRoutines::updateBytesCRC32()), crc, buf, len); > + > + // _areturn > + // __ mov(sp, r13); // set sp to sender sp > + __ ret(rscratch1); > + > + // generate a vanilla native entry as the slow path > + __ bind(slow_path); > + > + (void) generate_native_entry(false); > + > + return entry; > + } > + return generate_native_entry(false); > +} > + > void InterpreterGenerator::bang_stack_shadow_pages(bool native_call) { > // Bang each page in the shadow zone. We can't assume it's been done for > // an interpreter frame with greater than a page of locals, so each page > @@ -1373,6 +1493,12 @@ > case Interpreter::java_lang_math_exp : entry_point = ((InterpreterGenerator*) this)->generate_math_entry(kind); break; > case Interpreter::java_lang_ref_reference_get > : entry_point = ((InterpreterGenerator*)this)->generate_Reference_get_entry(); break; > + case Interpreter::java_util_zip_CRC32_update > + : entry_point = ((InterpreterGenerator*)this)->generate_CRC32_update_entry(); break; > + case Interpreter::java_util_zip_CRC32_updateBytes > + : // fall thru > + case Interpreter::java_util_zip_CRC32_updateByteBuffer > + : entry_point = ((InterpreterGenerator*)this)->generate_CRC32_updateBytes_entry(kind); break; > default : ShouldNotReachHere(); break; > } > > diff -r 9d3bc0f40cce -r 60fac40265fc src/cpu/aarch64/vm/vm_version_aarch64.cpp > --- a/src/cpu/aarch64/vm/vm_version_aarch64.cpp Wed May 14 15:43:50 2014 +0100 > +++ b/src/cpu/aarch64/vm/vm_version_aarch64.cpp Fri May 23 10:47:15 2014 +0100 > @@ -91,6 +91,10 @@ > FLAG_SET_DEFAULT(PrefetchScanIntervalInBytes, 256); > FLAG_SET_DEFAULT(PrefetchFieldsAhead, 256); > FLAG_SET_DEFAULT(PrefetchCopyIntervalInBytes, 256); > + > + if (FLAG_IS_DEFAULT(UseCRC32Intrinsics)) { > + UseCRC32Intrinsics = true; > + } > } > > void VM_Version::initialize() { > --- CUT HERE --- > > From openjdk-testing at linaro.org Sat May 24 13:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Sat, 24 May 2014 14:00:01 +0100 (BST) Subject: [aarch64-port-dev ] server JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140524130032.4B1FA1FCB4@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/summary/2014/144/summary.html =============================================================================== server-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/apr/17 pass: 433; fail: 1; error: 4 Build 1: aarch64/2014/apr/24 pass: 416; fail: 1; error: 21 Build 2: aarch64/2014/apr/25 pass: 425; fail: 1; error: 12 Build 3: aarch64/2014/apr/26 pass: 423; fail: 1; error: 14 Build 4: aarch64/2014/apr/30 pass: 421; fail: 2; error: 15 Build 5: aarch64/2014/may/02 pass: 434; fail: 2; error: 2 Build 6: aarch64/2014/may/10 pass: 421; fail: 2; error: 15 Build 7: aarch64/2014/may/13 pass: 427; fail: 2; error: 9 Build 8: aarch64/2014/may/15 pass: 409; fail: 2; error: 27 Build 9: aarch64/2014/may/24 pass: 423; fail: 2; error: 13 ------------------------------------------------------------------------------- =============================================================================== server-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/apr/24 pass: 2,894; error: 78 Build 1: aarch64/2014/apr/25 pass: 2,911; error: 61 Build 2: aarch64/2014/apr/26 pass: 2,912; error: 60 Build 3: aarch64/2014/apr/30 pass: 2,915; error: 57 Build 4: aarch64/2014/may/02 pass: 2,939; error: 33 Build 5: aarch64/2014/may/10 pass: 2,911; error: 61 Build 6: aarch64/2014/may/13 pass: 2,910; error: 62 Build 7: aarch64/2014/may/14 pass: 2,859; error: 113 Build 8: aarch64/2014/may/15 pass: 2,894; error: 78 Build 9: aarch64/2014/may/24 pass: 2,911; fail: 1; error: 60 1 fatal errors were detected; please follow the link above for more detail. ------------------------------------------------------------------------------- =============================================================================== server-release/jdk =============================================================================== Build 0: aarch64/2014/apr/24 pass: 4,699; fail: 472; error: 279 Build 1: aarch64/2014/apr/25 pass: 4,725; fail: 472; error: 253 Build 2: aarch64/2014/apr/26 pass: 4,723; fail: 475; error: 251 Build 3: aarch64/2014/apr/30 pass: 4,725; fail: 470; error: 254 Build 4: aarch64/2014/may/02 pass: 5,285; fail: 124; error: 40 Build 5: aarch64/2014/may/10 pass: 4,725; fail: 470; error: 254 Build 6: aarch64/2014/may/13 pass: 4,723; fail: 470; error: 256 Build 7: aarch64/2014/may/14 pass: 4,674; fail: 482; error: 293 Build 8: aarch64/2014/may/15 pass: 4,705; fail: 473; error: 271 Build 9: aarch64/2014/may/24 pass: 4,724; fail: 482; error: 243 ------------------------------------------------------------------------------- Previous results can be found here: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/index.html From openjdk-testing at linaro.org Sat May 24 13:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Sat, 24 May 2014 14:00:01 +0100 (BST) Subject: [aarch64-port-dev ] client JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140524130032.485231FCC3@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/summary/2014/144/summary.html =============================================================================== client-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/apr/24 pass: 418; fail: 5; error: 15 Build 1: aarch64/2014/apr/25 pass: 421; fail: 5; error: 12 Build 2: aarch64/2014/apr/26 pass: 421; fail: 5; error: 12 Build 3: aarch64/2014/apr/30 pass: 422; fail: 5; error: 11 Build 4: aarch64/2014/may/02 pass: 431; fail: 5; error: 2 Build 5: aarch64/2014/may/10 pass: 421; fail: 5; error: 12 Build 6: aarch64/2014/may/13 pass: 422; fail: 5; error: 11 Build 7: aarch64/2014/may/14 pass: 418; fail: 5; error: 15 Build 8: aarch64/2014/may/15 pass: 418; fail: 5; error: 15 Build 9: aarch64/2014/may/24 pass: 421; fail: 5; error: 12 ------------------------------------------------------------------------------- =============================================================================== client-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/apr/24 pass: 2,916; error: 56 Build 1: aarch64/2014/apr/25 pass: 2,917; error: 55 Build 2: aarch64/2014/apr/26 pass: 2,917; error: 55 Build 3: aarch64/2014/apr/30 pass: 2,917; error: 55 Build 4: aarch64/2014/may/02 pass: 2,941; error: 31 Build 5: aarch64/2014/may/10 pass: 2,917; error: 55 Build 6: aarch64/2014/may/13 pass: 2,921; error: 51 Build 7: aarch64/2014/may/14 pass: 2,908; error: 64 Build 8: aarch64/2014/may/15 pass: 2,915; error: 57 Build 9: aarch64/2014/may/24 pass: 2,917; error: 55 ------------------------------------------------------------------------------- =============================================================================== client-release/jdk =============================================================================== Build 0: aarch64/2014/apr/24 pass: 4,886; fail: 474; error: 90 Build 1: aarch64/2014/apr/25 pass: 4,905; fail: 474; error: 71 Build 2: aarch64/2014/apr/26 pass: 4,905; fail: 472; error: 72 Build 3: aarch64/2014/apr/30 pass: 4,904; fail: 473; error: 72 Build 4: aarch64/2014/may/02 pass: 5,272; fail: 131; error: 46 Build 5: aarch64/2014/may/10 pass: 4,906; fail: 472; error: 71 Build 6: aarch64/2014/may/13 pass: 4,904; fail: 473; error: 72 Build 7: aarch64/2014/may/14 pass: 4,862; fail: 474; error: 113 Build 8: aarch64/2014/may/15 pass: 4,890; fail: 473; error: 86 Build 9: aarch64/2014/may/24 pass: 4,906; fail: 472; error: 71 ------------------------------------------------------------------------------- Previous results can be found here: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/index.html From ed at camswl.com Sat May 24 19:31:52 2014 From: ed at camswl.com (ed at camswl.com) Date: Sat, 24 May 2014 19:31:52 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8/hotspot: Add support for CRC32 intrinsic Message-ID: <201405241931.s4OJVvPi014661@aojmv0008> Changeset: 14bba87e055e Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-24 20:31 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/14bba87e055e Add support for CRC32 intrinsic ! src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp ! src/cpu/aarch64/vm/c1_LIRGenerator_aarch64.cpp ! src/cpu/aarch64/vm/interpreterGenerator_aarch64.hpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp ! src/cpu/aarch64/vm/stubGenerator_aarch64.cpp ! src/cpu/aarch64/vm/stubRoutines_aarch64.cpp ! src/cpu/aarch64/vm/stubRoutines_aarch64.hpp ! src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp ! src/cpu/aarch64/vm/vm_version_aarch64.cpp From openjdk-testing at linaro.org Mon May 26 13:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Mon, 26 May 2014 14:00:01 +0100 (BST) Subject: [aarch64-port-dev ] server JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140526130032.C1AF31FCCC@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/summary/2014/146/summary.html =============================================================================== server-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/apr/24 pass: 416; fail: 1; error: 21 Build 1: aarch64/2014/apr/25 pass: 425; fail: 1; error: 12 Build 2: aarch64/2014/apr/26 pass: 423; fail: 1; error: 14 Build 3: aarch64/2014/apr/30 pass: 421; fail: 2; error: 15 Build 4: aarch64/2014/may/02 pass: 434; fail: 2; error: 2 Build 5: aarch64/2014/may/10 pass: 421; fail: 2; error: 15 Build 6: aarch64/2014/may/13 pass: 427; fail: 2; error: 9 Build 7: aarch64/2014/may/15 pass: 409; fail: 2; error: 27 Build 8: aarch64/2014/may/24 pass: 423; fail: 2; error: 13 Build 9: aarch64/2014/may/26 pass: 422; fail: 2; error: 14 ------------------------------------------------------------------------------- =============================================================================== server-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/apr/25 pass: 2,911; error: 61 Build 1: aarch64/2014/apr/26 pass: 2,912; error: 60 Build 2: aarch64/2014/apr/30 pass: 2,915; error: 57 Build 3: aarch64/2014/may/02 pass: 2,939; error: 33 Build 4: aarch64/2014/may/10 pass: 2,911; error: 61 Build 5: aarch64/2014/may/13 pass: 2,910; error: 62 Build 6: aarch64/2014/may/14 pass: 2,859; error: 113 Build 7: aarch64/2014/may/15 pass: 2,894; error: 78 Build 8: aarch64/2014/may/24 pass: 2,911; fail: 1; error: 60 Build 9: aarch64/2014/may/26 pass: 2,909; error: 63 ------------------------------------------------------------------------------- =============================================================================== server-release/jdk =============================================================================== Build 0: aarch64/2014/apr/25 pass: 4,725; fail: 472; error: 253 Build 1: aarch64/2014/apr/26 pass: 4,723; fail: 475; error: 251 Build 2: aarch64/2014/apr/30 pass: 4,725; fail: 470; error: 254 Build 3: aarch64/2014/may/02 pass: 5,285; fail: 124; error: 40 Build 4: aarch64/2014/may/10 pass: 4,725; fail: 470; error: 254 Build 5: aarch64/2014/may/13 pass: 4,723; fail: 470; error: 256 Build 6: aarch64/2014/may/14 pass: 4,674; fail: 482; error: 293 Build 7: aarch64/2014/may/15 pass: 4,705; fail: 473; error: 271 Build 8: aarch64/2014/may/24 pass: 4,724; fail: 482; error: 243 Build 9: aarch64/2014/may/26 pass: 4,723; fail: 473; error: 253 ------------------------------------------------------------------------------- Previous results can be found here: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/index.html From openjdk-testing at linaro.org Mon May 26 13:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Mon, 26 May 2014 14:00:01 +0100 (BST) Subject: [aarch64-port-dev ] client JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140526130032.C40A41F536@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/summary/2014/146/summary.html =============================================================================== client-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/apr/25 pass: 421; fail: 5; error: 12 Build 1: aarch64/2014/apr/26 pass: 421; fail: 5; error: 12 Build 2: aarch64/2014/apr/30 pass: 422; fail: 5; error: 11 Build 3: aarch64/2014/may/02 pass: 431; fail: 5; error: 2 Build 4: aarch64/2014/may/10 pass: 421; fail: 5; error: 12 Build 5: aarch64/2014/may/13 pass: 422; fail: 5; error: 11 Build 6: aarch64/2014/may/14 pass: 418; fail: 5; error: 15 Build 7: aarch64/2014/may/15 pass: 418; fail: 5; error: 15 Build 8: aarch64/2014/may/24 pass: 421; fail: 5; error: 12 Build 9: aarch64/2014/may/26 pass: 420; fail: 5; error: 13 ------------------------------------------------------------------------------- =============================================================================== client-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/apr/25 pass: 2,917; error: 55 Build 1: aarch64/2014/apr/26 pass: 2,917; error: 55 Build 2: aarch64/2014/apr/30 pass: 2,917; error: 55 Build 3: aarch64/2014/may/02 pass: 2,941; error: 31 Build 4: aarch64/2014/may/10 pass: 2,917; error: 55 Build 5: aarch64/2014/may/13 pass: 2,921; error: 51 Build 6: aarch64/2014/may/14 pass: 2,908; error: 64 Build 7: aarch64/2014/may/15 pass: 2,915; error: 57 Build 8: aarch64/2014/may/24 pass: 2,917; error: 55 Build 9: aarch64/2014/may/26 pass: 2,917; error: 55 ------------------------------------------------------------------------------- =============================================================================== client-release/jdk =============================================================================== Build 0: aarch64/2014/apr/25 pass: 4,905; fail: 474; error: 71 Build 1: aarch64/2014/apr/26 pass: 4,905; fail: 472; error: 72 Build 2: aarch64/2014/apr/30 pass: 4,904; fail: 473; error: 72 Build 3: aarch64/2014/may/02 pass: 5,272; fail: 131; error: 46 Build 4: aarch64/2014/may/10 pass: 4,906; fail: 472; error: 71 Build 5: aarch64/2014/may/13 pass: 4,904; fail: 473; error: 72 Build 6: aarch64/2014/may/14 pass: 4,862; fail: 474; error: 113 Build 7: aarch64/2014/may/15 pass: 4,890; fail: 473; error: 86 Build 8: aarch64/2014/may/24 pass: 4,906; fail: 472; error: 71 Build 9: aarch64/2014/may/26 pass: 4,906; fail: 473; error: 70 ------------------------------------------------------------------------------- Previous results can be found here: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/index.html From aph at redhat.com Tue May 27 09:04:51 2014 From: aph at redhat.com (Andrew Haley) Date: Tue, 27 May 2014 10:04:51 +0100 Subject: [aarch64-port-dev ] RFR: JDK8: Add support for CRC32 intrinsic In-Reply-To: <538076E7.4010804@redhat.com> References: <1400839342.14801.20.camel@localhost.localdomain> <538076E7.4010804@redhat.com> Message-ID: <53845533.1000506@redhat.com> On 05/24/2014 11:39 AM, Andrew Haley wrote: > + // _areturn >> + // __ mov(sp, r13); // set sp to sender sp >> + __ ret(rscratch1); >> + Also, I think we should restore the sender's SP. It is the contract of an entry point that at return it restore the sender's SP from R13, but unfortunately we are not consistent about this. The problem is that if your interpreter version of CRC32.update is ever called from a C2I adapter it will fail to restore the caller's SP and the caller will crash when you return to it. This can't happen at the moment because there is a compiler version of CRC32.update, but I think we should at least try to be consistent. I think it just needs at the end __ andr(sp, r13, -16); Andrew. From ed at camswl.com Wed May 28 09:08:55 2014 From: ed at camswl.com (ed at camswl.com) Date: Wed, 28 May 2014 09:08:55 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8/hotspot: Restore sp from sender sp, r13 in crc32 code Message-ID: <201405280908.s4S98wAl025376@aojmv0008> Changeset: fc99103df98d Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-28 10:08 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/fc99103df98d Restore sp from sender sp, r13 in crc32 code ! src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp From aph at redhat.com Thu May 29 17:24:07 2014 From: aph at redhat.com (Andrew Haley) Date: Thu, 29 May 2014 18:24:07 +0100 Subject: [aarch64-port-dev ] Don't use any form of _call_VM_leaf when we're calling a stub Message-ID: <53876D37.2080506@redhat.com> The CRC32 patch broke the builtin simulator. It was calling a (AArch64 code) stub routine vi blrt, which is really only for runtime code. Fixed thusly. Andrew. # HG changeset patch # User aph # Date 1401381523 -3600 # Thu May 29 17:38:43 2014 +0100 # Node ID 79225ea063f38fab53c57211175d2588218c7871 # Parent fc99103df98d5697bb3008aafb8242d6937e3425 Don't use any form of _call_VM_leaf when we're calling a stub. Jump directly to the stub after adjusting the stack. diff -r fc99103df98d -r 79225ea063f3 src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp --- a/src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp Wed May 28 10:08:48 2014 +0100 +++ b/src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp Thu May 29 17:38:43 2014 +0100 @@ -775,11 +775,10 @@ // Can now load 'len' since we're finished with 'off' __ ldrw(len, Address(esp, 0x0)); // Length - __ mov(rscratch1, lr); // saved by call_VM_leaf - __ super_call_VM_leaf(CAST_FROM_FN_PTR(address, StubRoutines::updateBytesCRC32()), crc, buf, len); + __ andr(sp, r13, -16); // Restore the caller's SP - __ andr(sp, r13, -16); - __ ret(rscratch1); + // We are frameless so we can just jump to the stub. + __ b(CAST_FROM_FN_PTR(address, StubRoutines::updateBytesCRC32())); // generate a vanilla native entry as the slow path __ bind(slow_path); @@ -1215,7 +1214,7 @@ // not properly paired (was bug - gri 11/22/99). __ notify_method_exit(vtos, InterpreterMacroAssembler::NotifyJVMTI); - // restore potential result in edx:eax, call result handler to + // restore potential result in r0:d0, call result handler to // restore potential result in ST0 & handle result __ pop(ltos); From aph at redhat.com Thu May 29 17:25:42 2014 From: aph at redhat.com (Andrew Haley) Date: Thu, 29 May 2014 18:25:42 +0100 Subject: [aarch64-port-dev ] Fix a tonne of bogus comments Message-ID: <53876D96.4070809@redhat.com> These are mostly left-over x86isms. Andrew. # HG changeset patch # User aph # Date 1401381560 -3600 # Thu May 29 17:39:20 2014 +0100 # Node ID 02139cd80d48b9c6c30302e2a5f543ef9bc4e53e # Parent 79225ea063f38fab53c57211175d2588218c7871 Fix a tonne of bogus comments. diff -r 79225ea063f3 -r 02139cd80d48 src/cpu/aarch64/vm/aarch64_call.cpp --- a/src/cpu/aarch64/vm/aarch64_call.cpp Thu May 29 17:38:43 2014 +0100 +++ b/src/cpu/aarch64/vm/aarch64_call.cpp Thu May 29 17:39:20 2014 +0100 @@ -180,7 +180,7 @@ default: break; case MacroAssembler::ret_type_integral: - // this overwrites the saved rax + // this overwrites the saved r0 *return_slot = sim->getCPUState().xreg(R0, 0); break; case MacroAssembler::ret_type_float: diff -r 79225ea063f3 -r 02139cd80d48 src/cpu/aarch64/vm/c1_CodeStubs_aarch64.cpp --- a/src/cpu/aarch64/vm/c1_CodeStubs_aarch64.cpp Thu May 29 17:38:43 2014 +0100 +++ b/src/cpu/aarch64/vm/c1_CodeStubs_aarch64.cpp Thu May 29 17:39:20 2014 +0100 @@ -209,7 +209,7 @@ __ bl(RuntimeAddress(Runtime1::entry_for(_stub_id))); ce->add_call_info_here(_info); ce->verify_oop_map(_info); - assert(_result->as_register() == r0, "result must in rax,"); + assert(_result->as_register() == r0, "result must in r0,"); __ b(_continuation); } diff -r 79225ea063f3 -r 02139cd80d48 src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp --- a/src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp Thu May 29 17:38:43 2014 +0100 +++ b/src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp Thu May 29 17:39:20 2014 +0100 @@ -378,7 +378,7 @@ int offset = code_offset(); - // the exception oop and pc are in rax, and rdx + // the exception oop and pc are in r0, and r3 // no other registers need to be preserved, so invalidate them __ invalidate_registers(false, true, true, false, true, true); @@ -2073,7 +2073,7 @@ add_call_info(pc_for_athrow_offset, info); // for exception handler __ verify_not_null_oop(r0); - // search an exception handler (rax: exception oop, rdx: throwing pc) + // search an exception handler (r0: exception oop, r3: throwing pc) if (compilation()->has_fpu_code()) { unwind_id = Runtime1::handle_exception_id; } else { diff -r 79225ea063f3 -r 02139cd80d48 src/cpu/aarch64/vm/c1_MacroAssembler_aarch64.hpp --- a/src/cpu/aarch64/vm/c1_MacroAssembler_aarch64.hpp Thu May 29 17:38:43 2014 +0100 +++ b/src/cpu/aarch64/vm/c1_MacroAssembler_aarch64.hpp Thu May 29 17:39:20 2014 +0100 @@ -54,7 +54,7 @@ Register result); // locking - // hdr : must be rax, contents destroyed + // hdr : must be r0, contents destroyed // obj : must point to the object to lock, contents preserved // disp_hdr: must point to the displaced header location, contents preserved // scratch : scratch register, contents destroyed @@ -64,7 +64,7 @@ // unlocking // hdr : contents destroyed // obj : must point to the object to lock, contents preserved - // disp_hdr: must be eax & must point to the displaced header location, contents destroyed + // disp_hdr: must be r0 & must point to the displaced header location, contents destroyed void unlock_object(Register swap, Register obj, Register lock, Label& slow_case); void initialize_object( @@ -79,7 +79,7 @@ // allocation of fixed-size objects // (can also be used to allocate fixed-size arrays, by setting // hdr_size correctly and storing the array length afterwards) - // obj : must be rax, will contain pointer to allocated object + // obj : will contain pointer to allocated object // t1, t2 : scratch registers - contents destroyed // header_size: size of object header in words // object_size: total size of object in words @@ -91,7 +91,7 @@ }; // allocation of arrays - // obj : must be rax, will contain pointer to allocated object + // obj : will contain pointer to allocated object // len : array length in number of elements // t : scratch register - contents destroyed // header_size: size of object header in words diff -r 79225ea063f3 -r 02139cd80d48 src/cpu/aarch64/vm/c1_Runtime1_aarch64.cpp --- a/src/cpu/aarch64/vm/c1_Runtime1_aarch64.cpp Thu May 29 17:38:43 2014 +0100 +++ b/src/cpu/aarch64/vm/c1_Runtime1_aarch64.cpp Thu May 29 17:39:20 2014 +0100 @@ -850,7 +850,7 @@ // refilling the TLAB or allocating directly from eden. Label retry_tlab, try_eden; const Register thread = - __ tlab_refill(retry_tlab, try_eden, slow_path); // preserves rbx & rdx, returns rdi + __ tlab_refill(retry_tlab, try_eden, slow_path); // preserves r19 & r3, returns rthread __ bind(retry_tlab); @@ -945,7 +945,7 @@ oop_maps->add_gc_map(call_offset, map); restore_live_registers_except_r0(sasm); - // rax,: new multi array + // r0,: new multi array __ verify_oop(r0); } break; diff -r 79225ea063f3 -r 02139cd80d48 src/cpu/aarch64/vm/compiledIC_aarch64.cpp --- a/src/cpu/aarch64/vm/compiledIC_aarch64.cpp Thu May 29 17:38:43 2014 +0100 +++ b/src/cpu/aarch64/vm/compiledIC_aarch64.cpp Thu May 29 17:39:20 2014 +0100 @@ -81,8 +81,8 @@ void CompiledStaticCall::emit_to_interp_stub(CodeBuffer &cbuf, address mark) { // Stub is fixed up when the corresponding call is converted from // calling compiled code to calling interpreted code. - // movq rbx, 0 - // jmp -5 # to self + // movq rmethod, 0 + // jmp -4 # to self // address mark = cbuf.insts_mark(); // Get mark within main instrs section. diff -r 79225ea063f3 -r 02139cd80d48 src/cpu/aarch64/vm/interp_masm_aarch64.cpp --- a/src/cpu/aarch64/vm/interp_masm_aarch64.cpp Thu May 29 17:38:43 2014 +0100 +++ b/src/cpu/aarch64/vm/interp_masm_aarch64.cpp Thu May 29 17:39:20 2014 +0100 @@ -263,7 +263,7 @@ profile_typecheck(r2, Rsub_klass, r5); // blows r2, reloads r5 // Do the check. - check_klass_subtype(Rsub_klass, r0, r2, ok_is_subtype); // blows rcx + check_klass_subtype(Rsub_klass, r0, r2, ok_is_subtype); // blows r2 // Profile the failure of the check. profile_typecheck_failed(r2); // blows r2 @@ -721,7 +721,7 @@ save_bcp(); // Save in case of exception // Convert from BasicObjectLock structure to object and BasicLock - // structure Store the BasicLock address into %rax + // structure Store the BasicLock address into %r0 lea(swap_reg, Address(lock_reg, BasicObjectLock::lock_offset_in_bytes())); // Load oop into obj_reg(%c_rarg3) diff -r 79225ea063f3 -r 02139cd80d48 src/cpu/aarch64/vm/interp_masm_aarch64.hpp --- a/src/cpu/aarch64/vm/interp_masm_aarch64.hpp Thu May 29 17:38:43 2014 +0100 +++ b/src/cpu/aarch64/vm/interp_masm_aarch64.hpp Thu May 29 17:39:20 2014 +0100 @@ -166,14 +166,14 @@ // Dispatching void dispatch_prolog(TosState state, int step = 0); void dispatch_epilog(TosState state, int step = 0); - // dispatch via ebx (assume ebx is loaded already) + // dispatch via rscratch1 void dispatch_only(TosState state); - // dispatch normal table via ebx (assume ebx is loaded already) + // dispatch normal table via rscratch1 (assume rscratch1 is loaded already) void dispatch_only_normal(TosState state); void dispatch_only_noverify(TosState state); - // load ebx from [esi + step] and dispatch via ebx + // load rscratch1 from [rbcp + step] and dispatch via rscratch1 void dispatch_next(TosState state, int step = 0); - // load ebx from [esi] and dispatch via ebx and table + // load rscratch1 from [esi] and dispatch via rscratch1 and table void dispatch_via (TosState state, address* table); // jump to an invoked target diff -r 79225ea063f3 -r 02139cd80d48 src/cpu/aarch64/vm/macroAssembler_aarch64.cpp --- a/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Thu May 29 17:38:43 2014 +0100 +++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Thu May 29 17:39:20 2014 +0100 @@ -1933,9 +1933,6 @@ BytecodeCounter::print(); } #endif - // To see where a verify_oop failed, get $ebx+40/X for this frame. - // XXX correct this offset for amd64 - // This is the value of eip which points to where verify_oop will return. if (os::message_box(msg, "Execution stopped, print registers?")) { ttyLocker ttyl; tty->print_cr(" pc = 0x%016lx", pc); diff -r 79225ea063f3 -r 02139cd80d48 src/cpu/aarch64/vm/macroAssembler_aarch64.hpp --- a/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp Thu May 29 17:38:43 2014 +0100 +++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp Thu May 29 17:39:20 2014 +0100 @@ -103,7 +103,7 @@ // Biased locking support // lock_reg and obj_reg must be loaded up with the appropriate values. - // swap_reg must be rax, and is killed. + // swap_reg is killed. // tmp_reg is optional. If it is supplied (i.e., != noreg) it will // be killed; if not supplied, push/pop will be used internally to // allocate a temporary (inefficient, avoid if possible). @@ -765,88 +765,6 @@ void int3(); #endif - // currently unimplemented -#if 0 - // Long operation macros for a 32bit cpu - // Long negation for Java - void lneg(Register hi, Register lo); - - // Long multiplication for Java - // (destroys contents of eax, ebx, ecx and edx) - void lmul(int x_rsp_offset, int y_rsp_offset); // rdx:rax = x * y - - // Long shifts for Java - // (semantics as described in JVM spec.) - void lshl(Register hi, Register lo); // hi:lo << (rcx & 0x3f) - void lshr(Register hi, Register lo, bool sign_extension = false); // hi:lo >> (rcx & 0x3f) - - // Long compare for Java - // (semantics as described in JVM spec.) - void lcmp2int(Register x_hi, Register x_lo, Register y_hi, Register y_lo); // x_hi = lcmp(x, y) - - - // misc - - // Sign extension - void sign_extend_short(Register reg); - void sign_extend_byte(Register reg); - - // Division by power of 2, rounding towards 0 - void division_with_shift(Register reg, int shift_value); -#endif - - // unimpelements -#if 0 - // Compares the top-most stack entries on the FPU stack and sets the eflags as follows: - // - // CF (corresponds to C0) if x < y - // PF (corresponds to C2) if unordered - // ZF (corresponds to C3) if x = y - // - // The arguments are in reversed order on the stack (i.e., top of stack is first argument). - // tmp is a temporary register, if none is available use noreg (only matters for non-P6 code) - void fcmp(Register tmp); - // Variant of the above which allows y to be further down the stack - // and which only pops x and y if specified. If pop_right is - // specified then pop_left must also be specified. - void fcmp(Register tmp, int index, bool pop_left, bool pop_right); - - // Floating-point comparison for Java - // Compares the top-most stack entries on the FPU stack and stores the result in dst. - // The arguments are in reversed order on the stack (i.e., top of stack is first argument). - // (semantics as described in JVM spec.) - void fcmp2int(Register dst, bool unordered_is_less); - // Variant of the above which allows y to be further down the stack - // and which only pops x and y if specified. If pop_right is - // specified then pop_left must also be specified. - void fcmp2int(Register dst, bool unordered_is_less, int index, bool pop_left, bool pop_right); - - // Floating-point remainder for Java (ST0 = ST0 fremr ST1, ST1 is empty afterwards) - // tmp is a temporary register, if none is available use noreg - void fremr(Register tmp); - - - // Inlined sin/cos generator for Java; must not use CPU instruction - // directly on Intel as it does not have high enough precision - // outside of the range [-pi/4, pi/4]. Extra argument indicate the - // number of FPU stack slots in use; all but the topmost will - // require saving if a slow case is necessary. Assumes argument is - // on FP TOS; result is on FP TOS. No cpu registers are changed by - // this code. - void trigfunc(char trig, int num_fpu_regs_in_use = 1); - - // branch to L if FPU flag C2 is set/not set - // tmp is a temporary register, if none is available use noreg - void jC2 (Register tmp, Label& L); - void jnC2(Register tmp, Label& L); - - void push_IU_state(); - void pop_IU_state(); - - void push_FPU_state(); - void pop_FPU_state(); -#endif - void push_CPU_state(); void pop_CPU_state() ; @@ -1011,33 +929,6 @@ // Support for serializing memory accesses between threads void serialize_memory(Register thread, Register tmp); - // unimplemented -#if 0 - void verify_tlab(); - - // Biased locking support - // lock_reg and obj_reg must be loaded up with the appropriate values. - // swap_reg must be rax, and is killed. - // tmp_reg is optional. If it is supplied (i.e., != noreg) it will - // be killed; if not supplied, push/pop will be used internally to - // allocate a temporary (inefficient, avoid if possible). - // Optional slow case is for implementations (interpreter and C1) which branch to - // slow case directly. Leaves condition codes set for C2's Fast_Lock node. - // Returns offset of first potentially-faulting instruction for null - // check info (currently consumed only by C1). If - // swap_reg_contains_mark is true then returns -1 as it is assumed - // the calling code has already passed any potential faults. - int biased_locking_enter(Register lock_reg, Register obj_reg, - Register swap_reg, Register tmp_reg, - bool swap_reg_contains_mark, - Label& done, Label* slow_case = NULL, - BiasedLockingCounters* counters = NULL); - void biased_locking_exit (Register obj_reg, Register temp_reg, Label& done); - - - Condition negate_condition(Condition cond); -#endif - // Arithmetics void addptr(Address dst, int32_t src) { diff -r 79225ea063f3 -r 02139cd80d48 src/cpu/aarch64/vm/methodHandles_aarch64.cpp --- a/src/cpu/aarch64/vm/methodHandles_aarch64.cpp Thu May 29 17:38:43 2014 +0100 +++ b/src/cpu/aarch64/vm/methodHandles_aarch64.cpp Thu May 29 17:39:20 2014 +0100 @@ -262,7 +262,7 @@ // temps used in this code are not used in *either* compiled or interpreted calling sequences Register temp1 = r10; Register temp2 = r11; - Register temp3 = r14; // r13 is live ty this point: it contains the sender SP + Register temp3 = r14; // r13 is live by this point: it contains the sender SP if (for_compiler_entry) { assert(receiver_reg == (iid == vmIntrinsics::_linkToStatic ? noreg : j_rarg0), "only valid assignment"); assert_different_registers(temp1, j_rarg0, j_rarg1, j_rarg2, j_rarg3, j_rarg4, j_rarg5, j_rarg6, j_rarg7); @@ -331,7 +331,7 @@ // Live registers at this point: // member_reg - MemberName that was the trailing argument // temp1_recv_klass - klass of stacked receiver, if needed - // rsi/r13 - interpreter linkage (if interpreted) ??? FIXME + // r13 - interpreter linkage (if interpreted) ??? FIXME // r1 ... r0 - compiler arguments (if compiled) Label L_incompatible_class_change_error; @@ -416,7 +416,7 @@ break; } - // live at this point: rmethod, rsi/r13 (if interpreted) + // live at this point: rmethod, r13 (if interpreted) // After figuring out which concrete method to call, jump into it. // Note that this works in the interpreter with no data motion. diff -r 79225ea063f3 -r 02139cd80d48 src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp --- a/src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp Thu May 29 17:38:43 2014 +0100 +++ b/src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp Thu May 29 17:39:20 2014 +0100 @@ -479,7 +479,7 @@ // stack pointer. It also recalculates and aligns sp. // A c2i adapter is frameless because the *callee* frame, which is - // interpreted, routinely repairs its caller's es (from sender_sp, + // interpreted, routinely repairs its caller's sp (from sender_sp, // which is set up via the senderSP register). // In other words, if *either* the caller or callee is interpreted, we can @@ -702,7 +702,7 @@ AArch64Simulator *sim = NULL; size_t len = 65536; if (NotifySimulator) { - name = new char[len]; + name = NEW_C_HEAP_ARRAY(char, len, mtInternal); } if (name) { @@ -757,7 +757,7 @@ name[0] = 'c'; name[2] = 'i'; sim->notifyCompile(name, c2i_entry); - delete[] name; + FREE_C_HEAP_ARRAY(char, name, mtInternal); } #endif @@ -1608,9 +1608,6 @@ // Mark location of rfp (someday) // map->set_callee_saved(VMRegImpl::stack2reg( stack_slots - 2), stack_slots * 2, 0, vmreg(rfp)); - // Use eax, ebx as temporaries during any memory-memory moves we have to do - // All inbound args are referenced based on rfp and all outbound args via sp. - int float_args = 0; int int_args = 0; @@ -1959,9 +1956,6 @@ // Don't use call_VM as it will see a possible pending exception and forward it // and never return here preventing us from clearing _last_native_pc down below. - // Also can't use call_VM_leaf either as it will check to see if rsi & rdi are - // preserved and correspond to the bcp/locals pointers. So we do a runtime call - // by hand. // save_native_result(masm, ret_type, stack_slots); __ mov(c_rarg0, rthread); @@ -2887,7 +2881,7 @@ oop_maps->add_gc_map( __ offset() - start, map); - // rax contains the address we are going to jump to assuming no exception got installed + // r0 contains the address we are going to jump to assuming no exception got installed // clear last_Java_sp __ reset_last_Java_frame(false, true); @@ -2990,7 +2984,6 @@ // Store exception in Thread object. We cannot pass any arguments to the // handle_exception call, since we do not want to make any assumption // about the size of the frame where the exception happened in. - // c_rarg0 is either rdi (Linux) or rcx (Windows). __ str(r0, Address(rthread, JavaThread::exception_oop_offset())); __ str(r3, Address(rthread, JavaThread::exception_pc_offset())); diff -r 79225ea063f3 -r 02139cd80d48 src/cpu/aarch64/vm/stubGenerator_aarch64.cpp --- a/src/cpu/aarch64/vm/stubGenerator_aarch64.cpp Thu May 29 17:38:43 2014 +0100 +++ b/src/cpu/aarch64/vm/stubGenerator_aarch64.cpp Thu May 29 17:39:20 2014 +0100 @@ -456,7 +456,7 @@ // not the case if the callee is compiled code => need to setup the // rsp. // - // rax: exception oop + // r0: exception oop // NOTE: this is used as a target from the signal handler so it // needs an x86 prolog which returns into the current simulator @@ -850,21 +850,6 @@ void array_overlap_test(Label& L_no_overlap, Address::sxtw sf) { __ b(L_no_overlap); } void array_overlap_test(address no_overlap_target, Label* NOLp, int sf) { Unimplemented(); } - // Shuffle first three arg regs on Windows into Linux/Solaris locations. - // - // Outputs: - // rdi - rcx - // rsi - rdx - // rdx - r8 - // rcx - r9 - // - // Registers r9 and r10 are used to save rdi and rsi on Windows, which latter - // are non-volatile. r9 and r10 should not be used by the caller. - // - void setup_arg_regs(int nargs = 3) { Unimplemented(); } - - void restore_arg_regs() { Unimplemented(); } - // Generate code for an array write pre barrier // // addr - starting address @@ -1796,8 +1781,8 @@ // rsp+40 - element count (32-bits) // // Output: - // rax == 0 - success - // rax == -1^K - failure, where K is partial transfer count + // r0 == 0 - success + // r0 == -1^K - failure, where K is partial transfer count // address generate_generic_copy(const char *name, address byte_copy_entry, address short_copy_entry, @@ -1952,8 +1937,12 @@ * c_rarg1 - byte* buf * c_rarg2 - int length * - * Ouput: - * rax - int crc result + * Output: + * r0 - int crc result + * + * Preserves: + * r13 + * */ address generate_updateBytesCRC32() { assert(UseCRC32Intrinsics, "what are we doing here?"); diff -r 79225ea063f3 -r 02139cd80d48 src/cpu/aarch64/vm/templateTable_aarch64.cpp --- a/src/cpu/aarch64/vm/templateTable_aarch64.cpp Thu May 29 17:38:43 2014 +0100 +++ b/src/cpu/aarch64/vm/templateTable_aarch64.cpp Thu May 29 17:39:20 2014 +0100 @@ -2724,7 +2724,7 @@ // access constant pool cache __ get_cache_and_index_at_bcp(r2, r1, 1); - // test for volatile with rdx + // test for volatile with r3 __ ldrw(r3, Address(r2, in_bytes(base + ConstantPoolCacheEntry::flags_offset()))); @@ -3188,7 +3188,7 @@ // r0: CallSite object (from cpool->resolved_references[]) // rmethod: MH.linkToCallSite method (from f2) - // Note: rax_callsite is already pushed by prepare_invoke + // Note: r0_callsite is already pushed by prepare_invoke // %%% should make a type profile for any invokedynamic that takes a ref argument // profile this call @@ -3657,7 +3657,6 @@ __ should_not_reach_here(); // call run-time routine - // rsi: points to monitor entry __ bind(found); __ push_ptr(r0); // make sure object is on stack (contract with oopMaps) __ unlock_object(c_rarg1); From aph at redhat.com Thu May 29 17:26:07 2014 From: aph at redhat.com (aph at redhat.com) Date: Thu, 29 May 2014 17:26:07 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8/hotspot: 2 new changesets Message-ID: <201405291726.s4THQ9cc016545@aojmv0008> Changeset: 79225ea063f3 Author: aph Date: 2014-05-29 17:38 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/79225ea063f3 Don't use any form of _call_VM_leaf when we're calling a stub. Jump directly to the stub after adjusting the stack. ! src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp Changeset: 02139cd80d48 Author: aph Date: 2014-05-29 17:39 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/02139cd80d48 Fix a tonne of bogus comments. ! src/cpu/aarch64/vm/aarch64_call.cpp ! src/cpu/aarch64/vm/c1_CodeStubs_aarch64.cpp ! src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp ! src/cpu/aarch64/vm/c1_MacroAssembler_aarch64.hpp ! src/cpu/aarch64/vm/c1_Runtime1_aarch64.cpp ! src/cpu/aarch64/vm/compiledIC_aarch64.cpp ! src/cpu/aarch64/vm/interp_masm_aarch64.cpp ! src/cpu/aarch64/vm/interp_masm_aarch64.hpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp ! src/cpu/aarch64/vm/methodHandles_aarch64.cpp ! src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp ! src/cpu/aarch64/vm/stubGenerator_aarch64.cpp ! src/cpu/aarch64/vm/templateTable_aarch64.cpp From edward.nevill at linaro.org Fri May 30 07:54:05 2014 From: edward.nevill at linaro.org (Edward Nevill) Date: Fri, 30 May 2014 08:54:05 +0100 Subject: [aarch64-port-dev ] Don't use any form of _call_VM_leaf when we're calling a stub In-Reply-To: <53876D37.2080506@redhat.com> References: <53876D37.2080506@redhat.com> Message-ID: <1401436445.21304.31.camel@localhost.localdomain> oops, sorry, I should have done a smoke test on the buitin sim, Ed. On Thu, 2014-05-29 at 18:24 +0100, Andrew Haley wrote: > The CRC32 patch broke the builtin simulator. It was calling a > (AArch64 code) stub routine vi blrt, which is really only for runtime > code. Fixed thusly. > > Andrew. From aph at redhat.com Fri May 30 10:34:11 2014 From: aph at redhat.com (aph at redhat.com) Date: Fri, 30 May 2014 10:34:11 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk9/hotspot: 2 new changesets Message-ID: <201405301034.s4UAYEsR016953@aojmv0008> Changeset: 61d3642d6ed3 Author: aph Date: 2014-05-29 05:51 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/61d3642d6ed3 DSB is unnecessary here. ! src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp Changeset: 99788e00cc4b Author: aph Date: 2014-05-29 05:53 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/99788e00cc4b Implement various locked memory operations. ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp From aph at redhat.com Fri May 30 10:39:46 2014 From: aph at redhat.com (aph at redhat.com) Date: Fri, 30 May 2014 10:39:46 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8/hotspot: 2 new changesets Message-ID: <201405301039.s4UAdmtQ017990@aojmv0008> Changeset: a80e7c1b07ad Author: aph Date: 2014-05-29 13:27 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/a80e7c1b07ad Delete useless instruction. ! src/cpu/aarch64/vm/aarch64.ad Changeset: a4a33014c25d Author: aph Date: 2014-05-29 13:27 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/a4a33014c25d Merge From aph at redhat.com Fri May 30 10:42:21 2014 From: aph at redhat.com (aph at redhat.com) Date: Fri, 30 May 2014 10:42:21 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk9/hotspot: 2 new changesets Message-ID: <201405301042.s4UAgOXv018480@aojmv0008> Changeset: 5046dff011d9 Author: aph Date: 2014-05-15 07:37 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/5046dff011d9 Correct costs for operations with shifts. ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/aarch64_ad.m4 Changeset: 56eef46546fd Author: aph Date: 2014-05-15 08:15 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/56eef46546fd Correct OptoAssembly for prologs and epilogs. ! src/cpu/aarch64/vm/aarch64.ad From aph at redhat.com Fri May 30 10:56:59 2014 From: aph at redhat.com (Andrew Haley) Date: Fri, 30 May 2014 11:56:59 +0100 Subject: [aarch64-port-dev ] RFR: JDK8: Add support for CRC32 intrinsic In-Reply-To: <1400839342.14801.20.camel@localhost.localdomain> References: <1400839342.14801.20.camel@localhost.localdomain> Message-ID: <538863FB.4050101@redhat.com> On 05/23/2014 11:02 AM, Edward Nevill wrote: > Hi, > > The following patch adds support for CRC32 intrinsic. > > The patch is a non neon patch. IE. It uses only the base aarch64 instruction set. Patch for neon to follow. > > Even without neon it gets 4.5 x improvement on my test case > > http://people.linaro.org/~edward.nevill/crc32/CRCTest.java > > As the patch is quite big (38K) I have put a copy of the patch @ > > http://people.linaro.org/~edward.nevill/crc32/crc32.patch > > which may be easier to apply if anyone wishes to try this out. > > The algorithm uses 4 x tables and handles 16 bytes (1 ldp worth) per iteration. I experimented doing 32 bytes per loop but I could not measure the difference so I left it at 16. There are also algorithms that use 8 tables (Google slice by 8) but I think the returns from this over the simpler by 4 algorithm are minimal. > > The guts of the algorithm are in kernel_crc32 if anyone is interested! Please make sure this goes into JDK9. Andrew. From aph at redhat.com Fri May 30 11:00:34 2014 From: aph at redhat.com (aph at redhat.com) Date: Fri, 30 May 2014 11:00:34 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8/hotspot: Implement various locked memory operations. Message-ID: <201405301100.s4UB0a5R021361@aojmv0008> Changeset: 72b29bfe67fa Author: aph Date: 2014-05-29 05:53 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/72b29bfe67fa Implement various locked memory operations. ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp From aph at redhat.com Fri May 30 11:34:30 2014 From: aph at redhat.com (aph at redhat.com) Date: Fri, 30 May 2014 11:34:30 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk9/hotspot: Fix a tonne of bogus comments. Message-ID: <201405301134.s4UBYWQZ026610@aojmv0008> Changeset: 815ebdcd571e Author: aph Date: 2014-05-30 07:08 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk9/hotspot/rev/815ebdcd571e Fix a tonne of bogus comments. ! src/cpu/aarch64/vm/aarch64_call.cpp ! src/cpu/aarch64/vm/c1_CodeStubs_aarch64.cpp ! src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp ! src/cpu/aarch64/vm/c1_MacroAssembler_aarch64.hpp ! src/cpu/aarch64/vm/c1_Runtime1_aarch64.cpp ! src/cpu/aarch64/vm/compiledIC_aarch64.cpp ! src/cpu/aarch64/vm/interp_masm_aarch64.cpp ! src/cpu/aarch64/vm/interp_masm_aarch64.hpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp ! src/cpu/aarch64/vm/methodHandles_aarch64.cpp ! src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp ! src/cpu/aarch64/vm/stubGenerator_aarch64.cpp ! src/cpu/aarch64/vm/templateTable_aarch64.cpp From aph at redhat.com Fri May 30 12:12:17 2014 From: aph at redhat.com (Andrew Haley) Date: Fri, 30 May 2014 13:12:17 +0100 Subject: [aarch64-port-dev ] AArch64 process Message-ID: <538875A1.8060101@redhat.com> I managed to confuse myself completely by checking some patches into JDK8 and some into JDK9. I think it's straightened out now. I think we need a rule, so here it is: all patches that are not JDK9 specific should go into JDK8 then JDK9. As before, no changes to shared code are allowed into JDK9 except under special circumstances. JDK7u should use the same HotSpot as JDK8. Andrew. From aph at redhat.com Fri May 30 15:54:22 2014 From: aph at redhat.com (aph at redhat.com) Date: Fri, 30 May 2014 15:54:22 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk7u/hotspot: 38 new changesets Message-ID: <201405301554.s4UFsrLv005986@aojmv0008> Changeset: 378b010e4b60 Author: aph Date: 2014-03-26 06:38 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/378b010e4b60 C1: Fix offset overflow when profiling. ! src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp Changeset: f2658ddb105c Author: aph Date: 2014-03-27 08:02 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/f2658ddb105c Offsets in lookupswitch instructions should be signed. ! src/cpu/aarch64/vm/templateTable_aarch64.cpp Changeset: e176eb39c5f5 Author: aph Date: 2014-03-31 10:20 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/e176eb39c5f5 Remove special-case handling of division arguments. AArch64 doesn't need it. ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Changeset: 780ed75ea21a Author: aph Date: 2014-04-01 12:22 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/780ed75ea21a Remove unnecessary memory barriers around CAS operations ! src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Changeset: 273f8f0e7109 Author: Edward Nevill edward.nevill at linaro.org Date: 2014-04-02 11:41 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/273f8f0e7109 Preserve callee save FP registers around call to java code ! src/cpu/aarch64/vm/frame_aarch64.hpp ! src/cpu/aarch64/vm/stubGenerator_aarch64.cpp Changeset: 5a8c184c37d4 Author: Edward Nevill edward.nevill at linaro.org Date: 2014-04-03 22:51 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/5a8c184c37d4 Use gcc __clear_cache instead of doing it ourselves ! src/cpu/aarch64/vm/icache_aarch64.cpp ! src/cpu/aarch64/vm/icache_aarch64.hpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp Changeset: a16c651450e4 Author: aph Date: 2014-04-08 14:58 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/a16c651450e4 New cost model for instruction selection. ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/aarch64_ad.m4 Changeset: d9468835bc51 Author: aph Date: 2014-04-10 06:50 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/d9468835bc51 Rewrite CAS operations to be more conservative ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp Changeset: 4c3b20781d5d Author: aph Date: 2014-04-22 18:54 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/4c3b20781d5d Use an explicit set of registers rather than a bitmap for psh and pop operations. ! src/cpu/aarch64/vm/c1_CodeStubs_aarch64.cpp ! src/cpu/aarch64/vm/c1_Runtime1_aarch64.cpp ! src/cpu/aarch64/vm/interp_masm_aarch64.hpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp ! src/cpu/aarch64/vm/methodHandles_aarch64.cpp ! src/cpu/aarch64/vm/register_aarch64.hpp ! src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp ! src/cpu/aarch64/vm/stubGenerator_aarch64.cpp ! src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp Changeset: 563e44ab11a3 Author: aph Date: 2014-04-23 09:26 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/563e44ab11a3 Add a constructor as a conversion from Register - RegSet. Use it. ! src/cpu/aarch64/vm/c1_Runtime1_aarch64.cpp ! src/cpu/aarch64/vm/register_aarch64.hpp Changeset: ef2aa7fd06f3 Author: Edward Nevill edward.nevill at linaro.org Date: 2014-04-24 10:43 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/ef2aa7fd06f3 Fix biased locking and enable as default ! src/cpu/aarch64/vm/globals_aarch64.hpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp ! src/cpu/zero/vm/globals_zero.hpp ! src/share/vm/runtime/globals.hpp Changeset: 9d641fdeea4d Author: Edward Nevill edward.nevill at linaro.org Date: 2014-04-29 14:58 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/9d641fdeea4d Minor optimisation for divide by 2 ! src/cpu/aarch64/vm/aarch64.ad Changeset: f67f9b1b52ae Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-01 14:57 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/f67f9b1b52ae Fix instruction size from 8 to 4 ! src/cpu/aarch64/vm/nativeInst_aarch64.hpp Changeset: 8a569467b81b Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-07 16:41 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/8a569467b81b Improvements to safepoint polling ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/c1_globals_aarch64.hpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Changeset: 99180a14ca07 Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-12 13:39 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/99180a14ca07 Optimise C2 entry point verification ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp Changeset: 6523308f9626 Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-12 13:41 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/6523308f9626 Make code entry alignment 64 for C2 ! src/cpu/aarch64/vm/globals_aarch64.hpp Changeset: 0ca397cbac95 Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-13 15:15 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/0ca397cbac95 Stop spurious O_BUFLEN warnings ! src/share/vm/runtime/globals.cpp Changeset: 1fcabae0e46f Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-13 16:09 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/1fcabae0e46f Optimise long divide by 2 ! src/cpu/aarch64/vm/aarch64.ad Changeset: ac30fdebd5f5 Author: aph Date: 2014-05-12 14:34 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/ac30fdebd5f5 Fix opto assembly for shifts. ! src/cpu/aarch64/vm/aarch64.ad Changeset: 3852a506a19b Author: aph Date: 2014-05-12 16:26 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/3852a506a19b Tidy up stack frame handling. ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Changeset: 92cd832e8f78 Author: aph Date: 2014-05-13 15:57 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/92cd832e8f78 Improve code generation for pop(), as suggested by Edward Nevill. ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Changeset: a1b63a9c0d1f Author: aph Date: 2014-05-13 16:28 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/a1b63a9c0d1f Add RegSet::operator+=. ! src/cpu/aarch64/vm/register_aarch64.hpp Changeset: 4d1f5e7d102c Author: aph Date: 2014-05-13 16:49 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/4d1f5e7d102c Tidy up register usage in push/pop instructions. ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/os_cpu/linux_aarch64/vm/assembler_linux_aarch64.cpp Changeset: 202a78c1caef Author: aph Date: 2014-05-12 11:28 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/202a78c1caef Merge ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Changeset: a7c6a42da087 Author: aph Date: 2014-05-13 11:51 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/a7c6a42da087 Merge ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Changeset: e7b46e8cc544 Author: aph Date: 2014-05-13 17:06 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/e7b46e8cc544 Merge ! src/cpu/aarch64/vm/aarch64.ad Changeset: 639009aad87b Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-13 20:22 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/639009aad87b Optimise addressing of card table byte map base ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/relocInfo_aarch64.cpp Changeset: 9d3bc0f40cce Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-14 15:43 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/9d3bc0f40cce Backout 6713:0ca397cbac95 ! src/share/vm/runtime/globals.cpp Changeset: a2e9ac7b3434 Author: aph Date: 2014-05-15 07:37 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/a2e9ac7b3434 Correct costs for operations with shifts. ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/aarch64_ad.m4 Changeset: b8ec31c74e2d Author: aph Date: 2014-05-15 08:15 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/b8ec31c74e2d Correct OptoAssembly for prologs and epilogs. ! src/cpu/aarch64/vm/aarch64.ad Changeset: 14bba87e055e Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-24 20:31 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/14bba87e055e Add support for CRC32 intrinsic ! src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp ! src/cpu/aarch64/vm/c1_LIRGenerator_aarch64.cpp ! src/cpu/aarch64/vm/interpreterGenerator_aarch64.hpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp ! src/cpu/aarch64/vm/stubGenerator_aarch64.cpp ! src/cpu/aarch64/vm/stubRoutines_aarch64.cpp ! src/cpu/aarch64/vm/stubRoutines_aarch64.hpp ! src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp ! src/cpu/aarch64/vm/vm_version_aarch64.cpp Changeset: fc99103df98d Author: Edward Nevill edward.nevill at linaro.org Date: 2014-05-28 10:08 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/fc99103df98d Restore sp from sender sp, r13 in crc32 code ! src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp Changeset: 79225ea063f3 Author: aph Date: 2014-05-29 17:38 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/79225ea063f3 Don't use any form of _call_VM_leaf when we're calling a stub. Jump directly to the stub after adjusting the stack. ! src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp Changeset: 02139cd80d48 Author: aph Date: 2014-05-29 17:39 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/02139cd80d48 Fix a tonne of bogus comments. ! src/cpu/aarch64/vm/aarch64_call.cpp ! src/cpu/aarch64/vm/c1_CodeStubs_aarch64.cpp ! src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp ! src/cpu/aarch64/vm/c1_MacroAssembler_aarch64.hpp ! src/cpu/aarch64/vm/c1_Runtime1_aarch64.cpp ! src/cpu/aarch64/vm/compiledIC_aarch64.cpp ! src/cpu/aarch64/vm/interp_masm_aarch64.cpp ! src/cpu/aarch64/vm/interp_masm_aarch64.hpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp ! src/cpu/aarch64/vm/methodHandles_aarch64.cpp ! src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp ! src/cpu/aarch64/vm/stubGenerator_aarch64.cpp ! src/cpu/aarch64/vm/templateTable_aarch64.cpp Changeset: a80e7c1b07ad Author: aph Date: 2014-05-29 13:27 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/a80e7c1b07ad Delete useless instruction. ! src/cpu/aarch64/vm/aarch64.ad Changeset: a4a33014c25d Author: aph Date: 2014-05-29 13:27 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/a4a33014c25d Merge Changeset: 72b29bfe67fa Author: aph Date: 2014-05-29 05:53 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/72b29bfe67fa Implement various locked memory operations. ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp Changeset: d17532dbc6a7 Author: aph Date: 2014-05-30 11:45 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/d17532dbc6a7 Merge From aph at redhat.com Fri May 30 15:59:08 2014 From: aph at redhat.com (aph at redhat.com) Date: Fri, 30 May 2014 15:59:08 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk7u/hotspot: Added tag jdk7u60_b04_aarch64_832 for changeset d17532dbc6a7 Message-ID: <201405301559.s4UFx9Le006547@aojmv0008> Changeset: f6121012b666 Author: aph Date: 2014-05-30 11:58 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk7u/hotspot/rev/f6121012b666 Added tag jdk7u60_b04_aarch64_832 for changeset d17532dbc6a7 ! .hgtags