From edward.nevill at gmail.com Mon Feb 1 20:33:57 2016 From: edward.nevill at gmail.com (Edward Nevill) Date: Mon, 01 Feb 2016 20:33:57 +0000 Subject: [aarch64-port-dev ] RFR: 8148783: aarch64: SEGV running SpecJBB2013 Message-ID: <1454358837.11463.14.camel@mint> Hi, Please review the following webrev http://cr.openjdk.java.net/~enevill/8148783/webrev.0/ JIRA Issue: https://bugs.openjdk.java.net/browse/JDK-8148783 The bug is explained in some detail in the JIRA issue. The problem is that the sign is not preserved in the following code from adrp(...) long offset = dest_page - pc_page; offset = (offset & ((1<<20)-1)) << 12; This generally works because the following movk overwrites bits 32..47 However on larger memory systems of 256 Gb it could happen that the PC address was 0x0000ffffXXXXXXXX in which case the falsely positive offset could wrap to 0x00010000XXXXXXXX Bit 48 does not get overwritten by the following movk, hence forming an invalid address. The solution is to use int32_t for offset instead of long, so it gets sign extended correctly when added to the pc(). All the best, Ed. From aph at redhat.com Tue Feb 2 15:16:13 2016 From: aph at redhat.com (Andrew Haley) Date: Tue, 2 Feb 2016 15:16:13 +0000 Subject: [aarch64-port-dev ] RFR: 8148783: aarch64: SEGV running SpecJBB2013 In-Reply-To: <1454358837.11463.14.camel@mint> References: <1454358837.11463.14.camel@mint> Message-ID: <56B0C83D.900@redhat.com> Hi, On 02/01/2016 08:33 PM, Edward Nevill wrote: > JIRA Issue: https://bugs.openjdk.java.net/browse/JDK-8148783 > > The bug is explained in some detail in the JIRA issue. > > The problem is that the sign is not preserved in the following code > from adrp(...) > > long offset = dest_page - pc_page; > offset = (offset & ((1<<20)-1)) << 12; > > This generally works because the following movk overwrites bits 32..47 > > However on larger memory systems of 256 Gb it could happen that the > PC address was > > 0x0000ffffXXXXXXXX > > in which case the falsely positive offset could wrap to > > 0x00010000XXXXXXXX > > Bit 48 does not get overwritten by the following movk, hence forming > an invalid address. > > The solution is to use int32_t for offset instead of long, so it > gets sign extended correctly when added to the pc(). I can't accept that patch because the overflowing assignment from long to int32_t is undefined behaviour. It is also very obscure code. Can you test the patch I've appended instead? It tiptoes around the UB and should be OK. Thanks, Andrew. diff --git a/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp b/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp --- a/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp +++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp @@ -3980,6 +3980,14 @@ return inst_mark(); } +int64_t MacroAssembler::truncate_signed_bitfield(int64_t n, int width) { + // Left shifts of a signed integer are UB in Standard C++ but + // well-defined in GNU C++. + n <<= 64 - width; + n >>= 64 - width; + return n; +} + void MacroAssembler::adrp(Register reg1, const Address &dest, unsigned long &byte_offset) { relocInfo::relocType rtype = dest.rspec().reloc()->type(); unsigned long low_page = (unsigned long)CodeCache::low_bound() >> 12; @@ -3999,8 +4007,18 @@ _adrp(reg1, dest.target()); } else { unsigned long pc_page = (unsigned long)pc() >> 12; - long offset = dest_page - pc_page; - offset = (offset & ((1<<20)-1)) << 12; + unsigned long page_offset = dest_page - pc_page; + + // The signed offset (in 4k pages) from PC to dest page. We use a + // reference in order to avoid UB when converting from unsigned to + // signed. + long offset = reinterpret_cast(page_offset); + + // The signed offset (in bytes) from the PC to the destination + // page. We only want the 32 LSBs of the offset because the range + // of ADRP is +-2G, i.e. 32 bits. + offset = truncate_signed_bitfield(offset << 12, 32); + _adrp(reg1, pc()+offset); movk(reg1, (unsigned long)dest.target() >> 32, 32); } diff --git a/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp b/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp --- a/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp +++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp @@ -85,9 +85,10 @@ void call_VM_helper(Register oop_result, address entry_point, int number_of_arguments, bool check_exceptions = true); - // Maximum size of class area in Metaspace when compressed uint64_t use_XOR_for_compressed_class_base; + int64_t truncate_signed_bitfield(int64_t n, int width); + public: MacroAssembler(CodeBuffer* code) : Assembler(code) { use_XOR_for_compressed_class_base From edward.nevill at gmail.com Wed Feb 3 11:45:52 2016 From: edward.nevill at gmail.com (Edward Nevill) Date: Wed, 03 Feb 2016 11:45:52 +0000 Subject: [aarch64-port-dev ] RFR: 8148948: aarch64: generate_copy_longs calls align() incorrectly Message-ID: <1454499952.2021.7.camel@mylittlepony.linaroharston> Hi, Please review the following: http://cr.openjdk.java.net/~enevill/8148948/webrev.0/ JIRA: https://bugs.openjdk.java.net/browse/JDK-8148948 The issue is that there are align statements of the form __ align(6) in generate_copy_longs() whereas the correct alignment statement should be __ align(64) In the proposed webrev I have changed the statements to __ align(CodeEntryAlignment); Howver in C1 CodeEntryAlignment is set to 16 for C1 and 64 for C2. I can see no reason why this is the case so I am proposing also changing CodeEntryAlignment to 64 for both C1 & C2. Other arches set CodeEntryAlignment as follows sparc: 32 ppc: 128 x86: 32 (c2), 16 (c1) Thanks, Ed. From aph at redhat.com Wed Feb 3 11:48:48 2016 From: aph at redhat.com (Andrew Haley) Date: Wed, 3 Feb 2016 11:48:48 +0000 Subject: [aarch64-port-dev ] RFR: 8148948: aarch64: generate_copy_longs calls align() incorrectly In-Reply-To: <1454499952.2021.7.camel@mylittlepony.linaroharston> References: <1454499952.2021.7.camel@mylittlepony.linaroharston> Message-ID: <56B1E920.3050705@redhat.com> On 03/02/16 11:45, Edward Nevill wrote: > Hi, > > Please review the following: > > http://cr.openjdk.java.net/~enevill/8148948/webrev.0/ Yes, OK. Andrew. From edward.nevill at gmail.com Thu Feb 4 16:46:07 2016 From: edward.nevill at gmail.com (Edward Nevill) Date: Thu, 04 Feb 2016 16:46:07 +0000 Subject: [aarch64-port-dev ] RFR: 8148783: aarch64: SEGV running SpecJBB2013 In-Reply-To: <56B0C83D.900@redhat.com> References: <1454358837.11463.14.camel@mint> <56B0C83D.900@redhat.com> Message-ID: <1454604367.22510.28.camel@mylittlepony.linaroharston> On Tue, 2016-02-02 at 15:16 +0000, Andrew Haley wrote: > Hi, > > On 02/01/2016 08:33 PM, Edward Nevill wrote: > > > JIRA Issue: https://bugs.openjdk.java.net/browse/JDK-8148783 > Can you test the patch I've appended instead? It tiptoes around the UB > and should be OK. Hi, Unfortunately this still fails. I have written a small simulacrum of the problem in C below. The following is the output. ed at arm64:~/tmp/adrp$ ./adrp original_adrp: pc = 0xffff70000000, dest = 0xfffe00000000, offset = 0x90000000, addr = 0x1000000000000 original_adrp: pc = 0xfffffffff000, dest = 0xfffe00000000, offset = 0x1000, addr = 0x1000000000000 new_adrp: pc = 0xffff70000000, dest = 0xfffe00000000, offset = 0xffffffff90000000, addr = 0xffff00000000 new_adrp: pc = 0xfffffffff000, dest = 0xfffe00000000, offset = 0x1000, addr = 0x1000000000000 <<<<< HERE bit 48 set The original generated an invalid address in both cases (where offset is +ve and -ve). The new version generates the correct output when the offset is -ve, however a +ve offset still generates an address with bit 48 set. A second problem is the following code in pd_patch_instruction // movk #imm16<<32 Instruction_aarch64::patch(branch + 4, 20, 5, (uint64_t)target >> 32); offset &= (1<<20)-1; instructions = 2; This is essentially doing the same thing as the original adrp, so even when the original adrp got the instruction correct the subsequent patching broke it again. I have attached a new webrev which fixes both these issues in a much simpler manner. http://cr.openjdk.java.net/~enevill/8148783/webrev.2 The key is to construct the instructions exactly as we are using them. When we use an adrp/movk combination to construct a 48 bit address we are using the adrp to construct the bottom 32 bits (with the bottom 12 bit 0) and the movk to construction bits 32..47 overwriting any values the adrp may have put in bits 32..47 So the instruction sequence is adrp Xn, 0xXXXXAAAAA000 movk Xn, 0xAAAA00000000 Where A represents required bits of the address and XXXX represent don't care bits. The only requirement on the XXXX bits is that they must be reachable using the adrp instruction. The webrev ensures this by using bits 32..47 from the PC and bits 0..31 from the destination address. The fact that we use the XXXX bits from the PC ensures the requirement that the address is reachable and using only the bottom 32 bits of the dest ensures we only get the bits we actually want the adrp instruction to construct and not any extraneous bits in bits 48 etc. The code that does this is unsigned long adrp_target = (target & 0xffffffffUL) | (source & 0xffff00000000UL); and this is also reflected in pd_patch_instruction to calculate the adrp target there. All the best, Ed. --- adrp.c --- #include void original_adrp(unsigned long pc, unsigned long dest) { unsigned long dest_page = dest >> 12; unsigned long pc_page = pc >> 12; long offset = dest_page - pc_page; offset = (offset & ((1<<20)-1)) << 12; printf("original_adrp: pc = 0x%lx, dest = 0x%lx, offset = 0x%lx, addr = 0x%lx\n", pc, dest, offset, pc+offset); } long truncate_signed_bitfield(long n, int width) { // Left shifts of a signed integer are UB in Standard C++ but // well-defined in GNU C++. n <<= 64 - width; n >>= 64 - width; return n; } void new_adrp(unsigned long pc, unsigned long dest) { unsigned long dest_page = dest >> 12; unsigned long pc_page = pc >> 12; unsigned long page_offset = dest_page - pc_page; long offset = page_offset; offset = truncate_signed_bitfield(offset << 12, 32); printf("new_adrp: pc = 0x%lx, dest = 0x%lx, offset = 0x%lx, addr = 0x% lx\n", pc, dest, offset, pc+offset); } int main(void) { original_adrp(0x0000ffff70000000, 0x0000fffe00000000); original_adrp(0x0000fffffffff000, 0x0000fffe00000000); new_adrp(0x0000ffff70000000, 0x0000fffe00000000); new_adrp(0x0000fffffffff000, 0x0000fffe00000000); } --- cut here --- From aph at redhat.com Thu Feb 4 17:02:05 2016 From: aph at redhat.com (Andrew Haley) Date: Thu, 4 Feb 2016 17:02:05 +0000 Subject: [aarch64-port-dev ] RFR: 8148783: aarch64: SEGV running SpecJBB2013 In-Reply-To: <1454604367.22510.28.camel@mylittlepony.linaroharston> References: <1454358837.11463.14.camel@mint> <56B0C83D.900@redhat.com> <1454604367.22510.28.camel@mylittlepony.linaroharston> Message-ID: <56B3840D.9060301@redhat.com> On 02/04/2016 04:46 PM, Edward Nevill wrote: > > The webrev ensures this by using bits 32..47 from the PC and bits > 0..31 from the destination address. The fact that we use the XXXX > bits from the PC ensures the requirement that the address is > reachable and using only the bottom 32 bits of the dest ensures we > only get the bits we actually want the adrp instruction to construct > and not any extraneous bits in bits 48 etc. > > The code that does this is > > unsigned long adrp_target = (target & 0xffffffffUL) | (source & 0xffff00000000UL); > > and this is also reflected in pd_patch_instruction to calculate the adrp target there. Much better, but this still is confusing. Surely you can do unsigned long target = (unsigned long)dest.target(); unsigned long adrp_target = (target & 0xffffffffUL) | ((unsigned long)pc() & 0xffff00000000UL); _adrp(reg1, (address)adrp_target); movk(reg1, target >> 32, 32); } "source" doesn't really mean anything here. OK with that change. Andrew. From hui.shi at linaro.org Fri Feb 5 12:47:35 2016 From: hui.shi at linaro.org (Hui Shi) Date: Fri, 5 Feb 2016 20:47:35 +0800 Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize disjoint array copy in stub code Message-ID: Hi, Would some one help review this changeset? This improves performance for codes like string builder and concat on aarch64. Bug: https://bugs.openjdk.java.net/browse/JDK-8149080 webrev: http://cr.openjdk.java.net/~hshi/8149080/webrev/ Arraycopy without overlapping is faster than overlapped copy. If overlap information is unknown at JIT time, stub code will check if arraycopy src and dest array overlap at runtime, if not overlap, stub will perform faster none-overlap array copy. In current aarch64 implementation, stub code checks only if dest below src, this doesn?t cover cases dest above src but still not overlap case (as X86 did). Fixing is checking both conditions, if (dest-src) is above/equal (copy size), it's not overlap and stub code can jump to none overlapping copy. Another modification is adding StubCodeMark for backward/forward copy longs on aarch64, so code in these sections can get profiled with correct stub name. Regards Hui From aph at redhat.com Fri Feb 5 12:58:55 2016 From: aph at redhat.com (Andrew Haley) Date: Fri, 5 Feb 2016 12:58:55 +0000 Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize disjoint array copy in stub code In-Reply-To: References: Message-ID: <56B49C8F.5030409@redhat.com> On 02/05/2016 12:47 PM, Hui Shi wrote: > Arraycopy without overlapping is faster than overlapped copy. The only thing which varies is the direction of copying. I'm not aware of anything which makes one direction faster than the other. Measurements, please. Andrew. From edward.nevill at gmail.com Fri Feb 5 14:32:41 2016 From: edward.nevill at gmail.com (Edward Nevill) Date: Fri, 05 Feb 2016 14:32:41 +0000 Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize disjoint array copy in stub code In-Reply-To: <56B49C8F.5030409@redhat.com> References: <56B49C8F.5030409@redhat.com> Message-ID: <1454682761.26562.19.camel@mint> On Fri, 2016-02-05 at 12:58 +0000, Andrew Haley wrote: > On 02/05/2016 12:47 PM, Hui Shi wrote: > > Arraycopy without overlapping is faster than overlapped copy. > > The only thing which varies is the direction of copying. I'm not > aware of anything which makes one direction faster than the other. > Measurements, please. Copy backwards doesn't prefetch. The difference with and without prefetch can be very significant on some micro-arches. if (direction == copy_forwards && PrefetchCopyIntervalInBytes > 0) __ prfm(Address(s, PrefetchCopyIntervalInBytes), PLDL1KEEP); I have done some experiments with prefetch enabled for backwards copy and it shows almost identical performance to forwards copy. Regards, Ed. From aph at redhat.com Fri Feb 5 14:37:46 2016 From: aph at redhat.com (Andrew Haley) Date: Fri, 5 Feb 2016 14:37:46 +0000 Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize disjoint array copy in stub code In-Reply-To: <1454682761.26562.19.camel@mint> References: <56B49C8F.5030409@redhat.com> <1454682761.26562.19.camel@mint> Message-ID: <56B4B3BA.602@redhat.com> On 02/05/2016 02:32 PM, Edward Nevill wrote: > On Fri, 2016-02-05 at 12:58 +0000, Andrew Haley wrote: >> On 02/05/2016 12:47 PM, Hui Shi wrote: >>> Arraycopy without overlapping is faster than overlapped copy. >> >> The only thing which varies is the direction of copying. I'm not >> aware of anything which makes one direction faster than the other. >> Measurements, please. > > Copy backwards doesn't prefetch. The difference with and without > prefetch can be very significant on some micro-arches. > > if (direction == copy_forwards && PrefetchCopyIntervalInBytes > 0) > __ prfm(Address(s, PrefetchCopyIntervalInBytes), PLDL1KEEP); > > I have done some experiments with prefetch enabled for backwards copy > and it shows almost identical performance to forwards copy. OK, so let's do that, then. Andrew. From hui.shi at linaro.org Sat Feb 6 11:52:19 2016 From: hui.shi at linaro.org (Hui Shi) Date: Sat, 6 Feb 2016 19:52:19 +0800 Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize disjoint array copy in stub code In-Reply-To: <56B4B3BA.602@redhat.com> References: <56B49C8F.5030409@redhat.com> <1454682761.26562.19.camel@mint> <56B4B3BA.602@redhat.com> Message-ID: Thanks Andrew and Edward! Code sequence for backward and forward array copy is almost same except prefetch. Performance test is based on http://cr.openjdk.java.net/~hshi/8149080/testcase/StringConcatTest.java run with "java StringConcatTest 5000". I tried disabling prefetch and compare performance between backward and forward array copy (all forward with my patch, force all backward by commenting out branch to nooverlap target and force jshort_disjoint_copy generate conjoint copy), forward array copy is much faster than backward. backward is about 85s and forward copy is about 60s. This test is try to reflect common cases like string builder/buffer append and string concatenation, these are disjoint array copy and forward array copy is better than backward in following two aspects: 1. Forward array copy can prefetch dest address needed in next string append. Most string append/concatenation operations will append chars after early appened char arrays. For example, str = str1 + str2 + str3 1. when append str1 in forward order, result value array(str.value) will be prefetched beyond str1's length with hardware prefetcher 2. when store str2.value into str.value, str.value is already prefetched, less cache miss when copy str2.value into str.value If copy in backward order, after copy str1.value into str.value, it's address before str.value[0] get prefetched, this is not useful for next append. Checking following PMU events on A57 ( http://cr.openjdk.java.net/~hshi/8149080/testcase/backward.perf, http://cr.openjdk.java.net/~hshi/8149080/testcase/forward.perf), forward array copy has more accurate hardware prefetcher result (more issued request is used).Compare with/without prefetch instruction in forward copy, no performance different, hardware prefechter might good enough. 0x167 Level 2 prefetcher request used (or demanded) 0x168 Level 2 prefetcher request issued In forward array copy 94% generated request is useful (r167/r168) In backward array copy 67% issued request is usedful http://cr.openjdk.java.net/~hshi/8149080/testcase/DiscreteCopy.java testing array copy not in append mode, each array copy performs on separate address. Run with "java DiscreteCopy 3000" Forward copy takes 58s and backward array copy takes 70s. Gap is decreased. 2. Backward array copy might cause much more unaligned memory access in string append/concatenation. Current array copy implementation is: 1. peel source array address for 16 bytes alignment (backward will perform peel from source end) copy 8,4,2,1 bytes 2. perform copy_longs 3. tail copy less than 16 bytes, copy 8,4,2,1 bytes In string append/concatenation cases, source string value array is usually 8 bytes or 16 bytes align. Suppose source address is 16 byte align and size is n*16+14;. With forward array copy: n ld/st pair, then store 8 bytes align, then store 4 bytes align, then store 2 bytes align. With backward array copy need peel source end address first (checking copy_memory_small): store 8 bytes unaligned, store 4 bytes unaligned, store 2 bytes aligned, n ld/st pair. Perform unaligned access profiling with perf on DiscreteCopy, massive unaligned access for backward array copy, while not found for forward array copy. http://cr.openjdk.java.net/~hshi/8149080/testcase/AlignedDiscreteCopy.java testing array copy with 16 bytes aligned size, performance is identical for backward and forward array copy, both are about 64s. Perform forward array copy when possible will not make things worse and benefit common cases like string append/concatenation. This is the original logic when generate conjoint array copy, this patch complete this logic by recognize all disjoint array copy. Does this make sense? Regards Hui On 5 February 2016 at 22:37, Andrew Haley wrote: > On 02/05/2016 02:32 PM, Edward Nevill wrote: > > On Fri, 2016-02-05 at 12:58 +0000, Andrew Haley wrote: > >> On 02/05/2016 12:47 PM, Hui Shi wrote: > >>> Arraycopy without overlapping is faster than overlapped copy. > >> > >> The only thing which varies is the direction of copying. I'm not > >> aware of anything which makes one direction faster than the other. > >> Measurements, please. > > > > Copy backwards doesn't prefetch. The difference with and without > > prefetch can be very significant on some micro-arches. > > > > if (direction == copy_forwards && PrefetchCopyIntervalInBytes > 0) > > __ prfm(Address(s, PrefetchCopyIntervalInBytes), PLDL1KEEP); > > > > I have done some experiments with prefetch enabled for backwards copy > > and it shows almost identical performance to forwards copy. > > OK, so let's do that, then. > > Andrew. > > > From hui.shi at linaro.org Sat Feb 6 12:24:03 2016 From: hui.shi at linaro.org (Hui Shi) Date: Sat, 6 Feb 2016 20:24:03 +0800 Subject: [aarch64-port-dev ] RFR(s): AAch64: Adding byte array equal support Message-ID: Hi All, Would someone help review this patch for adding byte array equal support on aarch64? bug: https://bugs.openjdk.java.net/browse/JDK-8149100 webrev: http://cr.openjdk.java.net/~hshi/8149100/webrev/ For http://cr.openjdk.java.net/~hshi/8149100/testcase/ArrayEqual.java, debug build run will failed with ?bad AD file? assertion on aarch64. # To suppress the following error report, specify this argument # after -XX: or in .hotspotrc: SuppressErrorAt=/matcher.cpp:1605 # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (/home/shihui/jdk9-hs-comp/hotspot/src/share/vm/opto/matcher.cpp:1605), pid=8501, tid=8746 # assert(false) failed: bad AD file # Debugg shows AryEqNode?s enconding is StrIntrinsicNode::LL, which is not supported on aarch64 now. 1605 assert( false, "bad AD file" ); (gdb) p ((AryEqNode*)n)->encoding() $1 = StrIntrinsicNode::LL (gdb) Fix is adding support for StrIntrinsicNode::LL encoding array equal operation, as Latin String compare might become important in JDK9 with new String. 1. Adding MacroAssembler::byte_arrays_equals to support byte array equals check. 2. Add new array_equalsB rule when AryEq enconding is StrIntrinsicNode::LL. http://cr.openjdk.java.net/~hshi/8149100/testcase/byte_array_equals.asm shows newly generated assembly. Relase build will invoke Array.equals method before this patch, with this patch, significant improvment on ArrayEqual case. time -p openjdk-9-internal.base/bin/java ArrayEqual real 54.98 user 55.13 time -p openjdk-9-internal.byteEquals/bin/java ArrayEqual real 28.59 user 28.62 sys 0.14 Following code sequence can be replaced with tbz (when tst has constant exactly two?s n times value), these code sequence exist in other places(MacroAssembler::char_arrays_equals, interpreter, etc). I would like clean all together in another separate changeset. tst(cnt1, 0b10); br(EQ, TAIL01); Regards Hui From aph at redhat.com Sun Feb 7 10:24:39 2016 From: aph at redhat.com (Andrew Haley) Date: Sun, 7 Feb 2016 10:24:39 +0000 Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize disjoint array copy in stub code In-Reply-To: References: <56B49C8F.5030409@redhat.com> <1454682761.26562.19.camel@mint> <56B4B3BA.602@redhat.com> Message-ID: <56B71B67.8080304@redhat.com> On 06/02/16 11:52, Hui Shi wrote: > Code sequence for backward and forward array copy is almost same > except prefetch. Performance test is based on > http://cr.openjdk.java.net/~hshi/8149080/testcase/StringConcatTest.java > run with "java StringConcatTest 5000". I tried disabling prefetch > and compare performance between backward and forward array copy (all > forward with my patch, force all backward by commenting out branch > to nooverlap target and force jshort_disjoint_copy generate conjoint > copy), forward array copy is much faster than backward. backward is > about 85s and forward copy is about 60s. > > This test is try to reflect common cases like string builder/buffer > append and string concatenation, these are disjoint array copy and > forward array copy is better than backward in following two aspects: You're confusing me. String concatenation is a disjoint array copy. Therefore it always copies forwards, does it not? > 1. Forward array copy can prefetch dest address needed in next string > append. So can backwards array copy, surely. > 2. Backward array copy might cause much more unaligned memory access in > string append/concatenation. Okay, I see. That is fixable: we can make sure that there are no more misaligned accesses in either direction. > Perform forward array copy when possible will not make things worse and > benefit common cases like string append/concatenation. This is the original > logic when generate conjoint array copy, this patch complete this logic by > recognize all disjoint array copy. Does this make sense? Yes, but it's a kludge. I'd much rather fix backwards copies so that they were just as fast. If that's not possible then your patch may be acceptable, but I think we should first try to fix backwards copies. We should be able to fix this the *right way*, by using prefetch instructions and making sure copies are aligned where possible. When I did my testing misaligned fetches were quite fast, and it didn't seem worth the effort to fix it. But I'm really mystified by why String concatenation doesn't always use forward copies anyway. Andrew. From aph at redhat.com Sun Feb 7 10:35:10 2016 From: aph at redhat.com (Andrew Haley) Date: Sun, 7 Feb 2016 10:35:10 +0000 Subject: [aarch64-port-dev ] RFR(s): AAch64: Adding byte array equal support In-Reply-To: References: Message-ID: <56B71DDE.2040109@redhat.com> On 06/02/16 12:24, Hui Shi wrote: > Hi All, > > Would someone help review this patch for adding byte array equal support on > aarch64? > > bug: https://bugs.openjdk.java.net/browse/JDK-8149100 > webrev: http://cr.openjdk.java.net/~hshi/8149100/webrev/ Ok, thanks. > Following code sequence can be replaced with tbz (when tst has constant > exactly two?s n times value), these code sequence exist in other > places(MacroAssembler::char_arrays_equals, interpreter, etc). I would like > clean all together in another separate changeset. > tst(cnt1, 0b10); > br(EQ, TAIL01); Right. Andrew. From edward.nevill at gmail.com Mon Feb 8 08:39:00 2016 From: edward.nevill at gmail.com (Edward Nevill) Date: Mon, 08 Feb 2016 08:39:00 +0000 Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize disjoint array copy in stub code In-Reply-To: References: <56B49C8F.5030409@redhat.com> <1454682761.26562.19.camel@mint> <56B4B3BA.602@redhat.com> Message-ID: <1454920740.26562.28.camel@mint> On Sat, 2016-02-06 at 19:52 +0800, Hui Shi wrote: > Code sequence for backward and forward array copy is almost same except prefetch. Performance test is based on http://cr.openjdk.java.net/~hshi/8149080/testcase/StringConcatTest.java run with "java StringConcatTest 5000". I tried disabling prefetch and compare performance between backward and forward array copy (all Hi, How did you disable the prefetch? Did you use -XX:PrefetchCopyIntervalInBytes=0? There is a bug/feature in vm_version_aarch64.cpp where it does FLAG_SET_DEFAULT(PrefetchCopyIntervalInBytes, 256); overwriting any previous value, whereas it should do if (FLAG_IS_DEFAULT(PrefetchCopyIntervalInBytes)) FLAG_SET_DEFAULT(PrefetchCopyIntervalInBytes, 256); > 1. Forward array copy can prefetch dest address needed in next string append. > > Most string append/concatenation operations will append chars after early appened char arrays. > For example, str = str1 + str2 + str3 > 1. when append str1 in forward order, result value array(str.value) will be prefetched beyond str1's length with hardware prefetcher > 2. when store str2.value into str.value, str.value is already prefetched, less cache miss when copy str2.value into str.value > If copy in backward order, after copy str1.value into str.value, it's address before str.value[0] get prefetched, this is not useful for next append. I assume you are talking about automatic hardware prefetching here since the SW implementation does not do any prefetching on the destination? In that case I can see how repeated forward copys may be more efficient for string concatenation. All the best, Ed. From hui.shi at linaro.org Mon Feb 8 11:55:34 2016 From: hui.shi at linaro.org (Hui Shi) Date: Mon, 8 Feb 2016 19:55:34 +0800 Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize disjoint array copy in stub code In-Reply-To: <56B71B67.8080304@redhat.com> References: <56B49C8F.5030409@redhat.com> <1454682761.26562.19.camel@mint> <56B4B3BA.602@redhat.com> <56B71B67.8080304@redhat.com> Message-ID: Thanks Andrew! > You're confusing me. String concatenation is a disjoint array copy. > Therefore it always copies forwards, does it not? > Yes, it would be better JIT can recognize this at compile time. Previous performance data is collected on Java8 (so copy is performed in short array copy, while in JDK9 it is byte array copy). I check both JDK8 and JDK9, both invoke Stub::jshort_arraycopy and Stub::jbyte_arraycopy. One reason might be JIT time determination is not important as there is run time check for disjoint array copy. > > 1. Forward array copy can prefetch dest address needed in next string > > append. > > So can backwards array copy, surely. > Could you please give more details about how backward array copy can also utilize hardware prefetcher in multiple string append case? > > > 2. Backward array copy might cause much more unaligned memory access in > > string append/concatenation. > > Okay, I see. That is fixable: we can make sure that there are no > more misaligned accesses in either direction. > > > Perform forward array copy when possible will not make things worse and > > benefit common cases like string append/concatenation. This is the > original > > logic when generate conjoint array copy, this patch complete this logic > by > > recognize all disjoint array copy. Does this make sense? > > Yes, but it's a kludge. I'd much rather fix backwards copies so that > they were just as fast. If that's not possible then your patch may be > acceptable, but I think we should first try to fix backwards copies. > We should be able to fix this the *right way*, by using prefetch > instructions and making sure copies are aligned where possible. When > I did my testing misaligned fetches were quite fast, and it didn't > seem worth the effort to fix it. > > I agree inserting prefetch in backward copy and make backward array copy more faster. For mis-aligned issue in backward array copy, we might copy in 1,2,4,8 order to make it align. > But I'm really mystified by why String concatenation doesn't always > use forward copies anyway. > > Andrew. > From hui.shi at linaro.org Mon Feb 8 11:58:15 2016 From: hui.shi at linaro.org (Hui Shi) Date: Mon, 8 Feb 2016 19:58:15 +0800 Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize disjoint array copy in stub code In-Reply-To: <1454920740.26562.28.camel@mint> References: <56B49C8F.5030409@redhat.com> <1454682761.26562.19.camel@mint> <56B4B3BA.602@redhat.com> <1454920740.26562.28.camel@mint> Message-ID: Thanks Edward! > > > How did you disable the prefetch? Did you use > -XX:PrefetchCopyIntervalInBytes=0? I disable prefetch by removing prefetch in stub code and rebuild. > > > > 1. Forward array copy can prefetch dest address needed in next string > append. > > > > Most string append/concatenation operations will append chars after > early appened char arrays. > > For example, str = str1 + str2 + str3 > > 1. when append str1 in forward order, result value array(str.value) > will be prefetched beyond str1's length with hardware prefetcher > > 2. when store str2.value into str.value, str.value is already > prefetched, less cache miss when copy str2.value into str.value > > If copy in backward order, after copy str1.value into str.value, it's > address before str.value[0] get prefetched, this is not useful for next > append. > > I assume you are talking about automatic hardware prefetching here since > the SW implementation does not do any prefetching on the destination? In > that case I can see how repeated forward copys may be more efficient for > string concatenation. > > Yes, its hardware prefecher. perf profiling get the hardware generated perfecher issued and used. and forward hardware prefetcher hit rate is much higher in forward array copy. Regards Hui From hui.shi at linaro.org Mon Feb 8 11:59:40 2016 From: hui.shi at linaro.org (Hui Shi) Date: Mon, 8 Feb 2016 19:59:40 +0800 Subject: [aarch64-port-dev ] RFR(s): AAch64: Adding byte array equal support In-Reply-To: <56B71DDE.2040109@redhat.com> References: <56B71DDE.2040109@redhat.com> Message-ID: Thanks Andrew! Could someone help push this change? Regards Hui On 7 February 2016 at 18:35, Andrew Haley wrote: > On 06/02/16 12:24, Hui Shi wrote: > > Hi All, > > > > Would someone help review this patch for adding byte array equal support > on > > aarch64? > > > > bug: https://bugs.openjdk.java.net/browse/JDK-8149100 > > webrev: http://cr.openjdk.java.net/~hshi/8149100/webrev/ > > Ok, thanks. > > > Following code sequence can be replaced with tbz (when tst has constant > > exactly two?s n times value), these code sequence exist in other > > places(MacroAssembler::char_arrays_equals, interpreter, etc). I would > like > > clean all together in another separate changeset. > > tst(cnt1, 0b10); > > br(EQ, TAIL01); > > Right. > > Andrew. > > > From aph at redhat.com Mon Feb 8 14:32:44 2016 From: aph at redhat.com (Andrew Haley) Date: Mon, 8 Feb 2016 14:32:44 +0000 Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize disjoint array copy in stub code In-Reply-To: References: <56B49C8F.5030409@redhat.com> <1454682761.26562.19.camel@mint> <56B4B3BA.602@redhat.com> <56B71B67.8080304@redhat.com> Message-ID: <56B8A70C.6050300@redhat.com> On 08/02/16 11:55, Hui Shi wrote: > Could you please give more details about how backward array copy can also > utilize hardware prefetcher in multiple string append case? I do not understand this question. There is AFAIK no "multiple string append case": string concatenation is done by a single array copy, and multiple concatenations are done as several array copies. Therefore, all we have to do is make sure that char/byte array copies are fast in both directions. And if the problem is prefetching, we know how to do that. Andrew. From edward.nevill at gmail.com Mon Feb 8 14:37:26 2016 From: edward.nevill at gmail.com (Edward Nevill) Date: Mon, 08 Feb 2016 14:37:26 +0000 Subject: [aarch64-port-dev ] RFR: 8149365: memory copy does not prefetch on backwards copy Message-ID: <1454942246.11423.18.camel@mint> Hi, The following webrev http://cr.openjdk.java.net/~enevill/8149365/webrev.0/ adds support for prefetch on backwards copies (previously prefetch was only done on forwards copies). It also fixes a 'feature' where the command line option -XX:PrefetchCopyIntervalInBytes=N is ignored and the value 256 always used instead. I have benchmarked it using the following test progam http://cr.openjdk.java.net/~enevill/8149365/ArrayCopyTest.java Which allows you to test memory copies of different sizes from a start size to an end size in step units. The test does both backwards and forwards copies. Usage: java ArrayCopyTest I have generated the results obtained before and after the above patch on 4 different partners HW (A,B,C,D) and a summary of the results is available at http://people.linaro.org/~edward.nevill/prefetch/prefetch.pdf For partner A I tested 3 ranges 0-64 bytes in units of 1 0-512 bytes in units of 8 0-4096 bytes in units of 64 The latter 2 clearly show the benefit of prefetching on backwards copies. For partners B, C & D, I only tested 0-4096 bytes in units of 64. I also tested B, C & D with -XX:PrefetchCopyIntervalInBytes=0. On these 3 partners disabling prefetch seemed to have no effect indicating that either prefetch is not implemented, or it implements automatic hardware prefetch. A summary of the results is that it improves performance significantly on partner A and has no effect on partners B,C & D. OK to push? Ed. From aph at redhat.com Mon Feb 8 14:57:23 2016 From: aph at redhat.com (Andrew Haley) Date: Mon, 8 Feb 2016 14:57:23 +0000 Subject: [aarch64-port-dev ] RFR: 8149365: memory copy does not prefetch on backwards copy In-Reply-To: <1454942246.11423.18.camel@mint> References: <1454942246.11423.18.camel@mint> Message-ID: <56B8ACD3.3040602@redhat.com> On 08/02/16 14:37, Edward Nevill wrote: > OK to push? OK, thanks. Andrew. From aph at redhat.com Tue Feb 9 13:54:25 2016 From: aph at redhat.com (Andrew Haley) Date: Tue, 9 Feb 2016 13:54:25 +0000 Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize disjoint array copy in stub code In-Reply-To: References: Message-ID: <56B9EF91.8000809@redhat.com> On 05/02/16 12:47, Hui Shi wrote: > Would some one help review this changeset? This improves performance for > codes like string builder and concat on aarch64. > Bug: https://bugs.openjdk.java.net/browse/JDK-8149080 > webrev: http://cr.openjdk.java.net/~hshi/8149080/webrev/ After some discussion with Edward Nevill, I am persuaded to accept this patch. While I'm not really happy that the backwards copy is so much slower than forwards, this patch is very low risk. I have checked the boundary conditions of (unsigned long)(d - s) >= (unsigned long)size and I'm convinced it's the correct test in this case. However, the comment // no overlap when (d-s) above_equal (count*size) is wrong. If d < s, unsigned(d-s) is >= (count*size) but the two strings may still overlap. This doesn't affect correctness in this case, because the forwards copy is the right one to use. Having said that, if someone changes nooverlap_target so that it is incorrect when copying overlapping arrays we'll have a problem. Thanks, Andrew, From edward.nevill at gmail.com Tue Feb 9 14:03:02 2016 From: edward.nevill at gmail.com (Edward Nevill) Date: Tue, 09 Feb 2016 14:03:02 +0000 Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize disjoint array copy in stub code In-Reply-To: <56B9EF91.8000809@redhat.com> References: <56B9EF91.8000809@redhat.com> Message-ID: <1455026582.32182.1.camel@mylittlepony.linaroharston> On Tue, 2016-02-09 at 13:54 +0000, Andrew Haley wrote: > On 05/02/16 12:47, Hui Shi wrote: > > > Would some one help review this changeset? This improves performance for > > codes like string builder and concat on aarch64. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8149080 > > webrev: http://cr.openjdk.java.net/~hshi/8149080/webrev/ > > However, the comment > > // no overlap when (d-s) above_equal (count*size) Shall I just change the comment to // use fwd copy when (d-s) above_equal (count*size) when I do the push? Regards, Ed. From aph at redhat.com Tue Feb 9 14:03:45 2016 From: aph at redhat.com (Andrew Haley) Date: Tue, 9 Feb 2016 14:03:45 +0000 Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize disjoint array copy in stub code In-Reply-To: <1455026582.32182.1.camel@mylittlepony.linaroharston> References: <56B9EF91.8000809@redhat.com> <1455026582.32182.1.camel@mylittlepony.linaroharston> Message-ID: <56B9F1C1.6010605@redhat.com> On 09/02/16 14:03, Edward Nevill wrote: >> However, the comment >> > >> > // no overlap when (d-s) above_equal (count*size) > Shall I just change the comment to > > // use fwd copy when (d-s) above_equal (count*size) > > when I do the push? OK, thanks. Andrew. From hui.shi at linaro.org Wed Feb 10 00:37:44 2016 From: hui.shi at linaro.org (Hui Shi) Date: Wed, 10 Feb 2016 08:37:44 +0800 Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize disjoint array copy in stub code In-Reply-To: <56B9F1C1.6010605@redhat.com> References: <56B9EF91.8000809@redhat.com> <1455026582.32182.1.camel@mylittlepony.linaroharston> <56B9F1C1.6010605@redhat.com> Message-ID: Thanks Andrew and Edward! I will follow up with misaligned issue for 16 byte alignment peeling before copy longs. Regards Hui On 9 February 2016 at 22:03, Andrew Haley wrote: > On 09/02/16 14:03, Edward Nevill wrote: > >> However, the comment > >> > > >> > // no overlap when (d-s) above_equal (count*size) > > Shall I just change the comment to > > > > // use fwd copy when (d-s) above_equal (count*size) > > > > when I do the push? > > OK, thanks. > > Andrew. > > From aph at redhat.com Wed Feb 10 13:05:26 2016 From: aph at redhat.com (Andrew Haley) Date: Wed, 10 Feb 2016 13:05:26 +0000 Subject: [aarch64-port-dev ] Use jmh for benchmarks [Was: RFR(s): AArch64: 8149080: Recoginize disjoint array copy in stub code] In-Reply-To: References: Message-ID: <56BB3596.2070000@redhat.com> It's very important to use JMH for HotSpot benchmarks. Without JMH, it is very hard to tell if you're measuring the right thing. In order to help you get started, I've appended a JMH version of your benchmark. Run it with: build/linux-aarch64-normal-server-release/jdk/bin/java -jar \ jmh-samples/target/microbenchmarks.jar '.*JMHSample_96.*' -wi 5 -i 10 \ -f 0 Andrew. ----------------------------------------------------------------------- package org.openjdk.jmh.samples; import org.openjdk.jmh.annotations.BenchmarkMode; import org.openjdk.jmh.annotations.GenerateMicroBenchmark; import org.openjdk.jmh.annotations.BenchmarkMode; import org.openjdk.jmh.annotations.GenerateMicroBenchmark; import org.openjdk.jmh.annotations.Mode; import org.openjdk.jmh.annotations.OutputTimeUnit; import org.openjdk.jmh.annotations.Scope; import org.openjdk.jmh.annotations.State; import org.openjdk.jmh.runner.Runner; import org.openjdk.jmh.runner.RunnerException; import org.openjdk.jmh.runner.options.Options; import org.openjdk.jmh.runner.options.OptionsBuilder; import java.util.concurrent.TimeUnit; import java.nio.*; import java.util.*; import java.util.concurrent.*; @BenchmarkMode(Mode.AverageTime) @OutputTimeUnit(TimeUnit.MICROSECONDS) public class JMHSample_96_StringAppend { @State(Scope.Benchmark) public static class BenchmarkState { final String[] strs = { "aoiod", // 5 "adsdefrgda", // 10 "dsadsadadsdsadiomjdas", // 20 "djsadusahdusaufdoaaiffjdkdpjikl", // 30 "dsudhusuhfudhaufhduahfduafhdkaffhdjafjdfa", // 40 "dhsuafydagfydagfdafdajlkejwfjfuhfuafjhdahfldjksl90s", // 50 "dsajufhdaufhdasuifhdasjkfndasjkfgbaduygbiafjioeawjfioiopjsdljl", // 60 "dshaudshauidshauidhsiufhdasjklfdbnasjkvbauyvbdyargfwrheuifgeuijikalkjfds", // 70 "nvfjsvnfusdbvfuyafbduyasfdsjkfhdjkasfhdjksafhdjksfhdjksfhasdjkncxsvnxcm,fdjklfjdkf", // 80 "fdhuafdhasuifhdasuigbdjkbvcjksbdfhduasfhduasifhdasjkfhdasjkfhdjklasfoeurieoiruwiowurieoureik", // 90 "dshfudahfduiashfduiasnvdjkvnuiarheuirheiodfhdjksafhuiheuiafheaskfdhjkasfhdjkashfdjkashfdjkasuipiuk890f", // 100 }; } @GenerateMicroBenchmark public StringBuilder doIt(BenchmarkState state) { StringBuilder strBuf = new StringBuilder(); for (String s : state.strs) { strBuf.append(s); } return strBuf; } public static void main(String[] args) throws RunnerException { Options opt = new OptionsBuilder() .include(".*" + JMHSample_96_StringAppend.class.getSimpleName() + ".*") .warmupIterations(5) .measurementIterations(5) .forks(1) .build(); new Runner(opt).run(); } } From aph at redhat.com Wed Feb 10 15:41:13 2016 From: aph at redhat.com (Andrew Haley) Date: Wed, 10 Feb 2016 15:41:13 +0000 Subject: [aarch64-port-dev ] RFR(s): AAch64: Adding byte array equal support In-Reply-To: <56B71DDE.2040109@redhat.com> References: <56B71DDE.2040109@redhat.com> Message-ID: <56BB5A19.60001@redhat.com> On 02/07/2016 10:35 AM, Andrew Haley wrote: > On 06/02/16 12:24, Hui Shi wrote: >> Hi All, >> >> Would someone help review this patch for adding byte array equal support on >> aarch64? >> >> bug: https://bugs.openjdk.java.net/browse/JDK-8149100 >> webrev: http://cr.openjdk.java.net/~hshi/8149100/webrev/ > > Ok, thanks. Having said that, I'm unhappy that this code is almost exactly the same as char_arrays_equals, seeming to have a slab of identical code. We've also got char_arrays_equals, which seems to do the same thing as string_equals. Andrew. From gnu.andrew at redhat.com Thu Feb 11 18:21:00 2016 From: gnu.andrew at redhat.com (Andrew Hughes) Date: Thu, 11 Feb 2016 13:21:00 -0500 (EST) Subject: [aarch64-port-dev ] Sync AArch64 8u JDK Repository with upstream 8u72 In-Reply-To: <2089452226.20004508.1455214656138.JavaMail.zimbra@redhat.com> Message-ID: <1530040291.20005017.1455214860067.JavaMail.zimbra@redhat.com> I did a comparison of the AArch64 jdk8u repository [0] with the upstream jdk8u72-b15 tag and found a couple of differences that could be resolved. 1. The change "8131105: Header Template for nroff man pages *.1 files contains errors" seems to have been reverted. Re-applying this fixes the issue. 2. Two solutions for a libpng on ARM issue [1] seem to have been applied. Now that the upstream 8078245 version is present, we can revert the change to the libpng sources, keeping them pristine and making it easier to apply future upstream updates to them. Webrev: http://cr.openjdk.java.net/~andrew/aarch64-8/sync/jdk.webrev.01/webrev/ Ok to push? [0] http://hg.openjdk.java.net/aarch64-port/jdk8u/jdk/ [1] https://bugs.openjdk.java.net/browse/JDK-8078245 Thanks, -- Andrew :) Senior Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 From aph at redhat.com Fri Feb 12 09:50:43 2016 From: aph at redhat.com (Andrew Haley) Date: Fri, 12 Feb 2016 09:50:43 +0000 Subject: [aarch64-port-dev ] Sync AArch64 8u JDK Repository with upstream 8u72 In-Reply-To: <1530040291.20005017.1455214860067.JavaMail.zimbra@redhat.com> References: <1530040291.20005017.1455214860067.JavaMail.zimbra@redhat.com> Message-ID: <56BDAAF3.3020904@redhat.com> On 11/02/16 18:21, Andrew Hughes wrote: > Ok to push? OK, thanks. Sounds like 8u has been pretty quiet. Andrew. From adinn at redhat.com Fri Feb 12 10:42:15 2016 From: adinn at redhat.com (Andrew Dinn) Date: Fri, 12 Feb 2016 10:42:15 +0000 Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize redundant memory operations with G1 In-Reply-To: <9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com> References: <56AA260B.8080101@redhat.com> <434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com> <56BB01E2.2090004@redhat.com> <9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com> Message-ID: <56BDB707.6090409@redhat.com> Hi Roland, A patch for the AArch64 C2 volatile/CAS generation code which deals with the effects of your proposed C2 patch is available as a webrev http://cr.openjdk.java.net/~adinn/8087341-aarch64/webrev.00/ The webrev includes your patch and mine and is based on the latest hs-comp. n.b. I have /not/ created a separate issue for the AArch64 part of this fix. I am not sure whether you want to combine it with your patch or push it as a separate stage. n.b. your patch allowed the AArch64 C2 code to be significantly simplified. That's because it ensures that the Raw memory flows associated with the GC card marks no longer intermingle with the AliasIdxBot and oop flows associated with the volatile store/CAS. This means the job of recognising the signature memory configuration between leading and trailing memory barriers is much easier. Testing: I have verified that this generates correct code for volatile put and CAS on AArch64 in all 5 relevant GC configurations: +UseG1GC +UseConcMarkSweepGC +CondCardMark +UseConcMarkSweepGC -CondCardMark +UseParallelGC +CondCardMark +UseParallelGC -CondCardMark A review from an AArch64 reviewer would be welcome. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in UK and Wales under Company Registration No. 3798903 Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul Argiry (US) From aph at redhat.com Fri Feb 12 10:49:31 2016 From: aph at redhat.com (Andrew Haley) Date: Fri, 12 Feb 2016 10:49:31 +0000 Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize redundant memory operations with G1 In-Reply-To: <56BDB707.6090409@redhat.com> References: <56AA260B.8080101@redhat.com> <434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com> <56BB01E2.2090004@redhat.com> <9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com> <56BDB707.6090409@redhat.com> Message-ID: <56BDB8BB.9040305@redhat.com> On 12/02/16 10:42, Andrew Dinn wrote: > A review from an AArch64 reviewer would be welcome. Crikey. Well, it looks okay, but wow... :-) One question: if those code fails because of a different shape of ideal graph than it expects, all that happens is slightly suboptimal code, right? Andrew. From hui.shi at linaro.org Fri Feb 12 11:10:07 2016 From: hui.shi at linaro.org (Hui Shi) Date: Fri, 12 Feb 2016 19:10:07 +0800 Subject: [aarch64-port-dev ] RFR(s): AAch64: Adding byte array equal support In-Reply-To: <56BB5A19.60001@redhat.com> References: <56B71DDE.2040109@redhat.com> <56BB5A19.60001@redhat.com> Message-ID: Hi Andrew, Are you suggesting we should refactoring these similar code to make them share most part? Similar with handling different array copies? Regards Hui On 10 February 2016 at 23:41, Andrew Haley wrote: > On 02/07/2016 10:35 AM, Andrew Haley wrote: > > On 06/02/16 12:24, Hui Shi wrote: > >> Hi All, > >> > >> Would someone help review this patch for adding byte array equal > support on > >> aarch64? > >> > >> bug: https://bugs.openjdk.java.net/browse/JDK-8149100 > >> webrev: http://cr.openjdk.java.net/~hshi/8149100/webrev/ > > > > Ok, thanks. > > Having said that, I'm unhappy that this code is almost exactly the same > as char_arrays_equals, seeming to have a slab of identical code. We've > also got char_arrays_equals, which seems to do the same thing as > string_equals. > > Andrew. > > > From aph at redhat.com Fri Feb 12 11:18:24 2016 From: aph at redhat.com (Andrew Haley) Date: Fri, 12 Feb 2016 11:18:24 +0000 Subject: [aarch64-port-dev ] RFR(s): AAch64: Adding byte array equal support In-Reply-To: References: <56B71DDE.2040109@redhat.com> <56BB5A19.60001@redhat.com> Message-ID: <56BDBF80.6020903@redhat.com> On 12/02/16 11:10, Hui Shi wrote: > Are you suggesting we should refactoring these similar code to make them > share most part? Similar with handling different array copies? Yes, absolutely. There are almost no code differences. Code duplication of this kind has what we call a "bad smell". It is not necessarily wrong, and there may be a good reason for it, but it is always suspicious. Andrew. From hui.shi at linaro.org Fri Feb 12 11:21:39 2016 From: hui.shi at linaro.org (Hui Shi) Date: Fri, 12 Feb 2016 19:21:39 +0800 Subject: [aarch64-port-dev ] RFR(s): AAch64: Adding byte array equal support In-Reply-To: <56BDBF80.6020903@redhat.com> References: <56B71DDE.2040109@redhat.com> <56BB5A19.60001@redhat.com> <56BDBF80.6020903@redhat.com> Message-ID: You're right! Checking X86 implementation it use same arrays_equals implementation for all these operations, we can do this for AArch64 too. I'll create a new work item and follow up with this. Regards Hui On 12 February 2016 at 19:18, Andrew Haley wrote: > On 12/02/16 11:10, Hui Shi wrote: > > Are you suggesting we should refactoring these similar code to make > them > > share most part? Similar with handling different array copies? > > Yes, absolutely. There are almost no code differences. > > Code duplication of this kind has what we call a "bad smell". It is > not necessarily wrong, and there may be a good reason for it, but it > is always suspicious. > > Andrew. > From adinn at redhat.com Fri Feb 12 11:25:44 2016 From: adinn at redhat.com (Andrew Dinn) Date: Fri, 12 Feb 2016 11:25:44 +0000 Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize redundant memory operations with G1 In-Reply-To: <56BDB8BB.9040305@redhat.com> References: <56AA260B.8080101@redhat.com> <434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com> <56BB01E2.2090004@redhat.com> <9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com> <56BDB707.6090409@redhat.com> <56BDB8BB.9040305@redhat.com> Message-ID: <56BDC138.8050903@redhat.com> On 12/02/16 10:49, Andrew Haley wrote: > On 12/02/16 10:42, Andrew Dinn wrote: >> A review from an AArch64 reviewer would be welcome. > > Crikey. Well, it looks okay, but wow... :-) well, yes . . . wow! But then again, this is what we had to expect when we decided to rely on matching subgraph shapes in the back end -- a change to the details of how stores are generated in generic code will have implications for the back end. The flip side of this is twofold. Firstly, changes of this sort will always be few and far between. Secondly, Roland's change has simplified something that was over-complex in the first place; this has not only unlatched some generic optimizations that should have just worked but also, by the same token, reduced the complexity of the AArch64 back end code. > One question: if those code fails because of a different shape of > ideal graph than it expects, all that happens is slightly suboptimal > code, right? Not quite. Roland's change without the AArch64 patch triggered an assert during CAS generation when the expected subgraph was not found. Also, the current code is not built to expect whatever barrier Shenandoah might insert. It ought to fall into much the same case as G1 and CMS + CondCardMark (depending upon how the GC barriers are generated). However, a check which folds Shenandoah into the same bucket as those two still needs explicitly wiring in. This patch is probably a better place to start from than the previous version in order to add that case handling. Roland's fix decouples the effects of the GC barrier from the ones associated with oop updates. That means that following this patch any changes in the way the GC barriers are generated are less likely to impact the Aarch64 back end code that is interested in memory barriers associated with oop updates. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in UK and Wales under Company Registration No. 3798903 Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul Argiry (US) From aph at redhat.com Fri Feb 12 11:31:15 2016 From: aph at redhat.com (Andrew Haley) Date: Fri, 12 Feb 2016 11:31:15 +0000 Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize redundant memory operations with G1 In-Reply-To: <56BDC138.8050903@redhat.com> References: <56AA260B.8080101@redhat.com> <434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com> <56BB01E2.2090004@redhat.com> <9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com> <56BDB707.6090409@redhat.com> <56BDB8BB.9040305@redhat.com> <56BDC138.8050903@redhat.com> Message-ID: <56BDC283.40503@redhat.com> On 12/02/16 11:25, Andrew Dinn wrote: > Not quite. Roland's change without the AArch64 patch triggered an assert > during CAS generation when the expected subgraph was not found. Hmm. Can this code not be changed to fail quietly, with no change to the code? That's what optimizations generally do. > Also, the current code is not built to expect whatever barrier > Shenandoah might insert. It ought to fall into much the same case as G1 > and CMS + CondCardMark (depending upon how the GC barriers are > generated). However, a check which folds Shenandoah into the same bucket > as those two still needs explicitly wiring in. Sure. Andrew. From adinn at redhat.com Fri Feb 12 12:03:36 2016 From: adinn at redhat.com (Andrew Dinn) Date: Fri, 12 Feb 2016 12:03:36 +0000 Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize redundant memory operations with G1 In-Reply-To: <56BDC283.40503@redhat.com> References: <56AA260B.8080101@redhat.com> <434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com> <56BB01E2.2090004@redhat.com> <9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com> <56BDB707.6090409@redhat.com> <56BDB8BB.9040305@redhat.com> <56BDC138.8050903@redhat.com> <56BDC283.40503@redhat.com> Message-ID: <56BDCA18.60906@redhat.com> On 12/02/16 11:31, Andrew Haley wrote: > On 12/02/16 11:25, Andrew Dinn wrote: >> Not quite. Roland's change without the AArch64 patch triggered an assert >> during CAS generation when the expected subgraph was not found. > > Hmm. Can this code not be changed to fail quietly, with no change > to the code? That's what optimizations generally do. The assert is employed when generating AArch64 code for a CAS because every CAS should *always* be capable of being optimized to use an ldaxr/stlxr pair without the need for a top and tail dmb pair i.e. we don't have a fall back. If we see a CompareAndSwap node in a subgraph that does not have the expected shape then this can only mean that the AArch64 code has gone out of sync with a change in the generic code. The assert is used to find this mismatch during development/testing. By placing as much as possible of this checking code in an ifdef ASSERT region we avoid executing the check in production. An assert is not employed when generating AArch64 code for a StoreX because not all StoreX operations are volatile stores susceptible to optimization. So, in this case if we don't find the relevant subgraph then we just fall back to generating dmbs. The predicates which control dmb generation apply to multiple rules -- those for StoreX, MemBarRelease and MemBarVolatile. The predicates are all supposed to operate consistently so that when one predicate falls back then they all do. However, that's only guaranteed by them knowing exactly which shape to look for and correctly identifying it i.e. by me having coded it correctly (but then that's true of a lot of code:-). It would be nice to be able to cross-validate the actions of these rules applied to some set of StoreN, MemBarRelease, MemBarVolatile and CompareAndSwap nodes. However, I cannot see any way of correlating a rule application to some given node with rule applications to related nodes. Rule applications in a given sequence are not easily associated with an originating volatile put/CAS (even when you know which ones they are as happens during debugging they don't necessarily occur in a fixed order). regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in UK and Wales under Company Registration No. 3798903 Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul Argiry (US) From roland.westrelin at oracle.com Fri Feb 12 12:36:09 2016 From: roland.westrelin at oracle.com (Roland Westrelin) Date: Fri, 12 Feb 2016 13:36:09 +0100 Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize redundant memory operations with G1 In-Reply-To: <56BDB707.6090409@redhat.com> References: <56AA260B.8080101@redhat.com> <434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com> <56BB01E2.2090004@redhat.com> <9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com> <56BDB707.6090409@redhat.com> Message-ID: <8339FB73-9C21-4513-B07B-5DEBB8583188@oracle.com> Hi Andrew, > A patch for the AArch64 C2 volatile/CAS generation code which deals with > the effects of your proposed C2 patch is available as a webrev > > http://cr.openjdk.java.net/~adinn/8087341-aarch64/webrev.00/ Thanks for putting that together. I didn?t expect that simple change to cause so much trouble. > n.b. I have /not/ created a separate issue for the AArch64 part of this > fix. I am not sure whether you want to combine it with your patch or > push it as a separate stage. I can push everything together and list you as a contributor (in the contributed-by field) if that works for you. Vladimir, can you take another look at this? Your two objections were: > Also we have specialized insert_mem_bar_volatile() if we don't want wide memory affect. Why not use it? The membar in the change takes the entire memory state as input but only changes raw memory. I don?t think that can be achieved with insert_mem_bar_volatile(). As explained by Mikael, the membar is here to force ordering between the oop store and the card table load. That?s why I think the membar?s inputs and outputs should be set up that way. > And we need to keep precedent edge link to oop store in case EA eliminates related allocation. Mikael said it?s not ok to eliminate the memory barrier if we leave the gc barrier. Roland. From adinn at redhat.com Fri Feb 12 13:51:40 2016 From: adinn at redhat.com (Andrew Dinn) Date: Fri, 12 Feb 2016 13:51:40 +0000 Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize redundant memory operations with G1 In-Reply-To: <8339FB73-9C21-4513-B07B-5DEBB8583188@oracle.com> References: <56AA260B.8080101@redhat.com> <434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com> <56BB01E2.2090004@redhat.com> <9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com> <56BDB707.6090409@redhat.com> <8339FB73-9C21-4513-B07B-5DEBB8583188@oracle.com> Message-ID: <56BDE36C.4020101@redhat.com> Hi Roland, On 12/02/16 12:36, Roland Westrelin wrote: > Hi Andrew, > >> A patch for the AArch64 C2 volatile/CAS generation code which deals >> with the effects of your proposed C2 patch is available as a >> webrev >> >> http://cr.openjdk.java.net/~adinn/8087341-aarch64/webrev.00/ > > Thanks for putting that together. I didn?t expect that simple change > to cause so much trouble. It was my decision to employ back end rule predicates which poke around in the graph that led to this -- it's not anything to do with your choice. I think your fix is correct and valuable in its own right, yet more so because it has simplified that back end code substantially. >> n.b. I have /not/ created a separate issue for the AArch64 part of >> this fix. I am not sure whether you want to combine it with your >> patch or push it as a separate stage. > > I can push everything together and list you as a contributor (in the > contributed-by field) if that works for you. > Yes please. I think Andrew Haley's responses so far mean that has agreed the AArch64 part of this change. Perhaps he can confirm? > Vladimir, can you take another look at this? Your two objections > were: > >> Also we have specialized insert_mem_bar_volatile() if we don't want >> wide memory affect. Why not use it? > > The membar in the change takes the entire memory state as input but > only changes raw memory. I don?t think that can be achieved with > insert_mem_bar_volatile(). As explained by Mikael, the membar is here > to force ordering between the oop store and the card table load. > That?s why I think the membar?s inputs and outputs should be set up > that way. Not that I am an official reviewer but I agree with you here. >> And we need to keep precedent edge link to oop store in case EA >> eliminates related allocation. > > Mikael said it?s not ok to eliminate the memory barrier if we leave > the gc barrier. Also in agreement with this. For both G1GC and CMS +CondCardMark a StoreLoad barrier is necessary to ensure that the StoreX is visible before the LoadB/StoreCM pair which implement the conditional card mark. For these configurations AArch64 detects any MemBarVolatile associated with the card mark and inserts a dmb ish instruction (StoreLoad implementation) before the ldrb/strb. With CMS -CondCardMark the generic code does not insert a memory barrier. However, for correctness on non-TSO architectures we need a StoreStore barrier between the StoreX and the StoreCM implementing the card mark. That ensures that these writes cannot be observed by GC threads out of order (it might cause the GC to miss the write). This special case is handled on AArch64 by translating StoreCM to include a dmb ishst instruction (StoreStore implementation) before the strb. regards, Andrew Dinn ----------- From aph at redhat.com Fri Feb 12 14:07:21 2016 From: aph at redhat.com (Andrew Haley) Date: Fri, 12 Feb 2016 14:07:21 +0000 Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize redundant memory operations with G1 In-Reply-To: <56BDE36C.4020101@redhat.com> References: <56AA260B.8080101@redhat.com> <434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com> <56BB01E2.2090004@redhat.com> <9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com> <56BDB707.6090409@redhat.com> <8339FB73-9C21-4513-B07B-5DEBB8583188@oracle.com> <56BDE36C.4020101@redhat.com> Message-ID: <56BDE719.2010703@redhat.com> On 12/02/16 13:51, Andrew Dinn wrote: > Yes please. I think Andrew Haley's responses so far mean that has > agreed the AArch64 part of this change. Perhaps he can confirm? Sure. Andrew. From gnu.andrew at redhat.com Fri Feb 12 17:45:21 2016 From: gnu.andrew at redhat.com (gnu.andrew at redhat.com) Date: Fri, 12 Feb 2016 17:45:21 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u/jdk: 2 new changesets Message-ID: <201602121745.u1CHjL9B029815@aojmv0008.oracle.com> Changeset: c57b985d9249 Author: mfang Date: 2015-07-15 12:12 -0700 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/jdk/rev/c57b985d9249 8131105: Header Template for nroff man pages *.1 files contains errors Reviewed-by: katleman ! src/bsd/doc/man/appletviewer.1 ! src/bsd/doc/man/extcheck.1 ! src/bsd/doc/man/idlj.1 ! src/bsd/doc/man/jar.1 ! src/bsd/doc/man/jarsigner.1 ! src/bsd/doc/man/java.1 ! src/bsd/doc/man/javac.1 ! src/bsd/doc/man/javadoc.1 ! src/bsd/doc/man/javah.1 ! src/bsd/doc/man/javap.1 ! src/bsd/doc/man/jcmd.1 ! src/bsd/doc/man/jconsole.1 ! src/bsd/doc/man/jdb.1 ! src/bsd/doc/man/jdeps.1 ! src/bsd/doc/man/jhat.1 ! src/bsd/doc/man/jinfo.1 ! src/bsd/doc/man/jjs.1 ! src/bsd/doc/man/jmap.1 ! src/bsd/doc/man/jps.1 ! src/bsd/doc/man/jrunscript.1 ! src/bsd/doc/man/jsadebugd.1 ! src/bsd/doc/man/jstack.1 ! src/bsd/doc/man/jstat.1 ! src/bsd/doc/man/jstatd.1 ! src/bsd/doc/man/keytool.1 ! src/bsd/doc/man/native2ascii.1 ! src/bsd/doc/man/orbd.1 ! src/bsd/doc/man/pack200.1 ! src/bsd/doc/man/policytool.1 ! src/bsd/doc/man/rmic.1 ! src/bsd/doc/man/rmid.1 ! src/bsd/doc/man/rmiregistry.1 ! src/bsd/doc/man/schemagen.1 ! src/bsd/doc/man/serialver.1 ! src/bsd/doc/man/servertool.1 ! src/bsd/doc/man/tnameserv.1 ! src/bsd/doc/man/unpack200.1 ! src/bsd/doc/man/wsgen.1 ! src/bsd/doc/man/wsimport.1 ! src/bsd/doc/man/xjc.1 ! src/linux/doc/man/appletviewer.1 ! src/linux/doc/man/extcheck.1 ! src/linux/doc/man/idlj.1 ! src/linux/doc/man/ja/appletviewer.1 ! src/linux/doc/man/ja/extcheck.1 ! src/linux/doc/man/ja/idlj.1 ! src/linux/doc/man/ja/jar.1 ! src/linux/doc/man/ja/jarsigner.1 ! src/linux/doc/man/ja/java.1 ! src/linux/doc/man/ja/javac.1 ! src/linux/doc/man/ja/javadoc.1 ! src/linux/doc/man/ja/javah.1 ! src/linux/doc/man/ja/javap.1 ! src/linux/doc/man/ja/javaws.1 ! src/linux/doc/man/ja/jcmd.1 ! src/linux/doc/man/ja/jconsole.1 ! src/linux/doc/man/ja/jdb.1 ! src/linux/doc/man/ja/jdeps.1 ! src/linux/doc/man/ja/jhat.1 ! src/linux/doc/man/ja/jinfo.1 ! src/linux/doc/man/ja/jjs.1 ! src/linux/doc/man/ja/jmap.1 ! src/linux/doc/man/ja/jps.1 ! src/linux/doc/man/ja/jrunscript.1 ! src/linux/doc/man/ja/jsadebugd.1 ! src/linux/doc/man/ja/jstack.1 ! src/linux/doc/man/ja/jstat.1 ! src/linux/doc/man/ja/jstatd.1 ! src/linux/doc/man/ja/jvisualvm.1 ! src/linux/doc/man/ja/keytool.1 ! src/linux/doc/man/ja/native2ascii.1 ! src/linux/doc/man/ja/orbd.1 ! src/linux/doc/man/ja/pack200.1 ! src/linux/doc/man/ja/policytool.1 ! src/linux/doc/man/ja/rmic.1 ! src/linux/doc/man/ja/rmid.1 ! src/linux/doc/man/ja/rmiregistry.1 ! src/linux/doc/man/ja/schemagen.1 ! src/linux/doc/man/ja/serialver.1 ! src/linux/doc/man/ja/servertool.1 ! src/linux/doc/man/ja/tnameserv.1 ! src/linux/doc/man/ja/unpack200.1 ! src/linux/doc/man/ja/wsgen.1 ! src/linux/doc/man/ja/wsimport.1 ! src/linux/doc/man/ja/xjc.1 ! src/linux/doc/man/jar.1 ! src/linux/doc/man/jarsigner.1 ! src/linux/doc/man/java.1 ! src/linux/doc/man/javac.1 ! src/linux/doc/man/javadoc.1 ! src/linux/doc/man/javah.1 ! src/linux/doc/man/javap.1 ! src/linux/doc/man/jcmd.1 ! src/linux/doc/man/jconsole.1 ! src/linux/doc/man/jdb.1 ! src/linux/doc/man/jdeps.1 ! src/linux/doc/man/jhat.1 ! src/linux/doc/man/jinfo.1 ! src/linux/doc/man/jjs.1 ! src/linux/doc/man/jmap.1 ! src/linux/doc/man/jps.1 ! src/linux/doc/man/jrunscript.1 ! src/linux/doc/man/jsadebugd.1 ! src/linux/doc/man/jstack.1 ! src/linux/doc/man/jstat.1 ! src/linux/doc/man/jstatd.1 ! src/linux/doc/man/keytool.1 ! src/linux/doc/man/native2ascii.1 ! src/linux/doc/man/orbd.1 ! src/linux/doc/man/pack200.1 ! src/linux/doc/man/policytool.1 ! src/linux/doc/man/rmic.1 ! src/linux/doc/man/rmid.1 ! src/linux/doc/man/rmiregistry.1 ! src/linux/doc/man/schemagen.1 ! src/linux/doc/man/serialver.1 ! src/linux/doc/man/servertool.1 ! src/linux/doc/man/tnameserv.1 ! src/linux/doc/man/unpack200.1 ! src/linux/doc/man/wsgen.1 ! src/linux/doc/man/wsimport.1 ! src/linux/doc/man/xjc.1 ! src/solaris/doc/sun/man/man1/appletviewer.1 ! src/solaris/doc/sun/man/man1/extcheck.1 ! src/solaris/doc/sun/man/man1/idlj.1 ! src/solaris/doc/sun/man/man1/ja/appletviewer.1 ! src/solaris/doc/sun/man/man1/ja/extcheck.1 ! src/solaris/doc/sun/man/man1/ja/idlj.1 ! src/solaris/doc/sun/man/man1/ja/jar.1 ! src/solaris/doc/sun/man/man1/ja/jarsigner.1 ! src/solaris/doc/sun/man/man1/ja/java.1 ! src/solaris/doc/sun/man/man1/ja/javac.1 ! src/solaris/doc/sun/man/man1/ja/javadoc.1 ! src/solaris/doc/sun/man/man1/ja/javah.1 ! src/solaris/doc/sun/man/man1/ja/javap.1 ! src/solaris/doc/sun/man/man1/ja/jcmd.1 ! src/solaris/doc/sun/man/man1/ja/jconsole.1 ! src/solaris/doc/sun/man/man1/ja/jdb.1 ! src/solaris/doc/sun/man/man1/ja/jdeps.1 ! src/solaris/doc/sun/man/man1/ja/jhat.1 ! src/solaris/doc/sun/man/man1/ja/jinfo.1 ! src/solaris/doc/sun/man/man1/ja/jjs.1 ! src/solaris/doc/sun/man/man1/ja/jmap.1 ! src/solaris/doc/sun/man/man1/ja/jps.1 ! src/solaris/doc/sun/man/man1/ja/jrunscript.1 ! src/solaris/doc/sun/man/man1/ja/jsadebugd.1 ! src/solaris/doc/sun/man/man1/ja/jstack.1 ! src/solaris/doc/sun/man/man1/ja/jstat.1 ! src/solaris/doc/sun/man/man1/ja/jstatd.1 ! src/solaris/doc/sun/man/man1/ja/jvisualvm.1 ! src/solaris/doc/sun/man/man1/ja/keytool.1 ! src/solaris/doc/sun/man/man1/ja/native2ascii.1 ! src/solaris/doc/sun/man/man1/ja/orbd.1 ! src/solaris/doc/sun/man/man1/ja/pack200.1 ! src/solaris/doc/sun/man/man1/ja/policytool.1 ! src/solaris/doc/sun/man/man1/ja/rmic.1 ! src/solaris/doc/sun/man/man1/ja/rmid.1 ! src/solaris/doc/sun/man/man1/ja/rmiregistry.1 ! src/solaris/doc/sun/man/man1/ja/schemagen.1 ! src/solaris/doc/sun/man/man1/ja/serialver.1 ! src/solaris/doc/sun/man/man1/ja/servertool.1 ! src/solaris/doc/sun/man/man1/ja/tnameserv.1 ! src/solaris/doc/sun/man/man1/ja/unpack200.1 ! src/solaris/doc/sun/man/man1/ja/wsgen.1 ! src/solaris/doc/sun/man/man1/ja/wsimport.1 ! src/solaris/doc/sun/man/man1/ja/xjc.1 ! src/solaris/doc/sun/man/man1/jar.1 ! src/solaris/doc/sun/man/man1/jarsigner.1 ! src/solaris/doc/sun/man/man1/java.1 ! src/solaris/doc/sun/man/man1/javac.1 ! src/solaris/doc/sun/man/man1/javadoc.1 ! src/solaris/doc/sun/man/man1/javah.1 ! src/solaris/doc/sun/man/man1/javap.1 ! src/solaris/doc/sun/man/man1/jcmd.1 ! src/solaris/doc/sun/man/man1/jconsole.1 ! src/solaris/doc/sun/man/man1/jdb.1 ! src/solaris/doc/sun/man/man1/jdeps.1 ! src/solaris/doc/sun/man/man1/jhat.1 ! src/solaris/doc/sun/man/man1/jinfo.1 ! src/solaris/doc/sun/man/man1/jjs.1 ! src/solaris/doc/sun/man/man1/jmap.1 ! src/solaris/doc/sun/man/man1/jps.1 ! src/solaris/doc/sun/man/man1/jrunscript.1 ! src/solaris/doc/sun/man/man1/jsadebugd.1 ! src/solaris/doc/sun/man/man1/jstack.1 ! src/solaris/doc/sun/man/man1/jstat.1 ! src/solaris/doc/sun/man/man1/jstatd.1 ! src/solaris/doc/sun/man/man1/keytool.1 ! src/solaris/doc/sun/man/man1/native2ascii.1 ! src/solaris/doc/sun/man/man1/orbd.1 ! src/solaris/doc/sun/man/man1/pack200.1 ! src/solaris/doc/sun/man/man1/policytool.1 ! src/solaris/doc/sun/man/man1/rmic.1 ! src/solaris/doc/sun/man/man1/rmid.1 ! src/solaris/doc/sun/man/man1/rmiregistry.1 ! src/solaris/doc/sun/man/man1/schemagen.1 ! src/solaris/doc/sun/man/man1/serialver.1 ! src/solaris/doc/sun/man/man1/servertool.1 ! src/solaris/doc/sun/man/man1/tnameserv.1 ! src/solaris/doc/sun/man/man1/unpack200.1 ! src/solaris/doc/sun/man/man1/wsgen.1 ! src/solaris/doc/sun/man/man1/wsimport.1 ! src/solaris/doc/sun/man/man1/xjc.1 Changeset: f9d3631fbc8f Author: andrew Date: 2016-02-08 14:55 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/jdk/rev/f9d3631fbc8f Revert changes to libpng source code now 8078245 is in place. ! src/share/native/sun/awt/libpng/pngpriv.h From gnu.andrew at redhat.com Fri Feb 12 17:55:09 2016 From: gnu.andrew at redhat.com (Andrew Hughes) Date: Fri, 12 Feb 2016 12:55:09 -0500 (EST) Subject: [aarch64-port-dev ] Sync AArch64 8u JDK Repository with upstream 8u72 In-Reply-To: <56BDAAF3.3020904@redhat.com> References: <1530040291.20005017.1455214860067.JavaMail.zimbra@redhat.com> <56BDAAF3.3020904@redhat.com> Message-ID: <609425171.20390154.1455299709519.JavaMail.zimbra@redhat.com> ----- Original Message ----- > On 11/02/16 18:21, Andrew Hughes wrote: > > Ok to push? > > OK, thanks. Sounds like 8u has been pretty quiet. > > Andrew. > > Done. Thanks. -- Andrew :) Senior Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 From vladimir.kozlov at oracle.com Fri Feb 12 19:44:18 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 12 Feb 2016 11:44:18 -0800 Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize redundant memory operations with G1 In-Reply-To: <8339FB73-9C21-4513-B07B-5DEBB8583188@oracle.com> References: <56AA260B.8080101@redhat.com> <434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com> <56BB01E2.2090004@redhat.com> <9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com> <56BDB707.6090409@redhat.com> <8339FB73-9C21-4513-B07B-5DEBB8583188@oracle.com> Message-ID: <56BE3612.4080305@oracle.com> Roland, Can you create new webrev which includes everything (aarch64)? And I am satisfied with your answers to my objections. Thanks, Vladimir On 2/12/16 4:36 AM, Roland Westrelin wrote: > Hi Andrew, > >> A patch for the AArch64 C2 volatile/CAS generation code which deals with >> the effects of your proposed C2 patch is available as a webrev >> >> http://cr.openjdk.java.net/~adinn/8087341-aarch64/webrev.00/ > > Thanks for putting that together. I didn?t expect that simple change to cause so much trouble. > >> n.b. I have /not/ created a separate issue for the AArch64 part of this >> fix. I am not sure whether you want to combine it with your patch or >> push it as a separate stage. > > I can push everything together and list you as a contributor (in the contributed-by field) if that works for you. > > Vladimir, can you take another look at this? Your two objections were: > >> Also we have specialized insert_mem_bar_volatile() if we don't want wide memory affect. Why not use it? > > The membar in the change takes the entire memory state as input but only changes raw memory. I don?t think that can be achieved with insert_mem_bar_volatile(). As explained by Mikael, the membar is here to force ordering between the oop store and the card table load. That?s why I think the membar?s inputs and outputs should be set up that way. > >> And we need to keep precedent edge link to oop store in case EA eliminates related allocation. > > Mikael said it?s not ok to eliminate the memory barrier if we leave the gc barrier. > > Roland. > From roland.westrelin at oracle.com Mon Feb 15 09:21:43 2016 From: roland.westrelin at oracle.com (Roland Westrelin) Date: Mon, 15 Feb 2016 10:21:43 +0100 Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize redundant memory operations with G1 In-Reply-To: <56BE3612.4080305@oracle.com> References: <56AA260B.8080101@redhat.com> <434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com> <56BB01E2.2090004@redhat.com> <9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com> <56BDB707.6090409@redhat.com> <8339FB73-9C21-4513-B07B-5DEBB8583188@oracle.com> <56BE3612.4080305@oracle.com> Message-ID: > Can you create new webrev which includes everything (aarch64)? Here it is: http://cr.openjdk.java.net/~roland/8087341/webrev.01/ Roland. > And I am satisfied with your answers to my objections. > > Thanks, > Vladimir > > On 2/12/16 4:36 AM, Roland Westrelin wrote: >> Hi Andrew, >> >>> A patch for the AArch64 C2 volatile/CAS generation code which deals with >>> the effects of your proposed C2 patch is available as a webrev >>> >>> http://cr.openjdk.java.net/~adinn/8087341-aarch64/webrev.00/ >> >> Thanks for putting that together. I didn?t expect that simple change to cause so much trouble. >> >>> n.b. I have /not/ created a separate issue for the AArch64 part of this >>> fix. I am not sure whether you want to combine it with your patch or >>> push it as a separate stage. >> >> I can push everything together and list you as a contributor (in the contributed-by field) if that works for you. >> >> Vladimir, can you take another look at this? Your two objections were: >> >>> Also we have specialized insert_mem_bar_volatile() if we don't want wide memory affect. Why not use it? >> >> The membar in the change takes the entire memory state as input but only changes raw memory. I don?t think that can be achieved with insert_mem_bar_volatile(). As explained by Mikael, the membar is here to force ordering between the oop store and the card table load. That?s why I think the membar?s inputs and outputs should be set up that way. >> >>> And we need to keep precedent edge link to oop store in case EA eliminates related allocation. >> >> Mikael said it?s not ok to eliminate the memory barrier if we leave the gc barrier. >> >> Roland. >> From adinn at redhat.com Mon Feb 15 11:08:07 2016 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 15 Feb 2016 11:08:07 +0000 Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize redundant memory operations with G1 In-Reply-To: References: <56AA260B.8080101@redhat.com> <434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com> <56BB01E2.2090004@redhat.com> <9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com> <56BDB707.6090409@redhat.com> <8339FB73-9C21-4513-B07B-5DEBB8583188@oracle.com> <56BE3612.4080305@oracle.com> Message-ID: <56C1B197.7060708@redhat.com> On 15/02/16 09:21, Roland Westrelin wrote: > >> Can you create new webrev which includes everything (aarch64)? > > Here it is: > http://cr.openjdk.java.net/~roland/8087341/webrev.01/ Thanks Roland. Looks good to go. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in UK and Wales under Company Registration No. 3798903 Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul Argiry (US) From vladimir.kozlov at oracle.com Mon Feb 15 17:33:08 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 15 Feb 2016 09:33:08 -0800 Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize redundant memory operations with G1 In-Reply-To: References: <56AA260B.8080101@redhat.com> <434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com> <56BB01E2.2090004@redhat.com> <9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com> <56BDB707.6090409@redhat.com> <8339FB73-9C21-4513-B07B-5DEBB8583188@oracle.com> <56BE3612.4080305@oracle.com> Message-ID: <56C20BD4.8030800@oracle.com> Good. Thank you. Vladimir On 2/15/16 1:21 AM, Roland Westrelin wrote: > >> Can you create new webrev which includes everything (aarch64)? > > Here it is: > http://cr.openjdk.java.net/~roland/8087341/webrev.01/ > > Roland. > >> And I am satisfied with your answers to my objections. >> >> Thanks, >> Vladimir >> >> On 2/12/16 4:36 AM, Roland Westrelin wrote: >>> Hi Andrew, >>> >>>> A patch for the AArch64 C2 volatile/CAS generation code which deals with >>>> the effects of your proposed C2 patch is available as a webrev >>>> >>>> http://cr.openjdk.java.net/~adinn/8087341-aarch64/webrev.00/ >>> >>> Thanks for putting that together. I didn?t expect that simple change to cause so much trouble. >>> >>>> n.b. I have /not/ created a separate issue for the AArch64 part of this >>>> fix. I am not sure whether you want to combine it with your patch or >>>> push it as a separate stage. >>> >>> I can push everything together and list you as a contributor (in the contributed-by field) if that works for you. >>> >>> Vladimir, can you take another look at this? Your two objections were: >>> >>>> Also we have specialized insert_mem_bar_volatile() if we don't want wide memory affect. Why not use it? >>> >>> The membar in the change takes the entire memory state as input but only changes raw memory. I don?t think that can be achieved with insert_mem_bar_volatile(). As explained by Mikael, the membar is here to force ordering between the oop store and the card table load. That?s why I think the membar?s inputs and outputs should be set up that way. >>> >>>> And we need to keep precedent edge link to oop store in case EA eliminates related allocation. >>> >>> Mikael said it?s not ok to eliminate the memory barrier if we leave the gc barrier. >>> >>> Roland. >>> > From felix.yang at linaro.org Tue Feb 16 11:28:13 2016 From: felix.yang at linaro.org (Felix Yang) Date: Tue, 16 Feb 2016 19:28:13 +0800 Subject: [aarch64-port-dev ] RFR: 8149907: aarch64: use load/store pair instructions in call_stub Message-ID: Hi, Please review the following webrev: http://cr.openjdk.java.net/~fyang/8149907/webrev.00/ Jira issue: https://bugs.openjdk.java.net/browse/JDK-8149907 This patch make use of load/store pair instructions in call_stub saving 24 load/store instructions. Tested with jtreg hotspot & langtools. Is it OK? Thanks, Felix. From aph at redhat.com Tue Feb 16 11:46:54 2016 From: aph at redhat.com (Andrew Haley) Date: Tue, 16 Feb 2016 11:46:54 +0000 Subject: [aarch64-port-dev ] RFR: 8149907: aarch64: use load/store pair instructions in call_stub In-Reply-To: References: Message-ID: <56C30C2E.4080206@redhat.com> On 02/16/2016 11:28 AM, Felix Yang wrote: > Please review the following webrev: > http://cr.openjdk.java.net/~fyang/8149907/webrev.00/ I guess this is okay, but it's a lot less self-documenting than it was. If there are any unused locals (e.g. r27_save) you must delete them or use them in assertions. Andrew. From hui.shi at linaro.org Wed Feb 17 13:21:11 2016 From: hui.shi at linaro.org (Hui Shi) Date: Wed, 17 Feb 2016 21:21:11 +0800 Subject: [aarch64-port-dev ] AArch64: follow up array copy investigation on misaligned peeling Message-ID: Hi Andrew and all, Follow up with early discussion about forward and backward array copy performance, current finding is 1. Optimizing misaligned load/store in backward array copy doesn't help on array copy performance, I suggest leave it unchanged now. 2. There is some chances to optimizing array copy peeling/tailing with combined 8 byte load/store. But might introduce extra stubs and complicate code. Would you please help comment? Firstly, remove unaligned reference by reorder copy orders from small to large (copy 1 byte first, 8 byte at last) when peeling. However it is even a little bit slow compared with original implementation. Test case is http://people.linaro.org/~hui.shi/arraycopy/TestPeelAlign.java Performance result in http://people.linaro.org/~hui.shi/arraycopy/arraycopy_align_and_combine_Test.pdf Patch is http://people.linaro.org/~hui.shi/arraycopy/peelingFromSmall.patch Test case is typical backward array copy scenario (insert some element in array and move tail array backward). From profiling, UNALIGNED_LDST_SPEC event drops a lot with patch. In my understanding, load address cross cache line boundary might trigger hardware prefetcher earlier than aligned access. So fixing unaligned access seems not helpful in array copy peeling. Secondly, as unaligned access doesn't show degradation in this case, further experiment is folding consecutive branches/load/stores into one 8 byte unaligned load/store. Following is updated stub code for byte array copy. This is legal when src and dst distance is bigger than 8 bytes. This is safe in cases like String.getChars String.getBytes. Perform different combination tests, it works best for byte array copy and still helpful for short array copy. Check result in pdf "opt" column is for this optimization. For StringConcat test ( http://people.linaro.org/~hui.shi/arraycopy/StringConcatTest.java), though array copy only takes 25% cycles in this test, entire test can still see 3.5% improvement with this combine load/store optimization. However I wondering if this is the proper way to improve these test-bit-load-store code sequence. This will requires extra really ?disjoint? array copy stub code, current disjoint array copy only means it can safely perform forward array copy. Or introduce no "overlap" test at runtime. My personal tradeoff is leaving array copy code unchanged and keep it simply and consistent now. Before patch StubRoutines::jbyte_disjoint_arraycopy [0x0000ffff7897f7c0, 0x0000ffff7897f860[ (160 bytes) 0x0000ffff7897f7e0: tbz w9, #3, Stub::jbyte_disjoint_arraycopy+44 0x0000ffff7897f7ec 0x0000ffff7897f7e4: ldr x8, [x0],#8 0x0000ffff7897f7e8: str x8, [x1],#8 0x0000ffff7897f7ec: tbz w9, #2, Stub::jbyte_disjoint_arraycopy+56 0x0000ffff7897f7f8 0x0000ffff7897f7f0: ldr w8, [x0],#4 0x0000ffff7897f7f4: str w8, [x1],#4 0x0000ffff7897f7f8: tbz w9, #1, Stub::jbyte_disjoint_arraycopy+68 0x0000ffff7897f804 0x0000ffff7897f7fc: ldrh w8, [x0],#2 0x0000ffff7897f800: strh w8, [x1],#2 0x0000ffff7897f804: tbz w9, #0, Stub::jbyte_disjoint_arraycopy+80 0x0000ffff7897f810 0x0000ffff7897f808: ldrb w8, [x0],#1 0x0000ffff7897f80c: strb w8, [x1],#1 0x0000ffff7897f810: cmp x2, #0x10 0x0000ffff7897f814: b.lt Stub::jbyte_disjoint_arraycopy+96 0x0000ffff7897f820 0x0000ffff7897f818: lsr x9, x2, #3 0x0000ffff7897f81c: bl Stub::foward_copy_longs+28 0x0000ffff7897f5c0 Code after patch StubRoutines::jbyte_disjoint_arraycopy [0x0000ffff6c97f7c0, 0x0000ffff6c97f87c[ (188 bytes) // peeling for alignment 0x0000ffff6c97f7e0: tbz w9, #3, Stub::jbyte_disjoint_arraycopy+48 0x0000ffff6c97f7f0 0x0000ffff6c97f7e4: sub x9, x9, #0x8 0x0000ffff6c97f7e8: ldr x8, [x0],#8 0x0000ffff6c97f7ec: str x8, [x1],#8 0x0000ffff6c97f7f0: ldr x8, [x0] 0x0000ffff6c97f7f4: str x8, [x1] 0x0000ffff6c97f7f8: add x0, x0, x9 0x0000ffff6c97f7fc: add x1, x1, x9 0x0000ffff6c97f800: cmp x2, #0x10 0x0000ffff6c97f804: b.lt Stub::jbyte_disjoint_arraycopy+124 0x0000ffff6c97f83c 0x0000ffff6c97f808: lsr x9, x2, #3 0x0000ffff6c97f80c: bl Stub::foward_copy_longs+28 0x0000ffff6c97f5c0 Regards Hui From aph at redhat.com Wed Feb 17 13:34:02 2016 From: aph at redhat.com (Andrew Haley) Date: Wed, 17 Feb 2016 13:34:02 +0000 Subject: [aarch64-port-dev ] AArch64: follow up array copy investigation on misaligned peeling In-Reply-To: References: Message-ID: <56C476CA.9070600@redhat.com> On 02/17/2016 01:21 PM, Hui Shi wrote: > For StringConcat test ( > http://people.linaro.org/~hui.shi/arraycopy/StringConcatTest.java), though > array copy only takes 25% cycles in this test, entire test can still see > 3.5% improvement with this combine load/store optimization. However I > wondering if this is the proper way to improve these test-bit-load-store > code sequence. This will requires extra really ?disjoint? array copy stub > code, current disjoint array copy only means it can safely perform forward > array copy. Or introduce no "overlap" test at runtime. My personal tradeoff > is leaving array copy code unchanged and keep it simply and consistent now. OK, that makes sense. My plan (such as it is) for tidying up the tail code is to convert three bit-test-and-branches into a single 8-way computed jump with an optimum sequence for all 8 cases. Sure, it will usually be mispredicted, but it's just a single jump. But really, once we're down to 3.5% of a contrived string- concatenation intensive test, it's questionable whether this is what we need to be spending time on. Thanks, Andrew. From felix.yang at linaro.org Wed Feb 17 14:11:33 2016 From: felix.yang at linaro.org (Felix Yang) Date: Wed, 17 Feb 2016 22:11:33 +0800 Subject: [aarch64-port-dev ] RFR: 8149907: aarch64: use load/store pair instructions in call_stub In-Reply-To: <56C30C2E.4080206@redhat.com> References: <56C30C2E.4080206@redhat.com> Message-ID: Hi Andrew, Thanks for the suggestions. I have updated the patch with the unused locals removed. New webrev: http://cr.openjdk.java.net/~fyang/8149907/webrev.01/ How about this one? Thanks for your help, Felix On 16 February 2016 at 19:46, Andrew Haley wrote: > On 02/16/2016 11:28 AM, Felix Yang wrote: > > Please review the following webrev: > > http://cr.openjdk.java.net/~fyang/8149907/webrev.00/ > > I guess this is okay, but it's a lot less self-documenting than it > was. > > If there are any unused locals (e.g. r27_save) you must delete > them or use them in assertions. > > Andrew. > > From aph at redhat.com Wed Feb 17 14:16:10 2016 From: aph at redhat.com (Andrew Haley) Date: Wed, 17 Feb 2016 14:16:10 +0000 Subject: [aarch64-port-dev ] RFR: 8149907: aarch64: use load/store pair instructions in call_stub In-Reply-To: References: <56C30C2E.4080206@redhat.com> Message-ID: <56C480AA.2090606@redhat.com> On 02/17/2016 02:11 PM, Felix Yang wrote: > Thanks for the suggestions. I have updated the patch with the unused > locals removed. > New webrev: http://cr.openjdk.java.net/~fyang/8149907/webrev.01/ > How about this one? What are r19_off and its friends used for now? Why are they still defined? Andrew. From felix.yang at linaro.org Wed Feb 17 14:33:21 2016 From: felix.yang at linaro.org (Felix Yang) Date: Wed, 17 Feb 2016 22:33:21 +0800 Subject: [aarch64-port-dev ] RFR: 8150038: aarch64: make use of CBZ and CBNZ when comparing narrow pointer with zero Message-ID: Hi, Please review the following webrev: *http://cr.openjdk.java.net/~fyang/8150038/webrev.00/ * Jira issue: *https://bugs.openjdk.java.net/browse/JDK-8150038 * For several times I noticed the following pattern in C2 JIT code (the java heap size is set to 200MB): 2042 0x0000007f6c9419c4: ldr w14, [x11,#32] ;*getfield buffer 2048 0x0000007f6c9419c8: cmp w14, wzr 2049 0x0000007f6c9419cc: b.eq 0x0000007f6c9425e4 ;*invokevirtual reset The two cmp and b.eq instructions can be combined into one "cbz" instruction. Currently, the aarch64 port only makes use of CBZ and CBNZ when comparing operands with Integer/Long/Pointer type with zero. Patch fixes the issue by adding one similar combine pattern in the AD file for Narrow pointer types(just like the sparc port does). Tested with jtreg hotspot & langtools. Is it OK? Thanks, Felix. From aph at redhat.com Wed Feb 17 15:07:26 2016 From: aph at redhat.com (Andrew Haley) Date: Wed, 17 Feb 2016 15:07:26 +0000 Subject: [aarch64-port-dev ] RFR: 8150038: aarch64: make use of CBZ and CBNZ when comparing narrow pointer with zero In-Reply-To: References: Message-ID: <56C48CAE.9040704@redhat.com> On 02/17/2016 02:33 PM, Felix Yang wrote: > Tested with jtreg hotspot & langtools. Is it OK? Sure, that looks fine. Thanks, Andrew. From edward.nevill at gmail.com Wed Feb 17 19:29:18 2016 From: edward.nevill at gmail.com (Edward Nevill) Date: Wed, 17 Feb 2016 19:29:18 +0000 Subject: [aarch64-port-dev ] arraycopy optimisations on aarch64 Message-ID: <1455737358.14578.45.camel@mint> Hi, There have been a number of ongoing efforts at optimising array copy recently. Rather than have multiple webrevs and multiple JIRA issues I would like to collect all the efforts under a single JIRA issue. I have created the following JIRA issue for all work relating to optimising array copy. https://bugs.openjdk.java.net/browse/JDK-8150082 We can then review all the array copy optimisation proposals on the aarch64-port-dev mailing list rather than cc'ing the whole of hotspot-compiler-dev with every intricate detail of array copys on aarch64. Once we have a complete version of array copy code we are happy with I can submit a single CR for review. All contributions will be acknowledged in the "Contributed-by" section. To further muddy the waters I have two patches I would like to forward for your discussion. 1) http://cr.openjdk.java.net/~enevill/memopts/small.patch This improves the performance of copying small (0 to 80 bytes) arrays. The copy code is inlined (rather than calling out to copy_longs). The copy forwards and copy backwards case is identical because the small copy code reads all data into registers before writing any. Thankfully aarch64 has plenty of registers. The rationale for choosing 80 as the limit is that it provides a guarantee than copy_longs is always called with at least 64 bytes, even after worst case alignment fixup. This means the small case code in copy_longs can be deleted (I have put an assert in copy longs to check it is never called with < 64 bytes). 2) http://cr.openjdk.java.net/~enevill/memopts/simd.patch This uses SIMD ldp/stp Qx, Qy instructions instead of scalar ldp/stp instructions, thereby loading/storing 32 bytes at a time instead of 16. It also extends the small copy code to copy 0-96 instead of 0-80 (because 80 is not divisible by 32). This improves performance on some micro-arches and not on others so I have provided a -XX:+UseSIMDForMemoryOps switch which defaults to false (we could look at enabling this by default for micro-arches where we know SIMD is better). I have prepared a set of performance measurements on memory copies between 0 & 96 bytes in steps of 1 (which shows the effect of the small copy optimisations) and also between 0 & 1024 in steps of 16. I have prepared these for 3 different micro-arches. The results are at http://cr.openjdk.java.net/~enevill/memopts/twoopts.pdf In these charts the blue 'original' line is the jdk9 tip as of earlier today. The red 'small copy' line is after application of the small copy patch above. The yellow 'SIMD' line is after the cumulative application of the small copy patch and the simd patch. The charts show time taken so smaller is better. I have normalised the charts by varying the number of iteration so all results are in the 0-1200 range. Because the number of iterations was different for each micro-arch no information should be inferred as to the relative performance of different micro-arches. The charts should only be used to compare the performance before and after application of the above patches. All the best, Ed. From felix.yang at linaro.org Thu Feb 18 15:02:06 2016 From: felix.yang at linaro.org (Felix Yang) Date: Thu, 18 Feb 2016 23:02:06 +0800 Subject: [aarch64-port-dev ] RFR: 8149907: aarch64: use load/store pair instructions in call_stub In-Reply-To: <56C480AA.2090606@redhat.com> References: <56C30C2E.4080206@redhat.com> <56C480AA.2090606@redhat.com> Message-ID: Hi, I updated the webrev with the unused ENUM members removed. New webrev: http://cr.openjdk.java.net/~fyang/8149907/webrev.02/ Thanks, Felix On 17 February 2016 at 22:16, Andrew Haley wrote: > On 02/17/2016 02:11 PM, Felix Yang wrote: > > > Thanks for the suggestions. I have updated the patch with the unused > > locals removed. > > New webrev: http://cr.openjdk.java.net/~fyang/8149907/webrev.01/ > > How about this one? > > What are r19_off and its friends used for now? Why are they still > defined? > > Andrew. > > From roland.westrelin at oracle.com Fri Feb 19 07:54:26 2016 From: roland.westrelin at oracle.com (Roland Westrelin) Date: Fri, 19 Feb 2016 08:54:26 +0100 Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize redundant memory operations with G1 In-Reply-To: <56C20BD4.8030800@oracle.com> References: <56AA260B.8080101@redhat.com> <434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com> <56BB01E2.2090004@redhat.com> <9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com> <56BDB707.6090409@redhat.com> <8339FB73-9C21-4513-B07B-5DEBB8583188@oracle.com> <56BE3612.4080305@oracle.com> <56C20BD4.8030800@oracle.com> Message-ID: Thanks for the review, Vladimir. Roland. > On Feb 15, 2016, at 6:33 PM, Vladimir Kozlov wrote: > > Good. Thank you. > > Vladimir > > On 2/15/16 1:21 AM, Roland Westrelin wrote: >> >>> Can you create new webrev which includes everything (aarch64)? >> >> Here it is: >> http://cr.openjdk.java.net/~roland/8087341/webrev.01/ >> >> Roland. >> >>> And I am satisfied with your answers to my objections. >>> >>> Thanks, >>> Vladimir >>> >>> On 2/12/16 4:36 AM, Roland Westrelin wrote: >>>> Hi Andrew, >>>> >>>>> A patch for the AArch64 C2 volatile/CAS generation code which deals with >>>>> the effects of your proposed C2 patch is available as a webrev >>>>> >>>>> http://cr.openjdk.java.net/~adinn/8087341-aarch64/webrev.00/ >>>> >>>> Thanks for putting that together. I didn?t expect that simple change to cause so much trouble. >>>> >>>>> n.b. I have /not/ created a separate issue for the AArch64 part of this >>>>> fix. I am not sure whether you want to combine it with your patch or >>>>> push it as a separate stage. >>>> >>>> I can push everything together and list you as a contributor (in the contributed-by field) if that works for you. >>>> >>>> Vladimir, can you take another look at this? Your two objections were: >>>> >>>>> Also we have specialized insert_mem_bar_volatile() if we don't want wide memory affect. Why not use it? >>>> >>>> The membar in the change takes the entire memory state as input but only changes raw memory. I don?t think that can be achieved with insert_mem_bar_volatile(). As explained by Mikael, the membar is here to force ordering between the oop store and the card table load. That?s why I think the membar?s inputs and outputs should be set up that way. >>>> >>>>> And we need to keep precedent edge link to oop store in case EA eliminates related allocation. >>>> >>>> Mikael said it?s not ok to eliminate the memory barrier if we leave the gc barrier. >>>> >>>> Roland. >>>> >> From roland.westrelin at oracle.com Fri Feb 19 07:54:55 2016 From: roland.westrelin at oracle.com (Roland Westrelin) Date: Fri, 19 Feb 2016 08:54:55 +0100 Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize redundant memory operations with G1 In-Reply-To: <56C1B197.7060708@redhat.com> References: <56AA260B.8080101@redhat.com> <434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com> <56BB01E2.2090004@redhat.com> <9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com> <56BDB707.6090409@redhat.com> <8339FB73-9C21-4513-B07B-5DEBB8583188@oracle.com> <56BE3612.4080305@oracle.com> <56C1B197.7060708@redhat.com> Message-ID: <8112862F-B94D-4043-AD3C-DF59BB8FB862@oracle.com> Thanks, Andrew! Roland. > On Feb 15, 2016, at 12:08 PM, Andrew Dinn wrote: > > On 15/02/16 09:21, Roland Westrelin wrote: >> >>> Can you create new webrev which includes everything (aarch64)? >> >> Here it is: >> http://cr.openjdk.java.net/~roland/8087341/webrev.01/ > > Thanks Roland. Looks good to go. > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in UK and Wales under Company Registration No. 3798903 > Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul > Argiry (US) From hui.shi at linaro.org Fri Feb 19 12:13:28 2016 From: hui.shi at linaro.org (Hui Shi) Date: Fri, 19 Feb 2016 20:13:28 +0800 Subject: [aarch64-port-dev ] RFR: 8149733: AArch64: refactor char_array_equals/byte_array_equals/string_equals Message-ID: Hi, Could some one help review this patch? This patch mainly aims to refactoring similar code on AArch64 for string equals/ char array equals and byte array equals. JIRA: https://bugs.openjdk.java.net/browse/JDK-8149733 webrev: http://cr.openjdk.java.net/~hshi/8149733/webrev/ Patch includes: 1. Add new method MacroAssembler::generic_arrays_equals method, its implementation combines string_equals and char/byte_array_equals. For array length >= 8 bytes, compare main body and tail bytes in 8 bytes wide. Same with current string equals' implementation. This eliminates tail branches and loads and improve performance on short length compare. For array length < 8 bytes, compare in test-ld-cmp sequence. It out performs loop copy in string_equals. 2. Remove unnecessary lea address computation (mov array pointer to last word) in string_equals. 3. Remove unnecessary tmp register for string_equals. JTreg doesn?t show regression and performance also doesn't degradate with different length combination. Small char/byte equals improves in most tests. There is one slow run with new implementation, because last several chars are different in test case. Original char array equals can find the difference with first test-ld-cmp check, test-ld-cmp sequence might be faster than entire unaligned 8 byte compare in some corner cases. Test different chars in middle of string, performance is close for both implementation. Test case: http://cr.openjdk.java.net/~hshi/8149733/TestArrayEqual.java Test result: http://cr.openjdk.java.net/~hshi/8149733/ArrayEqual.pdf Regards Hui From felix.yang at linaro.org Fri Feb 19 12:23:11 2016 From: felix.yang at linaro.org (Felix Yang) Date: Fri, 19 Feb 2016 20:23:11 +0800 Subject: [aarch64-port-dev ] RFR: 8150229: aarch64: c2 fix pipeline class for several instructions. Message-ID: Hi, Please review the following webrev: http://cr.openjdk.java.net/~fyang/8150229/webrev.00/ Jira issue: *https://bugs.openjdk.java.net/browse/JDK-8150229 * The pipeline class for some instructions is not set correctly. An example: instruct MoveF2I_reg_reg(iRegINoSp dst, vRegF src) %{ match(Set dst (MoveF2I src)); effect(DEF dst, USE src); ins_cost(INSN_COST); format %{ "fmovs $dst, $src\t# MoveF2I_reg_reg" %} ins_encode %{ __ fmovs($dst$$Register, as_FloatRegister($src$$reg)); %} ins_pipe(pipe_class_memory); => Should be "fp_f2i" %} Patch fixes this issue. Tested with jtreg hotspot. Please help commit this patch if it's OK. Thanks, Felix. From aph at redhat.com Fri Feb 19 12:23:25 2016 From: aph at redhat.com (Andrew Haley) Date: Fri, 19 Feb 2016 12:23:25 +0000 Subject: [aarch64-port-dev ] RFR: 8149733: AArch64: refactor char_array_equals/byte_array_equals/string_equals In-Reply-To: References: Message-ID: <56C7093D.9070408@redhat.com> On 02/19/2016 12:13 PM, Hui Shi wrote: > Could some one help review this patch? This patch mainly aims to > refactoring similar code on AArch64 for string equals/ char array equals > and byte array equals. I'm looking at it. It's quite complex and I won't reply immediately. Thanks, Andrew. From aleksey.shipilev at oracle.com Fri Feb 19 12:36:18 2016 From: aleksey.shipilev at oracle.com (Aleksey Shipilev) Date: Fri, 19 Feb 2016 15:36:18 +0300 Subject: [aarch64-port-dev ] RFR: 8149733: AArch64: refactor char_array_equals/byte_array_equals/string_equals In-Reply-To: References: Message-ID: <56C70C42.5020309@oracle.com> Hi Hui, On 02/19/2016 03:13 PM, Hui Shi wrote: > webrev: http://cr.openjdk.java.net/~hshi/8149733/webrev/ > Not savvy with AArch64 assembly, but it does not look bad. My other comments are superficial: * Desperately needs spell-checking: "implenetaions", "implemenation", "eqauls", "comapre" * Inconsistent naming, e.g. "... = wordSize/step_size;" * "if (is_string_equal == false) {" * "if (exact_log >0 )" * Shouldn't be: 4533 ldrw(cnt1, Address(ary1, length_offset)); 4534 ldrw(tmp2, Address(ary2, length_offset)); 4535 cmp(cnt1, tmp2); spelled like: 4533 ldrw(cnt1, Address(ary1, length_offset)); 4534 ldrw(cnt2, Address(ary2, length_offset)); 4535 cmp(cnt1, cnt2); * Would be nice to keep the comments like "// 0-7 bytes left, cnt1 = #bytes left - 4" * Why TAIL01 block is predicated on (step_size == 1) now? > Test case: http://cr.openjdk.java.net/~hshi/8149733/TestArrayEqual.java > I think you should really, really, really use JMH for these benchmarks: http://openjdk.java.net/projects/code-tools/jmh/ It would also provide you an easy access to generated code profiling, with -prof perfasm. It is usually pretty clear from that output if your generated code needs even more tuneups. Cheers, -Aleksey From aph at redhat.com Sat Feb 20 10:00:55 2016 From: aph at redhat.com (Andrew Haley) Date: Sat, 20 Feb 2016 10:00:55 +0000 Subject: [aarch64-port-dev ] arraycopy optimisations on aarch64 In-Reply-To: <1455737358.14578.45.camel@mint> References: <1455737358.14578.45.camel@mint> Message-ID: <56C83957.4090908@redhat.com> On 17/02/16 19:29, Edward Nevill wrote: > To further muddy the waters I have two patches I would like to forward for your discussion. Webrevs, please. Andrew. From edward.nevill at gmail.com Sat Feb 20 15:23:53 2016 From: edward.nevill at gmail.com (Edward Nevill) Date: Sat, 20 Feb 2016 15:23:53 +0000 Subject: [aarch64-port-dev ] arraycopy optimisations on aarch64 In-Reply-To: <56C83957.4090908@redhat.com> References: <1455737358.14578.45.camel@mint> <56C83957.4090908@redhat.com> Message-ID: <1455981833.4817.2.camel@mint> On Sat, 2016-02-20 at 10:00 +0000, Andrew Haley wrote: > On 17/02/16 19:29, Edward Nevill wrote: > > To further muddy the waters I have two patches I would like to forward for your discussion. > > Webrevs, please. Here they are http://cr.openjdk.java.net/~enevill/8150082/webrev.0/ http://cr.openjdk.java.net/~enevill/8150313/webrev.0/ Note: They are not independent, the first must be applied before the second. Regards, Ed. From hui.shi at linaro.org Mon Feb 22 11:56:25 2016 From: hui.shi at linaro.org (Hui Shi) Date: Mon, 22 Feb 2016 19:56:25 +0800 Subject: [aarch64-port-dev ] RFR: 8149733: AArch64: refactor char_array_equals/byte_array_equals/string_equals In-Reply-To: <56C70C42.5020309@oracle.com> References: <56C70C42.5020309@oracle.com> Message-ID: Thanks Aleksey & Andrew! Patch is updated in http://cr.openjdk.java.net/~hshi/8149733/webrev2/ , it adds on 1. Fix misc spelling and format issues 2. Use cnt2 for array length compare, comment that cnt2 can?t be used after length compare 3. Add more comments for tail handling JMH test in http://cr.openjdk.java.net/~hshi/8149733/webrev2/JMHSample_97_ArrayEqual.java . Run with java -jar ../benchmarks.jar '.*JMHSample_97*' -w 5 -wi 3 -i 5 -r 10 -f 0 Following is testing result before and after apply this patch. Refactoring looks better in most cases. Length 1-8 before Benchmark Mode Cnt Score Error Units JMHSample_97_ArrayEqual.byte_equal avgt 5 2954.349 ? 0.076 us/op JMHSample_97_ArrayEqual.byte_not_equal avgt 5 3232.505 ? 7.050 us/op JMHSample_97_ArrayEqual.char_equal avgt 5 2916.643 ? 0.126 us/op JMHSample_97_ArrayEqual.char_not_equal avgt 5 2778.486 ? 3.539 us/op JMHSample_97_ArrayEqual.string_equal avgt 5 4411.364 ? 0.149 us/op JMHSample_97_ArrayEqual.string_not_equal avgt 5 3898.965 ? 0.122 us/op Length 1-8 after Benchmark Mode Cnt Score Error Units JMHSample_97_ArrayEqual.byte_equal avgt 5 2890.122 ? 1.279 us/op JMHSample_97_ArrayEqual.byte_not_equal avgt 5 2893.002 ? 5.914 us/op JMHSample_97_ArrayEqual.char_equal avgt 5 2735.193 ? 0.096 us/op JMHSample_97_ArrayEqual.char_not_equal avgt 5 2753.818 ? 0.708 us/op JMHSample_97_ArrayEqual.string_equal avgt 5 4162.080 ? 818.652 us/op JMHSample_97_ArrayEqual.string_not_equal avgt 5 3824.308 ? 0.621 us/op Length 9-16 before Benchmark Mode Cnt Score Error Units JMHSample_97_ArrayEqual.byte_equal avgt 5 4193.783 ? 22.731 us/op JMHSample_97_ArrayEqual.byte_not_equal avgt 5 3819.967 ? 61.053 us/op JMHSample_97_ArrayEqual.char_equal avgt 5 5780.135 ? 104.966 us/op JMHSample_97_ArrayEqual.char_not_equal avgt 5 5694.717 ? 87.426 us/op JMHSample_97_ArrayEqual.string_equal avgt 5 6741.276 ? 1.112 us/op JMHSample_97_ArrayEqual.string_not_equal avgt 5 6439.345 ? 161.295 us/op Length 9-16 after Benchmark Mode Cnt Score Error Units JMHSample_97_ArrayEqual.byte_equal avgt 5 2937.688 ? 0.074 us/op JMHSample_97_ArrayEqual.byte_not_equal avgt 5 2842.832 ? 0.038 us/op JMHSample_97_ArrayEqual.char_equal avgt 5 5274.417 ? 0.912 us/op JMHSample_97_ArrayEqual.char_not_equal avgt 5 4611.007 ? 0.592 us/op JMHSample_97_ArrayEqual.string_equal avgt 5 6778.782 ? 28.918 us/op JMHSample_97_ArrayEqual.string_not_equal avgt 5 6455.762 ? 10.674 us/op Length 32-39 before Benchmark Mode Cnt Score Error Units JMHSample_97_ArrayEqual.byte_equal avgt 5 5519.248 ? 1.799 us/op JMHSample_97_ArrayEqual.byte_not_equal avgt 5 7204.390 ? 72.663 us/op JMHSample_97_ArrayEqual.char_equal avgt 5 7891.681 ? 4.859 us/op JMHSample_97_ArrayEqual.char_not_equal avgt 5 9830.466 ? 0.800 us/op JMHSample_97_ArrayEqual.string_equal avgt 5 10087.074 ? 1.976 us/op JMHSample_97_ArrayEqual.string_not_equal avgt 5 11383.347 ? 1.712 us/op Length 32-39 after Benchmark Mode Cnt Score Error Units JMHSample_97_ArrayEqual.byte_equal avgt 5 5445.432 ? 1.396 us/op JMHSample_97_ArrayEqual.byte_not_equal avgt 5 5856.414 ? 0.996 us/op JMHSample_97_ArrayEqual.char_equal avgt 5 7864.556 ? 1.408 us/op JMHSample_97_ArrayEqual.char_not_equal avgt 5 9274.953 ? 30.892 us/op JMHSample_97_ArrayEqual.string_equal avgt 5 9841.792 ? 0.721 us/op JMHSample_97_ArrayEqual.string_not_equal avgt 5 10750.615 ? 1.252 us/op Length 1025-1032 before Benchmark Mode Cnt Score Error Units JMHSample_97_ArrayEqual.byte_equal avgt 5 103655.644 ? 15794.521 us/op JMHSample_97_ArrayEqual.byte_not_equal avgt 5 90908.990 ? 120.387 us/op JMHSample_97_ArrayEqual.char_equal avgt 5 155515.192 ? 233.650 us/op JMHSample_97_ArrayEqual.char_not_equal avgt 5 148312.632 ? 59.342 us/op JMHSample_97_ArrayEqual.string_equal avgt 5 134281.945 ? 20.829 us/op JMHSample_97_ArrayEqual.string_not_equal avgt 5 138580.479 ? 137.336 us/o Length 1025-1032 after Benchmark Mode Cnt Score Error Units JMHSample_97_ArrayEqual.byte_equal avgt 5 102232.913 ? 1950.542 us/op JMHSample_97_ArrayEqual.byte_not_equal avgt 5 90179.625 ? 102.160 us/op JMHSample_97_ArrayEqual.char_equal avgt 5 152515.169 ? 167.507 us/op JMHSample_97_ArrayEqual.char_not_equal avgt 5 140293.463 ? 198.916 us/op JMHSample_97_ArrayEqual.string_equal avgt 5 141776.676 ? 42.597 us/op JMHSample_97_ArrayEqual.string_not_equal avgt 5 130141.577 ? 29.875 us/op Regards Hui On 19 February 2016 at 20:36, Aleksey Shipilev wrote: > Hi Hui, > > On 02/19/2016 03:13 PM, Hui Shi wrote: > > webrev: http://cr.openjdk.java.net/~hshi/8149733/webrev/ > > > > Not savvy with AArch64 assembly, but it does not look bad. > > My other comments are superficial: > > * Desperately needs spell-checking: "implenetaions", "implemenation", > "eqauls", "comapre" > > * Inconsistent naming, e.g. "... = wordSize/step_size;" > > * "if (is_string_equal == false) {" > > * "if (exact_log >0 )" > > * Shouldn't be: > > 4533 ldrw(cnt1, Address(ary1, length_offset)); > 4534 ldrw(tmp2, Address(ary2, length_offset)); > 4535 cmp(cnt1, tmp2); > > spelled like: > > 4533 ldrw(cnt1, Address(ary1, length_offset)); > 4534 ldrw(cnt2, Address(ary2, length_offset)); > 4535 cmp(cnt1, cnt2); > > * Would be nice to keep the comments like "// 0-7 bytes left, cnt1 = > #bytes left - 4" > > * Why TAIL01 block is predicated on (step_size == 1) now? > > > Test case: http://cr.openjdk.java.net/~hshi/8149733/TestArrayEqual.java > > > > I think you should really, really, really use JMH for these benchmarks: > http://openjdk.java.net/projects/code-tools/jmh/ > > It would also provide you an easy access to generated code profiling, > with -prof perfasm. It is usually pretty clear from that output if your > generated code needs even more tuneups. > > Cheers, > -Aleksey > > From edward.nevill at gmail.com Mon Feb 22 20:32:10 2016 From: edward.nevill at gmail.com (Edward Nevill) Date: Mon, 22 Feb 2016 20:32:10 +0000 Subject: [aarch64-port-dev ] RFR: 8150394: aarch64: add support for 8.1 LSE CAS instructions Message-ID: <1456173130.2735.8.camel@mint> Hi, Please review the following webrev http://cr.openjdk.java.net/~enevill/8150394/webrev.0/ This adds support for the CAS instructions in armv8.1. The use of the instructions is enabled/disabled by the use of -XX:+/-UseLSE. This is enabled automatically if detected in the hwcap. If UseLSE is enabled on a CPU which does not support these instructions a warning is issued but the instructions are enabled in any case. This is to allow use of the LSE extensions on 8.1 systems which are running older kernels. Tested before and after with jcstress (default mode). In both cases there was 1 failure which is due to a missing Unsafe method and always occurs with jdk9. Thanks for the review, Ed. From aleksey.shipilev at oracle.com Tue Feb 23 10:01:28 2016 From: aleksey.shipilev at oracle.com (Aleksey Shipilev) Date: Tue, 23 Feb 2016 13:01:28 +0300 Subject: [aarch64-port-dev ] RFR: 8149733: AArch64: refactor char_array_equals/byte_array_equals/string_equals In-Reply-To: References: <56C70C42.5020309@oracle.com> Message-ID: <56CC2DF8.60806@oracle.com> On 02/22/2016 02:56 PM, Hui Shi wrote: > Thanks Aleksey & Andrew! > > Patch is updated in http://cr.openjdk.java.net/~hshi/8149733/webrev2/ > , it adds on > 1. Fix misc spelling and format issues > 2. Use cnt2 for array length compare, comment that cnt2 can?t be used > after length compare > 3. Add more comments for tail handling Still not getting this part: 4526 cmp(ary1, ary2); 4527 mov(result, false); 4528 br(Assembler::EQ, SAME); Should the mov be *after* the branch? Also, "if (is_string_equal == false) {" should be "if (!is_string_equals)". > JMH test > in http://cr.openjdk.java.net/~hshi/8149733/webrev2/JMHSample_97_ArrayEqual.java > > . Run with java -jar ../benchmarks.jar '.*JMHSample_97*' -w 5 -wi 3 -i 5 > -r 10 -f 0 Um, -f 0 is bad: you contaminate the profiles. Also, the benchmark could be made much more idiomatic, solving a few other benchmarking pitfalls: http://cr.openjdk.java.net/~shade/scratch/ByteArrayEquals.java You might want to re-run with your updated code. Cheers, -Aleksey From aph at redhat.com Tue Feb 23 13:28:40 2016 From: aph at redhat.com (Andrew Haley) Date: Tue, 23 Feb 2016 13:28:40 +0000 Subject: [aarch64-port-dev ] =?utf-8?b?5Zue5aSN77yaUkZSOiA4MTQ5NzMzOiBB?= =?utf-8?q?Arch64=3A_refactorchar=5Farray=5Fequals/byte=5Farray=5Fequals/s?= =?utf-8?q?tring=5Fequals?= In-Reply-To: References: <56CC2DF8.60806@oracle.com> <56C70C42.5020309@oracle.com> Message-ID: <56CC5E88.7040309@redhat.com> On 02/23/2016 11:33 AM, hui.shi wrote: > thanks! I will update and rerun JMH. I want to make some changes. Please wait until then. Thanks, Andrew. From aph at redhat.com Tue Feb 23 16:17:52 2016 From: aph at redhat.com (Andrew Haley) Date: Tue, 23 Feb 2016 16:17:52 +0000 Subject: [aarch64-port-dev ] =?utf-8?b?5Zue5aSN77yaUkZSOiA4MTQ5NzMzOiBB?= =?utf-8?q?Arch64=3A_refactorchar=5Farray=5Fequals/byte=5Farray=5Fequals/s?= =?utf-8?q?tring=5Fequals?= In-Reply-To: References: <56CC2DF8.60806@oracle.com> <56C70C42.5020309@oracle.com> Message-ID: <56CC8630.2020708@redhat.com> My version is at http://cr.openjdk.java.net/~aph/8149733/ The changes I made are: I rewrote most of the comments because I couldn't understand them. I intend no criticism, and I understand that English isn't the language of your birth. Please tell me if you can understand my comments. "generic_array_equals" -> "arrays_equals" Reason: it's not generic, it's only bytes and chars. Also, this is what x86_64 calls the same routine. "ary1" -> "a" Reason: "ary" just looks odd. Also, these are the names in the java code. "cmp; br.nz" -> "eor, bnz" Reason: Don't clobber flags for no reason. There's no need to check for the same arrays if we're comparing strings. Otherwise, the code is the same. I haven't much tested this, but it should give the same performance. Please test it, and tell me if I've broken anything. Thanks, Andrew. From aph at redhat.com Wed Feb 24 10:58:23 2016 From: aph at redhat.com (Andrew Haley) Date: Wed, 24 Feb 2016 10:58:23 +0000 Subject: [aarch64-port-dev ] RFR: 8150394: aarch64: add support for 8.1 LSE CAS instructions In-Reply-To: <1456173130.2735.8.camel@mint> References: <1456173130.2735.8.camel@mint> Message-ID: <56CD8CCF.1030404@redhat.com> On 22/02/16 20:32, Edward Nevill wrote: > http://cr.openjdk.java.net/~enevill/8150394/webrev.0/ > > This adds support for the CAS instructions in armv8.1. The C2 code for aarch64_enc_cmpxchg* is missing. It's quite tricky to refactor to allow LSE instructions. I'd add a wordsize parameter to the cas instruction, like this: #define INSN(NAME, a, r) \ void NAME(operand_size sz, Register Rs, Register Rt, Register Rn) { \ assert(Rs != Rn && Rs != Rt, "unpredictable instruction"); \ compare_and_swap(Rs, Rt, Rn, sz, 1, a, r); \ } INSN(cas, 0, 0) And this gets rid of a ton of instruction definitions: we only need CAS{A,L,AL}. Pass the operand size down to MacroAssembler::cmpxchgw: enc_class aarch64_enc_cmpxchgw(memory mem, iRegINoSp oldval, iRegINoSp newval) %{ MacroAssembler _masm(&cbuf); guarantee($mem$$index == -1 && $mem$$disp == 0, "impossible encoding"); __ cmpxchg(Assembler::word, $mem$$base$$Register, $oldval$$Register, $newval$$Register, &Assembler::ldxrw, &MacroAssembler::cmpw, &Assembler::stlxrw); %} void MacroAssembler::cmpxchgw(operand_size sz, Register oldv, Register newv, Register addr, Register tmp, Label &succeed, Label *fail) { if (UseLSE) { ... It'll be necessary to pass a memory barrier flag too. Andrew. From aph at redhat.com Wed Feb 24 12:49:30 2016 From: aph at redhat.com (Andrew Haley) Date: Wed, 24 Feb 2016 12:49:30 +0000 Subject: [aarch64-port-dev ] RFR: 8150394: aarch64: add support for 8.1 LSE CAS instructions In-Reply-To: <56CD8CCF.1030404@redhat.com> References: <1456173130.2735.8.camel@mint> <56CD8CCF.1030404@redhat.com> Message-ID: <56CDA6DA.3080901@redhat.com> One more thing: with 8148146 we have new entry points and C2 nodes for WeakCompareAndSwapX. We'll need to add a "bool weak" parameter to MacroAssembler::cmpxchgw. I suppose it's OK for this to be done in a later commit. Andrew. From aph at redhat.com Wed Feb 24 12:59:16 2016 From: aph at redhat.com (Andrew Haley) Date: Wed, 24 Feb 2016 12:59:16 +0000 Subject: [aarch64-port-dev ] RFR: 8150394: aarch64: add support for 8.1 LSE CAS instructions In-Reply-To: <56CDA6DA.3080901@redhat.com> References: <1456173130.2735.8.camel@mint> <56CD8CCF.1030404@redhat.com> <56CDA6DA.3080901@redhat.com> Message-ID: <56CDA924.6050106@redhat.com> On 02/24/2016 12:49 PM, Andrew Haley wrote: > One more thing: with 8148146 we have new entry points and C2 nodes > for WeakCompareAndSwapX. We'll need to add a "bool weak" parameter > to MacroAssembler::cmpxchgw. I suppose it's OK for this to be done > in a later commit. Forget that: hs-comp is currently broken because of Unsafe changes. I'm going to make it build again and push. Your changes can go on top of that. Andrew. From hui.shi at linaro.org Wed Feb 24 13:02:48 2016 From: hui.shi at linaro.org (Hui Shi) Date: Wed, 24 Feb 2016 21:02:48 +0800 Subject: [aarch64-port-dev ] =?utf-8?b?5Zue5aSN77yaUkZSOiA4MTQ5NzMzOiBB?= =?utf-8?q?Arch64=3A_refactorchar=5Farray=5Fequals/byte=5Farray=5Fe?= =?utf-8?q?quals/string=5Fequals?= In-Reply-To: <56CC8630.2020708@redhat.com> References: <56CC2DF8.60806@oracle.com> <56C70C42.5020309@oracle.com> <56CC8630.2020708@redhat.com> Message-ID: Thanks Andrew! Your comment looks really better and performance doesn't change when run JMHSample_97_ArrayEqual.java test. latest webrev http://cr.openjdk.java.net/~hshi/8149733/webrev3/ several small name and format issues: 1. "ary1" -> "a" in method declaration 2. Use tmp1 instead of rscratch1 directly + ldrw(cnt1, Address(a1, length_offset)); + ldrw(cnt2, Address(a2, length_offset)); + eorw(rscratch1, cnt1, cnt2); + cbnzw(rscratch1, DONE); 3. Blank after ?!? + if (! is_string) { Following is result with Aleksey's updated test case (-w 5 -wi 3 -i3 -r 10), first 4 group are for base run with base string length 0, 8, 31, 1024. Performance with patch doesn't show same improvement with early test. Only small length string equal tests still show obvious improvement. grep -A 6 "^Benchmark" base.result Benchmark (baselength) (size) Mode Cnt Score Error Units JMH_ArrayEquals.byte_equal 0 500 avgt 9 15.563 ? 0.005 us/op JMH_ArrayEquals.byte_not_equal 0 500 avgt 9 16.425 ? 0.167 us/op JMH_ArrayEquals.char_equal 0 500 avgt 9 15.635 ? 0.294 us/op JMH_ArrayEquals.char_not_equal 0 500 avgt 9 15.557 ? 0.377 us/op JMH_ArrayEquals.string_equal 0 500 avgt 9 22.307 ? 0.063 us/op JMH_ArrayEquals.string_not_equal 0 500 avgt 9 21.368 ? 0.025 us/op -- Benchmark (baselength) (size) Mode Cnt Score Error Units JMH_ArrayEquals.byte_equal 8 500 avgt 9 16.058 ? 0.012 us/op JMH_ArrayEquals.byte_not_equal 8 500 avgt 9 16.910 ? 0.574 us/op JMH_ArrayEquals.char_equal 8 500 avgt 9 17.094 ? 0.008 us/op JMH_ArrayEquals.char_not_equal 8 500 avgt 9 17.114 ? 0.156 us/op JMH_ArrayEquals.string_equal 8 500 avgt 9 25.033 ? 0.074 us/op JMH_ArrayEquals.string_not_equal 8 500 avgt 9 24.968 ? 0.244 us/op -- Benchmark (baselength) (size) Mode Cnt Score Error Units JMH_ArrayEquals.byte_equal 31 500 avgt 9 18.821 ? 0.091 us/op JMH_ArrayEquals.byte_not_equal 31 500 avgt 9 19.763 ? 0.002 us/op JMH_ArrayEquals.char_equal 31 500 avgt 9 24.210 ? 0.033 us/op JMH_ArrayEquals.char_not_equal 31 500 avgt 9 27.400 ? 0.382 us/op JMH_ArrayEquals.string_equal 31 500 avgt 9 29.825 ? 0.098 us/op JMH_ArrayEquals.string_not_equal 31 500 avgt 9 31.918 ? 0.100 us/op -- Benchmark (baselength) (size) Mode Cnt Score Error Units JMH_ArrayEquals.byte_equal 1024 500 avgt 9 188.613 ? 7.386 us/op JMH_ArrayEquals.byte_not_equal 1024 500 avgt 9 193.399 ? 4.448 us/op JMH_ArrayEquals.char_equal 1024 500 avgt 9 316.324 ? 9.976 us/op JMH_ArrayEquals.char_not_equal 1024 500 avgt 9 341.307 ? 1.082 us/op JMH_ArrayEquals.string_equal 1024 500 avgt 9 324.059 ? 2.352 us/op JMH_ArrayEquals.string_not_equal 1024 500 avgt 9 326.954 ? 1.121 us/op grep -A 6 "^Benchmark" opt.result Benchmark (baselength) (size) Mode Cnt Score Error Units JMH_ArrayEquals.byte_equal 0 500 avgt 9 15.923 ? 0.132 us/op JMH_ArrayEquals.byte_not_equal 0 500 avgt 9 15.996 ? 0.336 us/op JMH_ArrayEquals.char_equal 0 500 avgt 9 16.001 ? 0.127 us/op JMH_ArrayEquals.char_not_equal 0 500 avgt 9 15.361 ? 0.004 us/op JMH_ArrayEquals.string_equal 0 500 avgt 9 21.083 ? 0.337 us/op JMH_ArrayEquals.string_not_equal 0 500 avgt 9 19.887 ? 0.479 us/op -- Benchmark (baselength) (size) Mode Cnt Score Error Units JMH_ArrayEquals.byte_equal 8 500 avgt 9 16.574 ? 0.148 us/op JMH_ArrayEquals.byte_not_equal 8 500 avgt 9 16.596 ? 0.719 us/op JMH_ArrayEquals.char_equal 8 500 avgt 9 17.874 ? 0.431 us/op JMH_ArrayEquals.char_not_equal 8 500 avgt 9 17.831 ? 0.284 us/op JMH_ArrayEquals.string_equal 8 500 avgt 9 24.279 ? 0.033 us/op JMH_ArrayEquals.string_not_equal 8 500 avgt 9 22.850 ? 0.444 us/op -- Benchmark (baselength) (size) Mode Cnt Score Error Units JMH_ArrayEquals.byte_equal 31 500 avgt 9 19.010 ? 0.006 us/op JMH_ArrayEquals.byte_not_equal 31 500 avgt 9 19.962 ? 0.071 us/op JMH_ArrayEquals.char_equal 31 500 avgt 9 25.038 ? 0.108 us/op JMH_ArrayEquals.char_not_equal 31 500 avgt 9 27.268 ? 0.063 us/op JMH_ArrayEquals.string_equal 31 500 avgt 9 29.366 ? 0.103 us/op JMH_ArrayEquals.string_not_equal 31 500 avgt 9 31.357 ? 0.047 us/op -- Benchmark (baselength) (size) Mode Cnt Score Error Units JMH_ArrayEquals.byte_equal 1024 500 avgt 9 190.034 ? 4.067 us/op JMH_ArrayEquals.byte_not_equal 1024 500 avgt 9 192.504 ? 4.675 us/op JMH_ArrayEquals.char_equal 1024 500 avgt 9 313.925 ? 8.476 us/op JMH_ArrayEquals.char_not_equal 1024 500 avgt 9 342.520 ? 7.915 us/op JMH_ArrayEquals.string_equal 1024 500 avgt 9 326.392 ? 2.009 us/op JMH_ArrayEquals.string_not_equal 1024 500 avgt 9 328.526 ? 3.617 us/op Regards Hui On 24 February 2016 at 00:17, Andrew Haley wrote: > My version is at > > http://cr.openjdk.java.net/~aph/8149733/ > > The changes I made are: > > I rewrote most of the comments because I couldn't understand > them. I intend no criticism, and I understand that English > isn't the language of your birth. Please tell me if you can > understand my comments. > > "generic_array_equals" -> "arrays_equals" > Reason: it's not generic, it's only bytes and chars. > Also, this is what x86_64 calls the same routine. > > "ary1" -> "a" > Reason: "ary" just looks odd. Also, these are the names in the > java code. > > "cmp; br.nz" -> "eor, bnz" > Reason: Don't clobber flags for no reason. > > There's no need to check for the same arrays if we're > comparing strings. > > Otherwise, the code is the same. I haven't much tested this, but it > should give the same performance. Please test it, and tell me if I've > broken anything. > > Thanks, > > Andrew. > From aleksey.shipilev at oracle.com Wed Feb 24 14:24:42 2016 From: aleksey.shipilev at oracle.com (Aleksey Shipilev) Date: Wed, 24 Feb 2016 17:24:42 +0300 Subject: [aarch64-port-dev ] =?utf-8?b?5Zue5aSN77yaUkZSOiA4MTQ5NzMzOiBB?= =?utf-8?q?Arch64=3A_refactorchar=5Farray=5Fequals/byte=5Farray=5Fequals/s?= =?utf-8?q?tring=5Fequals?= In-Reply-To: References: <56CC2DF8.60806@oracle.com> <56C70C42.5020309@oracle.com> <56CC8630.2020708@redhat.com> Message-ID: <56CDBD2A.9050002@oracle.com> On 02/24/2016 04:02 PM, Hui Shi wrote: > Thanks Andrew! Your comment looks really better and performance doesn't > change when run JMHSample_97_ArrayEqual.java > test. > > latest webrev http://cr.openjdk.java.net/~hshi/8149733/webrev3/ > Good. > Following is result with Aleksey's updated test case (-w 5 -wi 3 -i3 -r > 10), first 4 group are for base run with base string length 0, 8, 31, > 1024. Performance with patch doesn't show same improvement with early > test. Only small length string equal tests still show obvious improvement. ...and that's okay for refactoring. Cheers, -Aleksey From adinn at redhat.com Wed Feb 24 16:50:58 2016 From: adinn at redhat.com (Andrew Dinn) Date: Wed, 24 Feb 2016 16:50:58 +0000 Subject: [aarch64-port-dev ] RFR: 8150394: aarch64: add support for 8.1 LSE CAS instructions In-Reply-To: <56CD8CCF.1030404@redhat.com> References: <1456173130.2735.8.camel@mint> <56CD8CCF.1030404@redhat.com> Message-ID: <56CDDF72.9080304@redhat.com> On 24/02/16 10:58, Andrew Haley wrote: > On 22/02/16 20:32, Edward Nevill wrote: >> http://cr.openjdk.java.net/~enevill/8150394/webrev.0/ >> >> This adds support for the CAS instructions in armv8.1. > > The C2 code for aarch64_enc_cmpxchg* is missing. > > It's quite tricky to refactor to allow LSE instructions. I'd add > a wordsize parameter to the cas instruction, like this: > > #define INSN(NAME, a, r) \ > void NAME(operand_size sz, Register Rs, Register Rt, Register Rn) { \ > assert(Rs != Rn && Rs != Rt, "unpredictable instruction"); \ > compare_and_swap(Rs, Rt, Rn, sz, 1, a, r); \ > } > INSN(cas, 0, 0) > > And this gets rid of a ton of instruction definitions: we only need > CAS{A,L,AL}. > > Pass the operand size down to MacroAssembler::cmpxchgw: > > enc_class aarch64_enc_cmpxchgw(memory mem, iRegINoSp oldval, iRegINoSp newval) %{ > MacroAssembler _masm(&cbuf); > guarantee($mem$$index == -1 && $mem$$disp == 0, "impossible encoding"); > __ cmpxchg(Assembler::word, $mem$$base$$Register, $oldval$$Register, > $newval$$Register, > &Assembler::ldxrw, &MacroAssembler::cmpw, &Assembler::stlxrw); > %} > > void MacroAssembler::cmpxchgw(operand_size sz, Register oldv, > Register newv, Register addr, Register tmp, > Label &succeed, Label *fail) { > > if (UseLSE) { > ... > > It'll be necessary to pass a memory barrier flag too. You mean to deal with the difference between aarch64_enc_cmpxchg and aarch64_enc_cmpxchg_acq? The former uses ldxr and is employed for CAS when UseBarriersForVolatile is true. The latter uses ldaxr and is employed when we optimize CAS because UseBarriersForVolatile is false. We need to use the relevant flavour of casxx iside cmpxchg for each of these two encodings. I was also going to recommend using LSE in cmpxchg but I was not sure exactly how it would need to work. The lock code does not loop when the stlxr fails (it branches to cas_failed). However the CAS code loops back to retry the load. If cmpxchg is rewritten to use casal (or casl) does it not still need to loop? Also, what does casxx allow us to do to implement the weaker variants of the new unsafe CAS API other than to include or exclude the acquire? Is there a variant of CAS operations which could use casa rather than casal? or even just cas? regards, Andrew Dinn ----------- From aph at redhat.com Wed Feb 24 16:53:49 2016 From: aph at redhat.com (Andrew Haley) Date: Wed, 24 Feb 2016 16:53:49 +0000 Subject: [aarch64-port-dev ] RFR: 8150394: aarch64: add support for 8.1 LSE CAS instructions In-Reply-To: <56CDDF72.9080304@redhat.com> References: <1456173130.2735.8.camel@mint> <56CD8CCF.1030404@redhat.com> <56CDDF72.9080304@redhat.com> Message-ID: <56CDE01D.6080702@redhat.com> On 02/24/2016 04:50 PM, Andrew Dinn wrote: > You mean to deal with the difference between aarch64_enc_cmpxchg and > aarch64_enc_cmpxchg_acq? We now have CAS for non-ordered memory too. I'm preparing a patch as we speak. It should become clearer. Andrew. From gnu.andrew at redhat.com Thu Feb 25 00:34:08 2016 From: gnu.andrew at redhat.com (Andrew Hughes) Date: Wed, 24 Feb 2016 19:34:08 -0500 (EST) Subject: [aarch64-port-dev ] [PATCH] [jdk8u] jvm.cfg parsing broken, resulting in broken JDK In-Reply-To: <262219393.26885369.1456334677753.JavaMail.zimbra@redhat.com> Message-ID: <1220398795.26965429.1456360448570.JavaMail.zimbra@redhat.com> Webrev: http://cr.openjdk.java.net/~andrew/aarch64-8/sync/jdk.webrev.02/ This change: http://hg.openjdk.java.net/aarch64-port/jdk8u/jdk/rev/2940c1ead99bd7635 introduced a change to jvm.cfg parsing local to the 8u port. Then, this change: http://hg.openjdk.java.net/aarch64-port/jdk8u/jdk/rev/9399aa7ef558 removed part of that patch, breaking jvm.cfg parsing: "Error: missing `client' JVM at `/builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.72-6.b15.fc24.aarch64/openjdk/build/jdk8.build/images/j2sdk-image/jre/lib/aarch64/client/libjvm.so'." [0] This webrev removes the rest of 2940c1ead99bd7635 and replaces jvm.cfg with the version used in OpenJDK 9, allowing aarch64/jdk8u to build again. Ok to push? [0] https://bugzilla.redhat.com/show_bug.cgi?id=1307224#c14 -- Andrew :) Senior Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 From gnu.andrew at redhat.com Thu Feb 25 00:41:12 2016 From: gnu.andrew at redhat.com (Andrew Hughes) Date: Wed, 24 Feb 2016 19:41:12 -0500 (EST) Subject: [aarch64-port-dev ] [PATCH] [jdk8u] Remove unused template which breaks builds with GCC 6 In-Reply-To: <1527617114.26965440.1456360493395.JavaMail.zimbra@redhat.com> Message-ID: <1629682443.26965872.1456360872278.JavaMail.zimbra@redhat.com> Webrev: http://cr.openjdk.java.net/~andrew/aarch64-8/gcc6/webrev.01/ The min template: template static const T& min (const T& a, const T& b) causes the build to fail with GCC 6 [0], where the default C++ standard (-std=gnu++98) has to be explicitly specified, as the default has changed. /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.72-5.b15.fc24.aarch64/openjdk/hotspot/src/share/vm/utilities/globalDefinitions.hpp:1113:18: warning: variable templates only available with -std=c++14 or -std=gnu++14 #define min(a,b) Do_not_use_min_use_MIN2_instead ^ /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.72-5.b15.fc24.aarch64/openjdk/hotspot/src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp:197:36: note: in expansion of macro 'min' template static const T& min (const T& a, const T& b) { ^~~ /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.72-5.b15.fc24.aarch64/openjdk/hotspot/src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp:198:3: error: expected ';' before 'return' return (a > b) ? b : a; ^~~~~~ /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.72-5.b15.fc24.aarch64/openjdk/hotspot/src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp:199:1: error: expected declaration before '}' token } ^ The template appears to be unused and removing it allows the build to succeed with GCC 6. Ok to push? [0] https://bugzilla.redhat.com/show_bug.cgi?id=1307224#c1 -- Andrew :) Senior Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 From adinn at redhat.com Thu Feb 25 08:42:36 2016 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 25 Feb 2016 08:42:36 +0000 Subject: [aarch64-port-dev ] [PATCH] [jdk8u] jvm.cfg parsing broken, resulting in broken JDK In-Reply-To: <1220398795.26965429.1456360448570.JavaMail.zimbra@redhat.com> References: <1220398795.26965429.1456360448570.JavaMail.zimbra@redhat.com> Message-ID: <56CEBE7C.8070808@redhat.com> On 25/02/16 00:34, Andrew Hughes wrote: > Webrev: > http://cr.openjdk.java.net/~andrew/aarch64-8/sync/jdk.webrev.02/ > > This change: > > http://hg.openjdk.java.net/aarch64-port/jdk8u/jdk/rev/2940c1ead99bd7635 > > introduced a change to jvm.cfg parsing local to the 8u port. Then, > this change: > > http://hg.openjdk.java.net/aarch64-port/jdk8u/jdk/rev/9399aa7ef558 > > removed part of that patch, breaking jvm.cfg parsing: > > "Error: missing `client' JVM at > `/builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.72-6.b15.fc24.aarch64/openjdk/build/jdk8.build/images/j2sdk-image/jre/lib/aarch64/client/libjvm.so'." > [0] > > This webrev removes the rest of 2940c1ead99bd7635 and replaces > jvm.cfg with the version used in OpenJDK 9, allowing aarch64/jdk8u to > build again. > > Ok to push? > > [0] https://bugzilla.redhat.com/show_bug.cgi?id=1307224#c14 Yes, this looks right to me. It also explains why my latest build is having problems when I run without -server. So, please push. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in UK and Wales under Company Registration No. 3798903 Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul Argiry (US) From adinn at redhat.com Thu Feb 25 08:43:16 2016 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 25 Feb 2016 08:43:16 +0000 Subject: [aarch64-port-dev ] [PATCH] [jdk8u] Remove unused template which breaks builds with GCC 6 In-Reply-To: <1629682443.26965872.1456360872278.JavaMail.zimbra@redhat.com> References: <1629682443.26965872.1456360872278.JavaMail.zimbra@redhat.com> Message-ID: <56CEBEA4.4050301@redhat.com> On 25/02/16 00:41, Andrew Hughes wrote: > Webrev: http://cr.openjdk.java.net/~andrew/aarch64-8/gcc6/webrev.01/ > . . . > Ok to push? Yes, please. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in UK and Wales under Company Registration No. 3798903 Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul Argiry (US) From aph at redhat.com Thu Feb 25 09:25:14 2016 From: aph at redhat.com (Andrew Haley) Date: Thu, 25 Feb 2016 09:25:14 +0000 Subject: [aarch64-port-dev ] [PATCH] [jdk8u] Remove unused template which breaks builds with GCC 6 In-Reply-To: <56CEBEA4.4050301@redhat.com> References: <1629682443.26965872.1456360872278.JavaMail.zimbra@redhat.com> <56CEBEA4.4050301@redhat.com> Message-ID: <56CEC87A.5070600@redhat.com> On 25/02/16 08:43, Andrew Dinn wrote: > Yes, please. GCC 6 changes should be made to JDK 9 upstream and backported. JDK8u is not really for development. Andrew. From aph at redhat.com Thu Feb 25 09:58:31 2016 From: aph at redhat.com (Andrew Haley) Date: Thu, 25 Feb 2016 09:58:31 +0000 Subject: [aarch64-port-dev ] [PATCH] [jdk8u] Remove unused template which breaks builds with GCC 6 In-Reply-To: <1629682443.26965872.1456360872278.JavaMail.zimbra@redhat.com> References: <1629682443.26965872.1456360872278.JavaMail.zimbra@redhat.com> Message-ID: <56CED047.4060604@redhat.com> On 25/02/16 00:41, Andrew Hughes wrote: > Ok to push? > > [0] https://bugzilla.redhat.com/show_bug.cgi?id=1307224#c1 No. This should go to JDK 9 and get backported. Only JDK8- specific patches get reviewed on 8. Andrew. From edward.nevill at gmail.com Thu Feb 25 10:06:26 2016 From: edward.nevill at gmail.com (Edward Nevill) Date: Thu, 25 Feb 2016 10:06:26 +0000 Subject: [aarch64-port-dev ] RFR: 8150394: aarch64: add support for 8.1 LSE CAS instructions In-Reply-To: <56CD8CCF.1030404@redhat.com> References: <1456173130.2735.8.camel@mint> <56CD8CCF.1030404@redhat.com> Message-ID: <1456394786.1383.18.camel@mint> On Wed, 2016-02-24 at 10:58 +0000, Andrew Haley wrote: > On 22/02/16 20:32, Edward Nevill wrote: > And this gets rid of a ton of instruction definitions: we only need > CAS{A,L,AL}. > > Pass the operand size down to MacroAssembler::cmpxchgw: > > enc_class aarch64_enc_cmpxchgw(memory mem, iRegINoSp oldval, iRegINoSp newval) %{ > MacroAssembler _masm(&cbuf); > guarantee($mem$$index == -1 && $mem$$disp == 0, "impossible encoding"); > __ cmpxchg(Assembler::word, $mem$$base$$Register, $oldval$$Register, > $newval$$Register, > &Assembler::ldxrw, &MacroAssembler::cmpw, &Assembler::stlxrw); > %} > > void MacroAssembler::cmpxchgw(operand_size sz, Register oldv, > Register newv, Register addr, Register tmp, > Label &succeed, Label *fail) { > > if (UseLSE) { > ... > > It'll be necessary to pass a memory barrier flag too. Hi, Is this something like what you had in mind? http://cr.openjdk.java.net/~enevill/8150394/webrev.1/ WRT WeakCompareAndSwap I think it would be better if that went in as a separate change as we will have to backport this to jdk8 and doing it as one change means unpicking it later. Tested with jcstress with and without -XX:UseLSE All the best, Ed. From aph at redhat.com Thu Feb 25 10:11:44 2016 From: aph at redhat.com (Andrew Haley) Date: Thu, 25 Feb 2016 10:11:44 +0000 Subject: [aarch64-port-dev ] Freeze aarch64/jdk8 Message-ID: <56CED360.1000000@redhat.com> The jdk8 development tree at aarch64/jdk8 has been in use for some time. I don't think we need it any more. aarch64/jdk8u tracks upstream jdk8u far more closely: it differs from upstream only in the minimum number of places needed to get AArch64 to work. I'm proposing to close aarch64/jdk8 to all updates. aarch64/jdk8u will be used for all commits. jdk8 is interesting from a historical point of view, so it will be made read only. The rules for committing to aarch64/jdk8u are: 1. All non-AArch64-specific patches come from jdk8u. If you want to change anything non-AArch64 submit it to jdk8u. 2. All AArch64-specific patches, if they are relevant to jdk9, must be submitted there first and back-ported to jdk8u. Andrew. From aph at redhat.com Thu Feb 25 10:16:45 2016 From: aph at redhat.com (Andrew Haley) Date: Thu, 25 Feb 2016 10:16:45 +0000 Subject: [aarch64-port-dev ] RFR: 8150394: aarch64: add support for 8.1 LSE CAS instructions In-Reply-To: <1456394786.1383.18.camel@mint> References: <1456173130.2735.8.camel@mint> <56CD8CCF.1030404@redhat.com> <1456394786.1383.18.camel@mint> Message-ID: <56CED48D.9090702@redhat.com> On 25/02/16 10:06, Edward Nevill wrote: > Is this something like what you had in mind? > > http://cr.openjdk.java.net/~enevill/8150394/webrev.1/ Something like. I'll integrate what I've got with this and post it soon. Andrew. From edward.nevill at gmail.com Thu Feb 25 10:25:40 2016 From: edward.nevill at gmail.com (Edward Nevill) Date: Thu, 25 Feb 2016 10:25:40 +0000 Subject: [aarch64-port-dev ] Freeze aarch64/jdk8 In-Reply-To: <56CED360.1000000@redhat.com> References: <56CED360.1000000@redhat.com> Message-ID: <1456395940.7333.2.camel@mint> On Thu, 2016-02-25 at 10:11 +0000, Andrew Haley wrote: > The jdk8 development tree at aarch64/jdk8 has been in use for some > time. I don't think we need it any more. aarch64/jdk8u tracks > upstream jdk8u far more closely: it differs from upstream only in the > minimum number of places needed to get AArch64 to work. Perhaps to close out the jdk8 tree it might be good to backport the following patch http://openjdk.linaro.org/releases/1602/jdk8/patches/8592.patch This is a really nasty, critical bug and people are still building from this tree, Regards, Ed. From aph at redhat.com Thu Feb 25 10:45:46 2016 From: aph at redhat.com (Andrew Haley) Date: Thu, 25 Feb 2016 10:45:46 +0000 Subject: [aarch64-port-dev ] Freeze aarch64/jdk8 In-Reply-To: <1456395940.7333.2.camel@mint> References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint> Message-ID: <56CEDB5A.3090200@redhat.com> On 02/25/2016 10:25 AM, Edward Nevill wrote: > On Thu, 2016-02-25 at 10:11 +0000, Andrew Haley wrote: >> The jdk8 development tree at aarch64/jdk8 has been in use for some >> time. I don't think we need it any more. aarch64/jdk8u tracks >> upstream jdk8u far more closely: it differs from upstream only in the >> minimum number of places needed to get AArch64 to work. > > Perhaps to close out the jdk8 tree it might be good to backport the following patch > > http://openjdk.linaro.org/releases/1602/jdk8/patches/8592.patch > > This is a really nasty, critical bug and people are still building from this tree, Andrew Dinn is supposed to be doing this. Andrew. From adinn at redhat.com Thu Feb 25 11:37:00 2016 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 25 Feb 2016 11:37:00 +0000 Subject: [aarch64-port-dev ] Freeze aarch64/jdk8 In-Reply-To: <1456395940.7333.2.camel@mint> References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint> Message-ID: <56CEE75C.9080102@redhat.com> On 25/02/16 10:25, Edward Nevill wrote: > On Thu, 2016-02-25 at 10:11 +0000, Andrew Haley wrote: >> The jdk8 development tree at aarch64/jdk8 has been in use for some >> time. I don't think we need it any more. aarch64/jdk8u tracks >> upstream jdk8u far more closely: it differs from upstream only in the >> minimum number of places needed to get AArch64 to work. > > Perhaps to close out the jdk8 tree it might be good to backport the following patch > > http://openjdk.linaro.org/releases/1602/jdk8/patches/8592.patch > > This is a really nasty, critical bug and people are still building from this tree, I'm in the process of backporting all missing jdk8 hotspot patches into jdk8u. Unfortunately, one of the patches applied cleanly but made the changes in the wrong place (all those 0x1f and 0x3f substitutions for shifts were carefully misaligned by hg patch into the wrong cases). I have just started testing a newly patched build which corrects for this failure. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in UK and Wales under Company Registration No. 3798903 Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul Argiry (US) From adinn at redhat.com Thu Feb 25 11:53:23 2016 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 25 Feb 2016 11:53:23 +0000 Subject: [aarch64-port-dev ] Freeze aarch64/jdk8 In-Reply-To: <56CEE75C.9080102@redhat.com> References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint> <56CEE75C.9080102@redhat.com> Message-ID: <56CEEB33.2060400@redhat.com> On 25/02/16 11:37, Andrew Dinn wrote: > I'm in the process of backporting all missing jdk8 hotspot patches into > jdk8u. Unfortunately, one of the patches applied cleanly but made the > changes in the wrong place (all those 0x1f and 0x3f substitutions for > shifts were carefully misaligned by hg patch into the wrong cases). I > have just started testing a newly patched build which corrects for this > failure. I have successfully backported all the jdk8 hotspot patches which were not yet included in jdk8u. It seemed appropriate for all of them to go in (including test fixes). The list of patches is included below. It's basically everything from aarch64/jdk hotspot revision 8559 to revision 8590 excluding revision 8577 which had already been cherry-picked for prior inclusion. All patches were created by exporting the jdk8u revision. I only had two problems, applying revisions 8571 (more32bitshifts.patch) and 8576 (largecodecache.patch). Both were caused by the out of order cherry-pick. I tweaked both cases by hand and diffed with the corresponding version and with head to make sure that we ended up with the correct code. The resulting jdk8u tree builds and runs these basic smoke tests java Hello javac Hello.java netbeans (edit, build and run sample project) Shall I push these changes now? Or do you want to vet some of the patches? regards, Andrew Dinn ----------- Patches Backported from jdk8 to jdk8u [listed in order of application] 8134322.patch revid: 8559 Fix several errors in C2 biased locking implementation 8136524.patch revid: 8560 test/compiler/runtime/7196199/Test7196199.java fails 8136596.patch revid: 8561 Remove MemBarRelease when final field's allocation is NoEscape or ArgEscape 8136615.patch revid: 8562 elide DecodeN when followed by CmpP 0 8136165.patch revid: 8563 Tidy up compiled native calls 8138641.patch revid: 8564 Disable C2 peephole by default for aarch64 8138575.patch revid: 8565 Improve generated code for profile counters 8139674.patch revid: 8566 guarantee failure in TestOptionsWithRanges.java 8131645.patch revid: crash on Cavium when using G1 volcas.patch revid: 8568 Backport optimization of volatile puts/gets and CAS to use ldar/stlr 8131645-correction.patch revid: 8569 Fix thinko when backporting 8131645. Table ends up being allocated twice. 8140611.patch revid: 8570 jtreg test jdk/tools/pack200/UnpackerMemoryTest.java SEGVs more32bitshifts.patch revid: 8571 Some 32 bit shifts still being anded with 0x3f instead of 0x1f. Applied without reporting an error but all the offsets got shifted causing 0x1f and 0x3f to be substituted in the wrong places. Had to redo this part of the patch by hand until it looked like the jdk8 version. Also, checked by eyeball that all L instructions used 0x3f and all I instructions used 0x1f. 8135157.patch revid: 8572 DMB elimination in AArch64 C2 synchronization implementation 8138966.patch revid: 8573 Intermittent SEGV running ParallelGC 8143067.patch revid: 8574 guarantee failure in javac 8143285.patch revid: 8575 Missing load acquire when checking if ConstantPoolCacheEntry is resolved largecodecache.patch revid: 8576 Add support for large code cache Failed to apply as is because of conlict with a cherry-picked patch applied out of order (Remove AArch64-specific code in generateOptoStub.cpp). The failure relates to 2 problems. The out of order patch corrected a typo in the comment text used to contextualize the 2nd hunk in this patch. It also made an incompatible change to the computation of the instruction count i.e. there is a real merge conflict here when the patch is applied out of order. So, the cherry-picked patch must itself have been tweaked to be applied out of order. largecodecache-correction.patch revid: 8578 Fix client build after addition of large code cache support 8146286.patch revid: 8579 guarantee failures with large code cache sizes on jtreg test java/lang/invoke/LFCaching/LFMultiThreadCachingTest.java 8143584.patch revid: 8580 Load constant pool tag and class status with load acquire 8144028.patch revid: 8581 Use AArch64 bit-test instructions in C2 8144587.patch revid: 8582 generate vectorized MLA/MLS instructions 8145438.patch revid: 8583 Guarantee failures since 8144028: Use AArch64 bit-test instructions in C2 8144582.patch revid: 8584 AArch64 does not generate correct branch profile data 8144201.patch revid: 8585 jdk/test/com/sun/net/httpserver/Test6a.java fails with --enable-unlimited-crypto 8146678.patch revid: 8586 assertion failure: call instruction in an infinite loop 8146843.patch revid: 8587 add scheduling support for FP and vector instructions 8146709.patch revid: 8588 Incorrect use of ADRP for byte_map_base 8147805.patch revid: 8589 C1 segmentation fault due to inline Unsafe.getAndSetObject 8148240.patch revid: 8590 random infrequent null pointer exceptions in javac From aph at redhat.com Thu Feb 25 11:57:41 2016 From: aph at redhat.com (Andrew Haley) Date: Thu, 25 Feb 2016 11:57:41 +0000 Subject: [aarch64-port-dev ] Freeze aarch64/jdk8 In-Reply-To: <56CEEB33.2060400@redhat.com> References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint> <56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com> Message-ID: <56CEEC35.2020101@redhat.com> On 02/25/2016 11:53 AM, Andrew Dinn wrote: > Shall I push these changes now? Or do you want to vet some of the patches? Are any of them outside AArch64-specific directories? Andrew. From adinn at redhat.com Thu Feb 25 12:26:18 2016 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 25 Feb 2016 12:26:18 +0000 Subject: [aarch64-port-dev ] Freeze aarch64/jdk8 In-Reply-To: <56CEEC35.2020101@redhat.com> References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint> <56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com> <56CEEC35.2020101@redhat.com> Message-ID: <56CEF2EA.7090701@redhat.com> On 25/02/16 11:57, Andrew Haley wrote: > On 02/25/2016 11:53 AM, Andrew Dinn wrote: >> Shall I push these changes now? Or do you want to vet some of the patches? > > Are any of them outside AArch64-specific directories? Yes, in quite a few cases -- but they all appear to be backports of changes also made upstream. See below for details. n.b. revision ids are for aarch64/jdk8/hotspot tree. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in UK and Wales under Company Registration No. 3798903 Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul Argiry (US) 8136596.patch revid: 8561 8136596: Remove MemBarRelease when final field's allocation is NoEscape or ArgEscape this changes 3 files. the first is src/share/vm/opto/callnode.hpp @@ -894,6 +894,18 @@ // Convenience for initialization->maybe_set_complete(phase) bool maybe_set_complete(PhaseGVN* phase); + + // Return true if allocation doesn't escape thread, its escape state + // needs be noEscape or ArgEscape. InitializeNode._does_not_escape + // is true when its allocation's escape state is noEscape or + // ArgEscape. In case allocation's InitializeNode is NULL, check + // AlllocateNode._is_non_escaping flag. + // AlllocateNode._is_non_escaping is true when its escape state is + // noEscape. + bool does_not_escape_thread() { + InitializeNode* init = NULL; + return _is_non_escaping || (((init = initialization()) != NULL) && init->does_not_escape()); + } }; //------------------------------AllocateArray--------------------------------- the second is src/share/vm/opto/macro.cpp @@ -1385,7 +1385,8 @@ // MemBarStoreStore so that stores that initialize this object // can't be reordered with a subsequent store that makes this // object accessible by other threads. - if (init == NULL || (!init->is_complete_with_arraycopy() && !init->does_not_escape())) { + if (!alloc->does_not_escape_thread() && + (init == NULL || !init->is_complete_with_arraycopy())) { if (init == NULL || init->req() < InitializeNode::RawStores) { // No InitializeNode or no stores captured by zeroing // elimination. Simply add the MemBarStoreStore after object the third is src/share/vm/opto/memnode.cpp @@ -3065,7 +3065,7 @@ // Final field stores. Node* alloc = AllocateNode::Ideal_allocation(in(MemBarNode::Precedent), phase); if ((alloc != NULL) && alloc->is_Allocate() && - alloc->as_Allocate()->_is_non_escaping) { + alloc->as_Allocate()->does_not_escape_thread()) { // The allocated object does not escape. eliminate = true; } 8131645.patch revid: 8131645: crash on Cavium when using G1 this changes src/share/vm/gc_impementation/g1/g1CodeCacheRemSet.cpp @@ -200,6 +200,9 @@ void G1CodeRootSet::allocate_small_table() { _table = new CodeRootSetTable(SmallSize); + CodeRootSetTable* temp = new CodeRootSetTable(SmallSize); + + OrderAccess::release_store_ptr(&_table, temp); } void CodeRootSetTable::purge_list_append(CodeRootSetTable* table) { volcas.patch revid: 8568 Backport optimization of volatile puts/gets and CAS to use ldar/stlr this changes src/share/vm/opto/graphKit.cpp @@ -3803,7 +3803,7 @@ // Smash zero into card if( !UseConcMarkSweepGC ) { - __ store(__ ctrl(), card_adr, zero, bt, adr_type, MemNode::release); + __ store(__ ctrl(), card_adr, zero, bt, adr_type, MemNode::unordered); } else { // Specialized path for CM store barrier __ storeCM(__ ctrl(), card_adr, zero, oop_store, adr_idx, bt, adr_type); 8131645-correction.patch revid: 8569 Fix thinko when backporting 8131645. Table ends up being allocated twice. @@ -199,7 +199,6 @@ } void G1CodeRootSet::allocate_small_table() { - _table = new CodeRootSetTable(SmallSize); CodeRootSetTable* temp = new CodeRootSetTable(SmallSize); OrderAccess::release_store_ptr(&_table, temp); 8138966.patch revid: 8573 8138966: Intermittent SEGV running ParallelGC this changes src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.hpp @@ -348,7 +348,7 @@ HeapWord* _partial_obj_addr; region_sz_t _partial_obj_size; region_sz_t volatile _dc_and_los; - bool _blocks_filled; + bool volatile _blocks_filled; #ifdef ASSERT size_t _blocks_filled_count; // Number of block table fills. @@ -499,7 +499,9 @@ inline bool ParallelCompactData::RegionData::blocks_filled() const { - return _blocks_filled; + bool result = _blocks_filled; + OrderAccess::acquire(); + return result; } #ifdef ASSERT @@ -513,6 +515,7 @@ inline void ParallelCompactData::RegionData::set_blocks_filled() { + OrderAccess::release(); _blocks_filled = true; // Debug builds count the number of times the table was filled. DEBUG_ONLY(Atomic::inc_ptr(&_blocks_filled_count)); largecodecache.patch revid: 8576 Add support for large code cache this makes changes to two files. firstly src/share/vm/runtime/arguments.cpp @@ -1137,9 +1137,8 @@ } // Increase the code cache size - tiered compiles a lot more. if (FLAG_IS_DEFAULT(ReservedCodeCacheSize)) { - FLAG_SET_DEFAULT(ReservedCodeCacheSize, ReservedCodeCacheSize * 5); - // The maximum B/BL offset range on AArch64 is 128MB - AARCH64_ONLY(FLAG_SET_DEFAULT(ReservedCodeCacheSize, MIN2(ReservedCodeCacheSize, 128*M))); + FLAG_SET_DEFAULT(ReservedCodeCacheSize, + MIN2(CODE_CACHE_DEFAULT_LIMIT, ReservedCodeCacheSize * 5)); } if (!UseInterpreter) { // -Xcomp Tier3InvokeNotifyFreqLog = 0; @@ -2476,11 +2475,11 @@ "Invalid ReservedCodeCacheSize=%dK. Must be at least %uK.\n", ReservedCodeCacheSize/K, min_code_cache_size/K); status = false; - } else if (ReservedCodeCacheSize > 2*G) { - // Code cache size larger than MAXINT is not supported. + } else if (ReservedCodeCacheSize > CODE_CACHE_SIZE_LIMIT) { + // Code cache size larger than CODE_CACHE_SIZE_LIMIT is not supported. jio_fprintf(defaultStream::error_stream(), "Invalid ReservedCodeCacheSize=%dM. Must be at most %uM.\n", ReservedCodeCacheSize/M, - (2*G)/M); + CODE_CACHE_SIZE_LIMIT/M); status = false; } and also src/share/vm/utilities/globalDefinitions.hpp @@ -414,6 +414,11 @@ ProfileRTM = 0x0 // Use RTM with abort ratio calculation }; +// The maximum size of the code cache. Can be overridden by targets. +#define CODE_CACHE_SIZE_LIMIT (2*G) +// Allow targets to reduce the default size of the code cache. +#define CODE_CACHE_DEFAULT_LIMIT CODE_CACHE_SIZE_LIMIT + #ifdef TARGET_ARCH_x86 # include "globalDefinitions_x86.hpp" #endif 8145438.patch revid: 8583 8145438: Guarantee failures since 8144028: Use AArch64 bit-test instructions in C2 this makes a small change to src/share/vm/adlc/formssel.cpp @@ -1239,7 +1239,8 @@ !is_short_branch() && // Don't match another short branch variant reduce_result() != NULL && strcmp(reduce_result(), short_branch->reduce_result()) == 0 && - _matrule->equivalent(AD.globalNames(), short_branch->_matrule)) { + _matrule->equivalent(AD.globalNames(), short_branch->_matrule) && + equivalent_predicates(this, short_branch)) { // The instructions are equivalent. // Now verify that both instructions have the same parameters and From aph at redhat.com Thu Feb 25 13:38:32 2016 From: aph at redhat.com (Andrew Haley) Date: Thu, 25 Feb 2016 13:38:32 +0000 Subject: [aarch64-port-dev ] Freeze aarch64/jdk8 In-Reply-To: <56CEF2EA.7090701@redhat.com> References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint> <56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com> <56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com> Message-ID: <56CF03D8.8070101@redhat.com> On 02/25/2016 12:26 PM, Andrew Dinn wrote: > > > On 25/02/16 11:57, Andrew Haley wrote: >> On 02/25/2016 11:53 AM, Andrew Dinn wrote: >>> Shall I push these changes now? Or do you want to vet some of the patches? >> >> Are any of them outside AArch64-specific directories? > > Yes, in quite a few cases -- but they all appear to be backports of > changes also made upstream. See below for details. n.b. revision ids are > for aarch64/jdk8/hotspot tree. > > > 8136596.patch > revid: 8561 > 8136596: Remove MemBarRelease when final field's allocation is NoEscape > or ArgEscape This is a minor optimization, not submitted to jdk8u. It's also somewhat risky. OK if you make it AARCH64_ONLY. > 8131645.patch > revid: 8567 > 8131645: crash on Cavium when using G1 OK: serious crasher bug fix. Ed, has this been submitted to jdk8u? It should be because it's not AArch64-specific. > volcas.patch > revid: 8568 > Backport optimization of volatile puts/gets and CAS to use ldar/stlr It's probably safe, but there have been significant reworkings in this area. Maybe make this AARCH64_ONLY ? Please have a look at the generated code when running G1. > 8131645-correction.patch > revid: 8569 > Fix thinko when backporting 8131645. Table ends up being allocated twice. Yes, needed for 8131645. > 8138966.patch > revid: 8573 > 8138966: Intermittent SEGV running ParallelGC Yes, serious crasher bug. This is already in jdk8u. http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/rev/110735ab93ec Please Ask Andrew Hughes about this one: it should already be in. > largecodecache.patch > revid: 8576 > Add support for large code cache > > this makes changes to two files. The non-AArch64-specific parts of this patch are not used by anything so should not be included. The rest is OK. > 8145438.patch > revid: 8583 > 8145438: Guarantee failures since 8144028: Use AArch64 bit-test > instructions in C2 > > this makes a small change to src/share/vm/adlc/formssel.cpp This is OK. It can't be submitted to jdk8u upstream because it's only needed for Arch64. make it AARCH64_ONLY, just for safety. Andrew. From adinn at redhat.com Thu Feb 25 14:20:51 2016 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 25 Feb 2016 14:20:51 +0000 Subject: [aarch64-port-dev ] Freeze aarch64/jdk8 In-Reply-To: <56CF03D8.8070101@redhat.com> References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint> <56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com> <56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com> <56CF03D8.8070101@redhat.com> Message-ID: <56CF0DC2.8080104@redhat.com> On 25/02/16 13:38, Andrew Haley wrote: . . . >> 8136596.patch >> revid: 8561 >> 8136596: Remove MemBarRelease when final field's allocation is NoEscape >> or ArgEscape > > This is a minor optimization, not submitted to jdk8u. It's also > somewhat risky. OK if you make it AARCH64_ONLY. Ok, will do. >> volcas.patch >> revid: 8568 >> Backport optimization of volatile puts/gets and CAS to use ldar/stlr > > It's probably safe, but there have been significant reworkings in > this area. > > Maybe make this AARCH64_ONLY ? Please have a look at the generated > code when running G1. This shared change reverts part of a modification you had previously made to the shared code in an earlier attempt to implement optimization of volatile puts on AArch64. So, applying this patch merely restores the status quo as regards the shared code that is currently in both jdk8 and upstream jdk8u. Also, the same reversion was applied and is still present in jdk9. It has no bearing on any of the reworkings that have happened since the original patch was added to jdk9 and I don't envisage it having any such effect. So. I don't think there is any reason to make this AARCH64_ONLY. >> 8138966.patch >> revid: 8573 >> 8138966: Intermittent SEGV running ParallelGC > > Yes, serious crasher bug. This is already in jdk8u. > http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/rev/110735ab93ec > > Please Ask Andrew Hughes about this one: it should already be in. We already agreed that I would include this as part of my patch. >> largecodecache.patch >> revid: 8576 >> Add support for large code cache >> >> this makes changes to two files. > > The non-AArch64-specific parts of this patch are not used by anything > so should not be included. The rest is OK. Ok, I will rework this to include only the AArch64-specific code. >> 8145438.patch >> revid: 8583 >> 8145438: Guarantee failures since 8144028: Use AArch64 bit-test >> instructions in C2 >> >> this makes a small change to src/share/vm/adlc/formssel.cpp > > This is OK. It can't be submitted to jdk8u upstream because > it's only needed for Arch64. make it AARCH64_ONLY, just for > safety. ok, will do. I'll build and test with the revised patches and report (this tim ewith with a webrev) when I have managed to get it to pass basic smoke tests. regards, Andrew Dinn ----------- From aph at redhat.com Thu Feb 25 14:25:30 2016 From: aph at redhat.com (Andrew Haley) Date: Thu, 25 Feb 2016 14:25:30 +0000 Subject: [aarch64-port-dev ] Freeze aarch64/jdk8 In-Reply-To: <56CF0DC2.8080104@redhat.com> References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint> <56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com> <56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com> <56CF03D8.8070101@redhat.com> <56CF0DC2.8080104@redhat.com> Message-ID: <56CF0EDA.9070808@redhat.com> On 02/25/2016 02:20 PM, Andrew Dinn wrote: > On 25/02/16 13:38, Andrew Haley wrote: > . . . > >>> volcas.patch >>> revid: 8568 >>> Backport optimization of volatile puts/gets and CAS to use ldar/stlr >> >> It's probably safe, but there have been significant reworkings in >> this area. >> >> Maybe make this AARCH64_ONLY ? Please have a look at the generated >> code when running G1. > > This shared change reverts part of a modification you had previously > made to the shared code in an earlier attempt to implement optimization > of volatile puts on AArch64. So, applying this patch merely restores the > status quo as regards the shared code that is currently in both jdk8 and > upstream jdk8u. > > Also, the same reversion was applied and is still present in jdk9. It > has no bearing on any of the reworkings that have happened since the > original patch was added to jdk9 and I don't envisage it having any such > effect. > > So. I don't think there is any reason to make this AARCH64_ONLY. OK. I'm a bit mystified by the history of this one, but as long as in the end we don't diverge from upstream jdk8u I'm happy. >>> 8138966.patch >>> revid: 8573 >>> 8138966: Intermittent SEGV running ParallelGC >> >> Yes, serious crasher bug. This is already in jdk8u. >> http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/rev/110735ab93ec >> >> Please Ask Andrew Hughes about this one: it should already be in. > > We already agreed that I would include this as part of my patch. OK. Andrew. From aph at redhat.com Thu Feb 25 14:49:54 2016 From: aph at redhat.com (Andrew Haley) Date: Thu, 25 Feb 2016 14:49:54 +0000 Subject: [aarch64-port-dev ] RFR: 8150394: aarch64: add support for 8.1 LSE CAS instructions In-Reply-To: <1456394786.1383.18.camel@mint> References: <1456173130.2735.8.camel@mint> <56CD8CCF.1030404@redhat.com> <1456394786.1383.18.camel@mint> Message-ID: <56CF1492.1000400@redhat.com> Here's what I've got, merging my changes for VarHandles with yours for LSE CAS: http://cr.openjdk.java.net/~aph/aarch64-lse-cas/ Please test it. Thanks, Andrew. From aph at redhat.com Thu Feb 25 15:06:45 2016 From: aph at redhat.com (Andrew Haley) Date: Thu, 25 Feb 2016 15:06:45 +0000 Subject: [aarch64-port-dev ] RFR: 8150652: Remove unused code in AArch64 back end Message-ID: <56CF1885.1060600@redhat.com> Defining min in this way breaks compilation if min is already a #define, which it is on some compilers. http://cr.openjdk.java.net/~aph/8150652/ Andrew. From adinn at redhat.com Thu Feb 25 17:35:54 2016 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 25 Feb 2016 17:35:54 +0000 Subject: [aarch64-port-dev ] Freeze aarch64/jdk8 In-Reply-To: <56CF0EDA.9070808@redhat.com> References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint> <56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com> <56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com> <56CF03D8.8070101@redhat.com> <56CF0DC2.8080104@redhat.com> <56CF0EDA.9070808@redhat.com> Message-ID: <56CF3B7A.9080301@redhat.com> Ok, here is the webrev for the final set of patches. http://cr.openjdk.java.net/~adinn/jdk8u-aarch64-update/webrev.00 I built this on both AArch64 and x86_64 and ran the usual smoke tests successfully on both java Hello javac Hello.java netbeans (clean, build and run sample project) Note that the patch to rectify the jvm.cfg config and code which processes it still missing so javac and netbeans require -J-server to make them work properly. Andrew Hughes has a patch queued to fix this. Ok to push the changes in the webrev? regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in UK and Wales under Company Registration No. 3798903 Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul Argiry (US) From edward.nevill at gmail.com Thu Feb 25 17:44:46 2016 From: edward.nevill at gmail.com (Edward Nevill) Date: Thu, 25 Feb 2016 17:44:46 +0000 Subject: [aarch64-port-dev ] RFR: 8150394: aarch64: add support for 8.1 LSE CAS instructions In-Reply-To: <56CF1492.1000400@redhat.com> References: <1456173130.2735.8.camel@mint> <56CD8CCF.1030404@redhat.com> <1456394786.1383.18.camel@mint> <56CF1492.1000400@redhat.com> Message-ID: <1456422286.21810.2.camel@mint> On Thu, 2016-02-25 at 14:49 +0000, Andrew Haley wrote: > Here's what I've got, merging my changes for VarHandles with yours > for LSE CAS: > > http://cr.openjdk.java.net/~aph/aarch64-lse-cas/ > > Please test it. > Hi, Clean run through jcstress with -XX:+UseLSE. Also clean on some partners tests with and without -XX:+UseLSE. Looks fine, Ed. From aph at redhat.com Thu Feb 25 17:56:38 2016 From: aph at redhat.com (Andrew Haley) Date: Thu, 25 Feb 2016 17:56:38 +0000 Subject: [aarch64-port-dev ] Freeze aarch64/jdk8 In-Reply-To: <56CF03D8.8070101@redhat.com> References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint> <56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com> <56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com> <56CF03D8.8070101@redhat.com> Message-ID: <56CF4056.6010000@redhat.com> On 02/25/2016 01:38 PM, Andrew Haley wrote: >> largecodecache.patch >> > revid: 8576 >> > Add support for large code cache >> > >> > this makes changes to two files. > The non-AArch64-specific parts of this patch are not used by anything > so should not be included. The rest is OK. Oh sorry, I messed that up. CODE_CACHE_DEFAULT_LIMIT *is* used, and the shared bits are OK. Andrew. From gnu.andrew at redhat.com Thu Feb 25 18:55:38 2016 From: gnu.andrew at redhat.com (Andrew Hughes) Date: Thu, 25 Feb 2016 13:55:38 -0500 (EST) Subject: [aarch64-port-dev ] Freeze aarch64/jdk8 In-Reply-To: <56CF0EDA.9070808@redhat.com> References: <56CED360.1000000@redhat.com> <56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com> <56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com> <56CF03D8.8070101@redhat.com> <56CF0DC2.8080104@redhat.com> <56CF0EDA.9070808@redhat.com> Message-ID: <100280441.402786.1456426538535.JavaMail.zimbra@redhat.com> ----- Original Message ----- > On 02/25/2016 02:20 PM, Andrew Dinn wrote: > > On 25/02/16 13:38, Andrew Haley wrote: > > . . . > > > >>> volcas.patch > >>> revid: 8568 > >>> Backport optimization of volatile puts/gets and CAS to use ldar/stlr > >> > >> It's probably safe, but there have been significant reworkings in > >> this area. > >> > >> Maybe make this AARCH64_ONLY ? Please have a look at the generated > >> code when running G1. > > > > This shared change reverts part of a modification you had previously > > made to the shared code in an earlier attempt to implement optimization > > of volatile puts on AArch64. So, applying this patch merely restores the > > status quo as regards the shared code that is currently in both jdk8 and > > upstream jdk8u. > > > > Also, the same reversion was applied and is still present in jdk9. It > > has no bearing on any of the reworkings that have happened since the > > original patch was added to jdk9 and I don't envisage it having any such > > effect. > > > > So. I don't think there is any reason to make this AARCH64_ONLY. > > OK. I'm a bit mystified by the history of this one, but as long > as in the end we don't diverge from upstream jdk8u I'm happy. > > >>> 8138966.patch > >>> revid: 8573 > >>> 8138966: Intermittent SEGV running ParallelGC > >> > >> Yes, serious crasher bug. This is already in jdk8u. > >> http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/rev/110735ab93ec > >> > >> Please Ask Andrew Hughes about this one: it should already be in. > > > > We already agreed that I would include this as part of my patch. > > OK. > I believe that was 8147805, not 8138966. According to https://bugs.openjdk.java.net/browse/JDK-8138966, this fix is in 8u76, so we'll pick it up when we merge 8u76 in April. I don't see any problem with including it earlier though, if it's a serious issue. Mercurial is intelligent enough to see that we've already applied the change. > Andrew. > > -- Andrew :) Senior Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 From gnu.andrew at redhat.com Thu Feb 25 18:58:06 2016 From: gnu.andrew at redhat.com (Andrew Hughes) Date: Thu, 25 Feb 2016 13:58:06 -0500 (EST) Subject: [aarch64-port-dev ] [PATCH] [jdk8u] Remove unused template which breaks builds with GCC 6 In-Reply-To: <56CED047.4060604@redhat.com> References: <1629682443.26965872.1456360872278.JavaMail.zimbra@redhat.com> <56CED047.4060604@redhat.com> Message-ID: <1948009987.404665.1456426686320.JavaMail.zimbra@redhat.com> ----- Original Message ----- > On 25/02/16 00:41, Andrew Hughes wrote: > > Ok to push? > > > > [0] https://bugzilla.redhat.com/show_bug.cgi?id=1307224#c1 > > No. This should go to JDK 9 and get backported. Only JDK8- > specific patches get reviewed on 8. > Ok, the issue there is testing a build of OpenJDK 9 with GCC 6 on AArch64. I'll look into it. > Andrew. > > -- Andrew :) Senior Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 From christian.thalinger at oracle.com Thu Feb 25 19:45:57 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Thu, 25 Feb 2016 09:45:57 -1000 Subject: [aarch64-port-dev ] RFR: 8150652: Remove unused code in AArch64 back end In-Reply-To: <56CF1885.1060600@redhat.com> References: <56CF1885.1060600@redhat.com> Message-ID: <7BB0CCFA-51C0-4C04-8BB4-53A3A8B1D25C@oracle.com> Looks good. > On Feb 25, 2016, at 5:06 AM, Andrew Haley wrote: > > Defining min in this way breaks compilation if min is already a #define, > which it is on some compilers. > > http://cr.openjdk.java.net/~aph/8150652/ > > Andrew. From aph at redhat.com Thu Feb 25 20:35:17 2016 From: aph at redhat.com (Andrew Haley) Date: Thu, 25 Feb 2016 20:35:17 +0000 Subject: [aarch64-port-dev ] [PATCH] [jdk8u] Remove unused template which breaks builds with GCC 6 In-Reply-To: <1948009987.404665.1456426686320.JavaMail.zimbra@redhat.com> References: <1629682443.26965872.1456360872278.JavaMail.zimbra@redhat.com> <56CED047.4060604@redhat.com> <1948009987.404665.1456426686320.JavaMail.zimbra@redhat.com> Message-ID: <56CF6585.1010304@redhat.com> On 02/25/2016 06:58 PM, Andrew Hughes wrote: > Ok, the issue there is testing a build of OpenJDK 9 with > GCC 6 on AArch64. I'll look into it. I already submitted it for review. It's approved. Andrew. From gnu.andrew at redhat.com Fri Feb 26 04:41:22 2016 From: gnu.andrew at redhat.com (gnu.andrew at redhat.com) Date: Fri, 26 Feb 2016 04:41:22 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u/jdk: Completey revert 2940c1ead99bd7635 and sync jvm.cfg with OpenJDK 9 version. Message-ID: <201602260441.u1Q4fMp3021155@aojmv0008.oracle.com> Changeset: b39ade4fa554 Author: andrew Date: 2016-02-26 04:40 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/jdk/rev/b39ade4fa554 Completey revert 2940c1ead99bd7635 and sync jvm.cfg with OpenJDK 9 version. ! src/share/bin/java.c ! src/share/bin/java.h ! src/solaris/bin/aarch64/jvm.cfg From adinn at redhat.com Fri Feb 26 08:46:44 2016 From: adinn at redhat.com (Andrew Dinn) Date: Fri, 26 Feb 2016 08:46:44 +0000 Subject: [aarch64-port-dev ] Freeze aarch64/jdk8 In-Reply-To: <56CF4056.6010000@redhat.com> References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint> <56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com> <56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com> <56CF03D8.8070101@redhat.com> <56CF4056.6010000@redhat.com> Message-ID: <56D010F4.8070903@redhat.com> On 25/02/16 17:56, Andrew Haley wrote: > On 02/25/2016 01:38 PM, Andrew Haley wrote: >>> largecodecache.patch >>>> revid: 8576 >>>> Add support for large code cache >>>> >>>> this makes changes to two files. >> The non-AArch64-specific parts of this patch are not used by anything >> so should not be included. The rest is OK. > > Oh sorry, I messed that up. > > CODE_CACHE_DEFAULT_LIMIT *is* used, and the shared bits are OK. Well, I knew Ed added it for a reason :-) I just assumed you were happy to stick with the generic 2G limit. If I redo the patches with this restored is it then ok to push (assuming it passes basic tests) or do you want to see another webrev? regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in UK and Wales under Company Registration No. 3798903 Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul Argiry (US) From aph at redhat.com Fri Feb 26 09:14:33 2016 From: aph at redhat.com (Andrew Haley) Date: Fri, 26 Feb 2016 09:14:33 +0000 Subject: [aarch64-port-dev ] Freeze aarch64/jdk8 In-Reply-To: <56D010F4.8070903@redhat.com> References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint> <56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com> <56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com> <56CF03D8.8070101@redhat.com> <56CF4056.6010000@redhat.com> <56D010F4.8070903@redhat.com> Message-ID: <56D01779.10309@redhat.com> On 26/02/16 08:46, Andrew Dinn wrote: > On 25/02/16 17:56, Andrew Haley wrote: >> On 02/25/2016 01:38 PM, Andrew Haley wrote: >>>> largecodecache.patch >>>>> revid: 8576 >>>>> Add support for large code cache >>>>> >>>>> this makes changes to two files. >>> The non-AArch64-specific parts of this patch are not used by anything >>> so should not be included. The rest is OK. >> >> Oh sorry, I messed that up. >> >> CODE_CACHE_DEFAULT_LIMIT *is* used, and the shared bits are OK. > > Well, I knew Ed added it for a reason :-) I just assumed you were happy > to stick with the generic 2G limit. > > If I redo the patches with this restored is it then ok to push (assuming > it passes basic tests) or do you want to see another webrev? Yes, I think so. Then we have to do some fairly serious testing, on AArch64 and to make sure we haven't broken x86 and others. Thanks, Andrew. From adinn at redhat.com Fri Feb 26 10:03:08 2016 From: adinn at redhat.com (adinn at redhat.com) Date: Fri, 26 Feb 2016 10:03:08 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u/hotspot: 31 new changesets Message-ID: <201602261003.u1QA3850029592@aojmv0008.oracle.com> Changeset: 98e4d7b5ff2b Author: adinn Date: 2015-08-26 17:13 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/98e4d7b5ff2b 8134322: AArch64: Fix several errors in C2 biased locking implementation Summary: Several errors in C2 biased locking require fixing Reviewed-by: kvn Contributed-by: hui.shi at linaro.org ! src/cpu/aarch64/vm/aarch64.ad Changeset: b212413cdaef Author: enevill Date: 2015-09-15 12:59 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/b212413cdaef 8136524: aarch64: test/compiler/runtime/7196199/Test7196199.java fails Summary: Fix safepoint handlers to save 128 bits on vector poll Reviewed-by: kvn Contributed-by: felix.yang at linaro.org ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp ! src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp Changeset: 641806b9d29d Author: roland Date: 2016-02-25 09:43 -0500 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/641806b9d29d 8136596: Remove aarch64: MemBarRelease when final field's allocation is NoEscape or ArgEscape Summary: elide MemBar when AllocateNode _is_non_escaping Reviewed-by: kvn, roland Contributed-by: hui.shi at linaro.org ! src/share/vm/opto/callnode.hpp ! src/share/vm/opto/macro.cpp ! src/share/vm/opto/memnode.cpp Changeset: caab2df44238 Author: enevill Date: 2015-09-16 13:50 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/caab2df44238 8136615: aarch64: elide DecodeN when followed by CmpP 0 Summary: remove DecodeN when comparing a narrow oop with 0 Reviewed-by: kvn, adinn ! src/cpu/aarch64/vm/aarch64.ad Changeset: e499a51eaef1 Author: aph Date: 2015-09-28 16:18 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/e499a51eaef1 8136165: AARCH64: Tidy up compiled native calls Summary: Do some cleaning Reviewed-by: roland, kvn, enevill ! src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp Changeset: 82141dab8ec8 Author: aph Date: 2015-09-30 13:23 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/82141dab8ec8 8138641: Disable C2 peephole by default for aarch64 Reviewed-by: roland Contributed-by: felix.yang at linaro.org ! src/cpu/aarch64/vm/c2_globals_aarch64.hpp Changeset: 8d382116b8d0 Author: aph Date: 2015-09-29 17:01 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/8d382116b8d0 8138575: Improve generated code for profile counters Reviewed-by: kvn ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp Changeset: fa47c6788466 Author: enevill Date: 2015-10-15 15:33 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/fa47c6788466 8139674: aarch64: guarantee failure in TestOptionsWithRanges.java Summary: Fix negative overflow in instruction field Reviewed-by: kvn, roland, adinn, aph ! src/cpu/aarch64/vm/interp_masm_aarch64.cpp Changeset: c63eff2bbad8 Author: ecaspole Date: 2015-09-21 10:36 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/c63eff2bbad8 8131645: [ARM64] crash on Cavium when using G1 Summary: Add a fence when creating the CodeRootSetTable so the readers do not see invalid memory. Reviewed-by: aph, tschatzl ! src/share/vm/gc_implementation/g1/g1CodeCacheRemSet.cpp Changeset: 17b38ca19e23 Author: adinn Date: 2015-10-08 11:06 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/17b38ca19e23 Backport optimization of volatile puts/gets and CAS to use ldar/stlr ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/globals_aarch64.hpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp ! src/cpu/aarch64/vm/vm_version_aarch64.cpp ! src/share/vm/opto/graphKit.cpp Changeset: 4470d1a7ab47 Author: enevill Date: 2015-10-28 17:47 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/4470d1a7ab47 Fix thinko when backporting 8131645. Table ends up being allocated twice. ! src/share/vm/gc_implementation/g1/g1CodeCacheRemSet.cpp Changeset: d29561a8480e Author: enevill Date: 2015-10-28 17:51 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/d29561a8480e 8140611: aarch64: jtreg test jdk/tools/pack200/UnpackerMemoryTest.java SEGVs Summary: Fix register usage on calling native synchronized methods Reviewed-by: kvn, adinn ! src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp Changeset: c6c45e635f58 Author: enevill Date: 2016-02-25 05:44 -0500 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/c6c45e635f58 Some 32 bit shifts still being anded with 0x3f instead of 0x1f. ! src/cpu/aarch64/vm/aarch64.ad Changeset: 0d26ab01110c Author: aph Date: 2015-09-08 14:08 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/0d26ab01110c 8135157: DMB elimination in AArch64 C2 synchronization implementation Summary: Reduce memory barrier usage in C2 fast lock and unlock. Reviewed-by: kvn Contributed-by: wei.tang at linaro.org, aph at redhat.com ! src/cpu/aarch64/vm/aarch64.ad Changeset: 9b02e63a10cf Author: aph Date: 2015-11-04 13:38 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/9b02e63a10cf 8138966: Intermittent SEGV running ParallelGC Summary: Add necessary memory fences so that the parallel threads are unable to observe partially filled block tables. Reviewed-by: tschatzl ! src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.hpp Changeset: 69461ddc6e21 Author: enevill Date: 2015-11-19 15:15 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/69461ddc6e21 8143067: aarch64: guarantee failure in javac Summary: Fix adrp going out of range during code relocation Reviewed-by: aph, kvn ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Changeset: 2a885c3fa856 Author: hshi Date: 2015-11-24 09:02 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/2a885c3fa856 8143285: aarch64: Missing load acquire when checking if ConstantPoolCacheEntry is resolved Reviewed-by: roland, aph ! src/cpu/aarch64/vm/interp_masm_aarch64.cpp Changeset: df9fe5e4b123 Author: enevill Date: 2016-02-26 03:44 -0500 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/df9fe5e4b123 Add support for large code cache ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/assembler_aarch64.cpp ! src/cpu/aarch64/vm/assembler_aarch64.hpp ! src/cpu/aarch64/vm/c1_CodeStubs_aarch64.cpp ! src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp ! src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.hpp ! src/cpu/aarch64/vm/c1_MacroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/c1_Runtime1_aarch64.cpp ! src/cpu/aarch64/vm/compiledIC_aarch64.cpp ! src/cpu/aarch64/vm/globalDefinitions_aarch64.hpp ! src/cpu/aarch64/vm/globals_aarch64.hpp ! src/cpu/aarch64/vm/icBuffer_aarch64.cpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp ! src/cpu/aarch64/vm/methodHandles_aarch64.cpp ! src/cpu/aarch64/vm/nativeInst_aarch64.cpp ! src/cpu/aarch64/vm/nativeInst_aarch64.hpp ! src/cpu/aarch64/vm/relocInfo_aarch64.cpp ! src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp ! src/cpu/aarch64/vm/stubGenerator_aarch64.cpp ! src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp ! src/cpu/aarch64/vm/vtableStubs_aarch64.cpp ! src/os_cpu/linux_aarch64/vm/os_linux_aarch64.cpp ! src/share/vm/runtime/arguments.cpp ! src/share/vm/utilities/globalDefinitions.hpp Changeset: fdd053ca3236 Author: enevill Date: 2016-01-05 17:40 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/fdd053ca3236 Fix client build after addition of large code cache support ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/vm_version_aarch64.cpp Changeset: ebff70c35409 Author: enevill Date: 2015-12-29 16:47 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/ebff70c35409 8146286: aarch64: guarantee failures with large code cache sizes on jtreg test java/lang/invoke/LFCaching/LFMultiThreadCachingTest.java Summary: patch trampoline calls with special case bl to itself which does not cause guarantee failure Reviewed-by: aph ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/relocInfo_aarch64.cpp Changeset: a8e2e5e2062b Author: hshi Date: 2015-11-26 15:37 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/a8e2e5e2062b 8143584: Load constant pool tag and class status with load acquire Reviewed-by: roland, aph ! src/cpu/aarch64/vm/templateTable_aarch64.cpp Changeset: ab88ec370d76 Author: aph Date: 2015-11-25 18:13 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/ab88ec370d76 8144028: Use AArch64 bit-test instructions in C2 Reviewed-by: kvn ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp + test/compiler/codegen/8144028/BitTests.java Changeset: 30d91d32bb56 Author: fyang Date: 2015-12-07 21:23 +0800 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/30d91d32bb56 8144587: aarch64: generate vectorized MLA/MLS instructions Summary: Add support for MLA/MLS (vector) instructions Reviewed-by: roland ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/assembler_aarch64.hpp Changeset: eea9d73ceecb Author: aph Date: 2015-12-15 19:18 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/eea9d73ceecb 8145438: Guarantee failures since 8144028: Use AArch64 bit-test instructions in C2 Summary: Implement short and long versions of bit test instructions. Reviewed-by: kvn ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/c1_MacroAssembler_aarch64.hpp ! src/cpu/aarch64/vm/interp_masm_aarch64.cpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp ! src/share/vm/adlc/formssel.cpp Changeset: 797f2d436722 Author: aph Date: 2015-12-16 11:35 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/797f2d436722 8144582: AArch64 does not generate correct branch profile data Reviewed-by: kvn ! src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp Changeset: eed0f8fbe256 Author: fyang Date: 2015-12-07 21:14 +0800 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/eed0f8fbe256 8144201: aarch64: jdk/test/com/sun/net/httpserver/Test6a.java fails with --enable-unlimited-crypto Summary: Fix typo in stub generate_cipherBlockChaining_decryptAESCrypt Reviewed-by: roland ! src/cpu/aarch64/vm/stubGenerator_aarch64.cpp Changeset: 33f03ea2712b Author: enevill Date: 2016-01-08 11:39 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/33f03ea2712b 8146678: aarch64: assertion failure: call instruction in an infinite loop Summary: Remove assertion Reviewed-by: aph ! src/cpu/aarch64/vm/relocInfo_aarch64.cpp Changeset: 041044bfded5 Author: enevill Date: 2016-01-12 14:55 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/041044bfded5 8146843: aarch64: add scheduling support for FP and vector instructions Summary: add pipeline classes for FP/vector pipeline Reviewed-by: aph ! src/cpu/aarch64/vm/aarch64.ad Changeset: f087cd606b4c Author: aph Date: 2016-01-19 17:52 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/f087cd606b4c 8146709: AArch64: Incorrect use of ADRP for byte_map_base Reviewed-by: roland ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/c1_Runtime1_aarch64.cpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp ! src/cpu/aarch64/vm/stubGenerator_aarch64.cpp Changeset: d3cd1699e84a Author: hshi Date: 2016-01-20 04:56 -0800 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/d3cd1699e84a 8147805: aarch64: C1 segmentation fault due to inline Unsafe.getAndSetObject Summary: In Aarch64 LIR_Assembler.atomic_op, keep stored data reference register in decompressed forms as it may be used later Reviewed-by: aph Contributed-by: hui.shi at linaro.org, felix.yang at linaro.org ! src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp Changeset: f9b6277551dc Author: enevill Date: 2016-01-26 14:04 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/f9b6277551dc 8148240: aarch64: random infrequent null pointer exceptions in javac Summary: Disable fp as an allocatable register Reviewed-by: aph ! src/cpu/aarch64/vm/aarch64.ad From adinn at redhat.com Fri Feb 26 10:10:06 2016 From: adinn at redhat.com (Andrew Dinn) Date: Fri, 26 Feb 2016 10:10:06 +0000 Subject: [aarch64-port-dev ] Freeze aarch64/jdk8 In-Reply-To: <56D01779.10309@redhat.com> References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint> <56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com> <56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com> <56CF03D8.8070101@redhat.com> <56CF4056.6010000@redhat.com> <56D010F4.8070903@redhat.com> <56D01779.10309@redhat.com> Message-ID: <56D0247E.5060504@redhat.com> On 26/02/16 09:14, Andrew Haley wrote: > On 26/02/16 08:46, Andrew Dinn wrote: >> If I redo the patches with this restored is it then ok to push (assuming >> it passes basic tests) or do you want to see another webrev? > > Yes, I think so. Then we have to do some fairly serious testing, on > AArch64 and to make sure we haven't broken x86 and others. Ok, pushed. n.b. this tree passed basic smoke tests on both AArch64 and x86_64. Andrew Hughes' push to the jdk tree has removed the need to specify -J-server for javac and netbeans. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in UK and Wales under Company Registration No. 3798903 Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul Argiry (US) From hui.shi at linaro.org Fri Feb 26 14:28:01 2016 From: hui.shi at linaro.org (Hui Shi) Date: Fri, 26 Feb 2016 22:28:01 +0800 Subject: [aarch64-port-dev ] =?utf-8?b?5Zue5aSN77yaUkZSOiA4MTQ5NzMzOiBB?= =?utf-8?q?Arch64=3A_refactorchar=5Farray=5Fequals/byte=5Farray=5Fe?= =?utf-8?q?quals/string=5Fequals?= In-Reply-To: <56CDBD2A.9050002@oracle.com> References: <56CC2DF8.60806@oracle.com> <56C70C42.5020309@oracle.com> <56CC8630.2020708@redhat.com> <56CDBD2A.9050002@oracle.com> Message-ID: Thanks Aleksey! Can I have another review for this patch? Regards Hui On 24 February 2016 at 22:24, Aleksey Shipilev wrote: > On 02/24/2016 04:02 PM, Hui Shi wrote: > > Thanks Andrew! Your comment looks really better and performance doesn't > > change when run JMHSample_97_ArrayEqual.java > > < > http://cr.openjdk.java.net/%7Ehshi/8149733/webrev2/JMHSample_97_ArrayEqual.java> > test. > > > > latest webrev http://cr.openjdk.java.net/~hshi/8149733/webrev3/ > > > > Good. > > > Following is result with Aleksey's updated test case (-w 5 -wi 3 -i3 -r > > 10), first 4 group are for base run with base string length 0, 8, 31, > > 1024. Performance with patch doesn't show same improvement with early > > test. Only small length string equal tests still show obvious > improvement. > > ...and that's okay for refactoring. > > Cheers, > -Aleksey > > > From aph at redhat.com Fri Feb 26 14:37:57 2016 From: aph at redhat.com (Andrew Haley) Date: Fri, 26 Feb 2016 14:37:57 +0000 Subject: [aarch64-port-dev ] =?utf-8?b?5Zue5aSN77yaUkZSOiA4MTQ5NzMzOiBB?= =?utf-8?q?Arch64=3A_refactorchar=5Farray=5Fequals/byte=5Farray=5Fequals/s?= =?utf-8?q?tring=5Fequals?= In-Reply-To: References: <56CC2DF8.60806@oracle.com> <56C70C42.5020309@oracle.com> <56CC8630.2020708@redhat.com> <56CDBD2A.9050002@oracle.com> Message-ID: <56D06345.7010309@redhat.com> On 02/26/2016 02:28 PM, Hui Shi wrote: > Can I have another review for this patch? If you insist. OK. Andrew. From gnu.andrew at redhat.com Fri Feb 26 17:46:06 2016 From: gnu.andrew at redhat.com (Andrew Hughes) Date: Fri, 26 Feb 2016 12:46:06 -0500 (EST) Subject: [aarch64-port-dev ] [PATCH] [jdk8u] Remove unused template which breaks builds with GCC 6 In-Reply-To: <56CF6585.1010304@redhat.com> References: <1629682443.26965872.1456360872278.JavaMail.zimbra@redhat.com> <56CED047.4060604@redhat.com> <1948009987.404665.1456426686320.JavaMail.zimbra@redhat.com> <56CF6585.1010304@redhat.com> Message-ID: <440404060.902512.1456508766393.JavaMail.zimbra@redhat.com> ----- Original Message ----- > On 02/25/2016 06:58 PM, Andrew Hughes wrote: > > Ok, the issue there is testing a build of OpenJDK 9 with > > GCC 6 on AArch64. I'll look into it. > > I already submitted it for review. It's approved. > > Andrew. > > Yes, I saw this after I replied. Thanks, that saved me a lot of hassle! -- Andrew :) Senior Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 From hui.shi at linaro.org Sat Feb 27 11:40:29 2016 From: hui.shi at linaro.org (Hui Shi) Date: Sat, 27 Feb 2016 19:40:29 +0800 Subject: [aarch64-port-dev ] =?utf-8?b?5Zue5aSN77yaUkZSOiA4MTQ5NzMzOiBB?= =?utf-8?q?Arch64=3A_refactorchar=5Farray=5Fequals/byte=5Farray=5Fe?= =?utf-8?q?quals/string=5Fequals?= In-Reply-To: <56D06345.7010309@redhat.com> References: <56CC2DF8.60806@oracle.com> <56C70C42.5020309@oracle.com> <56CC8630.2020708@redhat.com> <56CDBD2A.9050002@oracle.com> <56D06345.7010309@redhat.com> Message-ID: Thanks! On 26 February 2016 at 22:37, Andrew Haley wrote: > On 02/26/2016 02:28 PM, Hui Shi wrote: > > Can I have another review for this patch? > > If you insist. OK. > > Andrew. > > From gnu.andrew at redhat.com Mon Feb 29 06:47:16 2016 From: gnu.andrew at redhat.com (gnu.andrew at redhat.com) Date: Mon, 29 Feb 2016 06:47:16 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u: Added tag aarch64-jdk8u72-b16 for changeset 92af9369869f Message-ID: <201602290647.u1T6lGdh015594@aojmv0008.oracle.com> Changeset: 86030362b0c5 Author: andrew Date: 2016-02-29 06:45 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/rev/86030362b0c5 Added tag aarch64-jdk8u72-b16 for changeset 92af9369869f ! .hgtags From gnu.andrew at redhat.com Mon Feb 29 06:47:24 2016 From: gnu.andrew at redhat.com (gnu.andrew at redhat.com) Date: Mon, 29 Feb 2016 06:47:24 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u/corba: Added tag aarch64-jdk8u72-b16 for changeset d5a3087d60ee Message-ID: <201602290647.u1T6lOXf015687@aojmv0008.oracle.com> Changeset: c44425453bfa Author: andrew Date: 2016-02-29 06:45 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/corba/rev/c44425453bfa Added tag aarch64-jdk8u72-b16 for changeset d5a3087d60ee ! .hgtags From gnu.andrew at redhat.com Mon Feb 29 06:47:31 2016 From: gnu.andrew at redhat.com (gnu.andrew at redhat.com) Date: Mon, 29 Feb 2016 06:47:31 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u/jaxp: Added tag aarch64-jdk8u72-b16 for changeset 6769d8017f5d Message-ID: <201602290647.u1T6lVOZ015800@aojmv0008.oracle.com> Changeset: 99056017b4e3 Author: andrew Date: 2016-02-29 06:45 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/jaxp/rev/99056017b4e3 Added tag aarch64-jdk8u72-b16 for changeset 6769d8017f5d ! .hgtags From gnu.andrew at redhat.com Mon Feb 29 06:47:38 2016 From: gnu.andrew at redhat.com (gnu.andrew at redhat.com) Date: Mon, 29 Feb 2016 06:47:38 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u/jaxws: Added tag aarch64-jdk8u72-b16 for changeset 1ecc978053bf Message-ID: <201602290647.u1T6lcdn015928@aojmv0008.oracle.com> Changeset: eeb105ae870d Author: andrew Date: 2016-02-29 06:45 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/jaxws/rev/eeb105ae870d Added tag aarch64-jdk8u72-b16 for changeset 1ecc978053bf ! .hgtags From gnu.andrew at redhat.com Mon Feb 29 06:47:46 2016 From: gnu.andrew at redhat.com (gnu.andrew at redhat.com) Date: Mon, 29 Feb 2016 06:47:46 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u/langtools: Added tag aarch64-jdk8u72-b16 for changeset b63515578554 Message-ID: <201602290647.u1T6lkxl016020@aojmv0008.oracle.com> Changeset: 109a626b4431 Author: andrew Date: 2016-02-29 06:45 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/langtools/rev/109a626b4431 Added tag aarch64-jdk8u72-b16 for changeset b63515578554 ! .hgtags From gnu.andrew at redhat.com Mon Feb 29 06:47:53 2016 From: gnu.andrew at redhat.com (gnu.andrew at redhat.com) Date: Mon, 29 Feb 2016 06:47:53 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u/hotspot: Added tag aarch64-jdk8u72-b16 for changeset f9b6277551dc Message-ID: <201602290647.u1T6lrQR016088@aojmv0008.oracle.com> Changeset: 4c440540c962 Author: andrew Date: 2016-02-29 06:45 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/4c440540c962 Added tag aarch64-jdk8u72-b16 for changeset f9b6277551dc ! .hgtags From gnu.andrew at redhat.com Mon Feb 29 06:48:00 2016 From: gnu.andrew at redhat.com (gnu.andrew at redhat.com) Date: Mon, 29 Feb 2016 06:48:00 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u/jdk: Added tag aarch64-jdk8u72-b16 for changeset b39ade4fa554 Message-ID: <201602290648.u1T6m0vx016165@aojmv0008.oracle.com> Changeset: 9331bfc2d798 Author: andrew Date: 2016-02-29 06:45 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/jdk/rev/9331bfc2d798 Added tag aarch64-jdk8u72-b16 for changeset b39ade4fa554 ! .hgtags From gnu.andrew at redhat.com Mon Feb 29 06:48:07 2016 From: gnu.andrew at redhat.com (gnu.andrew at redhat.com) Date: Mon, 29 Feb 2016 06:48:07 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u/nashorn: Added tag aarch64-jdk8u72-b16 for changeset 8eb47ddad851 Message-ID: <201602290648.u1T6m7n2016248@aojmv0008.oracle.com> Changeset: af05959dd44b Author: andrew Date: 2016-02-29 06:45 +0000 URL: http://hg.openjdk.java.net/aarch64-port/jdk8u/nashorn/rev/af05959dd44b Added tag aarch64-jdk8u72-b16 for changeset 8eb47ddad851 ! .hgtags From adinn at redhat.com Mon Feb 29 16:04:14 2016 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 29 Feb 2016 16:04:14 +0000 Subject: [aarch64-port-dev ] RFR: AArch64 backport to Icedtea7 of 8146709 Message-ID: <56D46BFE.90509@redhat.com> I have backported the patch for 8146709 (Incorrect use of ADRP for byte_map_base) from aarch64/jdk8u to icedtea7-forest. Here's the webrev: http://cr.openjdk.java.net/~adinn/8146709/webrev.00/ This appears to be the cause of the problem running specjvm referred to in bugzilla 1310061 https://bugzilla.redhat.com/show_bug.cgi?id=1310061 Before the patch I saw the error described in the BZ.After the patch it no longer occurs. The patched code passes basic smoke tests. Can I get an ok from someone else before I push this? regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in UK and Wales under Company Registration No. 3798903 Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul Argiry (US) From adinn at redhat.com Mon Feb 29 16:05:08 2016 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 29 Feb 2016 16:05:08 +0000 Subject: [aarch64-port-dev ] RFR: AArch64 backport to Icedtea7 of 8146709 In-Reply-To: <56D46BFE.90509@redhat.com> References: <56D46BFE.90509@redhat.com> Message-ID: <56D46C34.2040708@redhat.com> On 29/02/16 16:04, Andrew Dinn wrote: > I have backported the patch for 8146709 (Incorrect use of ADRP for > byte_map_base) from aarch64/jdk8u to icedtea7-forest. Here's the webrev: > > http://cr.openjdk.java.net/~adinn/8146709/webrev.00/ > > This appears to be the cause of the problem running specjvm referred to > in bugzilla 1310061 > > https://bugzilla.redhat.com/show_bug.cgi?id=1310061 > > Before the patch I saw the error described in the BZ.After the patch it > no longer occurs. > > The patched code passes basic smoke tests. Can I get an ok from someone > else before I push this? Oops, forgot to link the webrev: http://cr.openjdk.java.net/~adinn/8146709/webrev.00/ > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in UK and Wales under Company Registration No. 3798903 > Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul > Argiry (US) From gnu.andrew at redhat.com Mon Feb 29 16:14:56 2016 From: gnu.andrew at redhat.com (Andrew Hughes) Date: Mon, 29 Feb 2016 11:14:56 -0500 (EST) Subject: [aarch64-port-dev ] Freeze aarch64/jdk8 In-Reply-To: <56CF0DC2.8080104@redhat.com> References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint> <56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com> <56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com> <56CF03D8.8070101@redhat.com> <56CF0DC2.8080104@redhat.com> Message-ID: <943810353.1464090.1456762496076.JavaMail.zimbra@redhat.com> snip... > > >> largecodecache.patch > >> revid: 8576 > >> Add support for large code cache > >> > >> this makes changes to two files. > > > > The non-AArch64-specific parts of this patch are not used by anything > > so should not be included. The rest is OK. > > Ok, I will rework this to include only the AArch64-specific code. > The other changes appear to have been included and the build on s390 is now broken as a result (mismatch between the types of CODE_CACHE_DEFAULT_LIMIT and ReservedCodeCacheSize * 5 in the min2 macro). Can I revert the changes to these two files? (src/share/vm/utilities/globalDefinitions.hpp & src/share/vm/runtime/arguments.cpp) Thanks, -- Andrew :) Senior Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 From adinn at redhat.com Mon Feb 29 16:19:18 2016 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 29 Feb 2016 16:19:18 +0000 Subject: [aarch64-port-dev ] Freeze aarch64/jdk8 In-Reply-To: <943810353.1464090.1456762496076.JavaMail.zimbra@redhat.com> References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint> <56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com> <56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com> <56CF03D8.8070101@redhat.com> <56CF0DC2.8080104@redhat.com> <943810353.1464090.1456762496076.JavaMail.zimbra@redhat.com> Message-ID: <56D46F86.2090108@redhat.com> On 29/02/16 16:14, Andrew Hughes wrote: > snip... > >> >>>> largecodecache.patch >>>> revid: 8576 >>>> Add support for large code cache >>>> >>>> this makes changes to two files. >>> >>> The non-AArch64-specific parts of this patch are not used by anything >>> so should not be included. The rest is OK. >> >> Ok, I will rework this to include only the AArch64-specific code. >> > > The other changes appear to have been included and the build on s390 is > now broken as a result (mismatch between the types of > CODE_CACHE_DEFAULT_LIMIT and ReservedCodeCacheSize * 5 in the min2 macro). > Can I revert the changes to these two files? > (src/share/vm/utilities/globalDefinitions.hpp & src/share/vm/runtime/arguments.cpp) Andrew Haley followed up with a note explaining that these changes are needed on AArch64 (which is why I put them back in again). I think it might be better to fix the breakage to the PPC code. Perhaps Andrew Haley can comment? regards, Andrew Dinn ----------- From gnu.andrew at redhat.com Mon Feb 29 16:37:47 2016 From: gnu.andrew at redhat.com (Andrew Hughes) Date: Mon, 29 Feb 2016 11:37:47 -0500 (EST) Subject: [aarch64-port-dev ] Freeze aarch64/jdk8 In-Reply-To: <56D46F86.2090108@redhat.com> References: <56CED360.1000000@redhat.com> <56CEEB33.2060400@redhat.com> <56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com> <56CF03D8.8070101@redhat.com> <56CF0DC2.8080104@redhat.com> <943810353.1464090.1456762496076.JavaMail.zimbra@redhat.com> <56D46F86.2090108@redhat.com> Message-ID: <1056751202.1475582.1456763867497.JavaMail.zimbra@redhat.com> ----- Original Message ----- > On 29/02/16 16:14, Andrew Hughes wrote: > > snip... > > > >> > >>>> largecodecache.patch > >>>> revid: 8576 > >>>> Add support for large code cache > >>>> > >>>> this makes changes to two files. > >>> > >>> The non-AArch64-specific parts of this patch are not used by anything > >>> so should not be included. The rest is OK. > >> > >> Ok, I will rework this to include only the AArch64-specific code. > >> > > > > The other changes appear to have been included and the build on s390 is > > now broken as a result (mismatch between the types of > > CODE_CACHE_DEFAULT_LIMIT and ReservedCodeCacheSize * 5 in the min2 macro). > > Can I revert the changes to these two files? > > (src/share/vm/utilities/globalDefinitions.hpp & > > src/share/vm/runtime/arguments.cpp) > > Andrew Haley followed up with a note explaining that these changes are > needed on AArch64 (which is why I put them back in again). I think it > might be better to fix the breakage to the PPC code. Perhaps Andrew > Haley can comment? > * s390. The change was there for AArch64 only before. - FLAG_SET_DEFAULT(ReservedCodeCacheSize, - MIN2(CODE_CACHE_DEFAULT_LIMIT, ReservedCodeCacheSize * 5)); + FLAG_SET_DEFAULT(ReservedCodeCacheSize, ReservedCodeCacheSize * 5); + // The maximum B/BL offset range on AArch64 is 128MB + AARCH64_ONLY(FLAG_SET_DEFAULT(ReservedCodeCacheSize, MIN2(ReservedCodeCacheSize, 128*M))); Maybe it's just a case that the first setting needs to be not AArch64 and the AArch64 one needs updating to the new version? e.g. - FLAG_SET_DEFAULT(ReservedCodeCacheSize, - MIN2(CODE_CACHE_DEFAULT_LIMIT, ReservedCodeCacheSize * 5)); + NOT_AARCH64(FLAG_SET_DEFAULT(ReservedCodeCacheSize, ReservedCodeCacheSize * 5)); + AARCH64_ONLY(FLAG_SET_DEFAULT(ReservedCodeCacheSize, + MIN2(CODE_CACHE_DEFAULT_LIMIT, ReservedCodeCacheSize * 5))); We then don't change the behaviour on other architectures. > regards, > > > Andrew Dinn > ----------- > > -- Andrew :) Senior Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 From adinn at redhat.com Mon Feb 29 16:37:52 2016 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 29 Feb 2016 16:37:52 +0000 Subject: [aarch64-port-dev ] Freeze aarch64/jdk8 In-Reply-To: <56D46F86.2090108@redhat.com> References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint> <56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com> <56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com> <56CF03D8.8070101@redhat.com> <56CF0DC2.8080104@redhat.com> <943810353.1464090.1456762496076.JavaMail.zimbra@redhat.com> <56D46F86.2090108@redhat.com> Message-ID: <56D473E0.7010905@redhat.com> On 29/02/16 16:19, Andrew Dinn wrote: > On 29/02/16 16:14, Andrew Hughes wrote: >> The other changes appear to have been included and the build on s390 is >> now broken as a result (mismatch between the types of >> CODE_CACHE_DEFAULT_LIMIT and ReservedCodeCacheSize * 5 in the min2 macro). >> Can I revert the changes to these two files? >> (src/share/vm/utilities/globalDefinitions.hpp & src/share/vm/runtime/arguments.cpp) > > Andrew Haley followed up with a note explaining that these changes are > needed on AArch64 (which is why I put them back in again). I think it > might be better to fix the breakage to the PPC code. Perhaps Andrew > Haley can comment? Sorry, that should have read breakage to the PPC build -- since the error appears to be in code that is part of the patch. I am not sure I follow what is happening here. ReservedCodeCacheSize is an int. CODE_CACHE_DEFAULT_LIMIT defaults to CODE_CACHE_SIZE_LIMIT which is defined as (2 * G). AArch64 redefines it to (128 * M). G and M are both of type size_t. Do we just need a cast when we pass the arguments to min2? Andrew Hughes, can you provide more details on the error? regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in UK and Wales under Company Registration No. 3798903 Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul Argiry (US) From gnu.andrew at redhat.com Mon Feb 29 16:44:01 2016 From: gnu.andrew at redhat.com (Andrew Hughes) Date: Mon, 29 Feb 2016 11:44:01 -0500 (EST) Subject: [aarch64-port-dev ] Freeze aarch64/jdk8 In-Reply-To: <56D473E0.7010905@redhat.com> References: <56CED360.1000000@redhat.com> <56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com> <56CF03D8.8070101@redhat.com> <56CF0DC2.8080104@redhat.com> <943810353.1464090.1456762496076.JavaMail.zimbra@redhat.com> <56D46F86.2090108@redhat.com> <56D473E0.7010905@redhat.com> Message-ID: <968027110.1478028.1456764241065.JavaMail.zimbra@redhat.com> ----- Original Message ----- > On 29/02/16 16:19, Andrew Dinn wrote: > > On 29/02/16 16:14, Andrew Hughes wrote: > >> The other changes appear to have been included and the build on s390 is > >> now broken as a result (mismatch between the types of > >> CODE_CACHE_DEFAULT_LIMIT and ReservedCodeCacheSize * 5 in the min2 macro). > >> Can I revert the changes to these two files? > >> (src/share/vm/utilities/globalDefinitions.hpp & > >> src/share/vm/runtime/arguments.cpp) > > > > Andrew Haley followed up with a note explaining that these changes are > > needed on AArch64 (which is why I put them back in again). I think it > > might be better to fix the breakage to the PPC code. Perhaps Andrew > > Haley can comment? > > Sorry, that should have read breakage to the PPC build -- since the > error appears to be in code that is part of the patch. > > I am not sure I follow what is happening here. ReservedCodeCacheSize is > an int. CODE_CACHE_DEFAULT_LIMIT defaults to CODE_CACHE_SIZE_LIMIT which > is defined as (2 * G). AArch64 redefines it to (128 * M). G and M are > both of type size_t. Do we just need a cast when we pass the arguments > to min2? > In short, yes. It's s390. On s390, size_t is a long unsigned int, while the right-hand side, ReservedCodeCacheSize * 5, is a uintx: /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.72-5.b16.el7.s390/openjdk/hotspot/src/share/vm/runtime/arguments.cpp:1141:78: e\ rror: no matching function for call to 'MIN2(long unsigned int, uintx)' MIN2(CODE_CACHE_DEFAULT_LIMIT, ReservedCodeCacheSize * 5)); We have a lot of cases of this on s390 which we have to fix, and getting that upstream has been an uphill task, with them throwing rocks down at us all the time. We can fix it with a cast, but here I don't think this change should be even made on non-AArch64, as it's a divergence from 8u. See the suggestion in my previous e-mail. -- Andrew :) Senior Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 From aph at redhat.com Mon Feb 29 16:46:47 2016 From: aph at redhat.com (Andrew Haley) Date: Mon, 29 Feb 2016 16:46:47 +0000 Subject: [aarch64-port-dev ] Freeze aarch64/jdk8 In-Reply-To: <1056751202.1475582.1456763867497.JavaMail.zimbra@redhat.com> References: <56CED360.1000000@redhat.com> <56CEEB33.2060400@redhat.com> <56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com> <56CF03D8.8070101@redhat.com> <56CF0DC2.8080104@redhat.com> <943810353.1464090.1456762496076.JavaMail.zimbra@redhat.com> <56D46F86.2090108@redhat.com> <1056751202.1475582.1456763867497.JavaMail.zimbra@redhat.com> Message-ID: <56D475F7.3040901@redhat.com> On 02/29/2016 04:37 PM, Andrew Hughes wrote: > e.g. > > - FLAG_SET_DEFAULT(ReservedCodeCacheSize, > - MIN2(CODE_CACHE_DEFAULT_LIMIT, ReservedCodeCacheSize * 5)); > + NOT_AARCH64(FLAG_SET_DEFAULT(ReservedCodeCacheSize, ReservedCodeCacheSize * 5)); > + AARCH64_ONLY(FLAG_SET_DEFAULT(ReservedCodeCacheSize, > + MIN2(CODE_CACHE_DEFAULT_LIMIT, ReservedCodeCacheSize * 5))); > > We then don't change the behaviour on other architectures. Yes. That's safer all around. Andrew. From adinn at redhat.com Mon Feb 29 16:47:41 2016 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 29 Feb 2016 16:47:41 +0000 Subject: [aarch64-port-dev ] Freeze aarch64/jdk8 In-Reply-To: <968027110.1478028.1456764241065.JavaMail.zimbra@redhat.com> References: <56CED360.1000000@redhat.com> <56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com> <56CF03D8.8070101@redhat.com> <56CF0DC2.8080104@redhat.com> <943810353.1464090.1456762496076.JavaMail.zimbra@redhat.com> <56D46F86.2090108@redhat.com> <56D473E0.7010905@redhat.com> <968027110.1478028.1456764241065.JavaMail.zimbra@redhat.com> Message-ID: <56D4762D.6030700@redhat.com> On 29/02/16 16:44, Andrew Hughes wrote: > In short, yes. > > It's s390. On s390, size_t is a long unsigned int, while the right-hand > side, ReservedCodeCacheSize * 5, is a uintx: > > /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.72-5.b16.el7.s390/openjdk/hotspot/src/share/vm/runtime/arguments.cpp:1141:78: e\ > rror: no matching function for call to 'MIN2(long unsigned int, uintx)' > MIN2(CODE_CACHE_DEFAULT_LIMIT, ReservedCodeCacheSize * 5)); > > We have a lot of cases of this on s390 which we have to fix, and getting > that upstream has been an uphill task, with them throwing rocks down at us > all the time. > > We can fix it with a cast, but here I don't think this change should be even > made on non-AArch64, as it's a divergence from 8u. See the suggestion in my > previous e-mail. Oh. yeeeurch! Yes, your suggestion looks like a much better idea than trying to fix it any other way. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in UK and Wales under Company Registration No. 3798903 Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul Argiry (US) From aph at redhat.com Mon Feb 29 16:52:31 2016 From: aph at redhat.com (Andrew Haley) Date: Mon, 29 Feb 2016 16:52:31 +0000 Subject: [aarch64-port-dev ] RFR: AArch64 backport to Icedtea7 of 8146709 In-Reply-To: <56D46C34.2040708@redhat.com> References: <56D46BFE.90509@redhat.com> <56D46C34.2040708@redhat.com> Message-ID: <56D4774F.5060607@redhat.com> On 02/29/2016 04:05 PM, Andrew Dinn wrote: > Oops, forgot to link the webrev: > > http://cr.openjdk.java.net/~adinn/8146709/webrev.00/ OK. Thanks, Andrew.