From edward.nevill at gmail.com  Mon Feb  1 20:33:57 2016
From: edward.nevill at gmail.com (Edward Nevill)
Date: Mon, 01 Feb 2016 20:33:57 +0000
Subject: [aarch64-port-dev ] RFR: 8148783: aarch64: SEGV running SpecJBB2013
Message-ID: <1454358837.11463.14.camel@mint>

Hi,

Please review the following webrev

http://cr.openjdk.java.net/~enevill/8148783/webrev.0/

JIRA Issue: https://bugs.openjdk.java.net/browse/JDK-8148783

The bug is explained in some detail in the JIRA issue.

The problem is that the sign is not preserved in the following code from adrp(...)

    long offset = dest_page - pc_page;
    offset = (offset & ((1<<20)-1)) << 12;

This generally works because the following movk overwrites bits 32..47

However on larger memory systems of 256 Gb it could happen that the PC address was

0x0000ffffXXXXXXXX

in which case the falsely positive offset could wrap to

0x00010000XXXXXXXX

Bit 48 does not get overwritten by the following movk, hence forming an invalid address.

The solution is to use int32_t for offset instead of long, so it gets sign extended correctly when added to the pc().

All the best,
Ed.


From aph at redhat.com  Tue Feb  2 15:16:13 2016
From: aph at redhat.com (Andrew Haley)
Date: Tue, 2 Feb 2016 15:16:13 +0000
Subject: [aarch64-port-dev ] RFR: 8148783: aarch64: SEGV running
 SpecJBB2013
In-Reply-To: <1454358837.11463.14.camel@mint>
References: <1454358837.11463.14.camel@mint>
Message-ID: <56B0C83D.900@redhat.com>

Hi,

On 02/01/2016 08:33 PM, Edward Nevill wrote:

> JIRA Issue: https://bugs.openjdk.java.net/browse/JDK-8148783
> 
> The bug is explained in some detail in the JIRA issue.
> 
> The problem is that the sign is not preserved in the following code
> from adrp(...)
> 
>     long offset = dest_page - pc_page;
>     offset = (offset & ((1<<20)-1)) << 12;
> 
> This generally works because the following movk overwrites bits 32..47
> 
> However on larger memory systems of 256 Gb it could happen that the
> PC address was
> 
> 0x0000ffffXXXXXXXX
> 
> in which case the falsely positive offset could wrap to
> 
> 0x00010000XXXXXXXX
> 
> Bit 48 does not get overwritten by the following movk, hence forming
> an invalid address.
> 
> The solution is to use int32_t for offset instead of long, so it
> gets sign extended correctly when added to the pc().

I can't accept that patch because the overflowing assignment from long
to int32_t is undefined behaviour.  It is also very obscure code.

Can you test the patch I've appended instead?  It tiptoes around the UB
and should be OK.

Thanks,

Andrew.


diff --git a/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp b/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp
--- a/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp
+++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp
@@ -3980,6 +3980,14 @@
   return inst_mark();
 }

+int64_t MacroAssembler::truncate_signed_bitfield(int64_t n, int width) {
+  // Left shifts of a signed integer are UB in Standard C++ but
+  // well-defined in GNU C++.
+  n <<= 64 - width;
+  n >>= 64 - width;
+  return n;
+}
+
 void MacroAssembler::adrp(Register reg1, const Address &dest, unsigned long &byte_offset) {
   relocInfo::relocType rtype = dest.rspec().reloc()->type();
   unsigned long low_page = (unsigned long)CodeCache::low_bound() >> 12;
@@ -3999,8 +4007,18 @@
     _adrp(reg1, dest.target());
   } else {
     unsigned long pc_page = (unsigned long)pc() >> 12;
-    long offset = dest_page - pc_page;
-    offset = (offset & ((1<<20)-1)) << 12;
+    unsigned long page_offset = dest_page - pc_page;
+
+    // The signed offset (in 4k pages) from PC to dest page.  We use a
+    // reference in order to avoid UB when converting from unsigned to
+    // signed.
+    long offset = reinterpret_cast<long&>(page_offset);
+
+    // The signed offset (in bytes) from the PC to the destination
+    // page.  We only want the 32 LSBs of the offset because the range
+    // of ADRP is +-2G, i.e. 32 bits.
+    offset = truncate_signed_bitfield(offset << 12, 32);
+
     _adrp(reg1, pc()+offset);
     movk(reg1, (unsigned long)dest.target() >> 32, 32);
   }
diff --git a/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp b/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp
--- a/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp
+++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp
@@ -85,9 +85,10 @@

   void call_VM_helper(Register oop_result, address entry_point, int number_of_arguments, bool check_exceptions = true);

-  // Maximum size of class area in Metaspace when compressed
   uint64_t use_XOR_for_compressed_class_base;

+  int64_t truncate_signed_bitfield(int64_t n, int width);
+
  public:
   MacroAssembler(CodeBuffer* code) : Assembler(code) {
     use_XOR_for_compressed_class_base


From edward.nevill at gmail.com  Wed Feb  3 11:45:52 2016
From: edward.nevill at gmail.com (Edward Nevill)
Date: Wed, 03 Feb 2016 11:45:52 +0000
Subject: [aarch64-port-dev ] RFR: 8148948: aarch64: generate_copy_longs
	calls align() incorrectly
Message-ID: <1454499952.2021.7.camel@mylittlepony.linaroharston>

Hi,

Please review the following:

http://cr.openjdk.java.net/~enevill/8148948/webrev.0/

JIRA: https://bugs.openjdk.java.net/browse/JDK-8148948

The issue is that there are align statements of the form

 __ align(6)

in generate_copy_longs() whereas the correct alignment statement should be

 __ align(64)

In the proposed webrev I have changed the statements to

  __ align(CodeEntryAlignment);

Howver in C1 CodeEntryAlignment is set to 16 for C1 and 64 for C2. I can see no reason why this is the case so I am proposing also changing CodeEntryAlignment to 64 for both C1 & C2.

Other arches set CodeEntryAlignment as follows

sparc: 32
ppc: 128
x86: 32 (c2), 16 (c1)

Thanks,
Ed.


From aph at redhat.com  Wed Feb  3 11:48:48 2016
From: aph at redhat.com (Andrew Haley)
Date: Wed, 3 Feb 2016 11:48:48 +0000
Subject: [aarch64-port-dev ] RFR: 8148948: aarch64: generate_copy_longs
 calls align() incorrectly
In-Reply-To: <1454499952.2021.7.camel@mylittlepony.linaroharston>
References: <1454499952.2021.7.camel@mylittlepony.linaroharston>
Message-ID: <56B1E920.3050705@redhat.com>

On 03/02/16 11:45, Edward Nevill wrote:
> Hi,
> 
> Please review the following:
> 
> http://cr.openjdk.java.net/~enevill/8148948/webrev.0/

Yes, OK.

Andrew.


From edward.nevill at gmail.com  Thu Feb  4 16:46:07 2016
From: edward.nevill at gmail.com (Edward Nevill)
Date: Thu, 04 Feb 2016 16:46:07 +0000
Subject: [aarch64-port-dev ] RFR: 8148783: aarch64: SEGV running
 SpecJBB2013
In-Reply-To: <56B0C83D.900@redhat.com>
References: <1454358837.11463.14.camel@mint> <56B0C83D.900@redhat.com>
Message-ID: <1454604367.22510.28.camel@mylittlepony.linaroharston>

On Tue, 2016-02-02 at 15:16 +0000, Andrew Haley wrote:
> Hi,
> 
> On 02/01/2016 08:33 PM, Edward Nevill wrote:
> 
> > JIRA Issue: https://bugs.openjdk.java.net/browse/JDK-8148783
> Can you test the patch I've appended instead?  It tiptoes around the UB
> and should be OK.

Hi,

Unfortunately this still fails. I have written a small simulacrum of the problem in C below. The following is the output.

ed at arm64:~/tmp/adrp$ ./adrp
original_adrp: pc = 0xffff70000000, dest = 0xfffe00000000, offset = 0x90000000, addr = 0x1000000000000
original_adrp: pc = 0xfffffffff000, dest = 0xfffe00000000, offset = 0x1000, addr = 0x1000000000000
new_adrp: pc = 0xffff70000000, dest = 0xfffe00000000, offset = 0xffffffff90000000, addr = 0xffff00000000
new_adrp: pc = 0xfffffffff000, dest = 0xfffe00000000, offset = 0x1000, addr = 0x1000000000000  <<<<< HERE bit 48 set

The original generated an invalid address in both cases (where offset is +ve and -ve). The new version generates the correct output when the offset is -ve, however a +ve offset still generates an address with bit 48 set.

A second problem is the following code in pd_patch_instruction

        // movk #imm16<<32
        Instruction_aarch64::patch(branch + 4, 20, 5, (uint64_t)target >> 32);
        offset &= (1<<20)-1;
        instructions = 2;

This is essentially doing the same thing as the original adrp, so even when the original adrp got the instruction correct the subsequent patching broke it again.

I have attached a new webrev which fixes both these issues in a much simpler manner.

http://cr.openjdk.java.net/~enevill/8148783/webrev.2

The key is to construct the instructions exactly as we are using them. When we use an adrp/movk combination to construct a 48 bit address we are using the adrp to construct the bottom 32 bits (with the bottom 12 bit 0) and the movk to construction bits 32..47 overwriting any values the adrp may have put in bits 32..47

So the instruction sequence is

  adrp Xn, 0xXXXXAAAAA000
  movk Xn, 0xAAAA00000000

Where A represents required bits of the address and XXXX represent don't care bits. The only requirement on the XXXX bits is that they must be reachable using the adrp instruction.

The webrev ensures this by using bits 32..47 from the PC and bits 0..31 from the destination address. The fact that we use the XXXX bits from the PC ensures the requirement that the address is reachable and using only the bottom 32 bits of the dest ensures we only get the bits we actually want the adrp instruction to construct and not any extraneous bits in bits 48 etc.

The code that does this is

   unsigned long adrp_target = (target & 0xffffffffUL) | (source & 0xffff00000000UL);

and this is also reflected in pd_patch_instruction to calculate the adrp target there.

All the best,
Ed.

--- adrp.c ---
#include <stdio.h>

void original_adrp(unsigned long pc, unsigned long dest)
{
  unsigned long dest_page = dest >> 12;
  unsigned long pc_page = pc >> 12;
  long offset = dest_page - pc_page;
  offset = (offset & ((1<<20)-1)) << 12;
  printf("original_adrp: pc = 0x%lx, dest = 0x%lx, offset = 0x%lx, addr
= 0x%lx\n", pc, dest, offset, pc+offset);
}

long truncate_signed_bitfield(long n, int width) {
  // Left shifts of a signed integer are UB in Standard C++ but
  // well-defined in GNU C++.
  n <<= 64 - width;
  n >>= 64 - width;
  return n;
}

void new_adrp(unsigned long pc, unsigned long dest)
{
  unsigned long dest_page = dest >> 12;
  unsigned long pc_page = pc >> 12;
  unsigned long page_offset = dest_page - pc_page;
  long offset = page_offset;
  offset = truncate_signed_bitfield(offset << 12, 32);
  printf("new_adrp: pc = 0x%lx, dest = 0x%lx, offset = 0x%lx, addr = 0x%
lx\n", pc, dest, offset, pc+offset);
}

int main(void)
{
  original_adrp(0x0000ffff70000000, 0x0000fffe00000000);
  original_adrp(0x0000fffffffff000, 0x0000fffe00000000);

  new_adrp(0x0000ffff70000000, 0x0000fffe00000000);
  new_adrp(0x0000fffffffff000, 0x0000fffe00000000);
}
--- cut here ---


From aph at redhat.com  Thu Feb  4 17:02:05 2016
From: aph at redhat.com (Andrew Haley)
Date: Thu, 4 Feb 2016 17:02:05 +0000
Subject: [aarch64-port-dev ] RFR: 8148783: aarch64: SEGV running
 SpecJBB2013
In-Reply-To: <1454604367.22510.28.camel@mylittlepony.linaroharston>
References: <1454358837.11463.14.camel@mint> <56B0C83D.900@redhat.com>
	<1454604367.22510.28.camel@mylittlepony.linaroharston>
Message-ID: <56B3840D.9060301@redhat.com>

On 02/04/2016 04:46 PM, Edward Nevill wrote:
> 
> The webrev ensures this by using bits 32..47 from the PC and bits
> 0..31 from the destination address. The fact that we use the XXXX
> bits from the PC ensures the requirement that the address is
> reachable and using only the bottom 32 bits of the dest ensures we
> only get the bits we actually want the adrp instruction to construct
> and not any extraneous bits in bits 48 etc.
> 
> The code that does this is
> 
>    unsigned long adrp_target = (target & 0xffffffffUL) | (source & 0xffff00000000UL);
> 
> and this is also reflected in pd_patch_instruction to calculate the adrp target there.

Much better, but this still is confusing.  Surely you can do

    unsigned long target = (unsigned long)dest.target();
    unsigned long adrp_target
      = (target & 0xffffffffUL) | ((unsigned long)pc() & 0xffff00000000UL);

    _adrp(reg1, (address)adrp_target);
    movk(reg1, target >> 32, 32);
  }

"source" doesn't really mean anything here.

OK with that change.

Andrew.

From hui.shi at linaro.org  Fri Feb  5 12:47:35 2016
From: hui.shi at linaro.org (Hui Shi)
Date: Fri, 5 Feb 2016 20:47:35 +0800
Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize disjoint
	array copy in stub code
Message-ID: <CAF1YaiCAJ3o81OU12nfs2BVP27efZdSiU10Z4cHZMeo1rX=Y1Q@mail.gmail.com>

Hi,

Would some one help review this changeset?  This improves performance for
codes like string builder and concat on aarch64.
Bug: https://bugs.openjdk.java.net/browse/JDK-8149080
webrev: http://cr.openjdk.java.net/~hshi/8149080/webrev/

Arraycopy without overlapping is faster than overlapped copy. If overlap
information is unknown at JIT time, stub code will check if arraycopy src
and dest array overlap at runtime, if not overlap, stub will perform faster
none-overlap array copy. In current aarch64 implementation, stub code
checks only if dest below src, this doesn?t cover cases dest above src but
still not overlap case (as X86 did).

Fixing is checking both conditions,  if (dest-src) is above/equal (copy
size), it's not overlap and stub code can jump to none overlapping copy.
Another modification is adding StubCodeMark for backward/forward copy longs
on aarch64, so code in these sections can get profiled with correct stub
name.

Regards
Hui

From aph at redhat.com  Fri Feb  5 12:58:55 2016
From: aph at redhat.com (Andrew Haley)
Date: Fri, 5 Feb 2016 12:58:55 +0000
Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize
 disjoint array copy in stub code
In-Reply-To: <CAF1YaiCAJ3o81OU12nfs2BVP27efZdSiU10Z4cHZMeo1rX=Y1Q@mail.gmail.com>
References: <CAF1YaiCAJ3o81OU12nfs2BVP27efZdSiU10Z4cHZMeo1rX=Y1Q@mail.gmail.com>
Message-ID: <56B49C8F.5030409@redhat.com>

On 02/05/2016 12:47 PM, Hui Shi wrote:
> Arraycopy without overlapping is faster than overlapped copy.

The only thing which varies is the direction of copying.  I'm not
aware of anything which makes one direction faster than the other.
Measurements, please.

Andrew.


From edward.nevill at gmail.com  Fri Feb  5 14:32:41 2016
From: edward.nevill at gmail.com (Edward Nevill)
Date: Fri, 05 Feb 2016 14:32:41 +0000
Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize
 disjoint array copy in stub code
In-Reply-To: <56B49C8F.5030409@redhat.com>
References: <CAF1YaiCAJ3o81OU12nfs2BVP27efZdSiU10Z4cHZMeo1rX=Y1Q@mail.gmail.com>
	<56B49C8F.5030409@redhat.com>
Message-ID: <1454682761.26562.19.camel@mint>

On Fri, 2016-02-05 at 12:58 +0000, Andrew Haley wrote:
> On 02/05/2016 12:47 PM, Hui Shi wrote:
> > Arraycopy without overlapping is faster than overlapped copy.
> 
> The only thing which varies is the direction of copying.  I'm not
> aware of anything which makes one direction faster than the other.
> Measurements, please.

Copy backwards doesn't prefetch. The difference with and without
prefetch can be very significant on some micro-arches.

    if (direction == copy_forwards && PrefetchCopyIntervalInBytes > 0)
      __ prfm(Address(s, PrefetchCopyIntervalInBytes), PLDL1KEEP);

I have done some experiments with prefetch enabled for backwards copy
and it shows almost identical performance to forwards copy.

Regards,
Ed.


From aph at redhat.com  Fri Feb  5 14:37:46 2016
From: aph at redhat.com (Andrew Haley)
Date: Fri, 5 Feb 2016 14:37:46 +0000
Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize
 disjoint array copy in stub code
In-Reply-To: <1454682761.26562.19.camel@mint>
References: <CAF1YaiCAJ3o81OU12nfs2BVP27efZdSiU10Z4cHZMeo1rX=Y1Q@mail.gmail.com>
	<56B49C8F.5030409@redhat.com> <1454682761.26562.19.camel@mint>
Message-ID: <56B4B3BA.602@redhat.com>

On 02/05/2016 02:32 PM, Edward Nevill wrote:
> On Fri, 2016-02-05 at 12:58 +0000, Andrew Haley wrote:
>> On 02/05/2016 12:47 PM, Hui Shi wrote:
>>> Arraycopy without overlapping is faster than overlapped copy.
>>
>> The only thing which varies is the direction of copying.  I'm not
>> aware of anything which makes one direction faster than the other.
>> Measurements, please.
> 
> Copy backwards doesn't prefetch. The difference with and without
> prefetch can be very significant on some micro-arches.
> 
>     if (direction == copy_forwards && PrefetchCopyIntervalInBytes > 0)
>       __ prfm(Address(s, PrefetchCopyIntervalInBytes), PLDL1KEEP);
> 
> I have done some experiments with prefetch enabled for backwards copy
> and it shows almost identical performance to forwards copy.

OK, so let's do that, then.

Andrew.


From hui.shi at linaro.org  Sat Feb  6 11:52:19 2016
From: hui.shi at linaro.org (Hui Shi)
Date: Sat, 6 Feb 2016 19:52:19 +0800
Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize
 disjoint array copy in stub code
In-Reply-To: <56B4B3BA.602@redhat.com>
References: <CAF1YaiCAJ3o81OU12nfs2BVP27efZdSiU10Z4cHZMeo1rX=Y1Q@mail.gmail.com>
	<56B49C8F.5030409@redhat.com> <1454682761.26562.19.camel@mint>
	<56B4B3BA.602@redhat.com>
Message-ID: <CAF1YaiBLSHKe9sJkrEZs6=kZC3=z2iuBujRCyz=a70Q3nu6x+w@mail.gmail.com>

Thanks Andrew and Edward!

Code sequence for backward and forward array copy is almost same except
prefetch. Performance test is based on
http://cr.openjdk.java.net/~hshi/8149080/testcase/StringConcatTest.java run
with "java StringConcatTest 5000". I tried disabling prefetch and compare
performance between backward and forward array copy (all forward with my
patch, force all backward by commenting out branch to nooverlap target and
force jshort_disjoint_copy generate conjoint copy), forward array copy is
much faster than backward. backward is about 85s and forward copy is about
60s.

This test is try to reflect common cases like string builder/buffer append
and string concatenation, these are disjoint array copy and forward array
copy is better than backward in following two aspects:

1. Forward array copy can prefetch dest address needed in next string
append.
  Most string append/concatenation operations will append chars after early
appened char arrays.
  For example, str = str1 + str2 + str3
  1. when append str1 in forward order, result value array(str.value) will
be prefetched beyond str1's length with hardware prefetcher
  2. when store str2.value into str.value, str.value is already prefetched,
less cache miss when copy str2.value into str.value
  If copy in backward order, after copy str1.value into str.value, it's
address before str.value[0] get prefetched, this is not useful for next
append.
  Checking following PMU events on A57 (
http://cr.openjdk.java.net/~hshi/8149080/testcase/backward.perf,
http://cr.openjdk.java.net/~hshi/8149080/testcase/forward.perf), forward
array copy has more accurate hardware prefetcher result (more issued
request is used).Compare with/without prefetch instruction in forward copy,
no performance different, hardware prefechter might good enough.

  0x167 Level 2 prefetcher request used (or demanded)
  0x168 Level 2 prefetcher request issued
  In forward array copy 94% generated request is useful (r167/r168)
  In backward array copy 67% issued request is usedful

  http://cr.openjdk.java.net/~hshi/8149080/testcase/DiscreteCopy.java
testing array copy not in append mode, each array copy performs on separate
 address. Run with "java DiscreteCopy 3000" Forward copy takes 58s and
backward array copy takes 70s. Gap is decreased.

2. Backward array copy might cause much more unaligned memory access in
string append/concatenation.
   Current array copy implementation is:
   1. peel source array address for 16 bytes alignment (backward will
perform peel from source end) copy 8,4,2,1 bytes
   2. perform copy_longs
   3. tail copy less than 16 bytes, copy 8,4,2,1 bytes

   In string append/concatenation cases, source string value array is
usually 8 bytes or 16 bytes align. Suppose source address is 16 byte align
and size is n*16+14;.
   With forward array copy: n ld/st pair, then store 8 bytes align, then
store 4 bytes align, then store 2 bytes align.
   With backward array copy need peel source end address first (checking
copy_memory_small):  store 8 bytes unaligned, store 4 bytes unaligned,
store 2 bytes aligned, n ld/st pair.

   Perform unaligned access profiling with perf on DiscreteCopy, massive
unaligned access for backward array copy, while not found for forward array
copy.
http://cr.openjdk.java.net/~hshi/8149080/testcase/AlignedDiscreteCopy.java
testing
array copy with 16 bytes aligned size, performance is identical for
backward and forward array copy, both are about 64s.


Perform forward array copy when possible will not make things worse and
benefit common cases like string append/concatenation. This is the original
logic when generate conjoint array copy, this patch complete this logic by
recognize all disjoint array copy. Does this make sense?

Regards
Hui

On 5 February 2016 at 22:37, Andrew Haley <aph at redhat.com> wrote:

> On 02/05/2016 02:32 PM, Edward Nevill wrote:
> > On Fri, 2016-02-05 at 12:58 +0000, Andrew Haley wrote:
> >> On 02/05/2016 12:47 PM, Hui Shi wrote:
> >>> Arraycopy without overlapping is faster than overlapped copy.
> >>
> >> The only thing which varies is the direction of copying.  I'm not
> >> aware of anything which makes one direction faster than the other.
> >> Measurements, please.
> >
> > Copy backwards doesn't prefetch. The difference with and without
> > prefetch can be very significant on some micro-arches.
> >
> >     if (direction == copy_forwards && PrefetchCopyIntervalInBytes > 0)
> >       __ prfm(Address(s, PrefetchCopyIntervalInBytes), PLDL1KEEP);
> >
> > I have done some experiments with prefetch enabled for backwards copy
> > and it shows almost identical performance to forwards copy.
>
> OK, so let's do that, then.
>
> Andrew.
>
>
>

From hui.shi at linaro.org  Sat Feb  6 12:24:03 2016
From: hui.shi at linaro.org (Hui Shi)
Date: Sat, 6 Feb 2016 20:24:03 +0800
Subject: [aarch64-port-dev ] RFR(s): AAch64: Adding byte array equal support
Message-ID: <CAF1YaiDW0VB_dh0AyS1AEfwOoqCemGhJXNRPAEU07LfDS-rdKg@mail.gmail.com>

Hi All,

Would someone help review this patch for adding byte array equal support on
aarch64?

bug: https://bugs.openjdk.java.net/browse/JDK-8149100
webrev: http://cr.openjdk.java.net/~hshi/8149100/webrev/

For http://cr.openjdk.java.net/~hshi/8149100/testcase/ArrayEqual.java,
debug build run will failed with ?bad AD file? assertion on aarch64.
 # To suppress the following error report, specify this argument
# after -XX: or in .hotspotrc: SuppressErrorAt=/matcher.cpp:1605
#
# A fatal error has been detected by the Java Runtime Environment:
#
# Internal Error
(/home/shihui/jdk9-hs-comp/hotspot/src/share/vm/opto/matcher.cpp:1605),
pid=8501, tid=8746
# assert(false) failed: bad AD file
#

Debugg shows AryEqNode?s enconding is StrIntrinsicNode::LL, which is not
supported on aarch64 now.
1605 assert( false, "bad AD file" );
(gdb) p ((AryEqNode*)n)->encoding()
$1 = StrIntrinsicNode::LL
(gdb)

Fix is adding support for StrIntrinsicNode::LL encoding array equal
operation, as Latin String compare might become important in JDK9 with new
String.
1. Adding MacroAssembler::byte_arrays_equals to support byte array equals
check.
2. Add new array_equalsB rule when AryEq enconding is StrIntrinsicNode::LL.

http://cr.openjdk.java.net/~hshi/8149100/testcase/byte_array_equals.asm
shows newly generated assembly.

Relase build will invoke Array.equals method before this patch, with this
patch, significant improvment on ArrayEqual case.
time -p openjdk-9-internal.base/bin/java  ArrayEqual
real 54.98
user 55.13

time -p openjdk-9-internal.byteEquals/bin/java  ArrayEqual
real 28.59
user 28.62
sys 0.14

Following code sequence can be replaced with tbz (when tst has constant
exactly two?s n times value), these code sequence exist in other
places(MacroAssembler::char_arrays_equals, interpreter, etc). I would like
clean all together in another separate changeset.
    tst(cnt1, 0b10);
    br(EQ, TAIL01);

Regards
Hui

From aph at redhat.com  Sun Feb  7 10:24:39 2016
From: aph at redhat.com (Andrew Haley)
Date: Sun, 7 Feb 2016 10:24:39 +0000
Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize
 disjoint array copy in stub code
In-Reply-To: <CAF1YaiBLSHKe9sJkrEZs6=kZC3=z2iuBujRCyz=a70Q3nu6x+w@mail.gmail.com>
References: <CAF1YaiCAJ3o81OU12nfs2BVP27efZdSiU10Z4cHZMeo1rX=Y1Q@mail.gmail.com>
	<56B49C8F.5030409@redhat.com> <1454682761.26562.19.camel@mint>
	<56B4B3BA.602@redhat.com>
	<CAF1YaiBLSHKe9sJkrEZs6=kZC3=z2iuBujRCyz=a70Q3nu6x+w@mail.gmail.com>
Message-ID: <56B71B67.8080304@redhat.com>

On 06/02/16 11:52, Hui Shi wrote:

> Code sequence for backward and forward array copy is almost same
> except prefetch. Performance test is based on
> http://cr.openjdk.java.net/~hshi/8149080/testcase/StringConcatTest.java
> run with "java StringConcatTest 5000". I tried disabling prefetch
> and compare performance between backward and forward array copy (all
> forward with my patch, force all backward by commenting out branch
> to nooverlap target and force jshort_disjoint_copy generate conjoint
> copy), forward array copy is much faster than backward. backward is
> about 85s and forward copy is about 60s.
> 
> This test is try to reflect common cases like string builder/buffer
> append and string concatenation, these are disjoint array copy and
> forward array copy is better than backward in following two aspects:

You're confusing me.  String concatenation is a disjoint array copy.
Therefore it always copies forwards, does it not?

> 1. Forward array copy can prefetch dest address needed in next string
> append.

So can backwards array copy, surely.

> 2. Backward array copy might cause much more unaligned memory access in
> string append/concatenation.

Okay, I see.  That is fixable: we can make sure that there are no
more misaligned accesses in either direction.

> Perform forward array copy when possible will not make things worse and
> benefit common cases like string append/concatenation. This is the original
> logic when generate conjoint array copy, this patch complete this logic by
> recognize all disjoint array copy. Does this make sense?

Yes, but it's a kludge.  I'd much rather fix backwards copies so that
they were just as fast.  If that's not possible then your patch may be
acceptable, but I think we should first try to fix backwards copies.
We should be able to fix this the *right way*, by using prefetch
instructions and making sure copies are aligned where possible.  When
I did my testing misaligned fetches were quite fast, and it didn't
seem worth the effort to fix it.

But I'm really mystified by why String concatenation doesn't always
use forward copies anyway.

Andrew.

From aph at redhat.com  Sun Feb  7 10:35:10 2016
From: aph at redhat.com (Andrew Haley)
Date: Sun, 7 Feb 2016 10:35:10 +0000
Subject: [aarch64-port-dev ] RFR(s): AAch64: Adding byte array equal
	support
In-Reply-To: <CAF1YaiDW0VB_dh0AyS1AEfwOoqCemGhJXNRPAEU07LfDS-rdKg@mail.gmail.com>
References: <CAF1YaiDW0VB_dh0AyS1AEfwOoqCemGhJXNRPAEU07LfDS-rdKg@mail.gmail.com>
Message-ID: <56B71DDE.2040109@redhat.com>

On 06/02/16 12:24, Hui Shi wrote:
> Hi All,
> 
> Would someone help review this patch for adding byte array equal support on
> aarch64?
> 
> bug: https://bugs.openjdk.java.net/browse/JDK-8149100
> webrev: http://cr.openjdk.java.net/~hshi/8149100/webrev/

Ok, thanks.

> Following code sequence can be replaced with tbz (when tst has constant
> exactly two?s n times value), these code sequence exist in other
> places(MacroAssembler::char_arrays_equals, interpreter, etc). I would like
> clean all together in another separate changeset.
>     tst(cnt1, 0b10);
>     br(EQ, TAIL01);

Right.

Andrew.


From edward.nevill at gmail.com  Mon Feb  8 08:39:00 2016
From: edward.nevill at gmail.com (Edward Nevill)
Date: Mon, 08 Feb 2016 08:39:00 +0000
Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize
 disjoint array copy in stub code
In-Reply-To: <CAF1YaiBLSHKe9sJkrEZs6=kZC3=z2iuBujRCyz=a70Q3nu6x+w@mail.gmail.com>
References: <CAF1YaiCAJ3o81OU12nfs2BVP27efZdSiU10Z4cHZMeo1rX=Y1Q@mail.gmail.com>
	<56B49C8F.5030409@redhat.com> <1454682761.26562.19.camel@mint>
	<56B4B3BA.602@redhat.com>
	<CAF1YaiBLSHKe9sJkrEZs6=kZC3=z2iuBujRCyz=a70Q3nu6x+w@mail.gmail.com>
Message-ID: <1454920740.26562.28.camel@mint>

On Sat, 2016-02-06 at 19:52 +0800, Hui Shi wrote:

> Code sequence for backward and forward array copy is almost same except prefetch. Performance test is based on http://cr.openjdk.java.net/~hshi/8149080/testcase/StringConcatTest.java run with "java StringConcatTest 5000". I tried disabling prefetch and compare performance between backward and forward array copy (all 

Hi,

How did you disable the prefetch? Did you use -XX:PrefetchCopyIntervalInBytes=0?

There is a bug/feature in vm_version_aarch64.cpp where it does

  FLAG_SET_DEFAULT(PrefetchCopyIntervalInBytes, 256);

overwriting any previous value, whereas it should do

  if (FLAG_IS_DEFAULT(PrefetchCopyIntervalInBytes))
    FLAG_SET_DEFAULT(PrefetchCopyIntervalInBytes, 256);


> 1. Forward array copy can prefetch dest address needed in next string append.
> 
>   Most string append/concatenation operations will append chars after early appened char arrays. 
>   For example, str = str1 + str2 + str3
>   1. when append str1 in forward order, result value array(str.value) will be prefetched beyond str1's length with hardware prefetcher
>   2. when store str2.value into str.value, str.value is already prefetched, less cache miss when copy str2.value into str.value
>   If copy in backward order, after copy str1.value into str.value, it's address before str.value[0] get prefetched, this is not useful for next append.

I assume you are talking about automatic hardware prefetching here since the SW implementation does not do any prefetching on the destination? In that case I can see how repeated forward copys may be more efficient for string concatenation.

All the best,
Ed.


From hui.shi at linaro.org  Mon Feb  8 11:55:34 2016
From: hui.shi at linaro.org (Hui Shi)
Date: Mon, 8 Feb 2016 19:55:34 +0800
Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize
 disjoint array copy in stub code
In-Reply-To: <56B71B67.8080304@redhat.com>
References: <CAF1YaiCAJ3o81OU12nfs2BVP27efZdSiU10Z4cHZMeo1rX=Y1Q@mail.gmail.com>
	<56B49C8F.5030409@redhat.com> <1454682761.26562.19.camel@mint>
	<56B4B3BA.602@redhat.com>
	<CAF1YaiBLSHKe9sJkrEZs6=kZC3=z2iuBujRCyz=a70Q3nu6x+w@mail.gmail.com>
	<56B71B67.8080304@redhat.com>
Message-ID: <CAF1YaiC_obw1pPPZnqshcs-X6wV_orcQbVJYXrUogUz93OeU1w@mail.gmail.com>

Thanks Andrew!


> You're confusing me.  String concatenation is a disjoint array copy.
> Therefore it always copies forwards, does it not?
>

Yes,  it would be better JIT can recognize this at compile time. Previous
performance data is collected on Java8 (so copy is performed in short array
copy, while in JDK9 it is byte array copy).  I check both JDK8 and JDK9,
both invoke Stub::jshort_arraycopy and  Stub::jbyte_arraycopy.  One reason
might be JIT time determination is not important as there is run time check
for disjoint array copy.


> > 1. Forward array copy can prefetch dest address needed in next string
> > append.
>
> So can backwards array copy, surely.
>

Could you please give more details about how backward array copy can also
utilize hardware prefetcher in multiple string append case?


>
> > 2. Backward array copy might cause much more unaligned memory access in
> > string append/concatenation.
>
> Okay, I see.  That is fixable: we can make sure that there are no
> more misaligned accesses in either direction.
>
> > Perform forward array copy when possible will not make things worse and
> > benefit common cases like string append/concatenation. This is the
> original
> > logic when generate conjoint array copy, this patch complete this logic
> by
> > recognize all disjoint array copy. Does this make sense?
>
> Yes, but it's a kludge.  I'd much rather fix backwards copies so that
> they were just as fast.  If that's not possible then your patch may be
> acceptable, but I think we should first try to fix backwards copies.
> We should be able to fix this the *right way*, by using prefetch
> instructions and making sure copies are aligned where possible.  When
> I did my testing misaligned fetches were quite fast, and it didn't
> seem worth the effort to fix it.
>
>
I agree inserting prefetch in backward copy and make backward array copy
more faster. For mis-aligned issue in backward array copy, we might copy in
1,2,4,8 order to make it align.


> But I'm really mystified by why String concatenation doesn't always
> use forward copies anyway.
>
> Andrew.
>

From hui.shi at linaro.org  Mon Feb  8 11:58:15 2016
From: hui.shi at linaro.org (Hui Shi)
Date: Mon, 8 Feb 2016 19:58:15 +0800
Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize
 disjoint array copy in stub code
In-Reply-To: <1454920740.26562.28.camel@mint>
References: <CAF1YaiCAJ3o81OU12nfs2BVP27efZdSiU10Z4cHZMeo1rX=Y1Q@mail.gmail.com>
	<56B49C8F.5030409@redhat.com> <1454682761.26562.19.camel@mint>
	<56B4B3BA.602@redhat.com>
	<CAF1YaiBLSHKe9sJkrEZs6=kZC3=z2iuBujRCyz=a70Q3nu6x+w@mail.gmail.com>
	<1454920740.26562.28.camel@mint>
Message-ID: <CAF1YaiCCFFgZ=6Zz2ZWChOmoUud9Ws6TEEVhHMj5YYqZHEbCTQ@mail.gmail.com>

Thanks Edward!

>
>
> How did you disable the prefetch? Did you use
> -XX:PrefetchCopyIntervalInBytes=0?


I disable prefetch by removing prefetch in stub code and rebuild.

>
>
> > 1. Forward array copy can prefetch dest address needed in next string
> append.
> >
> >   Most string append/concatenation operations will append chars after
> early appened char arrays.
> >   For example, str = str1 + str2 + str3
> >   1. when append str1 in forward order, result value array(str.value)
> will be prefetched beyond str1's length with hardware prefetcher
> >   2. when store str2.value into str.value, str.value is already
> prefetched, less cache miss when copy str2.value into str.value
> >   If copy in backward order, after copy str1.value into str.value, it's
> address before str.value[0] get prefetched, this is not useful for next
> append.
>
> I assume you are talking about automatic hardware prefetching here since
> the SW implementation does not do any prefetching on the destination? In
> that case I can see how repeated forward copys may be more efficient for
> string concatenation.
>
>
Yes, its hardware prefecher. perf profiling get the hardware generated
perfecher issued and used. and forward hardware prefetcher hit rate is much
higher in forward array copy.

Regards
Hui

From hui.shi at linaro.org  Mon Feb  8 11:59:40 2016
From: hui.shi at linaro.org (Hui Shi)
Date: Mon, 8 Feb 2016 19:59:40 +0800
Subject: [aarch64-port-dev ] RFR(s): AAch64: Adding byte array equal
	support
In-Reply-To: <56B71DDE.2040109@redhat.com>
References: <CAF1YaiDW0VB_dh0AyS1AEfwOoqCemGhJXNRPAEU07LfDS-rdKg@mail.gmail.com>
	<56B71DDE.2040109@redhat.com>
Message-ID: <CAF1YaiDWx6GJ2XzkwcRRUk4EAvgYkJc4bDFabCWOBQUE3QZquQ@mail.gmail.com>

Thanks Andrew!

Could someone help push this change?

Regards
Hui

On 7 February 2016 at 18:35, Andrew Haley <aph at redhat.com> wrote:

> On 06/02/16 12:24, Hui Shi wrote:
> > Hi All,
> >
> > Would someone help review this patch for adding byte array equal support
> on
> > aarch64?
> >
> > bug: https://bugs.openjdk.java.net/browse/JDK-8149100
> > webrev: http://cr.openjdk.java.net/~hshi/8149100/webrev/
>
> Ok, thanks.
>
> > Following code sequence can be replaced with tbz (when tst has constant
> > exactly two?s n times value), these code sequence exist in other
> > places(MacroAssembler::char_arrays_equals, interpreter, etc). I would
> like
> > clean all together in another separate changeset.
> >     tst(cnt1, 0b10);
> >     br(EQ, TAIL01);
>
> Right.
>
> Andrew.
>
>
>

From aph at redhat.com  Mon Feb  8 14:32:44 2016
From: aph at redhat.com (Andrew Haley)
Date: Mon, 8 Feb 2016 14:32:44 +0000
Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize
 disjoint array copy in stub code
In-Reply-To: <CAF1YaiC_obw1pPPZnqshcs-X6wV_orcQbVJYXrUogUz93OeU1w@mail.gmail.com>
References: <CAF1YaiCAJ3o81OU12nfs2BVP27efZdSiU10Z4cHZMeo1rX=Y1Q@mail.gmail.com>
	<56B49C8F.5030409@redhat.com> <1454682761.26562.19.camel@mint>
	<56B4B3BA.602@redhat.com>
	<CAF1YaiBLSHKe9sJkrEZs6=kZC3=z2iuBujRCyz=a70Q3nu6x+w@mail.gmail.com>
	<56B71B67.8080304@redhat.com>
	<CAF1YaiC_obw1pPPZnqshcs-X6wV_orcQbVJYXrUogUz93OeU1w@mail.gmail.com>
Message-ID: <56B8A70C.6050300@redhat.com>

On 08/02/16 11:55, Hui Shi wrote:
> Could you please give more details about how backward array copy can also
> utilize hardware prefetcher in multiple string append case?

I do not understand this question.  There is AFAIK no "multiple string
append case": string concatenation is done by a single array copy, and
multiple concatenations are done as several array copies.  Therefore,
all we have to do is make sure that char/byte array copies are fast in
both directions.  And if the problem is prefetching, we know how to do
that.

Andrew.


From edward.nevill at gmail.com  Mon Feb  8 14:37:26 2016
From: edward.nevill at gmail.com (Edward Nevill)
Date: Mon, 08 Feb 2016 14:37:26 +0000
Subject: [aarch64-port-dev ] RFR: 8149365: memory copy does not prefetch on
	backwards copy
Message-ID: <1454942246.11423.18.camel@mint>

Hi,

The following webrev

http://cr.openjdk.java.net/~enevill/8149365/webrev.0/

adds support for prefetch on backwards copies (previously prefetch was only done on forwards copies).

It also fixes a 'feature' where the command line option -XX:PrefetchCopyIntervalInBytes=N is ignored and the value 256 always used instead.

I have benchmarked it using the following test progam

http://cr.openjdk.java.net/~enevill/8149365/ArrayCopyTest.java

Which allows you to test memory copies of different sizes from a start size to an end size in step units. The test does both backwards and forwards copies.

Usage:

java ArrayCopyTest <iters> <start> <end> <step>

I have generated the results obtained before and after the above patch on 4 different partners HW (A,B,C,D) and a summary of the results is available at

http://people.linaro.org/~edward.nevill/prefetch/prefetch.pdf

For partner A I tested 3 ranges

0-64 bytes in units of 1
0-512 bytes in units of 8
0-4096 bytes in units of 64

The latter 2 clearly show the benefit of prefetching on backwards copies.

For partners B, C & D, I only tested 0-4096 bytes in units of 64. I also tested B, C & D with -XX:PrefetchCopyIntervalInBytes=0. On these 3 partners disabling prefetch seemed to have no effect indicating that either prefetch is not implemented, or it implements automatic hardware prefetch.

A summary of the results is that it improves performance significantly on partner A and has no effect on partners B,C & D.

OK to push?

Ed.


From aph at redhat.com  Mon Feb  8 14:57:23 2016
From: aph at redhat.com (Andrew Haley)
Date: Mon, 8 Feb 2016 14:57:23 +0000
Subject: [aarch64-port-dev ] RFR: 8149365: memory copy does not prefetch
 on backwards copy
In-Reply-To: <1454942246.11423.18.camel@mint>
References: <1454942246.11423.18.camel@mint>
Message-ID: <56B8ACD3.3040602@redhat.com>

On 08/02/16 14:37, Edward Nevill wrote:
> OK to push?

OK, thanks.

Andrew.


From aph at redhat.com  Tue Feb  9 13:54:25 2016
From: aph at redhat.com (Andrew Haley)
Date: Tue, 9 Feb 2016 13:54:25 +0000
Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize
 disjoint array copy in stub code
In-Reply-To: <CAF1YaiCAJ3o81OU12nfs2BVP27efZdSiU10Z4cHZMeo1rX=Y1Q@mail.gmail.com>
References: <CAF1YaiCAJ3o81OU12nfs2BVP27efZdSiU10Z4cHZMeo1rX=Y1Q@mail.gmail.com>
Message-ID: <56B9EF91.8000809@redhat.com>

On 05/02/16 12:47, Hui Shi wrote:

> Would some one help review this changeset?  This improves performance for
> codes like string builder and concat on aarch64.
> Bug: https://bugs.openjdk.java.net/browse/JDK-8149080
> webrev: http://cr.openjdk.java.net/~hshi/8149080/webrev/

After some discussion with Edward Nevill, I am persuaded to accept
this patch.  While I'm not really happy that the backwards copy is so
much slower than forwards, this patch is very low risk.  I have
checked the boundary conditions of

  (unsigned long)(d - s) >= (unsigned long)size

and I'm convinced it's the correct test in this case.

However, the comment

    // no overlap when (d-s) above_equal (count*size)

is wrong.  If d < s, unsigned(d-s) is >= (count*size) but the two
strings may still overlap.  This doesn't affect correctness in this
case, because the forwards copy is the right one to use.  Having said
that, if someone changes nooverlap_target so that it is incorrect when
copying overlapping arrays we'll have a problem.

Thanks,

Andrew,

From edward.nevill at gmail.com  Tue Feb  9 14:03:02 2016
From: edward.nevill at gmail.com (Edward Nevill)
Date: Tue, 09 Feb 2016 14:03:02 +0000
Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize
 disjoint array copy in stub code
In-Reply-To: <56B9EF91.8000809@redhat.com>
References: <CAF1YaiCAJ3o81OU12nfs2BVP27efZdSiU10Z4cHZMeo1rX=Y1Q@mail.gmail.com>
	<56B9EF91.8000809@redhat.com>
Message-ID: <1455026582.32182.1.camel@mylittlepony.linaroharston>

On Tue, 2016-02-09 at 13:54 +0000, Andrew Haley wrote:
> On 05/02/16 12:47, Hui Shi wrote:
> 
> > Would some one help review this changeset?  This improves performance for
> > codes like string builder and concat on aarch64.
> > Bug: https://bugs.openjdk.java.net/browse/JDK-8149080
> > webrev: http://cr.openjdk.java.net/~hshi/8149080/webrev/
> 

> However, the comment
> 
>     // no overlap when (d-s) above_equal (count*size)

Shall I just change the comment to

     // use fwd copy when (d-s) above_equal (count*size)

when I do the push?

Regards,
Ed.


From aph at redhat.com  Tue Feb  9 14:03:45 2016
From: aph at redhat.com (Andrew Haley)
Date: Tue, 9 Feb 2016 14:03:45 +0000
Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize
 disjoint array copy in stub code
In-Reply-To: <1455026582.32182.1.camel@mylittlepony.linaroharston>
References: <CAF1YaiCAJ3o81OU12nfs2BVP27efZdSiU10Z4cHZMeo1rX=Y1Q@mail.gmail.com>
	<56B9EF91.8000809@redhat.com>
	<1455026582.32182.1.camel@mylittlepony.linaroharston>
Message-ID: <56B9F1C1.6010605@redhat.com>

On 09/02/16 14:03, Edward Nevill wrote:
>> However, the comment
>> > 
>> >     // no overlap when (d-s) above_equal (count*size)
> Shall I just change the comment to
> 
>      // use fwd copy when (d-s) above_equal (count*size)
> 
> when I do the push?

OK, thanks.

Andrew.


From hui.shi at linaro.org  Wed Feb 10 00:37:44 2016
From: hui.shi at linaro.org (Hui Shi)
Date: Wed, 10 Feb 2016 08:37:44 +0800
Subject: [aarch64-port-dev ] RFR(s): AArch64: 8149080: Recoginize
 disjoint array copy in stub code
In-Reply-To: <56B9F1C1.6010605@redhat.com>
References: <CAF1YaiCAJ3o81OU12nfs2BVP27efZdSiU10Z4cHZMeo1rX=Y1Q@mail.gmail.com>
	<56B9EF91.8000809@redhat.com>
	<1455026582.32182.1.camel@mylittlepony.linaroharston>
	<56B9F1C1.6010605@redhat.com>
Message-ID: <CAF1YaiBp5+CP7_Vm1DsbBpJJv6TitcTHx+4nTVbs+PC3a6pZJA@mail.gmail.com>

Thanks Andrew and Edward!

I will follow up with misaligned issue for 16 byte alignment peeling before
copy longs.

Regards
Hui

On 9 February 2016 at 22:03, Andrew Haley <aph at redhat.com> wrote:

> On 09/02/16 14:03, Edward Nevill wrote:
> >> However, the comment
> >> >
> >> >     // no overlap when (d-s) above_equal (count*size)
> > Shall I just change the comment to
> >
> >      // use fwd copy when (d-s) above_equal (count*size)
> >
> > when I do the push?
>
> OK, thanks.
>
> Andrew.
>
>

From aph at redhat.com  Wed Feb 10 13:05:26 2016
From: aph at redhat.com (Andrew Haley)
Date: Wed, 10 Feb 2016 13:05:26 +0000
Subject: [aarch64-port-dev ] Use jmh for benchmarks [Was: RFR(s): AArch64:
 8149080: Recoginize disjoint array copy in stub code]
In-Reply-To: <CAF1YaiCAJ3o81OU12nfs2BVP27efZdSiU10Z4cHZMeo1rX=Y1Q@mail.gmail.com>
References: <CAF1YaiCAJ3o81OU12nfs2BVP27efZdSiU10Z4cHZMeo1rX=Y1Q@mail.gmail.com>
Message-ID: <56BB3596.2070000@redhat.com>

It's very important to use JMH for HotSpot benchmarks.  Without JMH,
it is very hard to tell if you're measuring the right thing.

In order to help you get started, I've appended a JMH version of your
benchmark.  Run it with:

build/linux-aarch64-normal-server-release/jdk/bin/java -jar \
  jmh-samples/target/microbenchmarks.jar '.*JMHSample_96.*' -wi 5 -i 10 \
  -f 0

Andrew.


-----------------------------------------------------------------------
package org.openjdk.jmh.samples;

import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.GenerateMicroBenchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.GenerateMicroBenchmark;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;

import java.util.concurrent.TimeUnit;

import java.nio.*;
import java.util.*;
import java.util.concurrent.*;

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
public class JMHSample_96_StringAppend {

    @State(Scope.Benchmark)
    public static class BenchmarkState {
        final String[] strs = {
            "aoiod", // 5
            "adsdefrgda", // 10
            "dsadsadadsdsadiomjdas", // 20
            "djsadusahdusaufdoaaiffjdkdpjikl", // 30
            "dsudhusuhfudhaufhduahfduafhdkaffhdjafjdfa", // 40
            "dhsuafydagfydagfdafdajlkejwfjfuhfuafjhdahfldjksl90s", // 50
            "dsajufhdaufhdasuifhdasjkfndasjkfgbaduygbiafjioeawjfioiopjsdljl", // 60
            "dshaudshauidshauidhsiufhdasjklfdbnasjkvbauyvbdyargfwrheuifgeuijikalkjfds", // 70
            "nvfjsvnfusdbvfuyafbduyasfdsjkfhdjkasfhdjksafhdjksfhdjksfhasdjkncxsvnxcm,fdjklfjdkf", // 80
            "fdhuafdhasuifhdasuigbdjkbvcjksbdfhduasfhduasifhdasjkfhdasjkfhdjklasfoeurieoiruwiowurieoureik", // 90
            "dshfudahfduiashfduiasnvdjkvnuiarheuirheiodfhdjksafhuiheuiafheaskfdhjkasfhdjkashfdjkashfdjkasuipiuk890f", // 100
        };
    }

    @GenerateMicroBenchmark
    public StringBuilder doIt(BenchmarkState state) {
        StringBuilder strBuf = new StringBuilder();
        for (String s : state.strs) {
            strBuf.append(s);
        }
        return strBuf;
    }

    public static void main(String[] args) throws RunnerException {
        Options opt = new OptionsBuilder()
            .include(".*" + JMHSample_96_StringAppend.class.getSimpleName() + ".*")
            .warmupIterations(5)
            .measurementIterations(5)
            .forks(1)
            .build();

        new Runner(opt).run();
    }

}


From aph at redhat.com  Wed Feb 10 15:41:13 2016
From: aph at redhat.com (Andrew Haley)
Date: Wed, 10 Feb 2016 15:41:13 +0000
Subject: [aarch64-port-dev ] RFR(s): AAch64: Adding byte array equal
	support
In-Reply-To: <56B71DDE.2040109@redhat.com>
References: <CAF1YaiDW0VB_dh0AyS1AEfwOoqCemGhJXNRPAEU07LfDS-rdKg@mail.gmail.com>
	<56B71DDE.2040109@redhat.com>
Message-ID: <56BB5A19.60001@redhat.com>

On 02/07/2016 10:35 AM, Andrew Haley wrote:
> On 06/02/16 12:24, Hui Shi wrote:
>> Hi All,
>>
>> Would someone help review this patch for adding byte array equal support on
>> aarch64?
>>
>> bug: https://bugs.openjdk.java.net/browse/JDK-8149100
>> webrev: http://cr.openjdk.java.net/~hshi/8149100/webrev/
> 
> Ok, thanks.

Having said that, I'm unhappy that this code is almost exactly the same
as char_arrays_equals, seeming to have a slab of identical code.  We've
also got char_arrays_equals, which seems to do the same thing as
string_equals.

Andrew.


From gnu.andrew at redhat.com  Thu Feb 11 18:21:00 2016
From: gnu.andrew at redhat.com (Andrew Hughes)
Date: Thu, 11 Feb 2016 13:21:00 -0500 (EST)
Subject: [aarch64-port-dev ] Sync AArch64 8u JDK Repository with upstream
	8u72
In-Reply-To: <2089452226.20004508.1455214656138.JavaMail.zimbra@redhat.com>
Message-ID: <1530040291.20005017.1455214860067.JavaMail.zimbra@redhat.com>

I did a comparison of the AArch64 jdk8u repository [0] with the
upstream jdk8u72-b15 tag and found a couple of differences that
could be resolved.

1. The change
"8131105: Header Template for nroff man pages *.1 files contains errors"
seems to have been reverted. Re-applying this fixes the issue.

2. Two solutions for a libpng on ARM issue [1] seem to have been applied.
Now that the upstream 8078245 version is present, we can revert the change
to the libpng sources, keeping them pristine and making it easier to apply
future upstream updates to them.

Webrev: http://cr.openjdk.java.net/~andrew/aarch64-8/sync/jdk.webrev.01/webrev/

Ok to push?

[0] http://hg.openjdk.java.net/aarch64-port/jdk8u/jdk/
[1] https://bugs.openjdk.java.net/browse/JDK-8078245

Thanks,
-- 
Andrew :)

Senior Free Java Software Engineer
Red Hat, Inc. (http://www.redhat.com)

PGP Key: ed25519/35964222 (hkp://keys.gnupg.net)
Fingerprint = 5132 579D D154 0ED2 3E04  C5A0 CFDA 0F9B 3596 4222


From aph at redhat.com  Fri Feb 12 09:50:43 2016
From: aph at redhat.com (Andrew Haley)
Date: Fri, 12 Feb 2016 09:50:43 +0000
Subject: [aarch64-port-dev ] Sync AArch64 8u JDK Repository with
 upstream 8u72
In-Reply-To: <1530040291.20005017.1455214860067.JavaMail.zimbra@redhat.com>
References: <1530040291.20005017.1455214860067.JavaMail.zimbra@redhat.com>
Message-ID: <56BDAAF3.3020904@redhat.com>

On 11/02/16 18:21, Andrew Hughes wrote:
> Ok to push?

OK, thanks.  Sounds like 8u has been pretty quiet.

Andrew.


From adinn at redhat.com  Fri Feb 12 10:42:15 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Fri, 12 Feb 2016 10:42:15 +0000
Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize
 redundant memory operations with G1
In-Reply-To: <9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com>
References: <F1EBA70C-852F-4EEA-9C53-58FC4849FD2D@oracle.com>
	<56AA260B.8080101@redhat.com>
	<434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com>
	<56BB01E2.2090004@redhat.com>
	<9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com>
Message-ID: <56BDB707.6090409@redhat.com>

Hi Roland,

A patch for the AArch64 C2 volatile/CAS generation code which deals with
the effects of your proposed C2 patch is available as a webrev

 http://cr.openjdk.java.net/~adinn/8087341-aarch64/webrev.00/

The webrev includes your patch and mine and is based on the latest hs-comp.

n.b. I have /not/ created a separate issue for the AArch64 part of this
fix. I am not sure whether you want to combine it with your patch or
push it as a separate stage.

n.b. your patch allowed the AArch64 C2 code to be significantly
simplified. That's because it ensures that the Raw memory flows
associated with the GC card marks no longer intermingle with the
AliasIdxBot and oop flows associated with the volatile store/CAS. This
means the job of recognising the signature memory configuration between
leading and trailing memory barriers is much easier.

Testing:

I have verified that this generates correct code for volatile put and
CAS on AArch64 in all 5 relevant GC configurations:

  +UseG1GC
 +UseConcMarkSweepGC +CondCardMark
 +UseConcMarkSweepGC -CondCardMark
 +UseParallelGC +CondCardMark
 +UseParallelGC -CondCardMark

A review from an AArch64 reviewer would be welcome.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in UK and Wales under Company Registration No. 3798903
Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul
Argiry (US)

From aph at redhat.com  Fri Feb 12 10:49:31 2016
From: aph at redhat.com (Andrew Haley)
Date: Fri, 12 Feb 2016 10:49:31 +0000
Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize
 redundant memory operations with G1
In-Reply-To: <56BDB707.6090409@redhat.com>
References: <F1EBA70C-852F-4EEA-9C53-58FC4849FD2D@oracle.com>
	<56AA260B.8080101@redhat.com>
	<434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com>
	<56BB01E2.2090004@redhat.com>
	<9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com>
	<56BDB707.6090409@redhat.com>
Message-ID: <56BDB8BB.9040305@redhat.com>

On 12/02/16 10:42, Andrew Dinn wrote:
> A review from an AArch64 reviewer would be welcome.

Crikey.  Well, it looks okay, but wow... :-)

One question: if those code fails because of a different shape of
ideal graph than it expects, all that happens is slightly suboptimal
code, right?

Andrew.

From hui.shi at linaro.org  Fri Feb 12 11:10:07 2016
From: hui.shi at linaro.org (Hui Shi)
Date: Fri, 12 Feb 2016 19:10:07 +0800
Subject: [aarch64-port-dev ] RFR(s): AAch64: Adding byte array equal
	support
In-Reply-To: <56BB5A19.60001@redhat.com>
References: <CAF1YaiDW0VB_dh0AyS1AEfwOoqCemGhJXNRPAEU07LfDS-rdKg@mail.gmail.com>
	<56B71DDE.2040109@redhat.com> <56BB5A19.60001@redhat.com>
Message-ID: <CAF1YaiCcJ71sxkSKscu89sx3=dpT9893g-p-cCA=GyjfuAVGZQ@mail.gmail.com>

Hi Andrew,
   Are you suggesting we should refactoring these similar code to make them
share most part? Similar with handling different array copies?

Regards
Hui

On 10 February 2016 at 23:41, Andrew Haley <aph at redhat.com> wrote:

> On 02/07/2016 10:35 AM, Andrew Haley wrote:
> > On 06/02/16 12:24, Hui Shi wrote:
> >> Hi All,
> >>
> >> Would someone help review this patch for adding byte array equal
> support on
> >> aarch64?
> >>
> >> bug: https://bugs.openjdk.java.net/browse/JDK-8149100
> >> webrev: http://cr.openjdk.java.net/~hshi/8149100/webrev/
> >
> > Ok, thanks.
>
> Having said that, I'm unhappy that this code is almost exactly the same
> as char_arrays_equals, seeming to have a slab of identical code.  We've
> also got char_arrays_equals, which seems to do the same thing as
> string_equals.
>
> Andrew.
>
>
>

From aph at redhat.com  Fri Feb 12 11:18:24 2016
From: aph at redhat.com (Andrew Haley)
Date: Fri, 12 Feb 2016 11:18:24 +0000
Subject: [aarch64-port-dev ] RFR(s): AAch64: Adding byte array equal
	support
In-Reply-To: <CAF1YaiCcJ71sxkSKscu89sx3=dpT9893g-p-cCA=GyjfuAVGZQ@mail.gmail.com>
References: <CAF1YaiDW0VB_dh0AyS1AEfwOoqCemGhJXNRPAEU07LfDS-rdKg@mail.gmail.com>
	<56B71DDE.2040109@redhat.com> <56BB5A19.60001@redhat.com>
	<CAF1YaiCcJ71sxkSKscu89sx3=dpT9893g-p-cCA=GyjfuAVGZQ@mail.gmail.com>
Message-ID: <56BDBF80.6020903@redhat.com>

On 12/02/16 11:10, Hui Shi wrote:
>    Are you suggesting we should refactoring these similar code to make them
> share most part? Similar with handling different array copies?

Yes, absolutely.  There are almost no code differences.

Code duplication of this kind has what we call a "bad smell".  It is
not necessarily wrong, and there may be a good reason for it, but it
is always suspicious.

Andrew.

From hui.shi at linaro.org  Fri Feb 12 11:21:39 2016
From: hui.shi at linaro.org (Hui Shi)
Date: Fri, 12 Feb 2016 19:21:39 +0800
Subject: [aarch64-port-dev ] RFR(s): AAch64: Adding byte array equal
	support
In-Reply-To: <56BDBF80.6020903@redhat.com>
References: <CAF1YaiDW0VB_dh0AyS1AEfwOoqCemGhJXNRPAEU07LfDS-rdKg@mail.gmail.com>
	<56B71DDE.2040109@redhat.com> <56BB5A19.60001@redhat.com>
	<CAF1YaiCcJ71sxkSKscu89sx3=dpT9893g-p-cCA=GyjfuAVGZQ@mail.gmail.com>
	<56BDBF80.6020903@redhat.com>
Message-ID: <CAF1YaiC6rpGnnb4=MmCEv3+R884OBFEsb2jnvpDX5G97A4M=PA@mail.gmail.com>

You're right! Checking X86 implementation it use same arrays_equals
implementation for all these operations, we can do this for AArch64 too.
I'll create a new work item and follow up with this.

Regards
Hui

On 12 February 2016 at 19:18, Andrew Haley <aph at redhat.com> wrote:

> On 12/02/16 11:10, Hui Shi wrote:
> >    Are you suggesting we should refactoring these similar code to make
> them
> > share most part? Similar with handling different array copies?
>
> Yes, absolutely.  There are almost no code differences.
>
> Code duplication of this kind has what we call a "bad smell".  It is
> not necessarily wrong, and there may be a good reason for it, but it
> is always suspicious.
>
> Andrew.
>

From adinn at redhat.com  Fri Feb 12 11:25:44 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Fri, 12 Feb 2016 11:25:44 +0000
Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize
 redundant memory operations with G1
In-Reply-To: <56BDB8BB.9040305@redhat.com>
References: <F1EBA70C-852F-4EEA-9C53-58FC4849FD2D@oracle.com>
	<56AA260B.8080101@redhat.com>
	<434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com>
	<56BB01E2.2090004@redhat.com>
	<9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com>
	<56BDB707.6090409@redhat.com> <56BDB8BB.9040305@redhat.com>
Message-ID: <56BDC138.8050903@redhat.com>


On 12/02/16 10:49, Andrew Haley wrote:
> On 12/02/16 10:42, Andrew Dinn wrote:
>> A review from an AArch64 reviewer would be welcome.
> 
> Crikey.  Well, it looks okay, but wow... :-)

well, yes . . . wow! But then again, this is what we had to expect when
we decided to rely on matching subgraph shapes in the back end -- a
change to the details of how stores are generated in generic code will
have implications for the back end.

The flip side of this is twofold. Firstly, changes of this sort will
always be few and far between. Secondly, Roland's change has simplified
something that was over-complex in the first place; this has not only
unlatched some generic optimizations that should have just worked but
also, by the same token, reduced the complexity of the AArch64 back end
code.

> One question: if those code fails because of a different shape of
> ideal graph than it expects, all that happens is slightly suboptimal
> code, right?

Not quite. Roland's change without the AArch64 patch triggered an assert
during CAS generation when the expected subgraph was not found.

Also, the current code is not built to expect whatever barrier
Shenandoah might insert. It ought to fall into much the same case as G1
and CMS + CondCardMark (depending upon how the GC barriers are
generated). However, a check which folds Shenandoah into the same bucket
as those two still needs explicitly wiring in.

This patch is probably a better place to start from than the previous
version in order to add that case handling. Roland's fix decouples the
effects of the GC barrier from the ones associated with oop updates.
That means that following this patch any changes in the way the GC
barriers are generated are less likely to impact the Aarch64 back end
code that is interested in memory barriers associated with oop updates.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in UK and Wales under Company Registration No. 3798903
Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul
Argiry (US)

From aph at redhat.com  Fri Feb 12 11:31:15 2016
From: aph at redhat.com (Andrew Haley)
Date: Fri, 12 Feb 2016 11:31:15 +0000
Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize
 redundant memory operations with G1
In-Reply-To: <56BDC138.8050903@redhat.com>
References: <F1EBA70C-852F-4EEA-9C53-58FC4849FD2D@oracle.com>
	<56AA260B.8080101@redhat.com>
	<434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com>
	<56BB01E2.2090004@redhat.com>
	<9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com>
	<56BDB707.6090409@redhat.com> <56BDB8BB.9040305@redhat.com>
	<56BDC138.8050903@redhat.com>
Message-ID: <56BDC283.40503@redhat.com>

On 12/02/16 11:25, Andrew Dinn wrote:
> Not quite. Roland's change without the AArch64 patch triggered an assert
> during CAS generation when the expected subgraph was not found.

Hmm.  Can this code not be changed to fail quietly, with no change
to the code?  That's what optimizations generally do.

> Also, the current code is not built to expect whatever barrier
> Shenandoah might insert. It ought to fall into much the same case as G1
> and CMS + CondCardMark (depending upon how the GC barriers are
> generated). However, a check which folds Shenandoah into the same bucket
> as those two still needs explicitly wiring in.

Sure.

Andrew.


From adinn at redhat.com  Fri Feb 12 12:03:36 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Fri, 12 Feb 2016 12:03:36 +0000
Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize
 redundant memory operations with G1
In-Reply-To: <56BDC283.40503@redhat.com>
References: <F1EBA70C-852F-4EEA-9C53-58FC4849FD2D@oracle.com>
	<56AA260B.8080101@redhat.com>
	<434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com>
	<56BB01E2.2090004@redhat.com>
	<9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com>
	<56BDB707.6090409@redhat.com> <56BDB8BB.9040305@redhat.com>
	<56BDC138.8050903@redhat.com> <56BDC283.40503@redhat.com>
Message-ID: <56BDCA18.60906@redhat.com>

On 12/02/16 11:31, Andrew Haley wrote:
> On 12/02/16 11:25, Andrew Dinn wrote:
>> Not quite. Roland's change without the AArch64 patch triggered an assert
>> during CAS generation when the expected subgraph was not found.
> 
> Hmm.  Can this code not be changed to fail quietly, with no change
> to the code?  That's what optimizations generally do.

The assert is employed when generating AArch64 code for a CAS because
every CAS should *always* be capable of being optimized to use an
ldaxr/stlxr pair without the need for a top and tail dmb pair i.e. we
don't have a fall back. If we see a CompareAndSwap node in a subgraph
that does not have the expected shape then this can only mean that the
AArch64 code has gone out of sync with a change in the generic code. The
assert is used to find this mismatch during development/testing. By
placing as much as possible of this checking code in an ifdef ASSERT
region we avoid executing the check in production.

An assert is not employed when generating AArch64 code for a StoreX
because not all StoreX operations are volatile stores susceptible to
optimization. So, in this case if we don't find the relevant subgraph
then we just fall back to generating dmbs.

The predicates which control dmb generation apply to multiple rules --
those for StoreX, MemBarRelease and MemBarVolatile. The predicates are
all supposed to operate consistently so that when one predicate falls
back then they all do. However, that's only guaranteed by them knowing
exactly which shape to look for and correctly identifying it i.e. by me
having coded it correctly (but then that's true of a lot of code:-).

It would be nice to be able to cross-validate the actions of these rules
applied to some set of StoreN, MemBarRelease, MemBarVolatile and
CompareAndSwap nodes. However, I cannot see any way of correlating a
rule application to some given node with rule applications to related
nodes. Rule applications in a given sequence are not easily associated
with an originating volatile put/CAS (even when you know which ones they
are as happens during debugging they don't necessarily occur in a fixed
order).

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in UK and Wales under Company Registration No. 3798903
Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul
Argiry (US)

From roland.westrelin at oracle.com  Fri Feb 12 12:36:09 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Fri, 12 Feb 2016 13:36:09 +0100
Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize
	redundant memory operations with G1
In-Reply-To: <56BDB707.6090409@redhat.com>
References: <F1EBA70C-852F-4EEA-9C53-58FC4849FD2D@oracle.com>
	<56AA260B.8080101@redhat.com>
	<434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com>
	<56BB01E2.2090004@redhat.com>
	<9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com>
	<56BDB707.6090409@redhat.com>
Message-ID: <8339FB73-9C21-4513-B07B-5DEBB8583188@oracle.com>

Hi Andrew,

> A patch for the AArch64 C2 volatile/CAS generation code which deals with
> the effects of your proposed C2 patch is available as a webrev
> 
> http://cr.openjdk.java.net/~adinn/8087341-aarch64/webrev.00/

Thanks for putting that together. I didn?t expect that simple change to cause so much trouble.

> n.b. I have /not/ created a separate issue for the AArch64 part of this
> fix. I am not sure whether you want to combine it with your patch or
> push it as a separate stage.

I can push everything together and list you as a contributor (in the contributed-by field) if that works for you. 

Vladimir, can you take another look at this? Your two objections were:

> Also we have specialized insert_mem_bar_volatile() if we don't want wide memory affect. Why not use it?

The membar in the change takes the entire memory state as input but only changes raw memory. I don?t think that can be achieved with insert_mem_bar_volatile(). As explained by Mikael, the membar is here to force ordering between the oop store and the card table load. That?s why I think the membar?s inputs and outputs should be set up that way.

> And we need to keep precedent edge link to oop store in case EA eliminates related allocation.

Mikael said it?s not ok to eliminate the memory barrier if we leave the gc barrier.

Roland.

From adinn at redhat.com  Fri Feb 12 13:51:40 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Fri, 12 Feb 2016 13:51:40 +0000
Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize
 redundant memory operations with G1
In-Reply-To: <8339FB73-9C21-4513-B07B-5DEBB8583188@oracle.com>
References: <F1EBA70C-852F-4EEA-9C53-58FC4849FD2D@oracle.com>
	<56AA260B.8080101@redhat.com>
	<434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com>
	<56BB01E2.2090004@redhat.com>
	<9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com>
	<56BDB707.6090409@redhat.com>
	<8339FB73-9C21-4513-B07B-5DEBB8583188@oracle.com>
Message-ID: <56BDE36C.4020101@redhat.com>

Hi Roland,

On 12/02/16 12:36, Roland Westrelin wrote:
> Hi Andrew,
> 
>> A patch for the AArch64 C2 volatile/CAS generation code which deals
>> with the effects of your proposed C2 patch is available as a
>> webrev
>> 
>> http://cr.openjdk.java.net/~adinn/8087341-aarch64/webrev.00/
> 
> Thanks for putting that together. I didn?t expect that simple change
> to cause so much trouble.

It was my decision to employ back end rule predicates which poke around
in the graph that led to this -- it's not anything to do with your
choice. I think your fix is correct and valuable in its own right, yet
more so because it has simplified that back end code substantially.

>> n.b. I have /not/ created a separate issue for the AArch64 part of
>> this fix. I am not sure whether you want to combine it with your
>> patch or push it as a separate stage.
> 
> I can push everything together and list you as a contributor (in the
> contributed-by field) if that works for you.
> 

Yes please. I think Andrew Haley's responses so far mean that  has
agreed the AArch64 part of this change. Perhaps he can confirm?

> Vladimir, can you take another look at this? Your two objections
> were:
> 
>> Also we have specialized insert_mem_bar_volatile() if we don't want
>> wide memory affect. Why not use it?
> 
> The membar in the change takes the entire memory state as input but
> only changes raw memory. I don?t think that can be achieved with
> insert_mem_bar_volatile(). As explained by Mikael, the membar is here
> to force ordering between the oop store and the card table load.
> That?s why I think the membar?s inputs and outputs should be set up
> that way.

Not that I am an official reviewer but I agree with you here.

>> And we need to keep precedent edge link to oop store in case EA
>> eliminates related allocation.
> 
> Mikael said it?s not ok to eliminate the memory barrier if we leave
> the gc barrier.

Also in agreement with this.

For both G1GC and CMS +CondCardMark a StoreLoad barrier is necessary to
ensure that the StoreX is visible before the LoadB/StoreCM pair which
implement the conditional card mark. For these configurations AArch64
detects any MemBarVolatile associated with the card mark and inserts a
dmb ish instruction (StoreLoad implementation) before the ldrb/strb.

With CMS -CondCardMark the generic code does not insert a memory
barrier. However, for correctness on non-TSO architectures we need a
StoreStore barrier between the StoreX and the StoreCM implementing the
card mark. That ensures that these writes cannot be observed by GC
threads out of order (it might cause the GC to miss the write). This
special case is handled on AArch64 by translating StoreCM to include a
dmb ishst instruction (StoreStore implementation) before the strb.

regards,


Andrew Dinn
-----------

From aph at redhat.com  Fri Feb 12 14:07:21 2016
From: aph at redhat.com (Andrew Haley)
Date: Fri, 12 Feb 2016 14:07:21 +0000
Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize
 redundant memory operations with G1
In-Reply-To: <56BDE36C.4020101@redhat.com>
References: <F1EBA70C-852F-4EEA-9C53-58FC4849FD2D@oracle.com>
	<56AA260B.8080101@redhat.com>
	<434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com>
	<56BB01E2.2090004@redhat.com>
	<9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com>
	<56BDB707.6090409@redhat.com>
	<8339FB73-9C21-4513-B07B-5DEBB8583188@oracle.com>
	<56BDE36C.4020101@redhat.com>
Message-ID: <56BDE719.2010703@redhat.com>

On 12/02/16 13:51, Andrew Dinn wrote:
> Yes please. I think Andrew Haley's responses so far mean that  has
> agreed the AArch64 part of this change. Perhaps he can confirm?

Sure.

Andrew.


From gnu.andrew at redhat.com  Fri Feb 12 17:45:21 2016
From: gnu.andrew at redhat.com (gnu.andrew at redhat.com)
Date: Fri, 12 Feb 2016 17:45:21 +0000
Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u/jdk: 2 new changesets
Message-ID: <201602121745.u1CHjL9B029815@aojmv0008.oracle.com>

Changeset: c57b985d9249
Author:    mfang
Date:      2015-07-15 12:12 -0700
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/jdk/rev/c57b985d9249

8131105: Header Template for nroff man pages *.1 files contains errors
Reviewed-by: katleman

! src/bsd/doc/man/appletviewer.1
! src/bsd/doc/man/extcheck.1
! src/bsd/doc/man/idlj.1
! src/bsd/doc/man/jar.1
! src/bsd/doc/man/jarsigner.1
! src/bsd/doc/man/java.1
! src/bsd/doc/man/javac.1
! src/bsd/doc/man/javadoc.1
! src/bsd/doc/man/javah.1
! src/bsd/doc/man/javap.1
! src/bsd/doc/man/jcmd.1
! src/bsd/doc/man/jconsole.1
! src/bsd/doc/man/jdb.1
! src/bsd/doc/man/jdeps.1
! src/bsd/doc/man/jhat.1
! src/bsd/doc/man/jinfo.1
! src/bsd/doc/man/jjs.1
! src/bsd/doc/man/jmap.1
! src/bsd/doc/man/jps.1
! src/bsd/doc/man/jrunscript.1
! src/bsd/doc/man/jsadebugd.1
! src/bsd/doc/man/jstack.1
! src/bsd/doc/man/jstat.1
! src/bsd/doc/man/jstatd.1
! src/bsd/doc/man/keytool.1
! src/bsd/doc/man/native2ascii.1
! src/bsd/doc/man/orbd.1
! src/bsd/doc/man/pack200.1
! src/bsd/doc/man/policytool.1
! src/bsd/doc/man/rmic.1
! src/bsd/doc/man/rmid.1
! src/bsd/doc/man/rmiregistry.1
! src/bsd/doc/man/schemagen.1
! src/bsd/doc/man/serialver.1
! src/bsd/doc/man/servertool.1
! src/bsd/doc/man/tnameserv.1
! src/bsd/doc/man/unpack200.1
! src/bsd/doc/man/wsgen.1
! src/bsd/doc/man/wsimport.1
! src/bsd/doc/man/xjc.1
! src/linux/doc/man/appletviewer.1
! src/linux/doc/man/extcheck.1
! src/linux/doc/man/idlj.1
! src/linux/doc/man/ja/appletviewer.1
! src/linux/doc/man/ja/extcheck.1
! src/linux/doc/man/ja/idlj.1
! src/linux/doc/man/ja/jar.1
! src/linux/doc/man/ja/jarsigner.1
! src/linux/doc/man/ja/java.1
! src/linux/doc/man/ja/javac.1
! src/linux/doc/man/ja/javadoc.1
! src/linux/doc/man/ja/javah.1
! src/linux/doc/man/ja/javap.1
! src/linux/doc/man/ja/javaws.1
! src/linux/doc/man/ja/jcmd.1
! src/linux/doc/man/ja/jconsole.1
! src/linux/doc/man/ja/jdb.1
! src/linux/doc/man/ja/jdeps.1
! src/linux/doc/man/ja/jhat.1
! src/linux/doc/man/ja/jinfo.1
! src/linux/doc/man/ja/jjs.1
! src/linux/doc/man/ja/jmap.1
! src/linux/doc/man/ja/jps.1
! src/linux/doc/man/ja/jrunscript.1
! src/linux/doc/man/ja/jsadebugd.1
! src/linux/doc/man/ja/jstack.1
! src/linux/doc/man/ja/jstat.1
! src/linux/doc/man/ja/jstatd.1
! src/linux/doc/man/ja/jvisualvm.1
! src/linux/doc/man/ja/keytool.1
! src/linux/doc/man/ja/native2ascii.1
! src/linux/doc/man/ja/orbd.1
! src/linux/doc/man/ja/pack200.1
! src/linux/doc/man/ja/policytool.1
! src/linux/doc/man/ja/rmic.1
! src/linux/doc/man/ja/rmid.1
! src/linux/doc/man/ja/rmiregistry.1
! src/linux/doc/man/ja/schemagen.1
! src/linux/doc/man/ja/serialver.1
! src/linux/doc/man/ja/servertool.1
! src/linux/doc/man/ja/tnameserv.1
! src/linux/doc/man/ja/unpack200.1
! src/linux/doc/man/ja/wsgen.1
! src/linux/doc/man/ja/wsimport.1
! src/linux/doc/man/ja/xjc.1
! src/linux/doc/man/jar.1
! src/linux/doc/man/jarsigner.1
! src/linux/doc/man/java.1
! src/linux/doc/man/javac.1
! src/linux/doc/man/javadoc.1
! src/linux/doc/man/javah.1
! src/linux/doc/man/javap.1
! src/linux/doc/man/jcmd.1
! src/linux/doc/man/jconsole.1
! src/linux/doc/man/jdb.1
! src/linux/doc/man/jdeps.1
! src/linux/doc/man/jhat.1
! src/linux/doc/man/jinfo.1
! src/linux/doc/man/jjs.1
! src/linux/doc/man/jmap.1
! src/linux/doc/man/jps.1
! src/linux/doc/man/jrunscript.1
! src/linux/doc/man/jsadebugd.1
! src/linux/doc/man/jstack.1
! src/linux/doc/man/jstat.1
! src/linux/doc/man/jstatd.1
! src/linux/doc/man/keytool.1
! src/linux/doc/man/native2ascii.1
! src/linux/doc/man/orbd.1
! src/linux/doc/man/pack200.1
! src/linux/doc/man/policytool.1
! src/linux/doc/man/rmic.1
! src/linux/doc/man/rmid.1
! src/linux/doc/man/rmiregistry.1
! src/linux/doc/man/schemagen.1
! src/linux/doc/man/serialver.1
! src/linux/doc/man/servertool.1
! src/linux/doc/man/tnameserv.1
! src/linux/doc/man/unpack200.1
! src/linux/doc/man/wsgen.1
! src/linux/doc/man/wsimport.1
! src/linux/doc/man/xjc.1
! src/solaris/doc/sun/man/man1/appletviewer.1
! src/solaris/doc/sun/man/man1/extcheck.1
! src/solaris/doc/sun/man/man1/idlj.1
! src/solaris/doc/sun/man/man1/ja/appletviewer.1
! src/solaris/doc/sun/man/man1/ja/extcheck.1
! src/solaris/doc/sun/man/man1/ja/idlj.1
! src/solaris/doc/sun/man/man1/ja/jar.1
! src/solaris/doc/sun/man/man1/ja/jarsigner.1
! src/solaris/doc/sun/man/man1/ja/java.1
! src/solaris/doc/sun/man/man1/ja/javac.1
! src/solaris/doc/sun/man/man1/ja/javadoc.1
! src/solaris/doc/sun/man/man1/ja/javah.1
! src/solaris/doc/sun/man/man1/ja/javap.1
! src/solaris/doc/sun/man/man1/ja/jcmd.1
! src/solaris/doc/sun/man/man1/ja/jconsole.1
! src/solaris/doc/sun/man/man1/ja/jdb.1
! src/solaris/doc/sun/man/man1/ja/jdeps.1
! src/solaris/doc/sun/man/man1/ja/jhat.1
! src/solaris/doc/sun/man/man1/ja/jinfo.1
! src/solaris/doc/sun/man/man1/ja/jjs.1
! src/solaris/doc/sun/man/man1/ja/jmap.1
! src/solaris/doc/sun/man/man1/ja/jps.1
! src/solaris/doc/sun/man/man1/ja/jrunscript.1
! src/solaris/doc/sun/man/man1/ja/jsadebugd.1
! src/solaris/doc/sun/man/man1/ja/jstack.1
! src/solaris/doc/sun/man/man1/ja/jstat.1
! src/solaris/doc/sun/man/man1/ja/jstatd.1
! src/solaris/doc/sun/man/man1/ja/jvisualvm.1
! src/solaris/doc/sun/man/man1/ja/keytool.1
! src/solaris/doc/sun/man/man1/ja/native2ascii.1
! src/solaris/doc/sun/man/man1/ja/orbd.1
! src/solaris/doc/sun/man/man1/ja/pack200.1
! src/solaris/doc/sun/man/man1/ja/policytool.1
! src/solaris/doc/sun/man/man1/ja/rmic.1
! src/solaris/doc/sun/man/man1/ja/rmid.1
! src/solaris/doc/sun/man/man1/ja/rmiregistry.1
! src/solaris/doc/sun/man/man1/ja/schemagen.1
! src/solaris/doc/sun/man/man1/ja/serialver.1
! src/solaris/doc/sun/man/man1/ja/servertool.1
! src/solaris/doc/sun/man/man1/ja/tnameserv.1
! src/solaris/doc/sun/man/man1/ja/unpack200.1
! src/solaris/doc/sun/man/man1/ja/wsgen.1
! src/solaris/doc/sun/man/man1/ja/wsimport.1
! src/solaris/doc/sun/man/man1/ja/xjc.1
! src/solaris/doc/sun/man/man1/jar.1
! src/solaris/doc/sun/man/man1/jarsigner.1
! src/solaris/doc/sun/man/man1/java.1
! src/solaris/doc/sun/man/man1/javac.1
! src/solaris/doc/sun/man/man1/javadoc.1
! src/solaris/doc/sun/man/man1/javah.1
! src/solaris/doc/sun/man/man1/javap.1
! src/solaris/doc/sun/man/man1/jcmd.1
! src/solaris/doc/sun/man/man1/jconsole.1
! src/solaris/doc/sun/man/man1/jdb.1
! src/solaris/doc/sun/man/man1/jdeps.1
! src/solaris/doc/sun/man/man1/jhat.1
! src/solaris/doc/sun/man/man1/jinfo.1
! src/solaris/doc/sun/man/man1/jjs.1
! src/solaris/doc/sun/man/man1/jmap.1
! src/solaris/doc/sun/man/man1/jps.1
! src/solaris/doc/sun/man/man1/jrunscript.1
! src/solaris/doc/sun/man/man1/jsadebugd.1
! src/solaris/doc/sun/man/man1/jstack.1
! src/solaris/doc/sun/man/man1/jstat.1
! src/solaris/doc/sun/man/man1/jstatd.1
! src/solaris/doc/sun/man/man1/keytool.1
! src/solaris/doc/sun/man/man1/native2ascii.1
! src/solaris/doc/sun/man/man1/orbd.1
! src/solaris/doc/sun/man/man1/pack200.1
! src/solaris/doc/sun/man/man1/policytool.1
! src/solaris/doc/sun/man/man1/rmic.1
! src/solaris/doc/sun/man/man1/rmid.1
! src/solaris/doc/sun/man/man1/rmiregistry.1
! src/solaris/doc/sun/man/man1/schemagen.1
! src/solaris/doc/sun/man/man1/serialver.1
! src/solaris/doc/sun/man/man1/servertool.1
! src/solaris/doc/sun/man/man1/tnameserv.1
! src/solaris/doc/sun/man/man1/unpack200.1
! src/solaris/doc/sun/man/man1/wsgen.1
! src/solaris/doc/sun/man/man1/wsimport.1
! src/solaris/doc/sun/man/man1/xjc.1

Changeset: f9d3631fbc8f
Author:    andrew
Date:      2016-02-08 14:55 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/jdk/rev/f9d3631fbc8f

Revert changes to libpng source code now 8078245 is in place.

! src/share/native/sun/awt/libpng/pngpriv.h


From gnu.andrew at redhat.com  Fri Feb 12 17:55:09 2016
From: gnu.andrew at redhat.com (Andrew Hughes)
Date: Fri, 12 Feb 2016 12:55:09 -0500 (EST)
Subject: [aarch64-port-dev ] Sync AArch64 8u JDK Repository with
 upstream 8u72
In-Reply-To: <56BDAAF3.3020904@redhat.com>
References: <1530040291.20005017.1455214860067.JavaMail.zimbra@redhat.com>
	<56BDAAF3.3020904@redhat.com>
Message-ID: <609425171.20390154.1455299709519.JavaMail.zimbra@redhat.com>


----- Original Message -----
> On 11/02/16 18:21, Andrew Hughes wrote:
> > Ok to push?
> 
> OK, thanks.  Sounds like 8u has been pretty quiet.
> 
> Andrew.
> 
> 

Done. Thanks.
-- 
Andrew :)

Senior Free Java Software Engineer
Red Hat, Inc. (http://www.redhat.com)

PGP Key: ed25519/35964222 (hkp://keys.gnupg.net)
Fingerprint = 5132 579D D154 0ED2 3E04  C5A0 CFDA 0F9B 3596 4222


From vladimir.kozlov at oracle.com  Fri Feb 12 19:44:18 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 12 Feb 2016 11:44:18 -0800
Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize
 redundant memory operations with G1
In-Reply-To: <8339FB73-9C21-4513-B07B-5DEBB8583188@oracle.com>
References: <F1EBA70C-852F-4EEA-9C53-58FC4849FD2D@oracle.com>
	<56AA260B.8080101@redhat.com>
	<434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com>
	<56BB01E2.2090004@redhat.com>
	<9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com>
	<56BDB707.6090409@redhat.com>
	<8339FB73-9C21-4513-B07B-5DEBB8583188@oracle.com>
Message-ID: <56BE3612.4080305@oracle.com>

Roland,

Can you create new webrev which includes everything (aarch64)?
And I am satisfied with your answers to my objections.

Thanks,
Vladimir

On 2/12/16 4:36 AM, Roland Westrelin wrote:
> Hi Andrew,
>
>> A patch for the AArch64 C2 volatile/CAS generation code which deals with
>> the effects of your proposed C2 patch is available as a webrev
>>
>> http://cr.openjdk.java.net/~adinn/8087341-aarch64/webrev.00/
>
> Thanks for putting that together. I didn?t expect that simple change to cause so much trouble.
>
>> n.b. I have /not/ created a separate issue for the AArch64 part of this
>> fix. I am not sure whether you want to combine it with your patch or
>> push it as a separate stage.
>
> I can push everything together and list you as a contributor (in the contributed-by field) if that works for you.
>
> Vladimir, can you take another look at this? Your two objections were:
>
>> Also we have specialized insert_mem_bar_volatile() if we don't want wide memory affect. Why not use it?
>
> The membar in the change takes the entire memory state as input but only changes raw memory. I don?t think that can be achieved with insert_mem_bar_volatile(). As explained by Mikael, the membar is here to force ordering between the oop store and the card table load. That?s why I think the membar?s inputs and outputs should be set up that way.
>
>> And we need to keep precedent edge link to oop store in case EA eliminates related allocation.
>
> Mikael said it?s not ok to eliminate the memory barrier if we leave the gc barrier.
>
> Roland.
>

From roland.westrelin at oracle.com  Mon Feb 15 09:21:43 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Mon, 15 Feb 2016 10:21:43 +0100
Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize
	redundant memory operations with G1
In-Reply-To: <56BE3612.4080305@oracle.com>
References: <F1EBA70C-852F-4EEA-9C53-58FC4849FD2D@oracle.com>
	<56AA260B.8080101@redhat.com>
	<434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com>
	<56BB01E2.2090004@redhat.com>
	<9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com>
	<56BDB707.6090409@redhat.com>
	<8339FB73-9C21-4513-B07B-5DEBB8583188@oracle.com>
	<56BE3612.4080305@oracle.com>
Message-ID: <BE8116BA-2D07-4579-BC71-2F9A4E5B027C@oracle.com>


> Can you create new webrev which includes everything (aarch64)?

Here it is:
http://cr.openjdk.java.net/~roland/8087341/webrev.01/

Roland.

> And I am satisfied with your answers to my objections.
> 
> Thanks,
> Vladimir
> 
> On 2/12/16 4:36 AM, Roland Westrelin wrote:
>> Hi Andrew,
>> 
>>> A patch for the AArch64 C2 volatile/CAS generation code which deals with
>>> the effects of your proposed C2 patch is available as a webrev
>>> 
>>> http://cr.openjdk.java.net/~adinn/8087341-aarch64/webrev.00/
>> 
>> Thanks for putting that together. I didn?t expect that simple change to cause so much trouble.
>> 
>>> n.b. I have /not/ created a separate issue for the AArch64 part of this
>>> fix. I am not sure whether you want to combine it with your patch or
>>> push it as a separate stage.
>> 
>> I can push everything together and list you as a contributor (in the contributed-by field) if that works for you.
>> 
>> Vladimir, can you take another look at this? Your two objections were:
>> 
>>> Also we have specialized insert_mem_bar_volatile() if we don't want wide memory affect. Why not use it?
>> 
>> The membar in the change takes the entire memory state as input but only changes raw memory. I don?t think that can be achieved with insert_mem_bar_volatile(). As explained by Mikael, the membar is here to force ordering between the oop store and the card table load. That?s why I think the membar?s inputs and outputs should be set up that way.
>> 
>>> And we need to keep precedent edge link to oop store in case EA eliminates related allocation.
>> 
>> Mikael said it?s not ok to eliminate the memory barrier if we leave the gc barrier.
>> 
>> Roland.
>> 


From adinn at redhat.com  Mon Feb 15 11:08:07 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Mon, 15 Feb 2016 11:08:07 +0000
Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize
 redundant memory operations with G1
In-Reply-To: <BE8116BA-2D07-4579-BC71-2F9A4E5B027C@oracle.com>
References: <F1EBA70C-852F-4EEA-9C53-58FC4849FD2D@oracle.com>
	<56AA260B.8080101@redhat.com>
	<434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com>
	<56BB01E2.2090004@redhat.com>
	<9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com>
	<56BDB707.6090409@redhat.com>
	<8339FB73-9C21-4513-B07B-5DEBB8583188@oracle.com>
	<56BE3612.4080305@oracle.com>
	<BE8116BA-2D07-4579-BC71-2F9A4E5B027C@oracle.com>
Message-ID: <56C1B197.7060708@redhat.com>

On 15/02/16 09:21, Roland Westrelin wrote:
> 
>> Can you create new webrev which includes everything (aarch64)?
> 
> Here it is:
> http://cr.openjdk.java.net/~roland/8087341/webrev.01/

Thanks Roland. Looks good to go.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in UK and Wales under Company Registration No. 3798903
Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul
Argiry (US)

From vladimir.kozlov at oracle.com  Mon Feb 15 17:33:08 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 15 Feb 2016 09:33:08 -0800
Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize
 redundant memory operations with G1
In-Reply-To: <BE8116BA-2D07-4579-BC71-2F9A4E5B027C@oracle.com>
References: <F1EBA70C-852F-4EEA-9C53-58FC4849FD2D@oracle.com>
	<56AA260B.8080101@redhat.com>
	<434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com>
	<56BB01E2.2090004@redhat.com>
	<9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com>
	<56BDB707.6090409@redhat.com>
	<8339FB73-9C21-4513-B07B-5DEBB8583188@oracle.com>
	<56BE3612.4080305@oracle.com>
	<BE8116BA-2D07-4579-BC71-2F9A4E5B027C@oracle.com>
Message-ID: <56C20BD4.8030800@oracle.com>

Good. Thank you.

Vladimir

On 2/15/16 1:21 AM, Roland Westrelin wrote:
>
>> Can you create new webrev which includes everything (aarch64)?
>
> Here it is:
> http://cr.openjdk.java.net/~roland/8087341/webrev.01/
>
> Roland.
>
>> And I am satisfied with your answers to my objections.
>>
>> Thanks,
>> Vladimir
>>
>> On 2/12/16 4:36 AM, Roland Westrelin wrote:
>>> Hi Andrew,
>>>
>>>> A patch for the AArch64 C2 volatile/CAS generation code which deals with
>>>> the effects of your proposed C2 patch is available as a webrev
>>>>
>>>> http://cr.openjdk.java.net/~adinn/8087341-aarch64/webrev.00/
>>>
>>> Thanks for putting that together. I didn?t expect that simple change to cause so much trouble.
>>>
>>>> n.b. I have /not/ created a separate issue for the AArch64 part of this
>>>> fix. I am not sure whether you want to combine it with your patch or
>>>> push it as a separate stage.
>>>
>>> I can push everything together and list you as a contributor (in the contributed-by field) if that works for you.
>>>
>>> Vladimir, can you take another look at this? Your two objections were:
>>>
>>>> Also we have specialized insert_mem_bar_volatile() if we don't want wide memory affect. Why not use it?
>>>
>>> The membar in the change takes the entire memory state as input but only changes raw memory. I don?t think that can be achieved with insert_mem_bar_volatile(). As explained by Mikael, the membar is here to force ordering between the oop store and the card table load. That?s why I think the membar?s inputs and outputs should be set up that way.
>>>
>>>> And we need to keep precedent edge link to oop store in case EA eliminates related allocation.
>>>
>>> Mikael said it?s not ok to eliminate the memory barrier if we leave the gc barrier.
>>>
>>> Roland.
>>>
>

From felix.yang at linaro.org  Tue Feb 16 11:28:13 2016
From: felix.yang at linaro.org (Felix Yang)
Date: Tue, 16 Feb 2016 19:28:13 +0800
Subject: [aarch64-port-dev ] RFR: 8149907: aarch64: use load/store pair
	instructions in call_stub
Message-ID: <CACc5Y6RZCFf1vL1BT9-h05EixaUHnwCTEg2Sa9kJ3EWY72zcTw@mail.gmail.com>

Hi,

Please review the following webrev:
http://cr.openjdk.java.net/~fyang/8149907/webrev.00/

Jira issue: https://bugs.openjdk.java.net/browse/JDK-8149907

This patch make use of load/store pair instructions in call_stub
saving 24 load/store instructions.

Tested with jtreg hotspot & langtools.  Is it OK?

Thanks,
Felix.

From aph at redhat.com  Tue Feb 16 11:46:54 2016
From: aph at redhat.com (Andrew Haley)
Date: Tue, 16 Feb 2016 11:46:54 +0000
Subject: [aarch64-port-dev ] RFR: 8149907: aarch64: use load/store pair
 instructions in call_stub
In-Reply-To: <CACc5Y6RZCFf1vL1BT9-h05EixaUHnwCTEg2Sa9kJ3EWY72zcTw@mail.gmail.com>
References: <CACc5Y6RZCFf1vL1BT9-h05EixaUHnwCTEg2Sa9kJ3EWY72zcTw@mail.gmail.com>
Message-ID: <56C30C2E.4080206@redhat.com>

On 02/16/2016 11:28 AM, Felix Yang wrote:
> Please review the following webrev:
> http://cr.openjdk.java.net/~fyang/8149907/webrev.00/

I guess this is okay, but it's a lot less self-documenting than it
was.

If there are any unused locals (e.g. r27_save) you must delete
them or use them in assertions.

Andrew.


From hui.shi at linaro.org  Wed Feb 17 13:21:11 2016
From: hui.shi at linaro.org (Hui Shi)
Date: Wed, 17 Feb 2016 21:21:11 +0800
Subject: [aarch64-port-dev ] AArch64: follow up array copy investigation on
	misaligned peeling
Message-ID: <CAF1YaiAn8Z1cTmSvCvu+yz_QYKy=5BfpKMKfCL9Yd+GuxouEtg@mail.gmail.com>

Hi Andrew and all,

Follow up with early discussion about forward and backward array copy
performance, current finding is
1. Optimizing misaligned load/store in backward array copy doesn't help on
array copy performance, I suggest leave it unchanged now.
2. There is some chances to optimizing array copy peeling/tailing with
combined 8 byte load/store. But might introduce extra stubs and complicate
code.
Would you please help comment?

Firstly, remove unaligned reference by reorder copy orders from small to
large (copy 1 byte first, 8 byte at last) when peeling. However it is even
a little bit slow compared with original implementation.
Test case is  http://people.linaro.org/~hui.shi/arraycopy/TestPeelAlign.java
Performance result in
http://people.linaro.org/~hui.shi/arraycopy/arraycopy_align_and_combine_Test.pdf
Patch is http://people.linaro.org/~hui.shi/arraycopy/peelingFromSmall.patch
Test case is typical backward array copy scenario (insert some element in
array and move tail array backward). From profiling, UNALIGNED_LDST_SPEC
event drops a lot with patch. In my understanding, load address cross cache
line boundary might trigger hardware prefetcher earlier than aligned
access. So fixing unaligned access seems not helpful in array copy peeling.

Secondly, as unaligned access doesn't show degradation in this case,
further experiment is folding consecutive branches/load/stores into one 8
byte unaligned load/store. Following is updated stub code for byte array
copy. This is legal when src and dst distance is bigger than 8 bytes. This
is safe in cases like String.getChars String.getBytes. Perform different
combination tests, it works best for byte array copy and still helpful for
short array copy. Check result in pdf "opt" column is for this optimization.

For StringConcat test (
http://people.linaro.org/~hui.shi/arraycopy/StringConcatTest.java), though
array copy only takes 25% cycles in this test, entire test can still see
3.5% improvement with this combine load/store optimization.  However I
wondering if this is the proper way to improve these test-bit-load-store
code sequence. This will requires extra really ?disjoint? array copy stub
code, current disjoint array copy only means it can safely perform forward
array copy. Or introduce no "overlap" test at runtime. My personal tradeoff
is leaving array copy code unchanged and keep it simply and consistent now.

Before patch
StubRoutines::jbyte_disjoint_arraycopy [0x0000ffff7897f7c0,
0x0000ffff7897f860[ (160 bytes)
  0x0000ffff7897f7e0: tbz       w9, #3, Stub::jbyte_disjoint_arraycopy+44
0x0000ffff7897f7ec
  0x0000ffff7897f7e4: ldr       x8, [x0],#8
  0x0000ffff7897f7e8: str       x8, [x1],#8
  0x0000ffff7897f7ec: tbz       w9, #2, Stub::jbyte_disjoint_arraycopy+56
0x0000ffff7897f7f8
  0x0000ffff7897f7f0: ldr       w8, [x0],#4
  0x0000ffff7897f7f4: str       w8, [x1],#4
  0x0000ffff7897f7f8: tbz       w9, #1, Stub::jbyte_disjoint_arraycopy+68
0x0000ffff7897f804
  0x0000ffff7897f7fc: ldrh      w8, [x0],#2
  0x0000ffff7897f800: strh      w8, [x1],#2
  0x0000ffff7897f804: tbz       w9, #0, Stub::jbyte_disjoint_arraycopy+80
0x0000ffff7897f810
  0x0000ffff7897f808: ldrb      w8, [x0],#1
  0x0000ffff7897f80c: strb      w8, [x1],#1
  0x0000ffff7897f810: cmp       x2, #0x10
  0x0000ffff7897f814: b.lt      Stub::jbyte_disjoint_arraycopy+96
0x0000ffff7897f820
  0x0000ffff7897f818: lsr       x9, x2, #3
  0x0000ffff7897f81c: bl        Stub::foward_copy_longs+28
0x0000ffff7897f5c0

Code after patch
StubRoutines::jbyte_disjoint_arraycopy [0x0000ffff6c97f7c0,
0x0000ffff6c97f87c[ (188 bytes)
// peeling for alignment
  0x0000ffff6c97f7e0: tbz       w9, #3, Stub::jbyte_disjoint_arraycopy+48
0x0000ffff6c97f7f0
  0x0000ffff6c97f7e4: sub       x9, x9, #0x8
  0x0000ffff6c97f7e8: ldr       x8, [x0],#8
  0x0000ffff6c97f7ec: str       x8, [x1],#8
  0x0000ffff6c97f7f0: ldr       x8, [x0]
  0x0000ffff6c97f7f4: str       x8, [x1]
  0x0000ffff6c97f7f8: add       x0, x0, x9
  0x0000ffff6c97f7fc: add       x1, x1, x9
  0x0000ffff6c97f800: cmp       x2, #0x10
  0x0000ffff6c97f804: b.lt      Stub::jbyte_disjoint_arraycopy+124
0x0000ffff6c97f83c
  0x0000ffff6c97f808: lsr       x9, x2, #3
  0x0000ffff6c97f80c: bl        Stub::foward_copy_longs+28
0x0000ffff6c97f5c0

Regards
Hui

From aph at redhat.com  Wed Feb 17 13:34:02 2016
From: aph at redhat.com (Andrew Haley)
Date: Wed, 17 Feb 2016 13:34:02 +0000
Subject: [aarch64-port-dev ] AArch64: follow up array copy investigation
 on misaligned peeling
In-Reply-To: <CAF1YaiAn8Z1cTmSvCvu+yz_QYKy=5BfpKMKfCL9Yd+GuxouEtg@mail.gmail.com>
References: <CAF1YaiAn8Z1cTmSvCvu+yz_QYKy=5BfpKMKfCL9Yd+GuxouEtg@mail.gmail.com>
Message-ID: <56C476CA.9070600@redhat.com>

On 02/17/2016 01:21 PM, Hui Shi wrote:
> For StringConcat test (
> http://people.linaro.org/~hui.shi/arraycopy/StringConcatTest.java), though
> array copy only takes 25% cycles in this test, entire test can still see
> 3.5% improvement with this combine load/store optimization.  However I
> wondering if this is the proper way to improve these test-bit-load-store
> code sequence. This will requires extra really ?disjoint? array copy stub
> code, current disjoint array copy only means it can safely perform forward
> array copy. Or introduce no "overlap" test at runtime. My personal tradeoff
> is leaving array copy code unchanged and keep it simply and consistent now.

OK, that makes sense.

My plan (such as it is) for tidying up the tail code is to convert
three bit-test-and-branches into a single 8-way computed jump with an
optimum sequence for all 8 cases.  Sure, it will usually be
mispredicted, but it's just a single jump.

But really, once we're down to 3.5% of a contrived string-
concatenation intensive test, it's questionable whether this is what
we need to be spending time on.

Thanks,

Andrew.

From felix.yang at linaro.org  Wed Feb 17 14:11:33 2016
From: felix.yang at linaro.org (Felix Yang)
Date: Wed, 17 Feb 2016 22:11:33 +0800
Subject: [aarch64-port-dev ] RFR: 8149907: aarch64: use load/store pair
 instructions in call_stub
In-Reply-To: <56C30C2E.4080206@redhat.com>
References: <CACc5Y6RZCFf1vL1BT9-h05EixaUHnwCTEg2Sa9kJ3EWY72zcTw@mail.gmail.com>
	<56C30C2E.4080206@redhat.com>
Message-ID: <CACc5Y6R83U_Xk7kkqTc1eY3P8entxn=8O9zMdXJKn5YwOt4jrQ@mail.gmail.com>

Hi Andrew,

    Thanks for the suggestions.  I have updated the patch with the unused
locals removed.
    New webrev: http://cr.openjdk.java.net/~fyang/8149907/webrev.01/
    How about this one?

Thanks for your help,
Felix

On 16 February 2016 at 19:46, Andrew Haley <aph at redhat.com> wrote:

> On 02/16/2016 11:28 AM, Felix Yang wrote:
> > Please review the following webrev:
> > http://cr.openjdk.java.net/~fyang/8149907/webrev.00/
>
> I guess this is okay, but it's a lot less self-documenting than it
> was.
>
> If there are any unused locals (e.g. r27_save) you must delete
> them or use them in assertions.
>
> Andrew.
>
>

From aph at redhat.com  Wed Feb 17 14:16:10 2016
From: aph at redhat.com (Andrew Haley)
Date: Wed, 17 Feb 2016 14:16:10 +0000
Subject: [aarch64-port-dev ] RFR: 8149907: aarch64: use load/store pair
 instructions in call_stub
In-Reply-To: <CACc5Y6R83U_Xk7kkqTc1eY3P8entxn=8O9zMdXJKn5YwOt4jrQ@mail.gmail.com>
References: <CACc5Y6RZCFf1vL1BT9-h05EixaUHnwCTEg2Sa9kJ3EWY72zcTw@mail.gmail.com>
	<56C30C2E.4080206@redhat.com>
	<CACc5Y6R83U_Xk7kkqTc1eY3P8entxn=8O9zMdXJKn5YwOt4jrQ@mail.gmail.com>
Message-ID: <56C480AA.2090606@redhat.com>

On 02/17/2016 02:11 PM, Felix Yang wrote:

>     Thanks for the suggestions.  I have updated the patch with the unused
> locals removed.
>     New webrev: http://cr.openjdk.java.net/~fyang/8149907/webrev.01/
>     How about this one?

What are r19_off and its friends used for now?  Why are they still
defined?

Andrew.


From felix.yang at linaro.org  Wed Feb 17 14:33:21 2016
From: felix.yang at linaro.org (Felix Yang)
Date: Wed, 17 Feb 2016 22:33:21 +0800
Subject: [aarch64-port-dev ] RFR: 8150038: aarch64: make use of CBZ and CBNZ
 when comparing narrow pointer with zero
Message-ID: <CACc5Y6R_6o3eogEa1EV1NMZ0ymGLEhRV2ObhspoXg9jap-Atcw@mail.gmail.com>

Hi,

Please review the following webrev:
*http://cr.openjdk.java.net/~fyang/8150038/webrev.00/
<http://cr.openjdk.java.net/~fyang/8150038/webrev.00/>*

Jira issue: *https://bugs.openjdk.java.net/browse/JDK-8150038
<https://bugs.openjdk.java.net/browse/JDK-8150038>*

For several times I noticed the following pattern in C2 JIT code (the
java heap size is set to 200MB):


   2042   0x0000007f6c9419c4: ldr       w14, [x11,#32]  ;*getfield buffer
   2048   0x0000007f6c9419c8: cmp       w14, wzr
   2049   0x0000007f6c9419cc: b.eq      0x0000007f6c9425e4
;*invokevirtual reset

The two cmp and b.eq instructions can be combined into one "cbz" instruction.
Currently, the aarch64 port only makes use of CBZ and CBNZ when
comparing operands with Integer/Long/Pointer type with zero.

Patch fixes the issue by adding one similar combine pattern in the AD
file for Narrow pointer types(just like the sparc port does).

Tested with jtreg hotspot & langtools.  Is it OK?

Thanks,
Felix.

From aph at redhat.com  Wed Feb 17 15:07:26 2016
From: aph at redhat.com (Andrew Haley)
Date: Wed, 17 Feb 2016 15:07:26 +0000
Subject: [aarch64-port-dev ] RFR: 8150038: aarch64: make use of CBZ and
 CBNZ when comparing narrow pointer with zero
In-Reply-To: <CACc5Y6R_6o3eogEa1EV1NMZ0ymGLEhRV2ObhspoXg9jap-Atcw@mail.gmail.com>
References: <CACc5Y6R_6o3eogEa1EV1NMZ0ymGLEhRV2ObhspoXg9jap-Atcw@mail.gmail.com>
Message-ID: <56C48CAE.9040704@redhat.com>

On 02/17/2016 02:33 PM, Felix Yang wrote:
> Tested with jtreg hotspot & langtools.  Is it OK?

Sure, that looks fine.

Thanks,

Andrew.


From edward.nevill at gmail.com  Wed Feb 17 19:29:18 2016
From: edward.nevill at gmail.com (Edward Nevill)
Date: Wed, 17 Feb 2016 19:29:18 +0000
Subject: [aarch64-port-dev ] arraycopy optimisations on aarch64
Message-ID: <1455737358.14578.45.camel@mint>

Hi,

There have been a number of ongoing efforts at optimising array copy recently.

Rather than have multiple webrevs and multiple JIRA issues I would like to collect all the efforts under a single JIRA issue. I have created the following JIRA issue for all work relating to optimising array copy.

https://bugs.openjdk.java.net/browse/JDK-8150082

We can then review all the array copy optimisation proposals on the aarch64-port-dev mailing list rather than cc'ing the whole of hotspot-compiler-dev with every intricate detail of array copys on aarch64.

Once we have a complete version of array copy code we are happy with I can submit a single CR for review. All contributions will be acknowledged in the "Contributed-by" section.

To further muddy the waters I have two patches I would like to forward for your discussion.

1) http://cr.openjdk.java.net/~enevill/memopts/small.patch

This improves the performance of copying small (0 to 80 bytes) arrays. The copy code is inlined (rather than calling out to copy_longs).

The copy forwards and copy backwards case is identical because the small copy code reads all data into registers before writing any. Thankfully aarch64 has plenty of registers.

The rationale for choosing 80 as the limit is that it provides a guarantee than copy_longs is always called with at least 64 bytes, even after worst case alignment fixup. This means the small case code in copy_longs can be deleted (I have put an assert in copy longs to check it is never called with < 64 bytes).

2) http://cr.openjdk.java.net/~enevill/memopts/simd.patch

This uses SIMD ldp/stp Qx, Qy instructions instead of scalar ldp/stp instructions, thereby loading/storing 32 bytes at a time instead of 16.

It also extends the small copy code to copy 0-96 instead of 0-80 (because 80 is not divisible by 32).

This improves performance on some micro-arches and not on others so I have provided a -XX:+UseSIMDForMemoryOps switch which defaults to false (we could look at enabling this by default for micro-arches where we know SIMD is better).

I have prepared a set of performance measurements on memory copies between 0 & 96 bytes in steps of 1 (which shows the effect of the small copy optimisations) and also between 0 & 1024 in steps of 16. I have prepared these for 3 different micro-arches. The results are at

http://cr.openjdk.java.net/~enevill/memopts/twoopts.pdf

In these charts the blue 'original' line is the jdk9 tip as of earlier today. The red 'small copy' line is after application of the small copy patch above. The yellow 'SIMD' line is after the cumulative application of the small copy patch and the simd patch.

The charts show time taken so smaller is better. I have normalised the charts by varying the number of iteration so all results are in the 0-1200 range. Because the number of iterations was different for each micro-arch no information should be inferred as to the relative performance of different micro-arches. The charts should only be used to compare the performance before and after application of the above patches.

All the best,
Ed.


From felix.yang at linaro.org  Thu Feb 18 15:02:06 2016
From: felix.yang at linaro.org (Felix Yang)
Date: Thu, 18 Feb 2016 23:02:06 +0800
Subject: [aarch64-port-dev ] RFR: 8149907: aarch64: use load/store pair
 instructions in call_stub
In-Reply-To: <56C480AA.2090606@redhat.com>
References: <CACc5Y6RZCFf1vL1BT9-h05EixaUHnwCTEg2Sa9kJ3EWY72zcTw@mail.gmail.com>
	<56C30C2E.4080206@redhat.com>
	<CACc5Y6R83U_Xk7kkqTc1eY3P8entxn=8O9zMdXJKn5YwOt4jrQ@mail.gmail.com>
	<56C480AA.2090606@redhat.com>
Message-ID: <CACc5Y6Q5sRBu_5xCJFSMSySx+dnmpwXaqPsxtKV_tjiveUj4cg@mail.gmail.com>

Hi,

    I updated the webrev with the unused ENUM members removed.
    New webrev: http://cr.openjdk.java.net/~fyang/8149907/webrev.02/

Thanks,
Felix

On 17 February 2016 at 22:16, Andrew Haley <aph at redhat.com> wrote:

> On 02/17/2016 02:11 PM, Felix Yang wrote:
>
> >     Thanks for the suggestions.  I have updated the patch with the unused
> > locals removed.
> >     New webrev: http://cr.openjdk.java.net/~fyang/8149907/webrev.01/
> >     How about this one?
>
> What are r19_off and its friends used for now?  Why are they still
> defined?
>
> Andrew.
>
>

From roland.westrelin at oracle.com  Fri Feb 19 07:54:26 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Fri, 19 Feb 2016 08:54:26 +0100
Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize
	redundant memory operations with G1
In-Reply-To: <56C20BD4.8030800@oracle.com>
References: <F1EBA70C-852F-4EEA-9C53-58FC4849FD2D@oracle.com>
	<56AA260B.8080101@redhat.com>
	<434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com>
	<56BB01E2.2090004@redhat.com>
	<9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com>
	<56BDB707.6090409@redhat.com>
	<8339FB73-9C21-4513-B07B-5DEBB8583188@oracle.com>
	<56BE3612.4080305@oracle.com>
	<BE8116BA-2D07-4579-BC71-2F9A4E5B027C@oracle.com>
	<56C20BD4.8030800@oracle.com>
Message-ID: <A3C7FDDC-A7DA-4CA1-B025-7A0402033473@oracle.com>

Thanks for the review, Vladimir.

Roland.

> On Feb 15, 2016, at 6:33 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> Good. Thank you.
> 
> Vladimir
> 
> On 2/15/16 1:21 AM, Roland Westrelin wrote:
>> 
>>> Can you create new webrev which includes everything (aarch64)?
>> 
>> Here it is:
>> http://cr.openjdk.java.net/~roland/8087341/webrev.01/
>> 
>> Roland.
>> 
>>> And I am satisfied with your answers to my objections.
>>> 
>>> Thanks,
>>> Vladimir
>>> 
>>> On 2/12/16 4:36 AM, Roland Westrelin wrote:
>>>> Hi Andrew,
>>>> 
>>>>> A patch for the AArch64 C2 volatile/CAS generation code which deals with
>>>>> the effects of your proposed C2 patch is available as a webrev
>>>>> 
>>>>> http://cr.openjdk.java.net/~adinn/8087341-aarch64/webrev.00/
>>>> 
>>>> Thanks for putting that together. I didn?t expect that simple change to cause so much trouble.
>>>> 
>>>>> n.b. I have /not/ created a separate issue for the AArch64 part of this
>>>>> fix. I am not sure whether you want to combine it with your patch or
>>>>> push it as a separate stage.
>>>> 
>>>> I can push everything together and list you as a contributor (in the contributed-by field) if that works for you.
>>>> 
>>>> Vladimir, can you take another look at this? Your two objections were:
>>>> 
>>>>> Also we have specialized insert_mem_bar_volatile() if we don't want wide memory affect. Why not use it?
>>>> 
>>>> The membar in the change takes the entire memory state as input but only changes raw memory. I don?t think that can be achieved with insert_mem_bar_volatile(). As explained by Mikael, the membar is here to force ordering between the oop store and the card table load. That?s why I think the membar?s inputs and outputs should be set up that way.
>>>> 
>>>>> And we need to keep precedent edge link to oop store in case EA eliminates related allocation.
>>>> 
>>>> Mikael said it?s not ok to eliminate the memory barrier if we leave the gc barrier.
>>>> 
>>>> Roland.
>>>> 
>> 


From roland.westrelin at oracle.com  Fri Feb 19 07:54:55 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Fri, 19 Feb 2016 08:54:55 +0100
Subject: [aarch64-port-dev ] RFR(S): 8087341: C2 doesn't optimize
	redundant memory operations with G1
In-Reply-To: <56C1B197.7060708@redhat.com>
References: <F1EBA70C-852F-4EEA-9C53-58FC4849FD2D@oracle.com>
	<56AA260B.8080101@redhat.com>
	<434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com>
	<56BB01E2.2090004@redhat.com>
	<9C43B8E9-A34B-41EC-A433-CCA9B67623F5@oracle.com>
	<56BDB707.6090409@redhat.com>
	<8339FB73-9C21-4513-B07B-5DEBB8583188@oracle.com>
	<56BE3612.4080305@oracle.com>
	<BE8116BA-2D07-4579-BC71-2F9A4E5B027C@oracle.com>
	<56C1B197.7060708@redhat.com>
Message-ID: <8112862F-B94D-4043-AD3C-DF59BB8FB862@oracle.com>

Thanks, Andrew!

Roland.

> On Feb 15, 2016, at 12:08 PM, Andrew Dinn <adinn at redhat.com> wrote:
> 
> On 15/02/16 09:21, Roland Westrelin wrote:
>> 
>>> Can you create new webrev which includes everything (aarch64)?
>> 
>> Here it is:
>> http://cr.openjdk.java.net/~roland/8087341/webrev.01/
> 
> Thanks Roland. Looks good to go.
> 
> regards,
> 
> 
> Andrew Dinn
> -----------
> Senior Principal Software Engineer
> Red Hat UK Ltd
> Registered in UK and Wales under Company Registration No. 3798903
> Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul
> Argiry (US)


From hui.shi at linaro.org  Fri Feb 19 12:13:28 2016
From: hui.shi at linaro.org (Hui Shi)
Date: Fri, 19 Feb 2016 20:13:28 +0800
Subject: [aarch64-port-dev ] RFR: 8149733: AArch64: refactor
	char_array_equals/byte_array_equals/string_equals
Message-ID: <CAF1YaiBJaYRhS8u0dAOJq=dV2Q-g2+UvYSm-SfijCzYnVTKCvQ@mail.gmail.com>

Hi,

Could some one help review this patch? This patch mainly aims to
refactoring similar code on AArch64 for string equals/ char array equals
and byte array equals.

JIRA: https://bugs.openjdk.java.net/browse/JDK-8149733
webrev: http://cr.openjdk.java.net/~hshi/8149733/webrev/

Patch includes:
1. Add new method MacroAssembler::generic_arrays_equals method, its
implementation combines string_equals and char/byte_array_equals.
   For array length >= 8 bytes, compare main body and tail bytes in 8 bytes
wide. Same with current string equals' implementation. This eliminates tail
branches and loads and improve performance on short length compare.
   For array length < 8 bytes, compare in test-ld-cmp sequence. It out
performs loop copy in string_equals.
2. Remove unnecessary lea address computation (mov array pointer to last
word) in string_equals.
3. Remove unnecessary tmp register for string_equals.

JTreg doesn?t show regression and performance also doesn't degradate with
different length combination. Small char/byte equals improves in most
tests. There is one slow run with new implementation, because last several
chars are different in test case. Original char array equals can find the
difference with first test-ld-cmp check, test-ld-cmp sequence might be
faster than entire unaligned 8 byte compare in some corner cases. Test
different chars in middle of string, performance is close for both
implementation.

Test case: http://cr.openjdk.java.net/~hshi/8149733/TestArrayEqual.java
Test result: http://cr.openjdk.java.net/~hshi/8149733/ArrayEqual.pdf

Regards
Hui

From felix.yang at linaro.org  Fri Feb 19 12:23:11 2016
From: felix.yang at linaro.org (Felix Yang)
Date: Fri, 19 Feb 2016 20:23:11 +0800
Subject: [aarch64-port-dev ] RFR: 8150229: aarch64: c2 fix pipeline class
	for several instructions.
Message-ID: <CACc5Y6T5SpRgQVXC4LVEGsbKNR_rW=TtAe1pECC3-aK+64=fPA@mail.gmail.com>

Hi,

Please review the following webrev:

    http://cr.openjdk.java.net/~fyang/8150229/webrev.00/

Jira issue: *https://bugs.openjdk.java.net/browse/JDK-8150229
<https://bugs.openjdk.java.net/browse/JDK-8150229>*

The pipeline class for some instructions is not set correctly.  An example:

instruct MoveF2I_reg_reg(iRegINoSp dst, vRegF src) %{
  match(Set dst (MoveF2I src));
  effect(DEF dst, USE src);

  ins_cost(INSN_COST);
  format %{ "fmovs $dst, $src\t# MoveF2I_reg_reg" %}

  ins_encode %{
    __ fmovs($dst$$Register, as_FloatRegister($src$$reg));
  %}

  ins_pipe(pipe_class_memory);    =>    Should be "fp_f2i"
%}

Patch fixes this issue.  Tested with jtreg hotspot.  Please help
commit this patch if it's OK.

Thanks,
Felix.

From aph at redhat.com  Fri Feb 19 12:23:25 2016
From: aph at redhat.com (Andrew Haley)
Date: Fri, 19 Feb 2016 12:23:25 +0000
Subject: [aarch64-port-dev ] RFR: 8149733: AArch64: refactor
 char_array_equals/byte_array_equals/string_equals
In-Reply-To: <CAF1YaiBJaYRhS8u0dAOJq=dV2Q-g2+UvYSm-SfijCzYnVTKCvQ@mail.gmail.com>
References: <CAF1YaiBJaYRhS8u0dAOJq=dV2Q-g2+UvYSm-SfijCzYnVTKCvQ@mail.gmail.com>
Message-ID: <56C7093D.9070408@redhat.com>

On 02/19/2016 12:13 PM, Hui Shi wrote:
> Could some one help review this patch? This patch mainly aims to
> refactoring similar code on AArch64 for string equals/ char array equals
> and byte array equals.

I'm looking at it.  It's quite complex and I won't reply immediately.

Thanks,

Andrew.


From aleksey.shipilev at oracle.com  Fri Feb 19 12:36:18 2016
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Fri, 19 Feb 2016 15:36:18 +0300
Subject: [aarch64-port-dev ] RFR: 8149733: AArch64: refactor
 char_array_equals/byte_array_equals/string_equals
In-Reply-To: <CAF1YaiBJaYRhS8u0dAOJq=dV2Q-g2+UvYSm-SfijCzYnVTKCvQ@mail.gmail.com>
References: <CAF1YaiBJaYRhS8u0dAOJq=dV2Q-g2+UvYSm-SfijCzYnVTKCvQ@mail.gmail.com>
Message-ID: <56C70C42.5020309@oracle.com>

Hi Hui,

On 02/19/2016 03:13 PM, Hui Shi wrote:
> webrev: http://cr.openjdk.java.net/~hshi/8149733/webrev/
> <http://cr.openjdk.java.net/%7Ehshi/8149733/webrev/>

Not savvy with AArch64 assembly, but it does not look bad.

My other comments are superficial:

 * Desperately needs spell-checking: "implenetaions", "implemenation",
"eqauls", "comapre"

 * Inconsistent naming, e.g. "... = wordSize/step_size;"

 * "if (is_string_equal == false) {"

 * "if (exact_log >0 )"

 * Shouldn't be:

	4533     ldrw(cnt1, Address(ary1, length_offset));
	4534     ldrw(tmp2, Address(ary2, length_offset));
	4535     cmp(cnt1, tmp2);

  spelled like:

	4533 ldrw(cnt1, Address(ary1, length_offset));
	4534 ldrw(cnt2, Address(ary2, length_offset));
	4535 cmp(cnt1, cnt2);

 * Would be nice to keep the comments like "// 0-7 bytes left, cnt1 =
#bytes left - 4"

 * Why TAIL01 block is predicated on (step_size == 1) now?

> Test case: http://cr.openjdk.java.net/~hshi/8149733/TestArrayEqual.java
> <http://cr.openjdk.java.net/%7Ehshi/8149733/TestArrayEqual.java>

I think you should really, really, really use JMH for these benchmarks:
 http://openjdk.java.net/projects/code-tools/jmh/

It would also provide you an easy access to generated code profiling,
with -prof perfasm. It is usually pretty clear from that output if your
generated code needs even more tuneups.

Cheers,
-Aleksey


From aph at redhat.com  Sat Feb 20 10:00:55 2016
From: aph at redhat.com (Andrew Haley)
Date: Sat, 20 Feb 2016 10:00:55 +0000
Subject: [aarch64-port-dev ] arraycopy optimisations on aarch64
In-Reply-To: <1455737358.14578.45.camel@mint>
References: <1455737358.14578.45.camel@mint>
Message-ID: <56C83957.4090908@redhat.com>

On 17/02/16 19:29, Edward Nevill wrote:
> To further muddy the waters I have two patches I would like to forward for your discussion.

Webrevs, please.

Andrew.


From edward.nevill at gmail.com  Sat Feb 20 15:23:53 2016
From: edward.nevill at gmail.com (Edward Nevill)
Date: Sat, 20 Feb 2016 15:23:53 +0000
Subject: [aarch64-port-dev ] arraycopy optimisations on aarch64
In-Reply-To: <56C83957.4090908@redhat.com>
References: <1455737358.14578.45.camel@mint>  <56C83957.4090908@redhat.com>
Message-ID: <1455981833.4817.2.camel@mint>

On Sat, 2016-02-20 at 10:00 +0000, Andrew Haley wrote:
> On 17/02/16 19:29, Edward Nevill wrote:
> > To further muddy the waters I have two patches I would like to forward for your discussion.
> 
> Webrevs, please.

Here they are

http://cr.openjdk.java.net/~enevill/8150082/webrev.0/

http://cr.openjdk.java.net/~enevill/8150313/webrev.0/

Note: They are not independent, the first must be applied before the second.

Regards,
Ed.
 

From hui.shi at linaro.org  Mon Feb 22 11:56:25 2016
From: hui.shi at linaro.org (Hui Shi)
Date: Mon, 22 Feb 2016 19:56:25 +0800
Subject: [aarch64-port-dev ] RFR: 8149733: AArch64: refactor
	char_array_equals/byte_array_equals/string_equals
In-Reply-To: <56C70C42.5020309@oracle.com>
References: <CAF1YaiBJaYRhS8u0dAOJq=dV2Q-g2+UvYSm-SfijCzYnVTKCvQ@mail.gmail.com>
	<56C70C42.5020309@oracle.com>
Message-ID: <CAF1YaiDT7hy9tMzr0aTrMY6fp-q=EfpqDWKbB-4_PeGPiROrrg@mail.gmail.com>

Thanks Aleksey & Andrew!

Patch is updated in http://cr.openjdk.java.net/~hshi/8149733/webrev2/ , it
adds on
1. Fix misc spelling and format issues
2. Use cnt2 for array length compare, comment that cnt2 can?t be used after
length compare
3. Add more comments for tail handling

JMH test in
http://cr.openjdk.java.net/~hshi/8149733/webrev2/JMHSample_97_ArrayEqual.java
. Run with java -jar ../benchmarks.jar '.*JMHSample_97*' -w 5 -wi 3 -i 5 -r
10 -f 0

Following is testing result before and after apply this patch. Refactoring
looks better in most cases.

Length 1-8 before
Benchmark                                 Mode  Cnt     Score   Error  Units
JMHSample_97_ArrayEqual.byte_equal        avgt    5  2954.349 ? 0.076  us/op
JMHSample_97_ArrayEqual.byte_not_equal    avgt    5  3232.505 ? 7.050  us/op
JMHSample_97_ArrayEqual.char_equal        avgt    5  2916.643 ? 0.126  us/op
JMHSample_97_ArrayEqual.char_not_equal    avgt    5  2778.486 ? 3.539  us/op
JMHSample_97_ArrayEqual.string_equal      avgt    5  4411.364 ? 0.149  us/op
JMHSample_97_ArrayEqual.string_not_equal  avgt    5  3898.965 ? 0.122  us/op

Length 1-8 after
Benchmark                                 Mode  Cnt     Score     Error
 Units
JMHSample_97_ArrayEqual.byte_equal        avgt    5  2890.122 ?   1.279
 us/op
JMHSample_97_ArrayEqual.byte_not_equal    avgt    5  2893.002 ?   5.914
 us/op
JMHSample_97_ArrayEqual.char_equal        avgt    5  2735.193 ?   0.096
 us/op
JMHSample_97_ArrayEqual.char_not_equal    avgt    5  2753.818 ?   0.708
 us/op
JMHSample_97_ArrayEqual.string_equal      avgt    5  4162.080 ? 818.652
 us/op
JMHSample_97_ArrayEqual.string_not_equal  avgt    5  3824.308 ?   0.621
 us/op


Length 9-16 before
Benchmark                                 Mode  Cnt     Score    Error
 Units
JMHSample_97_ArrayEqual.byte_equal        avgt    5  4193.783 ?  22.731
 us/op
JMHSample_97_ArrayEqual.byte_not_equal    avgt    5  3819.967 ?  61.053
 us/op
JMHSample_97_ArrayEqual.char_equal        avgt    5  5780.135 ? 104.966
 us/op
JMHSample_97_ArrayEqual.char_not_equal    avgt    5  5694.717 ?  87.426
 us/op
JMHSample_97_ArrayEqual.string_equal      avgt    5  6741.276 ?   1.112
 us/op
JMHSample_97_ArrayEqual.string_not_equal  avgt    5  6439.345 ? 161.295
 us/op

Length 9-16 after
Benchmark                                 Mode  Cnt     Score   Error  Units
JMHSample_97_ArrayEqual.byte_equal        avgt    5  2937.688 ? 0.074  us/op
JMHSample_97_ArrayEqual.byte_not_equal    avgt    5  2842.832 ? 0.038  us/op
JMHSample_97_ArrayEqual.char_equal        avgt    5  5274.417 ? 0.912  us/op
JMHSample_97_ArrayEqual.char_not_equal    avgt    5  4611.007 ? 0.592  us/op
JMHSample_97_ArrayEqual.string_equal      avgt    5  6778.782 ? 28.918
 us/op
JMHSample_97_ArrayEqual.string_not_equal  avgt    5  6455.762 ? 10.674
 us/op

Length 32-39 before
Benchmark                                 Mode  Cnt      Score    Error
 Units
JMHSample_97_ArrayEqual.byte_equal        avgt    5   5519.248 ?  1.799
 us/op
JMHSample_97_ArrayEqual.byte_not_equal    avgt    5   7204.390 ? 72.663
 us/op
JMHSample_97_ArrayEqual.char_equal        avgt    5   7891.681 ?  4.859
 us/op
JMHSample_97_ArrayEqual.char_not_equal    avgt    5   9830.466 ?  0.800
 us/op
JMHSample_97_ArrayEqual.string_equal      avgt    5  10087.074 ?  1.976
 us/op
JMHSample_97_ArrayEqual.string_not_equal  avgt    5  11383.347 ?  1.712
 us/op

Length 32-39 after
Benchmark                                 Mode  Cnt      Score    Error
 Units
JMHSample_97_ArrayEqual.byte_equal        avgt    5   5445.432 ?  1.396
 us/op
JMHSample_97_ArrayEqual.byte_not_equal    avgt    5   5856.414 ?  0.996
 us/op
JMHSample_97_ArrayEqual.char_equal        avgt    5   7864.556 ?  1.408
 us/op
JMHSample_97_ArrayEqual.char_not_equal    avgt    5   9274.953 ? 30.892
 us/op
JMHSample_97_ArrayEqual.string_equal      avgt    5   9841.792 ?  0.721
 us/op
JMHSample_97_ArrayEqual.string_not_equal  avgt    5  10750.615 ?  1.252
 us/op

Length 1025-1032 before
Benchmark                                 Mode  Cnt       Score     Error
 Units
JMHSample_97_ArrayEqual.byte_equal        avgt    5  103655.644 ? 15794.521
 us/op
JMHSample_97_ArrayEqual.byte_not_equal    avgt    5   90908.990 ? 120.387
 us/op
JMHSample_97_ArrayEqual.char_equal        avgt    5  155515.192 ? 233.650
 us/op
JMHSample_97_ArrayEqual.char_not_equal    avgt    5  148312.632 ?  59.342
 us/op
JMHSample_97_ArrayEqual.string_equal      avgt    5  134281.945 ?  20.829
 us/op
JMHSample_97_ArrayEqual.string_not_equal  avgt    5  138580.479 ? 137.336
 us/o

Length 1025-1032 after
Benchmark                                 Mode  Cnt       Score       Error
 Units
JMHSample_97_ArrayEqual.byte_equal        avgt    5  102232.913 ? 1950.542
 us/op
JMHSample_97_ArrayEqual.byte_not_equal    avgt    5   90179.625 ? 102.160
 us/op
JMHSample_97_ArrayEqual.char_equal        avgt    5  152515.169 ?   167.507
 us/op
JMHSample_97_ArrayEqual.char_not_equal    avgt    5  140293.463 ?   198.916
 us/op
JMHSample_97_ArrayEqual.string_equal      avgt    5  141776.676 ?    42.597
 us/op
JMHSample_97_ArrayEqual.string_not_equal  avgt    5  130141.577 ?    29.875
 us/op

Regards
Hui

On 19 February 2016 at 20:36, Aleksey Shipilev <aleksey.shipilev at oracle.com>
wrote:

> Hi Hui,
>
> On 02/19/2016 03:13 PM, Hui Shi wrote:
> > webrev: http://cr.openjdk.java.net/~hshi/8149733/webrev/
> > <http://cr.openjdk.java.net/%7Ehshi/8149733/webrev/>
>
> Not savvy with AArch64 assembly, but it does not look bad.
>
> My other comments are superficial:
>
>  * Desperately needs spell-checking: "implenetaions", "implemenation",
> "eqauls", "comapre"
>
>  * Inconsistent naming, e.g. "... = wordSize/step_size;"
>
>  * "if (is_string_equal == false) {"
>
>  * "if (exact_log >0 )"
>
>  * Shouldn't be:
>
>         4533     ldrw(cnt1, Address(ary1, length_offset));
>         4534     ldrw(tmp2, Address(ary2, length_offset));
>         4535     cmp(cnt1, tmp2);
>
>   spelled like:
>
>         4533 ldrw(cnt1, Address(ary1, length_offset));
>         4534 ldrw(cnt2, Address(ary2, length_offset));
>         4535 cmp(cnt1, cnt2);
>
>  * Would be nice to keep the comments like "// 0-7 bytes left, cnt1 =
> #bytes left - 4"
>
>  * Why TAIL01 block is predicated on (step_size == 1) now?
>
> > Test case: http://cr.openjdk.java.net/~hshi/8149733/TestArrayEqual.java
> > <http://cr.openjdk.java.net/%7Ehshi/8149733/TestArrayEqual.java>
>
> I think you should really, really, really use JMH for these benchmarks:
>  http://openjdk.java.net/projects/code-tools/jmh/
>
> It would also provide you an easy access to generated code profiling,
> with -prof perfasm. It is usually pretty clear from that output if your
> generated code needs even more tuneups.
>
> Cheers,
> -Aleksey
>
>

From edward.nevill at gmail.com  Mon Feb 22 20:32:10 2016
From: edward.nevill at gmail.com (Edward Nevill)
Date: Mon, 22 Feb 2016 20:32:10 +0000
Subject: [aarch64-port-dev ] RFR: 8150394: aarch64: add support for 8.1 LSE
	CAS instructions
Message-ID: <1456173130.2735.8.camel@mint>

Hi,

Please review the following webrev

http://cr.openjdk.java.net/~enevill/8150394/webrev.0/

This adds support for the CAS instructions in armv8.1.

The use of the instructions is enabled/disabled by the use of -XX:+/-UseLSE. This is enabled automatically if detected in the hwcap. If UseLSE is enabled on a CPU which does not support these instructions a warning is issued but the instructions are enabled in any case. This is to allow use of the LSE extensions on 8.1 systems which are running older kernels.

Tested before and after with jcstress (default mode). In both cases there was 1 failure which is due to a missing Unsafe method and always occurs with jdk9.

Thanks for the review,
Ed.


From aleksey.shipilev at oracle.com  Tue Feb 23 10:01:28 2016
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Tue, 23 Feb 2016 13:01:28 +0300
Subject: [aarch64-port-dev ] RFR: 8149733: AArch64: refactor
 char_array_equals/byte_array_equals/string_equals
In-Reply-To: <CAF1YaiDT7hy9tMzr0aTrMY6fp-q=EfpqDWKbB-4_PeGPiROrrg@mail.gmail.com>
References: <CAF1YaiBJaYRhS8u0dAOJq=dV2Q-g2+UvYSm-SfijCzYnVTKCvQ@mail.gmail.com>
	<56C70C42.5020309@oracle.com>
	<CAF1YaiDT7hy9tMzr0aTrMY6fp-q=EfpqDWKbB-4_PeGPiROrrg@mail.gmail.com>
Message-ID: <56CC2DF8.60806@oracle.com>

On 02/22/2016 02:56 PM, Hui Shi wrote:
> Thanks Aleksey & Andrew!
> 
> Patch is updated in http://cr.openjdk.java.net/~hshi/8149733/webrev2/
> <http://cr.openjdk.java.net/%7Ehshi/8149733/webrev2/> , it adds on
> 1. Fix misc spelling and format issues
> 2. Use cnt2 for array length compare, comment that cnt2 can?t be used
> after length compare
> 3. Add more comments for tail handling

Still not getting this part:

4526   cmp(ary1, ary2);
4527   mov(result, false);
4528   br(Assembler::EQ, SAME);

Should the mov be *after* the branch?

Also, "if (is_string_equal == false) {" should be "if (!is_string_equals)".

> JMH test
> in http://cr.openjdk.java.net/~hshi/8149733/webrev2/JMHSample_97_ArrayEqual.java
> <http://cr.openjdk.java.net/%7Ehshi/8149733/webrev2/JMHSample_97_ArrayEqual.java>
> . Run with java -jar ../benchmarks.jar '.*JMHSample_97*' -w 5 -wi 3 -i 5
> -r 10 -f 0 

Um, -f 0 is bad: you contaminate the profiles. Also, the benchmark could
be made much more idiomatic, solving a few other benchmarking pitfalls:
  http://cr.openjdk.java.net/~shade/scratch/ByteArrayEquals.java

You might want to re-run with your updated code.

Cheers,
-Aleksey


From aph at redhat.com  Tue Feb 23 13:28:40 2016
From: aph at redhat.com (Andrew Haley)
Date: Tue, 23 Feb 2016 13:28:40 +0000
Subject: [aarch64-port-dev ] =?utf-8?b?5Zue5aSN77yaUkZSOiA4MTQ5NzMzOiBB?=
 =?utf-8?q?Arch64=3A_refactorchar=5Farray=5Fequals/byte=5Farray=5Fequals/s?=
 =?utf-8?q?tring=5Fequals?=
In-Reply-To: <tencent_2277B0CC985690AB6C8090A6@qq.com>
References: <56CC2DF8.60806@oracle.com>
	<CAF1YaiBJaYRhS8u0dAOJq=dV2Q-g2+UvYSm-SfijCzYnVTKCvQ@mail.gmail.com>
	<56C70C42.5020309@oracle.com> <CAF1YaiDT7hy9tMzr0aTrMY6fp-q=
	<tencent_2277B0CC985690AB6C8090A6@qq.com>
Message-ID: <56CC5E88.7040309@redhat.com>

On 02/23/2016 11:33 AM, hui.shi wrote:
> thanks! I will update and rerun JMH.

I want to make some changes.  Please wait until then.

Thanks,

Andrew.


From aph at redhat.com  Tue Feb 23 16:17:52 2016
From: aph at redhat.com (Andrew Haley)
Date: Tue, 23 Feb 2016 16:17:52 +0000
Subject: [aarch64-port-dev ] =?utf-8?b?5Zue5aSN77yaUkZSOiA4MTQ5NzMzOiBB?=
 =?utf-8?q?Arch64=3A_refactorchar=5Farray=5Fequals/byte=5Farray=5Fequals/s?=
 =?utf-8?q?tring=5Fequals?=
In-Reply-To: <tencent_2277B0CC985690AB6C8090A6@qq.com>
References: <56CC2DF8.60806@oracle.com>
	<CAF1YaiBJaYRhS8u0dAOJq=dV2Q-g2+UvYSm-SfijCzYnVTKCvQ@mail.gmail.com>
	<56C70C42.5020309@oracle.com> <CAF1YaiDT7hy9tMzr0aTrMY6fp-q=
	<tencent_2277B0CC985690AB6C8090A6@qq.com>
Message-ID: <56CC8630.2020708@redhat.com>

My version is at

http://cr.openjdk.java.net/~aph/8149733/

The changes I made are:

        I rewrote most of the comments because I couldn't understand
        them.  I intend no criticism, and I understand that English
        isn't the language of your birth.  Please tell me if you can
        understand my comments.

	"generic_array_equals" -> "arrays_equals"
        Reason: it's not generic, it's only bytes and chars.
        Also, this is what x86_64 calls the same routine.

        "ary1" -> "a"
        Reason: "ary" just looks odd.  Also, these are the names in the
	java code.

        "cmp; br.nz" -> "eor, bnz"
        Reason: Don't clobber flags for no reason.

        There's no need to check for the same arrays if we're
        comparing strings.

Otherwise, the code is the same.  I haven't much tested this, but it
should give the same performance.  Please test it, and tell me if I've
broken anything.

Thanks,

Andrew.

From aph at redhat.com  Wed Feb 24 10:58:23 2016
From: aph at redhat.com (Andrew Haley)
Date: Wed, 24 Feb 2016 10:58:23 +0000
Subject: [aarch64-port-dev ] RFR: 8150394: aarch64: add support for 8.1
	LSE CAS instructions
In-Reply-To: <1456173130.2735.8.camel@mint>
References: <1456173130.2735.8.camel@mint>
Message-ID: <56CD8CCF.1030404@redhat.com>

On 22/02/16 20:32, Edward Nevill wrote:
> http://cr.openjdk.java.net/~enevill/8150394/webrev.0/
> 
> This adds support for the CAS instructions in armv8.1.

The C2 code for aarch64_enc_cmpxchg* is missing.

It's quite tricky to refactor to allow LSE instructions.  I'd add
a wordsize parameter to the cas instruction, like this:

#define INSN(NAME, a, r)                                                \
  void NAME(operand_size sz, Register Rs, Register Rt, Register Rn) {   \
    assert(Rs != Rn && Rs != Rt, "unpredictable instruction");          \
    compare_and_swap(Rs, Rt, Rn, sz, 1, a, r);                          \
  }
  INSN(cas, 0, 0)

And this gets rid of a ton of instruction definitions: we only need
CAS{A,L,AL}.

Pass the operand size down to MacroAssembler::cmpxchgw:

  enc_class aarch64_enc_cmpxchgw(memory mem, iRegINoSp oldval, iRegINoSp newval) %{
    MacroAssembler _masm(&cbuf);
    guarantee($mem$$index == -1 && $mem$$disp == 0, "impossible encoding");
    __ cmpxchg(Assembler::word, $mem$$base$$Register, $oldval$$Register,
               $newval$$Register,
               &Assembler::ldxrw, &MacroAssembler::cmpw, &Assembler::stlxrw);
  %}

void MacroAssembler::cmpxchgw(operand_size sz, Register oldv,
                              Register newv, Register addr, Register tmp,
			      Label &succeed, Label *fail) {

  if (UseLSE) {
    ...

It'll be necessary to pass a memory barrier flag too.

Andrew.

From aph at redhat.com  Wed Feb 24 12:49:30 2016
From: aph at redhat.com (Andrew Haley)
Date: Wed, 24 Feb 2016 12:49:30 +0000
Subject: [aarch64-port-dev ] RFR: 8150394: aarch64: add support for 8.1
	LSE CAS instructions
In-Reply-To: <56CD8CCF.1030404@redhat.com>
References: <1456173130.2735.8.camel@mint> <56CD8CCF.1030404@redhat.com>
Message-ID: <56CDA6DA.3080901@redhat.com>

One more thing: with 8148146 we have new entry points and C2 nodes
for WeakCompareAndSwapX.  We'll need to add a "bool weak" parameter
to MacroAssembler::cmpxchgw.  I suppose it's OK for this to be done
in a later commit.

Andrew.


From aph at redhat.com  Wed Feb 24 12:59:16 2016
From: aph at redhat.com (Andrew Haley)
Date: Wed, 24 Feb 2016 12:59:16 +0000
Subject: [aarch64-port-dev ] RFR: 8150394: aarch64: add support for 8.1
 LSE CAS instructions
In-Reply-To: <56CDA6DA.3080901@redhat.com>
References: <1456173130.2735.8.camel@mint> <56CD8CCF.1030404@redhat.com>
	<56CDA6DA.3080901@redhat.com>
Message-ID: <56CDA924.6050106@redhat.com>

On 02/24/2016 12:49 PM, Andrew Haley wrote:
> One more thing: with 8148146 we have new entry points and C2 nodes
> for WeakCompareAndSwapX.  We'll need to add a "bool weak" parameter
> to MacroAssembler::cmpxchgw.  I suppose it's OK for this to be done
> in a later commit.

Forget that: hs-comp is currently broken because of Unsafe changes.  I'm
going to make it build again and push.  Your changes can go on top of
that.

Andrew.


From hui.shi at linaro.org  Wed Feb 24 13:02:48 2016
From: hui.shi at linaro.org (Hui Shi)
Date: Wed, 24 Feb 2016 21:02:48 +0800
Subject: [aarch64-port-dev ] =?utf-8?b?5Zue5aSN77yaUkZSOiA4MTQ5NzMzOiBB?=
	=?utf-8?q?Arch64=3A_refactorchar=5Farray=5Fequals/byte=5Farray=5Fe?=
	=?utf-8?q?quals/string=5Fequals?=
In-Reply-To: <56CC8630.2020708@redhat.com>
References: <56CC2DF8.60806@oracle.com>
	<CAF1YaiBJaYRhS8u0dAOJq=dV2Q-g2+UvYSm-SfijCzYnVTKCvQ@mail.gmail.com>
	<56C70C42.5020309@oracle.com>
	<tencent_2277B0CC985690AB6C8090A6@qq.com>
	<56CC8630.2020708@redhat.com>
Message-ID: <CAF1YaiB-zDg5JzbYrqo_F5OoRZfRL0nnw27n8X=7DjRT7W=r7g@mail.gmail.com>

Thanks Andrew! Your comment looks really better and performance doesn't
change when run JMHSample_97_ArrayEqual.java
<http://cr.openjdk.java.net/~hshi/8149733/webrev2/JMHSample_97_ArrayEqual.java>
 test.

latest webrev http://cr.openjdk.java.net/~hshi/8149733/webrev3/
several small name and format issues:
1. "ary1" -> "a" in method declaration

2. Use tmp1 instead of rscratch1 directly
+    ldrw(cnt1, Address(a1, length_offset));
+    ldrw(cnt2, Address(a2, length_offset));
+    eorw(rscratch1, cnt1, cnt2);
+    cbnzw(rscratch1, DONE);

3.       Blank after ?!?
+  if (! is_string) {


Following is result with Aleksey's updated test case (-w 5 -wi 3 -i3 -r
10), first 4 group are for base run with base string length 0, 8, 31, 1024.
Performance with patch doesn't show same improvement with early test. Only
small length string equal tests still show obvious improvement.

grep -A 6 "^Benchmark" base.result
Benchmark                         (baselength)  (size)  Mode  Cnt   Score
Error  Units
JMH_ArrayEquals.byte_equal                   0     500  avgt    9  15.563 ?
0.005  us/op
JMH_ArrayEquals.byte_not_equal               0     500  avgt    9  16.425 ?
0.167  us/op
JMH_ArrayEquals.char_equal                   0     500  avgt    9  15.635 ?
0.294  us/op
JMH_ArrayEquals.char_not_equal               0     500  avgt    9  15.557 ?
0.377  us/op
JMH_ArrayEquals.string_equal                 0     500  avgt    9  22.307 ?
0.063  us/op
JMH_ArrayEquals.string_not_equal             0     500  avgt    9  21.368 ?
0.025  us/op
--
Benchmark                         (baselength)  (size)  Mode  Cnt   Score
Error  Units
JMH_ArrayEquals.byte_equal                   8     500  avgt    9  16.058 ?
0.012  us/op
JMH_ArrayEquals.byte_not_equal               8     500  avgt    9  16.910 ?
0.574  us/op
JMH_ArrayEquals.char_equal                   8     500  avgt    9  17.094 ?
0.008  us/op
JMH_ArrayEquals.char_not_equal               8     500  avgt    9  17.114 ?
0.156  us/op
JMH_ArrayEquals.string_equal                 8     500  avgt    9  25.033 ?
0.074  us/op
JMH_ArrayEquals.string_not_equal             8     500  avgt    9  24.968 ?
0.244  us/op
--
Benchmark                         (baselength)  (size)  Mode  Cnt   Score
Error  Units
JMH_ArrayEquals.byte_equal                  31     500  avgt    9  18.821 ?
0.091  us/op
JMH_ArrayEquals.byte_not_equal              31     500  avgt    9  19.763 ?
0.002  us/op
JMH_ArrayEquals.char_equal                  31     500  avgt    9  24.210 ?
0.033  us/op
JMH_ArrayEquals.char_not_equal              31     500  avgt    9  27.400 ?
0.382  us/op
JMH_ArrayEquals.string_equal                31     500  avgt    9  29.825 ?
0.098  us/op
JMH_ArrayEquals.string_not_equal            31     500  avgt    9  31.918 ?
0.100  us/op
--
Benchmark                         (baselength)  (size)  Mode  Cnt    Score
  Error  Units
JMH_ArrayEquals.byte_equal                1024     500  avgt    9  188.613
? 7.386  us/op
JMH_ArrayEquals.byte_not_equal            1024     500  avgt    9  193.399
? 4.448  us/op
JMH_ArrayEquals.char_equal                1024     500  avgt    9  316.324
? 9.976  us/op
JMH_ArrayEquals.char_not_equal            1024     500  avgt    9  341.307
? 1.082  us/op
JMH_ArrayEquals.string_equal              1024     500  avgt    9  324.059
? 2.352  us/op
JMH_ArrayEquals.string_not_equal          1024     500  avgt    9  326.954
? 1.121  us/op


grep -A 6 "^Benchmark" opt.result
Benchmark                         (baselength)  (size)  Mode  Cnt   Score
Error  Units
JMH_ArrayEquals.byte_equal                   0     500  avgt    9  15.923 ?
0.132  us/op
JMH_ArrayEquals.byte_not_equal               0     500  avgt    9  15.996 ?
0.336  us/op
JMH_ArrayEquals.char_equal                   0     500  avgt    9  16.001 ?
0.127  us/op
JMH_ArrayEquals.char_not_equal               0     500  avgt    9  15.361 ?
0.004  us/op
JMH_ArrayEquals.string_equal                 0     500  avgt    9  21.083 ?
0.337  us/op
JMH_ArrayEquals.string_not_equal             0     500  avgt    9  19.887 ?
0.479  us/op
--
Benchmark                         (baselength)  (size)  Mode  Cnt   Score
Error  Units
JMH_ArrayEquals.byte_equal                   8     500  avgt    9  16.574 ?
0.148  us/op
JMH_ArrayEquals.byte_not_equal               8     500  avgt    9  16.596 ?
0.719  us/op
JMH_ArrayEquals.char_equal                   8     500  avgt    9  17.874 ?
0.431  us/op
JMH_ArrayEquals.char_not_equal               8     500  avgt    9  17.831 ?
0.284  us/op
JMH_ArrayEquals.string_equal                 8     500  avgt    9  24.279 ?
0.033  us/op
JMH_ArrayEquals.string_not_equal             8     500  avgt    9  22.850 ?
0.444  us/op
--
Benchmark                         (baselength)  (size)  Mode  Cnt   Score
Error  Units
JMH_ArrayEquals.byte_equal                  31     500  avgt    9  19.010 ?
0.006  us/op
JMH_ArrayEquals.byte_not_equal              31     500  avgt    9  19.962 ?
0.071  us/op
JMH_ArrayEquals.char_equal                  31     500  avgt    9  25.038 ?
0.108  us/op
JMH_ArrayEquals.char_not_equal              31     500  avgt    9  27.268 ?
0.063  us/op
JMH_ArrayEquals.string_equal                31     500  avgt    9  29.366 ?
0.103  us/op
JMH_ArrayEquals.string_not_equal            31     500  avgt    9  31.357 ?
0.047  us/op
--
Benchmark                         (baselength)  (size)  Mode  Cnt    Score
  Error  Units
JMH_ArrayEquals.byte_equal                1024     500  avgt    9  190.034
? 4.067  us/op
JMH_ArrayEquals.byte_not_equal            1024     500  avgt    9  192.504
? 4.675  us/op
JMH_ArrayEquals.char_equal                1024     500  avgt    9  313.925
? 8.476  us/op
JMH_ArrayEquals.char_not_equal            1024     500  avgt    9  342.520
? 7.915  us/op
JMH_ArrayEquals.string_equal              1024     500  avgt    9  326.392
? 2.009  us/op
JMH_ArrayEquals.string_not_equal          1024     500  avgt    9  328.526
? 3.617  us/op


Regards
Hui

On 24 February 2016 at 00:17, Andrew Haley <aph at redhat.com> wrote:

> My version is at
>
> http://cr.openjdk.java.net/~aph/8149733/
>
> The changes I made are:
>
>         I rewrote most of the comments because I couldn't understand
>         them.  I intend no criticism, and I understand that English
>         isn't the language of your birth.  Please tell me if you can
>         understand my comments.
>
>         "generic_array_equals" -> "arrays_equals"
>         Reason: it's not generic, it's only bytes and chars.
>         Also, this is what x86_64 calls the same routine.
>
>         "ary1" -> "a"
>         Reason: "ary" just looks odd.  Also, these are the names in the
>         java code.
>
>         "cmp; br.nz" -> "eor, bnz"
>         Reason: Don't clobber flags for no reason.
>
>         There's no need to check for the same arrays if we're
>         comparing strings.
>
> Otherwise, the code is the same.  I haven't much tested this, but it
> should give the same performance.  Please test it, and tell me if I've
> broken anything.
>
> Thanks,
>
> Andrew.
>

From aleksey.shipilev at oracle.com  Wed Feb 24 14:24:42 2016
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Wed, 24 Feb 2016 17:24:42 +0300
Subject: [aarch64-port-dev ] =?utf-8?b?5Zue5aSN77yaUkZSOiA4MTQ5NzMzOiBB?=
 =?utf-8?q?Arch64=3A_refactorchar=5Farray=5Fequals/byte=5Farray=5Fequals/s?=
 =?utf-8?q?tring=5Fequals?=
In-Reply-To: <CAF1YaiB-zDg5JzbYrqo_F5OoRZfRL0nnw27n8X=7DjRT7W=r7g@mail.gmail.com>
References: <56CC2DF8.60806@oracle.com>
	<CAF1YaiBJaYRhS8u0dAOJq=dV2Q-g2+UvYSm-SfijCzYnVTKCvQ@mail.gmail.com>
	<56C70C42.5020309@oracle.com> <tencent_2277B0CC985690AB6C8090A6@qq.com>
	<56CC8630.2020708@redhat.com>
	<CAF1YaiB-zDg5JzbYrqo_F5OoRZfRL0nnw27n8X=7DjRT7W=r7g@mail.gmail.com>
Message-ID: <56CDBD2A.9050002@oracle.com>

On 02/24/2016 04:02 PM, Hui Shi wrote:
> Thanks Andrew! Your comment looks really better and performance doesn't
> change when run JMHSample_97_ArrayEqual.java
> <http://cr.openjdk.java.net/%7Ehshi/8149733/webrev2/JMHSample_97_ArrayEqual.java> test. 
> 
> latest webrev http://cr.openjdk.java.net/~hshi/8149733/webrev3/
> <http://cr.openjdk.java.net/%7Ehshi/8149733/webrev3/>

Good.

> Following is result with Aleksey's updated test case (-w 5 -wi 3 -i3 -r
> 10), first 4 group are for base run with base string length 0, 8, 31,
> 1024. Performance with patch doesn't show same improvement with early
> test. Only small length string equal tests still show obvious improvement. 

...and that's okay for refactoring.

Cheers,
-Aleksey


From adinn at redhat.com  Wed Feb 24 16:50:58 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Wed, 24 Feb 2016 16:50:58 +0000
Subject: [aarch64-port-dev ] RFR: 8150394: aarch64: add support for 8.1
 LSE CAS instructions
In-Reply-To: <56CD8CCF.1030404@redhat.com>
References: <1456173130.2735.8.camel@mint> <56CD8CCF.1030404@redhat.com>
Message-ID: <56CDDF72.9080304@redhat.com>


On 24/02/16 10:58, Andrew Haley wrote:
> On 22/02/16 20:32, Edward Nevill wrote:
>> http://cr.openjdk.java.net/~enevill/8150394/webrev.0/
>>
>> This adds support for the CAS instructions in armv8.1.
> 
> The C2 code for aarch64_enc_cmpxchg* is missing.
> 
> It's quite tricky to refactor to allow LSE instructions.  I'd add
> a wordsize parameter to the cas instruction, like this:
> 
> #define INSN(NAME, a, r)                                                \
>   void NAME(operand_size sz, Register Rs, Register Rt, Register Rn) {   \
>     assert(Rs != Rn && Rs != Rt, "unpredictable instruction");          \
>     compare_and_swap(Rs, Rt, Rn, sz, 1, a, r);                          \
>   }
>   INSN(cas, 0, 0)
> 
> And this gets rid of a ton of instruction definitions: we only need
> CAS{A,L,AL}.
> 
> Pass the operand size down to MacroAssembler::cmpxchgw:
> 
>   enc_class aarch64_enc_cmpxchgw(memory mem, iRegINoSp oldval, iRegINoSp newval) %{
>     MacroAssembler _masm(&cbuf);
>     guarantee($mem$$index == -1 && $mem$$disp == 0, "impossible encoding");
>     __ cmpxchg(Assembler::word, $mem$$base$$Register, $oldval$$Register,
>                $newval$$Register,
>                &Assembler::ldxrw, &MacroAssembler::cmpw, &Assembler::stlxrw);
>   %}
> 
> void MacroAssembler::cmpxchgw(operand_size sz, Register oldv,
>                               Register newv, Register addr, Register tmp,
> 			      Label &succeed, Label *fail) {
> 
>   if (UseLSE) {
>     ...
> 
> It'll be necessary to pass a memory barrier flag too.

You mean to deal with the difference between aarch64_enc_cmpxchg and
aarch64_enc_cmpxchg_acq? The former uses ldxr and is employed for CAS
when UseBarriersForVolatile is true. The latter uses ldaxr and is
employed when we optimize CAS because UseBarriersForVolatile is false.
We need to use the relevant flavour of casxx iside cmpxchg for each of
these two encodings.

I was also going to recommend using LSE in cmpxchg but I was not sure
exactly how it would need to work. The lock code does not loop when the
stlxr fails (it branches to cas_failed). However the CAS code loops back
to retry the load. If cmpxchg is rewritten to use casal (or casl) does
it not still need to loop?

Also, what does casxx allow us to do to implement the weaker variants of
the new unsafe CAS API other than to include or exclude the acquire? Is
there a variant of CAS operations which could use casa rather than
casal? or even just cas?

regards,


Andrew Dinn
-----------


From aph at redhat.com  Wed Feb 24 16:53:49 2016
From: aph at redhat.com (Andrew Haley)
Date: Wed, 24 Feb 2016 16:53:49 +0000
Subject: [aarch64-port-dev ] RFR: 8150394: aarch64: add support for 8.1
 LSE CAS instructions
In-Reply-To: <56CDDF72.9080304@redhat.com>
References: <1456173130.2735.8.camel@mint> <56CD8CCF.1030404@redhat.com>
	<56CDDF72.9080304@redhat.com>
Message-ID: <56CDE01D.6080702@redhat.com>

On 02/24/2016 04:50 PM, Andrew Dinn wrote:
> You mean to deal with the difference between aarch64_enc_cmpxchg and
> aarch64_enc_cmpxchg_acq?

We now have CAS for non-ordered memory too.

I'm preparing a patch as we speak.  It should become clearer.

Andrew.


From gnu.andrew at redhat.com  Thu Feb 25 00:34:08 2016
From: gnu.andrew at redhat.com (Andrew Hughes)
Date: Wed, 24 Feb 2016 19:34:08 -0500 (EST)
Subject: [aarch64-port-dev ] [PATCH] [jdk8u] jvm.cfg parsing broken,
	resulting in broken JDK
In-Reply-To: <262219393.26885369.1456334677753.JavaMail.zimbra@redhat.com>
Message-ID: <1220398795.26965429.1456360448570.JavaMail.zimbra@redhat.com>

Webrev: http://cr.openjdk.java.net/~andrew/aarch64-8/sync/jdk.webrev.02/

This change:

http://hg.openjdk.java.net/aarch64-port/jdk8u/jdk/rev/2940c1ead99bd7635

introduced a change to jvm.cfg parsing local to the 8u port. Then, this change:

http://hg.openjdk.java.net/aarch64-port/jdk8u/jdk/rev/9399aa7ef558

removed part of that patch, breaking jvm.cfg parsing:

"Error: missing `client' JVM at `/builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.72-6.b15.fc24.aarch64/openjdk/build/jdk8.build/images/j2sdk-image/jre/lib/aarch64/client/libjvm.so'." [0]

This webrev removes the rest of 2940c1ead99bd7635 and replaces
jvm.cfg with the version used in OpenJDK 9, allowing aarch64/jdk8u
to build again.

Ok to push?

[0] https://bugzilla.redhat.com/show_bug.cgi?id=1307224#c14
-- 
Andrew :)

Senior Free Java Software Engineer
Red Hat, Inc. (http://www.redhat.com)

PGP Key: ed25519/35964222 (hkp://keys.gnupg.net)
Fingerprint = 5132 579D D154 0ED2 3E04  C5A0 CFDA 0F9B 3596 4222


From gnu.andrew at redhat.com  Thu Feb 25 00:41:12 2016
From: gnu.andrew at redhat.com (Andrew Hughes)
Date: Wed, 24 Feb 2016 19:41:12 -0500 (EST)
Subject: [aarch64-port-dev ] [PATCH] [jdk8u] Remove unused template which
 breaks builds with GCC 6
In-Reply-To: <1527617114.26965440.1456360493395.JavaMail.zimbra@redhat.com>
Message-ID: <1629682443.26965872.1456360872278.JavaMail.zimbra@redhat.com>

Webrev: http://cr.openjdk.java.net/~andrew/aarch64-8/gcc6/webrev.01/

The min template:

template <class T> static const T& min (const T& a, const T& b)

causes the build to fail with GCC 6 [0], where the default C++ standard
(-std=gnu++98) has to be explicitly specified, as the default has changed.

/builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.72-5.b15.fc24.aarch64/openjdk/hotspot/src/share/vm/utilities/globalDefinitions.hpp:1113:18: warning: variable templates only available with -std=c++14 or -std=gnu++14
 #define min(a,b) Do_not_use_min_use_MIN2_instead
                  ^
/builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.72-5.b15.fc24.aarch64/openjdk/hotspot/src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp:197:36: note: in expansion of macro 'min'
 template <class T> static const T& min (const T& a, const T& b) {
                                    ^~~
/builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.72-5.b15.fc24.aarch64/openjdk/hotspot/src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp:198:3: error: expected ';' before 'return'
   return (a > b) ? b : a;
   ^~~~~~
/builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.72-5.b15.fc24.aarch64/openjdk/hotspot/src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp:199:1: error: expected declaration before '}' token
 }
 ^

The template appears to be unused and removing it allows the build
to succeed with GCC 6.

Ok to push?

[0] https://bugzilla.redhat.com/show_bug.cgi?id=1307224#c1
-- 
Andrew :)

Senior Free Java Software Engineer
Red Hat, Inc. (http://www.redhat.com)

PGP Key: ed25519/35964222 (hkp://keys.gnupg.net)
Fingerprint = 5132 579D D154 0ED2 3E04  C5A0 CFDA 0F9B 3596 4222


From adinn at redhat.com  Thu Feb 25 08:42:36 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Thu, 25 Feb 2016 08:42:36 +0000
Subject: [aarch64-port-dev ] [PATCH] [jdk8u] jvm.cfg parsing broken,
 resulting in broken JDK
In-Reply-To: <1220398795.26965429.1456360448570.JavaMail.zimbra@redhat.com>
References: <1220398795.26965429.1456360448570.JavaMail.zimbra@redhat.com>
Message-ID: <56CEBE7C.8070808@redhat.com>

On 25/02/16 00:34, Andrew Hughes wrote:
> Webrev:
> http://cr.openjdk.java.net/~andrew/aarch64-8/sync/jdk.webrev.02/
> 
> This change:
> 
> http://hg.openjdk.java.net/aarch64-port/jdk8u/jdk/rev/2940c1ead99bd7635
>
>  introduced a change to jvm.cfg parsing local to the 8u port. Then,
> this change:
> 
> http://hg.openjdk.java.net/aarch64-port/jdk8u/jdk/rev/9399aa7ef558
> 
> removed part of that patch, breaking jvm.cfg parsing:
> 
> "Error: missing `client' JVM at
> `/builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.72-6.b15.fc24.aarch64/openjdk/build/jdk8.build/images/j2sdk-image/jre/lib/aarch64/client/libjvm.so'."
> [0]
> 
> This webrev removes the rest of 2940c1ead99bd7635 and replaces 
> jvm.cfg with the version used in OpenJDK 9, allowing aarch64/jdk8u to
> build again.
> 
> Ok to push?
> 
> [0] https://bugzilla.redhat.com/show_bug.cgi?id=1307224#c14

Yes, this looks right to me. It also explains why my latest build is
having problems when I run without -server. So, please push.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in UK and Wales under Company Registration No. 3798903
Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul
Argiry (US)

From adinn at redhat.com  Thu Feb 25 08:43:16 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Thu, 25 Feb 2016 08:43:16 +0000
Subject: [aarch64-port-dev ] [PATCH] [jdk8u] Remove unused template
 which breaks builds with GCC 6
In-Reply-To: <1629682443.26965872.1456360872278.JavaMail.zimbra@redhat.com>
References: <1629682443.26965872.1456360872278.JavaMail.zimbra@redhat.com>
Message-ID: <56CEBEA4.4050301@redhat.com>

On 25/02/16 00:41, Andrew Hughes wrote:
> Webrev: http://cr.openjdk.java.net/~andrew/aarch64-8/gcc6/webrev.01/
> . . .
> Ok to push?

Yes, please.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in UK and Wales under Company Registration No. 3798903
Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul
Argiry (US)

From aph at redhat.com  Thu Feb 25 09:25:14 2016
From: aph at redhat.com (Andrew Haley)
Date: Thu, 25 Feb 2016 09:25:14 +0000
Subject: [aarch64-port-dev ] [PATCH] [jdk8u] Remove unused template
 which breaks builds with GCC 6
In-Reply-To: <56CEBEA4.4050301@redhat.com>
References: <1629682443.26965872.1456360872278.JavaMail.zimbra@redhat.com>
	<56CEBEA4.4050301@redhat.com>
Message-ID: <56CEC87A.5070600@redhat.com>

On 25/02/16 08:43, Andrew Dinn wrote:
> Yes, please.

GCC 6 changes should be made to JDK 9 upstream and backported.
JDK8u is not really for development.

Andrew.


From aph at redhat.com  Thu Feb 25 09:58:31 2016
From: aph at redhat.com (Andrew Haley)
Date: Thu, 25 Feb 2016 09:58:31 +0000
Subject: [aarch64-port-dev ] [PATCH] [jdk8u] Remove unused template
 which breaks builds with GCC 6
In-Reply-To: <1629682443.26965872.1456360872278.JavaMail.zimbra@redhat.com>
References: <1629682443.26965872.1456360872278.JavaMail.zimbra@redhat.com>
Message-ID: <56CED047.4060604@redhat.com>

On 25/02/16 00:41, Andrew Hughes wrote:
> Ok to push?
> 
> [0] https://bugzilla.redhat.com/show_bug.cgi?id=1307224#c1

No.  This should go to JDK 9 and get backported.  Only JDK8-
specific patches get reviewed on 8.

Andrew.


From edward.nevill at gmail.com  Thu Feb 25 10:06:26 2016
From: edward.nevill at gmail.com (Edward Nevill)
Date: Thu, 25 Feb 2016 10:06:26 +0000
Subject: [aarch64-port-dev ] RFR: 8150394: aarch64: add support for 8.1
	LSE CAS instructions
In-Reply-To: <56CD8CCF.1030404@redhat.com>
References: <1456173130.2735.8.camel@mint>  <56CD8CCF.1030404@redhat.com>
Message-ID: <1456394786.1383.18.camel@mint>

On Wed, 2016-02-24 at 10:58 +0000, Andrew Haley wrote:
> On 22/02/16 20:32, Edward Nevill wrote:

> And this gets rid of a ton of instruction definitions: we only need
> CAS{A,L,AL}.
> 
> Pass the operand size down to MacroAssembler::cmpxchgw:
> 
>   enc_class aarch64_enc_cmpxchgw(memory mem, iRegINoSp oldval, iRegINoSp newval) %{
>     MacroAssembler _masm(&cbuf);
>     guarantee($mem$$index == -1 && $mem$$disp == 0, "impossible encoding");
>     __ cmpxchg(Assembler::word, $mem$$base$$Register, $oldval$$Register,
>                $newval$$Register,
>                &Assembler::ldxrw, &MacroAssembler::cmpw, &Assembler::stlxrw);
>   %}
> 
> void MacroAssembler::cmpxchgw(operand_size sz, Register oldv,
>                               Register newv, Register addr, Register tmp,
> 			      Label &succeed, Label *fail) {
> 
>   if (UseLSE) {
>     ...
> 
> It'll be necessary to pass a memory barrier flag too.

Hi,

Is this something like what you had in mind?

http://cr.openjdk.java.net/~enevill/8150394/webrev.1/

WRT WeakCompareAndSwap I think it would be better if that went in as a separate change as we will have to backport this to jdk8 and doing it as one change means unpicking it later.

Tested with jcstress with and without -XX:UseLSE

All the best,
Ed.


From aph at redhat.com  Thu Feb 25 10:11:44 2016
From: aph at redhat.com (Andrew Haley)
Date: Thu, 25 Feb 2016 10:11:44 +0000
Subject: [aarch64-port-dev ] Freeze aarch64/jdk8
Message-ID: <56CED360.1000000@redhat.com>

The jdk8 development tree at aarch64/jdk8 has been in use for some
time.  I don't think we need it any more.  aarch64/jdk8u tracks
upstream jdk8u far more closely: it differs from upstream only in the
minimum number of places needed to get AArch64 to work.

I'm proposing to close aarch64/jdk8 to all updates.  aarch64/jdk8u
will be used for all commits. jdk8 is interesting from a historical
point of view, so it will be made read only.

The rules for committing to aarch64/jdk8u are:

1.  All non-AArch64-specific patches come from jdk8u.  If you want to
change anything non-AArch64 submit it to jdk8u.

2.  All AArch64-specific patches, if they are relevant to jdk9, must
be submitted there first and back-ported to jdk8u.

Andrew.

From aph at redhat.com  Thu Feb 25 10:16:45 2016
From: aph at redhat.com (Andrew Haley)
Date: Thu, 25 Feb 2016 10:16:45 +0000
Subject: [aarch64-port-dev ] RFR: 8150394: aarch64: add support for 8.1
	LSE CAS instructions
In-Reply-To: <1456394786.1383.18.camel@mint>
References: <1456173130.2735.8.camel@mint> <56CD8CCF.1030404@redhat.com>
	<1456394786.1383.18.camel@mint>
Message-ID: <56CED48D.9090702@redhat.com>

On 25/02/16 10:06, Edward Nevill wrote:
> Is this something like what you had in mind?
> 
> http://cr.openjdk.java.net/~enevill/8150394/webrev.1/

Something like.  I'll integrate what I've got with this and post
it soon.

Andrew.


From edward.nevill at gmail.com  Thu Feb 25 10:25:40 2016
From: edward.nevill at gmail.com (Edward Nevill)
Date: Thu, 25 Feb 2016 10:25:40 +0000
Subject: [aarch64-port-dev ] Freeze aarch64/jdk8
In-Reply-To: <56CED360.1000000@redhat.com>
References: <56CED360.1000000@redhat.com>
Message-ID: <1456395940.7333.2.camel@mint>

On Thu, 2016-02-25 at 10:11 +0000, Andrew Haley wrote:
> The jdk8 development tree at aarch64/jdk8 has been in use for some
> time.  I don't think we need it any more.  aarch64/jdk8u tracks
> upstream jdk8u far more closely: it differs from upstream only in the
> minimum number of places needed to get AArch64 to work.

Perhaps to close out the jdk8 tree it might be good to backport the following patch

http://openjdk.linaro.org/releases/1602/jdk8/patches/8592.patch

This is a really nasty, critical bug and people are still building from this tree,

Regards,
Ed.


From aph at redhat.com  Thu Feb 25 10:45:46 2016
From: aph at redhat.com (Andrew Haley)
Date: Thu, 25 Feb 2016 10:45:46 +0000
Subject: [aarch64-port-dev ] Freeze aarch64/jdk8
In-Reply-To: <1456395940.7333.2.camel@mint>
References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint>
Message-ID: <56CEDB5A.3090200@redhat.com>

On 02/25/2016 10:25 AM, Edward Nevill wrote:
> On Thu, 2016-02-25 at 10:11 +0000, Andrew Haley wrote:
>> The jdk8 development tree at aarch64/jdk8 has been in use for some
>> time.  I don't think we need it any more.  aarch64/jdk8u tracks
>> upstream jdk8u far more closely: it differs from upstream only in the
>> minimum number of places needed to get AArch64 to work.
> 
> Perhaps to close out the jdk8 tree it might be good to backport the following patch
> 
> http://openjdk.linaro.org/releases/1602/jdk8/patches/8592.patch
> 
> This is a really nasty, critical bug and people are still building from this tree,

Andrew Dinn is supposed to be doing this.

Andrew.


From adinn at redhat.com  Thu Feb 25 11:37:00 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Thu, 25 Feb 2016 11:37:00 +0000
Subject: [aarch64-port-dev ] Freeze aarch64/jdk8
In-Reply-To: <1456395940.7333.2.camel@mint>
References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint>
Message-ID: <56CEE75C.9080102@redhat.com>


On 25/02/16 10:25, Edward Nevill wrote:
> On Thu, 2016-02-25 at 10:11 +0000, Andrew Haley wrote:
>> The jdk8 development tree at aarch64/jdk8 has been in use for some
>> time.  I don't think we need it any more.  aarch64/jdk8u tracks
>> upstream jdk8u far more closely: it differs from upstream only in the
>> minimum number of places needed to get AArch64 to work.
> 
> Perhaps to close out the jdk8 tree it might be good to backport the following patch
> 
> http://openjdk.linaro.org/releases/1602/jdk8/patches/8592.patch
> 
> This is a really nasty, critical bug and people are still building from this tree,

I'm in the process of backporting all missing jdk8 hotspot patches into
jdk8u. Unfortunately, one of the patches applied cleanly but made the
changes in the wrong place (all those 0x1f and 0x3f substitutions for
shifts were carefully misaligned by hg patch into the wrong cases). I
have just started testing a newly patched build which corrects for this
failure.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in UK and Wales under Company Registration No. 3798903
Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul
Argiry (US)

From adinn at redhat.com  Thu Feb 25 11:53:23 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Thu, 25 Feb 2016 11:53:23 +0000
Subject: [aarch64-port-dev ] Freeze aarch64/jdk8
In-Reply-To: <56CEE75C.9080102@redhat.com>
References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint>
	<56CEE75C.9080102@redhat.com>
Message-ID: <56CEEB33.2060400@redhat.com>

On 25/02/16 11:37, Andrew Dinn wrote:
> I'm in the process of backporting all missing jdk8 hotspot patches into
> jdk8u. Unfortunately, one of the patches applied cleanly but made the
> changes in the wrong place (all those 0x1f and 0x3f substitutions for
> shifts were carefully misaligned by hg patch into the wrong cases). I
> have just started testing a newly patched build which corrects for this
> failure.

I have successfully backported all the jdk8 hotspot patches which were
not yet included in jdk8u. It seemed appropriate for all of them to go
in (including test fixes). The list of patches is included below. It's
basically everything from aarch64/jdk hotspot revision 8559 to revision
8590 excluding revision 8577 which had already been cherry-picked for
prior inclusion.

All patches were created by exporting the jdk8u revision. I only had two
problems, applying revisions 8571 (more32bitshifts.patch) and 8576
(largecodecache.patch). Both were caused by the out of order
cherry-pick. I tweaked both cases by hand and diffed with the
corresponding version and with head to make sure that we ended up with
the correct code.

The resulting jdk8u tree builds and runs these basic smoke tests

  java Hello
  javac Hello.java
  netbeans (edit, build and run sample project)

Shall I push these changes now? Or do you want to vet some of the patches?

regards,


Andrew Dinn
-----------

Patches Backported from jdk8 to jdk8u
[listed in order of application]


8134322.patch
revid: 8559
Fix several errors in C2 biased locking implementation


8136524.patch
revid: 8560
test/compiler/runtime/7196199/Test7196199.java fails


8136596.patch
revid: 8561
Remove MemBarRelease when final field's allocation is NoEscape or ArgEscape


8136615.patch
revid: 8562
elide DecodeN when followed by CmpP 0


8136165.patch
revid: 8563
Tidy up compiled native calls


8138641.patch
revid: 8564
Disable C2 peephole by default for aarch64


8138575.patch
revid: 8565
Improve generated code for profile counters


8139674.patch
revid: 8566
guarantee failure in TestOptionsWithRanges.java


8131645.patch
revid:
crash on Cavium when using G1


volcas.patch
revid: 8568
Backport optimization of volatile puts/gets and CAS to use ldar/stlr


8131645-correction.patch
revid: 8569
Fix thinko when backporting 8131645. Table ends up being allocated twice.


8140611.patch
revid: 8570
jtreg test jdk/tools/pack200/UnpackerMemoryTest.java SEGVs


more32bitshifts.patch
revid: 8571
Some 32 bit shifts still being anded with 0x3f instead of 0x1f.

Applied without reporting an error but all the offsets got shifted
causing 0x1f and 0x3f to be substituted in the wrong places. Had to
redo this part of the patch by hand until it looked like the jdk8
version. Also, checked by eyeball that all L instructions used 0x3f
and all I instructions used 0x1f.

8135157.patch
revid: 8572
DMB elimination in AArch64 C2 synchronization implementation


8138966.patch
revid: 8573
Intermittent SEGV running ParallelGC


8143067.patch
revid: 8574
guarantee failure in javac


8143285.patch
revid: 8575
Missing load acquire when checking if ConstantPoolCacheEntry is resolved


largecodecache.patch
revid: 8576
Add support for large code cache

Failed to apply as is because of conlict with a cherry-picked patch
applied out of order (Remove AArch64-specific code in
generateOptoStub.cpp). The failure relates to 2 problems. The out of
order patch corrected a typo in the comment text used to contextualize
the 2nd hunk in this patch. It also made an incompatible change to the
computation of the instruction count i.e. there is a real merge
conflict here when the patch is applied out of order. So, the
cherry-picked patch must itself have been tweaked to be applied out of
order.


largecodecache-correction.patch
revid: 8578
Fix client build after addition of large code cache support


8146286.patch
revid: 8579
guarantee failures with large code cache sizes on jtreg test
java/lang/invoke/LFCaching/LFMultiThreadCachingTest.java


8143584.patch
revid: 8580
Load constant pool tag and class status with load acquire


8144028.patch
revid: 8581
Use AArch64 bit-test instructions in C2


8144587.patch
revid: 8582
generate vectorized MLA/MLS instructions


8145438.patch
revid: 8583
Guarantee failures since 8144028: Use AArch64 bit-test instructions in C2


8144582.patch
revid: 8584
AArch64 does not generate correct branch profile data


8144201.patch
revid: 8585
jdk/test/com/sun/net/httpserver/Test6a.java fails with
--enable-unlimited-crypto


8146678.patch
revid: 8586
assertion failure: call instruction in an infinite loop


8146843.patch
revid: 8587
add scheduling support for FP and vector instructions


8146709.patch
revid: 8588
Incorrect use of ADRP for byte_map_base


8147805.patch
revid: 8589
C1 segmentation fault due to inline Unsafe.getAndSetObject


8148240.patch
revid: 8590
random infrequent null pointer exceptions in javac

From aph at redhat.com  Thu Feb 25 11:57:41 2016
From: aph at redhat.com (Andrew Haley)
Date: Thu, 25 Feb 2016 11:57:41 +0000
Subject: [aarch64-port-dev ] Freeze aarch64/jdk8
In-Reply-To: <56CEEB33.2060400@redhat.com>
References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint>
	<56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com>
Message-ID: <56CEEC35.2020101@redhat.com>

On 02/25/2016 11:53 AM, Andrew Dinn wrote:
> Shall I push these changes now? Or do you want to vet some of the patches?

Are any of them outside AArch64-specific directories?

Andrew.


From adinn at redhat.com  Thu Feb 25 12:26:18 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Thu, 25 Feb 2016 12:26:18 +0000
Subject: [aarch64-port-dev ] Freeze aarch64/jdk8
In-Reply-To: <56CEEC35.2020101@redhat.com>
References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint>
	<56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com>
	<56CEEC35.2020101@redhat.com>
Message-ID: <56CEF2EA.7090701@redhat.com>


On 25/02/16 11:57, Andrew Haley wrote:
> On 02/25/2016 11:53 AM, Andrew Dinn wrote:
>> Shall I push these changes now? Or do you want to vet some of the patches?
> 
> Are any of them outside AArch64-specific directories?

Yes, in quite a few cases -- but they all appear to be backports of
changes also made upstream. See below for details. n.b. revision ids are
for aarch64/jdk8/hotspot tree.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in UK and Wales under Company Registration No. 3798903
Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul
Argiry (US)


8136596.patch
revid: 8561
8136596: Remove MemBarRelease when final field's allocation is NoEscape
or ArgEscape

this changes 3 files. the first is src/share/vm/opto/callnode.hpp

@@ -894,6 +894,18 @@

   // Convenience for initialization->maybe_set_complete(phase)
   bool maybe_set_complete(PhaseGVN* phase);
+
+  // Return true if allocation doesn't escape thread, its escape state
+  // needs be noEscape or ArgEscape. InitializeNode._does_not_escape
+  // is true when its allocation's escape state is noEscape or
+  // ArgEscape. In case allocation's InitializeNode is NULL, check
+  // AlllocateNode._is_non_escaping flag.
+  // AlllocateNode._is_non_escaping is true when its escape state is
+  // noEscape.
+  bool does_not_escape_thread() {
+    InitializeNode* init = NULL;
+    return _is_non_escaping || (((init = initialization()) != NULL) &&
init->does_not_escape());
+  }
 };

 //------------------------------AllocateArray---------------------------------


the second is src/share/vm/opto/macro.cpp

@@ -1385,7 +1385,8 @@
     // MemBarStoreStore so that stores that initialize this object
     // can't be reordered with a subsequent store that makes this
     // object accessible by other threads.
-    if (init == NULL || (!init->is_complete_with_arraycopy() &&
!init->does_not_escape())) {
+    if (!alloc->does_not_escape_thread() &&
+        (init == NULL || !init->is_complete_with_arraycopy())) {
       if (init == NULL || init->req() < InitializeNode::RawStores) {
         // No InitializeNode or no stores captured by zeroing
         // elimination. Simply add the MemBarStoreStore after object

the third is src/share/vm/opto/memnode.cpp

@@ -3065,7 +3065,7 @@
       // Final field stores.
       Node* alloc =
AllocateNode::Ideal_allocation(in(MemBarNode::Precedent), phase);
       if ((alloc != NULL) && alloc->is_Allocate() &&
-          alloc->as_Allocate()->_is_non_escaping) {
+          alloc->as_Allocate()->does_not_escape_thread()) {
         // The allocated object does not escape.
         eliminate = true;
       }


8131645.patch
revid:
8131645: crash on Cavium when using G1

this changes src/share/vm/gc_impementation/g1/g1CodeCacheRemSet.cpp

@@ -200,6 +200,9 @@

 void G1CodeRootSet::allocate_small_table() {
   _table = new CodeRootSetTable(SmallSize);
+  CodeRootSetTable* temp = new CodeRootSetTable(SmallSize);
+
+  OrderAccess::release_store_ptr(&_table, temp);
 }

 void CodeRootSetTable::purge_list_append(CodeRootSetTable* table) {


volcas.patch
revid: 8568
Backport optimization of volatile puts/gets and CAS to use ldar/stlr

this changes src/share/vm/opto/graphKit.cpp

@@ -3803,7 +3803,7 @@

   // Smash zero into card
   if( !UseConcMarkSweepGC ) {
-    __ store(__ ctrl(), card_adr, zero, bt, adr_type, MemNode::release);
+    __ store(__ ctrl(), card_adr, zero, bt, adr_type, MemNode::unordered);
   } else {
     // Specialized path for CM store barrier
     __ storeCM(__ ctrl(), card_adr, zero, oop_store, adr_idx, bt,
adr_type);


8131645-correction.patch
revid: 8569
Fix thinko when backporting 8131645. Table ends up being allocated twice.

@@ -199,7 +199,6 @@
 }

 void G1CodeRootSet::allocate_small_table() {
-  _table = new CodeRootSetTable(SmallSize);
   CodeRootSetTable* temp = new CodeRootSetTable(SmallSize);

   OrderAccess::release_store_ptr(&_table, temp);


8138966.patch
revid: 8573
8138966: Intermittent SEGV running ParallelGC

this changes
src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.hpp


@@ -348,7 +348,7 @@
     HeapWord*            _partial_obj_addr;
     region_sz_t          _partial_obj_size;
     region_sz_t volatile _dc_and_los;
-    bool                 _blocks_filled;
+    bool        volatile _blocks_filled;

 #ifdef ASSERT
     size_t               _blocks_filled_count;   // Number of block
table fills.
@@ -499,7 +499,9 @@
 inline bool
 ParallelCompactData::RegionData::blocks_filled() const
 {
-  return _blocks_filled;
+  bool result = _blocks_filled;
+  OrderAccess::acquire();
+  return result;
 }

 #ifdef ASSERT
@@ -513,6 +515,7 @@
 inline void
 ParallelCompactData::RegionData::set_blocks_filled()
 {
+  OrderAccess::release();
   _blocks_filled = true;
   // Debug builds count the number of times the table was filled.
   DEBUG_ONLY(Atomic::inc_ptr(&_blocks_filled_count));

largecodecache.patch
revid: 8576
Add support for large code cache

this makes changes to two files.

firstly src/share/vm/runtime/arguments.cpp

@@ -1137,9 +1137,8 @@
   }
   // Increase the code cache size - tiered compiles a lot more.
   if (FLAG_IS_DEFAULT(ReservedCodeCacheSize)) {
-    FLAG_SET_DEFAULT(ReservedCodeCacheSize, ReservedCodeCacheSize * 5);
-    // The maximum B/BL offset range on AArch64 is 128MB
-    AARCH64_ONLY(FLAG_SET_DEFAULT(ReservedCodeCacheSize,
MIN2(ReservedCodeCacheSize, 128*M)));
+    FLAG_SET_DEFAULT(ReservedCodeCacheSize,
+                     MIN2(CODE_CACHE_DEFAULT_LIMIT,
ReservedCodeCacheSize * 5));
   }
   if (!UseInterpreter) { // -Xcomp
     Tier3InvokeNotifyFreqLog = 0;
@@ -2476,11 +2475,11 @@
                 "Invalid ReservedCodeCacheSize=%dK. Must be at least
%uK.\n", ReservedCodeCacheSize/K,
                 min_code_cache_size/K);
     status = false;
-  } else if (ReservedCodeCacheSize > 2*G) {
-    // Code cache size larger than MAXINT is not supported.
+  } else if (ReservedCodeCacheSize > CODE_CACHE_SIZE_LIMIT) {
+    // Code cache size larger than CODE_CACHE_SIZE_LIMIT is not supported.
     jio_fprintf(defaultStream::error_stream(),
                 "Invalid ReservedCodeCacheSize=%dM. Must be at most
%uM.\n", ReservedCodeCacheSize/M,
-                (2*G)/M);
+                CODE_CACHE_SIZE_LIMIT/M);
     status = false;
   }


and also src/share/vm/utilities/globalDefinitions.hpp

@@ -414,6 +414,11 @@
   ProfileRTM = 0x0  // Use RTM with abort ratio calculation
 };

+// The maximum size of the code cache.  Can be overridden by targets.
+#define CODE_CACHE_SIZE_LIMIT (2*G)
+// Allow targets to reduce the default size of the code cache.
+#define CODE_CACHE_DEFAULT_LIMIT CODE_CACHE_SIZE_LIMIT
+
 #ifdef TARGET_ARCH_x86
 # include "globalDefinitions_x86.hpp"
 #endif


8145438.patch
revid: 8583
8145438: Guarantee failures since 8144028: Use AArch64 bit-test
instructions in C2

this makes a small change to src/share/vm/adlc/formssel.cpp

@@ -1239,7 +1239,8 @@
       !is_short_branch() &&     // Don't match another short branch variant
       reduce_result() != NULL &&
       strcmp(reduce_result(), short_branch->reduce_result()) == 0 &&
-      _matrule->equivalent(AD.globalNames(), short_branch->_matrule)) {
+      _matrule->equivalent(AD.globalNames(), short_branch->_matrule) &&
+      equivalent_predicates(this, short_branch)) {
     // The instructions are equivalent.

     // Now verify that both instructions have the same parameters and

From aph at redhat.com  Thu Feb 25 13:38:32 2016
From: aph at redhat.com (Andrew Haley)
Date: Thu, 25 Feb 2016 13:38:32 +0000
Subject: [aarch64-port-dev ] Freeze aarch64/jdk8
In-Reply-To: <56CEF2EA.7090701@redhat.com>
References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint>
	<56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com>
	<56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com>
Message-ID: <56CF03D8.8070101@redhat.com>

On 02/25/2016 12:26 PM, Andrew Dinn wrote:
> 
> 
> On 25/02/16 11:57, Andrew Haley wrote:
>> On 02/25/2016 11:53 AM, Andrew Dinn wrote:
>>> Shall I push these changes now? Or do you want to vet some of the patches?
>>
>> Are any of them outside AArch64-specific directories?
> 
> Yes, in quite a few cases -- but they all appear to be backports of
> changes also made upstream. See below for details. n.b. revision ids are
> for aarch64/jdk8/hotspot tree.
> 
> 
> 8136596.patch
> revid: 8561
> 8136596: Remove MemBarRelease when final field's allocation is NoEscape
> or ArgEscape

This is a minor optimization, not submitted to jdk8u.  It's also
somewhat risky.  OK if you make it AARCH64_ONLY.

> 8131645.patch
> revid: 8567
> 8131645: crash on Cavium when using G1

OK: serious crasher bug fix.  Ed, has this been submitted to jdk8u?
It should be because it's not AArch64-specific.

> volcas.patch
> revid: 8568
> Backport optimization of volatile puts/gets and CAS to use ldar/stlr

It's probably safe, but there have been significant reworkings in
this area.

Maybe make this AARCH64_ONLY ?  Please have a look at the generated
code when running G1.

> 8131645-correction.patch
> revid: 8569
> Fix thinko when backporting 8131645. Table ends up being allocated twice.

Yes, needed for 8131645.

> 8138966.patch
> revid: 8573
> 8138966: Intermittent SEGV running ParallelGC

Yes, serious crasher bug.  This is already in jdk8u.
http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/rev/110735ab93ec

Please Ask Andrew Hughes about this one: it should already be in.

> largecodecache.patch
> revid: 8576
> Add support for large code cache
> 
> this makes changes to two files.

The non-AArch64-specific parts of this patch are not used by anything
so should not be included.  The rest is OK.

> 8145438.patch
> revid: 8583
> 8145438: Guarantee failures since 8144028: Use AArch64 bit-test
> instructions in C2
> 
> this makes a small change to src/share/vm/adlc/formssel.cpp

This is OK.  It can't be submitted to jdk8u upstream because
it's only needed for Arch64.  make it AARCH64_ONLY, just for
safety.

Andrew.


From adinn at redhat.com  Thu Feb 25 14:20:51 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Thu, 25 Feb 2016 14:20:51 +0000
Subject: [aarch64-port-dev ] Freeze aarch64/jdk8
In-Reply-To: <56CF03D8.8070101@redhat.com>
References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint>
	<56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com>
	<56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com>
	<56CF03D8.8070101@redhat.com>
Message-ID: <56CF0DC2.8080104@redhat.com>

On 25/02/16 13:38, Andrew Haley wrote:
 . . .

>> 8136596.patch
>> revid: 8561
>> 8136596: Remove MemBarRelease when final field's allocation is NoEscape
>> or ArgEscape
> 
> This is a minor optimization, not submitted to jdk8u.  It's also
> somewhat risky.  OK if you make it AARCH64_ONLY.

Ok, will do.

>> volcas.patch
>> revid: 8568
>> Backport optimization of volatile puts/gets and CAS to use ldar/stlr
> 
> It's probably safe, but there have been significant reworkings in
> this area.
> 
> Maybe make this AARCH64_ONLY ?  Please have a look at the generated
> code when running G1.

This shared change reverts part of a modification you had previously
made to the shared code in an earlier attempt to implement optimization
of volatile puts on AArch64. So, applying this patch merely restores the
status quo as regards the shared code that is currently in both jdk8 and
upstream jdk8u.

Also, the same reversion was applied and is still present in jdk9. It
has no bearing on any of the reworkings that have happened since the
original patch was added to jdk9 and I don't envisage it having any such
effect.

So. I don't think there is any reason to make this AARCH64_ONLY.

>> 8138966.patch
>> revid: 8573
>> 8138966: Intermittent SEGV running ParallelGC
> 
> Yes, serious crasher bug.  This is already in jdk8u.
> http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/rev/110735ab93ec
> 
> Please Ask Andrew Hughes about this one: it should already be in.

We already agreed that I would include this as part of my patch.

>> largecodecache.patch
>> revid: 8576
>> Add support for large code cache
>>
>> this makes changes to two files.
> 
> The non-AArch64-specific parts of this patch are not used by anything
> so should not be included.  The rest is OK.

Ok, I will rework this to include only the AArch64-specific code.

>> 8145438.patch
>> revid: 8583
>> 8145438: Guarantee failures since 8144028: Use AArch64 bit-test
>> instructions in C2
>>
>> this makes a small change to src/share/vm/adlc/formssel.cpp
> 
> This is OK.  It can't be submitted to jdk8u upstream because
> it's only needed for Arch64.  make it AARCH64_ONLY, just for
> safety.

ok, will do.

I'll build and test with the revised patches and report (this tim ewith
with a webrev) when I have managed to get it to pass basic smoke tests.

regards,


Andrew Dinn
-----------


From aph at redhat.com  Thu Feb 25 14:25:30 2016
From: aph at redhat.com (Andrew Haley)
Date: Thu, 25 Feb 2016 14:25:30 +0000
Subject: [aarch64-port-dev ] Freeze aarch64/jdk8
In-Reply-To: <56CF0DC2.8080104@redhat.com>
References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint>
	<56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com>
	<56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com>
	<56CF03D8.8070101@redhat.com> <56CF0DC2.8080104@redhat.com>
Message-ID: <56CF0EDA.9070808@redhat.com>

On 02/25/2016 02:20 PM, Andrew Dinn wrote:
> On 25/02/16 13:38, Andrew Haley wrote:
>  . . .
> 
>>> volcas.patch
>>> revid: 8568
>>> Backport optimization of volatile puts/gets and CAS to use ldar/stlr
>>
>> It's probably safe, but there have been significant reworkings in
>> this area.
>>
>> Maybe make this AARCH64_ONLY ?  Please have a look at the generated
>> code when running G1.
> 
> This shared change reverts part of a modification you had previously
> made to the shared code in an earlier attempt to implement optimization
> of volatile puts on AArch64. So, applying this patch merely restores the
> status quo as regards the shared code that is currently in both jdk8 and
> upstream jdk8u.
>
> Also, the same reversion was applied and is still present in jdk9. It
> has no bearing on any of the reworkings that have happened since the
> original patch was added to jdk9 and I don't envisage it having any such
> effect.
> 
> So. I don't think there is any reason to make this AARCH64_ONLY.

OK.  I'm a bit mystified by the history of this one, but as long
as in the end we don't diverge from upstream jdk8u I'm happy.

>>> 8138966.patch
>>> revid: 8573
>>> 8138966: Intermittent SEGV running ParallelGC
>>
>> Yes, serious crasher bug.  This is already in jdk8u.
>> http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/rev/110735ab93ec
>>
>> Please Ask Andrew Hughes about this one: it should already be in.
> 
> We already agreed that I would include this as part of my patch.

OK.

Andrew.


From aph at redhat.com  Thu Feb 25 14:49:54 2016
From: aph at redhat.com (Andrew Haley)
Date: Thu, 25 Feb 2016 14:49:54 +0000
Subject: [aarch64-port-dev ] RFR: 8150394: aarch64: add support for 8.1
	LSE CAS instructions
In-Reply-To: <1456394786.1383.18.camel@mint>
References: <1456173130.2735.8.camel@mint> <56CD8CCF.1030404@redhat.com>
	<1456394786.1383.18.camel@mint>
Message-ID: <56CF1492.1000400@redhat.com>

Here's what I've got, merging my changes for VarHandles with yours
for LSE CAS:

http://cr.openjdk.java.net/~aph/aarch64-lse-cas/

Please test it.

Thanks,

Andrew.

From aph at redhat.com  Thu Feb 25 15:06:45 2016
From: aph at redhat.com (Andrew Haley)
Date: Thu, 25 Feb 2016 15:06:45 +0000
Subject: [aarch64-port-dev ] RFR: 8150652: Remove unused code in AArch64
	back end
Message-ID: <56CF1885.1060600@redhat.com>

Defining min in this way breaks compilation if min is already a #define,
which it is on some compilers.

http://cr.openjdk.java.net/~aph/8150652/

Andrew.

From adinn at redhat.com  Thu Feb 25 17:35:54 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Thu, 25 Feb 2016 17:35:54 +0000
Subject: [aarch64-port-dev ] Freeze aarch64/jdk8
In-Reply-To: <56CF0EDA.9070808@redhat.com>
References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint>
	<56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com>
	<56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com>
	<56CF03D8.8070101@redhat.com> <56CF0DC2.8080104@redhat.com>
	<56CF0EDA.9070808@redhat.com>
Message-ID: <56CF3B7A.9080301@redhat.com>

Ok, here is the webrev for the final set of patches.

  http://cr.openjdk.java.net/~adinn/jdk8u-aarch64-update/webrev.00

I built this on both AArch64 and x86_64 and ran the usual smoke tests
successfully on both

  java Hello
  javac Hello.java
  netbeans (clean, build and run sample project)

Note that the patch to rectify the jvm.cfg config and code which
processes it still missing so javac and netbeans require -J-server to
make them work properly. Andrew Hughes has a patch queued to fix this.

Ok to push the changes in the webrev?

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in UK and Wales under Company Registration No. 3798903
Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul
Argiry (US)

From edward.nevill at gmail.com  Thu Feb 25 17:44:46 2016
From: edward.nevill at gmail.com (Edward Nevill)
Date: Thu, 25 Feb 2016 17:44:46 +0000
Subject: [aarch64-port-dev ] RFR: 8150394: aarch64: add support for 8.1
	LSE CAS instructions
In-Reply-To: <56CF1492.1000400@redhat.com>
References: <1456173130.2735.8.camel@mint> <56CD8CCF.1030404@redhat.com>
	<1456394786.1383.18.camel@mint>  <56CF1492.1000400@redhat.com>
Message-ID: <1456422286.21810.2.camel@mint>

On Thu, 2016-02-25 at 14:49 +0000, Andrew Haley wrote:
> Here's what I've got, merging my changes for VarHandles with yours
> for LSE CAS:
> 
> http://cr.openjdk.java.net/~aph/aarch64-lse-cas/
> 
> Please test it.
> 

Hi,

Clean run through jcstress with -XX:+UseLSE. Also clean on some partners tests with and without -XX:+UseLSE.

Looks fine,
Ed.


From aph at redhat.com  Thu Feb 25 17:56:38 2016
From: aph at redhat.com (Andrew Haley)
Date: Thu, 25 Feb 2016 17:56:38 +0000
Subject: [aarch64-port-dev ] Freeze aarch64/jdk8
In-Reply-To: <56CF03D8.8070101@redhat.com>
References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint>
	<56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com>
	<56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com>
	<56CF03D8.8070101@redhat.com>
Message-ID: <56CF4056.6010000@redhat.com>

On 02/25/2016 01:38 PM, Andrew Haley wrote:
>> largecodecache.patch
>> > revid: 8576
>> > Add support for large code cache
>> > 
>> > this makes changes to two files.
> The non-AArch64-specific parts of this patch are not used by anything
> so should not be included.  The rest is OK.

Oh sorry, I messed that up.

CODE_CACHE_DEFAULT_LIMIT *is* used, and the shared bits are OK.

Andrew.


From gnu.andrew at redhat.com  Thu Feb 25 18:55:38 2016
From: gnu.andrew at redhat.com (Andrew Hughes)
Date: Thu, 25 Feb 2016 13:55:38 -0500 (EST)
Subject: [aarch64-port-dev ] Freeze aarch64/jdk8
In-Reply-To: <56CF0EDA.9070808@redhat.com>
References: <56CED360.1000000@redhat.com> <56CEE75C.9080102@redhat.com>
	<56CEEB33.2060400@redhat.com> <56CEEC35.2020101@redhat.com>
	<56CEF2EA.7090701@redhat.com> <56CF03D8.8070101@redhat.com>
	<56CF0DC2.8080104@redhat.com> <56CF0EDA.9070808@redhat.com>
Message-ID: <100280441.402786.1456426538535.JavaMail.zimbra@redhat.com>


----- Original Message -----
> On 02/25/2016 02:20 PM, Andrew Dinn wrote:
> > On 25/02/16 13:38, Andrew Haley wrote:
> >  . . .
> > 
> >>> volcas.patch
> >>> revid: 8568
> >>> Backport optimization of volatile puts/gets and CAS to use ldar/stlr
> >>
> >> It's probably safe, but there have been significant reworkings in
> >> this area.
> >>
> >> Maybe make this AARCH64_ONLY ?  Please have a look at the generated
> >> code when running G1.
> > 
> > This shared change reverts part of a modification you had previously
> > made to the shared code in an earlier attempt to implement optimization
> > of volatile puts on AArch64. So, applying this patch merely restores the
> > status quo as regards the shared code that is currently in both jdk8 and
> > upstream jdk8u.
> >
> > Also, the same reversion was applied and is still present in jdk9. It
> > has no bearing on any of the reworkings that have happened since the
> > original patch was added to jdk9 and I don't envisage it having any such
> > effect.
> > 
> > So. I don't think there is any reason to make this AARCH64_ONLY.
> 
> OK.  I'm a bit mystified by the history of this one, but as long
> as in the end we don't diverge from upstream jdk8u I'm happy.
> 
> >>> 8138966.patch
> >>> revid: 8573
> >>> 8138966: Intermittent SEGV running ParallelGC
> >>
> >> Yes, serious crasher bug.  This is already in jdk8u.
> >> http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/rev/110735ab93ec
> >>
> >> Please Ask Andrew Hughes about this one: it should already be in.
> > 
> > We already agreed that I would include this as part of my patch.
> 
> OK.
> 

I believe that was 8147805, not 8138966.

According to https://bugs.openjdk.java.net/browse/JDK-8138966,
this fix is in 8u76, so we'll pick it up when we merge 8u76 in April.
I don't see any problem with including it earlier though, if it's
a serious issue. Mercurial is intelligent enough to see that we've
already applied the change.

> Andrew.
> 
> 

-- 
Andrew :)

Senior Free Java Software Engineer
Red Hat, Inc. (http://www.redhat.com)

PGP Key: ed25519/35964222 (hkp://keys.gnupg.net)
Fingerprint = 5132 579D D154 0ED2 3E04  C5A0 CFDA 0F9B 3596 4222


From gnu.andrew at redhat.com  Thu Feb 25 18:58:06 2016
From: gnu.andrew at redhat.com (Andrew Hughes)
Date: Thu, 25 Feb 2016 13:58:06 -0500 (EST)
Subject: [aarch64-port-dev ] [PATCH] [jdk8u] Remove unused template
 which breaks builds with GCC 6
In-Reply-To: <56CED047.4060604@redhat.com>
References: <1629682443.26965872.1456360872278.JavaMail.zimbra@redhat.com>
	<56CED047.4060604@redhat.com>
Message-ID: <1948009987.404665.1456426686320.JavaMail.zimbra@redhat.com>

----- Original Message -----
> On 25/02/16 00:41, Andrew Hughes wrote:
> > Ok to push?
> > 
> > [0] https://bugzilla.redhat.com/show_bug.cgi?id=1307224#c1
> 
> No.  This should go to JDK 9 and get backported.  Only JDK8-
> specific patches get reviewed on 8.
> 

Ok, the issue there is testing a build of OpenJDK 9 with
GCC 6 on AArch64. I'll look into it.

> Andrew.
> 
> 

-- 
Andrew :)

Senior Free Java Software Engineer
Red Hat, Inc. (http://www.redhat.com)

PGP Key: ed25519/35964222 (hkp://keys.gnupg.net)
Fingerprint = 5132 579D D154 0ED2 3E04  C5A0 CFDA 0F9B 3596 4222


From christian.thalinger at oracle.com  Thu Feb 25 19:45:57 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Thu, 25 Feb 2016 09:45:57 -1000
Subject: [aarch64-port-dev ] RFR: 8150652: Remove unused code in AArch64
	back end
In-Reply-To: <56CF1885.1060600@redhat.com>
References: <56CF1885.1060600@redhat.com>
Message-ID: <7BB0CCFA-51C0-4C04-8BB4-53A3A8B1D25C@oracle.com>

Looks good.

> On Feb 25, 2016, at 5:06 AM, Andrew Haley <aph at redhat.com> wrote:
> 
> Defining min in this way breaks compilation if min is already a #define,
> which it is on some compilers.
> 
> http://cr.openjdk.java.net/~aph/8150652/
> 
> Andrew.


From aph at redhat.com  Thu Feb 25 20:35:17 2016
From: aph at redhat.com (Andrew Haley)
Date: Thu, 25 Feb 2016 20:35:17 +0000
Subject: [aarch64-port-dev ] [PATCH] [jdk8u] Remove unused template
 which breaks builds with GCC 6
In-Reply-To: <1948009987.404665.1456426686320.JavaMail.zimbra@redhat.com>
References: <1629682443.26965872.1456360872278.JavaMail.zimbra@redhat.com>
	<56CED047.4060604@redhat.com>
	<1948009987.404665.1456426686320.JavaMail.zimbra@redhat.com>
Message-ID: <56CF6585.1010304@redhat.com>

On 02/25/2016 06:58 PM, Andrew Hughes wrote:
> Ok, the issue there is testing a build of OpenJDK 9 with
> GCC 6 on AArch64. I'll look into it.

I already submitted it for review.  It's approved.

Andrew.


From gnu.andrew at redhat.com  Fri Feb 26 04:41:22 2016
From: gnu.andrew at redhat.com (gnu.andrew at redhat.com)
Date: Fri, 26 Feb 2016 04:41:22 +0000
Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u/jdk: Completey revert
	2940c1ead99bd7635 and	sync jvm.cfg with OpenJDK 9 version.
Message-ID: <201602260441.u1Q4fMp3021155@aojmv0008.oracle.com>

Changeset: b39ade4fa554
Author:    andrew
Date:      2016-02-26 04:40 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/jdk/rev/b39ade4fa554

Completey revert 2940c1ead99bd7635 and sync jvm.cfg with OpenJDK 9 version.

! src/share/bin/java.c
! src/share/bin/java.h
! src/solaris/bin/aarch64/jvm.cfg


From adinn at redhat.com  Fri Feb 26 08:46:44 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Fri, 26 Feb 2016 08:46:44 +0000
Subject: [aarch64-port-dev ] Freeze aarch64/jdk8
In-Reply-To: <56CF4056.6010000@redhat.com>
References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint>
	<56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com>
	<56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com>
	<56CF03D8.8070101@redhat.com> <56CF4056.6010000@redhat.com>
Message-ID: <56D010F4.8070903@redhat.com>

On 25/02/16 17:56, Andrew Haley wrote:
> On 02/25/2016 01:38 PM, Andrew Haley wrote:
>>> largecodecache.patch
>>>> revid: 8576
>>>> Add support for large code cache
>>>>
>>>> this makes changes to two files.
>> The non-AArch64-specific parts of this patch are not used by anything
>> so should not be included.  The rest is OK.
> 
> Oh sorry, I messed that up.
> 
> CODE_CACHE_DEFAULT_LIMIT *is* used, and the shared bits are OK.

Well, I knew Ed added it for a reason :-) I just assumed you were happy
to stick with the generic 2G limit.

If I redo the patches with this restored is it then ok to push (assuming
it passes basic tests) or do you want to see another webrev?

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in UK and Wales under Company Registration No. 3798903
Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul
Argiry (US)

From aph at redhat.com  Fri Feb 26 09:14:33 2016
From: aph at redhat.com (Andrew Haley)
Date: Fri, 26 Feb 2016 09:14:33 +0000
Subject: [aarch64-port-dev ] Freeze aarch64/jdk8
In-Reply-To: <56D010F4.8070903@redhat.com>
References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint>
	<56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com>
	<56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com>
	<56CF03D8.8070101@redhat.com> <56CF4056.6010000@redhat.com>
	<56D010F4.8070903@redhat.com>
Message-ID: <56D01779.10309@redhat.com>

On 26/02/16 08:46, Andrew Dinn wrote:
> On 25/02/16 17:56, Andrew Haley wrote:
>> On 02/25/2016 01:38 PM, Andrew Haley wrote:
>>>> largecodecache.patch
>>>>> revid: 8576
>>>>> Add support for large code cache
>>>>>
>>>>> this makes changes to two files.
>>> The non-AArch64-specific parts of this patch are not used by anything
>>> so should not be included.  The rest is OK.
>>
>> Oh sorry, I messed that up.
>>
>> CODE_CACHE_DEFAULT_LIMIT *is* used, and the shared bits are OK.
> 
> Well, I knew Ed added it for a reason :-) I just assumed you were happy
> to stick with the generic 2G limit.
> 
> If I redo the patches with this restored is it then ok to push (assuming
> it passes basic tests) or do you want to see another webrev?

Yes, I think so.  Then we have to do some fairly serious testing, on
AArch64 and to make sure we haven't broken x86 and others.

Thanks,

Andrew.


From adinn at redhat.com  Fri Feb 26 10:03:08 2016
From: adinn at redhat.com (adinn at redhat.com)
Date: Fri, 26 Feb 2016 10:03:08 +0000
Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u/hotspot: 31 new
	changesets
Message-ID: <201602261003.u1QA3850029592@aojmv0008.oracle.com>

Changeset: 98e4d7b5ff2b
Author:    adinn
Date:      2015-08-26 17:13 +0100
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/98e4d7b5ff2b

8134322: AArch64: Fix several errors in C2 biased locking implementation
Summary: Several errors in C2 biased locking require fixing
Reviewed-by: kvn
Contributed-by: hui.shi at linaro.org

! src/cpu/aarch64/vm/aarch64.ad

Changeset: b212413cdaef
Author:    enevill
Date:      2015-09-15 12:59 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/b212413cdaef

8136524: aarch64: test/compiler/runtime/7196199/Test7196199.java fails
Summary: Fix safepoint handlers to save 128 bits on vector poll
Reviewed-by: kvn
Contributed-by: felix.yang at linaro.org

! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp
! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp
! src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp

Changeset: 641806b9d29d
Author:    roland
Date:      2016-02-25 09:43 -0500
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/641806b9d29d

8136596: Remove aarch64: MemBarRelease when final field's allocation is NoEscape or ArgEscape
Summary: elide MemBar when AllocateNode _is_non_escaping
Reviewed-by: kvn, roland
Contributed-by: hui.shi at linaro.org

! src/share/vm/opto/callnode.hpp
! src/share/vm/opto/macro.cpp
! src/share/vm/opto/memnode.cpp

Changeset: caab2df44238
Author:    enevill
Date:      2015-09-16 13:50 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/caab2df44238

8136615: aarch64: elide DecodeN when followed by CmpP 0
Summary: remove DecodeN when comparing a narrow oop with 0
Reviewed-by: kvn, adinn

! src/cpu/aarch64/vm/aarch64.ad

Changeset: e499a51eaef1
Author:    aph
Date:      2015-09-28 16:18 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/e499a51eaef1

8136165: AARCH64: Tidy up compiled native calls
Summary: Do some cleaning
Reviewed-by: roland, kvn, enevill

! src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp

Changeset: 82141dab8ec8
Author:    aph
Date:      2015-09-30 13:23 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/82141dab8ec8

8138641: Disable C2 peephole by default for aarch64
Reviewed-by: roland
Contributed-by: felix.yang at linaro.org

! src/cpu/aarch64/vm/c2_globals_aarch64.hpp

Changeset: 8d382116b8d0
Author:    aph
Date:      2015-09-29 17:01 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/8d382116b8d0

8138575: Improve generated code for profile counters
Reviewed-by: kvn

! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp
! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp

Changeset: fa47c6788466
Author:    enevill
Date:      2015-10-15 15:33 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/fa47c6788466

8139674: aarch64: guarantee failure in TestOptionsWithRanges.java
Summary: Fix negative overflow in instruction field
Reviewed-by: kvn, roland, adinn, aph

! src/cpu/aarch64/vm/interp_masm_aarch64.cpp

Changeset: c63eff2bbad8
Author:    ecaspole
Date:      2015-09-21 10:36 -0400
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/c63eff2bbad8

8131645: [ARM64] crash on Cavium when using G1
Summary: Add a fence when creating the CodeRootSetTable so the readers do not see invalid memory.
Reviewed-by: aph, tschatzl

! src/share/vm/gc_implementation/g1/g1CodeCacheRemSet.cpp

Changeset: 17b38ca19e23
Author:    adinn
Date:      2015-10-08 11:06 -0400
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/17b38ca19e23

Backport optimization of volatile puts/gets and CAS to use ldar/stlr

! src/cpu/aarch64/vm/aarch64.ad
! src/cpu/aarch64/vm/globals_aarch64.hpp
! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp
! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp
! src/cpu/aarch64/vm/vm_version_aarch64.cpp
! src/share/vm/opto/graphKit.cpp

Changeset: 4470d1a7ab47
Author:    enevill
Date:      2015-10-28 17:47 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/4470d1a7ab47

Fix thinko when backporting 8131645. Table ends up being allocated twice.

! src/share/vm/gc_implementation/g1/g1CodeCacheRemSet.cpp

Changeset: d29561a8480e
Author:    enevill
Date:      2015-10-28 17:51 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/d29561a8480e

8140611: aarch64: jtreg test jdk/tools/pack200/UnpackerMemoryTest.java SEGVs
Summary: Fix register usage on calling native synchronized methods
Reviewed-by: kvn, adinn

! src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp

Changeset: c6c45e635f58
Author:    enevill
Date:      2016-02-25 05:44 -0500
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/c6c45e635f58

Some 32 bit shifts still being anded with 0x3f instead of 0x1f.

! src/cpu/aarch64/vm/aarch64.ad

Changeset: 0d26ab01110c
Author:    aph
Date:      2015-09-08 14:08 +0100
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/0d26ab01110c

8135157: DMB elimination in AArch64 C2 synchronization implementation
Summary: Reduce memory barrier usage in C2 fast lock and unlock.
Reviewed-by: kvn
Contributed-by: wei.tang at linaro.org, aph at redhat.com

! src/cpu/aarch64/vm/aarch64.ad

Changeset: 9b02e63a10cf
Author:    aph
Date:      2015-11-04 13:38 +0100
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/9b02e63a10cf

8138966: Intermittent SEGV running ParallelGC
Summary: Add necessary memory fences so that the parallel threads are unable to observe partially filled block tables.
Reviewed-by: tschatzl

! src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.hpp

Changeset: 69461ddc6e21
Author:    enevill
Date:      2015-11-19 15:15 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/69461ddc6e21

8143067: aarch64: guarantee failure in javac
Summary: Fix adrp going out of range during code relocation
Reviewed-by: aph, kvn

! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp

Changeset: 2a885c3fa856
Author:    hshi
Date:      2015-11-24 09:02 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/2a885c3fa856

8143285: aarch64: Missing load acquire when checking if ConstantPoolCacheEntry is resolved
Reviewed-by: roland, aph

! src/cpu/aarch64/vm/interp_masm_aarch64.cpp

Changeset: df9fe5e4b123
Author:    enevill
Date:      2016-02-26 03:44 -0500
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/df9fe5e4b123

Add support for large code cache

! src/cpu/aarch64/vm/aarch64.ad
! src/cpu/aarch64/vm/assembler_aarch64.cpp
! src/cpu/aarch64/vm/assembler_aarch64.hpp
! src/cpu/aarch64/vm/c1_CodeStubs_aarch64.cpp
! src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp
! src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.hpp
! src/cpu/aarch64/vm/c1_MacroAssembler_aarch64.cpp
! src/cpu/aarch64/vm/c1_Runtime1_aarch64.cpp
! src/cpu/aarch64/vm/compiledIC_aarch64.cpp
! src/cpu/aarch64/vm/globalDefinitions_aarch64.hpp
! src/cpu/aarch64/vm/globals_aarch64.hpp
! src/cpu/aarch64/vm/icBuffer_aarch64.cpp
! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp
! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp
! src/cpu/aarch64/vm/methodHandles_aarch64.cpp
! src/cpu/aarch64/vm/nativeInst_aarch64.cpp
! src/cpu/aarch64/vm/nativeInst_aarch64.hpp
! src/cpu/aarch64/vm/relocInfo_aarch64.cpp
! src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp
! src/cpu/aarch64/vm/stubGenerator_aarch64.cpp
! src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp
! src/cpu/aarch64/vm/vtableStubs_aarch64.cpp
! src/os_cpu/linux_aarch64/vm/os_linux_aarch64.cpp
! src/share/vm/runtime/arguments.cpp
! src/share/vm/utilities/globalDefinitions.hpp

Changeset: fdd053ca3236
Author:    enevill
Date:      2016-01-05 17:40 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/fdd053ca3236

Fix client build after addition of large code cache support

! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp
! src/cpu/aarch64/vm/vm_version_aarch64.cpp

Changeset: ebff70c35409
Author:    enevill
Date:      2015-12-29 16:47 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/ebff70c35409

8146286: aarch64: guarantee failures with large code cache sizes on jtreg test java/lang/invoke/LFCaching/LFMultiThreadCachingTest.java
Summary: patch trampoline calls with special case bl to itself which does not cause guarantee failure
Reviewed-by: aph

! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp
! src/cpu/aarch64/vm/relocInfo_aarch64.cpp

Changeset: a8e2e5e2062b
Author:    hshi
Date:      2015-11-26 15:37 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/a8e2e5e2062b

8143584: Load constant pool tag and class status with load acquire
Reviewed-by: roland, aph

! src/cpu/aarch64/vm/templateTable_aarch64.cpp

Changeset: ab88ec370d76
Author:    aph
Date:      2015-11-25 18:13 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/ab88ec370d76

8144028: Use AArch64 bit-test instructions in C2
Reviewed-by: kvn

! src/cpu/aarch64/vm/aarch64.ad
! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp
+ test/compiler/codegen/8144028/BitTests.java

Changeset: 30d91d32bb56
Author:    fyang
Date:      2015-12-07 21:23 +0800
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/30d91d32bb56

8144587: aarch64: generate vectorized MLA/MLS instructions
Summary: Add support for MLA/MLS (vector) instructions
Reviewed-by: roland

! src/cpu/aarch64/vm/aarch64.ad
! src/cpu/aarch64/vm/assembler_aarch64.hpp

Changeset: eea9d73ceecb
Author:    aph
Date:      2015-12-15 19:18 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/eea9d73ceecb

8145438: Guarantee failures since 8144028: Use AArch64 bit-test instructions in C2
Summary: Implement short and long versions of bit test instructions.
Reviewed-by: kvn

! src/cpu/aarch64/vm/aarch64.ad
! src/cpu/aarch64/vm/c1_MacroAssembler_aarch64.hpp
! src/cpu/aarch64/vm/interp_masm_aarch64.cpp
! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp
! src/share/vm/adlc/formssel.cpp

Changeset: 797f2d436722
Author:    aph
Date:      2015-12-16 11:35 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/797f2d436722

8144582: AArch64 does not generate correct branch profile data
Reviewed-by: kvn

! src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp

Changeset: eed0f8fbe256
Author:    fyang
Date:      2015-12-07 21:14 +0800
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/eed0f8fbe256

8144201: aarch64: jdk/test/com/sun/net/httpserver/Test6a.java fails with --enable-unlimited-crypto
Summary: Fix typo in stub generate_cipherBlockChaining_decryptAESCrypt
Reviewed-by: roland

! src/cpu/aarch64/vm/stubGenerator_aarch64.cpp

Changeset: 33f03ea2712b
Author:    enevill
Date:      2016-01-08 11:39 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/33f03ea2712b

8146678: aarch64: assertion failure: call instruction in an infinite loop
Summary: Remove assertion
Reviewed-by: aph

! src/cpu/aarch64/vm/relocInfo_aarch64.cpp

Changeset: 041044bfded5
Author:    enevill
Date:      2016-01-12 14:55 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/041044bfded5

8146843: aarch64: add scheduling support for FP and vector instructions
Summary: add pipeline classes for FP/vector pipeline
Reviewed-by: aph

! src/cpu/aarch64/vm/aarch64.ad

Changeset: f087cd606b4c
Author:    aph
Date:      2016-01-19 17:52 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/f087cd606b4c

8146709: AArch64: Incorrect use of ADRP for byte_map_base
Reviewed-by: roland

! src/cpu/aarch64/vm/aarch64.ad
! src/cpu/aarch64/vm/c1_Runtime1_aarch64.cpp
! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp
! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp
! src/cpu/aarch64/vm/stubGenerator_aarch64.cpp

Changeset: d3cd1699e84a
Author:    hshi
Date:      2016-01-20 04:56 -0800
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/d3cd1699e84a

8147805: aarch64: C1 segmentation fault due to inline Unsafe.getAndSetObject
Summary: In Aarch64 LIR_Assembler.atomic_op, keep stored data reference register in decompressed forms as it may be used later
Reviewed-by: aph
Contributed-by: hui.shi at linaro.org, felix.yang at linaro.org

! src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp

Changeset: f9b6277551dc
Author:    enevill
Date:      2016-01-26 14:04 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/f9b6277551dc

8148240: aarch64: random infrequent null pointer exceptions in javac
Summary: Disable fp as an allocatable register
Reviewed-by: aph

! src/cpu/aarch64/vm/aarch64.ad


From adinn at redhat.com  Fri Feb 26 10:10:06 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Fri, 26 Feb 2016 10:10:06 +0000
Subject: [aarch64-port-dev ] Freeze aarch64/jdk8
In-Reply-To: <56D01779.10309@redhat.com>
References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint>
	<56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com>
	<56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com>
	<56CF03D8.8070101@redhat.com> <56CF4056.6010000@redhat.com>
	<56D010F4.8070903@redhat.com> <56D01779.10309@redhat.com>
Message-ID: <56D0247E.5060504@redhat.com>

On 26/02/16 09:14, Andrew Haley wrote:
> On 26/02/16 08:46, Andrew Dinn wrote:
>> If I redo the patches with this restored is it then ok to push (assuming
>> it passes basic tests) or do you want to see another webrev?
> 
> Yes, I think so.  Then we have to do some fairly serious testing, on
> AArch64 and to make sure we haven't broken x86 and others.

Ok, pushed.

n.b.

  this tree passed basic smoke tests on both AArch64 and x86_64.

  Andrew Hughes' push to the jdk tree has removed the need to specify
-J-server for javac and netbeans.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in UK and Wales under Company Registration No. 3798903
Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul
Argiry (US)

From hui.shi at linaro.org  Fri Feb 26 14:28:01 2016
From: hui.shi at linaro.org (Hui Shi)
Date: Fri, 26 Feb 2016 22:28:01 +0800
Subject: [aarch64-port-dev ] =?utf-8?b?5Zue5aSN77yaUkZSOiA4MTQ5NzMzOiBB?=
	=?utf-8?q?Arch64=3A_refactorchar=5Farray=5Fequals/byte=5Farray=5Fe?=
	=?utf-8?q?quals/string=5Fequals?=
In-Reply-To: <56CDBD2A.9050002@oracle.com>
References: <56CC2DF8.60806@oracle.com>
	<CAF1YaiBJaYRhS8u0dAOJq=dV2Q-g2+UvYSm-SfijCzYnVTKCvQ@mail.gmail.com>
	<56C70C42.5020309@oracle.com>
	<tencent_2277B0CC985690AB6C8090A6@qq.com>
	<56CC8630.2020708@redhat.com>
	<CAF1YaiB-zDg5JzbYrqo_F5OoRZfRL0nnw27n8X=7DjRT7W=r7g@mail.gmail.com>
	<56CDBD2A.9050002@oracle.com>
Message-ID: <CAF1YaiCHdRbUZqq_Pt6of0s2d1QMtO8tPQ5t2Ey_7y1bkcaVVg@mail.gmail.com>

Thanks Aleksey!

Can I have another review for this patch?

Regards
Hui

On 24 February 2016 at 22:24, Aleksey Shipilev <aleksey.shipilev at oracle.com>
wrote:

> On 02/24/2016 04:02 PM, Hui Shi wrote:
> > Thanks Andrew! Your comment looks really better and performance doesn't
> > change when run JMHSample_97_ArrayEqual.java
> > <
> http://cr.openjdk.java.net/%7Ehshi/8149733/webrev2/JMHSample_97_ArrayEqual.java>
> test.
> >
> > latest webrev http://cr.openjdk.java.net/~hshi/8149733/webrev3/
> > <http://cr.openjdk.java.net/%7Ehshi/8149733/webrev3/>
>
> Good.
>
> > Following is result with Aleksey's updated test case (-w 5 -wi 3 -i3 -r
> > 10), first 4 group are for base run with base string length 0, 8, 31,
> > 1024. Performance with patch doesn't show same improvement with early
> > test. Only small length string equal tests still show obvious
> improvement.
>
> ...and that's okay for refactoring.
>
> Cheers,
> -Aleksey
>
>
>

From aph at redhat.com  Fri Feb 26 14:37:57 2016
From: aph at redhat.com (Andrew Haley)
Date: Fri, 26 Feb 2016 14:37:57 +0000
Subject: [aarch64-port-dev ] =?utf-8?b?5Zue5aSN77yaUkZSOiA4MTQ5NzMzOiBB?=
 =?utf-8?q?Arch64=3A_refactorchar=5Farray=5Fequals/byte=5Farray=5Fequals/s?=
 =?utf-8?q?tring=5Fequals?=
In-Reply-To: <CAF1YaiCHdRbUZqq_Pt6of0s2d1QMtO8tPQ5t2Ey_7y1bkcaVVg@mail.gmail.com>
References: <56CC2DF8.60806@oracle.com>
	<CAF1YaiBJaYRhS8u0dAOJq=dV2Q-g2+UvYSm-SfijCzYnVTKCvQ@mail.gmail.com>
	<56C70C42.5020309@oracle.com> <tencent_2277B0CC985690AB6C8090A6@qq.com>
	<56CC8630.2020708@redhat.com>
	<CAF1YaiB-zDg5JzbYrqo_F5OoRZfRL0nnw27n8X=7DjRT7W=r7g@mail.gmail.com>
	<56CDBD2A.9050002@oracle.com>
	<CAF1YaiCHdRbUZqq_Pt6of0s2d1QMtO8tPQ5t2Ey_7y1bkcaVVg@mail.gmail.com>
Message-ID: <56D06345.7010309@redhat.com>

On 02/26/2016 02:28 PM, Hui Shi wrote:
> Can I have another review for this patch?

If you insist.  OK.

Andrew.


From gnu.andrew at redhat.com  Fri Feb 26 17:46:06 2016
From: gnu.andrew at redhat.com (Andrew Hughes)
Date: Fri, 26 Feb 2016 12:46:06 -0500 (EST)
Subject: [aarch64-port-dev ] [PATCH] [jdk8u] Remove unused template
 which breaks builds with GCC 6
In-Reply-To: <56CF6585.1010304@redhat.com>
References: <1629682443.26965872.1456360872278.JavaMail.zimbra@redhat.com>
	<56CED047.4060604@redhat.com>
	<1948009987.404665.1456426686320.JavaMail.zimbra@redhat.com>
	<56CF6585.1010304@redhat.com>
Message-ID: <440404060.902512.1456508766393.JavaMail.zimbra@redhat.com>


----- Original Message -----
> On 02/25/2016 06:58 PM, Andrew Hughes wrote:
> > Ok, the issue there is testing a build of OpenJDK 9 with
> > GCC 6 on AArch64. I'll look into it.
> 
> I already submitted it for review.  It's approved.
> 
> Andrew.
> 
> 

Yes, I saw this after I replied. Thanks, that saved me a lot of hassle!
-- 
Andrew :)

Senior Free Java Software Engineer
Red Hat, Inc. (http://www.redhat.com)

PGP Key: ed25519/35964222 (hkp://keys.gnupg.net)
Fingerprint = 5132 579D D154 0ED2 3E04  C5A0 CFDA 0F9B 3596 4222


From hui.shi at linaro.org  Sat Feb 27 11:40:29 2016
From: hui.shi at linaro.org (Hui Shi)
Date: Sat, 27 Feb 2016 19:40:29 +0800
Subject: [aarch64-port-dev ] =?utf-8?b?5Zue5aSN77yaUkZSOiA4MTQ5NzMzOiBB?=
	=?utf-8?q?Arch64=3A_refactorchar=5Farray=5Fequals/byte=5Farray=5Fe?=
	=?utf-8?q?quals/string=5Fequals?=
In-Reply-To: <56D06345.7010309@redhat.com>
References: <56CC2DF8.60806@oracle.com>
	<CAF1YaiBJaYRhS8u0dAOJq=dV2Q-g2+UvYSm-SfijCzYnVTKCvQ@mail.gmail.com>
	<56C70C42.5020309@oracle.com>
	<tencent_2277B0CC985690AB6C8090A6@qq.com>
	<56CC8630.2020708@redhat.com>
	<CAF1YaiB-zDg5JzbYrqo_F5OoRZfRL0nnw27n8X=7DjRT7W=r7g@mail.gmail.com>
	<56CDBD2A.9050002@oracle.com>
	<CAF1YaiCHdRbUZqq_Pt6of0s2d1QMtO8tPQ5t2Ey_7y1bkcaVVg@mail.gmail.com>
	<56D06345.7010309@redhat.com>
Message-ID: <CAF1YaiCPLwMxAb_M4=1fp3LSFvH-QD+j65e8YTNsJ=Bth0c6LQ@mail.gmail.com>

Thanks!

On 26 February 2016 at 22:37, Andrew Haley <aph at redhat.com> wrote:

> On 02/26/2016 02:28 PM, Hui Shi wrote:
> > Can I have another review for this patch?
>
> If you insist.  OK.
>
> Andrew.
>
>

From gnu.andrew at redhat.com  Mon Feb 29 06:47:16 2016
From: gnu.andrew at redhat.com (gnu.andrew at redhat.com)
Date: Mon, 29 Feb 2016 06:47:16 +0000
Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u: Added tag
	aarch64-jdk8u72-b16 for changeset	92af9369869f
Message-ID: <201602290647.u1T6lGdh015594@aojmv0008.oracle.com>

Changeset: 86030362b0c5
Author:    andrew
Date:      2016-02-29 06:45 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/rev/86030362b0c5

Added tag aarch64-jdk8u72-b16 for changeset 92af9369869f

! .hgtags


From gnu.andrew at redhat.com  Mon Feb 29 06:47:24 2016
From: gnu.andrew at redhat.com (gnu.andrew at redhat.com)
Date: Mon, 29 Feb 2016 06:47:24 +0000
Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u/corba: Added tag
	aarch64-jdk8u72-b16 for	changeset d5a3087d60ee
Message-ID: <201602290647.u1T6lOXf015687@aojmv0008.oracle.com>

Changeset: c44425453bfa
Author:    andrew
Date:      2016-02-29 06:45 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/corba/rev/c44425453bfa

Added tag aarch64-jdk8u72-b16 for changeset d5a3087d60ee

! .hgtags


From gnu.andrew at redhat.com  Mon Feb 29 06:47:31 2016
From: gnu.andrew at redhat.com (gnu.andrew at redhat.com)
Date: Mon, 29 Feb 2016 06:47:31 +0000
Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u/jaxp: Added tag
	aarch64-jdk8u72-b16 for	changeset 6769d8017f5d
Message-ID: <201602290647.u1T6lVOZ015800@aojmv0008.oracle.com>

Changeset: 99056017b4e3
Author:    andrew
Date:      2016-02-29 06:45 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/jaxp/rev/99056017b4e3

Added tag aarch64-jdk8u72-b16 for changeset 6769d8017f5d

! .hgtags


From gnu.andrew at redhat.com  Mon Feb 29 06:47:38 2016
From: gnu.andrew at redhat.com (gnu.andrew at redhat.com)
Date: Mon, 29 Feb 2016 06:47:38 +0000
Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u/jaxws: Added tag
	aarch64-jdk8u72-b16 for	changeset 1ecc978053bf
Message-ID: <201602290647.u1T6lcdn015928@aojmv0008.oracle.com>

Changeset: eeb105ae870d
Author:    andrew
Date:      2016-02-29 06:45 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/jaxws/rev/eeb105ae870d

Added tag aarch64-jdk8u72-b16 for changeset 1ecc978053bf

! .hgtags


From gnu.andrew at redhat.com  Mon Feb 29 06:47:46 2016
From: gnu.andrew at redhat.com (gnu.andrew at redhat.com)
Date: Mon, 29 Feb 2016 06:47:46 +0000
Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u/langtools: Added tag
	aarch64-jdk8u72-b16 for	changeset b63515578554
Message-ID: <201602290647.u1T6lkxl016020@aojmv0008.oracle.com>

Changeset: 109a626b4431
Author:    andrew
Date:      2016-02-29 06:45 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/langtools/rev/109a626b4431

Added tag aarch64-jdk8u72-b16 for changeset b63515578554

! .hgtags


From gnu.andrew at redhat.com  Mon Feb 29 06:47:53 2016
From: gnu.andrew at redhat.com (gnu.andrew at redhat.com)
Date: Mon, 29 Feb 2016 06:47:53 +0000
Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u/hotspot: Added tag
	aarch64-jdk8u72-b16 for	changeset f9b6277551dc
Message-ID: <201602290647.u1T6lrQR016088@aojmv0008.oracle.com>

Changeset: 4c440540c962
Author:    andrew
Date:      2016-02-29 06:45 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/4c440540c962

Added tag aarch64-jdk8u72-b16 for changeset f9b6277551dc

! .hgtags


From gnu.andrew at redhat.com  Mon Feb 29 06:48:00 2016
From: gnu.andrew at redhat.com (gnu.andrew at redhat.com)
Date: Mon, 29 Feb 2016 06:48:00 +0000
Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u/jdk: Added tag
	aarch64-jdk8u72-b16 for	changeset b39ade4fa554
Message-ID: <201602290648.u1T6m0vx016165@aojmv0008.oracle.com>

Changeset: 9331bfc2d798
Author:    andrew
Date:      2016-02-29 06:45 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/jdk/rev/9331bfc2d798

Added tag aarch64-jdk8u72-b16 for changeset b39ade4fa554

! .hgtags


From gnu.andrew at redhat.com  Mon Feb 29 06:48:07 2016
From: gnu.andrew at redhat.com (gnu.andrew at redhat.com)
Date: Mon, 29 Feb 2016 06:48:07 +0000
Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u/nashorn: Added tag
	aarch64-jdk8u72-b16 for	changeset 8eb47ddad851
Message-ID: <201602290648.u1T6m7n2016248@aojmv0008.oracle.com>

Changeset: af05959dd44b
Author:    andrew
Date:      2016-02-29 06:45 +0000
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8u/nashorn/rev/af05959dd44b

Added tag aarch64-jdk8u72-b16 for changeset 8eb47ddad851

! .hgtags


From adinn at redhat.com  Mon Feb 29 16:04:14 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Mon, 29 Feb 2016 16:04:14 +0000
Subject: [aarch64-port-dev ] RFR: AArch64 backport to Icedtea7 of 8146709
Message-ID: <56D46BFE.90509@redhat.com>

I have backported the patch for 8146709 (Incorrect use of ADRP for
byte_map_base) from aarch64/jdk8u to icedtea7-forest. Here's the webrev:

  http://cr.openjdk.java.net/~adinn/8146709/webrev.00/

This appears to be the cause of the problem running specjvm referred to
in bugzilla 1310061

  https://bugzilla.redhat.com/show_bug.cgi?id=1310061

Before the patch I saw the error described in the BZ.After the patch it
no longer occurs.

The patched code passes basic smoke tests. Can I get an ok from someone
else before I push this?

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in UK and Wales under Company Registration No. 3798903
Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul
Argiry (US)

From adinn at redhat.com  Mon Feb 29 16:05:08 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Mon, 29 Feb 2016 16:05:08 +0000
Subject: [aarch64-port-dev ] RFR: AArch64 backport to Icedtea7 of 8146709
In-Reply-To: <56D46BFE.90509@redhat.com>
References: <56D46BFE.90509@redhat.com>
Message-ID: <56D46C34.2040708@redhat.com>


On 29/02/16 16:04, Andrew Dinn wrote:
> I have backported the patch for 8146709 (Incorrect use of ADRP for
> byte_map_base) from aarch64/jdk8u to icedtea7-forest. Here's the webrev:
> 
>   http://cr.openjdk.java.net/~adinn/8146709/webrev.00/
> 
> This appears to be the cause of the problem running specjvm referred to
> in bugzilla 1310061
> 
>   https://bugzilla.redhat.com/show_bug.cgi?id=1310061
> 
> Before the patch I saw the error described in the BZ.After the patch it
> no longer occurs.
> 
> The patched code passes basic smoke tests. Can I get an ok from someone
> else before I push this?

Oops, forgot to link the webrev:

 http://cr.openjdk.java.net/~adinn/8146709/webrev.00/

> regards,
> 
> 
> Andrew Dinn
> -----------
> Senior Principal Software Engineer
> Red Hat UK Ltd
> Registered in UK and Wales under Company Registration No. 3798903
> Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul
> Argiry (US)

From gnu.andrew at redhat.com  Mon Feb 29 16:14:56 2016
From: gnu.andrew at redhat.com (Andrew Hughes)
Date: Mon, 29 Feb 2016 11:14:56 -0500 (EST)
Subject: [aarch64-port-dev ] Freeze aarch64/jdk8
In-Reply-To: <56CF0DC2.8080104@redhat.com>
References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint>
	<56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com>
	<56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com>
	<56CF03D8.8070101@redhat.com> <56CF0DC2.8080104@redhat.com>
Message-ID: <943810353.1464090.1456762496076.JavaMail.zimbra@redhat.com>

snip...

> 
> >> largecodecache.patch
> >> revid: 8576
> >> Add support for large code cache
> >>
> >> this makes changes to two files.
> > 
> > The non-AArch64-specific parts of this patch are not used by anything
> > so should not be included.  The rest is OK.
> 
> Ok, I will rework this to include only the AArch64-specific code.
> 

The other changes appear to have been included and the build on s390 is
now broken as a result (mismatch between the types of
CODE_CACHE_DEFAULT_LIMIT and ReservedCodeCacheSize * 5 in the min2 macro).
Can I revert the changes to these two files?
(src/share/vm/utilities/globalDefinitions.hpp & src/share/vm/runtime/arguments.cpp)

Thanks,
-- 
Andrew :)

Senior Free Java Software Engineer
Red Hat, Inc. (http://www.redhat.com)

PGP Key: ed25519/35964222 (hkp://keys.gnupg.net)
Fingerprint = 5132 579D D154 0ED2 3E04  C5A0 CFDA 0F9B 3596 4222


From adinn at redhat.com  Mon Feb 29 16:19:18 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Mon, 29 Feb 2016 16:19:18 +0000
Subject: [aarch64-port-dev ] Freeze aarch64/jdk8
In-Reply-To: <943810353.1464090.1456762496076.JavaMail.zimbra@redhat.com>
References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint>
	<56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com>
	<56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com>
	<56CF03D8.8070101@redhat.com> <56CF0DC2.8080104@redhat.com>
	<943810353.1464090.1456762496076.JavaMail.zimbra@redhat.com>
Message-ID: <56D46F86.2090108@redhat.com>

On 29/02/16 16:14, Andrew Hughes wrote:
> snip...
> 
>>
>>>> largecodecache.patch
>>>> revid: 8576
>>>> Add support for large code cache
>>>>
>>>> this makes changes to two files.
>>>
>>> The non-AArch64-specific parts of this patch are not used by anything
>>> so should not be included.  The rest is OK.
>>
>> Ok, I will rework this to include only the AArch64-specific code.
>>
> 
> The other changes appear to have been included and the build on s390 is
> now broken as a result (mismatch between the types of
> CODE_CACHE_DEFAULT_LIMIT and ReservedCodeCacheSize * 5 in the min2 macro).
> Can I revert the changes to these two files?
> (src/share/vm/utilities/globalDefinitions.hpp & src/share/vm/runtime/arguments.cpp)

Andrew Haley followed up with a note explaining that these changes are
needed on AArch64 (which is why I put them back in again). I think it
might be better to fix the breakage to the PPC code. Perhaps Andrew
Haley can comment?

regards,


Andrew Dinn
-----------


From gnu.andrew at redhat.com  Mon Feb 29 16:37:47 2016
From: gnu.andrew at redhat.com (Andrew Hughes)
Date: Mon, 29 Feb 2016 11:37:47 -0500 (EST)
Subject: [aarch64-port-dev ] Freeze aarch64/jdk8
In-Reply-To: <56D46F86.2090108@redhat.com>
References: <56CED360.1000000@redhat.com> <56CEEB33.2060400@redhat.com>
	<56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com>
	<56CF03D8.8070101@redhat.com> <56CF0DC2.8080104@redhat.com>
	<943810353.1464090.1456762496076.JavaMail.zimbra@redhat.com>
	<56D46F86.2090108@redhat.com>
Message-ID: <1056751202.1475582.1456763867497.JavaMail.zimbra@redhat.com>

----- Original Message -----
> On 29/02/16 16:14, Andrew Hughes wrote:
> > snip...
> > 
> >>
> >>>> largecodecache.patch
> >>>> revid: 8576
> >>>> Add support for large code cache
> >>>>
> >>>> this makes changes to two files.
> >>>
> >>> The non-AArch64-specific parts of this patch are not used by anything
> >>> so should not be included.  The rest is OK.
> >>
> >> Ok, I will rework this to include only the AArch64-specific code.
> >>
> > 
> > The other changes appear to have been included and the build on s390 is
> > now broken as a result (mismatch between the types of
> > CODE_CACHE_DEFAULT_LIMIT and ReservedCodeCacheSize * 5 in the min2 macro).
> > Can I revert the changes to these two files?
> > (src/share/vm/utilities/globalDefinitions.hpp &
> > src/share/vm/runtime/arguments.cpp)
> 
> Andrew Haley followed up with a note explaining that these changes are
> needed on AArch64 (which is why I put them back in again). I think it
> might be better to fix the breakage to the PPC code. Perhaps Andrew
> Haley can comment?
> 

* s390.

The change was there for AArch64 only before.

-    FLAG_SET_DEFAULT(ReservedCodeCacheSize,
-                     MIN2(CODE_CACHE_DEFAULT_LIMIT, ReservedCodeCacheSize * 5));
+    FLAG_SET_DEFAULT(ReservedCodeCacheSize, ReservedCodeCacheSize * 5);
+    // The maximum B/BL offset range on AArch64 is 128MB
+    AARCH64_ONLY(FLAG_SET_DEFAULT(ReservedCodeCacheSize, MIN2(ReservedCodeCacheSize, 128*M)));

Maybe it's just a case that the first setting needs to be not AArch64
and the AArch64 one needs updating to the new version? e.g.

-    FLAG_SET_DEFAULT(ReservedCodeCacheSize,
-                     MIN2(CODE_CACHE_DEFAULT_LIMIT, ReservedCodeCacheSize * 5));
+    NOT_AARCH64(FLAG_SET_DEFAULT(ReservedCodeCacheSize, ReservedCodeCacheSize * 5));
+    AARCH64_ONLY(FLAG_SET_DEFAULT(ReservedCodeCacheSize,
+				  MIN2(CODE_CACHE_DEFAULT_LIMIT, ReservedCodeCacheSize * 5)));

We then don't change the behaviour on other architectures.

> regards,
> 
> 
> Andrew Dinn
> -----------
> 
> 

-- 
Andrew :)

Senior Free Java Software Engineer
Red Hat, Inc. (http://www.redhat.com)

PGP Key: ed25519/35964222 (hkp://keys.gnupg.net)
Fingerprint = 5132 579D D154 0ED2 3E04  C5A0 CFDA 0F9B 3596 4222


From adinn at redhat.com  Mon Feb 29 16:37:52 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Mon, 29 Feb 2016 16:37:52 +0000
Subject: [aarch64-port-dev ] Freeze aarch64/jdk8
In-Reply-To: <56D46F86.2090108@redhat.com>
References: <56CED360.1000000@redhat.com> <1456395940.7333.2.camel@mint>
	<56CEE75C.9080102@redhat.com> <56CEEB33.2060400@redhat.com>
	<56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com>
	<56CF03D8.8070101@redhat.com> <56CF0DC2.8080104@redhat.com>
	<943810353.1464090.1456762496076.JavaMail.zimbra@redhat.com>
	<56D46F86.2090108@redhat.com>
Message-ID: <56D473E0.7010905@redhat.com>

On 29/02/16 16:19, Andrew Dinn wrote:
> On 29/02/16 16:14, Andrew Hughes wrote:
>> The other changes appear to have been included and the build on s390 is
>> now broken as a result (mismatch between the types of
>> CODE_CACHE_DEFAULT_LIMIT and ReservedCodeCacheSize * 5 in the min2 macro).
>> Can I revert the changes to these two files?
>> (src/share/vm/utilities/globalDefinitions.hpp & src/share/vm/runtime/arguments.cpp)
> 
> Andrew Haley followed up with a note explaining that these changes are
> needed on AArch64 (which is why I put them back in again). I think it
> might be better to fix the breakage to the PPC code. Perhaps Andrew
> Haley can comment?

Sorry, that should have read breakage to the PPC build -- since the
error appears to be in code that is part of the patch.

I am not sure I follow what is happening here. ReservedCodeCacheSize is
an int. CODE_CACHE_DEFAULT_LIMIT defaults to CODE_CACHE_SIZE_LIMIT which
is defined as (2 * G). AArch64 redefines it to (128 * M). G and M are
both of type size_t. Do we just need a cast when we pass the arguments
to min2?

Andrew Hughes, can you provide more details on the error?

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in UK and Wales under Company Registration No. 3798903
Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul
Argiry (US)

From gnu.andrew at redhat.com  Mon Feb 29 16:44:01 2016
From: gnu.andrew at redhat.com (Andrew Hughes)
Date: Mon, 29 Feb 2016 11:44:01 -0500 (EST)
Subject: [aarch64-port-dev ] Freeze aarch64/jdk8
In-Reply-To: <56D473E0.7010905@redhat.com>
References: <56CED360.1000000@redhat.com> <56CEEC35.2020101@redhat.com>
	<56CEF2EA.7090701@redhat.com> <56CF03D8.8070101@redhat.com>
	<56CF0DC2.8080104@redhat.com>
	<943810353.1464090.1456762496076.JavaMail.zimbra@redhat.com>
	<56D46F86.2090108@redhat.com> <56D473E0.7010905@redhat.com>
Message-ID: <968027110.1478028.1456764241065.JavaMail.zimbra@redhat.com>


----- Original Message -----
> On 29/02/16 16:19, Andrew Dinn wrote:
> > On 29/02/16 16:14, Andrew Hughes wrote:
> >> The other changes appear to have been included and the build on s390 is
> >> now broken as a result (mismatch between the types of
> >> CODE_CACHE_DEFAULT_LIMIT and ReservedCodeCacheSize * 5 in the min2 macro).
> >> Can I revert the changes to these two files?
> >> (src/share/vm/utilities/globalDefinitions.hpp &
> >> src/share/vm/runtime/arguments.cpp)
> > 
> > Andrew Haley followed up with a note explaining that these changes are
> > needed on AArch64 (which is why I put them back in again). I think it
> > might be better to fix the breakage to the PPC code. Perhaps Andrew
> > Haley can comment?
> 
> Sorry, that should have read breakage to the PPC build -- since the
> error appears to be in code that is part of the patch.
> 
> I am not sure I follow what is happening here. ReservedCodeCacheSize is
> an int. CODE_CACHE_DEFAULT_LIMIT defaults to CODE_CACHE_SIZE_LIMIT which
> is defined as (2 * G). AArch64 redefines it to (128 * M). G and M are
> both of type size_t. Do we just need a cast when we pass the arguments
> to min2?
> 

In short, yes.

It's s390. On s390, size_t is a long unsigned int, while the right-hand
side, ReservedCodeCacheSize * 5, is a uintx:

/builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.72-5.b16.el7.s390/openjdk/hotspot/src/share/vm/runtime/arguments.cpp:1141:78: e\
rror: no matching function for call to 'MIN2(long unsigned int, uintx)'
                      MIN2(CODE_CACHE_DEFAULT_LIMIT, ReservedCodeCacheSize * 5));

We have a lot of cases of this on s390 which we have to fix, and getting
that upstream has been an uphill task, with them throwing rocks down at us
all the time.

We can fix it with a cast, but here I don't think this change should be even
made on non-AArch64, as it's a divergence from 8u. See the suggestion in my
previous e-mail.
-- 
Andrew :)

Senior Free Java Software Engineer
Red Hat, Inc. (http://www.redhat.com)

PGP Key: ed25519/35964222 (hkp://keys.gnupg.net)
Fingerprint = 5132 579D D154 0ED2 3E04  C5A0 CFDA 0F9B 3596 4222


From aph at redhat.com  Mon Feb 29 16:46:47 2016
From: aph at redhat.com (Andrew Haley)
Date: Mon, 29 Feb 2016 16:46:47 +0000
Subject: [aarch64-port-dev ] Freeze aarch64/jdk8
In-Reply-To: <1056751202.1475582.1456763867497.JavaMail.zimbra@redhat.com>
References: <56CED360.1000000@redhat.com> <56CEEB33.2060400@redhat.com>
	<56CEEC35.2020101@redhat.com> <56CEF2EA.7090701@redhat.com>
	<56CF03D8.8070101@redhat.com> <56CF0DC2.8080104@redhat.com>
	<943810353.1464090.1456762496076.JavaMail.zimbra@redhat.com>
	<56D46F86.2090108@redhat.com>
	<1056751202.1475582.1456763867497.JavaMail.zimbra@redhat.com>
Message-ID: <56D475F7.3040901@redhat.com>

On 02/29/2016 04:37 PM, Andrew Hughes wrote:
> e.g.
> 
> -    FLAG_SET_DEFAULT(ReservedCodeCacheSize,
> -                     MIN2(CODE_CACHE_DEFAULT_LIMIT, ReservedCodeCacheSize * 5));
> +    NOT_AARCH64(FLAG_SET_DEFAULT(ReservedCodeCacheSize, ReservedCodeCacheSize * 5));
> +    AARCH64_ONLY(FLAG_SET_DEFAULT(ReservedCodeCacheSize,
> +				  MIN2(CODE_CACHE_DEFAULT_LIMIT, ReservedCodeCacheSize * 5)));
> 
> We then don't change the behaviour on other architectures.

Yes.  That's safer all around.

Andrew.


From adinn at redhat.com  Mon Feb 29 16:47:41 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Mon, 29 Feb 2016 16:47:41 +0000
Subject: [aarch64-port-dev ] Freeze aarch64/jdk8
In-Reply-To: <968027110.1478028.1456764241065.JavaMail.zimbra@redhat.com>
References: <56CED360.1000000@redhat.com> <56CEEC35.2020101@redhat.com>
	<56CEF2EA.7090701@redhat.com> <56CF03D8.8070101@redhat.com>
	<56CF0DC2.8080104@redhat.com>
	<943810353.1464090.1456762496076.JavaMail.zimbra@redhat.com>
	<56D46F86.2090108@redhat.com> <56D473E0.7010905@redhat.com>
	<968027110.1478028.1456764241065.JavaMail.zimbra@redhat.com>
Message-ID: <56D4762D.6030700@redhat.com>

On 29/02/16 16:44, Andrew Hughes wrote:
> In short, yes.
> 
> It's s390. On s390, size_t is a long unsigned int, while the right-hand
> side, ReservedCodeCacheSize * 5, is a uintx:
> 
> /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.72-5.b16.el7.s390/openjdk/hotspot/src/share/vm/runtime/arguments.cpp:1141:78: e\
> rror: no matching function for call to 'MIN2(long unsigned int, uintx)'
>                       MIN2(CODE_CACHE_DEFAULT_LIMIT, ReservedCodeCacheSize * 5));
> 
> We have a lot of cases of this on s390 which we have to fix, and getting
> that upstream has been an uphill task, with them throwing rocks down at us
> all the time.
> 
> We can fix it with a cast, but here I don't think this change should be even
> made on non-AArch64, as it's a divergence from 8u. See the suggestion in my
> previous e-mail.

Oh. yeeeurch! Yes, your suggestion looks like a much better idea than
trying to fix it any other way.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in UK and Wales under Company Registration No. 3798903
Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul
Argiry (US)

From aph at redhat.com  Mon Feb 29 16:52:31 2016
From: aph at redhat.com (Andrew Haley)
Date: Mon, 29 Feb 2016 16:52:31 +0000
Subject: [aarch64-port-dev ] RFR: AArch64 backport to Icedtea7 of 8146709
In-Reply-To: <56D46C34.2040708@redhat.com>
References: <56D46BFE.90509@redhat.com> <56D46C34.2040708@redhat.com>
Message-ID: <56D4774F.5060607@redhat.com>

On 02/29/2016 04:05 PM, Andrew Dinn wrote:
> Oops, forgot to link the webrev:
> 
>  http://cr.openjdk.java.net/~adinn/8146709/webrev.00/

OK.

Thanks,

Andrew.