From adinn at redhat.com  Mon Jun  1 11:00:51 2015
From: adinn at redhat.com (Andrew Dinn)
Date: Mon, 01 Jun 2015 12:00:51 +0100
Subject: [aarch64-port-dev ] Fwd: /hg/icedtea7-forest/hotspot: 11 new
	changesets
In-Reply-To: <hg.8e04e38c3fa8.1433156009.4837603537117320258@icedtea.classpath.org>
References: <hg.8e04e38c3fa8.1433156009.4837603537117320258@icedtea.classpath.org>
Message-ID: <556C3B63.10703@redhat.com>

I have backported several outstanding AArch64 JDK8 and JDK9 hotspot
fixes to the icedtea7-forest repo as per the commit message below

I (successfully) smoke-tested the resulting build on both AArch64 and
x86 (the latter because the changes included a small number of minor
changes to shared code) by running netbeans and specjvm.

regards,


Andrew Dinn
-----------


-------- Forwarded Message --------
Subject: /hg/icedtea7-forest/hotspot: 11 new changesets
Date: Mon, 01 Jun 2015 10:53:29 +0000
From: adinn at icedtea.classpath.org
To: distro-pkg-dev at openjdk.java.net

changeset 8e04e38c3fa8 in /hg/icedtea7-forest/hotspot
details:
http://icedtea.classpath.org/hg/icedtea7-forest/hotspot?cmd=changeset;node=8e04e38c3fa8
author: aph
date: Thu May 28 10:16:54 2015 -0400

	8069593: Changes to JavaThread::_thread_state must use acquire and release


changeset 548020488783 in /hg/icedtea7-forest/hotspot
details:
http://icedtea.classpath.org/hg/icedtea7-forest/hotspot?cmd=changeset;node=548020488783
author: aph
date: Tue Mar 03 17:56:33 2015 +0000

	8074349: AARCH64: C2 generates poor code for some byte and character stores
	Summary: Use iRegIorL2I as src input for char and byte stores.
	Reviewed-by: kvn


changeset c8f1b01693ba in /hg/icedtea7-forest/hotspot
details:
http://icedtea.classpath.org/hg/icedtea7-forest/hotspot?cmd=changeset;node=c8f1b01693ba
author: aph
date: Thu May 28 10:25:15 2015 -0400

	8075045: AARCH64: Stack banging should use store rather than load
	Summary: Change stack bangs to use a store rather than a load


changeset 0bea9494c9cb in /hg/icedtea7-forest/hotspot
details:
http://icedtea.classpath.org/hg/icedtea7-forest/hotspot?cmd=changeset;node=0bea9494c9cb
author: enevill
date: Wed May 27 15:03:26 2015 +0100

	Add copyright to aarch64_ad.m4


changeset 63723278c978 in /hg/icedtea7-forest/hotspot
details:
http://icedtea.classpath.org/hg/icedtea7-forest/hotspot?cmd=changeset;node=63723278c978
author: aph
date: Fri May 29 09:31:52 2015 -0400

	8075443: AARCH64: Missed L2I optimizations in C2
	Summary: Use iRegIOrL2I for input operands whenever it makes sense.


changeset 84fa299120ce in /hg/icedtea7-forest/hotspot
details:
http://icedtea.classpath.org/hg/icedtea7-forest/hotspot?cmd=changeset;node=84fa299120ce
author: aph
date: Fri May 29 09:45:44 2015 -0400

	8075930: AARCH64: Use FP Register in C2
	Summary: modify to allow C2 to allocate FP (R29) as a general register


changeset 137f1ed67e92 in /hg/icedtea7-forest/hotspot
details:
http://icedtea.classpath.org/hg/icedtea7-forest/hotspot?cmd=changeset;node=137f1ed67e92
author: aph
date: Fri May 29 10:38:35 2015 -0400

	8076467: AARCH64: assertion fail with -XX:+UseG1GC
	Summary: Don't call encoding unless bool is true.


changeset 3f4d11cdefe1 in /hg/icedtea7-forest/hotspot
details:
http://icedtea.classpath.org/hg/icedtea7-forest/hotspot?cmd=changeset;node=3f4d11cdefe1
author: enevill
date: Fri May 29 11:03:49 2015 -0400

	8079203: AARCH64: Need to cater for different partner implementations
	Summary: Parse /proc/cpuinfo to derive implementation specific info


changeset a74b6b4d0bde in /hg/icedtea7-forest/hotspot
details:
http://icedtea.classpath.org/hg/icedtea7-forest/hotspot?cmd=changeset;node=a74b6b4d0bde
author: enevill
date: Wed May 27 15:40:40 2015 +0100

	8080586: aarch64: hotspot test
compiler/codegen/7184394/TestAESMain.java fails
	Summary: Return correct length in
generate_cipherBlockChaining_encryptAESCrypt


changeset 1795197a987f in /hg/icedtea7-forest/hotspot
details:
http://icedtea.classpath.org/hg/icedtea7-forest/hotspot?cmd=changeset;node=1795197a987f
author: adinn
date: Fri May 29 11:20:12 2015 -0400

	8075324: Costs of memory operands in aarch64.ad are inconsistent
	Summary: Made cost of 'indOffI' consistent to the other memory operands.


changeset c96991560be1 in /hg/icedtea7-forest/hotspot
details:
http://icedtea.classpath.org/hg/icedtea7-forest/hotspot?cmd=changeset;node=c96991560be1
author: thartmann
date: Mon Mar 23 10:15:53 2015 +0100

	8075136: Unnecessary sign extension for byte array access
	Summary: Added C2 matching rules to remove unnecessary sign extension
for byte array access.
	Reviewed-by: roland, kvn, aph, adinn


diffstat:

 src/cpu/aarch64/vm/aarch64.ad                      |  251
++++++++++++--------
 src/cpu/aarch64/vm/aarch64_ad.m4                   |   51 +++-
 src/cpu/aarch64/vm/assembler_aarch64.hpp           |    2 +-
 src/cpu/aarch64/vm/frame_aarch64.inline.hpp        |   12 -
 src/cpu/aarch64/vm/interp_masm_aarch64.hpp         |    2 +
 src/cpu/aarch64/vm/register_aarch64.hpp            |    5 +-
 src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp       |    9 +-
 src/cpu/aarch64/vm/stubGenerator_aarch64.cpp       |    4 +-
 src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp |   11 +-
 src/cpu/aarch64/vm/vm_version_aarch64.cpp          |   40 +++-
 src/cpu/aarch64/vm/vm_version_aarch64.hpp          |   34 ++
 src/cpu/x86/vm/x86_64.ad                           |   61 ++++-
 src/share/vm/runtime/thread.hpp                    |    2 +-
 13 files changed, 341 insertions(+), 143 deletions(-)

diffs (truncated from 1325 to 500 lines):

diff -r c0ca0821c737 -r c96991560be1 src/cpu/aarch64/vm/aarch64.ad
--- a/src/cpu/aarch64/vm/aarch64.ad	Wed Apr 29 12:23:48 2015 -0700
+++ b/src/cpu/aarch64/vm/aarch64.ad	Mon Mar 23 10:15:53 2015 +0100
@@ -447,7 +447,7 @@
     R26
  /* R27, */			// heapbase
  /* R28, */			// thread
- /* R29, */			// fp
+    R29,   			// fp
  /* R30, */			// lr
  /* R31 */			// sp
 );
@@ -481,7 +481,7 @@
     R26, R26_H,
  /* R27, R27_H,	*/		// heapbase
  /* R28, R28_H, */		// thread
- /* R29, R29_H, */		// fp
+    R29, R29_H,   		// fp
  /* R30, R30_H, */		// lr
  /* R31, R31_H */		// sp
 );
@@ -1728,7 +1728,7 @@
 }

 const RegMask Matcher::method_handle_invoke_SP_save_mask() {
-  return RegMask();
+  return FP_REG_mask();
 }

 // helper for encoding java_to_runtime calls on sim
@@ -1811,6 +1811,8 @@
     case INDINDEXSCALEDI2L:
     case INDINDEXSCALEDOFFSETI2LN:
     case INDINDEXSCALEDI2LN:
+    case INDINDEXOFFSETI2L:
+    case INDINDEXOFFSETI2LN:
       scale = Address::sxtw(size);
       break;
     default:
@@ -2126,16 +2128,22 @@
   enc_class aarch64_enc_stlrb(iRegI src, memory mem) %{
     MOV_VOLATILE(as_Register($src$$reg), $mem$$base, $mem$$index,
$mem$$scale, $mem$$disp,
 		 rscratch1, stlrb);
+    if (VM_Version::cpu_cpuFeatures() & VM_Version::CPU_DMB_ATOMICS)
+      __ dmb(__ ISH);
   %}

   enc_class aarch64_enc_stlrh(iRegI src, memory mem) %{
     MOV_VOLATILE(as_Register($src$$reg), $mem$$base, $mem$$index,
$mem$$scale, $mem$$disp,
 		 rscratch1, stlrh);
+    if (VM_Version::cpu_cpuFeatures() & VM_Version::CPU_DMB_ATOMICS)
+      __ dmb(__ ISH);
   %}

   enc_class aarch64_enc_stlrw(iRegI src, memory mem) %{
     MOV_VOLATILE(as_Register($src$$reg), $mem$$base, $mem$$index,
$mem$$scale, $mem$$disp,
 		 rscratch1, stlrw);
+    if (VM_Version::cpu_cpuFeatures() & VM_Version::CPU_DMB_ATOMICS)
+      __ dmb(__ ISH);
   %}


@@ -2226,6 +2234,8 @@
     }
     MOV_VOLATILE(src_reg, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp,
 		 rscratch1, stlr);
+    if (VM_Version::cpu_cpuFeatures() & VM_Version::CPU_DMB_ATOMICS)
+      __ dmb(__ ISH);
   %}

   enc_class aarch64_enc_fstlrs(vRegF src, memory mem) %{
@@ -2236,6 +2246,8 @@
     }
     MOV_VOLATILE(rscratch2, $mem$$base, $mem$$index, $mem$$scale,
$mem$$disp,
 		 rscratch1, stlrw);
+    if (VM_Version::cpu_cpuFeatures() & VM_Version::CPU_DMB_ATOMICS)
+      __ dmb(__ ISH);
   %}

   enc_class aarch64_enc_fstlrd(vRegD src, memory mem) %{
@@ -2246,6 +2258,8 @@
     }
     MOV_VOLATILE(rscratch2, $mem$$base, $mem$$index, $mem$$scale,
$mem$$disp,
 		 rscratch1, stlr);
+    if (VM_Version::cpu_cpuFeatures() & VM_Version::CPU_DMB_ATOMICS)
+      __ dmb(__ ISH);
   %}

   // synchronized read/update encodings
@@ -4285,6 +4299,20 @@
   %}
 %}

+operand indIndexOffsetI2L(iRegP reg, iRegI ireg, immLU12 off)
+%{
+  constraint(ALLOC_IN_RC(ptr_reg));
+  match(AddP (AddP reg (ConvI2L ireg)) off);
+  op_cost(INSN_COST);
+  format %{ "$reg, $ireg, $off I2L" %}
+  interface(MEMORY_INTER) %{
+    base($reg);
+    index($ireg);
+    scale(0x0);
+    disp($off);
+  %}
+%}
+
 operand indIndexScaledOffsetI2L(iRegP reg, iRegI ireg, immIScale scale,
immLU12 off)
 %{
   constraint(ALLOC_IN_RC(ptr_reg));
@@ -4345,7 +4373,7 @@
 %{
   constraint(ALLOC_IN_RC(ptr_reg));
   match(AddP reg off);
-  op_cost(INSN_COST);
+  op_cost(0);
   format %{ "[$reg, $off]" %}
   interface(MEMORY_INTER) %{
     base($reg);
@@ -4415,6 +4443,21 @@
   %}
 %}

+operand indIndexOffsetI2LN(iRegN reg, iRegI ireg, immLU12 off)
+%{
+  predicate(Universe::narrow_oop_shift() == 0);
+  constraint(ALLOC_IN_RC(ptr_reg));
+  match(AddP (AddP (DecodeN reg) (ConvI2L ireg)) off);
+  op_cost(INSN_COST);
+  format %{ "$reg, $ireg, $off I2L\t# narrow" %}
+  interface(MEMORY_INTER) %{
+    base($reg);
+    index($ireg);
+    scale(0x0);
+    disp($off);
+  %}
+%}
+
 operand indIndexScaledOffsetI2LN(iRegN reg, iRegI ireg, immIScale
scale, immLU12 off)
 %{
   predicate(Universe::narrow_oop_shift() == 0);
@@ -4673,8 +4716,8 @@
 // memory is used to define read/write location for load/store
 // instruction defs. we can turn a memory op into an Address

-opclass memory(indirect, indIndexScaledOffsetI,  indIndexScaledOffsetL,
indIndexScaledOffsetI2L, indIndexScaled, indIndexScaledI2L, indIndex,
indOffI, indOffL,
-	       indirectN, indIndexScaledOffsetIN,  indIndexScaledOffsetLN,
indIndexScaledOffsetI2LN, indIndexScaledN, indIndexScaledI2LN,
indIndexN, indOffIN, indOffLN);
+opclass memory(indirect, indIndexScaledOffsetI, indIndexScaledOffsetL,
indIndexOffsetI2L, indIndexScaledOffsetI2L, indIndexScaled,
indIndexScaledI2L, indIndex, indOffI, indOffL,
+               indirectN, indIndexScaledOffsetIN,
indIndexScaledOffsetLN, indIndexOffsetI2LN, indIndexScaledOffsetI2LN,
indIndexScaledN, indIndexScaledI2LN, indIndexN, indOffIN, indOffLN);

 // iRegIorL2I is used for src inputs in rules for 32 bit int (I)
 // operations. it allows the src to be either an iRegI or a (ConvL2I
@@ -5616,7 +5659,7 @@
 %}

 // Store Byte
-instruct storeB(iRegI src, memory mem)
+instruct storeB(iRegIorL2I src, memory mem)
 %{
   match(Set mem (StoreB mem src));

@@ -5642,7 +5685,7 @@
 %}

 // Store Char/Short
-instruct storeC(iRegI src, memory mem)
+instruct storeC(iRegIorL2I src, memory mem)
 %{
   match(Set mem (StoreC mem src));

@@ -5943,7 +5986,7 @@
 //
============================================================================
 // Zero Count Instructions

-instruct countLeadingZerosI(iRegINoSp dst, iRegI src) %{
+instruct countLeadingZerosI(iRegINoSp dst, iRegIorL2I src) %{
   match(Set dst (CountLeadingZerosI src));

   ins_cost(INSN_COST);
@@ -5967,7 +6010,7 @@
   ins_pipe( ialu_reg );
 %}

-instruct countTrailingZerosI(iRegINoSp dst, iRegI src) %{
+instruct countTrailingZerosI(iRegINoSp dst, iRegIorL2I src) %{
   match(Set dst (CountTrailingZerosI src));

   ins_cost(INSN_COST * 2);
@@ -6539,7 +6582,7 @@
 // which throws a ShouldNotHappen. So, we have to provide two flavours
 // of each rule, one for a cmpOp and a second for a cmpOpU (sigh).

-instruct cmovI_reg_reg(cmpOp cmp, rFlagsReg cr, iRegINoSp dst, iRegI
src1, iRegI src2) %{
+instruct cmovI_reg_reg(cmpOp cmp, rFlagsReg cr, iRegINoSp dst,
iRegIorL2I src1, iRegIorL2I src2) %{
   match(Set dst (CMoveI (Binary cmp cr) (Binary src1 src2)));

   ins_cost(INSN_COST * 2);
@@ -6555,7 +6598,7 @@
   ins_pipe(icond_reg_reg);
 %}

-instruct cmovUI_reg_reg(cmpOpU cmp, rFlagsRegU cr, iRegINoSp dst, iRegI
src1, iRegI src2) %{
+instruct cmovUI_reg_reg(cmpOpU cmp, rFlagsRegU cr, iRegINoSp dst,
iRegIorL2I src1, iRegIorL2I src2) %{
   match(Set dst (CMoveI (Binary cmp cr) (Binary src1 src2)));

   ins_cost(INSN_COST * 2);
@@ -6580,7 +6623,7 @@
 // we ought only to be able to cull one of these variants as the ideal
 // transforms ought always to order the zero consistently (to left/right?)

-instruct cmovI_zero_reg(cmpOp cmp, rFlagsReg cr, iRegINoSp dst, immI0
zero, iRegI src2) %{
+instruct cmovI_zero_reg(cmpOp cmp, rFlagsReg cr, iRegINoSp dst, immI0
zero, iRegIorL2I src2) %{
   match(Set dst (CMoveI (Binary cmp cr) (Binary zero src2)));

   ins_cost(INSN_COST * 2);
@@ -6596,7 +6639,7 @@
   ins_pipe(icond_reg);
 %}

-instruct cmovUI_zero_reg(cmpOpU cmp, rFlagsRegU cr, iRegINoSp dst,
immI0 zero, iRegI src2) %{
+instruct cmovUI_zero_reg(cmpOpU cmp, rFlagsRegU cr, iRegINoSp dst,
immI0 zero, iRegIorL2I src2) %{
   match(Set dst (CMoveI (Binary cmp cr) (Binary zero src2)));

   ins_cost(INSN_COST * 2);
@@ -6612,7 +6655,7 @@
   ins_pipe(icond_reg);
 %}

-instruct cmovI_reg_zero(cmpOp cmp, rFlagsReg cr, iRegINoSp dst, iRegI
src1, immI0 zero) %{
+instruct cmovI_reg_zero(cmpOp cmp, rFlagsReg cr, iRegINoSp dst,
iRegIorL2I src1, immI0 zero) %{
   match(Set dst (CMoveI (Binary cmp cr) (Binary src1 zero)));

   ins_cost(INSN_COST * 2);
@@ -6628,7 +6671,7 @@
   ins_pipe(icond_reg);
 %}

-instruct cmovUI_reg_zero(cmpOpU cmp, rFlagsRegU cr, iRegINoSp dst,
iRegI src1, immI0 zero) %{
+instruct cmovUI_reg_zero(cmpOpU cmp, rFlagsRegU cr, iRegINoSp dst,
iRegIorL2I src1, immI0 zero) %{
   match(Set dst (CMoveI (Binary cmp cr) (Binary src1 zero)));

   ins_cost(INSN_COST * 2);
@@ -7080,7 +7123,7 @@
   ins_pipe(ialu_reg_reg);
 %}

-instruct addI_reg_imm(iRegINoSp dst, iRegI src1, immIAddSub src2) %{
+instruct addI_reg_imm(iRegINoSp dst, iRegIorL2I src1, immIAddSub src2) %{
   match(Set dst (AddI src1 src2));

   ins_cost(INSN_COST);
@@ -7127,7 +7170,7 @@
 instruct addP_reg_reg_ext(iRegPNoSp dst, iRegP src1, iRegIorL2I src2) %{
   match(Set dst (AddP src1 (ConvI2L src2)));

-  ins_cost(INSN_COST);
+  ins_cost(1.9 * INSN_COST);
   format %{ "add $dst, $src1, $src2, sxtw\t# ptr" %}

   ins_encode %{
@@ -7473,7 +7516,7 @@
   ins_pipe(idiv_reg_reg);
 %}

-instruct signExtract(iRegINoSp dst, iRegI src, immI_31 div1, immI_31
div2) %{
+instruct signExtract(iRegINoSp dst, iRegIorL2I src, immI_31 div1,
immI_31 div2) %{
   match(Set dst (URShiftI (RShiftI src div1) div2));
   ins_cost(INSN_COST);
   format %{ "lsrw $dst, $src, $div1" %}
@@ -7483,7 +7526,7 @@
   ins_pipe(ialu_reg_shift);
 %}

-instruct div2Round(iRegINoSp dst, iRegI src, immI_31 div1, immI_31 div2) %{
+instruct div2Round(iRegINoSp dst, iRegIorL2I src, immI_31 div1, immI_31
div2) %{
   match(Set dst (AddI src (URShiftI (RShiftI src div1) div2)));
   ins_cost(INSN_COST);
   format %{ "addw $dst, $src, LSR $div1" %}
@@ -7793,7 +7836,7 @@
   ins_pipe(ialu_reg);
 %}
 instruct regI_not_reg(iRegINoSp dst,
-                         iRegI src1, immI_M1 m1,
+                         iRegIorL2I src1, immI_M1 m1,
                          rFlagsReg cr) %{
   match(Set dst (XorI src1 m1));
   ins_cost(INSN_COST);
@@ -7810,10 +7853,27 @@
 %}

 instruct AndI_reg_not_reg(iRegINoSp dst,
-                         iRegI src1, iRegI src2, immI_M1 m1,
+                         iRegIorL2I src1, iRegIorL2I src2, immI_M1 m1,
                          rFlagsReg cr) %{
   match(Set dst (AndI src1 (XorI src2 m1)));
   ins_cost(INSN_COST);
+  format %{ "bicw  $dst, $src1, $src2" %}
+
+  ins_encode %{
+    __ bicw(as_Register($dst$$reg),
+              as_Register($src1$$reg),
+              as_Register($src2$$reg),
+              Assembler::LSL, 0);
+  %}
+
+  ins_pipe(ialu_reg_reg);
+%}
+
+instruct AndL_reg_not_reg(iRegLNoSp dst,
+                         iRegL src1, iRegL src2, immL_M1 m1,
+                         rFlagsReg cr) %{
+  match(Set dst (AndL src1 (XorL src2 m1)));
+  ins_cost(INSN_COST);
   format %{ "bic  $dst, $src1, $src2" %}

   ins_encode %{
@@ -7826,15 +7886,15 @@
   ins_pipe(ialu_reg_reg);
 %}

-instruct AndL_reg_not_reg(iRegLNoSp dst,
-                         iRegL src1, iRegL src2, immL_M1 m1,
+instruct OrI_reg_not_reg(iRegINoSp dst,
+                         iRegIorL2I src1, iRegIorL2I src2, immI_M1 m1,
                          rFlagsReg cr) %{
-  match(Set dst (AndL src1 (XorL src2 m1)));
-  ins_cost(INSN_COST);
-  format %{ "bic  $dst, $src1, $src2" %}
-
-  ins_encode %{
-    __ bic(as_Register($dst$$reg),
+  match(Set dst (OrI src1 (XorI src2 m1)));
+  ins_cost(INSN_COST);
+  format %{ "ornw  $dst, $src1, $src2" %}
+
+  ins_encode %{
+    __ ornw(as_Register($dst$$reg),
               as_Register($src1$$reg),
               as_Register($src2$$reg),
               Assembler::LSL, 0);
@@ -7843,10 +7903,10 @@
   ins_pipe(ialu_reg_reg);
 %}

-instruct OrI_reg_not_reg(iRegINoSp dst,
-                         iRegI src1, iRegI src2, immI_M1 m1,
+instruct OrL_reg_not_reg(iRegLNoSp dst,
+                         iRegL src1, iRegL src2, immL_M1 m1,
                          rFlagsReg cr) %{
-  match(Set dst (OrI src1 (XorI src2 m1)));
+  match(Set dst (OrL src1 (XorL src2 m1)));
   ins_cost(INSN_COST);
   format %{ "orn  $dst, $src1, $src2" %}

@@ -7860,15 +7920,15 @@
   ins_pipe(ialu_reg_reg);
 %}

-instruct OrL_reg_not_reg(iRegLNoSp dst,
-                         iRegL src1, iRegL src2, immL_M1 m1,
+instruct XorI_reg_not_reg(iRegINoSp dst,
+                         iRegIorL2I src1, iRegIorL2I src2, immI_M1 m1,
                          rFlagsReg cr) %{
-  match(Set dst (OrL src1 (XorL src2 m1)));
-  ins_cost(INSN_COST);
-  format %{ "orn  $dst, $src1, $src2" %}
-
-  ins_encode %{
-    __ orn(as_Register($dst$$reg),
+  match(Set dst (XorI m1 (XorI src2 src1)));
+  ins_cost(INSN_COST);
+  format %{ "eonw  $dst, $src1, $src2" %}
+
+  ins_encode %{
+    __ eonw(as_Register($dst$$reg),
               as_Register($src1$$reg),
               as_Register($src2$$reg),
               Assembler::LSL, 0);
@@ -7877,10 +7937,10 @@
   ins_pipe(ialu_reg_reg);
 %}

-instruct XorI_reg_not_reg(iRegINoSp dst,
-                         iRegI src1, iRegI src2, immI_M1 m1,
+instruct XorL_reg_not_reg(iRegLNoSp dst,
+                         iRegL src1, iRegL src2, immL_M1 m1,
                          rFlagsReg cr) %{
-  match(Set dst (XorI m1 (XorI src2 src1)));
+  match(Set dst (XorL m1 (XorL src2 src1)));
   ins_cost(INSN_COST);
   format %{ "eon  $dst, $src1, $src2" %}

@@ -7894,25 +7954,8 @@
   ins_pipe(ialu_reg_reg);
 %}

-instruct XorL_reg_not_reg(iRegLNoSp dst,
-                         iRegL src1, iRegL src2, immL_M1 m1,
-                         rFlagsReg cr) %{
-  match(Set dst (XorL m1 (XorL src2 src1)));
-  ins_cost(INSN_COST);
-  format %{ "eon  $dst, $src1, $src2" %}
-
-  ins_encode %{
-    __ eon(as_Register($dst$$reg),
-              as_Register($src1$$reg),
-              as_Register($src2$$reg),
-              Assembler::LSL, 0);
-  %}
-
-  ins_pipe(ialu_reg_reg);
-%}
-
 instruct AndI_reg_URShift_not_reg(iRegINoSp dst,
-                         iRegI src1, iRegI src2,
+                         iRegIorL2I src1, iRegIorL2I src2,
                          immI src3, immI_M1 src4, rFlagsReg cr) %{
   match(Set dst (AndI src1 (XorI(URShiftI src2 src3) src4)));
   ins_cost(1.9 * INSN_COST);
@@ -7948,7 +7991,7 @@
 %}

 instruct AndI_reg_RShift_not_reg(iRegINoSp dst,
-                         iRegI src1, iRegI src2,
+                         iRegIorL2I src1, iRegIorL2I src2,
                          immI src3, immI_M1 src4, rFlagsReg cr) %{
   match(Set dst (AndI src1 (XorI(RShiftI src2 src3) src4)));
   ins_cost(1.9 * INSN_COST);
@@ -7984,7 +8027,7 @@
 %}

 instruct AndI_reg_LShift_not_reg(iRegINoSp dst,
-                         iRegI src1, iRegI src2,
+                         iRegIorL2I src1, iRegIorL2I src2,
                          immI src3, immI_M1 src4, rFlagsReg cr) %{
   match(Set dst (AndI src1 (XorI(LShiftI src2 src3) src4)));
   ins_cost(1.9 * INSN_COST);
@@ -8020,7 +8063,7 @@
 %}

 instruct XorI_reg_URShift_not_reg(iRegINoSp dst,
-                         iRegI src1, iRegI src2,
+                         iRegIorL2I src1, iRegIorL2I src2,
                          immI src3, immI_M1 src4, rFlagsReg cr) %{
   match(Set dst (XorI src4 (XorI(URShiftI src2 src3) src1)));
   ins_cost(1.9 * INSN_COST);
@@ -8056,7 +8099,7 @@
 %}

 instruct XorI_reg_RShift_not_reg(iRegINoSp dst,
-                         iRegI src1, iRegI src2,
+                         iRegIorL2I src1, iRegIorL2I src2,
                          immI src3, immI_M1 src4, rFlagsReg cr) %{
   match(Set dst (XorI src4 (XorI(RShiftI src2 src3) src1)));
   ins_cost(1.9 * INSN_COST);
@@ -8092,7 +8135,7 @@
 %}

 instruct XorI_reg_LShift_not_reg(iRegINoSp dst,
-                         iRegI src1, iRegI src2,
+                         iRegIorL2I src1, iRegIorL2I src2,
                          immI src3, immI_M1 src4, rFlagsReg cr) %{
   match(Set dst (XorI src4 (XorI(LShiftI src2 src3) src1)));
   ins_cost(1.9 * INSN_COST);
@@ -8128,7 +8171,7 @@
 %}

 instruct OrI_reg_URShift_not_reg(iRegINoSp dst,
-                         iRegI src1, iRegI src2,
+                         iRegIorL2I src1, iRegIorL2I src2,
                          immI src3, immI_M1 src4, rFlagsReg cr) %{
   match(Set dst (OrI src1 (XorI(URShiftI src2 src3) src4)));
   ins_cost(1.9 * INSN_COST);
@@ -8164,7 +8207,7 @@
 %}

 instruct OrI_reg_RShift_not_reg(iRegINoSp dst,
-                         iRegI src1, iRegI src2,
+                         iRegIorL2I src1, iRegIorL2I src2,
                          immI src3, immI_M1 src4, rFlagsReg cr) %{
   match(Set dst (OrI src1 (XorI(RShiftI src2 src3) src4)));
   ins_cost(1.9 * INSN_COST);
@@ -8200,7 +8243,7 @@
 %}

 instruct OrI_reg_LShift_not_reg(iRegINoSp dst,
-                         iRegI src1, iRegI src2,
+                         iRegIorL2I src1, iRegIorL2I src2,
                          immI src3, immI_M1 src4, rFlagsReg cr) %{
   match(Set dst (OrI src1 (XorI(LShiftI src2 src3) src4)));
   ins_cost(1.9 * INSN_COST);
@@ -8236,7 +8279,7 @@
 %}

 instruct AndI_reg_URShift_reg(iRegINoSp dst,
-                         iRegI src1, iRegI src2,
+                         iRegIorL2I src1, iRegIorL2I src2,
                          immI src3, rFlagsReg cr) %{
   match(Set dst (AndI src1 (URShiftI src2 src3)));


From Alexander.Alexeev at caviumnetworks.com  Mon Jun  1 14:23:24 2015
From: Alexander.Alexeev at caviumnetworks.com (Alexeev, Alexander)
Date: Mon, 1 Jun 2015 14:23:24 +0000
Subject: [aarch64-port-dev ] UseSHA flag is a bit inconsistent on AArch64
Message-ID: <SN1PR07MB14724F20D23D6FA81E7E3D1999B60@SN1PR07MB1472.namprd07.prod.outlook.com>

Hello

I noticed a couple of inconsistences related to UseSHA flag  on AArch64


1.       Comments for this flag in globals.hpp says   "Control whether SHA instructions can be used on SPARC"

2.       Two rules for the flag defined in test suite are broken. (Although rules defined for Sparc)

a.       UseSHA option should be disabled when all UseSHA*Intrinsics are disabled

b.      UseSHA option should be disabled when all UseSHA*Intrinsics are disabled even if UseSHA flag set to JVM

Proposed fixes for both issues are below


--- CUT HERE ---

--- old/src/cpu/aarch64/vm/vm_version_aarch64.cpp   2015-06-01 14:19:20.854027000 +0000

+++ new/src/cpu/aarch64/vm/vm_version_aarch64.cpp   2015-06-01 14:19:20.664027000 +0000

@@ -228,6 +228,9 @@

       warning("SHA512 instruction (for SHA-384 and SHA-512) is not available on this CPU.");

       FLAG_SET_DEFAULT(UseSHA512Intrinsics, false);

     }

+    if (!(UseSHA1Intrinsics || UseSHA256Intrinsics || UseSHA512Intrinsics)) {

+      FLAG_SET_DEFAULT(UseSHA, false);

+    }

   }


   // This machine allows unaligned memory accesses

--- old/src/share/vm/runtime/globals.hpp    2015-06-01 14:19:21.594027000 +0000

+++ new/src/share/vm/runtime/globals.hpp    2015-06-01 14:19:21.374027000 +0000

@@ -639,7 +639,8 @@

           "Control whether AES instructions can be used on x86/x64")        \

                                                                             \

   product(bool, UseSHA, false,                                              \

-          "Control whether SHA instructions can be used on SPARC")          \

+          "Control whether SHA instructions can be used"                    \

+          "on SPARC and AArch64")                                           \

                                                                             \

   product(size_t, LargePageSizeInBytes, 0,                                  \

           "Large page size (0 to let VM choose the page size)")             \

--- CUT HERE ---

Wbr,
Alexander

From Alexander.Alexeev at caviumnetworks.com  Mon Jun  1 18:38:53 2015
From: Alexander.Alexeev at caviumnetworks.com (Alexeev, Alexander)
Date: Mon, 1 Jun 2015 18:38:53 +0000
Subject: [aarch64-port-dev ] RFR: AARCH64: Fix for failed sha tests
	(compiler/intrinsics/sha)
Message-ID: <SN1PR07MB14725973FE3CE770944C89CF99B60@SN1PR07MB1472.namprd07.prod.outlook.com>

Hello

This patch [1] resolves issues with sha related tests (hotspot/test/compiler/intrinsics/sha) failed on aarch64. Names of tests are below.
The problem is that tests have no cases for aarch64 and such architecture falls into "OtherCPU" category which assumed not supporting SHA intrinsics.

Adding test cases for AArch64 solve that.
For "supported" test cases and "specific unsupported" test cases implementation is  just combination with Sparc existed versions by renaming classes and adding predicates for aarch64.
For "unsupported" version dedicated aarch64 test case is created since Sparc version doesn't check -XX:+SHA... options warnings and aarch64 can do what and does.
"supported" - means CPU support SHA instructions  and intrinsics are available
"unsupported" - means CPU doesn't have SHA support

Failed tests:
compiler/intrinsics/sha/cli/TestUseSHA1IntrinsicsOptionOnUnsupportedCPU.java
compiler/intrinsics/sha/cli/TestUseSHA256IntrinsicsOptionOnUnsupportedCPU.java
compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java
compiler/intrinsics/sha/sanity/TestSHA1Intrinsics.java
compiler/intrinsics/sha/sanity/TestSHA1MultiBlockIntrinsics.java
compiler/intrinsics/sha/sanity/TestSHA256Intrinsics.java
compiler/intrinsics/sha/sanity/TestSHA256MultiBlockIntrinsics.java

After applying proposed patch new test starts failing until inconsistency in UseSHA flag behavior will be resolved. Details with fix were sent early to the aarch64-port mail list in separate message.
Failing test: compiler/intrinsics/sha/cli/TestUseSHAOptionOnSupportedCPU.java

Jtreg hotspot tests
Before:
Test results: passed: 818; failed: 34; error: 3
After:
Test results: passed: 824; failed: 28; error: 3

Tests weren't executed on SPARC since such arch is unavailable.


Please, review and sponsor if approved.

[1] www.googledrive.com/host/0B5VQvD5uJjDQfmt3OXBJWHFTc2RQN0RsNUpNZ1ZDUFdPSno3VW12eUJnR0Y0TWszNHpaSEU<http://www.googledrive.com/host/0B5VQvD5uJjDQfmt3OXBJWHFTc2RQN0RsNUpNZ1ZDUFdPSno3VW12eUJnR0Y0TWszNHpaSEU>


From edward.nevill at gmail.com  Mon Jun  1 19:06:03 2015
From: edward.nevill at gmail.com (edward.nevill at gmail.com)
Date: Mon, 01 Jun 2015 19:06:03 +0000
Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8/hotspot: 2 new changesets
Message-ID: <201506011906.t51J63KH029334@aojmv0008.oracle.com>

Changeset: 685e10e5d557
Author:    thartmann
Date:      2015-03-23 10:13 +0100
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/685e10e5d557

8075324: Costs of memory operands in aarch64.ad are inconsistent
Summary: Made cost of 'indOffI' consistent to the other memory operands.
Reviewed-by: roland, aph, adinn

! src/cpu/aarch64/vm/aarch64.ad

Changeset: 471988878307
Author:    thartmann
Date:      2015-03-23 10:15 +0100
URL:       http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/471988878307

8075136: Unnecessary sign extension for byte array access
Summary: Added C2 matching rules to remove unnecessary sign extension for byte array access.
Reviewed-by: roland, kvn, aph, adinn

! src/cpu/aarch64/vm/aarch64.ad
! src/cpu/x86/vm/x86_64.ad


From edward.nevill at linaro.org  Tue Jun  2 09:26:35 2015
From: edward.nevill at linaro.org (Edward Nevill)
Date: Tue, 02 Jun 2015 10:26:35 +0100
Subject: [aarch64-port-dev ] RFR: 8081669: aarch64: JTreg TestStable tests
	failing
Message-ID: <1433237195.16770.13.camel@mylittlepony.linaroharston>

Hi,

The following webrev

http://cr.openjdk.java.net/~enevill/8081669/webrev.00/

fixes a number of TestStable tests.

This patch was contributed by alexander.alexeev at caviumnetworks.com

The following are the test failures that are fixed by this patch

compiler/stable/TestStableByte.java
compiler/stable/TestStableBoolean.java
compiler/stable/TestStableChar.java
compiler/stable/TestStableFloat.java
compiler/stable/TestStableObject.java
compiler/stable/TestStableDouble.java
compiler/stable/TestStableInt.java
compiler/stable/TestStableLong.java
compiler/stable/TestStableShort.java

The problem is that the method 'get' in StableConfiguration is supposed to return true if the method is server compiled, false otherwise.

On aarch64 it is always returning true, even when the method is client compiled. The reason for this is that aarch64 differs from other ports in that it always deopts on patching.

This means that the method 'get' deopts immediately when compiled with -Xcomp because it hits an unresolved method call. This means that the method is now executing in the interpreter.

When the method 'get' is executing in the interpreter, it uses the value of java.vm.name to determine whether the method would be server compiled if it were to be compiled. This ends up returning true on aarch64, because it is a server compiler.

However in the case where we force it not to server compile by using -XX:+TieredCompilation and -XX:TieredStopAtLevel=1 (as in the TestStable tests) this will be incorrect.

The solution is to introduce a simple (null) method 'get1' which will never be deopted (because there is never anything to patch) and uses this as the basis for deciding whether we are server compiling or not.

This is more fully explained in the inline comment in the patch.

Please review and if appropriate I will push.

All the best,
Ed.


From aph at redhat.com  Tue Jun  2 10:45:02 2015
From: aph at redhat.com (Andrew Haley)
Date: Tue, 02 Jun 2015 11:45:02 +0100
Subject: [aarch64-port-dev ] RFR: 8081669: aarch64: JTreg TestStable
 tests failing
In-Reply-To: <1433237195.16770.13.camel@mylittlepony.linaroharston>
References: <1433237195.16770.13.camel@mylittlepony.linaroharston>
Message-ID: <556D892E.50805@redhat.com>

On 06/02/2015 10:26 AM, Edward Nevill wrote:
> Please review and if appropriate I will push.

That sounds correct to me, but I need a JDK9 reviewer.

Andrew.


From aph at redhat.com  Tue Jun  2 12:56:41 2015
From: aph at redhat.com (Andrew Haley)
Date: Tue, 02 Jun 2015 13:56:41 +0100
Subject: [aarch64-port-dev ] RFR: 8079565: aarch64: Add vectorization
 support for aarch64
In-Reply-To: <1432658017.17486.32.camel@mylittlepony.linaroharston>
References: <1432658017.17486.32.camel@mylittlepony.linaroharston>
Message-ID: <556DA809.9080305@redhat.com>

On 05/26/2015 05:33 PM, Edward Nevill wrote:

> The following webrev
> 
> http://cr.openjdk.java.net/~enevill/8079565/webrev.00/

Looks like a great start, thanks.

Can we have a JDK9 reviewer, please?

Thanks,
Andrew.


From vladimir.x.ivanov at oracle.com  Tue Jun  2 14:17:17 2015
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Tue, 02 Jun 2015 17:17:17 +0300
Subject: [aarch64-port-dev ] RFR: 8081669: aarch64: JTreg TestStable
	tests failing
In-Reply-To: <1433237195.16770.13.camel@mylittlepony.linaroharston>
References: <1433237195.16770.13.camel@mylittlepony.linaroharston>
Message-ID: <556DBAED.5030701@oracle.com>

Looks good.

Best regards,
Vladimir Ivanov

On 6/2/15 12:26 PM, Edward Nevill wrote:
> Hi,
>
> The following webrev
>
> http://cr.openjdk.java.net/~enevill/8081669/webrev.00/
>
> fixes a number of TestStable tests.
>
> This patch was contributed by alexander.alexeev at caviumnetworks.com
>
> The following are the test failures that are fixed by this patch
>
> compiler/stable/TestStableByte.java
> compiler/stable/TestStableBoolean.java
> compiler/stable/TestStableChar.java
> compiler/stable/TestStableFloat.java
> compiler/stable/TestStableObject.java
> compiler/stable/TestStableDouble.java
> compiler/stable/TestStableInt.java
> compiler/stable/TestStableLong.java
> compiler/stable/TestStableShort.java
>
> The problem is that the method 'get' in StableConfiguration is supposed to return true if the method is server compiled, false otherwise.
>
> On aarch64 it is always returning true, even when the method is client compiled. The reason for this is that aarch64 differs from other ports in that it always deopts on patching.
>
> This means that the method 'get' deopts immediately when compiled with -Xcomp because it hits an unresolved method call. This means that the method is now executing in the interpreter.
>
> When the method 'get' is executing in the interpreter, it uses the value of java.vm.name to determine whether the method would be server compiled if it were to be compiled. This ends up returning true on aarch64, because it is a server compiler.
>
> However in the case where we force it not to server compile by using -XX:+TieredCompilation and -XX:TieredStopAtLevel=1 (as in the TestStable tests) this will be incorrect.
>
> The solution is to introduce a simple (null) method 'get1' which will never be deopted (because there is never anything to patch) and uses this as the basis for deciding whether we are server compiling or not.
>
> This is more fully explained in the inline comment in the patch.
>
> Please review and if appropriate I will push.
>
> All the best,
> Ed.
>
>

From vladimir.x.ivanov at oracle.com  Tue Jun  2 16:50:55 2015
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Tue, 02 Jun 2015 19:50:55 +0300
Subject: [aarch64-port-dev ] RFR: 8081669: aarch64: JTreg TestStable
	tests failing
In-Reply-To: <556DBAED.5030701@oracle.com>
References: <1433237195.16770.13.camel@mylittlepony.linaroharston>
	<556DBAED.5030701@oracle.com>
Message-ID: <556DDEEF.4060703@oracle.com>

The only concern I have is that I don't see Alexander on OCA list [1].
In order to proceed with the fix, he should sign OCA first.

Best regards,
Vladimir Ivanov

[1] http://www.oracle.com/technetwork/community/oca-486395.html

On 6/2/15 5:17 PM, Vladimir Ivanov wrote:
> Looks good.
>
> Best regards,
> Vladimir Ivanov
>
> On 6/2/15 12:26 PM, Edward Nevill wrote:
>> Hi,
>>
>> The following webrev
>>
>> http://cr.openjdk.java.net/~enevill/8081669/webrev.00/
>>
>> fixes a number of TestStable tests.
>>
>> This patch was contributed by alexander.alexeev at caviumnetworks.com
>>
>> The following are the test failures that are fixed by this patch
>>
>> compiler/stable/TestStableByte.java
>> compiler/stable/TestStableBoolean.java
>> compiler/stable/TestStableChar.java
>> compiler/stable/TestStableFloat.java
>> compiler/stable/TestStableObject.java
>> compiler/stable/TestStableDouble.java
>> compiler/stable/TestStableInt.java
>> compiler/stable/TestStableLong.java
>> compiler/stable/TestStableShort.java
>>
>> The problem is that the method 'get' in StableConfiguration is
>> supposed to return true if the method is server compiled, false
>> otherwise.
>>
>> On aarch64 it is always returning true, even when the method is client
>> compiled. The reason for this is that aarch64 differs from other ports
>> in that it always deopts on patching.
>>
>> This means that the method 'get' deopts immediately when compiled with
>> -Xcomp because it hits an unresolved method call. This means that the
>> method is now executing in the interpreter.
>>
>> When the method 'get' is executing in the interpreter, it uses the
>> value of java.vm.name to determine whether the method would be server
>> compiled if it were to be compiled. This ends up returning true on
>> aarch64, because it is a server compiler.
>>
>> However in the case where we force it not to server compile by using
>> -XX:+TieredCompilation and -XX:TieredStopAtLevel=1 (as in the
>> TestStable tests) this will be incorrect.
>>
>> The solution is to introduce a simple (null) method 'get1' which will
>> never be deopted (because there is never anything to patch) and uses
>> this as the basis for deciding whether we are server compiling or not.
>>
>> This is more fully explained in the inline comment in the patch.
>>
>> Please review and if appropriate I will push.
>>
>> All the best,
>> Ed.
>>
>>

From Alexander.Alexeev at caviumnetworks.com  Tue Jun  2 16:55:29 2015
From: Alexander.Alexeev at caviumnetworks.com (Alexeev, Alexander)
Date: Tue, 2 Jun 2015 16:55:29 +0000
Subject: [aarch64-port-dev ] RFR: 8081669: aarch64: JTreg
 TestStable	tests failing
In-Reply-To: <556DDEEF.4060703@oracle.com>
References: <1433237195.16770.13.camel@mylittlepony.linaroharston>
	<556DBAED.5030701@oracle.com> <556DDEEF.4060703@oracle.com>
Message-ID: <SN1PR07MB14725CAEBF21812FB63D3D5D99B50@SN1PR07MB1472.namprd07.prod.outlook.com>

Vladimir, I am contributing on behalf of Cavium Inc.
It actually exists in the list. Is it enough?

Regards,
Alexander

-----Original Message-----
From: aarch64-port-dev [mailto:aarch64-port-dev-bounces at openjdk.java.net] On Behalf Of Vladimir Ivanov
Sent: Tuesday, June 2, 2015 7:51 PM
To: edward.nevill at linaro.org; aarch64-port-dev at openjdk.java.net; hotspot-dev Source Developers
Subject: Re: [aarch64-port-dev ] RFR: 8081669: aarch64: JTreg TestStable tests failing

The only concern I have is that I don't see Alexander on OCA list [1].
In order to proceed with the fix, he should sign OCA first.

Best regards,
Vladimir Ivanov

[1] http://www.oracle.com/technetwork/community/oca-486395.html

On 6/2/15 5:17 PM, Vladimir Ivanov wrote:
> Looks good.
>
> Best regards,
> Vladimir Ivanov
>
> On 6/2/15 12:26 PM, Edward Nevill wrote:
>> Hi,
>>
>> The following webrev
>>
>> http://cr.openjdk.java.net/~enevill/8081669/webrev.00/
>>
>> fixes a number of TestStable tests.
>>
>> This patch was contributed by alexander.alexeev at caviumnetworks.com
>>
>> The following are the test failures that are fixed by this patch
>>
>> compiler/stable/TestStableByte.java
>> compiler/stable/TestStableBoolean.java
>> compiler/stable/TestStableChar.java
>> compiler/stable/TestStableFloat.java
>> compiler/stable/TestStableObject.java
>> compiler/stable/TestStableDouble.java
>> compiler/stable/TestStableInt.java
>> compiler/stable/TestStableLong.java
>> compiler/stable/TestStableShort.java
>>
>> The problem is that the method 'get' in StableConfiguration is 
>> supposed to return true if the method is server compiled, false 
>> otherwise.
>>
>> On aarch64 it is always returning true, even when the method is 
>> client compiled. The reason for this is that aarch64 differs from 
>> other ports in that it always deopts on patching.
>>
>> This means that the method 'get' deopts immediately when compiled 
>> with -Xcomp because it hits an unresolved method call. This means 
>> that the method is now executing in the interpreter.
>>
>> When the method 'get' is executing in the interpreter, it uses the 
>> value of java.vm.name to determine whether the method would be server 
>> compiled if it were to be compiled. This ends up returning true on 
>> aarch64, because it is a server compiler.
>>
>> However in the case where we force it not to server compile by using 
>> -XX:+TieredCompilation and -XX:TieredStopAtLevel=1 (as in the 
>> TestStable tests) this will be incorrect.
>>
>> The solution is to introduce a simple (null) method 'get1' which will 
>> never be deopted (because there is never anything to patch) and uses 
>> this as the basis for deciding whether we are server compiling or not.
>>
>> This is more fully explained in the inline comment in the patch.
>>
>> Please review and if appropriate I will push.
>>
>> All the best,
>> Ed.
>>
>>

From edward.nevill at gmail.com  Tue Jun  2 17:11:42 2015
From: edward.nevill at gmail.com (Edward Nevill)
Date: Tue, 02 Jun 2015 18:11:42 +0100
Subject: [aarch64-port-dev ] RFR: 8081669: aarch64: JTreg TestStable
 tests failing
In-Reply-To: <556DDEEF.4060703@oracle.com>
References: <1433237195.16770.13.camel@mylittlepony.linaroharston>
	<556DBAED.5030701@oracle.com>  <556DDEEF.4060703@oracle.com>
Message-ID: <1433265102.1852.5.camel@mint>

Hi Vladimir,

- Alexander is employed as a contractor by Cavium
- Cavium have signed the OCA
- Alexander is using his Cavium email address

I have checked this with Dalibor Topic (on cc) and my understanding was
that this was sufficient to allow Alexander to contribute.

All the best,
Ed.

On Tue, 2015-06-02 at 19:50 +0300, Vladimir Ivanov wrote:
> The only concern I have is that I don't see Alexander on OCA list [1].
> In order to proceed with the fix, he should sign OCA first.
> 
> Best regards,
> Vladimir Ivanov
> 
> [1] http://www.oracle.com/technetwork/community/oca-486395.html
> 
> On 6/2/15 5:17 PM, Vladimir Ivanov wrote:
> > Looks good.
> >
> > Best regards,
> > Vladimir Ivanov
> >
> > On 6/2/15 12:26 PM, Edward Nevill wrote:
> >> Hi,
> >>
> >> The following webrev
> >>
> >> http://cr.openjdk.java.net/~enevill/8081669/webrev.00/
> >>
> >> fixes a number of TestStable tests.
> >>
> >> This patch was contributed by alexander.alexeev at caviumnetworks.com
> >>
> >> The following are the test failures that are fixed by this patch
> >>
> >> compiler/stable/TestStableByte.java
> >> compiler/stable/TestStableBoolean.java
> >> compiler/stable/TestStableChar.java
> >> compiler/stable/TestStableFloat.java
> >> compiler/stable/TestStableObject.java
> >> compiler/stable/TestStableDouble.java
> >> compiler/stable/TestStableInt.java
> >> compiler/stable/TestStableLong.java
> >> compiler/stable/TestStableShort.java
> >>
> >> The problem is that the method 'get' in StableConfiguration is
> >> supposed to return true if the method is server compiled, false
> >> otherwise.
> >>
> >> On aarch64 it is always returning true, even when the method is client
> >> compiled. The reason for this is that aarch64 differs from other ports
> >> in that it always deopts on patching.
> >>
> >> This means that the method 'get' deopts immediately when compiled with
> >> -Xcomp because it hits an unresolved method call. This means that the
> >> method is now executing in the interpreter.
> >>
> >> When the method 'get' is executing in the interpreter, it uses the
> >> value of java.vm.name to determine whether the method would be server
> >> compiled if it were to be compiled. This ends up returning true on
> >> aarch64, because it is a server compiler.
> >>
> >> However in the case where we force it not to server compile by using
> >> -XX:+TieredCompilation and -XX:TieredStopAtLevel=1 (as in the
> >> TestStable tests) this will be incorrect.
> >>
> >> The solution is to introduce a simple (null) method 'get1' which will
> >> never be deopted (because there is never anything to patch) and uses
> >> this as the basis for deciding whether we are server compiling or not.
> >>
> >> This is more fully explained in the inline comment in the patch.
> >>
> >> Please review and if appropriate I will push.
> >>
> >> All the best,
> >> Ed.
> >>
> >>


From vladimir.x.ivanov at oracle.com  Tue Jun  2 17:16:55 2015
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Tue, 02 Jun 2015 20:16:55 +0300
Subject: [aarch64-port-dev ] RFR: 8081669: aarch64: JTreg TestStable
 tests failing
In-Reply-To: <1433265102.1852.5.camel@mint>
References: <1433237195.16770.13.camel@mylittlepony.linaroharston>	
	<556DBAED.5030701@oracle.com> <556DDEEF.4060703@oracle.com>
	<1433265102.1852.5.camel@mint>
Message-ID: <556DE507.2060403@oracle.com>

Edward, Alexander, thanks for the clarifications!

My bad, missed Cavium in the OCA list.

Best regards,
Vladimir Ivanov

On 6/2/15 8:11 PM, Edward Nevill wrote:
> Hi Vladimir,
>
> - Alexander is employed as a contractor by Cavium
> - Cavium have signed the OCA
> - Alexander is using his Cavium email address
>
> I have checked this with Dalibor Topic (on cc) and my understanding was
> that this was sufficient to allow Alexander to contribute.
>
> All the best,
> Ed.
>
> On Tue, 2015-06-02 at 19:50 +0300, Vladimir Ivanov wrote:
>> The only concern I have is that I don't see Alexander on OCA list [1].
>> In order to proceed with the fix, he should sign OCA first.
>>
>> Best regards,
>> Vladimir Ivanov
>>
>> [1] http://www.oracle.com/technetwork/community/oca-486395.html
>>
>> On 6/2/15 5:17 PM, Vladimir Ivanov wrote:
>>> Looks good.
>>>
>>> Best regards,
>>> Vladimir Ivanov
>>>
>>> On 6/2/15 12:26 PM, Edward Nevill wrote:
>>>> Hi,
>>>>
>>>> The following webrev
>>>>
>>>> http://cr.openjdk.java.net/~enevill/8081669/webrev.00/
>>>>
>>>> fixes a number of TestStable tests.
>>>>
>>>> This patch was contributed by alexander.alexeev at caviumnetworks.com
>>>>
>>>> The following are the test failures that are fixed by this patch
>>>>
>>>> compiler/stable/TestStableByte.java
>>>> compiler/stable/TestStableBoolean.java
>>>> compiler/stable/TestStableChar.java
>>>> compiler/stable/TestStableFloat.java
>>>> compiler/stable/TestStableObject.java
>>>> compiler/stable/TestStableDouble.java
>>>> compiler/stable/TestStableInt.java
>>>> compiler/stable/TestStableLong.java
>>>> compiler/stable/TestStableShort.java
>>>>
>>>> The problem is that the method 'get' in StableConfiguration is
>>>> supposed to return true if the method is server compiled, false
>>>> otherwise.
>>>>
>>>> On aarch64 it is always returning true, even when the method is client
>>>> compiled. The reason for this is that aarch64 differs from other ports
>>>> in that it always deopts on patching.
>>>>
>>>> This means that the method 'get' deopts immediately when compiled with
>>>> -Xcomp because it hits an unresolved method call. This means that the
>>>> method is now executing in the interpreter.
>>>>
>>>> When the method 'get' is executing in the interpreter, it uses the
>>>> value of java.vm.name to determine whether the method would be server
>>>> compiled if it were to be compiled. This ends up returning true on
>>>> aarch64, because it is a server compiler.
>>>>
>>>> However in the case where we force it not to server compile by using
>>>> -XX:+TieredCompilation and -XX:TieredStopAtLevel=1 (as in the
>>>> TestStable tests) this will be incorrect.
>>>>
>>>> The solution is to introduce a simple (null) method 'get1' which will
>>>> never be deopted (because there is never anything to patch) and uses
>>>> this as the basis for deciding whether we are server compiling or not.
>>>>
>>>> This is more fully explained in the inline comment in the patch.
>>>>
>>>> Please review and if appropriate I will push.
>>>>
>>>> All the best,
>>>> Ed.
>>>>
>>>>
>
>

From edward.nevill at linaro.org  Wed Jun  3 08:51:47 2015
From: edward.nevill at linaro.org (Edward Nevill)
Date: Wed, 03 Jun 2015 09:51:47 +0100
Subject: [aarch64-port-dev ] RFR: 8081790: SHA tests fail
Message-ID: <1433321507.32688.13.camel@mylittlepony.linaroharston>

Hi,

The following webrev

http://cr.openjdk.java.net/~enevill/8081790/webrev.00/

fixes a number of SHA test failures on aarch64.

This patch was contributed by alexander.alexeev at caviumnetworks.com

Currently the following JTReg/hotspot SHA tests fail on aarch64

FAILED: compiler/intrinsics/sha/cli/TestUseSHA1IntrinsicsOptionOnUnsupportedCPU.java 
FAILED: compiler/intrinsics/sha/cli/TestUseSHA256IntrinsicsOptionOnUnsupportedCPU.java 
FAILED: compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java (ie
FAILED: compiler/intrinsics/sha/sanity/TestSHA1Intrinsics.java 
FAILED: compiler/intrinsics/sha/sanity/TestSHA1MultiBlockIntrinsics.java 
FAILED: compiler/intrinsics/sha/sanity/TestSHA256MultiBlockIntrinsics.java 
FAILED: compiler/intrinsics/sha/sanity/TestSHA256Intrinsics.java

The reason for the test failures is that the test suite is configured on the assumption that Sparc is the only arch which support SHA in hw (and therefore supports the -XX:+UseSHA options).

The webrev adds tests for aarch64.

The following files have also been renamed as they were inappropriately named.

R test/compiler/intrinsics/sha/cli/testcases/GenericTestCaseForSupportedSparcCPU.java
R test/compiler/intrinsics/sha/cli/testcases/UseSHAIntrinsicsSpecificTestCaseForUnsupportedSparcCPU.java
R test/compiler/intrinsics/sha/cli/testcases/UseSHASpecificTestCaseForSupportedSparcCPU.java
R test/compiler/intrinsics/sha/cli/testcases/UseSHASpecificTestCaseForUnsupportedSparcCPU.java

These now become

A test/compiler/intrinsics/sha/cli/testcases/GenericTestCaseForSupportedCPU.java
A test/compiler/intrinsics/sha/cli/testcases/UseSHAIntrinsicsSpecificTestCaseForUnsupportedCPU.java
A test/compiler/intrinsics/sha/cli/testcases/UseSHASpecificTestCaseForSupportedCPU.java
A test/compiler/intrinsics/sha/cli/testcases/UseSHASpecificTestCaseForUnsupportedCPU.java

(ie. the 'Sparc' has been dropped from the filename as Sparc is no longer the only arch which supports SHA).

Tested with JTReg/hotspot

Before: Test results: passed: 840; failed: 10; error: 5
After:  Test results: passed: 847; failed: 3; error: 5

Please review,

Thanks,
Ed.


From edward.nevill at gmail.com  Wed Jun  3 14:32:19 2015
From: edward.nevill at gmail.com (Edward Nevill)
Date: Wed, 03 Jun 2015 15:32:19 +0100
Subject: [aarch64-port-dev ] RFR: 8081669: aarch64: JTreg TestStable
 tests failing
In-Reply-To: <556DE507.2060403@oracle.com>
References: <1433237195.16770.13.camel@mylittlepony.linaroharston>
	<556DBAED.5030701@oracle.com>  <556DDEEF.4060703@oracle.com>
	<1433265102.1852.5.camel@mint> <556DE507.2060403@oracle.com>
Message-ID: <1433341939.2009.1.camel@mylittlepony.linaroharston>

On Tue, 2015-06-02 at 20:16 +0300, Vladimir Ivanov wrote:
> Edward, Alexander, thanks for the clarifications!
> 
> My bad, missed Cavium in the OCA list.
> 
> Best regards,
> Vladimir Ivanov
> 

NP, Thanks for the review. I have pushed the change,

All the best,
Ed.


From edward.nevill at linaro.org  Thu Jun  4 10:37:55 2015
From: edward.nevill at linaro.org (Edward Nevill)
Date: Thu, 4 Jun 2015 11:37:55 +0100
Subject: [aarch64-port-dev ] RFR: 8081289: aarch64: add support for
	RewriteFrequentPairs in interpreter
In-Reply-To: <1432732880.11287.10.camel@mylittlepony.linaroharston>
References: <1432732880.11287.10.camel@mylittlepony.linaroharston>
Message-ID: <CAEf2cjcx36RiiKFjd2t--JHvf5MbcrRnO8tL6WoVbdPrS2W+tA@mail.gmail.com>

Hi,

Just a polite ping. I submitted this patch for review by a JDK9 reviewer
over a week ago and there has been no response.

This patch was contributed by Alexander Alexeev who is a new contributer to
OpenJDK.

The patch affects only _arch64 files and both Alexander and I have verified
it by running JTreg hotspot with -Xint.

Thanks for your help,
Ed,

On 27 May 2015 at 14:21, Edward Nevill <edward.nevill at linaro.org> wrote:

> Hi,
>
> The following webrev adds support for RewriteFrequentPairs to the template
> interpreter for aarch64.
>
> http://cr.openjdk.java.net/~enevill/8081289/webrev.00
>
> This was contributed by Alexander Alexeev (
> alexander.alexeev at caviumnetworks.com)
>
> This gives a small improvement to the interpreter on aarch64, and brings
> it in line with all the other ports (x86, sparc, ppc, zero) which all
> support RewriteFrequentPairs.
>
> I have done some performance measurement using -Xint with some micro
> benchmarks and I see a small improvement on each.
>
> java dhrystone: +9%
> embedded caffeinemark: +4%
> grinderbench: +1%
> dacapo (avrora): +1%
>
> Tested with hotspot jtreg:-
>
> Original: Test results: passed: 787; failed: 24; error: 44
> With patch: Test results: passed: 785; failed: 24; error: 46
>
> The difference in the # of errors is due to timeouts because we are
> running -Xint.
>
> Please review and if OK I will push.
>
> All the best,
> Ed.
>
>
>

From roland.westrelin at oracle.com  Thu Jun  4 10:41:12 2015
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Thu, 4 Jun 2015 12:41:12 +0200
Subject: [aarch64-port-dev ] RFR: 8081289: aarch64: add support for
	RewriteFrequentPairs in interpreter
In-Reply-To: <1432732880.11287.10.camel@mylittlepony.linaroharston>
References: <1432732880.11287.10.camel@mylittlepony.linaroharston>
Message-ID: <73430178-4F90-4AC4-B2A9-7591909750ED@oracle.com>

> http://cr.openjdk.java.net/~enevill/8081289/webrev.00

That looks good to me.

Roland.

From roland.westrelin at oracle.com  Thu Jun  4 11:29:27 2015
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Thu, 4 Jun 2015 13:29:27 +0200
Subject: [aarch64-port-dev ] RFR: 8079565: aarch64: Add vectorization
	support for aarch64
In-Reply-To: <1432658017.17486.32.camel@mylittlepony.linaroharston>
References: <1432658017.17486.32.camel@mylittlepony.linaroharston>
Message-ID: <1EE4C04D-7286-4ABF-B3FA-3343DC006BEF@oracle.com>

> http://cr.openjdk.java.net/~enevill/8079565/webrev.00/

The platform specific changes look good.
I don?t have an opinion on these:

-      if (TraceSuperWord && Verbose) {
+      if (TraceSuperWord) {

I never used TraceSuperWord. If someone has opinion, it?s time to speak up I guess.

Roland.

From edward.nevill at linaro.org  Thu Jun  4 12:28:51 2015
From: edward.nevill at linaro.org (Edward Nevill)
Date: Thu, 4 Jun 2015 13:28:51 +0100
Subject: [aarch64-port-dev ] RFR: 8079565: aarch64: Add vectorization
	support for aarch64
In-Reply-To: <1EE4C04D-7286-4ABF-B3FA-3343DC006BEF@oracle.com>
References: <1432658017.17486.32.camel@mylittlepony.linaroharston>
	<1EE4C04D-7286-4ABF-B3FA-3343DC006BEF@oracle.com>
Message-ID: <CAEf2cjeH7YDfv2d-sdRtJrSVaP1d8ZjjaCpZt+iPHBiA9D3=vg@mail.gmail.com>

On 4 June 2015 at 12:29, Roland Westrelin <roland.westrelin at oracle.com>
wrote:

> > http://cr.openjdk.java.net/~enevill/8079565/webrev.00/
>
> The platform specific changes look good.
> I don?t have an opinion on these:
>
> -      if (TraceSuperWord && Verbose) {
> +      if (TraceSuperWord) {
>
> I never used TraceSuperWord. If someone has opinion, it?s time to speak up
> I guess.


Hi Roland,

Thanks for spotting this.

This was an accidental change. I made it non verbose for my own debugging
purposes but forgot to revert it.

New webrev

 http://cr.openjdk.java.net/~enevill/8079565/webrev.01
<http://cr.openjdk.java.net/%7Eenevill/8079565/webrev.00/>

Only difference is reverting the changes to src/share/vm/opto/superword.cpp

There are now no changes to shared code, only _aarch64 files.

OK to push?
Ed.

From roland.westrelin at oracle.com  Thu Jun  4 13:43:54 2015
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Thu, 4 Jun 2015 15:43:54 +0200
Subject: [aarch64-port-dev ] RFR: 8079565: aarch64: Add vectorization
	support for aarch64
In-Reply-To: <CAEf2cjeH7YDfv2d-sdRtJrSVaP1d8ZjjaCpZt+iPHBiA9D3=vg@mail.gmail.com>
References: <1432658017.17486.32.camel@mylittlepony.linaroharston>
	<1EE4C04D-7286-4ABF-B3FA-3343DC006BEF@oracle.com>
	<CAEf2cjeH7YDfv2d-sdRtJrSVaP1d8ZjjaCpZt+iPHBiA9D3=vg@mail.gmail.com>
Message-ID: <0D7936CD-B5BD-4C8C-92BF-0108D3BD7CF0@oracle.com>

> OK to push?

Yes.

Roland.


From edward.nevill at linaro.org  Tue Jun  9 17:10:55 2015
From: edward.nevill at linaro.org (Edward Nevill)
Date: Tue, 09 Jun 2015 18:10:55 +0100
Subject: [aarch64-port-dev ] RFR: 8086087: aarch64: add support for 64 bit
	vectors
Message-ID: <1433869855.11860.20.camel@mylittlepony.linaroharston>

Hi,

http://cr.openjdk.java.net/~enevill/8086087/webrev/

This adds support for 64 bit vectors on aarch64. Previously the vector code only supported 128 bit vectors.

32 bit vectors are not supported directly as aarch64 has no support for 32 bit vectors, however the above webrev will permit 32 bit vectors but just place them in a 64 bit vector.

I have tested this with JTreg hotspot and get the same results before and after the change, viz,

Test results: passed: 845; failed: 12; error: 6

I have also benchmarked the Test*Vect tests from 6340864 in the hotspot test suite. The following are the average results I get on one of our partners HW (lower number is better).

TestByteVect:  128-bit (11.77), 64-bit (4.36)
TestShortVect: 128-bit (5.02),  64-bit (5.22)
TestIntVect:   128-bit (7.81),  64-bit (7.70)
TestLongVect:  128-bit (11.67), 64-bit (11.71)
TestFloatVect: 128-bit (16.75), 64-bit (17.29)
TestDoubleVect:128-bit (32.37), 64-bit (32.43)

So the only test which shows an improvement is TestByteVect which shows a 2.7x speedup. The other tests are the same within the bounds of experimental error.

The reason TestByteVect shows such an improvement is that with 128 bit vectors it is not being vectorized at all because the loop is not unrolled sufficiently to allow it to be vectorized, wheras with 64 bit vectors it is.

Please review and let me know if this is OK to push?

Ed.


From edward.nevill at linaro.org  Wed Jun 10 12:58:51 2015
From: edward.nevill at linaro.org (Edward Nevill)
Date: Wed, 10 Jun 2015 13:58:51 +0100
Subject: [aarch64-port-dev ] RFR: 8085805: aarch64: AdvancedThresholdPolicy
 lacks tuning of InlineSmallCode size
Message-ID: <1433941131.11860.74.camel@mylittlepony.linaroharston>

Hi,

http://cr.openjdk.java.net/~enevill/8085805/webrev/

adds tuning of InlineSmallCode for aarch64.

src/share/vm/runtime/advancedThresholdPolicy.cpp contains the following code which tunes the value of InlineSmallCode for X86 and SPARC.

  // Some inlining tuning
#ifdef X86
  if (FLAG_IS_DEFAULT(InlineSmallCode)) {
    FLAG_SET_DEFAULT(InlineSmallCode, 2000);
  }
#endif

#ifdef SPARC
  if (FLAG_IS_DEFAULT(InlineSmallCode)) {
    FLAG_SET_DEFAULT(InlineSmallCode, 2500);
  }
#endif

  set_increase_threshold_at_ratio();
  set_start_time(os::javaTimeMillis());
}

This webrev proposes changing this so that InlineSmallCode is increased to 2500 on aarch64 rather than the default of 1000. The change is simply to add AARCH64 to the conditional. IE.

#if defined SPARC || defined AARCH64
  if (FLAG_IS_DEFAULT(InlineSmallCode)) {
    FLAG_SET_DEFAULT(InlineSmallCode, 2500);
  }
#endif

This change request was triggered by one of our partners reporting a 6x improvement in one benchmark when the size of InlineSmallCode is increased.

I have done some testing to find the optimal size of InlineCodeSize for aarch64. The following shows the performance of various benchmarks against different sizes of InlineSmallCode.

InlineSmallCode    100   1000   1500   2000   2500   3000   5000

Grinderbench    440543 589792 595603 659213 665973 664663 667865
Stringtest       65182  65304  65211 339946 329314 326831 296886
SpecJVM2008       76.4   90.8   90.9   91.9   89.6   89.2   88.3

The optimal value seems to be about 2000/2500. I have elected for the slightly higher value.

Tested with JTreg/hotspot. In both cases, before and after the patch

Test results: passed: 845; failed: 12; error: 6

Please review,
Thanks,
Ed.


From Alexander.Alexeev at caviumnetworks.com  Wed Jun 10 13:14:20 2015
From: Alexander.Alexeev at caviumnetworks.com (Alexeev, Alexander)
Date: Wed, 10 Jun 2015 13:14:20 +0000
Subject: [aarch64-port-dev ] mov between vector and GP register
Message-ID: <SN1PR07MB1472100CB8B83BAC8E08568C99BD0@SN1PR07MB1472.namprd07.prod.outlook.com>

Hello

I would like to clarify why two moves below are declared as private in macroAssembler_aarch64.hpp?
What would be correct approach to use them in ins_encode definition in aarch64.ad?

assembler_aarch64.hpp:
2062   // Move from general purpose register
2063   //   mov  Vd.T[index], Rn
2064   void mov(FloatRegister Vd, SIMD_Arrangement T, int index, Register Xn) {

...
2070   // Move to general purpose register
2071   //   mov  Rd, Vn.T[index]
2072   void mov(Register Xd, FloatRegister Vn, SIMD_Arrangement T, int index) {


Thanks,
Alexander

From Alexander.Alexeev at caviumnetworks.com  Wed Jun 10 14:06:03 2015
From: Alexander.Alexeev at caviumnetworks.com (Alexeev, Alexander)
Date: Wed, 10 Jun 2015 14:06:03 +0000
Subject: [aarch64-port-dev ] population count intrinsic performance
Message-ID: <SN1PR07MB1472DFAD684C23F54715FCCC99BD0@SN1PR07MB1472.namprd07.prod.outlook.com>

Hello

I've implemented preliminary version of popCountI (intrinsic for java.lang.Integer.bitCount).
For some reasons performance become worse than it was before with Hacker's Delight version of algorithm. Pure benchmarking of assembly code showed that new version in contrast should be more efficient (2 cycles shorter).
SIMD - 13 cycles
HD  (baseline)  - 15 cycles

For evaluation in Java I used JMH

                                 Benchmark                 Mode  Cnt   Score   Error  Units
SIMD         BitCount.bitCountInteger  avgt    5  16.008 ? 0.016  ns/op
Baseline   BitCount.bitCountInteger  avgt    5  11.131 ? 0.069  ns/op


So I wonder what could cause such reverse. Could the reason be in JVM infrastructure and how intrinsics are inlined versus JITed code?
Any ideas are appreciated?


instruct popCountI(iRegINoSp dst,  iRegIorL2I src) %{
  match(Set dst (PopCountI src));
  ins_cost(INSN_COST * 13);

  format %{ "popCountI TODO\n\t" %}
  ins_encode %{
      __ mov(vscratch1, __ T1D, 0, as_Register($src$$reg));
      __ cnt(vscratch2, __ T8B, vscratch1);
      __ addv(vscratch1, __ T8B, vscratch2);
      __ mov(as_Register($dst$$reg), vscratch1, __ T1D, 0);
  %}

  ins_pipe(ialu_reg);
%}


Benchmark JMH (just one routine, the rest is as usual)

    @Benchmark
    public void bitCountInteger(final Blackhole bh) {
        bh.consume(java.lang.Integer.bitCount(0));
    }


Thanks,
Alexander

From edward.nevill at gmail.com  Wed Jun 10 14:24:28 2015
From: edward.nevill at gmail.com (Edward Nevill)
Date: Wed, 10 Jun 2015 15:24:28 +0100
Subject: [aarch64-port-dev ] population count intrinsic performance
In-Reply-To: <SN1PR07MB1472DFAD684C23F54715FCCC99BD0@SN1PR07MB1472.namprd07.prod.outlook.com>
References: <SN1PR07MB1472DFAD684C23F54715FCCC99BD0@SN1PR07MB1472.namprd07.prod.outlook.com>
Message-ID: <1433946268.11860.93.camel@mylittlepony.linaroharston>

On Wed, 2015-06-10 at 14:06 +0000, Alexeev, Alexander wrote:

> 
> 
> instruct popCountI(iRegINoSp dst,  iRegIorL2I src) %{
>   match(Set dst (PopCountI src));
>   ins_cost(INSN_COST * 13);
> 
>   format %{ "popCountI TODO\n\t" %}
>   ins_encode %{
>       __ mov(vscratch1, __ T1D, 0, as_Register($src$$reg));
>       __ cnt(vscratch2, __ T8B, vscratch1);
>       __ addv(vscratch1, __ T8B, vscratch2);
>       __ mov(as_Register($dst$$reg), vscratch1, __ T1D, 0);
>   %}
> 
>   ins_pipe(ialu_reg);
> %}

What are 'vscratch1' & 'vscratch2'. Could you send the complete patch so I can try this out,

Thanks,
Ed.


From vladimir.kozlov at oracle.com  Wed Jun 10 18:01:25 2015
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 10 Jun 2015 11:01:25 -0700
Subject: [aarch64-port-dev ] RFR: 8085805: aarch64:
 AdvancedThresholdPolicy lacks tuning of InlineSmallCode size
In-Reply-To: <1433941131.11860.74.camel@mylittlepony.linaroharston>
References: <1433941131.11860.74.camel@mylittlepony.linaroharston>
Message-ID: <55787B75.8010009@oracle.com>

Looks good to me.

Thanks,
Vladimir

On 6/10/15 5:58 AM, Edward Nevill wrote:
> Hi,
>
> http://cr.openjdk.java.net/~enevill/8085805/webrev/
>
> adds tuning of InlineSmallCode for aarch64.
>
> src/share/vm/runtime/advancedThresholdPolicy.cpp contains the following code which tunes the value of InlineSmallCode for X86 and SPARC.
>
>    // Some inlining tuning
> #ifdef X86
>    if (FLAG_IS_DEFAULT(InlineSmallCode)) {
>      FLAG_SET_DEFAULT(InlineSmallCode, 2000);
>    }
> #endif
>
> #ifdef SPARC
>    if (FLAG_IS_DEFAULT(InlineSmallCode)) {
>      FLAG_SET_DEFAULT(InlineSmallCode, 2500);
>    }
> #endif
>
>    set_increase_threshold_at_ratio();
>    set_start_time(os::javaTimeMillis());
> }
>
> This webrev proposes changing this so that InlineSmallCode is increased to 2500 on aarch64 rather than the default of 1000. The change is simply to add AARCH64 to the conditional. IE.
>
> #if defined SPARC || defined AARCH64
>    if (FLAG_IS_DEFAULT(InlineSmallCode)) {
>      FLAG_SET_DEFAULT(InlineSmallCode, 2500);
>    }
> #endif
>
> This change request was triggered by one of our partners reporting a 6x improvement in one benchmark when the size of InlineSmallCode is increased.
>
> I have done some testing to find the optimal size of InlineCodeSize for aarch64. The following shows the performance of various benchmarks against different sizes of InlineSmallCode.
>
> InlineSmallCode    100   1000   1500   2000   2500   3000   5000
>
> Grinderbench    440543 589792 595603 659213 665973 664663 667865
> Stringtest       65182  65304  65211 339946 329314 326831 296886
> SpecJVM2008       76.4   90.8   90.9   91.9   89.6   89.2   88.3
>
> The optimal value seems to be about 2000/2500. I have elected for the slightly higher value.
>
> Tested with JTreg/hotspot. In both cases, before and after the patch
>
> Test results: passed: 845; failed: 12; error: 6
>
> Please review,
> Thanks,
> Ed.
>
>
>

From Alexander.Alexeev at caviumnetworks.com  Wed Jun 10 18:41:01 2015
From: Alexander.Alexeev at caviumnetworks.com (Alexeev, Alexander)
Date: Wed, 10 Jun 2015 18:41:01 +0000
Subject: [aarch64-port-dev ] population count intrinsic performance
In-Reply-To: <1433946268.11860.93.camel@mylittlepony.linaroharston>
References: <SN1PR07MB1472DFAD684C23F54715FCCC99BD0@SN1PR07MB1472.namprd07.prod.outlook.com>
	<1433946268.11860.93.camel@mylittlepony.linaroharston>
Message-ID: <SN1PR07MB14722173C28F6212E10DD50499BD0@SN1PR07MB1472.namprd07.prod.outlook.com>

Ed, I removed those 'vscratch1' & 'vscratch2' as redundant.
Patch is below.

Regards,
Alexander

--- CUT HERE ---
diff -r 11af3990d56c src/cpu/aarch64/vm/aarch64.ad
--- a/src/cpu/aarch64/vm/aarch64.ad	Thu Jun 04 18:50:05 2015 -0700
+++ b/src/cpu/aarch64/vm/aarch64.ad	Wed Jun 10 18:12:27 2015 +0000
@@ -7402,6 +7402,40 @@
   ins_pipe(ialu_reg);
 %}
 
+//---------- Population Count Instructions -------------------------------------
+//
+
+instruct popCountI(iRegINoSp dst,  iRegIorL2I src) %{
+  match(Set dst (PopCountI src));
+  ins_cost(INSN_COST * 13);
+
+  format %{ "TODO popCountI\n\t" %}
+  ins_encode %{
+    __ mov(v0, __ T1D, 0, as_Register($src$$reg));
+    __ cnt(v1, __ T8B, v0);
+    __ addv(v0, __ T8B, v1);
+    __ mov(as_Register($dst$$reg), v0, __ T1D, 0);
+  %}
+
+  ins_pipe(ialu_reg);
+%}
+
+// Note: Long.bitCount(long) returns an int.
+instruct popCountL(iRegINoSp dst, iRegL src) %{
+  match(Set dst (PopCountL src));
+  ins_cost(INSN_COST * 13);
+
+  format %{ "TODO popCountL\n\t" %}
+  ins_encode %{
+    __ mov(v0, __ T1D, 0, as_Register($src$$reg));
+    __ cnt(v1, __ T8B, v0);
+    __ addv(v0, __ T8B, v1);
+    __ mov(as_Register($dst$$reg), v0, __ T1D, 0);
+  %}
+
+  ins_pipe(ialu_reg);
+%}
+
 // ============================================================================
 // MemBar Instruction
 
diff -r 11af3990d56c src/cpu/aarch64/vm/assembler_aarch64.hpp
--- a/src/cpu/aarch64/vm/assembler_aarch64.hpp	Thu Jun 04 18:50:05 2015 -0700
+++ b/src/cpu/aarch64/vm/assembler_aarch64.hpp	Wed Jun 10 18:12:27 2015 +0000
@@ -2050,6 +2050,9 @@
   INSN(negr,  1, 0b100000101110);
   INSN(notr,  1, 0b100000010110);
   INSN(addv,  0, 0b110001101110);
+  INSN(cls,   0, 0b100000010010);
+  INSN(clz,   1, 0b100000010010);
+  INSN(cnt,   0, 0b100000010110);
 
 #undef INSN
 
diff -r 11af3990d56c src/cpu/aarch64/vm/macroAssembler_aarch64.hpp
--- a/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp	Thu Jun 04 18:50:05 2015 -0700
+++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp	Wed Jun 10 18:12:27 2015 +0000
@@ -36,6 +36,7 @@
 class MacroAssembler: public Assembler {
   friend class LIR_Assembler;
 
+ public:
   using Assembler::mov;
   using Assembler::movi;

--- CUT HERE ---

> -----Original Message-----
> From: Edward Nevill [mailto:edward.nevill at gmail.com]
> Sent: Wednesday, June 10, 2015 5:24 PM
> To: Alexeev, Alexander
> Cc: aarch64-port-dev at openjdk.java.net
> Subject: Re: [aarch64-port-dev ] population count intrinsic performance
> 
> On Wed, 2015-06-10 at 14:06 +0000, Alexeev, Alexander wrote:
> 
> >
> >
> > instruct popCountI(iRegINoSp dst,  iRegIorL2I src) %{
> >   match(Set dst (PopCountI src));
> >   ins_cost(INSN_COST * 13);
> >
> >   format %{ "popCountI TODO\n\t" %}
> >   ins_encode %{
> >       __ mov(vscratch1, __ T1D, 0, as_Register($src$$reg));
> >       __ cnt(vscratch2, __ T8B, vscratch1);
> >       __ addv(vscratch1, __ T8B, vscratch2);
> >       __ mov(as_Register($dst$$reg), vscratch1, __ T1D, 0);
> >   %}
> >
> >   ins_pipe(ialu_reg);
> > %}
> 
> What are 'vscratch1' & 'vscratch2'. Could you send the complete patch so I
> can try this out,
> 
> Thanks,
> Ed.
> 


From edward.nevill at gmail.com  Wed Jun 10 20:34:48 2015
From: edward.nevill at gmail.com (Edward Nevill)
Date: Wed, 10 Jun 2015 21:34:48 +0100
Subject: [aarch64-port-dev ] population count intrinsic performance
In-Reply-To: <SN1PR07MB14722173C28F6212E10DD50499BD0@SN1PR07MB1472.namprd07.prod.outlook.com>
References: <SN1PR07MB1472DFAD684C23F54715FCCC99BD0@SN1PR07MB1472.namprd07.prod.outlook.com>
	<1433946268.11860.93.camel@mylittlepony.linaroharston>
	<SN1PR07MB14722173C28F6212E10DD50499BD0@SN1PR07MB1472.namprd07.prod.outlook.com>
Message-ID: <1433968488.1036.39.camel@mint>

On Wed, 2015-06-10 at 18:41 +0000, Alexeev, Alexander wrote:
> Ed, I removed those 'vscratch1' & 'vscratch2' as redundant.
> Patch is below.

Ah, I see. vscratch1 & vscratch2 are just two v registers you carved out
as scratch registers from the vector register set.

But you need to let the register allocator know!

>  
> +//---------- Population Count Instructions -------------------------------------
> +//
> +
> +instruct popCountI(iRegINoSp dst,  iRegIorL2I src) %{
> +  match(Set dst (PopCountI src));
> +  ins_cost(INSN_COST * 13);
> +
> +  format %{ "TODO popCountI\n\t" %}
> +  ins_encode %{
> +    __ mov(v0, __ T1D, 0, as_Register($src$$reg));
> +    __ cnt(v1, __ T8B, v0);
> +    __ addv(v0, __ T8B, v1);
> +    __ mov(as_Register($dst$$reg), v0, __ T1D, 0);
> +  %}

So here, registers v0 and v1 might already be allocated so you cannot
just use them. Also, I don't understand why you need v0 and v1.

I think what you want is something like

instruct popCountI(iRegINoSp dst,  iRegIorL2I src, vRegD tmp) %{
  match(Set dst (PopCountI src));
  ins_cost(INSN_COST * 13);
  effect(TEMP tmp);
  format ...
  ins_encode %{
    __ mov(tmp, __ T1D, 0, as_Register($src$$reg));
    __ cnt(tmp, __ T8B, tmp);
    __ addv(tmp, __ T8B, tmp);
    __ mov(as_Register($dst$$reg), tmp, __ T1D, 0);
%}

(I haven't tried this, just typed it into the email, so there may be
typos).

> +
> +  ins_pipe(ialu_reg);
> +%}

I think this should be

  ins_pipe(pipe_class_default)

for consistency with all the other SIMD instructions for which we
haven't implemented pipeline scheduling.


> +
>  // ============================================================================
>  // MemBar Instruction
>  
> diff -r 11af3990d56c src/cpu/aarch64/vm/assembler_aarch64.hpp
> --- a/src/cpu/aarch64/vm/assembler_aarch64.hpp	Thu Jun 04 18:50:05 2015 -0700
> +++ b/src/cpu/aarch64/vm/assembler_aarch64.hpp	Wed Jun 10 18:12:27 2015 +0000
> @@ -2050,6 +2050,9 @@
>    INSN(negr,  1, 0b100000101110);
>    INSN(notr,  1, 0b100000010110);
>    INSN(addv,  0, 0b110001101110);
> +  INSN(cls,   0, 0b100000010010);
> +  INSN(clz,   1, 0b100000010010);
> +  INSN(cnt,   0, 0b100000010110);
>  
>  #undef INSN
>  
> diff -r 11af3990d56c src/cpu/aarch64/vm/macroAssembler_aarch64.hpp
> --- a/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp	Thu Jun 04 18:50:05 2015 -0700
> +++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp	Wed Jun 10 18:12:27 2015 +0000
> @@ -36,6 +36,7 @@
>  class MacroAssembler: public Assembler {
>    friend class LIR_Assembler;
>  
> + public:
>    using Assembler::mov;
>    using Assembler::movi;

Looks fine, I think these should be public.


All the best,
Ed.


From Alexander.Alexeev at caviumnetworks.com  Thu Jun 11 08:10:30 2015
From: Alexander.Alexeev at caviumnetworks.com (Alexeev, Alexander)
Date: Thu, 11 Jun 2015 08:10:30 +0000
Subject: [aarch64-port-dev ] population count intrinsic performance
In-Reply-To: <1433968488.1036.39.camel@mint>
References: <SN1PR07MB1472DFAD684C23F54715FCCC99BD0@SN1PR07MB1472.namprd07.prod.outlook.com>
	<1433946268.11860.93.camel@mylittlepony.linaroharston>
	<SN1PR07MB14722173C28F6212E10DD50499BD0@SN1PR07MB1472.namprd07.prod.outlook.com>
	<1433968488.1036.39.camel@mint>
Message-ID: <DM2PR07MB14651AEA35D54F11C7AE211A99BC0@DM2PR07MB1465.namprd07.prod.outlook.com>


> But you need to let the register allocator know!
This is the main reason why I called this patch preliminary and it was a mistake to neglect that.
Now it is clear.

After applying recommended changes results for both versions are the same.
 
Baseline:
Benchmark                 Mode  Cnt   Score   Error  Units
BitCount.bitCountInteger  avgt    5  11.004 ? 0.000  ns/op
BitCount.bitCountLong     avgt    5  11.005 ? 0.000  ns/op

SIMD version:
Benchmark                 Mode  Cnt   Score   Error  Units
BitCount.bitCountInteger  avgt    5  11.004 ? 0.001  ns/op
BitCount.bitCountLong     avgt    5  11.004 ? 0.000  ns/op

Updated patch is below.

--- CUT HERE ---
diff -r 93cc4d7535ce src/cpu/aarch64/vm/aarch64.ad
--- a/src/cpu/aarch64/vm/aarch64.ad	Wed Jun 10 12:29:07 2015 +0000
+++ b/src/cpu/aarch64/vm/aarch64.ad	Thu Jun 11 07:28:28 2015 +0000
@@ -7402,6 +7402,42 @@
   ins_pipe(ialu_reg);
 %}
 
+//---------- Population Count Instructions -------------------------------------
+//
+
+instruct popCountI(iRegINoSp dst,  iRegIorL2I src, vRegD tmp) %{
+  match(Set dst (PopCountI src));
+  effect(TEMP tmp);
+  ins_cost(INSN_COST * 13);
+
+  format %{ "TODO popCountI\n\t" %}
+  ins_encode %{
+    __ mov($tmp$$FloatRegister, __ T1D, 0, as_Register($src$$reg));
+    __ cnt($tmp$$FloatRegister, __ T8B, $tmp$$FloatRegister);
+    __ addv($tmp$$FloatRegister, __ T8B, $tmp$$FloatRegister);
+    __ mov(as_Register($dst$$reg), $tmp$$FloatRegister, __ T1D, 0);
+  %}
+
+  ins_pipe(pipe_class_default);
+%}
+
+// Note: Long.bitCount(long) returns an int.
+instruct popCountL(iRegINoSp dst, iRegL src, vRegD tmp) %{
+  match(Set dst (PopCountL src));
+  effect(TEMP tmp);
+  ins_cost(INSN_COST * 13);
+
+  format %{ "TODO popCountL\n\t" %}
+  ins_encode %{
+    __ mov($tmp$$FloatRegister, __ T1D, 0, as_Register($src$$reg));
+    __ cnt($tmp$$FloatRegister, __ T8B, $tmp$$FloatRegister);
+    __ addv($tmp$$FloatRegister, __ T8B, $tmp$$FloatRegister);
+    __ mov(as_Register($dst$$reg), $tmp$$FloatRegister, __ T1D, 0);
+  %}
+
+  ins_pipe(pipe_class_default);
+%}
+
 // ============================================================================
 // MemBar Instruction
 
diff -r 93cc4d7535ce src/cpu/aarch64/vm/assembler_aarch64.hpp
--- a/src/cpu/aarch64/vm/assembler_aarch64.hpp	Wed Jun 10 12:29:07 2015 +0000
+++ b/src/cpu/aarch64/vm/assembler_aarch64.hpp	Thu Jun 11 07:28:28 2015 +0000
@@ -2050,6 +2050,9 @@
   INSN(negr,  1, 0b100000101110);
   INSN(notr,  1, 0b100000010110);
   INSN(addv,  0, 0b110001101110);
+  INSN(cls,   0, 0b100000010010);
+  INSN(clz,   1, 0b100000010010);
+  INSN(cnt,   0, 0b100000010110);
 
 #undef INSN
 
diff -r 93cc4d7535ce src/cpu/aarch64/vm/macroAssembler_aarch64.hpp
--- a/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp	Wed Jun 10 12:29:07 2015 +0000
+++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp	Thu Jun 11 07:28:28 2015 +0000
@@ -36,6 +36,7 @@
 class MacroAssembler: public Assembler {
   friend class LIR_Assembler;
 
+ public:
   using Assembler::mov;
   using Assembler::movi;
--- CUT HERE ---

Regards,
Alexander

From edward.nevill at gmail.com  Thu Jun 11 16:20:24 2015
From: edward.nevill at gmail.com (Edward Nevill)
Date: Thu, 11 Jun 2015 17:20:24 +0100
Subject: [aarch64-port-dev ] population count intrinsic performance
In-Reply-To: <DM2PR07MB14651AEA35D54F11C7AE211A99BC0@DM2PR07MB1465.namprd07.prod.outlook.com>
References: <SN1PR07MB1472DFAD684C23F54715FCCC99BD0@SN1PR07MB1472.namprd07.prod.outlook.com>
	<1433946268.11860.93.camel@mylittlepony.linaroharston>
	<SN1PR07MB14722173C28F6212E10DD50499BD0@SN1PR07MB1472.namprd07.prod.outlook.com>
	<1433968488.1036.39.camel@mint>
	<DM2PR07MB14651AEA35D54F11C7AE211A99BC0@DM2PR07MB1465.namprd07.prod.outlook.com>
Message-ID: <1434039624.7052.50.camel@mint>

On Thu, 2015-06-11 at 08:10 +0000, Alexeev, Alexander wrote:
> +
> +instruct popCountI(iRegINoSp dst,  iRegIorL2I src, vRegD tmp) %{
> +  match(Set dst (PopCountI src));
> +  effect(TEMP tmp);
> +  ins_cost(INSN_COST * 13);
> +
> +  format %{ "TODO popCountI\n\t" %}
> +  ins_encode %{
> +    __ mov($tmp$$FloatRegister, __ T1D, 0, as_Register($src$$reg));
> +    __ cnt($tmp$$FloatRegister, __ T8B, $tmp$$FloatRegister);
> +    __ addv($tmp$$FloatRegister, __ T8B, $tmp$$FloatRegister);
> +    __ mov(as_Register($dst$$reg), $tmp$$FloatRegister, __ T1D, 0);
> +  %}

I think there may be a problem with the way 'src' is used here. You are
assuming that the top 32 bits of src are 0. However this may not be the
case if for example, src is the result of an elided L2I conversion.

See the following comment in aarch64.ad

// iRegIorL2I is used for src inputs in rules for 32 bit int (I)
// operations. it allows the src to be either an iRegI or a (ConvL2I
// iRegL). in the latter case the l2i normally planted for a ConvL2I
// can be elided because the 32-bit instruction will just employ the
// lower 32 bits anyway.

Now, what I am not clear on, is whether if you just use iRegI here
rather than iRregIorL2I you are guaranteed that the top 32 bits are 0.

All the best,
Ed.


From aph at redhat.com  Thu Jun 11 16:24:16 2015
From: aph at redhat.com (Andrew Haley)
Date: Thu, 11 Jun 2015 17:24:16 +0100
Subject: [aarch64-port-dev ] population count intrinsic performance
In-Reply-To: <1434039624.7052.50.camel@mint>
References: <SN1PR07MB1472DFAD684C23F54715FCCC99BD0@SN1PR07MB1472.namprd07.prod.outlook.com>	<1433946268.11860.93.camel@mylittlepony.linaroharston>	<SN1PR07MB14722173C28F6212E10DD50499BD0@SN1PR07MB1472.namprd07.prod.outlook.com>	<1433968488.1036.39.camel@mint>	<DM2PR07MB14651AEA35D54F11C7AE211A99BC0@DM2PR07MB1465.namprd07.prod.outlook.com>
	<1434039624.7052.50.camel@mint>
Message-ID: <5579B630.4060406@redhat.com>

On 06/11/2015 05:20 PM, Edward Nevill wrote:
> Now, what I am not clear on, is whether if you just use iRegI here
> rather than iRregIorL2I you are guaranteed that the top 32 bits are 0.

If you can't use movw then src should be an iRegI.

Also, this:

__ mov($tmp$$FloatRegister, __ T1D, 0, as_Register($src$$reg));

could be

__ mov($tmp$$FloatRegister, __ T1D, 0, $src$$Register);

Andrew.


From Alexander.Alexeev at caviumnetworks.com  Thu Jun 11 16:54:31 2015
From: Alexander.Alexeev at caviumnetworks.com (Alexeev, Alexander)
Date: Thu, 11 Jun 2015 16:54:31 +0000
Subject: [aarch64-port-dev ] population count intrinsic performance
In-Reply-To: <5579B630.4060406@redhat.com>
References: <SN1PR07MB1472DFAD684C23F54715FCCC99BD0@SN1PR07MB1472.namprd07.prod.outlook.com>
	<1433946268.11860.93.camel@mylittlepony.linaroharston>
	<SN1PR07MB14722173C28F6212E10DD50499BD0@SN1PR07MB1472.namprd07.prod.outlook.com>
	<1433968488.1036.39.camel@mint>
	<DM2PR07MB14651AEA35D54F11C7AE211A99BC0@DM2PR07MB1465.namprd07.prod.outlook.com>
	<1434039624.7052.50.camel@mint> <5579B630.4060406@redhat.com>
Message-ID: <DM2PR07MB1465CCEC29A6EC0A9629AAFA99BC0@DM2PR07MB1465.namprd07.prod.outlook.com>

> > Now, what I am not clear on, is whether if you just use iRegI here
> > rather than iRregIorL2I you are guaranteed that the top 32 bits are 0.
> 
> If you can't use movw then src should be an iRegI.
Agreed.

>__ mov($tmp$$FloatRegister, __ T1D, 0, $src$$Register);
I've fixed that.

What would be the next step?

From aph at redhat.com  Thu Jun 11 17:00:09 2015
From: aph at redhat.com (Andrew Haley)
Date: Thu, 11 Jun 2015 18:00:09 +0100
Subject: [aarch64-port-dev ] population count intrinsic performance
In-Reply-To: <DM2PR07MB1465CCEC29A6EC0A9629AAFA99BC0@DM2PR07MB1465.namprd07.prod.outlook.com>
References: <SN1PR07MB1472DFAD684C23F54715FCCC99BD0@SN1PR07MB1472.namprd07.prod.outlook.com>	<1433946268.11860.93.camel@mylittlepony.linaroharston>	<SN1PR07MB14722173C28F6212E10DD50499BD0@SN1PR07MB1472.namprd07.prod.outlook.com>	<1433968488.1036.39.camel@mint>	<DM2PR07MB14651AEA35D54F11C7AE211A99BC0@DM2PR07MB1465.namprd07.prod.outlook.com>
	<1434039624.7052.50.camel@mint> <5579B630.4060406@redhat.com>
	<DM2PR07MB1465CCEC29A6EC0A9629AAFA99BC0@DM2PR07MB1465.namprd07.prod.outlook.com>
Message-ID: <5579BE99.1090600@redhat.com>

On 06/11/2015 05:54 PM, Alexeev, Alexander wrote:
>>> Now, what I am not clear on, is whether if you just use iRegI here
>>> rather than iRregIorL2I you are guaranteed that the top 32 bits are 0.
>>
>> If you can't use movw then src should be an iRegI.
> Agreed.
> 
>> __ mov($tmp$$FloatRegister, __ T1D, 0, $src$$Register);
> I've fixed that.
> 
> What would be the next step?

Post it as an RFR to hotspot-dev

Andrew.


From edward.nevill at linaro.org  Fri Jun 12 08:49:03 2015
From: edward.nevill at linaro.org (Edward Nevill)
Date: Fri, 12 Jun 2015 09:49:03 +0100
Subject: [aarch64-port-dev ] RFR: 8081790: SHA tests fail
In-Reply-To: <1433321507.32688.13.camel@mylittlepony.linaroharston>
References: <1433321507.32688.13.camel@mylittlepony.linaroharston>
Message-ID: <CAEf2cjdiCQ=VCppXHGWwo3zgqVUU-bBQSQWmZfaO37wt0A5=Pw@mail.gmail.com>

Hi,

Sorry to bother. The following was posted for review 9 days ago but there
has been no response.

This is an aarch64 only change to resolve 7 jtreg/hotspot failures.

Could a JDK9 reviewer please take a look at this,

Thanks,
Ed.

On 3 June 2015 at 09:51, Edward Nevill <edward.nevill at linaro.org> wrote:

> Hi,
>
> The following webrev
>
> http://cr.openjdk.java.net/~enevill/8081790/webrev.00/
>
> fixes a number of SHA test failures on aarch64.
>
> This patch was contributed by alexander.alexeev at caviumnetworks.com
>
> Currently the following JTReg/hotspot SHA tests fail on aarch64
>
> FAILED:
> compiler/intrinsics/sha/cli/TestUseSHA1IntrinsicsOptionOnUnsupportedCPU.java
> FAILED:
> compiler/intrinsics/sha/cli/TestUseSHA256IntrinsicsOptionOnUnsupportedCPU.java
> FAILED: compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java
> (ie
> FAILED: compiler/intrinsics/sha/sanity/TestSHA1Intrinsics.java
> FAILED: compiler/intrinsics/sha/sanity/TestSHA1MultiBlockIntrinsics.java
> FAILED: compiler/intrinsics/sha/sanity/TestSHA256MultiBlockIntrinsics.java
> FAILED: compiler/intrinsics/sha/sanity/TestSHA256Intrinsics.java
>
> The reason for the test failures is that the test suite is configured on
> the assumption that Sparc is the only arch which support SHA in hw (and
> therefore supports the -XX:+UseSHA options).
>
> The webrev adds tests for aarch64.
>
> The following files have also been renamed as they were inappropriately
> named.
>
> R
> test/compiler/intrinsics/sha/cli/testcases/GenericTestCaseForSupportedSparcCPU.java
> R
> test/compiler/intrinsics/sha/cli/testcases/UseSHAIntrinsicsSpecificTestCaseForUnsupportedSparcCPU.java
> R
> test/compiler/intrinsics/sha/cli/testcases/UseSHASpecificTestCaseForSupportedSparcCPU.java
> R
> test/compiler/intrinsics/sha/cli/testcases/UseSHASpecificTestCaseForUnsupportedSparcCPU.java
>
> These now become
>
> A
> test/compiler/intrinsics/sha/cli/testcases/GenericTestCaseForSupportedCPU.java
> A
> test/compiler/intrinsics/sha/cli/testcases/UseSHAIntrinsicsSpecificTestCaseForUnsupportedCPU.java
> A
> test/compiler/intrinsics/sha/cli/testcases/UseSHASpecificTestCaseForSupportedCPU.java
> A
> test/compiler/intrinsics/sha/cli/testcases/UseSHASpecificTestCaseForUnsupportedCPU.java
>
> (ie. the 'Sparc' has been dropped from the filename as Sparc is no longer
> the only arch which supports SHA).
>
> Tested with JTReg/hotspot
>
> Before: Test results: passed: 840; failed: 10; error: 5
> After:  Test results: passed: 847; failed: 3; error: 5
>
> Please review,
>
> Thanks,
> Ed.
>
>
>

From david.holmes at oracle.com  Fri Jun 12 08:59:51 2015
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 12 Jun 2015 18:59:51 +1000
Subject: [aarch64-port-dev ] RFR: 8081790: SHA tests fail
In-Reply-To: <CAEf2cjdiCQ=VCppXHGWwo3zgqVUU-bBQSQWmZfaO37wt0A5=Pw@mail.gmail.com>
References: <1433321507.32688.13.camel@mylittlepony.linaroharston>
	<CAEf2cjdiCQ=VCppXHGWwo3zgqVUU-bBQSQWmZfaO37wt0A5=Pw@mail.gmail.com>
Message-ID: <557A9F87.3010902@oracle.com>

Hi Ed,

On 12/06/2015 6:49 PM, Edward Nevill wrote:
> Hi,
>
> Sorry to bother. The following was posted for review 9 days ago but there
> has been no response.
>
> This is an aarch64 only change to resolve 7 jtreg/hotspot failures.
>
> Could a JDK9 reviewer please take a look at this,

The test changes are shared code so this needs someone from the compiler 
team to review and sponsor.

Thanks,
David

> Thanks,
> Ed.
>
> On 3 June 2015 at 09:51, Edward Nevill <edward.nevill at linaro.org> wrote:
>
>> Hi,
>>
>> The following webrev
>>
>> http://cr.openjdk.java.net/~enevill/8081790/webrev.00/
>>
>> fixes a number of SHA test failures on aarch64.
>>
>> This patch was contributed by alexander.alexeev at caviumnetworks.com
>>
>> Currently the following JTReg/hotspot SHA tests fail on aarch64
>>
>> FAILED:
>> compiler/intrinsics/sha/cli/TestUseSHA1IntrinsicsOptionOnUnsupportedCPU.java
>> FAILED:
>> compiler/intrinsics/sha/cli/TestUseSHA256IntrinsicsOptionOnUnsupportedCPU.java
>> FAILED: compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java
>> (ie
>> FAILED: compiler/intrinsics/sha/sanity/TestSHA1Intrinsics.java
>> FAILED: compiler/intrinsics/sha/sanity/TestSHA1MultiBlockIntrinsics.java
>> FAILED: compiler/intrinsics/sha/sanity/TestSHA256MultiBlockIntrinsics.java
>> FAILED: compiler/intrinsics/sha/sanity/TestSHA256Intrinsics.java
>>
>> The reason for the test failures is that the test suite is configured on
>> the assumption that Sparc is the only arch which support SHA in hw (and
>> therefore supports the -XX:+UseSHA options).
>>
>> The webrev adds tests for aarch64.
>>
>> The following files have also been renamed as they were inappropriately
>> named.
>>
>> R
>> test/compiler/intrinsics/sha/cli/testcases/GenericTestCaseForSupportedSparcCPU.java
>> R
>> test/compiler/intrinsics/sha/cli/testcases/UseSHAIntrinsicsSpecificTestCaseForUnsupportedSparcCPU.java
>> R
>> test/compiler/intrinsics/sha/cli/testcases/UseSHASpecificTestCaseForSupportedSparcCPU.java
>> R
>> test/compiler/intrinsics/sha/cli/testcases/UseSHASpecificTestCaseForUnsupportedSparcCPU.java
>>
>> These now become
>>
>> A
>> test/compiler/intrinsics/sha/cli/testcases/GenericTestCaseForSupportedCPU.java
>> A
>> test/compiler/intrinsics/sha/cli/testcases/UseSHAIntrinsicsSpecificTestCaseForUnsupportedCPU.java
>> A
>> test/compiler/intrinsics/sha/cli/testcases/UseSHASpecificTestCaseForSupportedCPU.java
>> A
>> test/compiler/intrinsics/sha/cli/testcases/UseSHASpecificTestCaseForUnsupportedCPU.java
>>
>> (ie. the 'Sparc' has been dropped from the filename as Sparc is no longer
>> the only arch which supports SHA).
>>
>> Tested with JTReg/hotspot
>>
>> Before: Test results: passed: 840; failed: 10; error: 5
>> After:  Test results: passed: 847; failed: 3; error: 5
>>
>> Please review,
>>
>> Thanks,
>> Ed.
>>
>>
>>

From edward.nevill at gmail.com  Fri Jun 12 09:42:45 2015
From: edward.nevill at gmail.com (Edward Nevill)
Date: Fri, 12 Jun 2015 10:42:45 +0100
Subject: [aarch64-port-dev ] RFR: 8081790: SHA tests fail
In-Reply-To: <557A9F87.3010902@oracle.com>
References: <1433321507.32688.13.camel@mylittlepony.linaroharston>
	<CAEf2cjdiCQ=VCppXHGWwo3zgqVUU-bBQSQWmZfaO37wt0A5=Pw@mail.gmail.com>
	<557A9F87.3010902@oracle.com>
Message-ID: <1434102165.7052.61.camel@mint>

On Fri, 2015-06-12 at 18:59 +1000, David Holmes wrote:
> Hi Ed,
> 
> On 12/06/2015 6:49 PM, Edward Nevill wrote:
> > Hi,
> >
> > Sorry to bother. The following was posted for review 9 days ago but there
> > has been no response.
> >
> > This is an aarch64 only change to resolve 7 jtreg/hotspot failures.
> >
> > Could a JDK9 reviewer please take a look at this,
> 
> The test changes are shared code so this needs someone from the compiler 
> team to review and sponsor.

Yes, thanks for pointing this out.

Could someone from the compiler team please review and sponsor,

Ed.

> >
> > On 3 June 2015 at 09:51, Edward Nevill <edward.nevill at linaro.org> wrote:
> >
> >> Hi,
> >>
> >> The following webrev
> >>
> >> http://cr.openjdk.java.net/~enevill/8081790/webrev.00/
> >>
> >> fixes a number of SHA test failures on aarch64.
> >>
> >> This patch was contributed by alexander.alexeev at caviumnetworks.com
> >>
> >> Currently the following JTReg/hotspot SHA tests fail on aarch64
> >>
> >> FAILED:
> >> compiler/intrinsics/sha/cli/TestUseSHA1IntrinsicsOptionOnUnsupportedCPU.java
> >> FAILED:
> >> compiler/intrinsics/sha/cli/TestUseSHA256IntrinsicsOptionOnUnsupportedCPU.java
> >> FAILED: compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java
> >> (ie
> >> FAILED: compiler/intrinsics/sha/sanity/TestSHA1Intrinsics.java
> >> FAILED: compiler/intrinsics/sha/sanity/TestSHA1MultiBlockIntrinsics.java
> >> FAILED: compiler/intrinsics/sha/sanity/TestSHA256MultiBlockIntrinsics.java
> >> FAILED: compiler/intrinsics/sha/sanity/TestSHA256Intrinsics.java
> >>
> >> The reason for the test failures is that the test suite is configured on
> >> the assumption that Sparc is the only arch which support SHA in hw (and
> >> therefore supports the -XX:+UseSHA options).
> >>
> >> The webrev adds tests for aarch64.
> >>
> >> The following files have also been renamed as they were inappropriately
> >> named.
> >>
> >> R
> >> test/compiler/intrinsics/sha/cli/testcases/GenericTestCaseForSupportedSparcCPU.java
> >> R
> >> test/compiler/intrinsics/sha/cli/testcases/UseSHAIntrinsicsSpecificTestCaseForUnsupportedSparcCPU.java
> >> R
> >> test/compiler/intrinsics/sha/cli/testcases/UseSHASpecificTestCaseForSupportedSparcCPU.java
> >> R
> >> test/compiler/intrinsics/sha/cli/testcases/UseSHASpecificTestCaseForUnsupportedSparcCPU.java
> >>
> >> These now become
> >>
> >> A
> >> test/compiler/intrinsics/sha/cli/testcases/GenericTestCaseForSupportedCPU.java
> >> A
> >> test/compiler/intrinsics/sha/cli/testcases/UseSHAIntrinsicsSpecificTestCaseForUnsupportedCPU.java
> >> A
> >> test/compiler/intrinsics/sha/cli/testcases/UseSHASpecificTestCaseForSupportedCPU.java
> >> A
> >> test/compiler/intrinsics/sha/cli/testcases/UseSHASpecificTestCaseForUnsupportedCPU.java
> >>
> >> (ie. the 'Sparc' has been dropped from the filename as Sparc is no longer
> >> the only arch which supports SHA).
> >>
> >> Tested with JTReg/hotspot
> >>
> >> Before: Test results: passed: 840; failed: 10; error: 5
> >> After:  Test results: passed: 847; failed: 3; error: 5
> >>
> >> Please review,
> >>
> >> Thanks,
> >> Ed.
> >>
> >>
> >>


From vladimir.kozlov at oracle.com  Fri Jun 12 17:12:47 2015
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 12 Jun 2015 10:12:47 -0700
Subject: [aarch64-port-dev ] RFR: 8081790: SHA tests fail
In-Reply-To: <557A9F87.3010902@oracle.com>
References: <1433321507.32688.13.camel@mylittlepony.linaroharston>	<CAEf2cjdiCQ=VCppXHGWwo3zgqVUU-bBQSQWmZfaO37wt0A5=Pw@mail.gmail.com>
	<557A9F87.3010902@oracle.com>
Message-ID: <557B130F.1030209@oracle.com>

Changes looks fine to me I will sponsor it.

Thanks,
Vladimir

On 6/12/15 1:59 AM, David Holmes wrote:
> Hi Ed,
>
> On 12/06/2015 6:49 PM, Edward Nevill wrote:
>> Hi,
>>
>> Sorry to bother. The following was posted for review 9 days ago but there
>> has been no response.
>>
>> This is an aarch64 only change to resolve 7 jtreg/hotspot failures.
>>
>> Could a JDK9 reviewer please take a look at this,
>
> The test changes are shared code so this needs someone from the compiler
> team to review and sponsor.
>
> Thanks,
> David
>
>> Thanks,
>> Ed.
>>
>> On 3 June 2015 at 09:51, Edward Nevill <edward.nevill at linaro.org> wrote:
>>
>>> Hi,
>>>
>>> The following webrev
>>>
>>> http://cr.openjdk.java.net/~enevill/8081790/webrev.00/
>>>
>>> fixes a number of SHA test failures on aarch64.
>>>
>>> This patch was contributed by alexander.alexeev at caviumnetworks.com
>>>
>>> Currently the following JTReg/hotspot SHA tests fail on aarch64
>>>
>>> FAILED:
>>> compiler/intrinsics/sha/cli/TestUseSHA1IntrinsicsOptionOnUnsupportedCPU.java
>>>
>>> FAILED:
>>> compiler/intrinsics/sha/cli/TestUseSHA256IntrinsicsOptionOnUnsupportedCPU.java
>>>
>>> FAILED:
>>> compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java
>>> (ie
>>> FAILED: compiler/intrinsics/sha/sanity/TestSHA1Intrinsics.java
>>> FAILED: compiler/intrinsics/sha/sanity/TestSHA1MultiBlockIntrinsics.java
>>> FAILED:
>>> compiler/intrinsics/sha/sanity/TestSHA256MultiBlockIntrinsics.java
>>> FAILED: compiler/intrinsics/sha/sanity/TestSHA256Intrinsics.java
>>>
>>> The reason for the test failures is that the test suite is configured on
>>> the assumption that Sparc is the only arch which support SHA in hw (and
>>> therefore supports the -XX:+UseSHA options).
>>>
>>> The webrev adds tests for aarch64.
>>>
>>> The following files have also been renamed as they were inappropriately
>>> named.
>>>
>>> R
>>> test/compiler/intrinsics/sha/cli/testcases/GenericTestCaseForSupportedSparcCPU.java
>>>
>>> R
>>> test/compiler/intrinsics/sha/cli/testcases/UseSHAIntrinsicsSpecificTestCaseForUnsupportedSparcCPU.java
>>>
>>> R
>>> test/compiler/intrinsics/sha/cli/testcases/UseSHASpecificTestCaseForSupportedSparcCPU.java
>>>
>>> R
>>> test/compiler/intrinsics/sha/cli/testcases/UseSHASpecificTestCaseForUnsupportedSparcCPU.java
>>>
>>>
>>> These now become
>>>
>>> A
>>> test/compiler/intrinsics/sha/cli/testcases/GenericTestCaseForSupportedCPU.java
>>>
>>> A
>>> test/compiler/intrinsics/sha/cli/testcases/UseSHAIntrinsicsSpecificTestCaseForUnsupportedCPU.java
>>>
>>> A
>>> test/compiler/intrinsics/sha/cli/testcases/UseSHASpecificTestCaseForSupportedCPU.java
>>>
>>> A
>>> test/compiler/intrinsics/sha/cli/testcases/UseSHASpecificTestCaseForUnsupportedCPU.java
>>>
>>>
>>> (ie. the 'Sparc' has been dropped from the filename as Sparc is no
>>> longer
>>> the only arch which supports SHA).
>>>
>>> Tested with JTReg/hotspot
>>>
>>> Before: Test results: passed: 840; failed: 10; error: 5
>>> After:  Test results: passed: 847; failed: 3; error: 5
>>>
>>> Please review,
>>>
>>> Thanks,
>>> Ed.
>>>
>>>
>>>

From goetz.lindenmaier at sap.com  Mon Jun 15 15:30:16 2015
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Mon, 15 Jun 2015 15:30:16 +0000
Subject: [aarch64-port-dev ] Ping: RFR(M): 8086069: Adapt runtime calls to
 recent intrinsics to pass	ints as long
Message-ID: <4295855A5C1DE049A61835A1887419CC2CFF9D17@DEWDFEMB12A.global.corp.sap>

Hi,

Could someone please have a look at this change?

I had a look whether I can push the functionality down to make_runtime_call().
This would simplify matters a lot. But as the TypeFunc is hashed, I can not
change it any more in make_runtime_call().

@aarch-people: I saw you have CCallingConventionRequiresIntsAsLongs set.
Could you please check whether this breaks your intinsics, e.g., multiplyToLen?
(We assure in sharedRuntime_ppc.cpp, c_calling_convention() that no INT types
end up there.)

Thanks,
  Goetz

-----Original Message-----
From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Lindenmaier, Goetz
Sent: Dienstag, 9. Juni 2015 17:18
To: HotSpot Developers
Subject: RFR(M): 8086069: Adapt runtime calls to recent intrinsics to pass ints as long

Hi,

we are working on porting the recently* added intrinsics to PPC.  As these use
runtime calls, the calls must obey to the platform ABI, which requires that ints
are passed as longs.

We made a similar change in "8024342: PPC64 (part 111): Support for C calling
conventions that require 64-bit ints."  It adapts the calls if
CCallingConventionRequiresIntsAsLongs is set.

This change adapts the calls to multiplyToLen, CRC32, AES, SHA accordingly.

Please review this change.  I please need a sponsor.
http://cr.openjdk.java.net/~goetz/webrevs/8086069-call_conv/webrev.01/

Best regards,
  Goetz


* i.e., added after making our initial port

From vladimir.kozlov at oracle.com  Mon Jun 15 16:16:35 2015
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 15 Jun 2015 09:16:35 -0700
Subject: [aarch64-port-dev ] RFR: aarch64: Bit count intrinsic
	implementation for aarch64
In-Reply-To: <SN1PR07MB147229C1954C9F1B41195AC499B80@SN1PR07MB1472.namprd07.prod.outlook.com>
References: <SN1PR07MB147291B1183A83557B6BA03999B80@SN1PR07MB1472.namprd07.prod.outlook.com>	<557ECAA9.1030706@redhat.com>
	<SN1PR07MB147229C1954C9F1B41195AC499B80@SN1PR07MB1472.namprd07.prod.outlook.com>
Message-ID: <557EFA63.50203@oracle.com>

Andrew, please, help Alexeev to file JBS bug and publish webrev on cr.openjdk. We can't review or accept patches which 
are not on cr. server.

There are several hotspot tests which check bitCount intrinsics (for example, 
compiler//codegen/6378821/Test6378821.java) but not full range of values.

Note, jdk tests should be run with -Xcomp to make sure methods are compiled and intrinsics are used.

Thanks,
Vladimir

On 6/15/15 6:51 AM, Alexeev, Alexander wrote:
>
>> Do any of those tests actually test popcount?
>
> Relevant JDK tests also all pass
> jdk/test/java/lang/Long/*
> jdk/test/java/lang/Integer/*
>
> Regards,
> Alexander
>

From aph at redhat.com  Mon Jun 15 16:21:10 2015
From: aph at redhat.com (Andrew Haley)
Date: Mon, 15 Jun 2015 17:21:10 +0100
Subject: [aarch64-port-dev ] AArch64: vectorization fails RSA crypto tests
Message-ID: <557EFB76.5050308@redhat.com>

java.math.BigInteger::add([I[I)[I gets miscompiled.  There is a

	ldr	q16, [x17,x10,lsl #4]

which should be a

	ldr	q16, [x17,x10]

Andrew.


diff -r 6217fd2c767b src/cpu/aarch64/vm/assembler_aarch64.hpp
--- a/src/cpu/aarch64/vm/assembler_aarch64.hpp  Fri Jun 12 16:09:45 2015 +0100
+++ b/src/cpu/aarch64/vm/assembler_aarch64.hpp  Mon Jun 15 17:16:58 2015 +0100
@@ -491,6 +491,11 @@
         i->rf(_index, 16);
         i->f(_ext.option(), 15, 13);
         unsigned size = i->get(31, 30);
+        if (i->get(26, 26) && i->get(23, 23)) {
+          // SIMD Q Type - Size = 128 bits
+          assert(size == 0, "bad size");
+          size = 0b100;
+        }
         if (size == 0) // It's a byte
           i->f(_ext.shift() >= 0, 12);
         else {

From aph at redhat.com  Mon Jun 15 16:22:50 2015
From: aph at redhat.com (Andrew Haley)
Date: Mon, 15 Jun 2015 17:22:50 +0100
Subject: [aarch64-port-dev ] RFR: aarch64: Bit count intrinsic
	implementation for aarch64
In-Reply-To: <557EFA63.50203@oracle.com>
References: <SN1PR07MB147291B1183A83557B6BA03999B80@SN1PR07MB1472.namprd07.prod.outlook.com>	<557ECAA9.1030706@redhat.com>
	<SN1PR07MB147229C1954C9F1B41195AC499B80@SN1PR07MB1472.namprd07.prod.outlook.com>
	<557EFA63.50203@oracle.com>
Message-ID: <557EFBDA.5050404@redhat.com>

On 06/15/2015 05:16 PM, Vladimir Kozlov wrote:
> Andrew, please, help Alexeev to file JBS bug and publish webrev on cr.openjdk. We can't review or accept patches which 
> are not on cr. server.

Sure, I'll do that.  Given that he now has done the paperwork, is
there anything to prevent him from having an account on cr.openjdk?

Andrew.


From edward.nevill at linaro.org  Mon Jun 15 20:24:59 2015
From: edward.nevill at linaro.org (Edward Nevill)
Date: Mon, 15 Jun 2015 21:24:59 +0100
Subject: [aarch64-port-dev ] AArch64: vectorization fails RSA crypto
	tests
In-Reply-To: <557EFB76.5050308@redhat.com>
References: <557EFB76.5050308@redhat.com>
Message-ID: <CAEf2cjcvy3PVuGFurKfoZnWsCAWX8rZ9kQqpVScwaNOfH_EBFA@mail.gmail.com>

On 15 June 2015 at 17:21, Andrew Haley <aph at redhat.com> wrote:

>
>
> diff -r 6217fd2c767b src/cpu/aarch64/vm/assembler_aarch64.hpp
> --- a/src/cpu/aarch64/vm/assembler_aarch64.hpp  Fri Jun 12 16:09:45 2015
> +0100
> +++ b/src/cpu/aarch64/vm/assembler_aarch64.hpp  Mon Jun 15 17:16:58 2015
> +0100
> @@ -491,6 +491,11 @@
>          i->rf(_index, 16);
>          i->f(_ext.option(), 15, 13);
>          unsigned size = i->get(31, 30);
> +        if (i->get(26, 26) && i->get(23, 23)) {
> +          // SIMD Q Type - Size = 128 bits
> +          assert(size == 0, "bad size");
> +          size = 0b100;
> +        }
>          if (size == 0) // It's a byte
>            i->f(_ext.shift() >= 0, 12);
>          else {
>

Oops, sorry about that.

The following fixes the assertion failure in
java/math/BigInteger/BigIntegerTest.java

diff -r 6217fd2c767b src/cpu/aarch64/vm/macroAssembler_aarch64.hpp
--- a/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp     Fri Jun 12 16:09:45
2015 +0100
+++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp     Mon Jun 15 20:20:12
2015 +0000
@@ -477,8 +477,11 @@
   //   T1D/T2D: invalid
   void mov(FloatRegister Vd, SIMD_Arrangement T, u_int32_t imm32) {
     assert(T != T1D && T != T2D, "invalid arrangement");
+    if (T == T8B || T == T16B) {
+      movi(Vd, T, imm32 & 0xff, 0);
+      return;
+    }
     u_int32_t nimm32 = ~imm32;
-    if (T == T8B || T == T16B) { imm32 &= 0xff; nimm32 &= 0xff; }
     if (T == T4H || T == T8H) { imm32 &= 0xffff; nimm32 &= 0xffff; }
     u_int32_t x = imm32;
     int movi_cnt = 0;

Would you like me to merge these two as a single patch, file a JBS
"regression" bug report and push a cr.

All the best,
Ed.

From dean.long at oracle.com  Tue Jun 16 00:07:34 2015
From: dean.long at oracle.com (Dean Long)
Date: Mon, 15 Jun 2015 17:07:34 -0700
Subject: [aarch64-port-dev ] Ping: RFR(M): 8086069: Adapt runtime calls
 to recent intrinsics to pass	ints as long
In-Reply-To: <57EF8CC4-3CB7-488E-89D4-5AE5EA3C99AA@oracle.com>
References: <4295855A5C1DE049A61835A1887419CC2CFF9D17@DEWDFEMB12A.global.corp.sap>
	<57EF8CC4-3CB7-488E-89D4-5AE5EA3C99AA@oracle.com>
Message-ID: <557F68C6.4050805@oracle.com>

This may be a dumb (but hopefully related) question, but why do we need 
to add top() for _LP64:

4364 #ifdef _LP64
4365 #define XTOP ,top() /*additional argument*/
4366 #else  //_LP64
4367 #define XTOP        /*no additional argument*/
4368 #endif //_LP64

4396  make_runtime_call(RC_LEAF|RC_NO_FP,
4397                    OptoRuntime::fast_arraycopy_Type(),
4398                    StubRoutines::unsafe_arraycopy(),
4399                    "unsafe_arraycopy",
4400                    TypeRawPtr::BOTTOM,
4401                    src, dst, size XTOP);

And why only for"size", but not "src" and "dst"?

dl

On 6/15/2015 1:47 PM, John Rose wrote:
> This change surprises me.  Sometimes our machine-independent IR needs #ifdefs, Matcher queries, or flag tests to deal with platform stuff we haven't factorized properly.  In this case a flag test is "apologizing" for oddly-sized argument registers at the IR level.
>
> But TypeFuncs and the rest of the IR do not talk about such details of calling conventions.  A C2 type is only about the information content  of arguments and return values, not their register bindings.  The lower level function SharedRuntime::c_calling_convention determines exact bindings of values to argument and return value registers, using the type VMRegPair.  It is likely that there is some awkwardness in assigning a *pair* of regs (representing a single 64-bit register) to carry a 32-bit value, but surely this is less complex, and more to the point, than hacking conversions from 32- to 64-bit values at the IR level.
>
> I would expect that, if your approach is to work, there should be an assert in SharedRuntime::c_calling_convention saying that the 32-bit types T_INT, etc., are *never* presented to the SR::ccc/VMRegPair layer of the code.  But, as it seems to me, it would be less disruptive to the overall design if SR::ccc can be presented with T_INT types, and be free to return an indication of which 64-bit register will carry that value.  The low-level move instructions which push data into those argument registers can be specialized to those target registers (in the AD file) if there is a need for filling up the 32 extra bits (sign or zero).
>
> HTH
> ? John
>
> On Jun 15, 2015, at 8:30 AM, Lindenmaier, Goetz <goetz.lindenmaier at sap.com> wrote:
>> Hi,
>>
>> Could someone please have a look at this change?
>>
>> I had a look whether I can push the functionality down to make_runtime_call().
>> This would simplify matters a lot. But as the TypeFunc is hashed, I can not
>> change it any more in make_runtime_call().
>>
>> @aarch-people: I saw you have CCallingConventionRequiresIntsAsLongs set.
>> Could you please check whether this breaks your intinsics, e.g., multiplyToLen?
>> (We assure in sharedRuntime_ppc.cpp, c_calling_convention() that no INT types
>> end up there.)
>>
>> Thanks,
>>   Goetz
>>
>> -----Original Message-----
>> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Lindenmaier, Goetz
>> Sent: Dienstag, 9. Juni 2015 17:18
>> To: HotSpot Developers
>> Subject: RFR(M): 8086069: Adapt runtime calls to recent intrinsics to pass ints as long
>>
>> Hi,
>>
>> we are working on porting the recently* added intrinsics to PPC.  As these use
>> runtime calls, the calls must obey to the platform ABI, which requires that ints
>> are passed as longs.
>>
>> We made a similar change in "8024342: PPC64 (part 111): Support for C calling
>> conventions that require 64-bit ints."  It adapts the calls if
>> CCallingConventionRequiresIntsAsLongs is set.
>>
>> This change adapts the calls to multiplyToLen, CRC32, AES, SHA accordingly.
>>
>> Please review this change.  I please need a sponsor.
>> http://cr.openjdk.java.net/~goetz/webrevs/8086069-call_conv/webrev.01/
>>
>> Best regards,
>>   Goetz
>>
>>
>> * i.e., added after making our initial port


From dean.long at oracle.com  Tue Jun 16 03:57:05 2015
From: dean.long at oracle.com (Dean Long)
Date: Mon, 15 Jun 2015 20:57:05 -0700
Subject: [aarch64-port-dev ] Ping: RFR(M): 8086069: Adapt runtime calls
 to recent intrinsics to pass	ints as long
In-Reply-To: <D6F78D71-A1E1-41DA-97C1-25548B9C5D0C@oracle.com>
References: <4295855A5C1DE049A61835A1887419CC2CFF9D17@DEWDFEMB12A.global.corp.sap>
	<57EF8CC4-3CB7-488E-89D4-5AE5EA3C99AA@oracle.com>
	<557F68C6.4050805@oracle.com>
	<D6F78D71-A1E1-41DA-97C1-25548B9C5D0C@oracle.com>
Message-ID: <557F9E91.7020603@oracle.com>

On 6/15/2015 5:26 PM, John Rose wrote:
> On Jun 15, 2015, at 5:07 PM, Dean Long <dean.long at oracle.com 
> <mailto:dean.long at oracle.com>> wrote:
>>
>> This may be a dumb (but hopefully related) question, but why do we 
>> need to add top() for _LP64:
>>
>> 4364 #ifdef _LP64
>> 4365 #define XTOP ,top() /*additional argument*/
>> 4366 #else  //_LP64
>> 4367 #define XTOP        /*no additional argument*/
>> 4368 #endif //_LP64
>>
>> 4396  make_runtime_call(RC_LEAF|RC_NO_FP,
>> 4397                    OptoRuntime::fast_arraycopy_Type(),
>> 4398                    StubRoutines::unsafe_arraycopy(),
>> 4399                    "unsafe_arraycopy",
>> 4400                    TypeRawPtr::BOTTOM,
>> 4401                    src, dst, size XTOP);
>>
>> And why only for"size", but not "src" and "dst"?
>
> That is one of the awkward places we jam in LP64-specific code.
> Java has no size_t type; the closest it gets is "long".
> But the compiler uses Java types to set up runtime stub call signatures.
> So if we want the compiler to pass a size_t value to a stub, it has to 
> pass a long on !LP64 and int on ILP32.
> (There's no need for an int-in-a-long here, since size_t is never 32 
> bits on an int-in-a-long machine.)
> To complete the embarrassment, the compiler has an internal convention 
> of always representing the twin word for longs and doubles (see JVMS).
> Net result is that if you want to ask for a size_t in the JIT on LP64, 
> you have to #ifdef in a jlong, and pass a "top" twin word.
> ? John

Thanks for the explanation.  It sounds like we are modeling the abstract 
Java stack representation of long and double, and this wouldn't be
easy to change, because I see things like "TypeFunc::Parms + 1" and 
"argument(2)" that would need to change before this could go away.

dl

From goetz.lindenmaier at sap.com  Tue Jun 16 07:23:01 2015
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Tue, 16 Jun 2015 07:23:01 +0000
Subject: [aarch64-port-dev ] Ping: RFR(M): 8086069: Adapt runtime calls
 to recent intrinsics to pass	ints as long
In-Reply-To: <57EF8CC4-3CB7-488E-89D4-5AE5EA3C99AA@oracle.com>
References: <4295855A5C1DE049A61835A1887419CC2CFF9D17@DEWDFEMB12A.global.corp.sap>
	<57EF8CC4-3CB7-488E-89D4-5AE5EA3C99AA@oracle.com>
Message-ID: <4295855A5C1DE049A61835A1887419CC2CFF9F65@DEWDFEMB12A.global.corp.sap>

Hi John,

thanks for looking at this change!

The PPC ABI says that int arguments must properly be extended to 64-bit:
http://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi-1.9.html#PARAM-PASS
"Simple integer types (char, short, int, long, enum) are mapped to a single doubleword. Values shorter than a doubleword are sign or zero extended as necessary."

We achieve this by adding an I2L node for arguments < 64-bit.
To assure proper typing of the IR, we have to adapt the function type and parameter list
accordingly.
Obviously, we have to deal with the fact that longs occupy 2 slots. That's not nice, 
but currently necessary.

The assertion you mention is in sharedRuntime_ppc.cpp:738. 

The approach works, it's just not implemented for the new intrinsics.
Also, I was looking for a generic solution, where I don't have to adapt
each runtime call added to the parser.

Sure we could issue sign extending instructions along with the call node
during emitting to the code buffer.  But if we add this early during parsing, the
nodes are subject to optimization.

Best regards,
  Goetz.


-----Original Message-----
From: John Rose [mailto:john.r.rose at oracle.com] 
Sent: Montag, 15. Juni 2015 22:47
To: Lindenmaier, Goetz
Cc: HotSpot Developers; aarch64-port-dev at openjdk.java.net
Subject: Re: Ping: RFR(M): 8086069: Adapt runtime calls to recent intrinsics to pass ints as long

This change surprises me.  Sometimes our machine-independent IR needs #ifdefs, Matcher queries, or flag tests to deal with platform stuff we haven't factorized properly.  In this case a flag test is "apologizing" for oddly-sized argument registers at the IR level.

But TypeFuncs and the rest of the IR do not talk about such details of calling conventions.  A C2 type is only about the information content  of arguments and return values, not their register bindings.  The lower level function SharedRuntime::c_calling_convention determines exact bindings of values to argument and return value registers, using the type VMRegPair.  It is likely that there is some awkwardness in assigning a *pair* of regs (representing a single 64-bit register) to carry a 32-bit value, but surely this is less complex, and more to the point, than hacking conversions from 32- to 64-bit values at the IR level.

I would expect that, if your approach is to work, there should be an assert in SharedRuntime::c_calling_convention saying that the 32-bit types T_INT, etc., are *never* presented to the SR::ccc/VMRegPair layer of the code.  But, as it seems to me, it would be less disruptive to the overall design if SR::ccc can be presented with T_INT types, and be free to return an indication of which 64-bit register will carry that value.  The low-level move instructions which push data into those argument registers can be specialized to those target registers (in the AD file) if there is a need for filling up the 32 extra bits (sign or zero).

HTH
? John

On Jun 15, 2015, at 8:30 AM, Lindenmaier, Goetz <goetz.lindenmaier at sap.com> wrote:
> 
> Hi,
> 
> Could someone please have a look at this change?
> 
> I had a look whether I can push the functionality down to make_runtime_call().
> This would simplify matters a lot. But as the TypeFunc is hashed, I can not
> change it any more in make_runtime_call().
> 
> @aarch-people: I saw you have CCallingConventionRequiresIntsAsLongs set.
> Could you please check whether this breaks your intinsics, e.g., multiplyToLen?
> (We assure in sharedRuntime_ppc.cpp, c_calling_convention() that no INT types
> end up there.)
> 
> Thanks,
>  Goetz
> 
> -----Original Message-----
> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Lindenmaier, Goetz
> Sent: Dienstag, 9. Juni 2015 17:18
> To: HotSpot Developers
> Subject: RFR(M): 8086069: Adapt runtime calls to recent intrinsics to pass ints as long
> 
> Hi,
> 
> we are working on porting the recently* added intrinsics to PPC.  As these use
> runtime calls, the calls must obey to the platform ABI, which requires that ints
> are passed as longs.
> 
> We made a similar change in "8024342: PPC64 (part 111): Support for C calling
> conventions that require 64-bit ints."  It adapts the calls if
> CCallingConventionRequiresIntsAsLongs is set.
> 
> This change adapts the calls to multiplyToLen, CRC32, AES, SHA accordingly.
> 
> Please review this change.  I please need a sponsor.
> http://cr.openjdk.java.net/~goetz/webrevs/8086069-call_conv/webrev.01/
> 
> Best regards,
>  Goetz
> 
> 
> * i.e., added after making our initial port


From john.r.rose at oracle.com  Mon Jun 15 20:47:17 2015
From: john.r.rose at oracle.com (John Rose)
Date: Mon, 15 Jun 2015 13:47:17 -0700
Subject: [aarch64-port-dev ] Ping: RFR(M): 8086069: Adapt runtime calls
	to recent intrinsics to pass	ints as long
In-Reply-To: <4295855A5C1DE049A61835A1887419CC2CFF9D17@DEWDFEMB12A.global.corp.sap>
References: <4295855A5C1DE049A61835A1887419CC2CFF9D17@DEWDFEMB12A.global.corp.sap>
Message-ID: <57EF8CC4-3CB7-488E-89D4-5AE5EA3C99AA@oracle.com>

This change surprises me.  Sometimes our machine-independent IR needs #ifdefs, Matcher queries, or flag tests to deal with platform stuff we haven't factorized properly.  In this case a flag test is "apologizing" for oddly-sized argument registers at the IR level.

But TypeFuncs and the rest of the IR do not talk about such details of calling conventions.  A C2 type is only about the information content  of arguments and return values, not their register bindings.  The lower level function SharedRuntime::c_calling_convention determines exact bindings of values to argument and return value registers, using the type VMRegPair.  It is likely that there is some awkwardness in assigning a *pair* of regs (representing a single 64-bit register) to carry a 32-bit value, but surely this is less complex, and more to the point, than hacking conversions from 32- to 64-bit values at the IR level.

I would expect that, if your approach is to work, there should be an assert in SharedRuntime::c_calling_convention saying that the 32-bit types T_INT, etc., are *never* presented to the SR::ccc/VMRegPair layer of the code.  But, as it seems to me, it would be less disruptive to the overall design if SR::ccc can be presented with T_INT types, and be free to return an indication of which 64-bit register will carry that value.  The low-level move instructions which push data into those argument registers can be specialized to those target registers (in the AD file) if there is a need for filling up the 32 extra bits (sign or zero).

HTH
? John

On Jun 15, 2015, at 8:30 AM, Lindenmaier, Goetz <goetz.lindenmaier at sap.com> wrote:
> 
> Hi,
> 
> Could someone please have a look at this change?
> 
> I had a look whether I can push the functionality down to make_runtime_call().
> This would simplify matters a lot. But as the TypeFunc is hashed, I can not
> change it any more in make_runtime_call().
> 
> @aarch-people: I saw you have CCallingConventionRequiresIntsAsLongs set.
> Could you please check whether this breaks your intinsics, e.g., multiplyToLen?
> (We assure in sharedRuntime_ppc.cpp, c_calling_convention() that no INT types
> end up there.)
> 
> Thanks,
>  Goetz
> 
> -----Original Message-----
> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Lindenmaier, Goetz
> Sent: Dienstag, 9. Juni 2015 17:18
> To: HotSpot Developers
> Subject: RFR(M): 8086069: Adapt runtime calls to recent intrinsics to pass ints as long
> 
> Hi,
> 
> we are working on porting the recently* added intrinsics to PPC.  As these use
> runtime calls, the calls must obey to the platform ABI, which requires that ints
> are passed as longs.
> 
> We made a similar change in "8024342: PPC64 (part 111): Support for C calling
> conventions that require 64-bit ints."  It adapts the calls if
> CCallingConventionRequiresIntsAsLongs is set.
> 
> This change adapts the calls to multiplyToLen, CRC32, AES, SHA accordingly.
> 
> Please review this change.  I please need a sponsor.
> http://cr.openjdk.java.net/~goetz/webrevs/8086069-call_conv/webrev.01/
> 
> Best regards,
>  Goetz
> 
> 
> * i.e., added after making our initial port


From john.r.rose at oracle.com  Tue Jun 16 00:26:36 2015
From: john.r.rose at oracle.com (John Rose)
Date: Mon, 15 Jun 2015 17:26:36 -0700
Subject: [aarch64-port-dev ] Ping: RFR(M): 8086069: Adapt runtime calls
	to recent intrinsics to pass	ints as long
In-Reply-To: <557F68C6.4050805@oracle.com>
References: <4295855A5C1DE049A61835A1887419CC2CFF9D17@DEWDFEMB12A.global.corp.sap>
	<57EF8CC4-3CB7-488E-89D4-5AE5EA3C99AA@oracle.com>
	<557F68C6.4050805@oracle.com>
Message-ID: <D6F78D71-A1E1-41DA-97C1-25548B9C5D0C@oracle.com>

On Jun 15, 2015, at 5:07 PM, Dean Long <dean.long at oracle.com> wrote:
> 
> This may be a dumb (but hopefully related) question, but why do we need to add top() for _LP64:
> 
> 4364 #ifdef _LP64
> 4365 #define XTOP ,top() /*additional argument*/
> 4366 #else  //_LP64
> 4367 #define XTOP        /*no additional argument*/
> 4368 #endif //_LP64
> 
> 4396  make_runtime_call(RC_LEAF|RC_NO_FP,
> 4397                    OptoRuntime::fast_arraycopy_Type(),
> 4398                    StubRoutines::unsafe_arraycopy(),
> 4399                    "unsafe_arraycopy",
> 4400                    TypeRawPtr::BOTTOM,
> 4401                    src, dst, size XTOP);
> 
> And why only for"size", but not "src" and "dst"?


That is one of the awkward places we jam in LP64-specific code.
Java has no size_t type; the closest it gets is "long".
But the compiler uses Java types to set up runtime stub call signatures.
So if we want the compiler to pass a size_t value to a stub, it has to pass a long on !LP64 and int on ILP32.
(There's no need for an int-in-a-long here, since size_t is never 32 bits on an int-in-a-long machine.)
To complete the embarrassment, the compiler has an internal convention of always representing the twin word for longs and doubles (see JVMS).
Net result is that if you want to ask for a size_t in the JIT on LP64, you have to #ifdef in a jlong, and pass a "top" twin word.
? John

From john.r.rose at oracle.com  Tue Jun 16 04:13:02 2015
From: john.r.rose at oracle.com (John Rose)
Date: Mon, 15 Jun 2015 21:13:02 -0700
Subject: [aarch64-port-dev ] Ping: RFR(M): 8086069: Adapt runtime calls
	to recent intrinsics to pass	ints as long
In-Reply-To: <557F9E91.7020603@oracle.com>
References: <4295855A5C1DE049A61835A1887419CC2CFF9D17@DEWDFEMB12A.global.corp.sap>
	<57EF8CC4-3CB7-488E-89D4-5AE5EA3C99AA@oracle.com>
	<557F68C6.4050805@oracle.com>
	<D6F78D71-A1E1-41DA-97C1-25548B9C5D0C@oracle.com>
	<557F9E91.7020603@oracle.com>
Message-ID: <BB5CF2A1-BE93-49E3-90F3-F1EA07A07F96@oracle.com>

On Jun 15, 2015, at 8:57 PM, Dean Long <dean.long at oracle.com> wrote:
> 
> Thanks for the explanation.  It sounds like we are modeling the abstract Java stack representation of long and double, and this wouldn't be
> easy to change, because I see things like "TypeFunc::Parms + 1" and "argument(2)" that would need to change before this could go away.

Indeed.  Slot pairs are a mess, an optimization (or concession) for platforms that no longer matter.  (Primitives might look like that in a few years.)  Some messes in HotSpot stem (IMO) from excessive attention to the bytecode syntax, designing a managed execution engine around the oddities of a code format.

In an ideal world, I would like to isolate, deprecate, and eventually remove the "evil twin" slots, since they no longer have any meaning (except maybe on some 32-bit systems).  Doing it at all levels will be hard, except in the context of some other breaking change.  But it could be done locally in the JVM, removing the notion of twin slots from modules that don't have an absolute need to work with them.  JITs shouldn't have to know about them, IMO; maybe not even the interpreter (though that would involve a renumbering prepass).

When we get value types, we may be able to make such a change, even to the bytecode syntax itself.  Or perhaps we will perpetuate the "evil twin" convention, but apply it to all value types (plus long and double).  Or perhaps (happy thought) we can make every value/ref/prim occupy one stack slot, in some bytecode of the future.

? John

From aph at redhat.com  Tue Jun 16 07:58:37 2015
From: aph at redhat.com (Andrew Haley)
Date: Tue, 16 Jun 2015 08:58:37 +0100
Subject: [aarch64-port-dev ] AArch64: vectorization fails RSA crypto
	tests
In-Reply-To: <CAEf2cjcvy3PVuGFurKfoZnWsCAWX8rZ9kQqpVScwaNOfH_EBFA@mail.gmail.com>
References: <557EFB76.5050308@redhat.com>
	<CAEf2cjcvy3PVuGFurKfoZnWsCAWX8rZ9kQqpVScwaNOfH_EBFA@mail.gmail.com>
Message-ID: <557FD72D.9060904@redhat.com>

On 15/06/15 21:24, Edward Nevill wrote:
> Would you like me to merge these two as a single patch, file a JBS
> "regression" bug report and push a cr.

Yes please,

Andrew.


From aph at redhat.com  Tue Jun 16 15:44:06 2015
From: aph at redhat.com (Andrew Haley)
Date: Tue, 16 Jun 2015 16:44:06 +0100
Subject: [aarch64-port-dev ] AArch64: vectorization fails RSA crypto
	tests
In-Reply-To: <CAEf2cjcvy3PVuGFurKfoZnWsCAWX8rZ9kQqpVScwaNOfH_EBFA@mail.gmail.com>
References: <557EFB76.5050308@redhat.com>
	<CAEf2cjcvy3PVuGFurKfoZnWsCAWX8rZ9kQqpVScwaNOfH_EBFA@mail.gmail.com>
Message-ID: <55804446.7010100@redhat.com>

On 06/15/2015 09:24 PM, Edward Nevill wrote:
> Would you like me to merge these two as a single patch, file a JBS
> "regression" bug report and push a cr.

While you're at it: mov(FloatRegister, SIMD_Arrangement, u_int32_t) is
a bit too large to be in a header file.

Thx,
Andrew.


From tangwei6 at huawei.com  Thu Jun 18 07:08:29 2015
From: tangwei6 at huawei.com (Tangwei (Euler))
Date: Thu, 18 Jun 2015 07:08:29 +0000
Subject: [aarch64-port-dev ] failed to build JDK9
Message-ID: <C8D1E566CC4CA845813731FB6FD5C530010FB73D@SZXEMI503-MBX.china.huawei.com>

Hi All,
  I cloned the latest openJDK9 for aarch64 on Ubuntu and failed to configure with following error message.
Anyone knows how to solve this issue? From the message, it suggested to install libfreetype6-dev,
but the library has already been installed.

configure: Could not compile and link with freetype. This might be a 32/64-bit mismatch.
configure: Using FREETYPE_CFLAGS=-I/usr/include/freetype2   and FREETYPE_LIBS=-lfreetype
configure: error: Can not continue without freetype. You might be able to fix this by running 'sudo apt-get install libfreetype6-dev'.
configure exiting with result code 1


Regards!
wei

From Alexander.Alexeev at caviumnetworks.com  Thu Jun 18 07:40:47 2015
From: Alexander.Alexeev at caviumnetworks.com (Alexeev, Alexander)
Date: Thu, 18 Jun 2015 07:40:47 +0000
Subject: [aarch64-port-dev ] failed to build JDK9
In-Reply-To: <C8D1E566CC4CA845813731FB6FD5C530010FB73D@SZXEMI503-MBX.china.huawei.com>
References: <C8D1E566CC4CA845813731FB6FD5C530010FB73D@SZXEMI503-MBX.china.huawei.com>
Message-ID: <SN1PR07MB1472935634858166D064B86B99A50@SN1PR07MB1472.namprd07.prod.outlook.com>

Hello Wei 

Did you try --debug-configure flag? It might provide some information on the source of the problem. 
Check that lib and includes are available on specified path.  

Regards,
Alexander

> -----Original Message-----
> From: aarch64-port-dev [mailto:aarch64-port-dev-
> bounces at openjdk.java.net] On Behalf Of Tangwei (Euler)
> Sent: Thursday, June 18, 2015 10:08 AM
> To: aarch64-port-dev at openjdk.java.net
> Subject: [aarch64-port-dev ] failed to build JDK9
> 
> Hi All,
>   I cloned the latest openJDK9 for aarch64 on Ubuntu and failed to configure
> with following error message.
> Anyone knows how to solve this issue? From the message, it suggested to
> install libfreetype6-dev, but the library has already been installed.
> 
> configure: Could not compile and link with freetype. This might be a 32/64-bit
> mismatch.
> configure: Using FREETYPE_CFLAGS=-I/usr/include/freetype2   and
> FREETYPE_LIBS=-lfreetype
> configure: error: Can not continue without freetype. You might be able to fix
> this by running 'sudo apt-get install libfreetype6-dev'.
> configure exiting with result code 1
> 
> 
> Regards!
> wei

From edward.nevill at linaro.org  Thu Jun 18 09:35:37 2015
From: edward.nevill at linaro.org (Edward Nevill)
Date: Thu, 18 Jun 2015 10:35:37 +0100
Subject: [aarch64-port-dev ] failed to build JDK9
In-Reply-To: <C8D1E566CC4CA845813731FB6FD5C530010FB73D@SZXEMI503-MBX.china.huawei.com>
References: <C8D1E566CC4CA845813731FB6FD5C530010FB73D@SZXEMI503-MBX.china.huawei.com>
Message-ID: <CAEf2cjeatJySrottypakLz+4P2RbTcGiC8bGR6srN4rph9aAxw@mail.gmail.com>

Hi,

I can successfully build jdk9 on ubuntu 14.04.

You may like to try initially building openjdk-7 as follows to ensure all
the dependancies are correct for openjdk-7.

apt-get source openjdk-7-jdk
cd openjdk-7-7u51-2.4.6 (name may vary slightly)
dpkg-buildpackage 2>&1 | tee ../log

If any dependancies are missing this will tell you exactly what packages to
install.

Once the dependancies are correct for openjdk-7 then you can retry jdk9.

Note that in order to build jdk9 you will need to have jdk8 installed.

You can download a pre-built binary from http://openjdk.linaro.org (follow
the releases tab).

All the best,
Edward Nevill


On 18 June 2015 at 08:08, Tangwei (Euler) <tangwei6 at huawei.com> wrote:

> Hi All,
>   I cloned the latest openJDK9 for aarch64 on Ubuntu and failed to
> configure with following error message.
> Anyone knows how to solve this issue? From the message, it suggested to
> install libfreetype6-dev,
> but the library has already been installed.
>
> configure: Could not compile and link with freetype. This might be a
> 32/64-bit mismatch.
> configure: Using FREETYPE_CFLAGS=-I/usr/include/freetype2   and
> FREETYPE_LIBS=-lfreetype
> configure: error: Can not continue without freetype. You might be able to
> fix this by running 'sudo apt-get install libfreetype6-dev'.
> configure exiting with result code 1
>
>
> Regards!
> wei
>

From tangwei6 at huawei.com  Thu Jun 18 14:36:17 2015
From: tangwei6 at huawei.com (Tangwei (Euler))
Date: Thu, 18 Jun 2015 14:36:17 +0000
Subject: [aarch64-port-dev ] failed to build JDK9
In-Reply-To: <CAEf2cjeatJySrottypakLz+4P2RbTcGiC8bGR6srN4rph9aAxw@mail.gmail.com>
References: <C8D1E566CC4CA845813731FB6FD5C530010FB73D@SZXEMI503-MBX.china.huawei.com>
	<CAEf2cjeatJySrottypakLz+4P2RbTcGiC8bGR6srN4rph9aAxw@mail.gmail.com>
Message-ID: <C8D1E566CC4CA845813731FB6FD5C530010FB80A@SZXEMI503-MBX.china.huawei.com>

Forgot to mention,  I tried to do cross compilation for aarch64 on X64 platform.  Following is my configuration command line.
The same command line works for openJDK8 before.  From the configuration log,  the directory to include freetype2 and libfreetype.so
for aarch64 needs to be specified.

./configure --enable-option-checking=fatal  --openjdk-target=aarch64-linux-gnu --enable-unlimited-crypto
--with-zlib=system --with-stdc++lib=dynamic CC=aarch64-linux-gnu-gcc  CXX=aarch64-linux-gnu-g++

My issue is solved by adding two options below to configure command:

--with-freetype-include=aarch64-toolchain/sysroot/usr/include/freetype2/
--with-freetype-lib=aarch64-toolchain/sysroot/usr/lib/

Thanks a lot for everyone?s kindly help!

Regards!
wei

From: Edward Nevill [mailto:edward.nevill at linaro.org]
Sent: Thursday, June 18, 2015 5:36 PM
To: Tangwei (Euler)
Cc: aarch64-port-dev at openjdk.java.net
Subject: Re: [aarch64-port-dev ] failed to build JDK9

Hi,
I can successfully build jdk9 on ubuntu 14.04.
You may like to try initially building openjdk-7 as follows to ensure all the dependancies are correct for openjdk-7.
apt-get source openjdk-7-jdk
cd openjdk-7-7u51-2.4.6 (name may vary slightly)
dpkg-buildpackage 2>&1 | tee ../log
If any dependancies are missing this will tell you exactly what packages to install.

Once the dependancies are correct for openjdk-7 then you can retry jdk9.
Note that in order to build jdk9 you will need to have jdk8 installed.
You can download a pre-built binary from http://openjdk.linaro.org (follow the releases tab).
All the best,
Edward Nevill


On 18 June 2015 at 08:08, Tangwei (Euler) <tangwei6 at huawei.com<mailto:tangwei6 at huawei.com>> wrote:
Hi All,
  I cloned the latest openJDK9 for aarch64 on Ubuntu and failed to configure with following error message.
Anyone knows how to solve this issue? From the message, it suggested to install libfreetype6-dev,
but the library has already been installed.

configure: Could not compile and link with freetype. This might be a 32/64-bit mismatch.
configure: Using FREETYPE_CFLAGS=-I/usr/include/freetype2   and FREETYPE_LIBS=-lfreetype
configure: error: Can not continue without freetype. You might be able to fix this by running 'sudo apt-get install libfreetype6-dev'.
configure exiting with result code 1


Regards!
wei


From tangwei6 at huawei.com  Thu Jun 18 15:00:23 2015
From: tangwei6 at huawei.com (Tangwei (Euler))
Date: Thu, 18 Jun 2015 15:00:23 +0000
Subject: [aarch64-port-dev ] abort when running jdk9 on ARM64
In-Reply-To: <CAEf2cjeatJySrottypakLz+4P2RbTcGiC8bGR6srN4rph9aAxw@mail.gmail.com>
References: <C8D1E566CC4CA845813731FB6FD5C530010FB73D@SZXEMI503-MBX.china.huawei.com>
	<CAEf2cjeatJySrottypakLz+4P2RbTcGiC8bGR6srN4rph9aAxw@mail.gmail.com>
Message-ID: <C8D1E566CC4CA845813731FB6FD5C530010FB82C@SZXEMI503-MBX.china.huawei.com>

Hi All,
  I can build out openjdk9 successfully, but the program will abort when just running ?java? without any options:
Following is stack, anyone has met the same issue before?

Stack: [0x000003ffa5340000,0x000003ffa5540000],  sp=0x000003ffa553e1c0,  free space=2040k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x8fb4e4]  VMError::report_and_die()+0x130
V  [libjvm.so+0x3df4a0]  report_vm_error(char const*, int, char const*, char const*)+0x68
V  [libjvm.so+0x73066c]  Monitor::wait(bool, long, bool)+0x22c
V  [libjvm.so+0x773d10]  os::create_thread(Thread*, os::ThreadType, unsigned long)+0x1a8
V  [libjvm.so+0x491efc]  GCTaskThread::GCTaskThread(GCTaskManager*, unsigned int, unsigned int)+0x60
V  [libjvm.so+0x491320]  GCTaskManager::initialize()+0x338
V  [libjvm.so+0x7905cc]  ParallelScavengeHeap::initialize()+0x334
V  [libjvm.so+0x8c2a94]  Universe::initialize_heap()+0x11c
V  [libjvm.so+0x8c2c60]  universe_init()+0x34
V  [libjvm.so+0x4f8348]  init_globals()+0x54
V  [libjvm.so+0x8a6d98]  Threads::create_vm(JavaVMInitArgs*, bool*)+0x2ac
V  [libjvm.so+0x56d688]  JNI_CreateJavaVM+0x78
C  [libjli.so+0x2a64]  JavaMain+0x8c
C  [libpthread.so.0+0x7c50]  start_thread+0xb0
C  [libc.so.6+0xdac60]  thread_start+0x30


Regards!
wei

From: Edward Nevill [mailto:edward.nevill at linaro.org]
Sent: Thursday, June 18, 2015 5:36 PM
To: Tangwei (Euler)
Cc: aarch64-port-dev at openjdk.java.net
Subject: Re: [aarch64-port-dev ] failed to build JDK9

Hi,
I can successfully build jdk9 on ubuntu 14.04.
You may like to try initially building openjdk-7 as follows to ensure all the dependancies are correct for openjdk-7.
apt-get source openjdk-7-jdk
cd openjdk-7-7u51-2.4.6 (name may vary slightly)
dpkg-buildpackage 2>&1 | tee ../log
If any dependancies are missing this will tell you exactly what packages to install.

Once the dependancies are correct for openjdk-7 then you can retry jdk9.
Note that in order to build jdk9 you will need to have jdk8 installed.
You can download a pre-built binary from http://openjdk.linaro.org (follow the releases tab).
All the best,
Edward Nevill


On 18 June 2015 at 08:08, Tangwei (Euler) <tangwei6 at huawei.com<mailto:tangwei6 at huawei.com>> wrote:
Hi All,
  I cloned the latest openJDK9 for aarch64 on Ubuntu and failed to configure with following error message.
Anyone knows how to solve this issue? From the message, it suggested to install libfreetype6-dev,
but the library has already been installed.

configure: Could not compile and link with freetype. This might be a 32/64-bit mismatch.
configure: Using FREETYPE_CFLAGS=-I/usr/include/freetype2   and FREETYPE_LIBS=-lfreetype
configure: error: Can not continue without freetype. You might be able to fix this by running 'sudo apt-get install libfreetype6-dev'.
configure exiting with result code 1


Regards!
wei


From edward.nevill at gmail.com  Thu Jun 18 15:19:25 2015
From: edward.nevill at gmail.com (Edward Nevill)
Date: Thu, 18 Jun 2015 16:19:25 +0100
Subject: [aarch64-port-dev ] abort when running jdk9 on ARM64
In-Reply-To: <C8D1E566CC4CA845813731FB6FD5C530010FB82C@SZXEMI503-MBX.china.huawei.com>
References: <C8D1E566CC4CA845813731FB6FD5C530010FB73D@SZXEMI503-MBX.china.huawei.com>
	<CAEf2cjeatJySrottypakLz+4P2RbTcGiC8bGR6srN4rph9aAxw@mail.gmail.com>
	<C8D1E566CC4CA845813731FB6FD5C530010FB82C@SZXEMI503-MBX.china.huawei.com>
Message-ID: <1434640765.8420.4.camel@mint>

Hi wei,

You could try downloading the latest jdk9 prebuilt binary from http://openjdk.linaro.org. Here is a link

http://openjdk.linaro.org/releases/jdk9-server-release-1505.tar.xz

If this works then it is likely to be a problem with your build.

All the best,
Ed.

On Thu, 2015-06-18 at 15:00 +0000, Tangwei (Euler) wrote:
> Hi All,
>   I can build out openjdk9 successfully, but the program will abort when just running ?java? without any options:
> Following is stack, anyone has met the same issue before?
> 
> Stack: [0x000003ffa5340000,0x000003ffa5540000],  sp=0x000003ffa553e1c0,  free space=2040k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
> V  [libjvm.so+0x8fb4e4]  VMError::report_and_die()+0x130
> V  [libjvm.so+0x3df4a0]  report_vm_error(char const*, int, char const*, char const*)+0x68
> V  [libjvm.so+0x73066c]  Monitor::wait(bool, long, bool)+0x22c
> V  [libjvm.so+0x773d10]  os::create_thread(Thread*, os::ThreadType, unsigned long)+0x1a8
> V  [libjvm.so+0x491efc]  GCTaskThread::GCTaskThread(GCTaskManager*, unsigned int, unsigned int)+0x60
> V  [libjvm.so+0x491320]  GCTaskManager::initialize()+0x338
> V  [libjvm.so+0x7905cc]  ParallelScavengeHeap::initialize()+0x334
> V  [libjvm.so+0x8c2a94]  Universe::initialize_heap()+0x11c
> V  [libjvm.so+0x8c2c60]  universe_init()+0x34
> V  [libjvm.so+0x4f8348]  init_globals()+0x54
> V  [libjvm.so+0x8a6d98]  Threads::create_vm(JavaVMInitArgs*, bool*)+0x2ac
> V  [libjvm.so+0x56d688]  JNI_CreateJavaVM+0x78
> C  [libjli.so+0x2a64]  JavaMain+0x8c
> C  [libpthread.so.0+0x7c50]  start_thread+0xb0
> C  [libc.so.6+0xdac60]  thread_start+0x30
> 
> 
> Regards!
> wei


From christian.thalinger at oracle.com  Thu Jun 18 16:02:20 2015
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Thu, 18 Jun 2015 09:02:20 -0700
Subject: [aarch64-port-dev ] Ping: RFR(M): 8086069: Adapt runtime calls
	to recent intrinsics to pass	ints as long
In-Reply-To: <BB5CF2A1-BE93-49E3-90F3-F1EA07A07F96@oracle.com>
References: <4295855A5C1DE049A61835A1887419CC2CFF9D17@DEWDFEMB12A.global.corp.sap>
	<57EF8CC4-3CB7-488E-89D4-5AE5EA3C99AA@oracle.com>
	<557F68C6.4050805@oracle.com>
	<D6F78D71-A1E1-41DA-97C1-25548B9C5D0C@oracle.com>
	<557F9E91.7020603@oracle.com>
	<BB5CF2A1-BE93-49E3-90F3-F1EA07A07F96@oracle.com>
Message-ID: <754964BF-CC0D-4324-8AAC-11799847BFEE@oracle.com>


> On Jun 15, 2015, at 9:13 PM, John Rose <john.r.rose at oracle.com> wrote:
> 
> On Jun 15, 2015, at 8:57 PM, Dean Long <dean.long at oracle.com> wrote:
>> 
>> Thanks for the explanation.  It sounds like we are modeling the abstract Java stack representation of long and double, and this wouldn't be
>> easy to change, because I see things like "TypeFunc::Parms + 1" and "argument(2)" that would need to change before this could go away.
> 
> Indeed.  Slot pairs are a mess, an optimization (or concession) for platforms that no longer matter.  (Primitives might look like that in a few years.)  Some messes in HotSpot stem (IMO) from excessive attention to the bytecode syntax, designing a managed execution engine around the oddities of a code format.
> 
> In an ideal world, I would like to isolate, deprecate, and eventually remove the "evil twin" slots, since they no longer have any meaning (except maybe on some 32-bit systems).  Doing it at all levels will be hard, except in the context of some other breaking change.  But it could be done locally in the JVM, removing the notion of twin slots from modules that don't have an absolute need to work with them.  JITs shouldn't have to know about them, IMO; maybe not even the interpreter (though that would involve a renumbering prepass).
> 
> When we get value types, we may be able to make such a change, even to the bytecode syntax itself.  Or perhaps we will perpetuate the "evil twin" convention, but apply it to all value types (plus long and double).  Or perhaps (happy thought) we can make every value/ref/prim occupy one stack slot, in some bytecode of the future.

I?m all for happy thoughts :-)

> 
> ? John


From goetz.lindenmaier at sap.com  Mon Jun 22 12:54:30 2015
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Mon, 22 Jun 2015 12:54:30 +0000
Subject: [aarch64-port-dev ] Ping: RFR(M): 8086069: Adapt runtime calls
	to recent intrinsics	to pass	ints as long
In-Reply-To: <754964BF-CC0D-4324-8AAC-11799847BFEE@oracle.com>
References: <4295855A5C1DE049A61835A1887419CC2CFF9D17@DEWDFEMB12A.global.corp.sap>
	<57EF8CC4-3CB7-488E-89D4-5AE5EA3C99AA@oracle.com>
	<557F68C6.4050805@oracle.com>
	<D6F78D71-A1E1-41DA-97C1-25548B9C5D0C@oracle.com>
	<557F9E91.7020603@oracle.com>
	<BB5CF2A1-BE93-49E3-90F3-F1EA07A07F96@oracle.com>
	<754964BF-CC0D-4324-8AAC-11799847BFEE@oracle.com>
Message-ID: <4295855A5C1DE049A61835A1887419CC2CFFC4DE@DEWDFEMB12A.global.corp.sap>

Hi,

I would like to ping again for my change.

I want to recall that this only extends the existing mechanism guarded
by CCallingConventionRequiresIntsAsLongs to multiplyToLen, CRC32, 
AES and SHA.

http://cr.openjdk.java.net/~goetz/webrevs/8086069-call_conv/webrev.01/
Could somebody please review this change?  I please need a sponsor.

Thanks and best regards,
  Goetz.


-----Original Message-----
From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Christian Thalinger
Sent: Donnerstag, 18. Juni 2015 18:02
To: John Rose
Cc: HotSpot Developers; aarch64-port-dev at openjdk.java.net
Subject: Re: Ping: RFR(M): 8086069: Adapt runtime calls to recent intrinsics to pass ints as long


> On Jun 15, 2015, at 9:13 PM, John Rose <john.r.rose at oracle.com> wrote:
> 
> On Jun 15, 2015, at 8:57 PM, Dean Long <dean.long at oracle.com> wrote:
>> 
>> Thanks for the explanation.  It sounds like we are modeling the abstract Java stack representation of long and double, and this wouldn't be
>> easy to change, because I see things like "TypeFunc::Parms + 1" and "argument(2)" that would need to change before this could go away.
> 
> Indeed.  Slot pairs are a mess, an optimization (or concession) for platforms that no longer matter.  (Primitives might look like that in a few years.)  Some messes in HotSpot stem (IMO) from excessive attention to the bytecode syntax, designing a managed execution engine around the oddities of a code format.
> 
> In an ideal world, I would like to isolate, deprecate, and eventually remove the "evil twin" slots, since they no longer have any meaning (except maybe on some 32-bit systems).  Doing it at all levels will be hard, except in the context of some other breaking change.  But it could be done locally in the JVM, removing the notion of twin slots from modules that don't have an absolute need to work with them.  JITs shouldn't have to know about them, IMO; maybe not even the interpreter (though that would involve a renumbering prepass).
> 
> When we get value types, we may be able to make such a change, even to the bytecode syntax itself.  Or perhaps we will perpetuate the "evil twin" convention, but apply it to all value types (plus long and double).  Or perhaps (happy thought) we can make every value/ref/prim occupy one stack slot, in some bytecode of the future.

I?m all for happy thoughts :-)

> 
> ? John


From edward.nevill at gmail.com  Mon Jun 22 13:23:21 2015
From: edward.nevill at gmail.com (Edward Nevill)
Date: Mon, 22 Jun 2015 14:23:21 +0100
Subject: [aarch64-port-dev ] RFR: 8129426: aarch64: add support for PopCount
	in C2
Message-ID: <1434979401.21282.31.camel@mint>

Hi,

Aarch64 currently does not support the PopCountI and PopCountL nodes in aarch64.ad

The following webrev adds support for these using the SIMD instructions 'cnt' and 'addv'

http://cr.openjdk.java.net/~enevill/8129426/webrev.01/

This patch was contributed by alexander.alexeev at caviumnetworks.com

The patch only modifies aarch64 specific files.

I have merged the patch in and tested it with JTreg / hotspot with the following results.

Original: Test results: passed: 847; failed: 13; error: 6
Revised: Test results: passed: 848; failed: 12; error: 6

The single additional failure in the original is the test

FAILED: compiler/intrinsics/squaretolen/TestSquareToLen.java

which is an intermittent failure in the original.

I have benchmarked the patch on four different partner platforms. The average improvement was 2.6X for PopCountI and 2.5X for PopCountL.

Please review and if OK I will push,

Thanks,
Ed.


From aph at redhat.com  Mon Jun 22 14:04:10 2015
From: aph at redhat.com (Andrew Haley)
Date: Mon, 22 Jun 2015 15:04:10 +0100
Subject: [aarch64-port-dev ] RFR: 8129426: aarch64: add support for
 PopCount in C2
In-Reply-To: <1434979401.21282.31.camel@mint>
References: <1434979401.21282.31.camel@mint>
Message-ID: <558815DA.8020500@redhat.com>

On 06/22/2015 02:23 PM, Edward Nevill wrote:
> Aarch64 currently does not support the PopCountI and PopCountL nodes in aarch64.ad

> 
> Please review and if OK I will push,

Shouldn't mov in the IregI case be movw?  And iRegI be iRegIorL2I?

I'm guessing that C2 won't do the MOVs itself if you specify the
instruction as vRegD src.

Andrew.


From adinn at redhat.com  Mon Jun 22 14:34:05 2015
From: adinn at redhat.com (Andrew Dinn)
Date: Mon, 22 Jun 2015 15:34:05 +0100
Subject: [aarch64-port-dev ] RFR: 8129426: aarch64: add support for
 PopCount in C2
In-Reply-To: <558815DA.8020500@redhat.com>
References: <1434979401.21282.31.camel@mint> <558815DA.8020500@redhat.com>
Message-ID: <55881CDD.2080009@redhat.com>


On 22/06/15 15:04, Andrew Haley wrote:
> On 06/22/2015 02:23 PM, Edward Nevill wrote:
>> Aarch64 currently does not support the PopCountI and PopCountL nodes in aarch64.ad
> 
>>
>> Please review and if OK I will push,
> 
> Shouldn't mov in the IregI case be movw?  And iRegI be iRegIorL2I?

Agreed on both counts.

Strictly, for the PopCountI encoding /both/ mov operations -- i.e. to
and fro -- should be a movw (I assume that means passing enum tag T1F in
place of T1D?). However, using mov for the restore to $dst is safe as we
know the top 32 bits will be zero.

> I'm guessing that C2 won't do the MOVs itself if you specify the
> instruction as vRegD src.

I believe you guess right.

regards,


Andrew Dinn
-----------

From edward.nevill at gmail.com  Mon Jun 22 14:59:42 2015
From: edward.nevill at gmail.com (Edward Nevill)
Date: Mon, 22 Jun 2015 15:59:42 +0100
Subject: [aarch64-port-dev ] RFR: 8129426: aarch64: add support for
 PopCount in C2
In-Reply-To: <558815DA.8020500@redhat.com>
References: <1434979401.21282.31.camel@mint>  <558815DA.8020500@redhat.com>
Message-ID: <1434985182.21282.34.camel@mint>

On Mon, 2015-06-22 at 15:04 +0100, Andrew Haley wrote:
> On 06/22/2015 02:23 PM, Edward Nevill wrote:
> > Aarch64 currently does not support the PopCountI and PopCountL nodes in aarch64.ad
> 
> > 
> > Please review and if OK I will push,
> 
> Shouldn't mov in the IregI case be movw?  And iRegI be iRegIorL2I?

No. It needs 0s in the top 32 bits.

The reason is that the following CNT instruction is only available in 8B or 16B forms.

It was iRegIorL2I before, I changed it to IregI because of this problem.

Regards,
Ed.


From aph at redhat.com  Mon Jun 22 15:04:29 2015
From: aph at redhat.com (Andrew Haley)
Date: Mon, 22 Jun 2015 16:04:29 +0100
Subject: [aarch64-port-dev ] RFR: 8129426: aarch64: add support for
 PopCount in C2
In-Reply-To: <1434985182.21282.34.camel@mint>
References: <1434979401.21282.31.camel@mint> <558815DA.8020500@redhat.com>
	<1434985182.21282.34.camel@mint>
Message-ID: <558823FD.5080800@redhat.com>

On 06/22/2015 03:59 PM, Edward Nevill wrote:
> On Mon, 2015-06-22 at 15:04 +0100, Andrew Haley wrote:
>> On 06/22/2015 02:23 PM, Edward Nevill wrote:
>>> Aarch64 currently does not support the PopCountI and PopCountL nodes in aarch64.ad
>>
>>>
>>> Please review and if OK I will push,
>>
>> Shouldn't mov in the IregI case be movw?  And iRegI be iRegIorL2I?
> 
> No. It needs 0s in the top 32 bits.
> 
> The reason is that the following CNT instruction is only available in 8B or 16B forms.
> 
> It was iRegIorL2I before, I changed it to IregI because of this problem.

Well, you're asking for trouble.  We've tried to make sure that the
top half of an int register is always zero, but it's hard absolutely
to guarantee it in all cases.  Does movw to a vector register really
not clear the top 32 bits of the dest?

Andrew.


From adinn at redhat.com  Mon Jun 22 15:29:22 2015
From: adinn at redhat.com (Andrew Dinn)
Date: Mon, 22 Jun 2015 16:29:22 +0100
Subject: [aarch64-port-dev ] RFR: 8129426: aarch64: add support for
 PopCount in C2
In-Reply-To: <558823FD.5080800@redhat.com>
References: <1434979401.21282.31.camel@mint>
	<558815DA.8020500@redhat.com>	<1434985182.21282.34.camel@mint>
	<558823FD.5080800@redhat.com>
Message-ID: <558829D2.4000503@redhat.com>

On 22/06/15 16:04, Andrew Haley wrote:
> On 06/22/2015 03:59 PM, Edward Nevill wrote:
>> On Mon, 2015-06-22 at 15:04 +0100, Andrew Haley wrote:
>>> On 06/22/2015 02:23 PM, Edward Nevill wrote:
>>>> Aarch64 currently does not support the PopCountI and PopCountL nodes in aarch64.ad
>>>
>>>>
>>>> Please review and if OK I will push,
>>>
>>> Shouldn't mov in the IregI case be movw?  And iRegI be iRegIorL2I?
>>
>> No. It needs 0s in the top 32 bits.
>>
>> The reason is that the following CNT instruction is only available in 8B or 16B forms.
>>
>> It was iRegIorL2I before, I changed it to IregI because of this problem.
> 
> Well, you're asking for trouble.  We've tried to make sure that the
> top half of an int register is always zero, but it's hard absolutely
> to guarantee it in all cases.  Does movw to a vector register really
> not clear the top 32 bits of the dest?

Aargh, after checking the Manuel (as a last resort) it appears that the
32 bit move carefully moves a 32 bit word into a 32 bit slot leaving the
rest of the slots unchanged -- MOV is documented as an alias for INS
which pretty much explains the semantics.

It seems that the scalar fmovw instruction does the same (it is
documented as overlapping the behaviour of the vector move instruction).

So, it seems Ed is right to use iRegI and rely on an l2i conversion to
zero the top word if the incoming value is long.

regards,


Andrew Dinn
-----------

From aph at redhat.com  Mon Jun 22 15:41:08 2015
From: aph at redhat.com (Andrew Haley)
Date: Mon, 22 Jun 2015 16:41:08 +0100
Subject: [aarch64-port-dev ] RFR: 8129426: aarch64: add support for
 PopCount in C2
In-Reply-To: <558829D2.4000503@redhat.com>
References: <1434979401.21282.31.camel@mint>
	<558815DA.8020500@redhat.com>	<1434985182.21282.34.camel@mint>
	<558823FD.5080800@redhat.com> <558829D2.4000503@redhat.com>
Message-ID: <55882C94.7030505@redhat.com>

On 06/22/2015 04:29 PM, Andrew Dinn wrote:
> So, it seems Ed is right to use iRegI and rely on an l2i conversion to
> zero the top word if the incoming value is long.

I don't think that's safe.  I certainly don't think it's a good
tradeoff.  I think it'd be the only place in our entire code base
where we assume that the high bits of a jint are zero.  If it really
wants zeros in the top bits we'd better put them there.

Andrew.

From adinn at redhat.com  Mon Jun 22 15:50:36 2015
From: adinn at redhat.com (Andrew Dinn)
Date: Mon, 22 Jun 2015 16:50:36 +0100
Subject: [aarch64-port-dev ] RFR: 8129426: aarch64: add support for
 PopCount in C2
In-Reply-To: <55882C94.7030505@redhat.com>
References: <1434979401.21282.31.camel@mint>
	<558815DA.8020500@redhat.com>	<1434985182.21282.34.camel@mint>
	<558823FD.5080800@redhat.com> <558829D2.4000503@redhat.com>
	<55882C94.7030505@redhat.com>
Message-ID: <55882ECC.8030602@redhat.com>

On 22/06/15 16:41, Andrew Haley wrote:
> On 06/22/2015 04:29 PM, Andrew Dinn wrote:
>> So, it seems Ed is right to use iRegI and rely on an l2i conversion to
>> zero the top word if the incoming value is long.
> 
> I don't think that's safe.  I certainly don't think it's a good
> tradeoff.  I think it'd be the only place in our entire code base
> where we assume that the high bits of a jint are zero.  If it really
> wants zeros in the top bits we'd better put them there.

Well, yes, but it really /ought/ to be safe.

Whenever we generate an iRegI dst output we should be using a foow
instruction and end up with the top 32 bits zero. So, wherever we
consume an iRegI src input we ought to be able to rely on it having top
bits zero. Either it was generated directly as an iRegI output or it was
generated as an iRegL output and passed in via an l2i conversion.

If that assumption fails anywhere then it will only fail because we used
a foo insn where we really needed a foow. I think we would be better to
let any such errors fail as quickly as possible, find the error and fix
the offending code to use foow. Your mileage may vary.

regards,


Andrew Dinn
-----------

From aph at redhat.com  Mon Jun 22 16:01:00 2015
From: aph at redhat.com (Andrew Haley)
Date: Mon, 22 Jun 2015 17:01:00 +0100
Subject: [aarch64-port-dev ] RFR: 8129426: aarch64: add support for
 PopCount in C2
In-Reply-To: <55882ECC.8030602@redhat.com>
References: <1434979401.21282.31.camel@mint>
	<558815DA.8020500@redhat.com>	<1434985182.21282.34.camel@mint>
	<558823FD.5080800@redhat.com> <558829D2.4000503@redhat.com>
	<55882C94.7030505@redhat.com> <55882ECC.8030602@redhat.com>
Message-ID: <5588313C.1070409@redhat.com>

On 06/22/2015 04:50 PM, Andrew Dinn wrote:
> On 22/06/15 16:41, Andrew Haley wrote:
>> On 06/22/2015 04:29 PM, Andrew Dinn wrote:
>>> So, it seems Ed is right to use iRegI and rely on an l2i conversion to
>>> zero the top word if the incoming value is long.
>>
>> I don't think that's safe.  I certainly don't think it's a good
>> tradeoff.  I think it'd be the only place in our entire code base
>> where we assume that the high bits of a jint are zero.  If it really
>> wants zeros in the top bits we'd better put them there.
> 
> Well, yes, but it really /ought/ to be safe.
> 
> Whenever we generate an iRegI dst output we should be using a foow
> instruction and end up with the top 32 bits zero. So, wherever we
> consume an iRegI src input we ought to be able to rely on it having top
> bits zero. Either it was generated directly as an iRegI output or it was
> generated as an iRegL output and passed in via an l2i conversion.
> 
> If that assumption fails anywhere then it will only fail because we used
> a foo insn where we really needed a foow. I think we would be better to
> let any such errors fail as quickly as possible, find the error and fix
> the offending code to use foow.

And how would we even notice it, yet alone find the error?

> Your mileage may vary.

Hmm.  So far we've been very conservative, making sure that we always
use the correct mode for inputs and the correct mode for outputs.  If
we're going to start making assumptions that top bits of int ops are
always zero we could always elide l2i to a no-op.  So far we have
resisted that, and with good reason IMO.

I wrote the deoptimization code and was pretty careful to do the right
thing, but also very reassured that it probably didn't matter.  I
don't think we can guarantee that nowhere do we have a sign extension
where there should be a zero extension.

Andrew.

From adinn at redhat.com  Mon Jun 22 16:14:18 2015
From: adinn at redhat.com (Andrew Dinn)
Date: Mon, 22 Jun 2015 17:14:18 +0100
Subject: [aarch64-port-dev ] RFR: 8129426: aarch64: add support for
 PopCount in C2
In-Reply-To: <5588313C.1070409@redhat.com>
References: <1434979401.21282.31.camel@mint>
	<558815DA.8020500@redhat.com>	<1434985182.21282.34.camel@mint>
	<558823FD.5080800@redhat.com> <558829D2.4000503@redhat.com>
	<55882C94.7030505@redhat.com> <55882ECC.8030602@redhat.com>
	<5588313C.1070409@redhat.com>
Message-ID: <5588345A.4060708@redhat.com>

On 22/06/15 17:01, Andrew Haley wrote:
> On 06/22/2015 04:50 PM, Andrew Dinn wrote:
>> If that assumption fails anywhere then it will only fail because we used
>> a foo insn where we really needed a foow. I think we would be better to
>> let any such errors fail as quickly as possible, find the error and fix
>> the offending code to use foow.
> 
> And how would we even notice it, yet alone find the error?

I agree it will not necessarily be easy to spot. Bit we know exactly
where to look (see below).

>> Your mileage may vary.
> 
> Hmm.  So far we've been very conservative, making sure that we always
> use the correct mode for inputs and the correct mode for outputs.  If
> we're going to start making assumptions that top bits of int ops are
> always zero we could always elide l2i to a no-op.  So far we have
> resisted that, and with good reason IMO.

No, that last statement is not at all correct. l2i is explicitly
inserted into the ideal graph when the compiler knows that a value
generated as long is being consumed as an int and so needs to be
truncated. We have explicitly avoided performing any truncation to
effect that l2i in every rule where we accept an input of iRegIorL2I. In
all such cases we have ensured that the instruction which consumes the
input is a foow not a foo. That's quite checkable by eyeball.

For this one case, we also need to be sure that every instruction which
generates an iRegI output uses a foow instruction which, (according to
the manual) zeroes the top bits. That's also checkable by eyeball.

We also need to be sure that anything spilled as a 32 bit int is
restored as a 32 bit int with the top bits correspondingly zeroed.

> I wrote the deoptimization code and was pretty careful to do the right
> thing, but also very reassured that it probably didn't matter.  I
> don't think we can guarantee that nowhere do we have a sign extension
> where there should be a zero extension.

Well, you might not want to take this risk and instead add an explicit
zero of the upper half. But I think we need to be clear what risk we are
taking.

regards,


Andrew Dinn
-----------

From goetz.lindenmaier at sap.com  Tue Jun 23 07:00:26 2015
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Tue, 23 Jun 2015 07:00:26 +0000
Subject: [aarch64-port-dev ] Fix on aarch required after 8073165: Contended
 Locking fast exit bucket.
Message-ID: <4295855A5C1DE049A61835A1887419CC2CFFC7DF@DEWDFEMB12A.global.corp.sap>

Hi,

I think you need the fix below after the change in 8073165.
http://hg.openjdk.java.net/jdk9/dev/hotspot/rev/2abcd8a4896c
If you verify this, I will submit it along with the ppc fix.

Best regards,
  Goetz.

diff -r 5b5db3d68ab9 src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp
--- a/src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp      Mon Jun 22 17:15:45 2015 +0200
+++ b/src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp      Tue Jun 23 08:54:15 2015 +0200
@@ -2120,6 +2120,7 @@
       save_native_result(masm, ret_type, stack_slots);
     }

+    __ mov(c_rarg2, rthread);
     __ lea(c_rarg1, Address(sp, lock_slot_offset * VMRegImpl::stack_slot_size));
     __ mov(c_rarg0, obj_reg);

@@ -2128,7 +2129,7 @@
     __ ldr(r19, Address(rthread, in_bytes(Thread::pending_exception_offset())));
     __ str(zr, Address(rthread, in_bytes(Thread::pending_exception_offset())));

-    rt_call(masm, CAST_FROM_FN_PTR(address, SharedRuntime::complete_monitor_unlocking_C), 2, 0, 1);
+    rt_call(masm, CAST_FROM_FN_PTR(address, SharedRuntime::complete_monitor_unlocking_C), 3, 0, 1);

#ifdef ASSERT
     {

From edward.nevill at linaro.org  Tue Jun 23 08:55:48 2015
From: edward.nevill at linaro.org (Edward Nevill)
Date: Tue, 23 Jun 2015 09:55:48 +0100
Subject: [aarch64-port-dev ] RFR: 8081294: aarch64: fails to build on ubuntu
	wily
Message-ID: <1435049748.20837.6.camel@mylittlepony.linaroharston>

Hi,

One of our partners has reported that jdk9 fails to build for aarch64 on ubuntu 'wily'

The failing buildlog is here

https://launchpad.net/ubuntu/+source/openjdk-9/9~b64-1ubuntu1/+build/7441971

The following webrev fixes this

http://cr.openjdk.java.net/~enevill/8081294/webrev/

I have verified that this fixes the problem by cross compiling against a sysroot.

Our partner has also verified that this patch fixes the build.

As this is a change to shared code I will need someone to sponsor this and push it through JPRT.

Thanks for your help,
Ed
 

From aph at redhat.com  Tue Jun 23 10:10:20 2015
From: aph at redhat.com (Andrew Haley)
Date: Tue, 23 Jun 2015 11:10:20 +0100
Subject: [aarch64-port-dev ] RFR: 8129426: aarch64: add support for
 PopCount in C2
In-Reply-To: <5588345A.4060708@redhat.com>
References: <1434979401.21282.31.camel@mint>
	<558815DA.8020500@redhat.com>	<1434985182.21282.34.camel@mint>
	<558823FD.5080800@redhat.com> <558829D2.4000503@redhat.com>
	<55882C94.7030505@redhat.com> <55882ECC.8030602@redhat.com>
	<5588313C.1070409@redhat.com> <5588345A.4060708@redhat.com>
Message-ID: <5589308C.6000309@redhat.com>

On 22/06/15 17:14, Andrew Dinn wrote:
> On 22/06/15 17:01, Andrew Haley wrote:
>> On 06/22/2015 04:50 PM, Andrew Dinn wrote:
>>> If that assumption fails anywhere then it will only fail because we used
>>> a foo insn where we really needed a foow. I think we would be better to
>>> let any such errors fail as quickly as possible, find the error and fix
>>> the offending code to use foow.
>>
>> And how would we even notice it, yet alone find the error?
> 
> I agree it will not necessarily be easy to spot. Bit we know exactly
> where to look (see below).
> 
>>> Your mileage may vary.
>>
>> Hmm.  So far we've been very conservative, making sure that we always
>> use the correct mode for inputs and the correct mode for outputs.  If
>> we're going to start making assumptions that top bits of int ops are
>> always zero we could always elide l2i to a no-op.  So far we have
>> resisted that, and with good reason IMO.
> 
> No, that last statement is not at all correct. l2i is explicitly
> inserted into the ideal graph when the compiler knows that a value
> generated as long is being consumed as an int and so needs to be
> truncated.

I'm sure that's true, but it's not really relevant to what I said.

> We also need to be sure that anything spilled as a 32 bit int is
> restored as a 32 bit int with the top bits correspondingly zeroed.

That too.  But C2 has to interact with C1, the interpreter, and all
the stubs, and all the intrinsics.

>> I wrote the deoptimization code and was pretty careful to do the right
>> thing, but also very reassured that it probably didn't matter.  I
>> don't think we can guarantee that nowhere do we have a sign extension
>> where there should be a zero extension.
> 
> Well, you might not want to take this risk and instead add an explicit
> zero of the upper half. But I think we need to be clear what risk we are
> taking.

It's this: if we don't explicitly zero the upper half we'll have to
audit all the code which might present a sign-extended value (instead
of a zero-extended one) in a register that's supposed to contain a
jint.

Andrew.

From edward.nevill at linaro.org  Tue Jun 23 10:12:12 2015
From: edward.nevill at linaro.org (Edward Nevill)
Date: Tue, 23 Jun 2015 11:12:12 +0100
Subject: [aarch64-port-dev ] RFR: 8129551: some regressions introduced by
 addition of vectorisation code
Message-ID: <1435054332.5083.15.camel@mylittlepony.linaroharston>

Hi,

The following webrev

http://cr.openjdk.java.net/~enevill/8129551/webrev

fixes a number of regressions introduced in the addition of vectorisation for aarch64 as follows:-

java/math/BigInteger/BigIntegerTest.java

fails with an assertion failure when run with fastdebug or slowdebug builds

# Internal Error (/home/alexander/build-open-jdk/dev/jdk9/baseline/dev/hotspot/src/cpu/aarch64/vm/assembler_aarch64.hpp:2078), pid=8124, tid=0x0000007ec61eb1f0
# assert(op == 0 && 0 == 0) failed: must be MOVI

and also in test

java/math/BigInteger/ModPow.java

java.math.BigInteger::add gets miscompiled. There is a

        ldr q16, [x17,x10,lsl #4]

which should be a

        ldr q16, [x17,x10]

I have also moved

void MacroAssembler::mov(FloatRegister Vd, SIMD_Arrangement T, u_int32_t imm32)

to macroAssembler_aarch64.cpp from macroAssembler_aarch64.hpp as it was getting much too large for a .hpp.

I have tested the original and webrev version with JTreg (hotspot, langtools & jdk) with the following results:-

Original:-

hotspot: Test results: passed: 849; failed: 11; error: 6
langtools: Test results: passed: 3,240; error: 2
jdk: Test results: passed: 6,103; failed: 568; error: 26

Revised:-

hotspot: Test results: passed: 849; failed: 11; error: 6
langtools: Test results: passed: 3,240; error: 2
jdk: Test results: passed: 6,108; failed: 567; error: 22

This changeset only affects aarch64 files.

Please review and if OK I will push,

Thanks,
Ed.


From aph at redhat.com  Tue Jun 23 10:18:09 2015
From: aph at redhat.com (Andrew Haley)
Date: Tue, 23 Jun 2015 11:18:09 +0100
Subject: [aarch64-port-dev ] RFR: 8129551: some regressions introduced
 by addition of vectorisation code
In-Reply-To: <1435054332.5083.15.camel@mylittlepony.linaroharston>
References: <1435054332.5083.15.camel@mylittlepony.linaroharston>
Message-ID: <55893261.7030501@redhat.com>

On 23/06/15 11:12, Edward Nevill wrote:
> Hi,
> 
> The following webrev
> 
> http://cr.openjdk.java.net/~enevill/8129551/webrev
> 
> fixes a number of regressions introduced in the addition of vectorisation for aarch64 as follows:-
> 
> java/math/BigInteger/BigIntegerTest.java
> 
> fails with an assertion failure when run with fastdebug or slowdebug builds
> 
> # Internal Error (/home/alexander/build-open-jdk/dev/jdk9/baseline/dev/hotspot/src/cpu/aarch64/vm/assembler_aarch64.hpp:2078), pid=8124, tid=0x0000007ec61eb1f0
> # assert(op == 0 && 0 == 0) failed: must be MOVI
> 
> and also in test
> 
> java/math/BigInteger/ModPow.java
> 
> java.math.BigInteger::add gets miscompiled. There is a
> 
>         ldr q16, [x17,x10,lsl #4]
> 
> which should be a
> 
>         ldr q16, [x17,x10]
> 
> I have also moved
> 
> void MacroAssembler::mov(FloatRegister Vd, SIMD_Arrangement T, u_int32_t imm32)
> 
> to macroAssembler_aarch64.cpp from macroAssembler_aarch64.hpp as it was getting much too large for a .hpp.
> 
> I have tested the original and webrev version with JTreg (hotspot, langtools & jdk) with the following results:-
> 
> Original:-
> 
> hotspot: Test results: passed: 849; failed: 11; error: 6
> langtools: Test results: passed: 3,240; error: 2
> jdk: Test results: passed: 6,103; failed: 568; error: 26
> 
> Revised:-
> 
> hotspot: Test results: passed: 849; failed: 11; error: 6
> langtools: Test results: passed: 3,240; error: 2
> jdk: Test results: passed: 6,108; failed: 567; error: 22
> 
> This changeset only affects aarch64 files.
> 
> Please review and if OK I will push,

Won't this fail to detect an overflow?

1418   if (T == T4H || T == T8H) { imm32 &= 0xffff; nimm32 &= 0xffff; }

Otherwise this looks OK to me.

Andrew.


From aph at redhat.com  Tue Jun 23 10:22:35 2015
From: aph at redhat.com (Andrew Haley)
Date: Tue, 23 Jun 2015 11:22:35 +0100
Subject: [aarch64-port-dev ] RFR [M] : 8087333,
 Optionally Pre-Generate the HotSpot Template Interpreter
In-Reply-To: <5589239E.9050600@oracle.com>
References: <557B1743.9040004@oracle.com>
	<55824D8D.2000003@oracle.com>	<5582F686.1090507@oracle.com>
	<5583C704.3090900@oracle.com>	<55845790.7040507@oracle.com>
	<5588F41F.70506@oracle.com> <5589239E.9050600@oracle.com>
Message-ID: <5589336B.90706@redhat.com>

On 23/06/15 10:15, Bertrand Delsart wrote:
> While investigating why I had added the .S files for this CR, I noticed 
> that aarch64 open port has some .S files checked-in but no rules to 
> compile them. To avoid triggering the compilation and linking of these 
> (obsolete?) files in the open aarch64 port, I had to modify the change 
> in vm.make.

Both of these .S files are used by simulator builds.  We probably don't
need them any more.

Andrew.


From adinn at redhat.com  Tue Jun 23 10:32:50 2015
From: adinn at redhat.com (Andrew Dinn)
Date: Tue, 23 Jun 2015 11:32:50 +0100
Subject: [aarch64-port-dev ] RFR: 8129426: aarch64: add support for
 PopCount in C2
In-Reply-To: <5589308C.6000309@redhat.com>
References: <1434979401.21282.31.camel@mint>
	<558815DA.8020500@redhat.com>	<1434985182.21282.34.camel@mint>
	<558823FD.5080800@redhat.com> <558829D2.4000503@redhat.com>
	<55882C94.7030505@redhat.com> <55882ECC.8030602@redhat.com>
	<5588313C.1070409@redhat.com> <5588345A.4060708@redhat.com>
	<5589308C.6000309@redhat.com>
Message-ID: <558935D2.1020103@redhat.com>

On 23/06/15 11:10, Andrew Haley wrote:
> On 22/06/15 17:14, Andrew Dinn wrote:
>> . . .
>> Well, you might not want to take this risk and instead add an explicit
>> zero of the upper half. But I think we need to be clear what risk we are
>> taking.
> 
> It's this: if we don't explicitly zero the upper half we'll have to
> audit all the code which might present a sign-extended value (instead
> of a zero-extended one) in a register that's supposed to contain a
> jint.

Ok, let's play safe. If Ed tweaks the patch to zero the upper word we
can always revise that later if/when we decide we are feeling lucky.

regards,


Andrew Dinn
-----------

From bertrand.delsart at oracle.com  Tue Jun 23 12:49:52 2015
From: bertrand.delsart at oracle.com (Bertrand Delsart)
Date: Tue, 23 Jun 2015 14:49:52 +0200
Subject: [aarch64-port-dev ] RFR [M] : 8087333,
 Optionally Pre-Generate the HotSpot Template Interpreter
In-Reply-To: <5589336B.90706@redhat.com>
References: <557B1743.9040004@oracle.com>
	<55824D8D.2000003@oracle.com>	<5582F686.1090507@oracle.com>
	<5583C704.3090900@oracle.com>	<55845790.7040507@oracle.com>
	<5588F41F.70506@oracle.com> <5589239E.9050600@oracle.com>
	<5589336B.90706@redhat.com>
Message-ID: <558955F0.3070203@oracle.com>

On 23/06/2015 12:22, Andrew Haley wrote:
> On 23/06/15 10:15, Bertrand Delsart wrote:
>> While investigating why I had added the .S files for this CR, I noticed
>> that aarch64 open port has some .S files checked-in but no rules to
>> compile them. To avoid triggering the compilation and linking of these
>> (obsolete?) files in the open aarch64 port, I had to modify the change
>> in vm.make.
>
> Both of these .S files are used by simulator builds.  We probably don't
> need them any more.

Thanks Andrew,

Let me know whether you prefer my new extensible Src_Files_BASE based 
findsrc rule in hotspot/make/linux/makefiles/vm.make or whether I can 
hard code the fact that .S files, if present, should be compiled.

In the later case, I can either delete your .S files or add them in 
Src_Files_EXCLUDE

See
http://cr.openjdk.java.net/~bdelsart/8087333/webrev.00-02/webrev/make/linux/makefiles/vm.make.udiff.html

(The addition of .S files for our extensions was moved to make/closed, 
using a simple/extensible "Src_Files_BASE += \*.S")

Regards,

Bertrand

>
> Andrew.
>


-- 
Bertrand Delsart,                     Grenoble Engineering Center
Oracle,         180 av. de l'Europe,          ZIRST de Montbonnot
38330 Montbonnot Saint Martin,                             FRANCE
bertrand.delsart at oracle.com             Phone : +33 4 76 18 81 23

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE: This email message is for the sole use of the intended
recipient(s) and may contain confidential and privileged
information. Any unauthorized review, use, disclosure or
distribution is prohibited. If you are not the intended recipient,
please contact the sender by reply email and destroy all copies of
the original message.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

From adinn at redhat.com  Tue Jun 23 13:14:45 2015
From: adinn at redhat.com (Andrew Dinn)
Date: Tue, 23 Jun 2015 14:14:45 +0100
Subject: [aarch64-port-dev ] Fix for 8122937: [JEP 245] Validate JVM
 Command-Line Flag Arguments breaks AArch64
Message-ID: <55895BC5.7010607@redhat.com>

I am trying to build the current hs-rt tree on AArch64 in order to test
against Andrew Haley's UseCondCardMark patch (JDK-8079315). That tree
now also includes the following fix for JDK-8122937:

  http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/5bbf25472731

The patch includes changes made to all the current cpus /except/ aarch64
which means that the aarch64 build now falls over with


/home/adinn/openjdk/hs-rt/hotspot/src/share/vm/runtime/globals_extension.hpp:242:28:
error: macro "ARCH_FLAGS" passed 7 arguments, but takes just 5
           IGNORE_CONSTRAINT)

I believe the only modification required is to add the extra 2 arguments
range and constraint to macro ARCH_FLAGS (as per most of the other
cpu-specific globals files). I am testing this now and will raise a JIRA
and submit a webrev if that is indeed all that is needed.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in UK and Wales under Company Registration No. 3798903
Directors: Michael Cunningham (USA), Matt Parson (USA), Charlie Peters
(USA), Michael O'Neill (Ireland)

From edward.nevill at gmail.com  Tue Jun 23 13:51:00 2015
From: edward.nevill at gmail.com (Edward Nevill)
Date: Tue, 23 Jun 2015 14:51:00 +0100
Subject: [aarch64-port-dev ] RFR: 8129426: aarch64: add support for
 PopCount in C2
In-Reply-To: <558935D2.1020103@redhat.com>
References: <1434979401.21282.31.camel@mint> <558815DA.8020500@redhat.com>
	<1434985182.21282.34.camel@mint> <558823FD.5080800@redhat.com>
	<558829D2.4000503@redhat.com> <55882C94.7030505@redhat.com>
	<55882ECC.8030602@redhat.com> <5588313C.1070409@redhat.com>
	<5588345A.4060708@redhat.com> <5589308C.6000309@redhat.com>
	<558935D2.1020103@redhat.com>
Message-ID: <1435067460.5083.21.camel@mylittlepony.linaroharston>

On Tue, 2015-06-23 at 11:32 +0100, Andrew Dinn wrote:
> On 23/06/15 11:10, Andrew Haley wrote:
> > On 22/06/15 17:14, Andrew Dinn wrote:
> >> . . .
> >> Well, you might not want to take this risk and instead add an explicit
> >> zero of the upper half. But I think we need to be clear what risk we are
> >> taking.
> > 
> > It's this: if we don't explicitly zero the upper half we'll have to
> > audit all the code which might present a sign-extended value (instead
> > of a zero-extended one) in a register that's supposed to contain a
> > jint.
> 
> Ok, let's play safe. If Ed tweaks the patch to zero the upper word we
> can always revise that later if/when we decide we are feeling lucky.
> 

OK. New webrev at

http://cr.openjdk.java.net/~enevill/8129426/webrev.02/

All the best,
Ed.


From aph at redhat.com  Tue Jun 23 14:30:07 2015
From: aph at redhat.com (Andrew Haley)
Date: Tue, 23 Jun 2015 15:30:07 +0100
Subject: [aarch64-port-dev ] RFR [M] : 8087333,
 Optionally Pre-Generate the HotSpot Template Interpreter
In-Reply-To: <558955F0.3070203@oracle.com>
References: <557B1743.9040004@oracle.com>
	<55824D8D.2000003@oracle.com>	<5582F686.1090507@oracle.com>
	<5583C704.3090900@oracle.com>	<55845790.7040507@oracle.com>
	<5588F41F.70506@oracle.com> <5589239E.9050600@oracle.com>
	<5589336B.90706@redhat.com> <558955F0.3070203@oracle.com>
Message-ID: <55896D6F.3010404@redhat.com>

On 06/23/2015 01:49 PM, Bertrand Delsart wrote:
> On 23/06/2015 12:22, Andrew Haley wrote:
>> On 23/06/15 10:15, Bertrand Delsart wrote:
>>> While investigating why I had added the .S files for this CR, I noticed
>>> that aarch64 open port has some .S files checked-in but no rules to
>>> compile them. To avoid triggering the compilation and linking of these
>>> (obsolete?) files in the open aarch64 port, I had to modify the change
>>> in vm.make.
>>
>> Both of these .S files are used by simulator builds.  We probably don't
>> need them any more.
> 
> Thanks Andrew,
> 
> Let me know whether you prefer my new extensible Src_Files_BASE based 
> findsrc rule in hotspot/make/linux/makefiles/vm.make or whether I can 
> hard code the fact that .S files, if present, should be compiled.
> 
> In the later case, I can either delete your .S files or add them in 
> Src_Files_EXCLUDE
> 
> See
> http://cr.openjdk.java.net/~bdelsart/8087333/webrev.00-02/webrev/make/linux/makefiles/vm.make.udiff.html
> 
> (The addition of .S files for our extensions was moved to make/closed, 
> using a simple/extensible "Src_Files_BASE += \*.S")

Hmmm.  Ed, Andrew Dinn, do you understand the issue here?

Andrew.


From adinn at redhat.com  Tue Jun 23 14:41:22 2015
From: adinn at redhat.com (Andrew Dinn)
Date: Tue, 23 Jun 2015 15:41:22 +0100
Subject: [aarch64-port-dev ] RFR: 8129584: Fix required for aarch64 after
	8122937
Message-ID: <55897012.6040109@redhat.com>

The following webrev against jdk9/hs-rt fixes AArch64 after it was
broken by JDK-8122937:

  http://cr.openjdk.java.net/~adinn/8129584/webrev/

It is an AArch64-only patch. Could AArch64 reviewers please check it is
ok? I'll also need someone to push it as I am not a committer. Thanks.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in UK and Wales under Company Registration No. 3798903
Directors: Michael Cunningham (USA), Matt Parson (USA), Charlie Peters
(USA), Michael O'Neill (Ireland)

From bertrand.delsart at oracle.com  Tue Jun 23 15:05:18 2015
From: bertrand.delsart at oracle.com (Bertrand Delsart)
Date: Tue, 23 Jun 2015 17:05:18 +0200
Subject: [aarch64-port-dev ] RFR [M] : 8087333,
 Optionally Pre-Generate the HotSpot Template Interpreter
In-Reply-To: <55896D6F.3010404@redhat.com>
References: <557B1743.9040004@oracle.com>	<55824D8D.2000003@oracle.com>	<5582F686.1090507@oracle.com>	<5583C704.3090900@oracle.com>	<55845790.7040507@oracle.com>	<5588F41F.70506@oracle.com>
	<5589239E.9050600@oracle.com>	<5589336B.90706@redhat.com>
	<558955F0.3070203@oracle.com> <55896D6F.3010404@redhat.com>
Message-ID: <558975AE.2060509@oracle.com>

On 23/06/2015 16:30, Andrew Haley wrote:
> On 06/23/2015 01:49 PM, Bertrand Delsart wrote:
>> On 23/06/2015 12:22, Andrew Haley wrote:
>>> On 23/06/15 10:15, Bertrand Delsart wrote:
>>>> While investigating why I had added the .S files for this CR, I noticed
>>>> that aarch64 open port has some .S files checked-in but no rules to
>>>> compile them. To avoid triggering the compilation and linking of these
>>>> (obsolete?) files in the open aarch64 port, I had to modify the change
>>>> in vm.make.
>>>
>>> Both of these .S files are used by simulator builds.  We probably don't
>>> need them any more.
>>
>> Thanks Andrew,
>>
>> Let me know whether you prefer my new extensible Src_Files_BASE based
>> findsrc rule in hotspot/make/linux/makefiles/vm.make or whether I can
>> hard code the fact that .S files, if present, should be compiled.
>>
>> In the later case, I can either delete your .S files or add them in
>> Src_Files_EXCLUDE
>>
>> See
>> http://cr.openjdk.java.net/~bdelsart/8087333/webrev.00-02/webrev/make/linux/makefiles/vm.make.udiff.html
>>
>> (The addition of .S files for our extensions was moved to make/closed,
>> using a simple/extensible "Src_Files_BASE += \*.S")
>
> Hmmm.  Ed, Andrew Dinn, do you understand the issue here?

Stated differently, I need to add support for finding and compiling .S 
files for this extension. In Hotspot, this is done thanks to:
- a rule to compile them
- 'findsrc', which builds the list of files to compile and link

During reviews, I realized there are .S files in your code. These files 
are currently ignored by openjdk builds. However, my initial 
modification was going to cause these files to be spotted by findsrc 
(and hence compiled and linked with your JVM).

To avoid compiling and linking them, I need either:
- to remove them
- or to ensure findsrc does not look for them (thanks to my
   proposed Src_Files_BASE mechanism)
- or to ensure findsrc excludes them (through Src_Files_EXCLUDE)

Hope this helps,

Bertrand.

>
> Andrew.
>
>


-- 
Bertrand Delsart,                     Grenoble Engineering Center
Oracle,         180 av. de l'Europe,          ZIRST de Montbonnot
38330 Montbonnot Saint Martin,                             FRANCE
bertrand.delsart at oracle.com             Phone : +33 4 76 18 81 23

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE: This email message is for the sole use of the intended
recipient(s) and may contain confidential and privileged
information. Any unauthorized review, use, disclosure or
distribution is prohibited. If you are not the intended recipient,
please contact the sender by reply email and destroy all copies of
the original message.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

From volker.simonis at gmail.com  Tue Jun 23 15:22:10 2015
From: volker.simonis at gmail.com (Volker Simonis)
Date: Tue, 23 Jun 2015 17:22:10 +0200
Subject: [aarch64-port-dev ] RFR: 8129584: Fix required for aarch64
	after 8122937
In-Reply-To: <55897012.6040109@redhat.com>
References: <55897012.6040109@redhat.com>
Message-ID: <CA+3eh11qz6dSWz47hHSUYJOvK9jYPZYOYZQPgjCxDPC+MOMY_A@mail.gmail.com>

Hi Andrew,

the change looks good!

I can push it for you.

Regards,
Volker


On Tue, Jun 23, 2015 at 4:41 PM, Andrew Dinn <adinn at redhat.com> wrote:
> The following webrev against jdk9/hs-rt fixes AArch64 after it was
> broken by JDK-8122937:
>
>   http://cr.openjdk.java.net/~adinn/8129584/webrev/
>
> It is an AArch64-only patch. Could AArch64 reviewers please check it is
> ok? I'll also need someone to push it as I am not a committer. Thanks.
>
> regards,
>
>
> Andrew Dinn
> -----------
> Senior Principal Software Engineer
> Red Hat UK Ltd
> Registered in UK and Wales under Company Registration No. 3798903
> Directors: Michael Cunningham (USA), Matt Parson (USA), Charlie Peters
> (USA), Michael O'Neill (Ireland)

From adinn at redhat.com  Tue Jun 23 15:24:48 2015
From: adinn at redhat.com (Andrew Dinn)
Date: Tue, 23 Jun 2015 16:24:48 +0100
Subject: [aarch64-port-dev ] RFR: 8129584: Fix required for aarch64
	after 8122937
In-Reply-To: <CA+3eh11qz6dSWz47hHSUYJOvK9jYPZYOYZQPgjCxDPC+MOMY_A@mail.gmail.com>
References: <55897012.6040109@redhat.com>
	<CA+3eh11qz6dSWz47hHSUYJOvK9jYPZYOYZQPgjCxDPC+MOMY_A@mail.gmail.com>
Message-ID: <55897A40.4070908@redhat.com>

On 23/06/15 16:22, Volker Simonis wrote:
> the change looks good!
> 
> I can push it for you.

Thanks, Volker, that would be great. I think I still need one more
reviewer though. Can anyone else help here?

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in UK and Wales under Company Registration No. 3798903
Directors: Michael Cunningham (USA), Matt Parson (USA), Charlie Peters
(USA), Michael O'Neill (Ireland)

From edward.nevill at linaro.org  Tue Jun 23 15:31:14 2015
From: edward.nevill at linaro.org (Edward Nevill)
Date: Tue, 23 Jun 2015 16:31:14 +0100
Subject: [aarch64-port-dev ] Fix on aarch required after 8073165:
 Contended Locking fast exit bucket.
In-Reply-To: <4295855A5C1DE049A61835A1887419CC2CFFC7DF@DEWDFEMB12A.global.corp.sap>
References: <4295855A5C1DE049A61835A1887419CC2CFFC7DF@DEWDFEMB12A.global.corp.sap>
Message-ID: <1435073474.5083.28.camel@mylittlepony.linaroharston>

On Tue, 2015-06-23 at 07:00 +0000, Lindenmaier, Goetz wrote:
> Hi,
> 
> I think you need the fix below after the change in 8073165.
> http://hg.openjdk.java.net/jdk9/dev/hotspot/rev/2abcd8a4896c
> If you verify this, I will submit it along with the ppc fix.
> 
> Best regards,
>   Goetz.

Hi,

This change looks fine. I have merged in the patch, built it on aarch64 and run jtreg/hotspot on the result. The result before and after was

Test results: passed: 856; failed: 4; error: 6

Thanks for doing this,
Ed.

> 
> diff -r 5b5db3d68ab9 src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp
> --- a/src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp      Mon Jun 22 17:15:45 2015 +0200
> +++ b/src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp      Tue Jun 23 08:54:15 2015 +0200
> @@ -2120,6 +2120,7 @@
>        save_native_result(masm, ret_type, stack_slots);
>      }
> 
> +    __ mov(c_rarg2, rthread);
>      __ lea(c_rarg1, Address(sp, lock_slot_offset * VMRegImpl::stack_slot_size));
>      __ mov(c_rarg0, obj_reg);
> 
> @@ -2128,7 +2129,7 @@
>      __ ldr(r19, Address(rthread, in_bytes(Thread::pending_exception_offset())));
>      __ str(zr, Address(rthread, in_bytes(Thread::pending_exception_offset())));
> 
> -    rt_call(masm, CAST_FROM_FN_PTR(address, SharedRuntime::complete_monitor_unlocking_C), 2, 0, 1);
> +    rt_call(masm, CAST_FROM_FN_PTR(address, SharedRuntime::complete_monitor_unlocking_C), 3, 0, 1);
> 
> #ifdef ASSERT
>      {


From volker.simonis at gmail.com  Tue Jun 23 15:31:21 2015
From: volker.simonis at gmail.com (Volker Simonis)
Date: Tue, 23 Jun 2015 17:31:21 +0200
Subject: [aarch64-port-dev ] RFR: 8129584: Fix required for aarch64
	after 8122937
In-Reply-To: <55897A40.4070908@redhat.com>
References: <55897012.6040109@redhat.com>
	<CA+3eh11qz6dSWz47hHSUYJOvK9jYPZYOYZQPgjCxDPC+MOMY_A@mail.gmail.com>
	<55897A40.4070908@redhat.com>
Message-ID: <CA+3eh117TUuBgiJinb62dH5eDL=ejPxdCGCCfn=8hM3mfdePCQ@mail.gmail.com>

You're welcome. I've updated the copyright and once a second review
appears I'll push it.

Regards,
Volker


On Tue, Jun 23, 2015 at 5:24 PM, Andrew Dinn <adinn at redhat.com> wrote:
> On 23/06/15 16:22, Volker Simonis wrote:
>> the change looks good!
>>
>> I can push it for you.
>
> Thanks, Volker, that would be great. I think I still need one more
> reviewer though. Can anyone else help here?
>
> regards,
>
>
> Andrew Dinn
> -----------
> Senior Principal Software Engineer
> Red Hat UK Ltd
> Registered in UK and Wales under Company Registration No. 3798903
> Directors: Michael Cunningham (USA), Matt Parson (USA), Charlie Peters
> (USA), Michael O'Neill (Ireland)

From vladimir.kozlov at oracle.com  Tue Jun 23 15:32:51 2015
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 23 Jun 2015 08:32:51 -0700
Subject: [aarch64-port-dev ] RFR: 8129584: Fix required for aarch64
	after 8122937
In-Reply-To: <CA+3eh11qz6dSWz47hHSUYJOvK9jYPZYOYZQPgjCxDPC+MOMY_A@mail.gmail.com>
References: <55897012.6040109@redhat.com>
	<CA+3eh11qz6dSWz47hHSUYJOvK9jYPZYOYZQPgjCxDPC+MOMY_A@mail.gmail.com>
Message-ID: <55897C23.9090601@oracle.com>

+1. Reviewed.

Thanks,
Vladimir

On 6/23/15 8:22 AM, Volker Simonis wrote:
> Hi Andrew,
>
> the change looks good!
>
> I can push it for you.
>
> Regards,
> Volker
>
>
> On Tue, Jun 23, 2015 at 4:41 PM, Andrew Dinn <adinn at redhat.com> wrote:
>> The following webrev against jdk9/hs-rt fixes AArch64 after it was
>> broken by JDK-8122937:
>>
>>    http://cr.openjdk.java.net/~adinn/8129584/webrev/
>>
>> It is an AArch64-only patch. Could AArch64 reviewers please check it is
>> ok? I'll also need someone to push it as I am not a committer. Thanks.
>>
>> regards,
>>
>>
>> Andrew Dinn
>> -----------
>> Senior Principal Software Engineer
>> Red Hat UK Ltd
>> Registered in UK and Wales under Company Registration No. 3798903
>> Directors: Michael Cunningham (USA), Matt Parson (USA), Charlie Peters
>> (USA), Michael O'Neill (Ireland)

From adinn at redhat.com  Tue Jun 23 17:00:35 2015
From: adinn at redhat.com (Andrew Dinn)
Date: Tue, 23 Jun 2015 18:00:35 +0100
Subject: [aarch64-port-dev ] RFR: 8129584: Fix required for aarch64
	after 8122937
In-Reply-To: <55897C23.9090601@oracle.com>
References: <55897012.6040109@redhat.com>
	<CA+3eh11qz6dSWz47hHSUYJOvK9jYPZYOYZQPgjCxDPC+MOMY_A@mail.gmail.com>
	<55897C23.9090601@oracle.com>
Message-ID: <558990B3.6050905@redhat.com>

On 23/06/15 16:32, Vladimir Kozlov wrote:
> +1. Reviewed.

Thanks, Vladimir.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in UK and Wales under Company Registration No. 3798903
Directors: Michael Cunningham (USA), Matt Parson (USA), Charlie Peters
(USA), Michael O'Neill (Ireland)

From edward.nevill at linaro.org  Tue Jun 23 17:22:17 2015
From: edward.nevill at linaro.org (Edward Nevill)
Date: Tue, 23 Jun 2015 18:22:17 +0100
Subject: [aarch64-port-dev ] RFR: 8129551: some regressions introduced
 by addition of vectorisation code
In-Reply-To: <55893261.7030501@redhat.com>
References: <1435054332.5083.15.camel@mylittlepony.linaroharston>
	<55893261.7030501@redhat.com>
Message-ID: <CAEf2cjfdDdjS-_jG3iH1uOMBJNPhL4qneH-ddh7HXMiXTO8VtQ@mail.gmail.com>

On 23 June 2015 at 11:18, Andrew Haley <aph at redhat.com> wrote:

> On 23/06/15 11:12, Edward Nevill wrote:
> > This changeset only affects aarch64 files.
> >
> > Please review and if OK I will push,
>
> Won't this fail to detect an overflow?
>
> 1418   if (T == T4H || T == T8H) { imm32 &= 0xffff; nimm32 &= 0xffff; }
>
> Otherwise this looks OK to me.
>
> Andrew.
>
>
Updated webrev with asserts

http://cr.openjdk.java.net/~enevill/8129551/webrev.01

Could a JDK9 Reviewer please review this,

Thanks,
Ed.

From vladimir.kozlov at oracle.com  Tue Jun 23 17:55:20 2015
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 23 Jun 2015 10:55:20 -0700
Subject: [aarch64-port-dev ] RFR: 8129551: some regressions introduced
 by addition of vectorisation code
In-Reply-To: <CAEf2cjfdDdjS-_jG3iH1uOMBJNPhL4qneH-ddh7HXMiXTO8VtQ@mail.gmail.com>
References: <1435054332.5083.15.camel@mylittlepony.linaroharston>	<55893261.7030501@redhat.com>
	<CAEf2cjfdDdjS-_jG3iH1uOMBJNPhL4qneH-ddh7HXMiXTO8VtQ@mail.gmail.com>
Message-ID: <55899D88.8090904@oracle.com>

asserts works only in debug VM so I would leave original imm32 &= 0xff and imm32 &= 0xffff.
I think you should also move comments with table to macroAssembler_aarch64.cpp.

Thanks,
Vladimir

On 6/23/15 10:22 AM, Edward Nevill wrote:
> On 23 June 2015 at 11:18, Andrew Haley <aph at redhat.com> wrote:
>
>> On 23/06/15 11:12, Edward Nevill wrote:
>>> This changeset only affects aarch64 files.
>>>
>>> Please review and if OK I will push,
>>
>> Won't this fail to detect an overflow?
>>
>> 1418   if (T == T4H || T == T8H) { imm32 &= 0xffff; nimm32 &= 0xffff; }
>>
>> Otherwise this looks OK to me.
>>
>> Andrew.
>>
>>
> Updated webrev with asserts
>
> http://cr.openjdk.java.net/~enevill/8129551/webrev.01
>
> Could a JDK9 Reviewer please review this,
>
> Thanks,
> Ed.
>

From edward.nevill at linaro.org  Tue Jun 23 19:01:44 2015
From: edward.nevill at linaro.org (Edward Nevill)
Date: Tue, 23 Jun 2015 20:01:44 +0100
Subject: [aarch64-port-dev ] RFR: 8129551: some regressions introduced
 by addition of vectorisation code
In-Reply-To: <55899D88.8090904@oracle.com>
References: <1435054332.5083.15.camel@mylittlepony.linaroharston>
	<55893261.7030501@redhat.com>
	<CAEf2cjfdDdjS-_jG3iH1uOMBJNPhL4qneH-ddh7HXMiXTO8VtQ@mail.gmail.com>
	<55899D88.8090904@oracle.com>
Message-ID: <CAEf2cjcfY3K16UpAEgTn4nfFLknkYu_E9nPPCz-zxkTGg-YkKQ@mail.gmail.com>

On 23 June 2015 at 18:55, Vladimir Kozlov <vladimir.kozlov at oracle.com>
wrote:

> asserts works only in debug VM so I would leave original imm32 &= 0xff and
> imm32 &= 0xffff.
> I think you should also move comments with table to
> macroAssembler_aarch64.cpp.


Done. New webrev at

http://cr.openjdk.java.net/~enevill/8129551/webrev.02

Does this look OK now?

Thanks,
Ed.

From vladimir.kozlov at oracle.com  Tue Jun 23 19:42:20 2015
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 23 Jun 2015 12:42:20 -0700
Subject: [aarch64-port-dev ] RFR: 8129551: some regressions introduced
 by addition of vectorisation code
In-Reply-To: <CAEf2cjcfY3K16UpAEgTn4nfFLknkYu_E9nPPCz-zxkTGg-YkKQ@mail.gmail.com>
References: <1435054332.5083.15.camel@mylittlepony.linaroharston>	<55893261.7030501@redhat.com>	<CAEf2cjfdDdjS-_jG3iH1uOMBJNPhL4qneH-ddh7HXMiXTO8VtQ@mail.gmail.com>	<55899D88.8090904@oracle.com>
	<CAEf2cjcfY3K16UpAEgTn4nfFLknkYu_E9nPPCz-zxkTGg-YkKQ@mail.gmail.com>
Message-ID: <5589B69C.5050100@oracle.com>

Yes, looks good.

Thanks,
Vladimir

On 6/23/15 12:01 PM, Edward Nevill wrote:
>
>
> On 23 June 2015 at 18:55, Vladimir Kozlov <vladimir.kozlov at oracle.com
> <mailto:vladimir.kozlov at oracle.com>> wrote:
>
>     asserts works only in debug VM so I would leave original imm32 &=
>     0xff and imm32 &= 0xffff.
>     I think you should also move comments with table to
>     macroAssembler_aarch64.cpp.
>
>
> Done. New webrev at
>
> http://cr.openjdk.java.net/~enevill/8129551/webrev.02
>
> Does this look OK now?
>
> Thanks,
> Ed.
>
>

From goetz.lindenmaier at sap.com  Wed Jun 24 06:48:22 2015
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Wed, 24 Jun 2015 06:48:22 +0000
Subject: [aarch64-port-dev ] Fix on aarch required after 8073165:
 Contended Locking fast exit bucket.
In-Reply-To: <1435073474.5083.28.camel@mylittlepony.linaroharston>
References: <4295855A5C1DE049A61835A1887419CC2CFFC7DF@DEWDFEMB12A.global.corp.sap>
	<1435073474.5083.28.camel@mylittlepony.linaroharston>
Message-ID: <4295855A5C1DE049A61835A1887419CC2CFFECF1@DEWDFEMB12A.global.corp.sap>

Hi Ed, 

thanks for testing this.
I'll prepare an official RFR including these changes.

Best regards,
  Goetz.

-----Original Message-----
From: Edward Nevill [mailto:edward.nevill at linaro.org] 
Sent: Dienstag, 23. Juni 2015 17:31
To: Lindenmaier, Goetz
Cc: aarch64-port-dev at openjdk.java.net
Subject: Re: [aarch64-port-dev ] Fix on aarch required after 8073165: Contended Locking fast exit bucket.

On Tue, 2015-06-23 at 07:00 +0000, Lindenmaier, Goetz wrote:
> Hi,
> 
> I think you need the fix below after the change in 8073165.
> http://hg.openjdk.java.net/jdk9/dev/hotspot/rev/2abcd8a4896c
> If you verify this, I will submit it along with the ppc fix.
> 
> Best regards,
>   Goetz.

Hi,

This change looks fine. I have merged in the patch, built it on aarch64 and run jtreg/hotspot on the result. The result before and after was

Test results: passed: 856; failed: 4; error: 6

Thanks for doing this,
Ed.

> 
> diff -r 5b5db3d68ab9 src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp
> --- a/src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp      Mon Jun 22 17:15:45 2015 +0200
> +++ b/src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp      Tue Jun 23 08:54:15 2015 +0200
> @@ -2120,6 +2120,7 @@
>        save_native_result(masm, ret_type, stack_slots);
>      }
> 
> +    __ mov(c_rarg2, rthread);
>      __ lea(c_rarg1, Address(sp, lock_slot_offset * VMRegImpl::stack_slot_size));
>      __ mov(c_rarg0, obj_reg);
> 
> @@ -2128,7 +2129,7 @@
>      __ ldr(r19, Address(rthread, in_bytes(Thread::pending_exception_offset())));
>      __ str(zr, Address(rthread, in_bytes(Thread::pending_exception_offset())));
> 
> -    rt_call(masm, CAST_FROM_FN_PTR(address, SharedRuntime::complete_monitor_unlocking_C), 2, 0, 1);
> +    rt_call(masm, CAST_FROM_FN_PTR(address, SharedRuntime::complete_monitor_unlocking_C), 3, 0, 1);
> 
> #ifdef ASSERT
>      {


From david.holmes at oracle.com  Wed Jun 24 09:48:39 2015
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 24 Jun 2015 19:48:39 +1000
Subject: [aarch64-port-dev ] RFR: 8081294: aarch64: fails to build on
	ubuntu wily
In-Reply-To: <1435049748.20837.6.camel@mylittlepony.linaroharston>
References: <1435049748.20837.6.camel@mylittlepony.linaroharston>
Message-ID: <558A7CF7.7080101@oracle.com>

Hi Ed,

On 23/06/2015 6:55 PM, Edward Nevill wrote:
> Hi,
>
> One of our partners has reported that jdk9 fails to build for aarch64 on ubuntu 'wily'
>
> The failing buildlog is here
>
> https://launchpad.net/ubuntu/+source/openjdk-9/9~b64-1ubuntu1/+build/7441971
>
> The following webrev fixes this
>
> http://cr.openjdk.java.net/~enevill/8081294/webrev/
>
> I have verified that this fixes the problem by cross compiling against a sysroot.
>
> Our partner has also verified that this patch fixes the build.
>
> As this is a change to shared code I will need someone to sponsor this and push it through JPRT.

I'll handle this for you. As it is trivial and only affects aarch64 my 
Review is sufficient.

Thanks,
David


> Thanks for your help,
> Ed
>
>

From edward.nevill at gmail.com  Wed Jun 24 13:36:51 2015
From: edward.nevill at gmail.com (Edward Nevill)
Date: Wed, 24 Jun 2015 14:36:51 +0100
Subject: [aarch64-port-dev ] RFR: 8129426: aarch64: add support for
 PopCount in C2
In-Reply-To: <5589CB73.7020509@oracle.com>
References: <1434979401.21282.31.camel@mint> <558815DA.8020500@redhat.com>
	<1434985182.21282.34.camel@mint> <558823FD.5080800@redhat.com>
	<558829D2.4000503@redhat.com> <55882C94.7030505@redhat.com>
	<55882ECC.8030602@redhat.com> <5588313C.1070409@redhat.com>
	<5588345A.4060708@redhat.com> <5589308C.6000309@redhat.com>
	<558935D2.1020103@redhat.com>
	<1435067460.5083.21.camel@mylittlepony.linaroharston>
	<5589CB73.7020509@oracle.com>
Message-ID: <1435153011.13459.2.camel@mint>

On Tue, 2015-06-23 at 15:11 -0600, Alejandro E Murillo wrote:
> On 6/23/2015 7:51 AM, Edward Nevill wrote:
> > On Tue, 2015-06-23 at 11:32 +0100, Andrew Dinn wrote:
> >> On 23/06/15 11:10, Andrew Haley wrote:
> >>> On 22/06/15 17:14, Andrew Dinn wrote:
> >>>> . . .
> >>>> Well, you might not want to take this risk and instead add an explicit
> >>>> zero of the upper half. But I think we need to be clear what risk we are
> >>>> taking.
> >>> It's this: if we don't explicitly zero the upper half we'll have to
> >>> audit all the code which might present a sign-extended value (instead
> >>> of a zero-extended one) in a register that's supposed to contain a
> >>> jint.
> >> Ok, let's play safe. If Ed tweaks the patch to zero the upper word we
> >> can always revise that later if/when we decide we are feeling lucky.
> >>
> > OK. New webrev at
> >
> > http://cr.openjdk.java.net/~enevill/8129426/webrev.02/
> >
> > All the best,
> > Ed.
> >
> >
> Hi,
> to be consistent with similar integrations and to avoid potential 
> merging problems,
> going forward please work with  the hs-rt repo for this kind of changes,
> as Volker has been doing.

Hi Alejandro,

I have rebased the patch on hs-rt. New webrev

http://cr.openjdk.java.net/~enevill/8129426/webrev.03/

Does it look OK to push?

Thanks,
Ed.


From edward.nevill at gmail.com  Wed Jun 24 15:27:32 2015
From: edward.nevill at gmail.com (Edward Nevill)
Date: Wed, 24 Jun 2015 16:27:32 +0100
Subject: [aarch64-port-dev ] RFR: 8086087: aarch64: add support for 64 bit
	vectors
Message-ID: <1435159652.13459.6.camel@mint>

Hi,

The following webrev based on the hs-rt repo

http://cr.openjdk.java.net/~enevill/8086087/webrev.01

Adds support for 64 bit vectors on aarch64. Previously the vector code only supported 128 bit vectors.

32 bit vectors are not supported directly as aarch64 has no support for 32 bit vectors, however the above webrev will permit 32 bit vectors but just place them in a 64 bit vector.

I have tested this with JTreg hotspot and get the same results before and after the change, viz,

Test results: passed: 845; failed: 12; error: 6

I have also benchmarked the Test*Vect tests from 6340864 in the hotspot test suite. The following are the average results I get on one of our partners HW (lower number is better).

TestByteVect:  128-bit (11.77), 64-bit (4.36)
TestShortVect: 128-bit (5.02),  64-bit (5.22)
TestIntVect:   128-bit (7.81),  64-bit (7.70)
TestLongVect:  128-bit (11.67), 64-bit (11.71)
TestFloatVect: 128-bit (16.75), 64-bit (17.29)
TestDoubleVect:128-bit (32.37), 64-bit (32.43)

So the only test which shows an improvement is TestByteVect which shows a 2.7x speedup. The other tests are the same within the bounds of experimental error.

The reason TestByteVect shows such an improvement is that with 128 bit vectors it is not being vectorized at all because the loop is not unrolled sufficiently to allow it to be vectorized, wheras with 64 bit vectors it is.

Please review and let me know if this is OK to push?

Ed.


From vladimir.kozlov at oracle.com  Wed Jun 24 16:57:19 2015
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 24 Jun 2015 09:57:19 -0700
Subject: [aarch64-port-dev ] RFR: 8086087: aarch64: add support for 64
	bit vectors
In-Reply-To: <1435159652.13459.6.camel@mint>
References: <1435159652.13459.6.camel@mint>
Message-ID: <558AE16F.9080704@oracle.com>

Hi, Ed

I am worried about 32 bit vectors. There could be conflict somewhere in RA since min_vector_size will not match minimum 
vector register VecD size.

Can you split these changes to have separate changesets? One is support VecD (64 bit) and an other 32bit vectors.
If some testing will show problems we can check which changes caused it more precisely.

And this should be reviewed on compiler mailing list instead of runtime.

Thanks,
Vladimir

On 6/24/15 8:27 AM, Edward Nevill wrote:
> Hi,
>
> The following webrev based on the hs-rt repo
>
> http://cr.openjdk.java.net/~enevill/8086087/webrev.01
>
> Adds support for 64 bit vectors on aarch64. Previously the vector code only supported 128 bit vectors.
>
> 32 bit vectors are not supported directly as aarch64 has no support for 32 bit vectors, however the above webrev will permit 32 bit vectors but just place them in a 64 bit vector.
>
> I have tested this with JTreg hotspot and get the same results before and after the change, viz,
>
> Test results: passed: 845; failed: 12; error: 6
>
> I have also benchmarked the Test*Vect tests from 6340864 in the hotspot test suite. The following are the average results I get on one of our partners HW (lower number is better).
>
> TestByteVect:  128-bit (11.77), 64-bit (4.36)
> TestShortVect: 128-bit (5.02),  64-bit (5.22)
> TestIntVect:   128-bit (7.81),  64-bit (7.70)
> TestLongVect:  128-bit (11.67), 64-bit (11.71)
> TestFloatVect: 128-bit (16.75), 64-bit (17.29)
> TestDoubleVect:128-bit (32.37), 64-bit (32.43)
>
> So the only test which shows an improvement is TestByteVect which shows a 2.7x speedup. The other tests are the same within the bounds of experimental error.
>
> The reason TestByteVect shows such an improvement is that with 128 bit vectors it is not being vectorized at all because the loop is not unrolled sufficiently to allow it to be vectorized, wheras with 64 bit vectors it is.
>
> Please review and let me know if this is OK to push?
>
> Ed.
>
>

From edward.nevill at gmail.com  Wed Jun 24 18:53:12 2015
From: edward.nevill at gmail.com (Edward Nevill)
Date: Wed, 24 Jun 2015 19:53:12 +0100
Subject: [aarch64-port-dev ] RFR: 8086087: aarch64: add support for 64
	bit vectors
In-Reply-To: <558AE16F.9080704@oracle.com>
References: <1435159652.13459.6.camel@mint>  <558AE16F.9080704@oracle.com>
Message-ID: <1435171992.13459.22.camel@mint>

On Wed, 2015-06-24 at 09:57 -0700, Vladimir Kozlov wrote:
> Hi, Ed
> 
> I am worried about 32 bit vectors. There could be conflict somewhere in RA since min_vector_size will not match minimum 
> vector register VecD size.
> 
> Can you split these changes to have separate changesets? One is support VecD (64 bit) and an other 32bit vectors.
> If some testing will show problems we can check which changes caused it more precisely.

Hi Vladimir,

Thanks for the review. I am generally happy that putting 32 bit values in 64 bit registers is OK. I initially did the 64 bit registers by putting them in 128 bit registers.

That worked OK, but there were 2 problems.

First when a register was spilled I had to spill 128 bits since I did not know the size at the point of the spill.

The second problem was with scalar reduction when doing an add across the vector, rather than a parallel vector operation. In this case it would get the wrong result if the top 64 bits were non zero.

This is why I generated a separate 64 bit vectorisation.

With 32 bit, spilling 64 bits instead of 32 bits does not matter, and scalar reduction operations do not exist for 32 bit (the minimum is 2I).

I will do as you suggest, and split it into two webrevs.

> 
> And this should be reviewed on compiler mailing list instead of runtime.

And should the changeset then be based on hs-comp and pushed to hs-comp?

All the best,
Ed.


From vladimir.kozlov at oracle.com  Wed Jun 24 18:58:04 2015
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 24 Jun 2015 11:58:04 -0700
Subject: [aarch64-port-dev ] RFR: 8086087: aarch64: add support for 64
	bit vectors
In-Reply-To: <1435171992.13459.22.camel@mint>
References: <1435159652.13459.6.camel@mint> <558AE16F.9080704@oracle.com>
	<1435171992.13459.22.camel@mint>
Message-ID: <558AFDBC.1080705@oracle.com>

 > And should the changeset then be based on hs-comp and pushed to hs-comp?

Yes

On 6/24/15 11:53 AM, Edward Nevill wrote:
> On Wed, 2015-06-24 at 09:57 -0700, Vladimir Kozlov wrote:
>> Hi, Ed
>>
>> I am worried about 32 bit vectors. There could be conflict somewhere in RA since min_vector_size will not match minimum
>> vector register VecD size.
>>
>> Can you split these changes to have separate changesets? One is support VecD (64 bit) and an other 32bit vectors.
>> If some testing will show problems we can check which changes caused it more precisely.
>
> Hi Vladimir,
>
> Thanks for the review. I am generally happy that putting 32 bit values in 64 bit registers is OK. I initially did the 64 bit registers by putting them in 128 bit registers.
>
> That worked OK, but there were 2 problems.
>
> First when a register was spilled I had to spill 128 bits since I did not know the size at the point of the spill.
>
> The second problem was with scalar reduction when doing an add across the vector, rather than a parallel vector operation. In this case it would get the wrong result if the top 64 bits were non zero.
>
> This is why I generated a separate 64 bit vectorisation.
>
> With 32 bit, spilling 64 bits instead of 32 bits does not matter, and scalar reduction operations do not exist for 32 bit (the minimum is 2I).
>
> I will do as you suggest, and split it into two webrevs.
>
>>
>> And this should be reviewed on compiler mailing list instead of runtime.
>
> And should the changeset then be based on hs-comp and pushed to hs-comp?
>
> All the best,
> Ed.
>
>

From edward.nevill at gmail.com  Wed Jun 24 19:44:09 2015
From: edward.nevill at gmail.com (Edward Nevill)
Date: Wed, 24 Jun 2015 20:44:09 +0100
Subject: [aarch64-port-dev ] RFR: 8129426: aarch64: add support for
 PopCount in C2
In-Reply-To: <558AFCD9.1090501@oracle.com>
References: <1434979401.21282.31.camel@mint> <558815DA.8020500@redhat.com>
	<1434985182.21282.34.camel@mint> <558823FD.5080800@redhat.com>
	<558829D2.4000503@redhat.com> <55882C94.7030505@redhat.com>
	<55882ECC.8030602@redhat.com> <5588313C.1070409@redhat.com>
	<5588345A.4060708@redhat.com> <5589308C.6000309@redhat.com>
	<558935D2.1020103@redhat.com>
	<1435067460.5083.21.camel@mylittlepony.linaroharston>
	<5589CB73.7020509@oracle.com> <1435153011.13459.2.camel@mint>
	<558AFCD9.1090501@oracle.com>
Message-ID: <1435175049.13459.26.camel@mint>

On Wed, 2015-06-24 at 12:54 -0600, Alejandro E Murillo wrote:
> 
> On 6/24/2015 7:36 AM, Edward Nevill wrote:
> > On Tue, 2015-06-23 at 15:11 -0600, Alejandro E Murillo wrote:
> >> On 6/23/2015 7:51 AM, Edward Nevill wrote:
> >>> On Tue, 2015-06-23 at 11:32 +0100, Andrew Dinn wrote:
> >>>> On 23/06/15 11:10, Andrew Haley wrote:
> >>>>> On 22/06/15 17:14, Andrew Dinn wrote:
> >>>>>> . . .
> >>>>>> Well, you might not want to take this risk and instead add an explicit
> >>>>>> zero of the upper half. But I think we need to be clear what risk we are
> >>>>>> taking.
> >>>>> It's this: if we don't explicitly zero the upper half we'll have to
> >>>>> audit all the code which might present a sign-extended value (instead
> >>>>> of a zero-extended one) in a register that's supposed to contain a
> >>>>> jint.
> >>>> Ok, let's play safe. If Ed tweaks the patch to zero the upper word we
> >>>> can always revise that later if/when we decide we are feeling lucky.
> >>>>
> >>> OK. New webrev at
> >>>
> >>> http://cr.openjdk.java.net/~enevill/8129426/webrev.02/
> >>>
> >>> All the best,
> >>> Ed.
> >>>
> >>>
> >> Hi,
> >> to be consistent with similar integrations and to avoid potential
> >> merging problems,
> >> going forward please work with  the hs-rt repo for this kind of changes,
> >> as Volker has been doing.
> > Hi Alejandro,
> >
> > I have rebased the patch on hs-rt. New webrev
> >
> > http://cr.openjdk.java.net/~enevill/8129426/webrev.03/
> >
> > Does it look OK to push?
> >
> > Thanks,
> > Ed.
> >
> Apologies, I thought you had already pushed that to jdk9/dev,
> but it turns out you had pushed 8129551 , no this one.
> 
> If this is a follow up to the previous push into jdk9/dev (8129551)
> or somewhat related, then it's probably better if you pushed
> this one  to jdk9/dev as well, as to avoid any possible conflict
> when we merge jdk9/dev with jdk9/hs. If they are completely
> independent then go ahead and push it to hs-rt, after review of course.

OK, I was confused when you suggested it should be pushed to hs-rt, since the change is adding PopCount to C2.

Should I base it on hs-comp and move the review over to hotspot-compiler-dev?

All the best,
Ed.


From alejandro.murillo at oracle.com  Wed Jun 24 18:54:17 2015
From: alejandro.murillo at oracle.com (Alejandro E Murillo)
Date: Wed, 24 Jun 2015 12:54:17 -0600
Subject: [aarch64-port-dev ] RFR: 8129426: aarch64: add support for
 PopCount in C2
In-Reply-To: <1435153011.13459.2.camel@mint>
References: <1434979401.21282.31.camel@mint> <558815DA.8020500@redhat.com>	
	<1434985182.21282.34.camel@mint> <558823FD.5080800@redhat.com>	
	<558829D2.4000503@redhat.com> <55882C94.7030505@redhat.com>	
	<55882ECC.8030602@redhat.com> <5588313C.1070409@redhat.com>	
	<5588345A.4060708@redhat.com> <5589308C.6000309@redhat.com>	
	<558935D2.1020103@redhat.com>	
	<1435067460.5083.21.camel@mylittlepony.linaroharston>	
	<5589CB73.7020509@oracle.com> <1435153011.13459.2.camel@mint>
Message-ID: <558AFCD9.1090501@oracle.com>


On 6/24/2015 7:36 AM, Edward Nevill wrote:
> On Tue, 2015-06-23 at 15:11 -0600, Alejandro E Murillo wrote:
>> On 6/23/2015 7:51 AM, Edward Nevill wrote:
>>> On Tue, 2015-06-23 at 11:32 +0100, Andrew Dinn wrote:
>>>> On 23/06/15 11:10, Andrew Haley wrote:
>>>>> On 22/06/15 17:14, Andrew Dinn wrote:
>>>>>> . . .
>>>>>> Well, you might not want to take this risk and instead add an explicit
>>>>>> zero of the upper half. But I think we need to be clear what risk we are
>>>>>> taking.
>>>>> It's this: if we don't explicitly zero the upper half we'll have to
>>>>> audit all the code which might present a sign-extended value (instead
>>>>> of a zero-extended one) in a register that's supposed to contain a
>>>>> jint.
>>>> Ok, let's play safe. If Ed tweaks the patch to zero the upper word we
>>>> can always revise that later if/when we decide we are feeling lucky.
>>>>
>>> OK. New webrev at
>>>
>>> http://cr.openjdk.java.net/~enevill/8129426/webrev.02/
>>>
>>> All the best,
>>> Ed.
>>>
>>>
>> Hi,
>> to be consistent with similar integrations and to avoid potential
>> merging problems,
>> going forward please work with  the hs-rt repo for this kind of changes,
>> as Volker has been doing.
> Hi Alejandro,
>
> I have rebased the patch on hs-rt. New webrev
>
> http://cr.openjdk.java.net/~enevill/8129426/webrev.03/
>
> Does it look OK to push?
>
> Thanks,
> Ed.
>
Apologies, I thought you had already pushed that to jdk9/dev,
but it turns out you had pushed 8129551 , no this one.

If this is a follow up to the previous push into jdk9/dev (8129551)
or somewhat related, then it's probably better if you pushed
this one  to jdk9/dev as well, as to avoid any possible conflict
when we merge jdk9/dev with jdk9/hs. If they are completely
independent then go ahead and push it to hs-rt, after review of course.

Thanks

-- 
Alejandro


From alejandro.murillo at oracle.com  Wed Jun 24 20:51:30 2015
From: alejandro.murillo at oracle.com (Alejandro E Murillo)
Date: Wed, 24 Jun 2015 14:51:30 -0600
Subject: [aarch64-port-dev ] RFR: 8129426: aarch64: add support for
 PopCount in C2
In-Reply-To: <1435175049.13459.26.camel@mint>
References: <1434979401.21282.31.camel@mint> <558815DA.8020500@redhat.com>	
	<1434985182.21282.34.camel@mint> <558823FD.5080800@redhat.com>	
	<558829D2.4000503@redhat.com> <55882C94.7030505@redhat.com>	
	<55882ECC.8030602@redhat.com> <5588313C.1070409@redhat.com>	
	<5588345A.4060708@redhat.com> <5589308C.6000309@redhat.com>	
	<558935D2.1020103@redhat.com>	
	<1435067460.5083.21.camel@mylittlepony.linaroharston>	
	<5589CB73.7020509@oracle.com> <1435153011.13459.2.camel@mint>	
	<558AFCD9.1090501@oracle.com> <1435175049.13459.26.camel@mint>
Message-ID: <558B1852.2030800@oracle.com>


On 6/24/2015 1:44 PM, Edward Nevill wrote:
> On Wed, 2015-06-24 at 12:54 -0600, Alejandro E Murillo wrote:
>> On 6/24/2015 7:36 AM, Edward Nevill wrote:
>>> On Tue, 2015-06-23 at 15:11 -0600, Alejandro E Murillo wrote:
>>>> On 6/23/2015 7:51 AM, Edward Nevill wrote:
>>>>> On Tue, 2015-06-23 at 11:32 +0100, Andrew Dinn wrote:
>>>>>> On 23/06/15 11:10, Andrew Haley wrote:
>>>>>>> On 22/06/15 17:14, Andrew Dinn wrote:
>>>>>>>> . . .
>>>>>>>> Well, you might not want to take this risk and instead add an explicit
>>>>>>>> zero of the upper half. But I think we need to be clear what risk we are
>>>>>>>> taking.
>>>>>>> It's this: if we don't explicitly zero the upper half we'll have to
>>>>>>> audit all the code which might present a sign-extended value (instead
>>>>>>> of a zero-extended one) in a register that's supposed to contain a
>>>>>>> jint.
>>>>>> Ok, let's play safe. If Ed tweaks the patch to zero the upper word we
>>>>>> can always revise that later if/when we decide we are feeling lucky.
>>>>>>
>>>>> OK. New webrev at
>>>>>
>>>>> http://cr.openjdk.java.net/~enevill/8129426/webrev.02/
>>>>>
>>>>> All the best,
>>>>> Ed.
>>>>>
>>>>>
>>>> Hi,
>>>> to be consistent with similar integrations and to avoid potential
>>>> merging problems,
>>>> going forward please work with  the hs-rt repo for this kind of changes,
>>>> as Volker has been doing.
>>> Hi Alejandro,
>>>
>>> I have rebased the patch on hs-rt. New webrev
>>>
>>> http://cr.openjdk.java.net/~enevill/8129426/webrev.03/
>>>
>>> Does it look OK to push?
>>>
>>> Thanks,
>>> Ed.
>>>
>> Apologies, I thought you had already pushed that to jdk9/dev,
>> but it turns out you had pushed 8129551 , no this one.
>>
>> If this is a follow up to the previous push into jdk9/dev (8129551)
>> or somewhat related, then it's probably better if you pushed
>> this one  to jdk9/dev as well, as to avoid any possible conflict
>> when we merge jdk9/dev with jdk9/hs. If they are completely
>> independent then go ahead and push it to hs-rt, after review of course.
> OK, I was confused when you suggested it should be pushed to hs-rt, since the change is adding PopCount to C2.
>
> Should I base it on hs-comp and move the review over to hotspot-compiler-dev?
>
> All the best,
> Ed.
>
>
As I said, I didn't double checked and thought that was for the
push you had done into jdk9/dev
yes,  this one looks more appropriate for hs-comp.
Depending on the change, going forward, push to the
appropriate hotspot group repo (hs-rt or hs-comp) not to jdk9/dev

cheers

-- 
Alejandro


From aph at redhat.com  Thu Jun 25 09:24:49 2015
From: aph at redhat.com (Andrew Haley)
Date: Thu, 25 Jun 2015 10:24:49 +0100
Subject: [aarch64-port-dev ] Scalar reduction
Message-ID: <558BC8E1.9050207@redhat.com>

Do you know if int scalar reduction supposed to work yet?

This doesn't seem to be vectorized:

    int sum(int[] a) {
        int val = 0;
        for(int elem: a)
            val += elem;
        return val;
    }

but this is:

    int[] sum(int[] a, int[] b, int[] result) {
        for(int i = 0; i < a.length; i++)
            result [i] = a[i] + b[i];
        return result;
    }

Hmmmm, doesn't seem to work on x86 either.  Baffled.

Andrew.

From edward.nevill at gmail.com  Thu Jun 25 10:20:49 2015
From: edward.nevill at gmail.com (Edward Nevill)
Date: Thu, 25 Jun 2015 11:20:49 +0100
Subject: [aarch64-port-dev ] Scalar reduction
In-Reply-To: <558BC8E1.9050207@redhat.com>
References: <558BC8E1.9050207@redhat.com>
Message-ID: <1435227649.11204.11.camel@mylittlepony.linaroharston>

On Thu, 2015-06-25 at 10:24 +0100, Andrew Haley wrote:
> Do you know if int scalar reduction supposed to work yet?

Yes, the following shows an example

--- cut here ---
public class Sum
{
  public static void main(String[] args) {
    int[] a = new int[256*1024];
    int[] b = new int[256*1024];
    init(a,b);
    int total = 0;
    for(int j = 0; j < 2000; j++) {
      total = sum(a,b);
    }
    System.out.println("total = " + total);
  }

  public static void init(
    int[] a,
    int[] b)
  {
    for(int j = 0; j < 1; j++)
    {
      for(int i = 0; i < a.length; i++)
      {
        a[i] = i * 1 + j;
        b[i] = i * 1 - j;
      }
    }
  }

  public static int sum(
    int[] a,
    int[] b)
  {
    int total = 0;
    for(int i = 0; i < a.length; i++)
    {
      total += a[i] + b[i];
    }
    return total;
  }

}
--- cut here ---

This generates

  0x000003ff850eaa00: sbfiz     x11, x16, #2, #32  ;*iaload
                                                ; - Sum::sum at 13 (line 35)

  0x000003ff850eaa04: add       x12, x2, x11
  0x000003ff850eaa08: add       x11, x18, x11
  0x000003ff850eaa0c: ldr       q17, [x11,#16]
  0x000003ff850eaa10: ldr       q16, [x12,#16]
  0x000003ff850eaa14: sbfiz     x11, x16, #2, #32
  0x000003ff850eaa18: add       x12, x2, x11
  0x000003ff850eaa1c: add       x11, x18, x11
  0x000003ff850eaa20: ldr       q19, [x11,#32]
  0x000003ff850eaa24: ldr       q18, [x12,#32]
  0x000003ff850eaa28: add       v16.4s, v16.4s, v17.4s
  0x000003ff850eaa2c: add       v17.4s, v18.4s, v19.4s
  0x000003ff850eaa30: addv      s18, v16.4s        <<<<< SCALAR REDUCTION
  0x000003ff850eaa34: mov       w12, v18.s[0]
  0x000003ff850eaa38: add       w11, w12, w0
  0x000003ff850eaa3c: add       w16, w16, #0x8  ;*iinc
                                                ; - Sum::sum at 20 (line 33)

  0x000003ff850eaa40: addv      s16, v17.4s
  0x000003ff850eaa44: mov       w13, v16.s[0]
  0x000003ff850eaa48: add       w0, w13, w11    ;*iadd
                                                ; - Sum::sum at 18 (line 35)

  0x000003ff850eaa4c: cmp       w16, w10
  0x000003ff850eaa50: b.lt      0x000003ff850eaa00  ;*if_icmpge


> 
> This doesn't seem to be vectorized:
> 
>     int sum(int[] a) {
>         int val = 0;
>         for(int elem: a)
>             val += elem;
>         return val;
>     }

But yes, it seems rather bad that it doesn't get this.

I'll take a closer look,
Ed.


From aph at redhat.com  Thu Jun 25 10:23:29 2015
From: aph at redhat.com (Andrew Haley)
Date: Thu, 25 Jun 2015 11:23:29 +0100
Subject: [aarch64-port-dev ] Scalar reduction
In-Reply-To: <1435227649.11204.11.camel@mylittlepony.linaroharston>
References: <558BC8E1.9050207@redhat.com>
	<1435227649.11204.11.camel@mylittlepony.linaroharston>
Message-ID: <558BD6A1.9030502@redhat.com>

On 06/25/2015 11:20 AM, Edward Nevill wrote:
> But yes, it seems rather bad that it doesn't get this.

Ah, OK.  x86 doesn't work either.

I guess better to ask hs-comp.

Andrew.


From edward.nevill at gmail.com  Thu Jun 25 10:40:49 2015
From: edward.nevill at gmail.com (Edward Nevill)
Date: Thu, 25 Jun 2015 11:40:49 +0100
Subject: [aarch64-port-dev ] RFR: 8086087: aarch64: add support for 64 bit
	vectors
Message-ID: <1435228849.11204.17.camel@mylittlepony.linaroharston>

Hi,

The following webrev adds support for 64 bit vectors (only) on aarch64

http://cr.openjdk.java.net/~enevill/8086087/webrev.02

Previously the vector code only supported 128 bit vectors.

32 bit vectors are not supported in this changeset but will be supported in a future changeset.

I have tested this with JTreg hotspot with the following results

Original: Test results: passed: 858; failed: 4; error: 6
Revised:  Test results: passed: 857; failed: 5; error: 6

The additional test failure is compiler/intrinsics/muladd/TestMulAdd.java which fails intermittently with both original and revised versions (I'll take a look at that next:-).

I have also benchmarked the Test*Vect tests from 6340864 in the hotspot test suite. The following are the average results I get on one of our partners HW (lower number is better).

TestByteVect:  128-bit (11.77), 64-bit (4.36)
TestShortVect: 128-bit (5.02),  64-bit (5.22)
TestIntVect:   128-bit (7.81),  64-bit (7.70)
TestLongVect:  128-bit (11.67), 64-bit (11.71)
TestFloatVect: 128-bit (16.75), 64-bit (17.29)
TestDoubleVect:128-bit (32.37), 64-bit (32.43)

So the only test which shows an improvement is TestByteVect which shows a 2.7x speedup. The other tests are the same within the bounds of experimental error.

The reason TestByteVect shows such an improvement is that with 128 bit vectors it is not being vectorized at all because the loop is not unrolled sufficiently to allow it to be vectorized, wheras with 64 bit vectors it is.

Please review and let me know if this is OK to push?

Ed.

PS: For pushing an aarch64 specific change to hs-comp do I need 1 or 2 reviewers?


From vladimir.kozlov at oracle.com  Thu Jun 25 14:20:41 2015
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 25 Jun 2015 07:20:41 -0700
Subject: [aarch64-port-dev ] RFR: 8086087: aarch64: add support for 64
	bit vectors
In-Reply-To: <1435228849.11204.17.camel@mylittlepony.linaroharston>
References: <1435228849.11204.17.camel@mylittlepony.linaroharston>
Message-ID: <558C0E39.2060000@oracle.com>

This looks good. Thank you, Ed.

Since changes a big you need 2 reviewers. One official reviewer (me in this case) and one who is familiar with this code 
and at least committer (Andrew, for example).

Thanks,
Vladimir

On 6/25/15 3:40 AM, Edward Nevill wrote:
> Hi,
>
> The following webrev adds support for 64 bit vectors (only) on aarch64
>
> http://cr.openjdk.java.net/~enevill/8086087/webrev.02
>
> Previously the vector code only supported 128 bit vectors.
>
> 32 bit vectors are not supported in this changeset but will be supported in a future changeset.
>
> I have tested this with JTreg hotspot with the following results
>
> Original: Test results: passed: 858; failed: 4; error: 6
> Revised:  Test results: passed: 857; failed: 5; error: 6
>
> The additional test failure is compiler/intrinsics/muladd/TestMulAdd.java which fails intermittently with both original and revised versions (I'll take a look at that next:-).
>
> I have also benchmarked the Test*Vect tests from 6340864 in the hotspot test suite. The following are the average results I get on one of our partners HW (lower number is better).
>
> TestByteVect:  128-bit (11.77), 64-bit (4.36)
> TestShortVect: 128-bit (5.02),  64-bit (5.22)
> TestIntVect:   128-bit (7.81),  64-bit (7.70)
> TestLongVect:  128-bit (11.67), 64-bit (11.71)
> TestFloatVect: 128-bit (16.75), 64-bit (17.29)
> TestDoubleVect:128-bit (32.37), 64-bit (32.43)
>
> So the only test which shows an improvement is TestByteVect which shows a 2.7x speedup. The other tests are the same within the bounds of experimental error.
>
> The reason TestByteVect shows such an improvement is that with 128 bit vectors it is not being vectorized at all because the loop is not unrolled sufficiently to allow it to be vectorized, wheras with 64 bit vectors it is.
>
> Please review and let me know if this is OK to push?
>
> Ed.
>
> PS: For pushing an aarch64 specific change to hs-comp do I need 1 or 2 reviewers?
>
>

From edward.nevill at linaro.org  Thu Jun 25 14:37:38 2015
From: edward.nevill at linaro.org (Edward Nevill)
Date: Thu, 25 Jun 2015 15:37:38 +0100
Subject: [aarch64-port-dev ] RFR: 8129426: aarch64: add support for PopCount
	in C2
Message-ID: <1435243058.29000.4.camel@mylittlepony.linaroharston>

Hi,

Aarch64 currently does not support the PopCountI and PopCountL nodes in aarch64.ad

The following webrev adds support for these using the SIMD instructions 'cnt' and 'addv'

http://cr.openjdk.java.net/~enevill/8129426/webrev.04

This patch was contributed by alexander.alexeev at caviumnetworks.com

The patch only modifies aarch64 specific files.

I have merged the patch in and tested it with JTreg / hotspot with the following results for both original and revised

Test results: passed: 858; failed: 4; error: 6

I have benchmarked the patch on four different partner platforms. The average improvement was 2.6X for PopCountI and 2.5X for PopCountL.

Please review,

Thanks,
Ed.


From edward.nevill at gmail.com  Thu Jun 25 14:44:04 2015
From: edward.nevill at gmail.com (Edward Nevill)
Date: Thu, 25 Jun 2015 15:44:04 +0100
Subject: [aarch64-port-dev ] RFR: 8129426: aarch64: add support for PopCount
	in C2
Message-ID: <1435243444.29000.6.camel@mylittlepony.linaroharston>

Hi,

Aarch64 currently does not support the PopCountI and PopCountL nodes in aarch64.ad

The following webrev adds support for these using the SIMD instructions 'cnt' and 'addv'

http://cr.openjdk.java.net/~enevill/8129426/webrev.04

This patch was contributed by alexander.alexeev at caviumnetworks.com

The patch only modifies aarch64 specific files.

I have merged the patch in and tested it with JTreg / hotspot with the following results for both original and revised

Test results: passed: 858; failed: 4; error: 6

I have benchmarked the patch on four different partner platforms. The average improvement was 2.6X for PopCountI and 2.5X for PopCountL.

Please review,

Thanks,
Ed.


From vladimir.kozlov at oracle.com  Thu Jun 25 19:29:17 2015
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 25 Jun 2015 12:29:17 -0700
Subject: [aarch64-port-dev ] RFR: 8129426: aarch64: add support for
	PopCount in C2
In-Reply-To: <1435243444.29000.6.camel@mylittlepony.linaroharston>
References: <1435243444.29000.6.camel@mylittlepony.linaroharston>
Message-ID: <558C568D.5080506@oracle.com>

Looks good.

Thanks,
Vladimir

On 6/25/15 7:44 AM, Edward Nevill wrote:
> Hi,
>
> Aarch64 currently does not support the PopCountI and PopCountL nodes in aarch64.ad
>
> The following webrev adds support for these using the SIMD instructions 'cnt' and 'addv'
>
> http://cr.openjdk.java.net/~enevill/8129426/webrev.04
>
> This patch was contributed by alexander.alexeev at caviumnetworks.com
>
> The patch only modifies aarch64 specific files.
>
> I have merged the patch in and tested it with JTreg / hotspot with the following results for both original and revised
>
> Test results: passed: 858; failed: 4; error: 6
>
> I have benchmarked the patch on four different partner platforms. The average improvement was 2.6X for PopCountI and 2.5X for PopCountL.
>
> Please review,
>
> Thanks,
> Ed.
>
>

From aph at redhat.com  Mon Jun 29 09:07:10 2015
From: aph at redhat.com (Andrew Haley)
Date: Mon, 29 Jun 2015 10:07:10 +0100
Subject: [aarch64-port-dev ] RFR: 8086087: aarch64: add support for 64
 bit vectors
In-Reply-To: <558C0E39.2060000@oracle.com>
References: <1435228849.11204.17.camel@mylittlepony.linaroharston>
	<558C0E39.2060000@oracle.com>
Message-ID: <55910ABE.8050809@redhat.com>

On 25/06/15 15:20, Vladimir Kozlov wrote:
> Since changes a big you need 2 reviewers. One official reviewer (me in this case) and one who is familiar with this code 
> and at least committer (Andrew, for example).

Looks good.

Thanks,
Andrew.


From aph at redhat.com  Mon Jun 29 09:08:18 2015
From: aph at redhat.com (Andrew Haley)
Date: Mon, 29 Jun 2015 10:08:18 +0100
Subject: [aarch64-port-dev ] RFR: 8129426: aarch64: add support for
 PopCount in C2
In-Reply-To: <1435243444.29000.6.camel@mylittlepony.linaroharston>
References: <1435243444.29000.6.camel@mylittlepony.linaroharston>
Message-ID: <55910B02.50604@redhat.com>

On 25/06/15 15:44, Edward Nevill wrote:
> Please review,

This is fine.

Thanks,
Andrew.


From aph at redhat.com  Mon Jun 29 12:15:21 2015
From: aph at redhat.com (Andrew Haley)
Date: Mon, 29 Jun 2015 13:15:21 +0100
Subject: [aarch64-port-dev ] Sign-extending 32-bit operands in adapters
 [Was: RFR: 8129426: aarch64: add support for PopCount in C2]
In-Reply-To: <5589308C.6000309@redhat.com>
References: <1434979401.21282.31.camel@mint>	<558815DA.8020500@redhat.com>	<1434985182.21282.34.camel@mint>	<558823FD.5080800@redhat.com>
	<558829D2.4000503@redhat.com>	<55882C94.7030505@redhat.com>
	<55882ECC.8030602@redhat.com>	<5588313C.1070409@redhat.com>
	<5588345A.4060708@redhat.com> <5589308C.6000309@redhat.com>
Message-ID: <559136D9.5000801@redhat.com>

Here's a snippet from gen_i2c_adapter where we sign extend:

      if (!r_2->is_valid()) {
        // sign extend???
        __ ldrsw(rscratch2, Address(esp, ld_off));
        __ str(rscratch2, Address(sp, st_off));

but in another place we don't sign extend:

        // sign extend and use a full word?
        __ ldrw(r, Address(esp, ld_off));
      }

So, we sign extend when our argument is passed to compiled code in
memory, but zero extend when it is passed in a register.  The
confusion (and those comments) about what should happen seems to come
from the x86 code.  I think we've agreed that we should zero extend,
but I'm still far from convinced that we should ever use an input
operand in any mode other than its natural size.

Andrew.

From goetz.lindenmaier at sap.com  Mon Jun 29 12:46:42 2015
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Mon, 29 Jun 2015 12:46:42 +0000
Subject: [aarch64-port-dev ] Sign-extending 32-bit operands in adapters
 [Was: [aarch64-port-dev	] RFR: 8129426: aarch64: add support for PopCount
 in C2]
In-Reply-To: <559136D9.5000801@redhat.com>
References: <1434979401.21282.31.camel@mint>	<558815DA.8020500@redhat.com>
	<1434985182.21282.34.camel@mint>	<558823FD.5080800@redhat.com>
	<558829D2.4000503@redhat.com>	<55882C94.7030505@redhat.com>
	<55882ECC.8030602@redhat.com>	<5588313C.1070409@redhat.com>
	<5588345A.4060708@redhat.com> <5589308C.6000309@redhat.com>
	<559136D9.5000801@redhat.com>
Message-ID: <4295855A5C1DE049A61835A1887419CC2D000CA6@DEWDFEMB12A.global.corp.sap>

Hi Andrew, 

I have an off-topic question that touches this issue:
You have CCallingConventionRequiresIntsAsLongs set to true.

We once introduced this, because we have to sign-extend all ints and place them
in long slots for PPC C calling conventions.
If this is set, we do the cast in the frontend, and the i2l nodes can be optimized, and we
don't need to do it in the native wrapper.
Unfortunately, there are more and more intrinsics with explicitly constructed
calls.  We have to adapt these in the frontend, which causes not that nice shared changes.

I think about doing the i2l cast right in the native wrapper.  There, the cast will always be
necessary, i.e., it's not optimized, but that's not really performance relevant.

So basically I could remove the code guarded by CCallingConventionRequiresIntsAsLongs, 
except for that you use it ...
But as I read the aarch code, it's not really necessary.  You pass ints in small slots, anyways.
So do you rely on that code?

Best regards,
  Goetz.


-----Original Message-----
From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Andrew Haley
Sent: Montag, 29. Juni 2015 14:15
To: Andrew Dinn; edward.nevill at gmail.com
Cc: hotspot-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
Subject: Sign-extending 32-bit operands in adapters [Was: [aarch64-port-dev ] RFR: 8129426: aarch64: add support for PopCount in C2]

Here's a snippet from gen_i2c_adapter where we sign extend:

      if (!r_2->is_valid()) {
        // sign extend???
        __ ldrsw(rscratch2, Address(esp, ld_off));
        __ str(rscratch2, Address(sp, st_off));

but in another place we don't sign extend:

        // sign extend and use a full word?
        __ ldrw(r, Address(esp, ld_off));
      }

So, we sign extend when our argument is passed to compiled code in
memory, but zero extend when it is passed in a register.  The
confusion (and those comments) about what should happen seems to come
from the x86 code.  I think we've agreed that we should zero extend,
but I'm still far from convinced that we should ever use an input
operand in any mode other than its natural size.

Andrew.

From aph at redhat.com  Mon Jun 29 12:54:38 2015
From: aph at redhat.com (Andrew Haley)
Date: Mon, 29 Jun 2015 13:54:38 +0100
Subject: [aarch64-port-dev ] Sign-extending 32-bit operands in adapters
 [Was: RFR: 8129426: aarch64: add support for PopCount in C2]
In-Reply-To: <4295855A5C1DE049A61835A1887419CC2D000CA6@DEWDFEMB12A.global.corp.sap>
References: <1434979401.21282.31.camel@mint>	<558815DA.8020500@redhat.com>	<1434985182.21282.34.camel@mint>	<558823FD.5080800@redhat.com>	<558829D2.4000503@redhat.com>	<55882C94.7030505@redhat.com>	<55882ECC.8030602@redhat.com>	<5588313C.1070409@redhat.com>	<5588345A.4060708@redhat.com>
	<5589308C.6000309@redhat.com> <559136D9.5000801@redhat.com>
	<4295855A5C1DE049A61835A1887419CC2D000CA6@DEWDFEMB12A.global.corp.sap>
Message-ID: <5591400E.9090702@redhat.com>

On 06/29/2015 01:46 PM, Lindenmaier, Goetz wrote:
> So basically I could remove the code guarded by CCallingConventionRequiresIntsAsLongs, 
> except for that you use it ...
> But as I read the aarch code, it's not really necessary.  You pass ints in small slots, anyways.
> So do you rely on that code?

Could you please point me to exactly the code you are talking about
which we use?

Andrew.


From aph at redhat.com  Mon Jun 29 13:00:29 2015
From: aph at redhat.com (Andrew Haley)
Date: Mon, 29 Jun 2015 14:00:29 +0100
Subject: [aarch64-port-dev ] Sign-extending 32-bit operands in adapters
 [Was: RFR: 8129426: aarch64: add support for PopCount in C2]
In-Reply-To: <5591400E.9090702@redhat.com>
References: <1434979401.21282.31.camel@mint>	<558815DA.8020500@redhat.com>	<1434985182.21282.34.camel@mint>	<558823FD.5080800@redhat.com>	<558829D2.4000503@redhat.com>	<55882C94.7030505@redhat.com>	<55882ECC.8030602@redhat.com>	<5588313C.1070409@redhat.com>	<5588345A.4060708@redhat.com>	<5589308C.6000309@redhat.com>
	<559136D9.5000801@redhat.com>	<4295855A5C1DE049A61835A1887419CC2D000CA6@DEWDFEMB12A.global.corp.sap>
	<5591400E.9090702@redhat.com>
Message-ID: <5591416D.7000804@redhat.com>

On 06/29/2015 01:54 PM, Andrew Haley wrote:
> On 06/29/2015 01:46 PM, Lindenmaier, Goetz wrote:
>> So basically I could remove the code guarded by CCallingConventionRequiresIntsAsLongs, 
>> except for that you use it ...
>> But as I read the aarch code, it's not really necessary.  You pass ints in small slots, anyways.
>> So do you rely on that code?
> 
> Could you please point me to exactly the code you are talking about
> which we use?

Or is it simply that we set
CCallingConventionRequiresIntsAsLongs = true?

We don't need to do that.

Andrew.


From goetz.lindenmaier at sap.com  Mon Jun 29 13:04:33 2015
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Mon, 29 Jun 2015 13:04:33 +0000
Subject: [aarch64-port-dev ] Sign-extending 32-bit operands in adapters
 [Was: RFR: 8129426: aarch64: add support for PopCount in C2]
In-Reply-To: <5591416D.7000804@redhat.com>
References: <1434979401.21282.31.camel@mint>	<558815DA.8020500@redhat.com>
	<1434985182.21282.34.camel@mint>	<558823FD.5080800@redhat.com>
	<558829D2.4000503@redhat.com>	<55882C94.7030505@redhat.com>
	<55882ECC.8030602@redhat.com>	<5588313C.1070409@redhat.com>
	<5588345A.4060708@redhat.com>	<5589308C.6000309@redhat.com>
	<559136D9.5000801@redhat.com>
	<4295855A5C1DE049A61835A1887419CC2D000CA6@DEWDFEMB12A.global.corp.sap>
	<5591400E.9090702@redhat.com> <5591416D.7000804@redhat.com>
Message-ID: <4295855A5C1DE049A61835A1887419CC2D000CEC@DEWDFEMB12A.global.corp.sap>

Yes, right, that's what I mean.

So I'll remove it.

Best regards,
  Goetz.

-----Original Message-----
From: Andrew Haley [mailto:aph at redhat.com] 
Sent: Montag, 29. Juni 2015 15:00
To: Lindenmaier, Goetz; Andrew Dinn; edward.nevill at gmail.com
Cc: hotspot-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
Subject: Re: [aarch64-port-dev ] Sign-extending 32-bit operands in adapters [Was: RFR: 8129426: aarch64: add support for PopCount in C2]

On 06/29/2015 01:54 PM, Andrew Haley wrote:
> On 06/29/2015 01:46 PM, Lindenmaier, Goetz wrote:
>> So basically I could remove the code guarded by CCallingConventionRequiresIntsAsLongs, 
>> except for that you use it ...
>> But as I read the aarch code, it's not really necessary.  You pass ints in small slots, anyways.
>> So do you rely on that code?
> 
> Could you please point me to exactly the code you are talking about
> which we use?

Or is it simply that we set
CCallingConventionRequiresIntsAsLongs = true?

We don't need to do that.

Andrew.


From tangwei6 at huawei.com  Tue Jun 30 14:34:12 2015
From: tangwei6 at huawei.com (Tangwei (Euler))
Date: Tue, 30 Jun 2015 14:34:12 +0000
Subject: [aarch64-port-dev ] barrier issue in cmpxchgptr implementation
Message-ID: <C8D1E566CC4CA845813731FB6FD5C530010FCC50@SZXEMI503-MBX.china.huawei.com>

Hi All,
  I checked the MacroAssembler::cmpxchgptr implementation in JVM, and found load-aquire/store-release followed by a full memory barrier used to simulate the behavior of X86 cmpxchg.
Following is the code snapshot from the function for your reference. In my opinion, the cmpxchg must provide full bi-directional fence semantics. Is there anyone can help to explain why one
membar in the end is enough for cmpxchg? Or one more memory barrier commented out is needed to add at the beginning?

  // membar(AnyAny)  // is this needed?
retry_load:
ldaxr(tmp, addr);
  cmp(tmp, oldv);
  br(Assembler::NE, nope);
  stlxr(tmp, newv, addr);
  cbzw(tmp, succeed);
  b(retry_load);
nope:
  membar(AnyAny);
  mov(oldv, tmp);

I checked the source __cmpxchg_mb, it seems two memory barrier is added around cmpxchg.

smp_mb();
ret = __cmpxchg(ptr, old, new, size);
smp_mb();

But after checked intrinsic in GCC with following simple case, I found there is no barrier around load-acquire/store-release.
I am a little confused now. Which instruction sequence should be chosen to simulate X86 cmpxchg?

long foo (long *ptr, long old, long new)
{
    return __sync_val_compare_and_swap (ptr, old, new);
}
Assembly:
        ldaxr   x3, [x0]        // 22   aarch64_load_exclusivedi        [length = 4]
        cmp     x3, x1  // 23   *cmpdi/1        [length = 4]
        bne     .L3     // 24   *condjump       [length = 4]
        stlxr   w4, x2, [x0]    // 25   aarch64_store_exclusivedi       [length = 4]
        cbnz    w4, .L2 // 26   *cbnesi1        [length = 4]


Regards!
wei

From aph at redhat.com  Tue Jun 30 18:10:47 2015
From: aph at redhat.com (Andrew Haley)
Date: Tue, 30 Jun 2015 19:10:47 +0100
Subject: [aarch64-port-dev ] barrier issue in cmpxchgptr implementation
In-Reply-To: <C8D1E566CC4CA845813731FB6FD5C530010FCC50@SZXEMI503-MBX.china.huawei.com>
References: <C8D1E566CC4CA845813731FB6FD5C530010FCC50@SZXEMI503-MBX.china.huawei.com>
Message-ID: <5592DBA7.6010501@redhat.com>

Hi,

On 06/30/2015 03:34 PM, Tangwei (Euler) wrote:

>   I checked the MacroAssembler::cmpxchgptr implementation in JVM,
> and found load-aquire/store-release followed by a full memory
> barrier used to simulate the behavior of X86 cmpxchg.

Well, it's not necessarily doing that.  It is a cmpxchg which needs to
be strong enough for HotSpot's usage: it does not necessarily have to
be as strong as x86.  But let's move on...

> Following is the code snapshot from the function for your
> reference. In my opinion, the cmpxchg must provide full
> bi-directional fence semantics. Is there anyone can help to explain
> why one membar in the end is enough for cmpxchg? Or one more memory
> barrier commented out is needed to add at the beginning?

>   // membar(AnyAny)  // is this needed?
> retry_load:

LoadLoad|LoadStore

> ldaxr(tmp, addr);
>   cmp(tmp, oldv);
>   br(Assembler::NE, nope);

StoreStore|LoadStore

>   stlxr(tmp, newv, addr);
>   cbzw(tmp, succeed);
>   b(retry_load);
> nope:
>   membar(AnyAny);

AnyAny.  Nothing can pass here.

Please explain what problem you see with this sequence.  We need an
example of incorrect operation.

> But after checked intrinsic in GCC with following simple case, I
> found there is no barrier around load-acquire/store-release.  I am a
> little confused now. Which instruction sequence should be chosen to
> simulate X86 cmpxchg?

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65697

As I said, that depends if you really need to simulate X86 cmpxchg.

Andrew.