From openjdk-testing at linaro.org Tue Apr 1 13:44:49 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Tue, 1 Apr 2014 13:44:49 +0000 (UTC) Subject: [aarch64-port-dev ] server JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140401134634.04FEE1FD1D@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/summary/2014/091/summary.html =============================================================================== server-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/mar/21 pass: 413; fail: 2; error: 2 Build 1: aarch64/2014/mar/25 pass: 413; fail: 2; error: 2 Build 2: aarch64/2014/mar/26 pass: 414; fail: 2; error: 1 Build 3: aarch64/2014/mar/27 pass: 435; fail: 1; error: 2 Build 4: aarch64/2014/mar/28 pass: 435; fail: 1; error: 2 Build 5: aarch64/2014/mar/29 pass: 435; fail: 1; error: 2 Build 6: aarch64/2014/apr/01 pass: 435; fail: 1; error: 2 ------------------------------------------------------------------------------- =============================================================================== server-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/mar/20 pass: 2,939; error: 33 Build 1: aarch64/2014/mar/21 pass: 2,955; error: 17 Build 2: aarch64/2014/mar/25 pass: 2,943; error: 29 Build 3: aarch64/2014/mar/26 pass: 2,960; error: 12 Build 4: aarch64/2014/mar/27 pass: 2,941; error: 31 Build 5: aarch64/2014/mar/28 pass: 2,939; error: 33 Build 6: aarch64/2014/apr/01 pass: 2,939; error: 33 ------------------------------------------------------------------------------- =============================================================================== server-release/jdk =============================================================================== Build 0: aarch64/2014/mar/20 pass: 5,284; fail: 124; error: 40 Build 1: aarch64/2014/mar/21 pass: 5,285; fail: 123; error: 40 Build 2: aarch64/2014/mar/27 pass: 5,281; fail: 130; error: 39 Build 3: aarch64/2014/mar/28 pass: 5,288; fail: 122; error: 40 Build 4: aarch64/2014/apr/01 pass: 5,286; fail: 124; error: 40 ------------------------------------------------------------------------------- Previous results can be found here: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/index.html From openjdk-testing at linaro.org Tue Apr 1 13:50:39 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Tue, 1 Apr 2014 13:50:39 +0000 (UTC) Subject: [aarch64-port-dev ] client JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140401135212.F349A1FD1D@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/summary/2014/091/summary.html =============================================================================== client-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/mar/21 pass: 412; fail: 3; error: 2 Build 1: aarch64/2014/mar/25 pass: 413; fail: 3; error: 1 Build 2: aarch64/2014/mar/26 pass: 414; fail: 3 Build 3: aarch64/2014/mar/27 pass: 431; fail: 5; error: 2 Build 4: aarch64/2014/mar/28 pass: 431; fail: 5; error: 2 Build 5: aarch64/2014/mar/29 pass: 431; fail: 5; error: 2 Build 6: aarch64/2014/apr/01 pass: 431; fail: 5; error: 2 ------------------------------------------------------------------------------- =============================================================================== client-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/mar/19 pass: 2,949; error: 23 Build 1: aarch64/2014/mar/20 pass: 2,939; error: 33 Build 2: aarch64/2014/mar/21 pass: 2,946; error: 26 Build 3: aarch64/2014/mar/26 pass: 2,950; error: 22 Build 4: aarch64/2014/mar/27 pass: 2,941; error: 31 Build 5: aarch64/2014/mar/28 pass: 2,940; error: 32 Build 6: aarch64/2014/apr/01 pass: 2,941; error: 31 ------------------------------------------------------------------------------- =============================================================================== client-release/jdk =============================================================================== Build 0: aarch64/2014/mar/20 pass: 5,271; fail: 130; error: 47 Build 1: aarch64/2014/mar/21 pass: 5,277; fail: 128; error: 43 Build 2: aarch64/2014/mar/27 pass: 5,273; fail: 131; error: 46 Build 3: aarch64/2014/mar/28 pass: 5,273; fail: 131; error: 46 Build 4: aarch64/2014/apr/01 pass: 5,268; fail: 134; error: 48 ------------------------------------------------------------------------------- Previous results can be found here: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/index.html From aph at redhat.com Tue Apr 1 16:28:03 2014 From: aph at redhat.com (Andrew Haley) Date: Tue, 01 Apr 2014 17:28:03 +0100 Subject: [aarch64-port-dev ] Remove unnecessary memory barriers around CAS operations Message-ID: <533AE913.6020802@redhat.com> After much careful reading of the AArch64 spec and many runs of jcstress, I have decided that it is safe to remove the barriers around CAS. A more thorough reworking of memory barriers is on my list of things to do, but it needs some HotSpot changes from upstream that aren't yet in our code base. Andrew. # HG changeset patch # User aph # Date 1396369343 14400 # Tue Apr 01 12:22:23 2014 -0400 # Node ID 780ed75ea21a727949abcfe57ea5544f7a1ca22c # Parent e176eb39c5f53127fe18a7e528ca6bc6f0c23cea Remove unnecessary memory barriers around CAS operations diff -r e176eb39c5f5 -r 780ed75ea21a src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp --- a/src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp Mon Mar 31 10:20:26 2014 -0400 +++ b/src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp Tue Apr 01 12:22:23 2014 -0400 @@ -1584,9 +1584,8 @@ Label retry_load, nope; // flush and load exclusive from the memory location // and fail if it is not what we expect - __ membar(__ AnyAny); __ bind(retry_load); - __ ldxrw(rscratch1, addr); + __ ldaxrw(rscratch1, addr); __ cmpw(rscratch1, cmpval); __ cset(rscratch1, Assembler::NE); __ br(Assembler::NE, nope); @@ -1603,9 +1602,8 @@ Label retry_load, nope; // flush and load exclusive from the memory location // and fail if it is not what we expect - __ membar(__ AnyAny); __ bind(retry_load); - __ ldxr(rscratch1, addr); + __ ldaxr(rscratch1, addr); __ cmp(rscratch1, cmpval); __ cset(rscratch1, Assembler::NE); __ br(Assembler::NE, nope); diff -r e176eb39c5f5 -r 780ed75ea21a src/cpu/aarch64/vm/macroAssembler_aarch64.cpp --- a/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Mon Mar 31 10:20:26 2014 -0400 +++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Tue Apr 01 12:22:23 2014 -0400 @@ -1817,13 +1817,11 @@ bind(retry_load); // flush and load exclusive from the memory location // and fail if it is not what we expect - membar(AnyAny); - ldxr(tmp, addr); + ldaxr(tmp, addr); cmp(tmp, oldv); br(Assembler::NE, nope); // if we store+flush with no intervening write tmp wil be zero - stxr(tmp, newv, addr); - membar(AnyAny); + stlxr(tmp, newv, addr); cbzw(tmp, succeed); // retry so we only ever return after a load fails to compare // ensures we don't return a stale value after a failed write. @@ -1847,13 +1845,11 @@ bind(retry_load); // flush and load exclusive from the memory location // and fail if it is not what we expect - membar(AnyAny); - ldxrw(tmp, addr); + ldaxrw(tmp, addr); cmp(tmp, oldv); br(Assembler::NE, nope); // if we store+flush with no intervening write tmp wil be zero - stxrw(tmp, newv, addr); - membar(AnyAny); + stlxrw(tmp, newv, addr); cbzw(tmp, succeed); // retry so we only ever return after a load fails to compare // ensures we don't return a stale value after a failed write. From aph at redhat.com Tue Apr 1 16:29:03 2014 From: aph at redhat.com (aph at redhat.com) Date: Tue, 01 Apr 2014 16:29:03 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8/hotspot: 2 new changesets Message-ID: <201404011629.s31GT5PX025943@aojmv0008> Changeset: e176eb39c5f5 Author: aph Date: 2014-03-31 10:20 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/e176eb39c5f5 Remove special-case handling of division arguments. AArch64 doesn't need it. ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Changeset: 780ed75ea21a Author: aph Date: 2014-04-01 12:22 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/780ed75ea21a Remove unnecessary memory barriers around CAS operations ! src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp From D.Sturm42 at gmail.com Wed Apr 2 00:32:23 2014 From: D.Sturm42 at gmail.com (D.Sturm) Date: Wed, 2 Apr 2014 02:32:23 +0200 Subject: [aarch64-port-dev ] FP load/stores and atomicity guarantees Message-ID: Hi, what are the current ideas with regard to atomicity of floating point load/stores? JLS 17.7 states that write/reads of floats are atomic, while doubles only need to guarantee that the 2 halves are written in any order. For volatile doubles it also expects atomicity. The problem with these requirements is that the only guarantee the A64 ISA gives is byte-wise atomicity (B2.6.3): "No memory accesses involving SIMD and floating-point registers[...] have single-copy atomicity of any quantity greater than individual bytes." It seems to me the only way to implement the requirements given by the JLS is to move non-local floating point values (local ones that are only visible to one thread are expemt obviously) into an integer register for load/store instructions (ouch). Any thoughts on the topic? Maybe existing processor designs give stronger guarantees? At the moment it seems that HotSpot is generating normal ldr/str dX, address instructions indiscriminately -- Daniel From D.Sturm42 at gmail.com Wed Apr 2 01:48:30 2014 From: D.Sturm42 at gmail.com (D.Sturm) Date: Wed, 2 Apr 2014 03:48:30 +0200 Subject: [aarch64-port-dev ] Remove unnecessary memory barriers around CAS operations In-Reply-To: <533AE913.6020802@redhat.com> References: <533AE913.6020802@redhat.com> Message-ID: Hey, could you explain your thinking about why it is safe to remove the second memory barrier? We are definitely in agreement on the first one (see previous mails on the matter), but I don't see how we can remove the second one. It is my understanding that - using the vocabulary from http://gee.cs.oswego.edu/dl/jmm/cookbook.html - a lda[x]r instruction is equivalent to load;LoadStore;LoadLoad and the stl[x]r instruction is equivalent to LoadStore;StoreStore;store. Consequently to get all necessary ordering guarantees would mean inserting a StoreLoad barrier after the store instruction (which is equivalent to an AnyAny barrier) - there are other options, but in any case we need a StoreLoad barrier *somewhere* (and putting it after writes than before reads seems more efficient in practce - that's what x64 does in HotSpot at least I think). To quote the cookbook: "The sequence: Store1; StoreLoad; Load2 ensures that Store1's data are made visible to other processors (i.e., flushed to main memory) before data accessed by Load2 and all subsequent load instructions are loaded. StoreLoad barriers protect against a subsequent load incorrectly using Store1's data value rather than that from a more recent store to the same location performed by a different processor." while the A64 reference has to say the following about a store-release: "A store-release will be observed by each observer after that observer observes any loads or stores that appear in program order before the store-release, *but says nothing about loads and stores appearing after the store-release* (emphasis mine)" So I don't think we can remove the AnyAny barrier after volatile writes and since a CAS has to fulfil the same requirements as a volatile read and write, it seems to me that the 2nd barrier is necessary. -- Daniel On 1 April 2014 18:28, Andrew Haley wrote: > After much careful reading of the AArch64 spec and many runs of jcstress, > I have decided that it is safe to remove the barriers around CAS. > > A more thorough reworking of memory barriers is on my list of things > to do, but it needs some HotSpot changes from upstream that aren't yet > in our code base. > > Andrew. > > > # HG changeset patch > # User aph > # Date 1396369343 14400 > # Tue Apr 01 12:22:23 2014 -0400 > # Node ID 780ed75ea21a727949abcfe57ea5544f7a1ca22c > # Parent e176eb39c5f53127fe18a7e528ca6bc6f0c23cea > Remove unnecessary memory barriers around CAS operations > > diff -r e176eb39c5f5 -r 780ed75ea21a > src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp > --- a/src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp Mon Mar 31 > 10:20:26 2014 -0400 > +++ b/src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp Tue Apr 01 > 12:22:23 2014 -0400 > @@ -1584,9 +1584,8 @@ > Label retry_load, nope; > // flush and load exclusive from the memory location > // and fail if it is not what we expect > - __ membar(__ AnyAny); > __ bind(retry_load); > - __ ldxrw(rscratch1, addr); > + __ ldaxrw(rscratch1, addr); > __ cmpw(rscratch1, cmpval); > __ cset(rscratch1, Assembler::NE); > __ br(Assembler::NE, nope); > @@ -1603,9 +1602,8 @@ > Label retry_load, nope; > // flush and load exclusive from the memory location > // and fail if it is not what we expect > - __ membar(__ AnyAny); > __ bind(retry_load); > - __ ldxr(rscratch1, addr); > + __ ldaxr(rscratch1, addr); > __ cmp(rscratch1, cmpval); > __ cset(rscratch1, Assembler::NE); > __ br(Assembler::NE, nope); > diff -r e176eb39c5f5 -r 780ed75ea21a > src/cpu/aarch64/vm/macroAssembler_aarch64.cpp > --- a/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Mon Mar 31 > 10:20:26 2014 -0400 > +++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Tue Apr 01 > 12:22:23 2014 -0400 > @@ -1817,13 +1817,11 @@ > bind(retry_load); > // flush and load exclusive from the memory location > // and fail if it is not what we expect > - membar(AnyAny); > - ldxr(tmp, addr); > + ldaxr(tmp, addr); > cmp(tmp, oldv); > br(Assembler::NE, nope); > // if we store+flush with no intervening write tmp wil be zero > - stxr(tmp, newv, addr); > - membar(AnyAny); > + stlxr(tmp, newv, addr); > cbzw(tmp, succeed); > // retry so we only ever return after a load fails to compare > // ensures we don't return a stale value after a failed write. > @@ -1847,13 +1845,11 @@ > bind(retry_load); > // flush and load exclusive from the memory location > // and fail if it is not what we expect > - membar(AnyAny); > - ldxrw(tmp, addr); > + ldaxrw(tmp, addr); > cmp(tmp, oldv); > br(Assembler::NE, nope); > // if we store+flush with no intervening write tmp wil be zero > - stxrw(tmp, newv, addr); > - membar(AnyAny); > + stlxrw(tmp, newv, addr); > cbzw(tmp, succeed); > // retry so we only ever return after a load fails to compare > // ensures we don't return a stale value after a failed write. > From edward.nevill at linaro.org Wed Apr 2 11:01:10 2014 From: edward.nevill at linaro.org (Edward Nevill) Date: Wed, 02 Apr 2014 12:01:10 +0100 Subject: [aarch64-port-dev ] RFR: Save callee save FP registers on entry to java code Message-ID: <1396436470.29419.12.camel@localhost.localdomain> Hi, The following patch fixes a bug whereby the callee save FP registers D8-D15 were not saved on entry to Java. These registers are used (without being saved) by both C1 and C2 JIT. Thanks to Matthias Klose for finding this, and to Andrew Haley for debugging this from just the hs_err log. I have tested the client & server builds against Hotspot, and have done a basic test on the simulator build. OK to push? Ed. --- CUT HERE --- exporting patch: # HG changeset patch # User Edward Nevill edward.nevill at linaro.org # Date 1396435308 -3600 # Wed Apr 02 11:41:48 2014 +0100 # Node ID 273f8f0e7109ba0abe8f3697f2f48e34afe0d2f3 # Parent 780ed75ea21a727949abcfe57ea5544f7a1ca22c Preserve callee save FP registers around call to java code diff -r 780ed75ea21a -r 273f8f0e7109 src/cpu/aarch64/vm/frame_aarch64.hpp --- a/src/cpu/aarch64/vm/frame_aarch64.hpp Tue Apr 01 12:22:23 2014 -0400 +++ b/src/cpu/aarch64/vm/frame_aarch64.hpp Wed Apr 02 11:41:48 2014 +0100 @@ -133,7 +133,7 @@ // Entry frames // n.b. these values are determined by the layout defined in // stubGenerator for the Java call stub - entry_frame_after_call_words = 19, + entry_frame_after_call_words = 27, entry_frame_call_wrapper_offset = -8, // we don't need a save area diff -r 780ed75ea21a -r 273f8f0e7109 src/cpu/aarch64/vm/stubGenerator_aarch64.cpp --- a/src/cpu/aarch64/vm/stubGenerator_aarch64.cpp Tue Apr 01 12:22:23 2014 -0400 +++ b/src/cpu/aarch64/vm/stubGenerator_aarch64.cpp Wed Apr 02 11:41:48 2014 +0100 @@ -135,8 +135,16 @@ // [ return_from_Java ] <--- sp // [ argument word n ] // ... - // -19 [ argument word 1 ] - // -18 [ saved r28 ] <--- sp_after_call + // -27 [ argument word 1 ] + // -26 [ saved d15 ] <--- sp_after_call + // -25 [ saved d14 ] + // -24 [ saved d13 ] + // -23 [ saved d12 ] + // -22 [ saved d11 ] + // -21 [ saved d10 ] + // -20 [ saved d9 ] + // -19 [ saved d8 ] + // -18 [ saved r28 ] // -17 [ saved r27 ] // -16 [ saved r26 ] // -15 [ saved r25 ] @@ -159,7 +167,17 @@ // Call stub stack layout word offsets from fp enum call_stub_layout { - sp_after_call_off = -18, + sp_after_call_off = -26, + + d15_off = -26, + d14_off = -25, + d13_off = -24, + d12_off = -23, + d11_off = -22, + d10_off = -21, + d9_off = -20, + d8_off = -19, + r28_off = -18, r27_off = -17, r26_off = -16, @@ -202,6 +220,15 @@ const Address thread (rfp, thread_off * wordSize); + const Address d15_save (rfp, d15_off * wordSize); + const Address d14_save (rfp, d14_off * wordSize); + const Address d13_save (rfp, d13_off * wordSize); + const Address d12_save (rfp, d12_off * wordSize); + const Address d11_save (rfp, d11_off * wordSize); + const Address d10_save (rfp, d10_off * wordSize); + const Address d9_save (rfp, d9_off * wordSize); + const Address d8_save (rfp, d8_off * wordSize); + const Address r28_save (rfp, r28_off * wordSize); const Address r27_save (rfp, r27_off * wordSize); const Address r26_save (rfp, r26_off * wordSize); @@ -220,11 +247,8 @@ address aarch64_entry = __ pc(); -// AED: this should fix Ed's problem -- we only save the sender's SP for our sim #ifdef BUILTIN_SIM // Save sender's SP for stack traces. -// ECN: FIXME -// #if 0 __ mov(rscratch1, sp); __ str(rscratch1, Address(__ pre(sp, -2 * wordSize))); #endif @@ -254,6 +278,15 @@ __ str(r27, r27_save); __ str(r28, r28_save); + __ strd(v8, d8_save); + __ strd(v9, d9_save); + __ strd(v10, d10_save); + __ strd(v11, d11_save); + __ strd(v12, d12_save); + __ strd(v13, d13_save); + __ strd(v14, d14_save); + __ strd(v15, d15_save); + // install Java thread in global register now we have saved // whatever value it held __ mov(rthread, c_rarg7); @@ -358,6 +391,15 @@ #endif // restore callee-save registers + __ ldrd(v15, d15_save); + __ ldrd(v14, d14_save); + __ ldrd(v13, d13_save); + __ ldrd(v12, d12_save); + __ ldrd(v11, d11_save); + __ ldrd(v10, d10_save); + __ ldrd(v9, d9_save); + __ ldrd(v8, d8_save); + __ ldr(r28, r28_save); __ ldr(r27, r27_save); __ ldr(r26, r26_save); --- CUT HERE --- From aph at redhat.com Wed Apr 2 11:07:25 2014 From: aph at redhat.com (Andrew Haley) Date: Wed, 02 Apr 2014 12:07:25 +0100 Subject: [aarch64-port-dev ] RFR: Save callee save FP registers on entry to java code In-Reply-To: <1396436470.29419.12.camel@localhost.localdomain> References: <1396436470.29419.12.camel@localhost.localdomain> Message-ID: <533BEF6D.1040508@redhat.com> On 04/02/2014 12:01 PM, Edward Nevill wrote: > Preserve callee save FP registers around call to java code OK, thanks. Andrew. From ed at camswl.com Wed Apr 2 11:50:00 2014 From: ed at camswl.com (ed at camswl.com) Date: Wed, 02 Apr 2014 11:50:00 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8/hotspot: Preserve callee save FP registers around call to java code Message-ID: <201404021150.s32Bo27C013200@aojmv0008> Changeset: 273f8f0e7109 Author: Edward Nevill edward.nevill at linaro.org Date: 2014-04-02 11:41 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/273f8f0e7109 Preserve callee save FP registers around call to java code ! src/cpu/aarch64/vm/frame_aarch64.hpp ! src/cpu/aarch64/vm/stubGenerator_aarch64.cpp From aph at redhat.com Wed Apr 2 11:00:15 2014 From: aph at redhat.com (Andrew Haley) Date: Wed, 02 Apr 2014 12:00:15 +0100 Subject: [aarch64-port-dev ] Remove unnecessary memory barriers around CAS operations In-Reply-To: References: <533AE913.6020802@redhat.com> Message-ID: <533BEDBF.40108@redhat.com> On 04/02/2014 02:48 AM, D.Sturm wrote: > Hey, > could you explain your thinking about why it is safe to remove the second > memory barrier? We are definitely in agreement on the first one (see > previous mails on the matter), but I don't see how we can remove the second > one. > > It is my understanding that - using the vocabulary from > http://gee.cs.oswego.edu/dl/jmm/cookbook.html - a lda[x]r instruction is > equivalent to load;LoadStore;LoadLoad and the stl[x]r instruction is > equivalent to LoadStore;StoreStore;store. Consequently to get all necessary > ordering guarantees would mean inserting a StoreLoad barrier after the > store instruction (which is equivalent to an AnyAny barrier) - there are > other options, but in any case we need a StoreLoad barrier *somewhere* (and > putting it after writes than before reads seems more efficient in practce - > that's what x64 does in HotSpot at least I think). AIUI the barrier after stl[x]r is only needed to prevent a following load from moving before it. As Hans Boehm put it, "Could someone post a test case that they think should work, but that doesn't work with the acquire/release implementation (without added fences)? Clearly it does not work as a general purpose fence replacement, e.g. when used on an object accessed by only one thread. But I hope that was not intended. It does seem to me that it does preserve the property that properly synchronized programs are sequentially consistent." "On ARMv8, I would expect a volatile store to be compiled to a store release, and a volatile load to be compiled to a load acquire. Period. Unlike on Itanium, a release store is ordered with respect to a later acquire load, so the fence between them should not be needed. Thus there is no a priori reason to expect that a CAS would require a fence either. "I think that's completely uncontroversial. ARMv8 load acquire and store release are believed to suffice for Java volatile loads and stores respectively. Even the fence-less implementation used a release store exclusive. Unless I'm missing something, examples like this should be handled correctly by all proposed implementations, whether or not fences are added. "As far as I can tell, the only use case that require the fences to be added are essentially abuses of CAS as a fence." If you think otherwise, please join the discussion at concurrency-interest at cs.oswego.edu:Semantics of compareAndSwapX. Andrew. From openjdk-testing at linaro.org Wed Apr 2 14:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Wed, 2 Apr 2014 14:00:01 +0000 (UTC) Subject: [aarch64-port-dev ] client JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140402140212.CD2151FBB9@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/summary/2014/092/summary.html =============================================================================== client-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/mar/25 pass: 413; fail: 3; error: 1 Build 1: aarch64/2014/mar/26 pass: 414; fail: 3 Build 2: aarch64/2014/mar/27 pass: 431; fail: 5; error: 2 Build 3: aarch64/2014/mar/28 pass: 431; fail: 5; error: 2 Build 4: aarch64/2014/mar/29 pass: 431; fail: 5; error: 2 Build 5: aarch64/2014/apr/01 pass: 431; fail: 5; error: 2 Build 6: aarch64/2014/apr/02 pass: 431; fail: 5; error: 2 ------------------------------------------------------------------------------- =============================================================================== client-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/mar/20 pass: 2,939; error: 33 Build 1: aarch64/2014/mar/21 pass: 2,946; error: 26 Build 2: aarch64/2014/mar/26 pass: 2,950; error: 22 Build 3: aarch64/2014/mar/27 pass: 2,941; error: 31 Build 4: aarch64/2014/mar/28 pass: 2,940; error: 32 Build 5: aarch64/2014/apr/01 pass: 2,941; error: 31 Build 6: aarch64/2014/apr/02 pass: 2,941; error: 31 ------------------------------------------------------------------------------- =============================================================================== client-release/jdk =============================================================================== Build 0: aarch64/2014/mar/20 pass: 5,271; fail: 130; error: 47 Build 1: aarch64/2014/mar/21 pass: 5,277; fail: 128; error: 43 Build 2: aarch64/2014/mar/27 pass: 5,273; fail: 131; error: 46 Build 3: aarch64/2014/mar/28 pass: 5,273; fail: 131; error: 46 Build 4: aarch64/2014/apr/01 pass: 5,268; fail: 134; error: 48 Build 5: aarch64/2014/apr/02 pass: 5,276; fail: 128; error: 46 ------------------------------------------------------------------------------- Previous results can be found here: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/index.html From openjdk-testing at linaro.org Wed Apr 2 14:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Wed, 2 Apr 2014 14:00:01 +0000 (UTC) Subject: [aarch64-port-dev ] server JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140402140225.8E23A1F49E@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/summary/2014/092/summary.html =============================================================================== server-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/mar/25 pass: 413; fail: 2; error: 2 Build 1: aarch64/2014/mar/26 pass: 414; fail: 2; error: 1 Build 2: aarch64/2014/mar/27 pass: 435; fail: 1; error: 2 Build 3: aarch64/2014/mar/28 pass: 435; fail: 1; error: 2 Build 4: aarch64/2014/mar/29 pass: 435; fail: 1; error: 2 Build 5: aarch64/2014/apr/01 pass: 435; fail: 1; error: 2 Build 6: aarch64/2014/apr/02 pass: 435; fail: 1; error: 2 ------------------------------------------------------------------------------- =============================================================================== server-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/mar/21 pass: 2,955; error: 17 Build 1: aarch64/2014/mar/25 pass: 2,943; error: 29 Build 2: aarch64/2014/mar/26 pass: 2,960; error: 12 Build 3: aarch64/2014/mar/27 pass: 2,941; error: 31 Build 4: aarch64/2014/mar/28 pass: 2,939; error: 33 Build 5: aarch64/2014/apr/01 pass: 2,939; error: 33 Build 6: aarch64/2014/apr/02 pass: 2,938; error: 34 ------------------------------------------------------------------------------- =============================================================================== server-release/jdk =============================================================================== Build 0: aarch64/2014/mar/20 pass: 5,284; fail: 124; error: 40 Build 1: aarch64/2014/mar/21 pass: 5,285; fail: 123; error: 40 Build 2: aarch64/2014/mar/27 pass: 5,281; fail: 130; error: 39 Build 3: aarch64/2014/mar/28 pass: 5,288; fail: 122; error: 40 Build 4: aarch64/2014/apr/01 pass: 5,286; fail: 124; error: 40 Build 5: aarch64/2014/apr/02 pass: 5,284; fail: 125; error: 41 ------------------------------------------------------------------------------- Previous results can be found here: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/index.html From ed at camswl.com Wed Apr 2 16:59:34 2014 From: ed at camswl.com (Edward Nevill) Date: Wed, 02 Apr 2014 17:59:34 +0100 Subject: [aarch64-port-dev ] RFC: Use __clear_cache to do cache flushing Message-ID: <1396457974.11627.6.camel@localhost.localdomain> Hi, The current implementation of cache flushing/invalidation assumes that the line size is 64 bytes. This is not necessarily the case. To do this properly we would have to read CTR_EL0. Added to this is the complexity that the D cache and I cache may have different line sizes, so we must be careful to use the D cache line size when flushing the D cache and the I cache line size when invalidating the I cache. I think it would be much better all around if we just used the gcc intrinsic __clear_cache to do this as in the patch below. Comments? Ed. --- CUT HERE --- exporting patch: # HG changeset patch # User Edward Nevill edward.nevill at linaro.org # Date 1396457502 -3600 # Wed Apr 02 17:51:42 2014 +0100 # Node ID 080aebbebb5386e29298abfd5da240f7588ecbc0 # Parent 273f8f0e7109ba0abe8f3697f2f48e34afe0d2f3 Use gcc __clear_cache instead of doing it ourselves diff -r 273f8f0e7109 -r 080aebbebb53 src/cpu/aarch64/vm/icache_aarch64.cpp --- a/src/cpu/aarch64/vm/icache_aarch64.cpp Wed Apr 02 11:41:48 2014 +0100 +++ b/src/cpu/aarch64/vm/icache_aarch64.cpp Wed Apr 02 17:51:42 2014 +0100 @@ -29,41 +29,10 @@ #include "runtime/icache.hpp" extern void aarch64TestHook(); -extern "C" void setup_arm_sim(); -#define __ _masm-> - -void ICacheStubGenerator::generate_icache_flush(ICache::flush_icache_stub_t* flush_icache_stub) { - +void ICacheStubGenerator::generate_icache_flush( + ICache::flush_icache_stub_t* flush_icache_stub) { aarch64TestHook(); - - StubCodeMark mark(this, "ICache", "flush_icache_stub"); - - address entry = __ pc(); - - // generate a c stub prolog which will bootstrap into the ARM code - // which follows, loading the 3 general registers passed by the - // caller into the first 3 ARM registers - -#ifdef BUILTIN_SIM - __ c_stub_prolog(3, 0, MacroAssembler::ret_type_integral); -#endif - - Register start = r0, lines = r1, auto_magic = r2; - - // First flush the dcache - __ generate_flush_loop(&Assembler::dc, start, lines); - - __ dsb(Assembler::SY); - - // And then the icache - __ generate_flush_loop(&Assembler::ic, start, lines); - - // the stub is supposed to return the 3rd argument - __ mov(r0, auto_magic); - __ ret(r30); - - *flush_icache_stub = (ICache::flush_icache_stub_t)entry; + // Give anyone who calls this a surprise + *flush_icache_stub = (ICache::flush_icache_stub_t)NULL; } - -#undef __ diff -r 273f8f0e7109 -r 080aebbebb53 src/cpu/aarch64/vm/icache_aarch64.hpp --- a/src/cpu/aarch64/vm/icache_aarch64.hpp Wed Apr 02 11:41:48 2014 +0100 +++ b/src/cpu/aarch64/vm/icache_aarch64.hpp Wed Apr 02 17:51:42 2014 +0100 @@ -27,16 +27,19 @@ #ifndef CPU_AARCH64_VM_ICACHE_AARCH64_HPP #define CPU_AARCH64_VM_ICACHE_AARCH64_HPP -// Interface for updating the instruction cache. Whenever the VM modifies -// code, part of the processor instruction cache potentially has to be flushed. +// Interface for updating the instruction cache. Whenever the VM +// modifies code, part of the processor instruction cache potentially +// has to be flushed. class ICache : public AbstractICache { public: - enum { - stub_size = 128, // Size of the icache flush stub in bytes - line_size = 64, // Icache line size in bytes - log2_line_size = 6 // log2(line_size) - }; + static void initialize() {} + static void invalidate_word(address addr) { + __clear_cache((char *)addr, (char *)(addr + 3)); + } + static void invalidate_range(address start, int nbytes) { + __clear_cache((char *)start, (char *)(start + nbytes)); + } }; #endif // CPU_AARCH64_VM_ICACHE_AARCH64_HPP --- CUT HERE --- From aph at redhat.com Thu Apr 3 09:57:21 2014 From: aph at redhat.com (Andrew Haley) Date: Thu, 03 Apr 2014 10:57:21 +0100 Subject: [aarch64-port-dev ] RFC: Use __clear_cache to do cache flushing In-Reply-To: <1396457974.11627.6.camel@localhost.localdomain> References: <1396457974.11627.6.camel@localhost.localdomain> Message-ID: <533D3081.5030908@redhat.com> Hi, On 04/02/2014 05:59 PM, Edward Nevill wrote: > The current implementation of cache flushing/invalidation assumes that the line size is 64 bytes. This is not necessarily the case. > > To do this properly we would have to read CTR_EL0. Added to this is the complexity that the D cache and I cache may have different line sizes, so we must be careful to use the D cache line size when flushing the D cache and the I cache line size when invalidating the I cache. > > I think it would be much better all around if we just used the gcc intrinsic __clear_cache to do this as in the patch below. > > Comments? Hmm. I guess this is OK. I am uncomfortable that this leaves ICache::line_size incorrectly defined, but maybe that doesn't matter. I have written a patch that correctly initializes ICache::line_size. as an alternative for you to consider. If you decide to go with the GCC version, please: Make sure you delete the enums for things that we know to be wrong like line_size and log2_line_size. Fix your indentation in icache_aarch64.hpp. Andrew. diff -r 780ed75ea21a src/cpu/aarch64/vm/icache_aarch64.cpp --- a/src/cpu/aarch64/vm/icache_aarch64.cpp Tue Apr 01 12:22:23 2014 -0400 +++ b/src/cpu/aarch64/vm/icache_aarch64.cpp Thu Apr 03 05:52:33 2014 -0400 @@ -33,6 +33,27 @@ #define __ _masm-> +int ICache::line_size = 64; +int ICache::log2_line_size = 6; +static int dcache_line_size = 64; + +class InitICache { +public: + InitICache() { +#ifndef BUILTIN_SIM + unsigned int cache_info = 0; + /* CTR_EL0 [3:0] contains log2 of icache line size in words. + CTR_EL0 [19:16] contains log2 of dcache line size in words. */ + asm volatile ("mrs\t%0, ctr_el0":"=r" (cache_info)); + ICache::log2_line_size = (cache_info & 0xF) + 2; + ICache::line_size = 1 << ICache::log2_line_size; + dcache_line_size = 4 << ((cache_info >> 16) & 0xF); +#endif + } +}; + +static InitICache nothing; // Call the constructor + void ICacheStubGenerator::generate_icache_flush(ICache::flush_icache_stub_t* flush_icache_stub) { aarch64TestHook(); @@ -52,12 +73,12 @@ Register start = r0, lines = r1, auto_magic = r2; // First flush the dcache - __ generate_flush_loop(&Assembler::dc, start, lines); + __ generate_flush_loop(&Assembler::dc, start, lines, dcache_line_size); __ dsb(Assembler::SY); // And then the icache - __ generate_flush_loop(&Assembler::ic, start, lines); + __ generate_flush_loop(&Assembler::ic, start, lines, ICache::line_size); // the stub is supposed to return the 3rd argument __ mov(r0, auto_magic); diff -r 780ed75ea21a src/cpu/aarch64/vm/icache_aarch64.hpp --- a/src/cpu/aarch64/vm/icache_aarch64.hpp Tue Apr 01 12:22:23 2014 -0400 +++ b/src/cpu/aarch64/vm/icache_aarch64.hpp Thu Apr 03 05:52:33 2014 -0400 @@ -34,9 +34,8 @@ public: enum { stub_size = 128, // Size of the icache flush stub in bytes - line_size = 64, // Icache line size in bytes - log2_line_size = 6 // log2(line_size) }; + static int line_size, log2_line_size; // Icache line size in bytes }; #endif // CPU_AARCH64_VM_ICACHE_AARCH64_HPP diff -r 780ed75ea21a src/cpu/aarch64/vm/macroAssembler_aarch64.cpp --- a/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Tue Apr 01 12:22:23 2014 -0400 +++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Thu Apr 03 05:52:33 2014 -0400 @@ -2889,7 +2889,8 @@ } } -void MacroAssembler::generate_flush_loop(flush_insn flush, Register start, Register lines) { +void MacroAssembler::generate_flush_loop(flush_insn flush, Register start, Register lines, + int line_size) { Label again, exit; assert_different_registers(start, lines, rscratch1, rscratch2); @@ -2901,7 +2902,7 @@ bind(again); (this->*flush)(rscratch1); sub(rscratch2, rscratch2, 1); - add(rscratch1, rscratch1, ICache::line_size); + add(rscratch1, rscratch1, line_size); cbnz(rscratch2, again); bind(exit); } diff -r 780ed75ea21a src/cpu/aarch64/vm/macroAssembler_aarch64.hpp --- a/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp Tue Apr 01 12:22:23 2014 -0400 +++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp Thu Apr 03 05:52:33 2014 -0400 @@ -1362,7 +1362,8 @@ address read_polling_page(Register r, relocInfo::relocType rtype); typedef void (Assembler::* flush_insn)(Register Rt); - void generate_flush_loop(flush_insn flush, Register start, Register lines); + void generate_flush_loop(flush_insn flush, Register start, Register lines, + int line_size); // Used by aarch64.ad to control code generation static bool use_acq_rel_for_volatile_fields(); From openjdk-testing at linaro.org Thu Apr 3 14:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Thu, 3 Apr 2014 14:00:01 +0000 (UTC) Subject: [aarch64-port-dev ] server JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140403140034.85E201F5F8@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/summary/2014/093/summary.html =============================================================================== server-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/mar/26 pass: 414; fail: 2; error: 1 Build 1: aarch64/2014/mar/27 pass: 435; fail: 1; error: 2 Build 2: aarch64/2014/mar/28 pass: 435; fail: 1; error: 2 Build 3: aarch64/2014/mar/29 pass: 435; fail: 1; error: 2 Build 4: aarch64/2014/apr/01 pass: 435; fail: 1; error: 2 Build 5: aarch64/2014/apr/02 pass: 435; fail: 1; error: 2 Build 6: aarch64/2014/apr/03 pass: 435; fail: 1; error: 2 ------------------------------------------------------------------------------- =============================================================================== server-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/mar/25 pass: 2,943; error: 29 Build 1: aarch64/2014/mar/26 pass: 2,960; error: 12 Build 2: aarch64/2014/mar/27 pass: 2,941; error: 31 Build 3: aarch64/2014/mar/28 pass: 2,939; error: 33 Build 4: aarch64/2014/apr/01 pass: 2,939; error: 33 Build 5: aarch64/2014/apr/02 pass: 2,938; error: 34 Build 6: aarch64/2014/apr/03 pass: 2,939; error: 33 ------------------------------------------------------------------------------- =============================================================================== server-release/jdk =============================================================================== Build 0: aarch64/2014/mar/20 pass: 5,284; fail: 124; error: 40 Build 1: aarch64/2014/mar/21 pass: 5,285; fail: 123; error: 40 Build 2: aarch64/2014/mar/27 pass: 5,281; fail: 130; error: 39 Build 3: aarch64/2014/mar/28 pass: 5,288; fail: 122; error: 40 Build 4: aarch64/2014/apr/01 pass: 5,286; fail: 124; error: 40 Build 5: aarch64/2014/apr/02 pass: 5,284; fail: 125; error: 41 Build 6: aarch64/2014/apr/03 pass: 5,284; fail: 127; error: 39 ------------------------------------------------------------------------------- Previous results can be found here: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/index.html From openjdk-testing at linaro.org Thu Apr 3 14:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Thu, 3 Apr 2014 14:00:01 +0000 (UTC) Subject: [aarch64-port-dev ] client JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140403140033.ADAA91FBC3@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/summary/2014/093/summary.html =============================================================================== client-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/mar/26 pass: 414; fail: 3 Build 1: aarch64/2014/mar/27 pass: 431; fail: 5; error: 2 Build 2: aarch64/2014/mar/28 pass: 431; fail: 5; error: 2 Build 3: aarch64/2014/mar/29 pass: 431; fail: 5; error: 2 Build 4: aarch64/2014/apr/01 pass: 431; fail: 5; error: 2 Build 5: aarch64/2014/apr/02 pass: 431; fail: 5; error: 2 Build 6: aarch64/2014/apr/03 pass: 431; fail: 5; error: 2 ------------------------------------------------------------------------------- =============================================================================== client-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/mar/21 pass: 2,946; error: 26 Build 1: aarch64/2014/mar/26 pass: 2,950; error: 22 Build 2: aarch64/2014/mar/27 pass: 2,941; error: 31 Build 3: aarch64/2014/mar/28 pass: 2,940; error: 32 Build 4: aarch64/2014/apr/01 pass: 2,941; error: 31 Build 5: aarch64/2014/apr/02 pass: 2,941; error: 31 Build 6: aarch64/2014/apr/03 pass: 2,941; error: 31 ------------------------------------------------------------------------------- =============================================================================== client-release/jdk =============================================================================== Build 0: aarch64/2014/mar/20 pass: 5,271; fail: 130; error: 47 Build 1: aarch64/2014/mar/21 pass: 5,277; fail: 128; error: 43 Build 2: aarch64/2014/mar/27 pass: 5,273; fail: 131; error: 46 Build 3: aarch64/2014/mar/28 pass: 5,273; fail: 131; error: 46 Build 4: aarch64/2014/apr/01 pass: 5,268; fail: 134; error: 48 Build 5: aarch64/2014/apr/02 pass: 5,276; fail: 128; error: 46 Build 6: aarch64/2014/apr/03 pass: 5,272; fail: 132; error: 46 1 fatal errors were detected; please follow the link above for more detail. ------------------------------------------------------------------------------- Previous results can be found here: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/index.html From ed at camswl.com Thu Apr 3 22:10:07 2014 From: ed at camswl.com (Edward Nevill) Date: Thu, 03 Apr 2014 23:10:07 +0100 Subject: [aarch64-port-dev ] RFC: Use __clear_cache to do cache flushing In-Reply-To: <533D3081.5030908@redhat.com> References: <1396457974.11627.6.camel@localhost.localdomain> <533D3081.5030908@redhat.com> Message-ID: <1396563007.13254.27.camel@mint> Hi Andrew, On Thu, 2014-04-03 at 10:57 +0100, Andrew Haley wrote: > Hi, > > On 04/02/2014 05:59 PM, Edward Nevill wrote: > > The current implementation of cache flushing/invalidation assumes that the line size is 64 bytes. This is not necessarily the case. > > > > To do this properly we would have to read CTR_EL0. Added to this is the complexity that the D cache and I cache may have different line sizes, so we must be careful to use the D cache line size when flushing the D cache and the I cache line size when invalidating the I cache. > > > > I think it would be much better all around if we just used the gcc intrinsic __clear_cache to do this as in the patch below. > > > > Comments? > > Hmm. I guess this is OK. I am uncomfortable that this leaves > ICache::line_size incorrectly defined, but maybe that doesn't matter. > I have written a patch that correctly initializes > ICache::line_size. as an alternative for you to consider. > > If you decide to go with the GCC version, please: Thanks for this. I think the GCC version offers a more generic solution to the problems with the existing code. I do not believe that line_size is referenced anywhere outside the cache flushing code. > Make sure you delete the enums for things that we know to be wrong > like line_size and log2_line_size. I have ensured that line_size, log2_line_size and stub_size do not appear in any aarch64 specific code. I have deleted generate_icache_flush in macroAssembler_aarch64.{cpp,hpp} as it is no longer referenced and it references ICache::line_size which might cause confusion. > Fix your indentation in icache_aarch64.hpp. Fixed. New patch below, All the best, Ed. --- CUT HERE --- exporting patch: # HG changeset patch # User Edward Nevill edward.nevill at linaro.org # Date 1396561902 -3600 # Thu Apr 03 22:51:42 2014 +0100 # Node ID 5a8c184c37d4fc7d6c91b9c79401bd7d8242f4e8 # Parent 273f8f0e7109ba0abe8f3697f2f48e34afe0d2f3 Use gcc __clear_cache instead of doing it ourselves diff -r 273f8f0e7109 -r 5a8c184c37d4 src/cpu/aarch64/vm/icache_aarch64.cpp --- a/src/cpu/aarch64/vm/icache_aarch64.cpp Wed Apr 02 11:41:48 2014 +0100 +++ b/src/cpu/aarch64/vm/icache_aarch64.cpp Thu Apr 03 22:51:42 2014 +0100 @@ -29,41 +29,10 @@ #include "runtime/icache.hpp" extern void aarch64TestHook(); -extern "C" void setup_arm_sim(); -#define __ _masm-> - -void ICacheStubGenerator::generate_icache_flush(ICache::flush_icache_stub_t* flush_icache_stub) { - +void ICacheStubGenerator::generate_icache_flush( + ICache::flush_icache_stub_t* flush_icache_stub) { aarch64TestHook(); - - StubCodeMark mark(this, "ICache", "flush_icache_stub"); - - address entry = __ pc(); - - // generate a c stub prolog which will bootstrap into the ARM code - // which follows, loading the 3 general registers passed by the - // caller into the first 3 ARM registers - -#ifdef BUILTIN_SIM - __ c_stub_prolog(3, 0, MacroAssembler::ret_type_integral); -#endif - - Register start = r0, lines = r1, auto_magic = r2; - - // First flush the dcache - __ generate_flush_loop(&Assembler::dc, start, lines); - - __ dsb(Assembler::SY); - - // And then the icache - __ generate_flush_loop(&Assembler::ic, start, lines); - - // the stub is supposed to return the 3rd argument - __ mov(r0, auto_magic); - __ ret(r30); - - *flush_icache_stub = (ICache::flush_icache_stub_t)entry; + // Give anyone who calls this a surprise + *flush_icache_stub = (ICache::flush_icache_stub_t)NULL; } - -#undef __ diff -r 273f8f0e7109 -r 5a8c184c37d4 src/cpu/aarch64/vm/icache_aarch64.hpp --- a/src/cpu/aarch64/vm/icache_aarch64.hpp Wed Apr 02 11:41:48 2014 +0100 +++ b/src/cpu/aarch64/vm/icache_aarch64.hpp Thu Apr 03 22:51:42 2014 +0100 @@ -27,16 +27,19 @@ #ifndef CPU_AARCH64_VM_ICACHE_AARCH64_HPP #define CPU_AARCH64_VM_ICACHE_AARCH64_HPP -// Interface for updating the instruction cache. Whenever the VM modifies -// code, part of the processor instruction cache potentially has to be flushed. +// Interface for updating the instruction cache. Whenever the VM +// modifies code, part of the processor instruction cache potentially +// has to be flushed. class ICache : public AbstractICache { public: - enum { - stub_size = 128, // Size of the icache flush stub in bytes - line_size = 64, // Icache line size in bytes - log2_line_size = 6 // log2(line_size) - }; + static void initialize() {} + static void invalidate_word(address addr) { + __clear_cache((char *)addr, (char *)(addr + 3)); + } + static void invalidate_range(address start, int nbytes) { + __clear_cache((char *)start, (char *)(start + nbytes)); + } }; #endif // CPU_AARCH64_VM_ICACHE_AARCH64_HPP diff -r 273f8f0e7109 -r 5a8c184c37d4 src/cpu/aarch64/vm/macroAssembler_aarch64.cpp --- a/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Wed Apr 02 11:41:48 2014 +0100 +++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Thu Apr 03 22:51:42 2014 +0100 @@ -2889,23 +2889,6 @@ } } -void MacroAssembler::generate_flush_loop(flush_insn flush, Register start, Register lines) { - Label again, exit; - - assert_different_registers(start, lines, rscratch1, rscratch2); - - cmp(lines, zr); - br(Assembler::LE, exit); - mov(rscratch1, start); - mov(rscratch2, lines); - bind(again); - (this->*flush)(rscratch1); - sub(rscratch2, rscratch2, 1); - add(rscratch1, rscratch1, ICache::line_size); - cbnz(rscratch2, again); - bind(exit); -} - bool MacroAssembler::use_acq_rel_for_volatile_fields() { #ifdef PRODUCT return false; diff -r 273f8f0e7109 -r 5a8c184c37d4 src/cpu/aarch64/vm/macroAssembler_aarch64.hpp --- a/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp Wed Apr 02 11:41:48 2014 +0100 +++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp Thu Apr 03 22:51:42 2014 +0100 @@ -1361,9 +1361,6 @@ address read_polling_page(Register r, address page, relocInfo::relocType rtype); address read_polling_page(Register r, relocInfo::relocType rtype); - typedef void (Assembler::* flush_insn)(Register Rt); - void generate_flush_loop(flush_insn flush, Register start, Register lines); - // Used by aarch64.ad to control code generation static bool use_acq_rel_for_volatile_fields(); }; --- CUT HERE --- From aph at redhat.com Fri Apr 4 08:20:16 2014 From: aph at redhat.com (Andrew Haley) Date: Fri, 04 Apr 2014 09:20:16 +0100 Subject: [aarch64-port-dev ] RFC: Use __clear_cache to do cache flushing In-Reply-To: <1396563007.13254.27.camel@mint> References: <1396457974.11627.6.camel@localhost.localdomain> <533D3081.5030908@redhat.com> <1396563007.13254.27.camel@mint> Message-ID: <533E6B40.5030207@redhat.com> On 04/03/2014 11:10 PM, Edward Nevill wrote: > New patch below, OK. Andrew. From ed at camswl.com Fri Apr 4 09:13:06 2014 From: ed at camswl.com (ed at camswl.com) Date: Fri, 04 Apr 2014 09:13:06 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8/hotspot: Use gcc __clear_cache instead of doing it ourselves Message-ID: <201404040913.s349D7J4005681@aojmv0008> Changeset: 5a8c184c37d4 Author: Edward Nevill edward.nevill at linaro.org Date: 2014-04-03 22:51 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/5a8c184c37d4 Use gcc __clear_cache instead of doing it ourselves ! src/cpu/aarch64/vm/icache_aarch64.cpp ! src/cpu/aarch64/vm/icache_aarch64.hpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp From openjdk-testing at linaro.org Fri Apr 4 14:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Fri, 4 Apr 2014 14:00:01 +0000 (UTC) Subject: [aarch64-port-dev ] client JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140404140448.B4373200A2@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/summary/2014/094/summary.html =============================================================================== client-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/mar/27 pass: 431; fail: 5; error: 2 Build 1: aarch64/2014/mar/28 pass: 431; fail: 5; error: 2 Build 2: aarch64/2014/mar/29 pass: 431; fail: 5; error: 2 Build 3: aarch64/2014/apr/01 pass: 431; fail: 5; error: 2 Build 4: aarch64/2014/apr/02 pass: 431; fail: 5; error: 2 Build 5: aarch64/2014/apr/03 pass: 431; fail: 5; error: 2 Build 6: aarch64/2014/apr/04 pass: 431; fail: 5; error: 2 ------------------------------------------------------------------------------- =============================================================================== client-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/mar/26 pass: 2,950; error: 22 Build 1: aarch64/2014/mar/27 pass: 2,941; error: 31 Build 2: aarch64/2014/mar/28 pass: 2,940; error: 32 Build 3: aarch64/2014/apr/01 pass: 2,941; error: 31 Build 4: aarch64/2014/apr/02 pass: 2,941; error: 31 Build 5: aarch64/2014/apr/03 pass: 2,941; error: 31 Build 6: aarch64/2014/apr/04 pass: 2,931; error: 41 ------------------------------------------------------------------------------- =============================================================================== client-release/jdk =============================================================================== Build 0: aarch64/2014/mar/21 pass: 5,277; fail: 128; error: 43 Build 1: aarch64/2014/mar/27 pass: 5,273; fail: 131; error: 46 Build 2: aarch64/2014/mar/28 pass: 5,273; fail: 131; error: 46 Build 3: aarch64/2014/apr/01 pass: 5,268; fail: 134; error: 48 Build 4: aarch64/2014/apr/02 pass: 5,276; fail: 128; error: 46 Build 5: aarch64/2014/apr/03 pass: 5,272; fail: 132; error: 46 Build 6: aarch64/2014/apr/04 pass: 5,268; fail: 130; error: 52 1 fatal errors were detected; please follow the link above for more detail. ------------------------------------------------------------------------------- Previous results can be found here: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/index.html From openjdk-testing at linaro.org Fri Apr 4 14:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Fri, 4 Apr 2014 14:00:01 +0000 (UTC) Subject: [aarch64-port-dev ] server JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140404140449.799EF20093@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/summary/2014/094/summary.html =============================================================================== server-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/mar/27 pass: 435; fail: 1; error: 2 Build 1: aarch64/2014/mar/28 pass: 435; fail: 1; error: 2 Build 2: aarch64/2014/mar/29 pass: 435; fail: 1; error: 2 Build 3: aarch64/2014/apr/01 pass: 435; fail: 1; error: 2 Build 4: aarch64/2014/apr/02 pass: 435; fail: 1; error: 2 Build 5: aarch64/2014/apr/03 pass: 435; fail: 1; error: 2 Build 6: aarch64/2014/apr/04 pass: 434; fail: 1; error: 3 ------------------------------------------------------------------------------- =============================================================================== server-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/mar/26 pass: 2,960; error: 12 Build 1: aarch64/2014/mar/27 pass: 2,941; error: 31 Build 2: aarch64/2014/mar/28 pass: 2,939; error: 33 Build 3: aarch64/2014/apr/01 pass: 2,939; error: 33 Build 4: aarch64/2014/apr/02 pass: 2,938; error: 34 Build 5: aarch64/2014/apr/03 pass: 2,939; error: 33 Build 6: aarch64/2014/apr/04 pass: 2,918; error: 54 ------------------------------------------------------------------------------- =============================================================================== server-release/jdk =============================================================================== Build 0: aarch64/2014/mar/21 pass: 5,285; fail: 123; error: 40 Build 1: aarch64/2014/mar/27 pass: 5,281; fail: 130; error: 39 Build 2: aarch64/2014/mar/28 pass: 5,288; fail: 122; error: 40 Build 3: aarch64/2014/apr/01 pass: 5,286; fail: 124; error: 40 Build 4: aarch64/2014/apr/02 pass: 5,284; fail: 125; error: 41 Build 5: aarch64/2014/apr/03 pass: 5,284; fail: 127; error: 39 Build 6: aarch64/2014/apr/04 pass: 5,280; fail: 124; error: 46 ------------------------------------------------------------------------------- Previous results can be found here: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/index.html From openjdk-testing at linaro.org Sat Apr 5 14:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Sat, 5 Apr 2014 14:00:01 +0000 (UTC) Subject: [aarch64-port-dev ] client JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140405140036.76405200C1@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/summary/2014/095/summary.html =============================================================================== client-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/mar/28 pass: 431; fail: 5; error: 2 Build 1: aarch64/2014/mar/29 pass: 431; fail: 5; error: 2 Build 2: aarch64/2014/apr/01 pass: 431; fail: 5; error: 2 Build 3: aarch64/2014/apr/02 pass: 431; fail: 5; error: 2 Build 4: aarch64/2014/apr/03 pass: 431; fail: 5; error: 2 Build 5: aarch64/2014/apr/04 pass: 431; fail: 5; error: 2 Build 6: aarch64/2014/apr/05 pass: 431; fail: 5; error: 2 ------------------------------------------------------------------------------- =============================================================================== client-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/mar/27 pass: 2,941; error: 31 Build 1: aarch64/2014/mar/28 pass: 2,940; error: 32 Build 2: aarch64/2014/apr/01 pass: 2,941; error: 31 Build 3: aarch64/2014/apr/02 pass: 2,941; error: 31 Build 4: aarch64/2014/apr/03 pass: 2,941; error: 31 Build 5: aarch64/2014/apr/04 pass: 2,931; error: 41 Build 6: aarch64/2014/apr/05 pass: 2,939; error: 33 ------------------------------------------------------------------------------- =============================================================================== client-release/jdk =============================================================================== Build 0: aarch64/2014/mar/27 pass: 5,273; fail: 131; error: 46 Build 1: aarch64/2014/mar/28 pass: 5,273; fail: 131; error: 46 Build 2: aarch64/2014/apr/01 pass: 5,268; fail: 134; error: 48 Build 3: aarch64/2014/apr/02 pass: 5,276; fail: 128; error: 46 Build 4: aarch64/2014/apr/03 pass: 5,272; fail: 132; error: 46 Build 5: aarch64/2014/apr/04 pass: 5,268; fail: 130; error: 52 Build 6: aarch64/2014/apr/05 pass: 5,274; fail: 128; error: 48 ------------------------------------------------------------------------------- Previous results can be found here: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/index.html From openjdk-testing at linaro.org Sat Apr 5 14:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Sat, 5 Apr 2014 14:00:01 +0000 (UTC) Subject: [aarch64-port-dev ] server JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140405140036.18157200EB@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/summary/2014/095/summary.html =============================================================================== server-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/mar/28 pass: 435; fail: 1; error: 2 Build 1: aarch64/2014/mar/29 pass: 435; fail: 1; error: 2 Build 2: aarch64/2014/apr/01 pass: 435; fail: 1; error: 2 Build 3: aarch64/2014/apr/02 pass: 435; fail: 1; error: 2 Build 4: aarch64/2014/apr/03 pass: 435; fail: 1; error: 2 Build 5: aarch64/2014/apr/04 pass: 434; fail: 1; error: 3 Build 6: aarch64/2014/apr/05 pass: 435; fail: 1; error: 2 ------------------------------------------------------------------------------- =============================================================================== server-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/mar/27 pass: 2,941; error: 31 Build 1: aarch64/2014/mar/28 pass: 2,939; error: 33 Build 2: aarch64/2014/apr/01 pass: 2,939; error: 33 Build 3: aarch64/2014/apr/02 pass: 2,938; error: 34 Build 4: aarch64/2014/apr/03 pass: 2,939; error: 33 Build 5: aarch64/2014/apr/04 pass: 2,918; error: 54 Build 6: aarch64/2014/apr/05 pass: 2,939; error: 33 ------------------------------------------------------------------------------- =============================================================================== server-release/jdk =============================================================================== Build 0: aarch64/2014/mar/27 pass: 5,281; fail: 130; error: 39 Build 1: aarch64/2014/mar/28 pass: 5,288; fail: 122; error: 40 Build 2: aarch64/2014/apr/01 pass: 5,286; fail: 124; error: 40 Build 3: aarch64/2014/apr/02 pass: 5,284; fail: 125; error: 41 Build 4: aarch64/2014/apr/03 pass: 5,284; fail: 127; error: 39 Build 5: aarch64/2014/apr/04 pass: 5,280; fail: 124; error: 46 Build 6: aarch64/2014/apr/05 pass: 5,282; fail: 127; error: 41 ------------------------------------------------------------------------------- Previous results can be found here: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/index.html From openjdk-testing at linaro.org Sun Apr 6 14:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Sun, 6 Apr 2014 14:00:01 +0000 (UTC) Subject: [aarch64-port-dev ] server JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140406140029.6B39620167@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/summary/2014/096/summary.html =============================================================================== server-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/mar/29 pass: 435; fail: 1; error: 2 Build 1: aarch64/2014/apr/01 pass: 435; fail: 1; error: 2 Build 2: aarch64/2014/apr/02 pass: 435; fail: 1; error: 2 Build 3: aarch64/2014/apr/03 pass: 435; fail: 1; error: 2 Build 4: aarch64/2014/apr/04 pass: 434; fail: 1; error: 3 Build 5: aarch64/2014/apr/05 pass: 435; fail: 1; error: 2 Build 6: aarch64/2014/apr/06 pass: 435; fail: 1; error: 2 ------------------------------------------------------------------------------- =============================================================================== server-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/mar/28 pass: 2,939; error: 33 Build 1: aarch64/2014/apr/01 pass: 2,939; error: 33 Build 2: aarch64/2014/apr/02 pass: 2,938; error: 34 Build 3: aarch64/2014/apr/03 pass: 2,939; error: 33 Build 4: aarch64/2014/apr/04 pass: 2,918; error: 54 Build 5: aarch64/2014/apr/05 pass: 2,939; error: 33 Build 6: aarch64/2014/apr/06 pass: 2,936; error: 36 ------------------------------------------------------------------------------- =============================================================================== server-release/jdk =============================================================================== Build 0: aarch64/2014/mar/28 pass: 5,288; fail: 122; error: 40 Build 1: aarch64/2014/apr/01 pass: 5,286; fail: 124; error: 40 Build 2: aarch64/2014/apr/02 pass: 5,284; fail: 125; error: 41 Build 3: aarch64/2014/apr/03 pass: 5,284; fail: 127; error: 39 Build 4: aarch64/2014/apr/04 pass: 5,280; fail: 124; error: 46 Build 5: aarch64/2014/apr/05 pass: 5,282; fail: 127; error: 41 Build 6: aarch64/2014/apr/06 pass: 5,280; fail: 129; error: 41 ------------------------------------------------------------------------------- Previous results can be found here: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/index.html From openjdk-testing at linaro.org Sun Apr 6 14:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Sun, 6 Apr 2014 14:00:01 +0000 (UTC) Subject: [aarch64-port-dev ] client JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140406140028.F0214201ED@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/summary/2014/096/summary.html =============================================================================== client-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/mar/29 pass: 431; fail: 5; error: 2 Build 1: aarch64/2014/apr/01 pass: 431; fail: 5; error: 2 Build 2: aarch64/2014/apr/02 pass: 431; fail: 5; error: 2 Build 3: aarch64/2014/apr/03 pass: 431; fail: 5; error: 2 Build 4: aarch64/2014/apr/04 pass: 431; fail: 5; error: 2 Build 5: aarch64/2014/apr/05 pass: 431; fail: 5; error: 2 Build 6: aarch64/2014/apr/06 pass: 431; fail: 5; error: 2 ------------------------------------------------------------------------------- =============================================================================== client-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/mar/28 pass: 2,940; error: 32 Build 1: aarch64/2014/apr/01 pass: 2,941; error: 31 Build 2: aarch64/2014/apr/02 pass: 2,941; error: 31 Build 3: aarch64/2014/apr/03 pass: 2,941; error: 31 Build 4: aarch64/2014/apr/04 pass: 2,931; error: 41 Build 5: aarch64/2014/apr/05 pass: 2,939; error: 33 Build 6: aarch64/2014/apr/06 pass: 2,941; error: 31 ------------------------------------------------------------------------------- =============================================================================== client-release/jdk =============================================================================== Build 0: aarch64/2014/mar/28 pass: 5,273; fail: 131; error: 46 Build 1: aarch64/2014/apr/01 pass: 5,268; fail: 134; error: 48 Build 2: aarch64/2014/apr/02 pass: 5,276; fail: 128; error: 46 Build 3: aarch64/2014/apr/03 pass: 5,272; fail: 132; error: 46 Build 4: aarch64/2014/apr/04 pass: 5,268; fail: 130; error: 52 Build 5: aarch64/2014/apr/05 pass: 5,274; fail: 128; error: 48 Build 6: aarch64/2014/apr/06 pass: 5,274; fail: 127; error: 49 1 fatal errors were detected; please follow the link above for more detail. ------------------------------------------------------------------------------- Previous results can be found here: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/index.html From aph at redhat.com Wed Apr 9 13:41:37 2014 From: aph at redhat.com (Andrew Haley) Date: Wed, 09 Apr 2014 14:41:37 +0100 Subject: [aarch64-port-dev ] New instruction costs Message-ID: <53454E11.9090102@redhat.com> These are based on the Cortex-A57 costs in GCC and a little bit of guesswork. It doesn't make very much difference to the generated code, but it's better than using PowerPC costs. Andrew. -------------- next part -------------- # HG changeset patch # User aph # Date 1396965510 -3600 # Tue Apr 08 14:58:30 2014 +0100 # Node ID a16c651450e4b0822cfabb248e19f3b371582fce # Parent 5a8c184c37d4fc7d6c91b9c79401bd7d8242f4e8 New cost model for instruction selection. diff -r 5a8c184c37d4 -r a16c651450e4 src/cpu/aarch64/vm/aarch64.ad --- a/src/cpu/aarch64/vm/aarch64.ad Thu Apr 03 22:51:42 2014 +0100 +++ b/src/cpu/aarch64/vm/aarch64.ad Tue Apr 08 14:58:30 2014 +0100 @@ -703,18 +703,11 @@ // something definitions %{ - // The default cost (of an ALU instruction). - int_def DEFAULT_COST_LOW ( 30, 30); - int_def DEFAULT_COST ( 100, 100); - int_def HUGE_COST (1000000, 1000000); - - // Memory refs - int_def MEMORY_REF_COST_LOW ( 200, DEFAULT_COST * 2); - int_def MEMORY_REF_COST ( 300, DEFAULT_COST * 3); - - // Branches are even more expensive. - int_def BRANCH_COST ( 900, DEFAULT_COST * 9); - int_def CALL_COST ( 1300, DEFAULT_COST * 13); + // The default cost (of a register move instruction). + int_def INSN_COST ( 100, 100); + int_def BRANCH_COST ( 200, 2 * INSN_COST); + int_def CALL_COST ( 200, 2 * INSN_COST); + int_def VOLATILE_REF_COST ( 1000, 10 * INSN_COST); %} @@ -3301,7 +3294,7 @@ op_attrib op_cost(1); // Required cost attribute //----------Instruction Attributes--------------------------------------------- -ins_attrib ins_cost(DEFAULT_COST); // Required cost attribute +ins_attrib ins_cost(INSN_COST); // Required cost attribute ins_attrib ins_size(32); // Required size attribute (in bits) ins_attrib ins_short_branch(0); // Required flag: is this instruction // a non-matching short branch variant @@ -3327,26 +3320,15 @@ %{ match(ConI); - op_cost(10); - format %{ %} - interface(CONST_INTER); -%} - -// 32 bit zero -operand immI0() -%{ - predicate(n->get_int() == 0); - match(ConI); - op_cost(0); format %{ %} interface(CONST_INTER); %} -// 32 bit unit increment -operand immI_1() -%{ - predicate(n->get_int() == 1); +// 32 bit zero +operand immI0() +%{ + predicate(n->get_int() == 0); match(ConI); op_cost(0); @@ -3354,10 +3336,10 @@ interface(CONST_INTER); %} -// 32 bit unit decrement -operand immI_M1() -%{ - predicate(n->get_int() == -1); +// 32 bit unit increment +operand immI_1() +%{ + predicate(n->get_int() == 1); match(ConI); op_cost(0); @@ -3365,9 +3347,10 @@ interface(CONST_INTER); %} -operand immI_8() -%{ - predicate(n->get_int() == 8); +// 32 bit unit decrement +operand immI_M1() +%{ + predicate(n->get_int() == -1); match(ConI); op_cost(0); @@ -3375,9 +3358,9 @@ interface(CONST_INTER); %} -operand immI_16() -%{ - predicate(n->get_int() == 16); +operand immI_8() +%{ + predicate(n->get_int() == 8); match(ConI); op_cost(0); @@ -3385,9 +3368,9 @@ interface(CONST_INTER); %} -operand immI_24() -%{ - predicate(n->get_int() == 24); +operand immI_16() +%{ + predicate(n->get_int() == 16); match(ConI); op_cost(0); @@ -3395,9 +3378,9 @@ interface(CONST_INTER); %} -operand immI_32() -%{ - predicate(n->get_int() == 32); +operand immI_24() +%{ + predicate(n->get_int() == 24); match(ConI); op_cost(0); @@ -3405,9 +3388,9 @@ interface(CONST_INTER); %} -operand immI_48() -%{ - predicate(n->get_int() == 48); +operand immI_32() +%{ + predicate(n->get_int() == 32); match(ConI); op_cost(0); @@ -3415,9 +3398,9 @@ interface(CONST_INTER); %} -operand immI_56() -%{ - predicate(n->get_int() == 56); +operand immI_48() +%{ + predicate(n->get_int() == 48); match(ConI); op_cost(0); @@ -3425,9 +3408,9 @@ interface(CONST_INTER); %} -operand immI_64() -%{ - predicate(n->get_int() == 64); +operand immI_56() +%{ + predicate(n->get_int() == 56); match(ConI); op_cost(0); @@ -3435,9 +3418,9 @@ interface(CONST_INTER); %} -operand immI_255() -%{ - predicate(n->get_int() == 255); +operand immI_64() +%{ + predicate(n->get_int() == 64); match(ConI); op_cost(0); @@ -3445,9 +3428,9 @@ interface(CONST_INTER); %} -operand immI_65535() -%{ - predicate(n->get_int() == 65535); +operand immI_255() +%{ + predicate(n->get_int() == 255); match(ConI); op_cost(0); @@ -3455,9 +3438,9 @@ interface(CONST_INTER); %} -operand immL_255() -%{ - predicate(n->get_int() == 255); +operand immI_65535() +%{ + predicate(n->get_int() == 65535); match(ConI); op_cost(0); @@ -3465,19 +3448,19 @@ interface(CONST_INTER); %} -operand immL_65535() -%{ - predicate(n->get_long() == 65535L); - match(ConL); +operand immL_255() +%{ + predicate(n->get_int() == 255); + match(ConI); op_cost(0); format %{ %} interface(CONST_INTER); %} -operand immL_4294967295() -%{ - predicate(n->get_long() == 4294967295L); +operand immL_65535() +%{ + predicate(n->get_long() == 65535L); match(ConL); op_cost(0); @@ -3485,6 +3468,16 @@ interface(CONST_INTER); %} +operand immL_4294967295() +%{ + predicate(n->get_long() == 4294967295L); + match(ConL); + + op_cost(0); + format %{ %} + interface(CONST_INTER); +%} + operand immL_bitmask() %{ predicate(((n->get_long() & 0xc000000000000000l) == 0) @@ -3513,6 +3506,7 @@ predicate(0 <= n->get_int() && (n->get_int() <= 3)); match(ConI); + op_cost(0); format %{ %} interface(CONST_INTER); %} @@ -3523,6 +3517,7 @@ predicate(((-(1 << 25)) <= n->get_int()) && (n->get_int() < (1 << 25))); match(ConI); + op_cost(0); format %{ %} interface(CONST_INTER); %} @@ -3533,6 +3528,7 @@ predicate(((-(1 << 18)) <= n->get_int()) && (n->get_int() < (1 << 18))); match(ConI); + op_cost(0); format %{ %} interface(CONST_INTER); %} @@ -3543,6 +3539,7 @@ predicate((0 <= n->get_int()) && (n->get_int() < (1 << 12))); match(ConI); + op_cost(0); format %{ %} interface(CONST_INTER); %} @@ -3552,6 +3549,7 @@ predicate((0 <= n->get_long()) && (n->get_long() < (1 << 12))); match(ConL); + op_cost(0); format %{ %} interface(CONST_INTER); %} @@ -3562,6 +3560,7 @@ predicate(Address::offset_ok_for_immed(n->get_int())); match(ConI); + op_cost(0); format %{ %} interface(CONST_INTER); %} @@ -3571,6 +3570,7 @@ predicate(Address::offset_ok_for_immed(n->get_long())); match(ConL); + op_cost(0); format %{ %} interface(CONST_INTER); %} @@ -3591,6 +3591,7 @@ %{ predicate(Assembler::operand_valid_for_logical_immediate(/*is32*/true, (unsigned long)n->get_int())); match(ConI); + op_cost(0); format %{ %} interface(CONST_INTER); @@ -3602,7 +3603,7 @@ %{ match(ConL); - op_cost(20); + op_cost(0); format %{ %} interface(CONST_INTER); %} @@ -3689,7 +3690,7 @@ %{ match(ConP); - op_cost(10); + op_cost(0); format %{ %} interface(CONST_INTER); %} @@ -3700,7 +3701,7 @@ predicate(n->get_ptr() == 0); match(ConP); - op_cost(5); + op_cost(0); format %{ %} interface(CONST_INTER); %} @@ -3712,7 +3713,7 @@ predicate(n->get_ptr() == 1); match(ConP); - op_cost(5); + op_cost(0); format %{ %} interface(CONST_INTER); %} @@ -3724,7 +3725,7 @@ predicate(n->get_ptr() == -1); match(ConP); - op_cost(5); + op_cost(0); format %{ %} interface(CONST_INTER); %} @@ -3736,7 +3737,7 @@ predicate(n->get_ptr() == -2); match(ConP); - op_cost(5); + op_cost(0); format %{ %} interface(CONST_INTER); %} @@ -3746,7 +3747,7 @@ operand immD() %{ match(ConD); - op_cost(40); + op_cost(0); format %{ %} interface(CONST_INTER); %} @@ -3776,7 +3777,7 @@ operand immF() %{ match(ConF); - op_cost(40); + op_cost(0); format %{ %} interface(CONST_INTER); %} @@ -3808,7 +3809,7 @@ %{ match(ConN); - op_cost(10); + op_cost(0); format %{ %} interface(CONST_INTER); %} @@ -3819,7 +3820,7 @@ predicate(n->get_narrowcon() == 0); match(ConN); - op_cost(5); + op_cost(0); format %{ %} interface(CONST_INTER); %} @@ -3828,7 +3829,7 @@ %{ match(ConNKlass); - op_cost(10); + op_cost(0); format %{ %} interface(CONST_INTER); %} @@ -3840,6 +3841,7 @@ constraint(ALLOC_IN_RC(any_reg32)); match(RegI); match(iRegINoSp); + op_cost(0); format %{ %} interface(REG_INTER); %} @@ -3849,6 +3851,7 @@ %{ constraint(ALLOC_IN_RC(no_special_reg32)); match(RegI); + op_cost(0); format %{ %} interface(REG_INTER); %} @@ -3860,6 +3863,7 @@ constraint(ALLOC_IN_RC(any_reg)); match(RegL); match(iRegLNoSp); + op_cost(0); format %{ %} interface(REG_INTER); %} @@ -3885,6 +3889,7 @@ //match(iRegP_R4); //match(iRegP_R5); match(thread_RegP); + op_cost(0); format %{ %} interface(REG_INTER); %} @@ -3900,6 +3905,7 @@ // match(iRegP_R4); // match(iRegP_R5); // match(thread_RegP); + op_cost(0); format %{ %} interface(REG_INTER); %} @@ -3911,6 +3917,7 @@ match(RegP); // match(iRegP); match(iRegPNoSp); + op_cost(0); format %{ %} interface(REG_INTER); %} @@ -3922,6 +3929,7 @@ match(RegP); // match(iRegP); match(iRegPNoSp); + op_cost(0); format %{ %} interface(REG_INTER); %} @@ -3933,6 +3941,7 @@ match(RegP); // match(iRegP); match(iRegPNoSp); + op_cost(0); format %{ %} interface(REG_INTER); %} @@ -3944,6 +3953,7 @@ match(RegP); // match(iRegP); match(iRegPNoSp); + op_cost(0); format %{ %} interface(REG_INTER); %} @@ -3955,6 +3965,7 @@ match(RegP); // match(iRegP); match(iRegPNoSp); + op_cost(0); format %{ %} interface(REG_INTER); %} @@ -3965,6 +3976,7 @@ constraint(ALLOC_IN_RC(r11_reg)); match(RegL); match(iRegLNoSp); + op_cost(0); format %{ %} interface(REG_INTER); %} @@ -3975,6 +3987,7 @@ constraint(ALLOC_IN_RC(fp_reg)); match(RegP); // match(iRegP); + op_cost(0); format %{ %} interface(REG_INTER); %} @@ -3985,6 +3998,7 @@ constraint(ALLOC_IN_RC(r0_reg)); match(RegI); match(iRegINoSp); + op_cost(0); format %{ %} interface(REG_INTER); %} @@ -3996,6 +4010,7 @@ constraint(ALLOC_IN_RC(any_reg32)); match(RegN); match(iRegNNoSp); + op_cost(0); format %{ %} interface(REG_INTER); %} @@ -4005,6 +4020,7 @@ %{ constraint(ALLOC_IN_RC(no_special_reg32)); match(RegN); + op_cost(0); format %{ %} interface(REG_INTER); %} @@ -4015,6 +4031,7 @@ %{ constraint(ALLOC_IN_RC(heapbase_reg)); match(RegI); + op_cost(0); format %{ %} interface(REG_INTER); %} @@ -4026,6 +4043,7 @@ constraint(ALLOC_IN_RC(float_reg)); match(RegF); + op_cost(0); format %{ %} interface(REG_INTER); %} @@ -4037,6 +4055,7 @@ constraint(ALLOC_IN_RC(double_reg)); match(RegD); + op_cost(0); format %{ %} interface(REG_INTER); %} @@ -4065,6 +4084,7 @@ constraint(ALLOC_IN_RC(int_flags)); match(RegFlags); + op_cost(0); format %{ "RFLAGS" %} interface(REG_INTER); %} @@ -4075,6 +4095,7 @@ constraint(ALLOC_IN_RC(int_flags)); match(RegFlags); + op_cost(0); format %{ "RFLAGSU" %} interface(REG_INTER); %} @@ -4087,6 +4108,7 @@ constraint(ALLOC_IN_RC(method_reg)); // inline_cache_reg match(reg); match(iRegPNoSp); + op_cost(0); format %{ %} interface(REG_INTER); %} @@ -4096,6 +4118,7 @@ constraint(ALLOC_IN_RC(method_reg)); // interpreter_method_oop_reg match(reg); match(iRegPNoSp); + op_cost(0); format %{ %} interface(REG_INTER); %} @@ -4105,6 +4128,7 @@ %{ constraint(ALLOC_IN_RC(thread_reg)); // link_reg match(reg); + op_cost(0); format %{ %} interface(REG_INTER); %} @@ -4113,6 +4137,7 @@ %{ constraint(ALLOC_IN_RC(lr_reg)); // link_reg match(reg); + op_cost(0); format %{ %} interface(REG_INTER); %} @@ -4137,7 +4162,7 @@ %{ constraint(ALLOC_IN_RC(ptr_reg)); match(AddP (AddP reg (LShiftL lreg scale)) off); - op_cost(DEFAULT_COST); + op_cost(INSN_COST); format %{ "$reg, $lreg lsl($scale), $off" %} interface(MEMORY_INTER) %{ base($reg); @@ -4151,7 +4176,7 @@ %{ constraint(ALLOC_IN_RC(ptr_reg)); match(AddP (AddP reg (LShiftL lreg scale)) off); - op_cost(DEFAULT_COST); + op_cost(INSN_COST); format %{ "$reg, $lreg lsl($scale), $off" %} interface(MEMORY_INTER) %{ base($reg); @@ -4165,7 +4190,7 @@ %{ constraint(ALLOC_IN_RC(ptr_reg)); match(AddP (AddP reg (LShiftL (ConvI2L ireg) scale)) off); - op_cost(DEFAULT_COST); + op_cost(INSN_COST); format %{ "$reg, $ireg sxtw($scale), $off I2L" %} interface(MEMORY_INTER) %{ base($reg); @@ -4221,7 +4246,7 @@ %{ constraint(ALLOC_IN_RC(ptr_reg)); match(AddP reg off); - op_cost(0); + op_cost(INSN_COST); format %{ "[$reg, $off]" %} interface(MEMORY_INTER) %{ base($reg); @@ -4266,7 +4291,7 @@ predicate(Universe::narrow_oop_shift() == 0); constraint(ALLOC_IN_RC(ptr_reg)); match(AddP (AddP (DecodeN reg) (LShiftL lreg scale)) off); - op_cost(DEFAULT_COST); + op_cost(0); format %{ "$reg, $lreg lsl($scale), $off\t# narrow" %} interface(MEMORY_INTER) %{ base($reg); @@ -4281,7 +4306,7 @@ predicate(Universe::narrow_oop_shift() == 0); constraint(ALLOC_IN_RC(ptr_reg)); match(AddP (AddP (DecodeN reg) (LShiftL lreg scale)) off); - op_cost(DEFAULT_COST); + op_cost(INSN_COST); format %{ "$reg, $lreg lsl($scale), $off\t# narrow" %} interface(MEMORY_INTER) %{ base($reg); @@ -4296,7 +4321,7 @@ predicate(Universe::narrow_oop_shift() == 0); constraint(ALLOC_IN_RC(ptr_reg)); match(AddP (AddP (DecodeN reg) (LShiftL (ConvI2L ireg) scale)) off); - op_cost(DEFAULT_COST); + op_cost(INSN_COST); format %{ "$reg, $ireg sxtw($scale), $off I2L\t# narrow" %} interface(MEMORY_INTER) %{ base($reg); @@ -4695,7 +4720,7 @@ match(Set dst (LoadB mem)); predicate(!treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(4 * INSN_COST); format %{ "ldrsbw $dst, $mem\t# byte" %} ins_encode(aarch64_enc_ldrsbw(dst, mem)); @@ -4709,7 +4734,7 @@ match(Set dst (ConvI2L (LoadB mem))); predicate(!treat_as_volatile(((MemNode*)(n->in(1))))); - ins_cost(MEMORY_REF_COST); + ins_cost(4 * INSN_COST); format %{ "ldrsb $dst, $mem\t# byte" %} ins_encode(aarch64_enc_ldrsb(dst, mem)); @@ -4723,7 +4748,7 @@ match(Set dst (LoadUB mem)); predicate(!treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(4 * INSN_COST); format %{ "ldrbw $dst, $mem\t# byte" %} ins_encode(aarch64_enc_ldrb(dst, mem)); @@ -4737,7 +4762,7 @@ match(Set dst (ConvI2L (LoadUB mem))); predicate(!treat_as_volatile(((MemNode*)(n->in(1))))); - ins_cost(MEMORY_REF_COST); + ins_cost(4 * INSN_COST); format %{ "ldrb $dst, $mem\t# byte" %} ins_encode(aarch64_enc_ldrb(dst, mem)); @@ -4751,7 +4776,7 @@ match(Set dst (LoadS mem)); predicate(!treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(4 * INSN_COST); format %{ "ldrshw $dst, $mem\t# short" %} ins_encode(aarch64_enc_ldrshw(dst, mem)); @@ -4765,7 +4790,7 @@ match(Set dst (ConvI2L (LoadS mem))); predicate(!treat_as_volatile(((MemNode*)(n->in(1))))); - ins_cost(MEMORY_REF_COST); + ins_cost(4 * INSN_COST); format %{ "ldrsh $dst, $mem\t# short" %} ins_encode(aarch64_enc_ldrsh(dst, mem)); @@ -4779,7 +4804,7 @@ match(Set dst (LoadUS mem)); predicate(!treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(4 * INSN_COST); format %{ "ldrh $dst, $mem\t# short" %} ins_encode(aarch64_enc_ldrh(dst, mem)); @@ -4793,7 +4818,7 @@ match(Set dst (ConvI2L (LoadUS mem))); predicate(!treat_as_volatile(((MemNode*)(n->in(1))))); - ins_cost(MEMORY_REF_COST); + ins_cost(4 * INSN_COST); format %{ "ldrh $dst, $mem\t# short" %} ins_encode(aarch64_enc_ldrh(dst, mem)); @@ -4807,7 +4832,7 @@ match(Set dst (LoadI mem)); predicate(!treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(4 * INSN_COST); format %{ "ldrw $dst, $mem\t# int" %} ins_encode(aarch64_enc_ldrw(dst, mem)); @@ -4821,7 +4846,7 @@ match(Set dst (ConvI2L (LoadI mem))); predicate(!treat_as_volatile(((MemNode*)(n->in(1))))); - ins_cost(MEMORY_REF_COST); + ins_cost(4 * INSN_COST); format %{ "ldrsw $dst, $mem\t# int" %} ins_encode(aarch64_enc_ldrsw(dst, mem)); @@ -4835,7 +4860,7 @@ match(Set dst (AndL (ConvI2L (LoadI mem)) mask)); predicate(!treat_as_volatile(((MemNode*)(n->in(1))->in(1)))); - ins_cost(MEMORY_REF_COST); + ins_cost(4 * INSN_COST); format %{ "ldrw $dst, $mem\t# int" %} ins_encode(aarch64_enc_ldrw(dst, mem)); @@ -4849,7 +4874,7 @@ match(Set dst (LoadL mem)); predicate(!treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(4 * INSN_COST); format %{ "ldr $dst, $mem\t# int" %} ins_encode(aarch64_enc_ldr(dst, mem)); @@ -4862,7 +4887,7 @@ %{ match(Set dst (LoadRange mem)); - ins_cost(MEMORY_REF_COST); + ins_cost(4 * INSN_COST); format %{ "ldrw $dst, $mem\t# range" %} ins_encode(aarch64_enc_ldrw(dst, mem)); @@ -4876,7 +4901,7 @@ match(Set dst (LoadP mem)); predicate(!treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(4 * INSN_COST); format %{ "ldr $dst, $mem\t# ptr" %} ins_encode(aarch64_enc_ldr(dst, mem)); @@ -4890,7 +4915,7 @@ match(Set dst (LoadN mem)); predicate(!treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(4 * INSN_COST); format %{ "ldrw $dst, $mem\t# compressed ptr" %} ins_encode(aarch64_enc_ldrw(dst, mem)); @@ -4904,7 +4929,7 @@ match(Set dst (LoadKlass mem)); predicate(!treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(4 * INSN_COST); format %{ "ldr $dst, $mem\t# class" %} ins_encode(aarch64_enc_ldr(dst, mem)); @@ -4918,7 +4943,7 @@ match(Set dst (LoadNKlass mem)); predicate(!treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(4 * INSN_COST); format %{ "ldrw $dst, $mem\t# compressed class ptr" %} ins_encode(aarch64_enc_ldrw(dst, mem)); @@ -4932,7 +4957,7 @@ match(Set dst (LoadF mem)); predicate(!treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(4 * INSN_COST); format %{ "ldrs $dst, $mem\t# float" %} ins_encode( aarch64_enc_ldrs(dst, mem) ); @@ -4946,7 +4971,7 @@ match(Set dst (LoadD mem)); predicate(!treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(4 * INSN_COST); format %{ "ldrd $dst, $mem\t# double" %} ins_encode( aarch64_enc_ldrd(dst, mem) ); @@ -4960,7 +4985,7 @@ %{ match(Set dst src); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "mov $dst, $src\t# int" %} ins_encode( aarch64_enc_movw_imm(dst, src) ); @@ -4973,7 +4998,7 @@ %{ match(Set dst src); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "mov $dst, $src\t# long" %} ins_encode( aarch64_enc_mov_imm(dst, src) ); @@ -4987,7 +5012,7 @@ %{ match(Set dst con); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 4); format %{ "mov $dst, $con\t# ptr\n\t" %} @@ -5003,7 +5028,7 @@ %{ match(Set dst con); - ins_cost(DEFAULT_COST_LOW); + ins_cost(INSN_COST); format %{ "mov $dst, $con\t# NULL ptr" %} ins_encode(aarch64_enc_mov_p0(dst, con)); @@ -5017,7 +5042,7 @@ %{ match(Set dst con); - ins_cost(DEFAULT_COST_LOW); + ins_cost(INSN_COST); format %{ "mov $dst, $con\t# NULL ptr" %} ins_encode(aarch64_enc_mov_p1(dst, con)); @@ -5031,7 +5056,7 @@ %{ match(Set dst con); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 4); format %{ "mov $dst, $con\t# compressed ptr" %} ins_encode(aarch64_enc_mov_n(dst, con)); @@ -5045,7 +5070,7 @@ %{ match(Set dst con); - ins_cost(DEFAULT_COST_LOW); + ins_cost(INSN_COST); format %{ "mov $dst, $con\t# compressed NULL ptr" %} ins_encode(aarch64_enc_mov_n0(dst, con)); @@ -5059,7 +5084,7 @@ %{ match(Set dst con); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "mov $dst, $con\t# compressed klass ptr" %} ins_encode(aarch64_enc_mov_nk(dst, con)); @@ -5071,7 +5096,7 @@ instruct loadConF_packed(vRegF dst, immFPacked con) %{ match(Set dst con); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 4); format %{ "fmovs $dst, $con"%} ins_encode %{ __ fmovs(as_FloatRegister($dst$$reg), (double)$con$$constant); @@ -5085,7 +5110,7 @@ instruct loadConF(vRegF dst, immF con) %{ match(Set dst con); - ins_cost(DEFAULT_COST * 2); + ins_cost(INSN_COST * 4); format %{ "ldrs $dst, [$constantaddress]\t# load from constant table: float=$con\n\t" @@ -5102,7 +5127,7 @@ instruct loadConD_packed(vRegD dst, immDPacked con) %{ match(Set dst con); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "fmovd $dst, $con"%} ins_encode %{ __ fmovd(as_FloatRegister($dst$$reg), $con$$constant); @@ -5116,7 +5141,7 @@ instruct loadConD(vRegD dst, immD con) %{ match(Set dst con); - ins_cost(DEFAULT_COST * 2); + ins_cost(INSN_COST * 5); format %{ "ldrd $dst, [$constantaddress]\t# load from constant table: float=$con\n\t" %} @@ -5135,7 +5160,7 @@ %{ match(Set mem (StoreCM mem zero)); - ins_cost(MEMORY_REF_COST); + ins_cost(INSN_COST); format %{ "strb zr, $mem\t# byte" %} ins_encode(aarch64_enc_strb0(mem)); @@ -5149,7 +5174,7 @@ match(Set mem (StoreB mem src)); predicate(!treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(INSN_COST); format %{ "strb $src, $mem\t# byte" %} ins_encode(aarch64_enc_strb(src, mem)); @@ -5163,7 +5188,7 @@ match(Set mem (StoreB mem zero)); predicate(!treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(INSN_COST); format %{ "strb zr, $mem\t# byte" %} ins_encode(aarch64_enc_strb0(mem)); @@ -5177,7 +5202,7 @@ match(Set mem (StoreC mem src)); predicate(!treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(INSN_COST); format %{ "strh $src, $mem\t# short" %} ins_encode(aarch64_enc_strh(src, mem)); @@ -5190,7 +5215,7 @@ match(Set mem (StoreC mem zero)); predicate(!treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(INSN_COST); format %{ "strh zr, $mem\t# short" %} ins_encode(aarch64_enc_strh0(mem)); @@ -5205,7 +5230,7 @@ match(Set mem(StoreI mem src)); predicate(!treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(INSN_COST); format %{ "strw $src, $mem\t# int" %} ins_encode(aarch64_enc_strw(src, mem)); @@ -5218,7 +5243,7 @@ match(Set mem(StoreI mem zero)); predicate(!treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(INSN_COST); format %{ "strw zr, $mem\t# int" %} ins_encode(aarch64_enc_strw0(mem)); @@ -5232,7 +5257,7 @@ match(Set mem (StoreL mem src)); predicate(!treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(INSN_COST); format %{ "str $src, $mem\t# int" %} ins_encode(aarch64_enc_str(src, mem)); @@ -5246,7 +5271,7 @@ match(Set mem (StoreL mem zero)); predicate(!treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(INSN_COST); format %{ "str zr, $mem\t# int" %} ins_encode(aarch64_enc_str0(mem)); @@ -5260,7 +5285,7 @@ match(Set mem (StoreP mem src)); predicate(!treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(INSN_COST); format %{ "str $src, $mem\t# ptr" %} ins_encode(aarch64_enc_str(src, mem)); @@ -5274,7 +5299,7 @@ match(Set mem (StoreP mem zero)); predicate(!treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(INSN_COST); format %{ "str zr, $mem\t# ptr" %} ins_encode(aarch64_enc_str0(mem)); @@ -5301,7 +5326,7 @@ %{ match(Set mem (StoreP mem dummy_m1)); - ins_cost(MEMORY_REF_COST_LOW); + ins_cost(INSN_COST); format %{ "str ., $mem\t# save pc to thread (no ret addr)" %} // use opcode to indicate that we have no return address argument @@ -5316,7 +5341,7 @@ %{ match(Set mem (StoreP mem dummy_m2)); - ins_cost(MEMORY_REF_COST_LOW); + ins_cost(INSN_COST); format %{ "str ., $mem\t# save pc to thread (w ret addr)" %} // use opcode to indicate that we have a return address argument @@ -5333,7 +5358,7 @@ match(Set mem (StoreN mem src)); predicate(!treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(INSN_COST); format %{ "strw $src, $mem\t# compressed ptr" %} ins_encode(aarch64_enc_strw(src, mem)); @@ -5348,7 +5373,7 @@ Universe::narrow_klass_base() == NULL && !((MemNode*)n)->is_volatile()); - ins_cost(MEMORY_REF_COST_LOW); + ins_cost(INSN_COST); format %{ "strw rheapbase, $mem\t# compressed ptr (rheapbase==0)" %} ins_encode(aarch64_enc_strw(heapbase, mem)); @@ -5362,7 +5387,7 @@ match(Set mem (StoreF mem src)); predicate(!treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(INSN_COST); format %{ "strs $src, $mem\t# float" %} ins_encode( aarch64_enc_strs(src, mem) ); @@ -5379,7 +5404,7 @@ match(Set mem (StoreD mem src)); predicate(!treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(INSN_COST); format %{ "strd $src, $mem\t# double" %} ins_encode( aarch64_enc_strd(src, mem) ); @@ -5392,7 +5417,7 @@ %{ match(Set mem (StoreNKlass mem src)); - ins_cost(MEMORY_REF_COST); + ins_cost(INSN_COST); format %{ "strw $src, $mem\t# compressed klass ptr" %} ins_encode(aarch64_enc_strw(src, mem)); @@ -5409,6 +5434,7 @@ instruct prefetchr( memory mem ) %{ match(PrefetchRead mem); + ins_cost(INSN_COST); format %{ "prfm $mem, PLDL1KEEP\t# Prefetch into level 1 cache read keep" %} ins_encode( aarch64_enc_prefetchr(mem) ); @@ -5419,6 +5445,7 @@ instruct prefetchw( memory mem ) %{ match(PrefetchAllocation mem); + ins_cost(INSN_COST); format %{ "prfm $mem, PSTL1KEEP\t# Prefetch into level 1 cache write keep" %} ins_encode( aarch64_enc_prefetchw(mem) ); @@ -5429,6 +5456,7 @@ instruct prefetchnta( memory mem ) %{ match(PrefetchWrite mem); + ins_cost(INSN_COST); format %{ "prfm $mem, PSTL1STRM\t# Prefetch into level 1 cache write streaming" %} ins_encode( aarch64_enc_prefetchnta(mem) ); @@ -5444,7 +5472,7 @@ match(Set dst (LoadB mem)); predicate(treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "ldarsb $dst, $mem\t# byte" %} ins_encode(aarch64_enc_ldarsb(dst, mem)); @@ -5458,7 +5486,7 @@ match(Set dst (ConvI2L (LoadB mem))); predicate(treat_as_volatile(((MemNode*)(n->in(1))))); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "ldarsb $dst, $mem\t# byte" %} ins_encode(aarch64_enc_ldarsb(dst, mem)); @@ -5472,7 +5500,7 @@ match(Set dst (LoadUB mem)); predicate(treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "ldarb $dst, $mem\t# byte" %} ins_encode(aarch64_enc_ldarb(dst, mem)); @@ -5486,7 +5514,7 @@ match(Set dst (ConvI2L (LoadUB mem))); predicate(treat_as_volatile(((MemNode*)(n->in(1))))); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "ldarb $dst, $mem\t# byte" %} ins_encode(aarch64_enc_ldarb(dst, mem)); @@ -5500,7 +5528,7 @@ match(Set dst (LoadS mem)); predicate(treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "ldarshw $dst, $mem\t# short" %} ins_encode(aarch64_enc_ldarshw(dst, mem)); @@ -5513,7 +5541,7 @@ match(Set dst (LoadUS mem)); predicate(treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "ldarhw $dst, $mem\t# short" %} ins_encode(aarch64_enc_ldarhw(dst, mem)); @@ -5527,7 +5555,7 @@ match(Set dst (ConvI2L (LoadUS mem))); predicate(treat_as_volatile(((MemNode*)(n->in(1))))); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "ldarh $dst, $mem\t# short" %} ins_encode(aarch64_enc_ldarh(dst, mem)); @@ -5541,7 +5569,7 @@ match(Set dst (ConvI2L (LoadS mem))); predicate(treat_as_volatile(((MemNode*)(n->in(1))))); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "ldarh $dst, $mem\t# short" %} ins_encode(aarch64_enc_ldarsh(dst, mem)); @@ -5555,7 +5583,7 @@ match(Set dst (LoadI mem)); predicate(treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "ldarw $dst, $mem\t# int" %} ins_encode(aarch64_enc_ldarw(dst, mem)); @@ -5569,7 +5597,7 @@ match(Set dst (AndL (ConvI2L (LoadI mem)) mask)); predicate(treat_as_volatile(((MemNode*)(n->in(1))->in(1)))); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "ldarw $dst, $mem\t# int" %} ins_encode(aarch64_enc_ldarw(dst, mem)); @@ -5583,7 +5611,7 @@ match(Set dst (LoadL mem)); predicate(treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "ldar $dst, $mem\t# int" %} ins_encode(aarch64_enc_ldar(dst, mem)); @@ -5597,7 +5625,7 @@ match(Set dst (LoadP mem)); predicate(treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "ldar $dst, $mem\t# ptr" %} ins_encode(aarch64_enc_ldar(dst, mem)); @@ -5611,7 +5639,7 @@ match(Set dst (LoadN mem)); predicate(treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "ldarw $dst, $mem\t# compressed ptr" %} ins_encode(aarch64_enc_ldarw(dst, mem)); @@ -5625,7 +5653,7 @@ match(Set dst (LoadF mem)); predicate(treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "ldars $dst, $mem\t# float" %} ins_encode( aarch64_enc_fldars(dst, mem) ); @@ -5639,7 +5667,7 @@ match(Set dst (LoadD mem)); predicate(treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "ldard $dst, $mem\t# double" %} ins_encode( aarch64_enc_fldard(dst, mem) ); @@ -5653,7 +5681,7 @@ match(Set mem (StoreB mem src)); predicate(treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "stlrb $src, $mem\t# byte" %} ins_encode(aarch64_enc_stlrb(src, mem)); @@ -5667,7 +5695,7 @@ match(Set mem (StoreC mem src)); predicate(treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "stlrh $src, $mem\t# short" %} ins_encode(aarch64_enc_stlrh(src, mem)); @@ -5682,7 +5710,7 @@ match(Set mem(StoreI mem src)); predicate(treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "stlrw $src, $mem\t# int" %} ins_encode(aarch64_enc_stlrw(src, mem)); @@ -5696,7 +5724,7 @@ match(Set mem (StoreL mem src)); predicate(treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "stlr $src, $mem\t# int" %} ins_encode(aarch64_enc_stlr(src, mem)); @@ -5710,7 +5738,7 @@ match(Set mem (StoreP mem src)); predicate(treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "stlr $src, $mem\t# ptr" %} ins_encode(aarch64_enc_stlr(src, mem)); @@ -5724,7 +5752,7 @@ match(Set mem (StoreN mem src)); predicate(treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "stlrw $src, $mem\t# compressed ptr" %} ins_encode(aarch64_enc_stlrw(src, mem)); @@ -5738,7 +5766,7 @@ match(Set mem (StoreF mem src)); predicate(treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "stlrs $src, $mem\t# float" %} ins_encode( aarch64_enc_fstlrs(src, mem) ); @@ -5755,7 +5783,7 @@ match(Set mem (StoreD mem src)); predicate(treat_as_volatile(((MemNode*)n))); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "stlrd $src, $mem\t# double" %} ins_encode( aarch64_enc_fstlrd(src, mem) ); @@ -5771,7 +5799,7 @@ instruct bytes_reverse_int(iRegINoSp dst) %{ match(Set dst (ReverseBytesI dst)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "revw $dst, $dst" %} ins_encode %{ @@ -5784,7 +5812,7 @@ instruct bytes_reverse_long(iRegLNoSp dst) %{ match(Set dst (ReverseBytesL dst)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "rev $dst, $dst" %} ins_encode %{ @@ -5797,7 +5825,7 @@ instruct bytes_reverse_unsigned_short(iRegINoSp dst) %{ match(Set dst (ReverseBytesUS dst)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "rev16w $dst, $dst" %} ins_encode %{ @@ -5810,7 +5838,7 @@ instruct bytes_reverse_short(iRegINoSp dst) %{ match(Set dst (ReverseBytesS dst)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "rev16w $dst, $dst\n\t" "sbfmw $dst, $dst, #0, #15" %} @@ -5828,7 +5856,7 @@ instruct membar_acquire() %{ match(MemBarAcquire); - ins_cost(4*MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "MEMBAR-acquire\t# ???" %} @@ -5845,7 +5873,7 @@ instruct membar_release() %{ match(MemBarRelease); - ins_cost(4*MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "MEMBAR-release" %} ins_encode %{ @@ -5859,7 +5887,7 @@ instruct membar_volatile() %{ match(MemBarVolatile); - ins_cost(4*MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "MEMBAR-volatile?" %} @@ -5883,7 +5911,7 @@ instruct membar_storestore() %{ match(MemBarStoreStore); - ins_cost(4*MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); ins_encode %{ __ membar(Assembler::StoreStore); @@ -5899,7 +5927,7 @@ ins_encode %{ __ block_comment("membar-acquire-lock"); - __ membar(Assembler::Membar_mask_bits(Assembler::LoadLoad|Assembler::LoadStore)); + // __ membar(Assembler::Membar_mask_bits(Assembler::LoadLoad|Assembler::LoadStore)); %} ins_pipe(pipe_class_memory); @@ -5912,7 +5940,7 @@ ins_encode %{ __ block_comment("MEMBAR-release-lock"); - __ membar(Assembler::AnyAny); + // __ membar(Assembler::AnyAny); %} ins_pipe(pipe_class_memory); @@ -5924,7 +5952,7 @@ instruct castX2P(iRegPNoSp dst, iRegL src) %{ match(Set dst (CastX2P src)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "mov $dst, $src\t# long -> ptr" %} ins_encode %{ @@ -5939,7 +5967,7 @@ instruct castP2X(iRegLNoSp dst, iRegP src) %{ match(Set dst (CastP2X src)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "mov $dst, $src\t# ptr -> long" %} ins_encode %{ @@ -5955,7 +5983,7 @@ instruct convP2I(iRegINoSp dst, iRegP src) %{ match(Set dst (ConvL2I (CastP2X src))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "movw $dst, $src\t# ptr -> int" %} ins_encode %{ __ movw($dst$$Register, $src$$Register); @@ -5971,7 +5999,7 @@ predicate(Universe::narrow_oop_shift() == 0); match(Set dst (ConvL2I (CastP2X (DecodeN src)))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "mov dst, $src\t# compressed ptr -> int" %} ins_encode %{ __ movw($dst$$Register, $src$$Register); @@ -5986,7 +6014,7 @@ predicate(n->bottom_type()->make_ptr()->ptr() != TypePtr::NotNull); match(Set dst (EncodeP src)); effect(KILL cr); - ins_cost(DEFAULT_COST * 3); + ins_cost(INSN_COST * 3); format %{ "encode_heap_oop $dst, $src" %} ins_encode %{ Register s = $src$$Register; @@ -5999,7 +6027,7 @@ instruct encodeHeapOop_not_null(iRegNNoSp dst, iRegP src, rFlagsReg cr) %{ predicate(n->bottom_type()->make_ptr()->ptr() == TypePtr::NotNull); match(Set dst (EncodeP src)); - ins_cost(DEFAULT_COST * 3); + ins_cost(INSN_COST * 3); format %{ "encode_heap_oop_not_null $dst, $src" %} ins_encode %{ __ encode_heap_oop_not_null($dst$$Register, $src$$Register); @@ -6011,7 +6039,7 @@ predicate(n->bottom_type()->is_ptr()->ptr() != TypePtr::NotNull && n->bottom_type()->is_ptr()->ptr() != TypePtr::Constant); match(Set dst (DecodeN src)); - ins_cost(DEFAULT_COST * 3); + ins_cost(INSN_COST * 3); format %{ "decode_heap_oop $dst, $src" %} ins_encode %{ Register s = $src$$Register; @@ -6025,7 +6053,7 @@ predicate(n->bottom_type()->is_ptr()->ptr() == TypePtr::NotNull || n->bottom_type()->is_ptr()->ptr() == TypePtr::Constant); match(Set dst (DecodeN src)); - ins_cost(DEFAULT_COST * 3); + ins_cost(INSN_COST * 3); format %{ "decode_heap_oop_not_null $dst, $src" %} ins_encode %{ Register s = $src$$Register; @@ -6042,7 +6070,7 @@ instruct encodeKlass_not_null(iRegNNoSp dst, iRegP src) %{ match(Set dst (EncodePKlass src)); - ins_cost(DEFAULT_COST * 3); + ins_cost(INSN_COST * 3); format %{ "encode_klass_not_null $dst,$src" %} ins_encode %{ @@ -6057,7 +6085,7 @@ instruct decodeKlass_not_null(iRegPNoSp dst, iRegN src) %{ match(Set dst (DecodeNKlass src)); - ins_cost(DEFAULT_COST * 3); + ins_cost(INSN_COST * 3); format %{ "decode_klass_not_null $dst,$src" %} ins_encode %{ @@ -6141,7 +6169,7 @@ %{ match(Set dst (LoadPLocked mem)); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "ldaxr $dst, $mem\t# ptr linked acquire" %} @@ -6159,7 +6187,7 @@ %{ match(Set cr (StorePConditional heap_top_ptr (Binary oldval newval))); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); // TODO // do we need to do a store-conditional release or can we just use a @@ -6180,7 +6208,7 @@ %{ match(Set cr (StoreLConditional mem (Binary oldval newval))); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "cmpxchg rscratch1, $mem, $oldval, $newval, $mem\t# if $mem == $oldval then $mem <-- $newval" @@ -6197,7 +6225,7 @@ %{ match(Set cr (StoreIConditional mem (Binary oldval newval))); - ins_cost(MEMORY_REF_COST); + ins_cost(VOLATILE_REF_COST); format %{ "cmpxchgw rscratch1, $mem, $oldval, $newval, $mem\t# if $mem == $oldval then $mem <-- $newval" @@ -6304,7 +6332,7 @@ instruct cmovI_reg_reg(cmpOp cmp, rFlagsReg cr, iRegINoSp dst, iRegI src1, iRegI src2) %{ match(Set dst (CMoveI (Binary cmp cr) (Binary src1 src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "cselw $dst, $src2, $src1 $cmp\t# signed, int" %} ins_encode %{ @@ -6320,7 +6348,7 @@ instruct cmovUI_reg_reg(cmpOpU cmp, rFlagsRegU cr, iRegINoSp dst, iRegI src1, iRegI src2) %{ match(Set dst (CMoveI (Binary cmp cr) (Binary src1 src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "cselw $dst, $src2, $src1 $cmp\t# unsigned, int" %} ins_encode %{ @@ -6345,7 +6373,7 @@ instruct cmovI_zero_reg(cmpOp cmp, rFlagsReg cr, iRegINoSp dst, immI0 zero, iRegI src2) %{ match(Set dst (CMoveI (Binary cmp cr) (Binary zero src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "cselw $dst, $src2, zr $cmp\t# signed, int" %} ins_encode %{ @@ -6361,7 +6389,7 @@ instruct cmovUI_zero_reg(cmpOpU cmp, rFlagsRegU cr, iRegINoSp dst, immI0 zero, iRegI src2) %{ match(Set dst (CMoveI (Binary cmp cr) (Binary zero src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "cselw $dst, $src2, zr $cmp\t# unsigned, int" %} ins_encode %{ @@ -6377,7 +6405,7 @@ instruct cmovI_reg_zero(cmpOp cmp, rFlagsReg cr, iRegINoSp dst, iRegI src1, immI0 zero) %{ match(Set dst (CMoveI (Binary cmp cr) (Binary src1 zero))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "cselw $dst, zr, $src1 $cmp\t# signed, int" %} ins_encode %{ @@ -6393,7 +6421,7 @@ instruct cmovUI_reg_zero(cmpOpU cmp, rFlagsRegU cr, iRegINoSp dst, iRegI src1, immI0 zero) %{ match(Set dst (CMoveI (Binary cmp cr) (Binary src1 zero))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "cselw $dst, zr, $src1 $cmp\t# unsigned, int" %} ins_encode %{ @@ -6414,7 +6442,7 @@ instruct cmovI_reg_zero_one(cmpOp cmp, rFlagsReg cr, iRegINoSp dst, immI0 zero, immI_1 one) %{ match(Set dst (CMoveI (Binary cmp cr) (Binary one zero))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "csincw $dst, zr, zr $cmp\t# signed, int" %} ins_encode %{ @@ -6433,7 +6461,7 @@ instruct cmovUI_reg_zero_one(cmpOpU cmp, rFlagsRegU cr, iRegINoSp dst, immI0 zero, immI_1 one) %{ match(Set dst (CMoveI (Binary cmp cr) (Binary one zero))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "csincw $dst, zr, zr $cmp\t# unsigned, int" %} ins_encode %{ @@ -6452,7 +6480,7 @@ instruct cmovL_reg_reg(cmpOp cmp, rFlagsReg cr, iRegLNoSp dst, iRegL src1, iRegL src2) %{ match(Set dst (CMoveL (Binary cmp cr) (Binary src1 src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "csel $dst, $src2, $src1 $cmp\t# signed, long" %} ins_encode %{ @@ -6468,7 +6496,7 @@ instruct cmovUL_reg_reg(cmpOpU cmp, rFlagsRegU cr, iRegLNoSp dst, iRegL src1, iRegL src2) %{ match(Set dst (CMoveL (Binary cmp cr) (Binary src1 src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "csel $dst, $src2, $src1 $cmp\t# unsigned, long" %} ins_encode %{ @@ -6486,7 +6514,7 @@ instruct cmovL_reg_zero(cmpOp cmp, rFlagsReg cr, iRegLNoSp dst, iRegL src1, immL0 zero) %{ match(Set dst (CMoveL (Binary cmp cr) (Binary src1 zero))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "csel $dst, zr, $src1 $cmp\t# signed, long" %} ins_encode %{ @@ -6502,7 +6530,7 @@ instruct cmovUL_reg_zero(cmpOpU cmp, rFlagsRegU cr, iRegLNoSp dst, iRegL src1, immL0 zero) %{ match(Set dst (CMoveL (Binary cmp cr) (Binary src1 zero))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "csel $dst, zr, $src1 $cmp\t# unsigned, long" %} ins_encode %{ @@ -6518,7 +6546,7 @@ instruct cmovL_zero_reg(cmpOp cmp, rFlagsReg cr, iRegLNoSp dst, immL0 zero, iRegL src2) %{ match(Set dst (CMoveL (Binary cmp cr) (Binary zero src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "csel $dst, $src2, zr $cmp\t# signed, long" %} ins_encode %{ @@ -6534,7 +6562,7 @@ instruct cmovUL_zero_reg(cmpOpU cmp, rFlagsRegU cr, iRegLNoSp dst, immL0 zero, iRegL src2) %{ match(Set dst (CMoveL (Binary cmp cr) (Binary zero src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "csel $dst, $src2, zr $cmp\t# unsigned, long" %} ins_encode %{ @@ -6550,7 +6578,7 @@ instruct cmovP_reg_reg(cmpOp cmp, rFlagsReg cr, iRegPNoSp dst, iRegP src1, iRegP src2) %{ match(Set dst (CMoveP (Binary cmp cr) (Binary src1 src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "csel $dst, $src2, $src1 $cmp\t# signed, ptr" %} ins_encode %{ @@ -6566,7 +6594,7 @@ instruct cmovUP_reg_reg(cmpOpU cmp, rFlagsRegU cr, iRegPNoSp dst, iRegP src1, iRegP src2) %{ match(Set dst (CMoveP (Binary cmp cr) (Binary src1 src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "csel $dst, $src2, $src1 $cmp\t# unsigned, ptr" %} ins_encode %{ @@ -6584,7 +6612,7 @@ instruct cmovP_reg_zero(cmpOp cmp, rFlagsReg cr, iRegPNoSp dst, iRegP src1, immP0 zero) %{ match(Set dst (CMoveP (Binary cmp cr) (Binary src1 zero))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "csel $dst, zr, $src1 $cmp\t# signed, ptr" %} ins_encode %{ @@ -6600,7 +6628,7 @@ instruct cmovUP_reg_zero(cmpOpU cmp, rFlagsRegU cr, iRegPNoSp dst, iRegP src1, immP0 zero) %{ match(Set dst (CMoveP (Binary cmp cr) (Binary src1 zero))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "csel $dst, zr, $src1 $cmp\t# unsigned, ptr" %} ins_encode %{ @@ -6616,7 +6644,7 @@ instruct cmovP_zero_reg(cmpOp cmp, rFlagsReg cr, iRegPNoSp dst, immP0 zero, iRegP src2) %{ match(Set dst (CMoveP (Binary cmp cr) (Binary zero src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "csel $dst, $src2, zr $cmp\t# signed, ptr" %} ins_encode %{ @@ -6632,7 +6660,7 @@ instruct cmovUP_zero_reg(cmpOpU cmp, rFlagsRegU cr, iRegPNoSp dst, immP0 zero, iRegP src2) %{ match(Set dst (CMoveP (Binary cmp cr) (Binary zero src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "csel $dst, $src2, zr $cmp\t# unsigned, ptr" %} ins_encode %{ @@ -6648,7 +6676,7 @@ instruct cmovN_reg_reg(cmpOp cmp, rFlagsReg cr, iRegNNoSp dst, iRegN src1, iRegN src2) %{ match(Set dst (CMoveN (Binary cmp cr) (Binary src1 src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "cselw $dst, $src2, $src1 $cmp\t# signed, compressed ptr" %} ins_encode %{ @@ -6664,7 +6692,7 @@ instruct cmovUN_reg_reg(cmpOpU cmp, rFlagsRegU cr, iRegNNoSp dst, iRegN src1, iRegN src2) %{ match(Set dst (CMoveN (Binary cmp cr) (Binary src1 src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "cselw $dst, $src2, $src1 $cmp\t# signed, compressed ptr" %} ins_encode %{ @@ -6682,7 +6710,7 @@ instruct cmovN_reg_zero(cmpOp cmp, rFlagsReg cr, iRegNNoSp dst, iRegN src1, immN0 zero) %{ match(Set dst (CMoveN (Binary cmp cr) (Binary src1 zero))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "cselw $dst, zr, $src1 $cmp\t# signed, compressed ptr" %} ins_encode %{ @@ -6698,7 +6726,7 @@ instruct cmovUN_reg_zero(cmpOpU cmp, rFlagsRegU cr, iRegNNoSp dst, iRegN src1, immN0 zero) %{ match(Set dst (CMoveN (Binary cmp cr) (Binary src1 zero))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "cselw $dst, zr, $src1 $cmp\t# unsigned, compressed ptr" %} ins_encode %{ @@ -6714,7 +6742,7 @@ instruct cmovN_zero_reg(cmpOp cmp, rFlagsReg cr, iRegNNoSp dst, immN0 zero, iRegN src2) %{ match(Set dst (CMoveN (Binary cmp cr) (Binary zero src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "cselw $dst, $src2, zr $cmp\t# signed, compressed ptr" %} ins_encode %{ @@ -6730,7 +6758,7 @@ instruct cmovUN_zero_reg(cmpOpU cmp, rFlagsRegU cr, iRegNNoSp dst, immN0 zero, iRegN src2) %{ match(Set dst (CMoveN (Binary cmp cr) (Binary zero src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "cselw $dst, $src2, zr $cmp\t# unsigned, compressed ptr" %} ins_encode %{ @@ -6747,7 +6775,7 @@ %{ match(Set dst (CMoveF (Binary cmp cr) (Binary src1 src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 3); format %{ "fcsels $dst, $src1, $src2, $cmp\t# signed cmove float\n\t" %} ins_encode %{ @@ -6765,7 +6793,7 @@ %{ match(Set dst (CMoveF (Binary cmp cr) (Binary src1 src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 3); format %{ "fcsels $dst, $src1, $src2, $cmp\t# unsigned cmove float\n\t" %} ins_encode %{ @@ -6783,7 +6811,7 @@ %{ match(Set dst (CMoveD (Binary cmp cr) (Binary src1 src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 3); format %{ "fcseld $dst, $src1, $src2, $cmp\t# signed cmove float\n\t" %} ins_encode %{ @@ -6801,7 +6829,7 @@ %{ match(Set dst (CMoveD (Binary cmp cr) (Binary src1 src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 3); format %{ "fcseld $dst, $src1, $src2, $cmp\t# unsigned cmove float\n\t" %} ins_encode %{ @@ -6858,7 +6886,7 @@ instruct addI_reg_reg(iRegINoSp dst, iRegIorL2I src1, iRegIorL2I src2) %{ match(Set dst (AddI src1 src2)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "addw $dst, $src1, $src2" %} ins_encode %{ @@ -6873,7 +6901,7 @@ instruct addI_reg_imm(iRegINoSp dst, iRegI src1, immIAddSub src2) %{ match(Set dst (AddI src1 src2)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "addw $dst, $src1, $src2" %} // use opcode to indicate that this is an add not a sub @@ -6887,7 +6915,7 @@ instruct addI_reg_imm_i2l(iRegINoSp dst, iRegL src1, immIAddSub src2) %{ match(Set dst (AddI (ConvL2I src1) src2)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "addw $dst, $src1, $src2" %} // use opcode to indicate that this is an add not a sub @@ -6902,7 +6930,7 @@ instruct addP_reg_reg(iRegPNoSp dst, iRegP src1, iRegL src2) %{ match(Set dst (AddP src1 src2)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "add $dst, $src1, $src2\t# ptr" %} ins_encode %{ @@ -6917,7 +6945,7 @@ instruct addP_reg_reg_ext(iRegPNoSp dst, iRegP src1, iRegIorL2I src2) %{ match(Set dst (AddP src1 (ConvI2L src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "add $dst, $src1, $src2\t# ptr" %} ins_encode %{ @@ -6932,7 +6960,7 @@ instruct addP_reg_reg_lsl(iRegPNoSp dst, iRegP src1, iRegL src2, immIScale scale) %{ match(Set dst (AddP src1 (LShiftL src2 scale))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "add $dst, $src1, $src2, LShiftL $scale\t# ptr" %} ins_encode %{ @@ -6947,7 +6975,7 @@ instruct addP_reg_reg_ext_shift(iRegPNoSp dst, iRegP src1, iRegIorL2I src2, immIScale scale) %{ match(Set dst (AddP src1 (LShiftL (ConvI2L src2) scale))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "add $dst, $src1, $src2, I2L $scale\t# ptr" %} ins_encode %{ @@ -6962,7 +6990,7 @@ instruct lshift_ext(iRegLNoSp dst, iRegIorL2I src, immI scale, rFlagsReg cr) %{ match(Set dst (LShiftL (ConvI2L src) scale)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "sbfiz $dst, $src, $scale & 63, -$scale & 63\t" %} ins_encode %{ @@ -6980,7 +7008,7 @@ instruct addP_reg_imm(iRegPNoSp dst, iRegP src1, immLAddSub src2) %{ match(Set dst (AddP src1 src2)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "add $dst, $src1, $src2\t# ptr" %} // use opcode to indicate that this is an add not a sub @@ -6996,7 +7024,7 @@ match(Set dst (AddL src1 src2)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "add $dst, $src1, $src2" %} ins_encode %{ @@ -7012,7 +7040,7 @@ instruct addL_reg_imm(iRegLNoSp dst, iRegL src1, immLAddSub src2) %{ match(Set dst (AddL src1 src2)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "add $dst, $src1, $src2" %} // use opcode to indicate that this is an add not a sub @@ -7027,7 +7055,7 @@ instruct subI_reg_reg(iRegINoSp dst, iRegIorL2I src1, iRegIorL2I src2) %{ match(Set dst (SubI src1 src2)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "subw $dst, $src1, $src2" %} ins_encode %{ @@ -7043,7 +7071,7 @@ instruct subI_reg_imm(iRegINoSp dst, iRegIorL2I src1, immIAddSub src2) %{ match(Set dst (SubI src1 src2)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "subw $dst, $src1, $src2" %} // use opcode to indicate that this is a sub not an add @@ -7059,7 +7087,7 @@ match(Set dst (SubL src1 src2)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "sub $dst, $src1, $src2" %} ins_encode %{ @@ -7075,7 +7103,7 @@ instruct subL_reg_imm(iRegLNoSp dst, iRegL src1, immLAddSub src2) %{ match(Set dst (SubL src1 src2)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "sub$dst, $src1, $src2" %} // use opcode to indicate that this is a sub not an add @@ -7091,7 +7119,7 @@ instruct negI_reg(iRegINoSp dst, iRegIorL2I src, immI0 zero, rFlagsReg cr) %{ match(Set dst (SubI zero src)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "negw $dst, $src\t# int" %} ins_encode %{ @@ -7107,7 +7135,7 @@ instruct negL_reg(iRegLNoSp dst, iRegIorL2I src, immL0 zero, rFlagsReg cr) %{ match(Set dst (SubL zero src)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "neg $dst, $src\t# long" %} ins_encode %{ @@ -7123,7 +7151,7 @@ instruct mulI(iRegINoSp dst, iRegIorL2I src1, iRegIorL2I src2) %{ match(Set dst (MulI src1 src2)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 3); format %{ "mulw $dst, $src1, $src2" %} ins_encode %{ @@ -7140,7 +7168,7 @@ instruct mulL(iRegLNoSp dst, iRegL src1, iRegL src2) %{ match(Set dst (MulL src1 src2)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 5); format %{ "mul $dst, $src1, $src2" %} ins_encode %{ @@ -7156,7 +7184,7 @@ %{ match(Set dst (MulHiL src1 src2)); - ins_cost(2 * DEFAULT_COST); + ins_cost(INSN_COST * 7); format %{ "smulh $dst, $src1, $src2, \t# mulhi" %} ins_encode %{ @@ -7173,7 +7201,7 @@ instruct maddI(iRegINoSp dst, iRegIorL2I src1, iRegIorL2I src2, iRegIorL2I src3) %{ match(Set dst (AddI src3 (MulI src1 src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 3); format %{ "madd $dst, $src1, $src2, $src3" %} ins_encode %{ @@ -7189,7 +7217,7 @@ instruct msubI(iRegINoSp dst, iRegIorL2I src1, iRegIorL2I src2, iRegIorL2I src3) %{ match(Set dst (SubI src3 (MulI src1 src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 3); format %{ "msub $dst, $src1, $src2, $src3" %} ins_encode %{ @@ -7207,7 +7235,7 @@ instruct maddL(iRegLNoSp dst, iRegL src1, iRegL src2, iRegL src3) %{ match(Set dst (AddL src3 (MulL src1 src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 5); format %{ "madd $dst, $src1, $src2, $src3" %} ins_encode %{ @@ -7223,7 +7251,7 @@ instruct msubL(iRegLNoSp dst, iRegL src1, iRegL src2, iRegL src3) %{ match(Set dst (SubL src3 (MulL src1 src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 5); format %{ "msub $dst, $src1, $src2, $src3" %} ins_encode %{ @@ -7241,15 +7269,8 @@ instruct divI(iRegINoSp dst, iRegIorL2I src1, iRegIorL2I src2) %{ match(Set dst (DivI src1 src2)); - ins_cost(10*DEFAULT_COST); - format %{ "cmpw $src1, #0x80000000\t# idiv\n\t" - "bne normal\n\t" - "cmnw $src2, #1\n\t" - "beq normal\n\t" - "movw $dst, $src1\n\t" - "b done\n" - "normal: sdivw $dst, $src1, $src2\n" - "done:" %} + ins_cost(INSN_COST * 19); + format %{ "sdivw $dst, $src1, $src2" %} ins_encode(aarch64_enc_divw(dst, src1, src2)); ins_pipe(pipe_class_default); @@ -7260,15 +7281,8 @@ instruct divL(iRegLNoSp dst, iRegL src1, iRegL src2) %{ match(Set dst (DivL src1 src2)); - ins_cost(10*DEFAULT_COST); - format %{ "cmp $src1, #0x8000000000000000\t# ldiv\n\t" - "bne normal\n\t" - "cmn $src2, #1\n\t" - "beq normal\n\t" - "mov $dst, $src1\n\t" - "b done\n" - "normal: sdiv $dst, $src1, $src2\n" - "done:" %} + ins_cost(INSN_COST * 35); + format %{ "sdiv $dst, $src1, $src2" %} ins_encode(aarch64_enc_div(dst, src1, src2)); ins_pipe(pipe_class_default); @@ -7279,15 +7293,9 @@ instruct modI(iRegINoSp dst, iRegIorL2I src1, iRegIorL2I src2) %{ match(Set dst (ModI src1 src2)); - format %{ "cmpw $src1, #0x80000000\t# imod\n\t" - "bne normal\n\t" - "cmnw $src2, #1\n\t" - "beq normal\n\t" - "movw $dst, zr\n\t" - "b done\n" - "normal: sdivw rscratch1, $src1, $src2\n\t" - "msubw($dst, rscratch1, $src2, $src1" - "done:" %} + ins_cost(INSN_COST * 22); + format %{ "sdivw rscratch1, $src1, $src2\n\t" + "msubw($dst, rscratch1, $src2, $src1" %} ins_encode(aarch64_enc_modw(dst, src1, src2)); ins_pipe(pipe_class_default); @@ -7298,16 +7306,9 @@ instruct modL(iRegLNoSp dst, iRegL src1, iRegL src2) %{ match(Set dst (ModL src1 src2)); - ins_cost(10*DEFAULT_COST); - format %{ "cmp $src1, #0x8000000000000000\t# lmod\n\t" - "bne normal\n\t" - "cmn $src2, #1\n\t" - "beq normal\n\t" - "mov $dst, zr\n\t" - "b done\n" - "normal: sdiv rscratch1, $src1, $src2\n" - "msub($dst, rscratch1, $src2, $src1" - "done:" %} + ins_cost(INSN_COST * 38); + format %{ "sdiv rscratch1, $src1, $src2\n" + "msub($dst, rscratch1, $src2, $src1" %} ins_encode(aarch64_enc_mod(dst, src1, src2)); ins_pipe(pipe_class_default); @@ -7319,7 +7320,7 @@ instruct lShiftI_reg_reg(iRegINoSp dst, iRegIorL2I src1, iRegIorL2I src2) %{ match(Set dst (LShiftI src1 src2)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "lslvw $dst, $src1, $src2" %} ins_encode %{ @@ -7335,6 +7336,7 @@ instruct lShiftI_reg_imm(iRegINoSp dst, iRegIorL2I src1, immI src2) %{ match(Set dst (LShiftI src1 src2)); + ins_cost(INSN_COST); format %{ "lslw $dst, $src1, ($src2 & 0x1f)" %} ins_encode %{ @@ -7350,7 +7352,7 @@ instruct urShiftI_reg_reg(iRegINoSp dst, iRegIorL2I src1, iRegIorL2I src2) %{ match(Set dst (URShiftI src1 src2)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "lsrvw $dst, $src1, $src2" %} ins_encode %{ @@ -7366,6 +7368,7 @@ instruct urShiftI_reg_imm(iRegINoSp dst, iRegIorL2I src1, immI src2) %{ match(Set dst (URShiftI src1 src2)); + ins_cost(INSN_COST); format %{ "lsrw $dst, $src1, ($src2 & 0x1f)" %} ins_encode %{ @@ -7381,7 +7384,7 @@ instruct rShiftI_reg_reg(iRegINoSp dst, iRegIorL2I src1, iRegIorL2I src2) %{ match(Set dst (RShiftI src1 src2)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "asrvw $dst, $src1, $src2" %} ins_encode %{ @@ -7397,6 +7400,7 @@ instruct rShiftI_reg_imm(iRegINoSp dst, iRegIorL2I src1, immI src2) %{ match(Set dst (RShiftI src1 src2)); + ins_cost(INSN_COST); format %{ "asrw $dst, $src1, ($src2 & 0x1f)" %} ins_encode %{ @@ -7417,7 +7421,7 @@ instruct lShiftL_reg_reg(iRegLNoSp dst, iRegL src1, iRegIorL2I src2) %{ match(Set dst (LShiftL src1 src2)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "lslv $dst, $src1, $src2" %} ins_encode %{ @@ -7433,6 +7437,7 @@ instruct lShiftL_reg_imm(iRegLNoSp dst, iRegL src1, immI src2) %{ match(Set dst (LShiftL src1 src2)); + ins_cost(INSN_COST); format %{ "lsl $dst, $src1, ($src2 & 0x3f)" %} ins_encode %{ @@ -7448,7 +7453,7 @@ instruct urShiftL_reg_reg(iRegLNoSp dst, iRegL src1, iRegIorL2I src2) %{ match(Set dst (URShiftL src1 src2)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "lsrv $dst, $src1, $src2" %} ins_encode %{ @@ -7464,6 +7469,7 @@ instruct urShiftL_reg_imm(iRegLNoSp dst, iRegL src1, immI src2) %{ match(Set dst (URShiftL src1 src2)); + ins_cost(INSN_COST); format %{ "lsr $dst, $src1, ($src2 & 0x3f)" %} ins_encode %{ @@ -7479,7 +7485,7 @@ instruct rShiftL_reg_reg(iRegLNoSp dst, iRegL src1, iRegIorL2I src2) %{ match(Set dst (RShiftL src1 src2)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "asrv $dst, $src1, $src2" %} ins_encode %{ @@ -7495,6 +7501,7 @@ instruct rShiftL_reg_imm(iRegLNoSp dst, iRegL src1, immI src2) %{ match(Set dst (RShiftL src1 src2)); + ins_cost(INSN_COST); format %{ "asr $dst, $src1, ($src2 & 0x3f)" %} ins_encode %{ @@ -7512,7 +7519,7 @@ iRegL src1, immL_M1 m1, rFlagsReg cr) %{ match(Set dst (XorL src1 m1)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "eon $dst, $src1, zr" %} ins_encode %{ @@ -7528,7 +7535,7 @@ iRegI src1, immI_M1 m1, rFlagsReg cr) %{ match(Set dst (XorI src1 m1)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "eonw $dst, $src1, zr" %} ins_encode %{ @@ -7545,7 +7552,7 @@ iRegI src1, iRegI src2, immI_M1 m1, rFlagsReg cr) %{ match(Set dst (AndI src1 (XorI src2 m1))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "bic $dst, $src1, $src2" %} ins_encode %{ @@ -7562,7 +7569,7 @@ iRegL src1, iRegL src2, immL_M1 m1, rFlagsReg cr) %{ match(Set dst (AndL src1 (XorL src2 m1))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "bic $dst, $src1, $src2" %} ins_encode %{ @@ -7579,7 +7586,7 @@ iRegI src1, iRegI src2, immI_M1 m1, rFlagsReg cr) %{ match(Set dst (OrI src1 (XorI src2 m1))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "orn $dst, $src1, $src2" %} ins_encode %{ @@ -7596,7 +7603,7 @@ iRegL src1, iRegL src2, immL_M1 m1, rFlagsReg cr) %{ match(Set dst (OrL src1 (XorL src2 m1))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "orn $dst, $src1, $src2" %} ins_encode %{ @@ -7613,7 +7620,7 @@ iRegI src1, iRegI src2, immI_M1 m1, rFlagsReg cr) %{ match(Set dst (XorI m1 (XorI src2 src1))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "eon $dst, $src1, $src2" %} ins_encode %{ @@ -7630,7 +7637,7 @@ iRegL src1, iRegL src2, immL_M1 m1, rFlagsReg cr) %{ match(Set dst (XorL m1 (XorL src2 src1))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "eon $dst, $src1, $src2" %} ins_encode %{ @@ -7647,7 +7654,7 @@ iRegI src1, iRegI src2, immI src3, immI_M1 src4, rFlagsReg cr) %{ match(Set dst (AndI src1 (XorI(URShiftI src2 src3) src4))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "bicw $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -7665,7 +7672,7 @@ iRegL src1, iRegL src2, immI src3, immL_M1 src4, rFlagsReg cr) %{ match(Set dst (AndL src1 (XorL(URShiftL src2 src3) src4))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "bic $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -7683,7 +7690,7 @@ iRegI src1, iRegI src2, immI src3, immI_M1 src4, rFlagsReg cr) %{ match(Set dst (AndI src1 (XorI(RShiftI src2 src3) src4))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "bicw $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -7701,7 +7708,7 @@ iRegL src1, iRegL src2, immI src3, immL_M1 src4, rFlagsReg cr) %{ match(Set dst (AndL src1 (XorL(RShiftL src2 src3) src4))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "bic $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -7719,7 +7726,7 @@ iRegI src1, iRegI src2, immI src3, immI_M1 src4, rFlagsReg cr) %{ match(Set dst (AndI src1 (XorI(LShiftI src2 src3) src4))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "bicw $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -7737,7 +7744,7 @@ iRegL src1, iRegL src2, immI src3, immL_M1 src4, rFlagsReg cr) %{ match(Set dst (AndL src1 (XorL(LShiftL src2 src3) src4))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "bic $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -7755,7 +7762,7 @@ iRegI src1, iRegI src2, immI src3, immI_M1 src4, rFlagsReg cr) %{ match(Set dst (XorI src4 (XorI(URShiftI src2 src3) src1))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "eonw $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -7773,7 +7780,7 @@ iRegL src1, iRegL src2, immI src3, immL_M1 src4, rFlagsReg cr) %{ match(Set dst (XorL src4 (XorL(URShiftL src2 src3) src1))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "eon $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -7791,7 +7798,7 @@ iRegI src1, iRegI src2, immI src3, immI_M1 src4, rFlagsReg cr) %{ match(Set dst (XorI src4 (XorI(RShiftI src2 src3) src1))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "eonw $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -7809,7 +7816,7 @@ iRegL src1, iRegL src2, immI src3, immL_M1 src4, rFlagsReg cr) %{ match(Set dst (XorL src4 (XorL(RShiftL src2 src3) src1))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "eon $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -7827,7 +7834,7 @@ iRegI src1, iRegI src2, immI src3, immI_M1 src4, rFlagsReg cr) %{ match(Set dst (XorI src4 (XorI(LShiftI src2 src3) src1))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "eonw $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -7845,7 +7852,7 @@ iRegL src1, iRegL src2, immI src3, immL_M1 src4, rFlagsReg cr) %{ match(Set dst (XorL src4 (XorL(LShiftL src2 src3) src1))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "eon $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -7863,7 +7870,7 @@ iRegI src1, iRegI src2, immI src3, immI_M1 src4, rFlagsReg cr) %{ match(Set dst (OrI src1 (XorI(URShiftI src2 src3) src4))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "ornw $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -7881,7 +7888,7 @@ iRegL src1, iRegL src2, immI src3, immL_M1 src4, rFlagsReg cr) %{ match(Set dst (OrL src1 (XorL(URShiftL src2 src3) src4))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "orn $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -7899,7 +7906,7 @@ iRegI src1, iRegI src2, immI src3, immI_M1 src4, rFlagsReg cr) %{ match(Set dst (OrI src1 (XorI(RShiftI src2 src3) src4))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "ornw $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -7917,7 +7924,7 @@ iRegL src1, iRegL src2, immI src3, immL_M1 src4, rFlagsReg cr) %{ match(Set dst (OrL src1 (XorL(RShiftL src2 src3) src4))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "orn $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -7935,7 +7942,7 @@ iRegI src1, iRegI src2, immI src3, immI_M1 src4, rFlagsReg cr) %{ match(Set dst (OrI src1 (XorI(LShiftI src2 src3) src4))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "ornw $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -7953,7 +7960,7 @@ iRegL src1, iRegL src2, immI src3, immL_M1 src4, rFlagsReg cr) %{ match(Set dst (OrL src1 (XorL(LShiftL src2 src3) src4))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "orn $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -7972,7 +7979,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (AndI src1 (URShiftI src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "andw $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -7991,7 +7998,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (AndL src1 (URShiftL src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "andr $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -8010,7 +8017,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (AndI src1 (RShiftI src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "andw $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -8029,7 +8036,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (AndL src1 (RShiftL src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "andr $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -8048,7 +8055,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (AndI src1 (LShiftI src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "andw $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -8067,7 +8074,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (AndL src1 (LShiftL src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "andr $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -8086,7 +8093,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (XorI src1 (URShiftI src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "eorw $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -8105,7 +8112,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (XorL src1 (URShiftL src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "eor $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -8124,7 +8131,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (XorI src1 (RShiftI src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "eorw $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -8143,7 +8150,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (XorL src1 (RShiftL src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "eor $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -8162,7 +8169,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (XorI src1 (LShiftI src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "eorw $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -8181,7 +8188,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (XorL src1 (LShiftL src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "eor $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -8200,7 +8207,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (OrI src1 (URShiftI src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "orrw $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -8219,7 +8226,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (OrL src1 (URShiftL src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "orr $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -8238,7 +8245,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (OrI src1 (RShiftI src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "orrw $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -8257,7 +8264,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (OrL src1 (RShiftL src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "orr $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -8276,7 +8283,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (OrI src1 (LShiftI src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "orrw $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -8295,7 +8302,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (OrL src1 (LShiftL src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "orr $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -8314,7 +8321,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (AddI src1 (URShiftI src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "addw $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -8333,7 +8340,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (AddL src1 (URShiftL src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "add $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -8352,7 +8359,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (AddI src1 (RShiftI src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "addw $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -8371,7 +8378,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (AddL src1 (RShiftL src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "add $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -8390,7 +8397,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (AddI src1 (LShiftI src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "addw $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -8409,7 +8416,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (AddL src1 (LShiftL src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "add $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -8428,7 +8435,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (SubI src1 (URShiftI src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "subw $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -8447,7 +8454,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (SubL src1 (URShiftL src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "sub $dst, $src1, $src2, LSR $src3" %} ins_encode %{ @@ -8466,7 +8473,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (SubI src1 (RShiftI src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "subw $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -8485,7 +8492,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (SubL src1 (RShiftL src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "sub $dst, $src1, $src2, ASR $src3" %} ins_encode %{ @@ -8504,7 +8511,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (SubI src1 (LShiftI src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "subw $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -8523,7 +8530,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst (SubL src1 (LShiftL src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "sub $dst, $src1, $src2, LSL $src3" %} ins_encode %{ @@ -8548,7 +8555,7 @@ predicate((unsigned int)n->in(2)->get_int() <= 63 && (unsigned int)n->in(1)->in(2)->get_int() <= 63); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "sbfm $dst, $src, $rshift_count - $lshift_count, #63 - $lshift_count" %} ins_encode %{ int lshift = $lshift_count$$constant, rshift = $rshift_count$$constant; @@ -8571,7 +8578,7 @@ predicate((unsigned int)n->in(2)->get_int() <= 31 && (unsigned int)n->in(1)->in(2)->get_int() <= 31); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "sbfmw $dst, $src, $rshift_count - $lshift_count, #31 - $lshift_count" %} ins_encode %{ int lshift = $lshift_count$$constant, rshift = $rshift_count$$constant; @@ -8594,7 +8601,7 @@ predicate((unsigned int)n->in(2)->get_int() <= 63 && (unsigned int)n->in(1)->in(2)->get_int() <= 63); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "ubfm $dst, $src, $rshift_count - $lshift_count, #63 - $lshift_count" %} ins_encode %{ int lshift = $lshift_count$$constant, rshift = $rshift_count$$constant; @@ -8617,7 +8624,7 @@ predicate((unsigned int)n->in(2)->get_int() <= 31 && (unsigned int)n->in(1)->in(2)->get_int() <= 31); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "ubfmw $dst, $src, $rshift_count - $lshift_count, #31 - $lshift_count" %} ins_encode %{ int lshift = $lshift_count$$constant, rshift = $rshift_count$$constant; @@ -8636,7 +8643,7 @@ %{ match(Set dst (AndI (URShiftI src rshift) mask)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "ubfxw $dst, $src, $mask" %} ins_encode %{ int rshift = $rshift$$constant; @@ -8651,7 +8658,7 @@ %{ match(Set dst (AndL (URShiftL src rshift) mask)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "ubfx $dst, $src, $mask" %} ins_encode %{ int rshift = $rshift$$constant; @@ -8669,7 +8676,7 @@ %{ match(Set dst (ConvI2L (AndI (URShiftI src rshift) mask))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "ubfx $dst, $src, $mask" %} ins_encode %{ int rshift = $rshift$$constant; @@ -8688,7 +8695,7 @@ match(Set dst (OrL (LShiftL src1 lshift) (URShiftL src2 rshift))); predicate(0 == ((n->in(1)->in(2)->get_int() + n->in(2)->in(2)->get_int()) & 63)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "extr $dst, $src1, $src2, #$rshift" %} ins_encode %{ @@ -8703,7 +8710,7 @@ match(Set dst (OrI (LShiftI src1 lshift) (URShiftI src2 rshift))); predicate(0 == ((n->in(1)->in(2)->get_int() + n->in(2)->in(2)->get_int()) & 31)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "extr $dst, $src1, $src2, #$rshift" %} ins_encode %{ @@ -8718,7 +8725,7 @@ match(Set dst (AddL (LShiftL src1 lshift) (URShiftL src2 rshift))); predicate(0 == ((n->in(1)->in(2)->get_int() + n->in(2)->in(2)->get_int()) & 63)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "extr $dst, $src1, $src2, #$rshift" %} ins_encode %{ @@ -8733,7 +8740,7 @@ match(Set dst (AddI (LShiftI src1 lshift) (URShiftI src2 rshift))); predicate(0 == ((n->in(1)->in(2)->get_int() + n->in(2)->in(2)->get_int()) & 31)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "extr $dst, $src1, $src2, #$rshift" %} ins_encode %{ @@ -8751,7 +8758,7 @@ effect(DEF dst, USE src, USE shift); format %{ "rol $dst, $src, $shift" %} - ins_cost(2*DEFAULT_COST); + ins_cost(INSN_COST * 3); ins_encode %{ __ subw(rscratch1, zr, as_Register($shift$$reg)); __ rorv(as_Register($dst$$reg), as_Register($src$$reg), @@ -8767,7 +8774,7 @@ effect(DEF dst, USE src, USE shift); format %{ "rol $dst, $src, $shift" %} - ins_cost(2*DEFAULT_COST); + ins_cost(INSN_COST * 3); ins_encode %{ __ subw(rscratch1, zr, as_Register($shift$$reg)); __ rorvw(as_Register($dst$$reg), as_Register($src$$reg), @@ -8819,7 +8826,7 @@ effect(DEF dst, USE src, USE shift); format %{ "ror $dst, $src, $shift" %} - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); ins_encode %{ __ rorv(as_Register($dst$$reg), as_Register($src$$reg), as_Register($shift$$reg)); @@ -8834,7 +8841,7 @@ effect(DEF dst, USE src, USE shift); format %{ "ror $dst, $src, $shift" %} - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); ins_encode %{ __ rorvw(as_Register($dst$$reg), as_Register($src$$reg), as_Register($shift$$reg)); @@ -8883,7 +8890,7 @@ instruct AddExtI(iRegLNoSp dst, iRegL src1, iRegIorL2I src2, rFlagsReg cr) %{ match(Set dst (AddL src1 (ConvI2L src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "add $dst, $src1, $src2" %} ins_encode %{ @@ -8896,7 +8903,7 @@ instruct SubExtI(iRegLNoSp dst, iRegL src1, iRegIorL2I src2, rFlagsReg cr) %{ match(Set dst (SubL src1 (ConvI2L src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "sub $dst, $src1, $src2" %} ins_encode %{ @@ -8910,7 +8917,7 @@ instruct AddExtI_sxth(iRegINoSp dst, iRegI src1, iRegI src2, immI_16 lshift, immI_16 rshift, rFlagsReg cr) %{ match(Set dst (AddI src1 (RShiftI (LShiftI src2 lshift) rshift))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "add $dst, $src1, sxth $src2" %} ins_encode %{ @@ -8923,7 +8930,7 @@ instruct AddExtI_sxtb(iRegINoSp dst, iRegI src1, iRegI src2, immI_24 lshift, immI_24 rshift, rFlagsReg cr) %{ match(Set dst (AddI src1 (RShiftI (LShiftI src2 lshift) rshift))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "add $dst, $src1, sxtb $src2" %} ins_encode %{ @@ -8936,7 +8943,7 @@ instruct AddExtI_uxtb(iRegINoSp dst, iRegI src1, iRegI src2, immI_24 lshift, immI_24 rshift, rFlagsReg cr) %{ match(Set dst (AddI src1 (URShiftI (LShiftI src2 lshift) rshift))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "add $dst, $src1, uxtb $src2" %} ins_encode %{ @@ -8949,7 +8956,7 @@ instruct AddExtL_sxth(iRegLNoSp dst, iRegL src1, iRegL src2, immI_48 lshift, immI_48 rshift, rFlagsReg cr) %{ match(Set dst (AddL src1 (RShiftL (LShiftL src2 lshift) rshift))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "add $dst, $src1, sxth $src2" %} ins_encode %{ @@ -8962,7 +8969,7 @@ instruct AddExtL_sxtw(iRegLNoSp dst, iRegL src1, iRegL src2, immI_32 lshift, immI_32 rshift, rFlagsReg cr) %{ match(Set dst (AddL src1 (RShiftL (LShiftL src2 lshift) rshift))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "add $dst, $src1, sxtw $src2" %} ins_encode %{ @@ -8975,7 +8982,7 @@ instruct AddExtL_sxtb(iRegLNoSp dst, iRegL src1, iRegL src2, immI_56 lshift, immI_56 rshift, rFlagsReg cr) %{ match(Set dst (AddL src1 (RShiftL (LShiftL src2 lshift) rshift))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "add $dst, $src1, sxtb $src2" %} ins_encode %{ @@ -8988,7 +8995,7 @@ instruct AddExtL_uxtb(iRegLNoSp dst, iRegL src1, iRegL src2, immI_56 lshift, immI_56 rshift, rFlagsReg cr) %{ match(Set dst (AddL src1 (URShiftL (LShiftL src2 lshift) rshift))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "add $dst, $src1, uxtb $src2" %} ins_encode %{ @@ -9002,7 +9009,7 @@ instruct AddExtI_uxtb_and(iRegINoSp dst, iRegI src1, iRegI src2, immI_255 mask, rFlagsReg cr) %{ match(Set dst (AddI src1 (AndI src2 mask))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "addw $dst, $src1, $src2, uxtb" %} ins_encode %{ @@ -9015,7 +9022,7 @@ instruct AddExtI_uxth_and(iRegINoSp dst, iRegI src1, iRegI src2, immI_65535 mask, rFlagsReg cr) %{ match(Set dst (AddI src1 (AndI src2 mask))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "addw $dst, $src1, $src2, uxth" %} ins_encode %{ @@ -9028,7 +9035,7 @@ instruct AddExtL_uxtb_and(iRegLNoSp dst, iRegL src1, iRegL src2, immL_255 mask, rFlagsReg cr) %{ match(Set dst (AddL src1 (AndL src2 mask))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "add $dst, $src1, $src2, uxtb" %} ins_encode %{ @@ -9041,7 +9048,7 @@ instruct AddExtL_uxth_and(iRegLNoSp dst, iRegL src1, iRegL src2, immL_65535 mask, rFlagsReg cr) %{ match(Set dst (AddL src1 (AndL src2 mask))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "add $dst, $src1, $src2, uxth" %} ins_encode %{ @@ -9054,7 +9061,7 @@ instruct AddExtL_uxtw_and(iRegLNoSp dst, iRegL src1, iRegL src2, immL_4294967295 mask, rFlagsReg cr) %{ match(Set dst (AddL src1 (AndL src2 mask))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "add $dst, $src1, $src2, uxtw" %} ins_encode %{ @@ -9067,7 +9074,7 @@ instruct SubExtI_uxtb_and(iRegINoSp dst, iRegI src1, iRegI src2, immI_255 mask, rFlagsReg cr) %{ match(Set dst (SubI src1 (AndI src2 mask))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "subw $dst, $src1, $src2, uxtb" %} ins_encode %{ @@ -9080,7 +9087,7 @@ instruct SubExtI_uxth_and(iRegINoSp dst, iRegI src1, iRegI src2, immI_65535 mask, rFlagsReg cr) %{ match(Set dst (SubI src1 (AndI src2 mask))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "subw $dst, $src1, $src2, uxth" %} ins_encode %{ @@ -9093,7 +9100,7 @@ instruct SubExtL_uxtb_and(iRegLNoSp dst, iRegL src1, iRegL src2, immL_255 mask, rFlagsReg cr) %{ match(Set dst (SubL src1 (AndL src2 mask))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "sub $dst, $src1, $src2, uxtb" %} ins_encode %{ @@ -9106,7 +9113,7 @@ instruct SubExtL_uxth_and(iRegLNoSp dst, iRegL src1, iRegL src2, immL_65535 mask, rFlagsReg cr) %{ match(Set dst (SubL src1 (AndL src2 mask))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "sub $dst, $src1, $src2, uxth" %} ins_encode %{ @@ -9119,7 +9126,7 @@ instruct SubExtL_uxtw_and(iRegLNoSp dst, iRegL src1, iRegL src2, immL_4294967295 mask, rFlagsReg cr) %{ match(Set dst (SubL src1 (AndL src2 mask))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "sub $dst, $src1, $src2, uxtw" %} ins_encode %{ @@ -9132,16 +9139,13 @@ // END This section of the file is automatically generated. Do not edit -------------- -// Combined Long Mask and Right Shift (using UBFM) -// TODO - - // ============================================================================ // Floating Point Arithmetic Instructions instruct addF_reg_reg(vRegF dst, vRegF src1, vRegF src2) %{ match(Set dst (AddF src1 src2)); + ins_cost(INSN_COST * 5); format %{ "fadds $dst, $src1, $src2" %} ins_encode %{ @@ -9156,6 +9160,7 @@ instruct addD_reg_reg(vRegD dst, vRegD src1, vRegD src2) %{ match(Set dst (AddD src1 src2)); + ins_cost(INSN_COST * 5); format %{ "faddd $dst, $src1, $src2" %} ins_encode %{ @@ -9170,6 +9175,7 @@ instruct subF_reg_reg(vRegF dst, vRegF src1, vRegF src2) %{ match(Set dst (SubF src1 src2)); + ins_cost(INSN_COST * 5); format %{ "fsubs $dst, $src1, $src2" %} ins_encode %{ @@ -9184,6 +9190,7 @@ instruct subD_reg_reg(vRegD dst, vRegD src1, vRegD src2) %{ match(Set dst (SubD src1 src2)); + ins_cost(INSN_COST * 5); format %{ "fsubd $dst, $src1, $src2" %} ins_encode %{ @@ -9198,6 +9205,7 @@ instruct mulF_reg_reg(vRegF dst, vRegF src1, vRegF src2) %{ match(Set dst (MulF src1 src2)); + ins_cost(INSN_COST * 6); format %{ "fmuls $dst, $src1, $src2" %} ins_encode %{ @@ -9212,6 +9220,7 @@ instruct mulD_reg_reg(vRegD dst, vRegD src1, vRegD src2) %{ match(Set dst (MulD src1 src2)); + ins_cost(INSN_COST * 6); format %{ "fmuld $dst, $src1, $src2" %} ins_encode %{ @@ -9359,6 +9368,7 @@ instruct divF_reg_reg(vRegF dst, vRegF src1, vRegF src2) %{ match(Set dst (DivF src1 src2)); + ins_cost(INSN_COST * 18); format %{ "fdivs $dst, $src1, $src2" %} ins_encode %{ @@ -9373,6 +9383,7 @@ instruct divD_reg_reg(vRegD dst, vRegD src1, vRegD src2) %{ match(Set dst (DivD src1 src2)); + ins_cost(INSN_COST * 32); format %{ "fdivd $dst, $src1, $src2" %} ins_encode %{ @@ -9387,6 +9398,7 @@ instruct negF_reg_reg(vRegF dst, vRegF src) %{ match(Set dst (NegF src)); + ins_cost(INSN_COST * 3); format %{ "fneg $dst, $src" %} ins_encode %{ @@ -9400,6 +9412,7 @@ instruct negD_reg_reg(vRegD dst, vRegD src) %{ match(Set dst (NegD src)); + ins_cost(INSN_COST * 3); format %{ "fnegd $dst, $src" %} ins_encode %{ @@ -9413,6 +9426,7 @@ instruct absF_reg(vRegF dst, vRegF src) %{ match(Set dst (AbsF src)); + ins_cost(INSN_COST * 3); format %{ "fabss $dst, $src" %} ins_encode %{ __ fabss(as_FloatRegister($dst$$reg), @@ -9425,6 +9439,7 @@ instruct absD_reg(vRegD dst, vRegD src) %{ match(Set dst (AbsD src)); + ins_cost(INSN_COST * 3); format %{ "fabsd $dst, $src" %} ins_encode %{ __ fabsd(as_FloatRegister($dst$$reg), @@ -9440,10 +9455,6 @@ // Integer Logical Instructions // And Instructions -// TODO -// these currently set CR and are flagged as killing CR but we would -// like to isolate the cases where we want to set flags from those -// where we don't. need to work out how to do that. instruct andI_reg_reg(iRegINoSp dst, iRegIorL2I src1, iRegIorL2I src2, rFlagsReg cr) %{ @@ -9451,6 +9462,7 @@ format %{ "andw $dst, $src1, $src2\t# int" %} + ins_cost(INSN_COST); ins_encode %{ __ andw(as_Register($dst$$reg), as_Register($src1$$reg), @@ -9465,6 +9477,7 @@ format %{ "andsw $dst, $src1, $src2\t# int" %} + ins_cost(INSN_COST); ins_encode %{ __ andw(as_Register($dst$$reg), as_Register($src1$$reg), @@ -9481,6 +9494,7 @@ format %{ "orrw $dst, $src1, $src2\t# int" %} + ins_cost(INSN_COST); ins_encode %{ __ orrw(as_Register($dst$$reg), as_Register($src1$$reg), @@ -9495,6 +9509,7 @@ format %{ "orrw $dst, $src1, $src2\t# int" %} + ins_cost(INSN_COST); ins_encode %{ __ orrw(as_Register($dst$$reg), as_Register($src1$$reg), @@ -9511,6 +9526,7 @@ format %{ "eorw $dst, $src1, $src2\t# int" %} + ins_cost(INSN_COST); ins_encode %{ __ eorw(as_Register($dst$$reg), as_Register($src1$$reg), @@ -9525,6 +9541,7 @@ format %{ "eorw $dst, $src1, $src2\t# int" %} + ins_cost(INSN_COST); ins_encode %{ __ eorw(as_Register($dst$$reg), as_Register($src1$$reg), @@ -9542,6 +9559,7 @@ format %{ "and $dst, $src1, $src2\t# int" %} + ins_cost(INSN_COST); ins_encode %{ __ andr(as_Register($dst$$reg), as_Register($src1$$reg), @@ -9556,6 +9574,7 @@ format %{ "and $dst, $src1, $src2\t# int" %} + ins_cost(INSN_COST); ins_encode %{ __ andr(as_Register($dst$$reg), as_Register($src1$$reg), @@ -9572,6 +9591,7 @@ format %{ "orr $dst, $src1, $src2\t# int" %} + ins_cost(INSN_COST); ins_encode %{ __ orr(as_Register($dst$$reg), as_Register($src1$$reg), @@ -9586,6 +9606,7 @@ format %{ "orr $dst, $src1, $src2\t# int" %} + ins_cost(INSN_COST); ins_encode %{ __ orr(as_Register($dst$$reg), as_Register($src1$$reg), @@ -9602,6 +9623,7 @@ format %{ "eor $dst, $src1, $src2\t# int" %} + ins_cost(INSN_COST); ins_encode %{ __ eor(as_Register($dst$$reg), as_Register($src1$$reg), @@ -9614,7 +9636,7 @@ instruct xorL_reg_imm(iRegLNoSp dst, iRegL src1, immLLog src2) %{ match(Set dst (XorL src1 src2)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "eor $dst, $src1, $src2\t# int" %} ins_encode %{ @@ -9630,7 +9652,7 @@ %{ match(Set dst (ConvI2L src)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "sxtw $dst, $src\t# i2l" %} ins_encode %{ __ sbfm($dst$$Register, $src$$Register, 0, 31); @@ -9643,7 +9665,7 @@ %{ match(Set dst (AndL (ConvI2L src) mask)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "ubfm $dst, $src, 0, 31\t# ui2l" %} ins_encode %{ __ ubfm($dst$$Register, $src$$Register, 0, 31); @@ -9655,7 +9677,7 @@ instruct convL2I_reg(iRegINoSp dst, iRegL src) %{ match(Set dst (ConvL2I src)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "movw $dst, $src \t// l2i" %} ins_encode %{ @@ -9704,7 +9726,7 @@ instruct convD2F_reg(vRegF dst, vRegD src) %{ match(Set dst (ConvD2F src)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 5); format %{ "fcvtd $dst, $src \t// d2f" %} ins_encode %{ @@ -9717,7 +9739,7 @@ instruct convF2D_reg(vRegD dst, vRegF src) %{ match(Set dst (ConvF2D src)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 5); format %{ "fcvts $dst, $src \t// f2d" %} ins_encode %{ @@ -9730,7 +9752,7 @@ instruct convF2I_reg_reg(iRegINoSp dst, vRegF src) %{ match(Set dst (ConvF2I src)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 5); format %{ "fcvtzsw $dst, $src \t// f2i" %} ins_encode %{ @@ -9743,7 +9765,7 @@ instruct convF2L_reg_reg(iRegLNoSp dst, vRegF src) %{ match(Set dst (ConvF2L src)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 5); format %{ "fcvtzs $dst, $src \t// f2l" %} ins_encode %{ @@ -9756,7 +9778,7 @@ instruct convI2F_reg_reg(vRegF dst, iRegI src) %{ match(Set dst (ConvI2F src)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 5); format %{ "scvtfws $dst, $src \t// i2f" %} ins_encode %{ @@ -9769,7 +9791,7 @@ instruct convL2F_reg_reg(vRegF dst, iRegL src) %{ match(Set dst (ConvL2F src)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 5); format %{ "scvtfs $dst, $src \t// l2f" %} ins_encode %{ @@ -9782,7 +9804,7 @@ instruct convD2I_reg_reg(iRegINoSp dst, vRegD src) %{ match(Set dst (ConvD2I src)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 5); format %{ "fcvtzdw $dst, $src \t// d2i" %} ins_encode %{ @@ -9795,7 +9817,7 @@ instruct convD2L_reg_reg(iRegLNoSp dst, vRegD src) %{ match(Set dst (ConvD2L src)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 5); format %{ "fcvtzd $dst, $src \t// d2l" %} ins_encode %{ @@ -9808,7 +9830,7 @@ instruct convI2D_reg_reg(vRegD dst, iRegI src) %{ match(Set dst (ConvI2D src)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 5); format %{ "scvtfwd $dst, $src \t// i2d" %} ins_encode %{ @@ -9821,7 +9843,7 @@ instruct convL2D_reg_reg(vRegD dst, iRegL src) %{ match(Set dst (ConvL2D src)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 5); format %{ "scvtfd $dst, $src \t// l2d" %} ins_encode %{ @@ -9839,7 +9861,7 @@ effect(DEF dst, USE src); - ins_cost(MEMORY_REF_COST_LOW); + ins_cost(4 * INSN_COST); format %{ "ldrw $dst, $src\t# MoveF2I_stack_reg" %} @@ -9857,7 +9879,7 @@ effect(DEF dst, USE src); - ins_cost(MEMORY_REF_COST_LOW); + ins_cost(4 * INSN_COST); format %{ "ldrs $dst, $src\t# MoveI2F_stack_reg" %} @@ -9875,7 +9897,7 @@ effect(DEF dst, USE src); - ins_cost(MEMORY_REF_COST_LOW); + ins_cost(4 * INSN_COST); format %{ "ldr $dst, $src\t# MoveD2L_stack_reg" %} @@ -9893,7 +9915,7 @@ effect(DEF dst, USE src); - ins_cost(MEMORY_REF_COST_LOW); + ins_cost(4 * INSN_COST); format %{ "ldrd $dst, $src\t# MoveL2D_stack_reg" %} @@ -9911,7 +9933,7 @@ effect(DEF dst, USE src); - ins_cost(MEMORY_REF_COST_LOW); + ins_cost(INSN_COST); format %{ "strs $src, $dst\t# MoveF2I_reg_stack" %} @@ -9929,7 +9951,7 @@ effect(DEF dst, USE src); - ins_cost(MEMORY_REF_COST_LOW); + ins_cost(INSN_COST); format %{ "strw $src, $dst\t# MoveI2F_reg_stack" %} @@ -9947,7 +9969,7 @@ effect(DEF dst, USE src); - ins_cost(MEMORY_REF_COST_LOW); + ins_cost(INSN_COST); format %{ "strd $dst, $src\t# MoveD2L_reg_stack" %} @@ -9965,7 +9987,7 @@ effect(DEF dst, USE src); - ins_cost(MEMORY_REF_COST_LOW); + ins_cost(INSN_COST); format %{ "str $src, $dst\t# MoveL2D_reg_stack" %} @@ -9983,7 +10005,7 @@ effect(DEF dst, USE src); - ins_cost(MEMORY_REF_COST_LOW); + ins_cost(INSN_COST); format %{ "fmovs $dst, $src\t# MoveF2I_reg_reg" %} @@ -10001,7 +10023,7 @@ effect(DEF dst, USE src); - ins_cost(MEMORY_REF_COST_LOW); + ins_cost(INSN_COST); format %{ "fmovs $dst, $src\t# MoveI2F_reg_reg" %} @@ -10019,7 +10041,7 @@ effect(DEF dst, USE src); - ins_cost(MEMORY_REF_COST_LOW); + ins_cost(INSN_COST); format %{ "fmovd $dst, $src\t# MoveD2L_reg_reg" %} @@ -10037,7 +10059,7 @@ effect(DEF dst, USE src); - ins_cost(MEMORY_REF_COST_LOW); + ins_cost(INSN_COST); format %{ "fmovd $dst, $src\t# MoveL2D_reg_reg" %} @@ -10057,7 +10079,7 @@ match(Set dummy (ClearArray cnt base)); effect(USE_KILL cnt, USE_KILL base); - ins_cost(MEMORY_REF_COST); + ins_cost(4 * INSN_COST); format %{ "ClearArray $cnt, $base" %} ins_encode(aarch64_enc_clear_array_reg_reg(cnt, base)); @@ -10074,7 +10096,7 @@ effect(DEF cr, USE op1, USE op2); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "cmpw $op1, $op2" %} ins_encode(aarch64_enc_cmpw(op1, op2)); @@ -10088,7 +10110,7 @@ effect(DEF cr, USE op1); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "cmpw $op1, 0" %} ins_encode(aarch64_enc_cmpw_imm_addsub(op1, zero)); @@ -10102,7 +10124,7 @@ effect(DEF cr, USE op1); - ins_cost(DEFAULT_COST + 1); + ins_cost(INSN_COST); format %{ "cmpw $op1, $op2" %} ins_encode(aarch64_enc_cmpw_imm_addsub(op1, op2)); @@ -10116,7 +10138,7 @@ effect(DEF cr, USE op1); - ins_cost(DEFAULT_COST + 2); + ins_cost(INSN_COST * 2); format %{ "cmpw $op1, $op2" %} ins_encode(aarch64_enc_cmpw_imm(op1, op2)); @@ -10134,7 +10156,7 @@ effect(DEF cr, USE op1, USE op2); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "cmpw $op1, $op2\t# unsigned" %} ins_encode(aarch64_enc_cmpw(op1, op2)); @@ -10148,7 +10170,7 @@ effect(DEF cr, USE op1); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "cmpw $op1, #0\t# unsigned" %} ins_encode(aarch64_enc_cmpw_imm_addsub(op1, zero)); @@ -10162,7 +10184,7 @@ effect(DEF cr, USE op1); - ins_cost(DEFAULT_COST + 1); + ins_cost(INSN_COST); format %{ "cmpw $op1, $op2\t# unsigned" %} ins_encode(aarch64_enc_cmpw_imm_addsub(op1, op2)); @@ -10176,7 +10198,7 @@ effect(DEF cr, USE op1); - ins_cost(DEFAULT_COST + 2); + ins_cost(INSN_COST * 2); format %{ "cmpw $op1, $op2\t# unsigned" %} ins_encode(aarch64_enc_cmpw_imm(op1, op2)); @@ -10190,7 +10212,7 @@ effect(DEF cr, USE op1, USE op2); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "cmp $op1, $op2" %} ins_encode(aarch64_enc_cmp(op1, op2)); @@ -10204,7 +10226,7 @@ effect(DEF cr, USE op1); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "tst $op1" %} ins_encode(aarch64_enc_cmp_imm_addsub(op1, zero)); @@ -10218,7 +10240,7 @@ effect(DEF cr, USE op1); - ins_cost(DEFAULT_COST + 1); + ins_cost(INSN_COST); format %{ "cmp $op1, $op2" %} ins_encode(aarch64_enc_cmp_imm_addsub(op1, op2)); @@ -10232,7 +10254,7 @@ effect(DEF cr, USE op1); - ins_cost(DEFAULT_COST + 2); + ins_cost(INSN_COST * 2); format %{ "cmp $op1, $op2" %} ins_encode(aarch64_enc_cmp_imm(op1, op2)); @@ -10246,7 +10268,7 @@ effect(DEF cr, USE op1, USE op2); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "cmp $op1, $op2\t // ptr" %} ins_encode(aarch64_enc_cmpp(op1, op2)); @@ -10260,7 +10282,7 @@ effect(DEF cr, USE op1, USE op2); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "cmp $op1, $op2\t // compressed ptr" %} ins_encode(aarch64_enc_cmpn(op1, op2)); @@ -10274,7 +10296,7 @@ effect(DEF cr, USE op1, USE zero); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "cmp $op1, 0\t // ptr" %} ins_encode(aarch64_enc_testp(op1)); @@ -10288,7 +10310,7 @@ effect(DEF cr, USE op1, USE zero); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "cmp $op1, 0\t // compressed ptr" %} ins_encode(aarch64_enc_testn(op1)); @@ -10305,7 +10327,7 @@ %{ match(Set cr (CmpF src1 src2)); - ins_cost(DEFAULT_COST); + ins_cost(3 * INSN_COST); format %{ "fcmps $src1, $src2" %} ins_encode %{ @@ -10319,7 +10341,7 @@ %{ match(Set cr (CmpF src1 src2)); - ins_cost(DEFAULT_COST); + ins_cost(3 * INSN_COST); format %{ "fcmps $src1, 0.0" %} ins_encode %{ @@ -10333,7 +10355,7 @@ %{ match(Set cr (CmpD src1 src2)); - ins_cost(DEFAULT_COST); + ins_cost(3 * INSN_COST); format %{ "fcmpd $src1, $src2" %} ins_encode %{ @@ -10347,7 +10369,7 @@ %{ match(Set cr (CmpD src1 src2)); - ins_cost(DEFAULT_COST); + ins_cost(3 * INSN_COST); format %{ "fcmpd $src1, 0.0" %} ins_encode %{ @@ -10362,7 +10384,7 @@ match(Set dst (CmpF3 src1 src2)); effect(KILL cr); - ins_cost(DEFAULT_COST * 3); + ins_cost(5 * INSN_COST); format %{ "fcmps $src1, $src2\n\t" "csinvw($dst, zr, zr, eq\n\t" "csnegw($dst, $dst, $dst, lt)" @@ -10390,7 +10412,7 @@ match(Set dst (CmpD3 src1 src2)); effect(KILL cr); - ins_cost(DEFAULT_COST * 3); + ins_cost(5 * INSN_COST); format %{ "fcmpd $src1, $src2\n\t" "csinvw($dst, zr, zr, eq\n\t" "csnegw($dst, $dst, $dst, lt)" @@ -10417,7 +10439,7 @@ match(Set dst (CmpF3 src1 zero)); effect(KILL cr); - ins_cost(DEFAULT_COST * 3); + ins_cost(5 * INSN_COST); format %{ "fcmps $src1, 0.0\n\t" "csinvw($dst, zr, zr, eq\n\t" "csnegw($dst, $dst, $dst, lt)" @@ -10444,7 +10466,7 @@ match(Set dst (CmpD3 src1 zero)); effect(KILL cr); - ins_cost(DEFAULT_COST * 3); + ins_cost(5 * INSN_COST); format %{ "fcmpd $src1, 0.0\n\t" "csinvw($dst, zr, zr, eq\n\t" "csnegw($dst, $dst, $dst, lt)" @@ -10470,7 +10492,7 @@ match(Set dst (CmpLTMask p q)); effect(KILL cr); - ins_cost(DEFAULT_COST); + ins_cost(3 * INSN_COST); format %{ "cmpw $p, $q\t# cmpLTMask\n\t" "csetw $dst, lt\n\t" @@ -10491,7 +10513,7 @@ match(Set dst (CmpLTMask src zero)); effect(KILL cr); - ins_cost(DEFAULT_COST_LOW); + ins_cost(INSN_COST); format %{ "asrw $dst, $src, #31\t# cmpLTMask0" %} @@ -10512,7 +10534,7 @@ effect(DEF dst, USE src1, USE src2, KILL cr); size(8); - ins_cost(DEFAULT_COST * 2); + ins_cost(INSN_COST * 3); format %{ "cmpw $src1 $src2\t signed int\n\t" "cselw $dst, $src1, $src2 lt\t" @@ -10537,7 +10559,7 @@ effect(DEF dst, USE src1, USE src2, KILL cr); size(8); - ins_cost(DEFAULT_COST * 2); + ins_cost(INSN_COST * 3); format %{ "cmpw $src1 $src2\t signed int\n\t" "cselw $dst, $src1, $src2 gt\t" @@ -10728,7 +10750,7 @@ // TODO // identify correct cost - ins_cost(DEFAULT_COST); + ins_cost(5 * INSN_COST); format %{ "fastlock $object,$box\t! kills $tmp,$tmp2" %} ins_encode(aarch64_enc_fast_lock(object, box, tmp, tmp2)); @@ -10741,7 +10763,7 @@ match(Set cr (FastUnlock object box)); effect(TEMP tmp, TEMP tmp2); - ins_cost(300); + ins_cost(5 * INSN_COST); format %{ "fastunlock $object,$box\t! kills $tmp, $tmp2" %} ins_encode(aarch64_enc_fast_unlock(object, box, tmp, tmp2)); @@ -10920,8 +10942,6 @@ %{ match(Set ex_oop (CreateEx)); - ins_cost(0); - format %{ " -- \t// exception oop; no code emitted" %} size(0); diff -r 5a8c184c37d4 -r a16c651450e4 src/cpu/aarch64/vm/aarch64_ad.m4 --- a/src/cpu/aarch64/vm/aarch64_ad.m4 Thu Apr 03 22:51:42 2014 +0100 +++ b/src/cpu/aarch64/vm/aarch64_ad.m4 Tue Apr 08 14:58:30 2014 +0100 @@ -7,7 +7,7 @@ immI src3, rFlagsReg cr) %{ match(Set dst ($2$1 src1 ($4$1 src2 src3))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "$3 $dst, $src1, $src2, $5 $src3" %} ins_encode %{ @@ -30,7 +30,7 @@ ifelse($2,Xor, match(Set dst (Xor$1 m1 (Xor$1 src2 src1)));, match(Set dst ($2$1 src1 (Xor$1 src2 m1)));) - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "$3 $dst, $src1, $src2" %} ins_encode %{ @@ -52,7 +52,7 @@ ifelse($2,Xor, match(Set dst ($2$1 src4 (Xor$1($4$1 src2 src3) src1)));, match(Set dst ($2$1 src1 (Xor$1($4$1 src2 src3) src4)));) - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "$3 $dst, $src1, $src2, $5 $src3" %} ins_encode %{ @@ -70,7 +70,7 @@ iReg$1 src1, imm$1_M1 m1, rFlagsReg cr) %{ match(Set dst (Xor$1 src1 m1)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "$2 $dst, $src1, zr" %} ins_encode %{ @@ -131,7 +131,7 @@ predicate((unsigned int)n->in(2)->get_int() <= $2 && (unsigned int)n->in(1)->in(2)->get_int() <= $2); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "$4 $dst, $src, $rshift_count - $lshift_count, #$2 - $lshift_count" %} ins_encode %{ int lshift = $lshift_count$$constant, rshift = $rshift_count$$constant; @@ -155,7 +155,7 @@ %{ match(Set dst (And$1 ($2$1 src rshift) mask)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "$3 $dst, $src, $mask" %} ins_encode %{ int rshift = $rshift$$constant; @@ -175,7 +175,7 @@ %{ match(Set dst (ConvI2L (AndI (URShiftI src rshift) mask))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST * 2); format %{ "ubfx $dst, $src, $mask" %} ins_encode %{ int rshift = $rshift$$constant; @@ -195,7 +195,7 @@ match(Set dst ($3$1 (LShift$1 src1 lshift) (URShift$1 src2 rshift))); predicate(0 == ((n->in(1)->in(2)->get_int() + n->in(2)->in(2)->get_int()) & $2)); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "extr $dst, $src1, $src2, #$rshift" %} ins_encode %{ @@ -217,7 +217,7 @@ effect(DEF dst, USE src, USE shift); format %{ "$2 $dst, $src, $shift" %} - ins_cost(2*DEFAULT_COST); + ins_cost(INSN_COST * 3); ins_encode %{ __ subw(rscratch1, zr, as_Register($shift$$reg)); __ $3(as_Register($dst$$reg), as_Register($src$$reg), @@ -233,7 +233,7 @@ effect(DEF dst, USE src, USE shift); format %{ "$2 $dst, $src, $shift" %} - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); ins_encode %{ __ $3(as_Register($dst$$reg), as_Register($src$$reg), as_Register($shift$$reg)); @@ -277,7 +277,7 @@ instruct $3Ext$1(iReg$2NoSp dst, iReg$2 src1, iReg$1orL2I src2, rFlagsReg cr) %{ match(Set dst ($3$2 src1 (ConvI2L src2))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "$4 $dst, $src1, $6 $src2" %} ins_encode %{ @@ -293,7 +293,7 @@ instruct $3Ext$1_$6(iReg$1NoSp dst, iReg$1 src1, iReg$1 src2, immI_`'eval($7-$2) lshift, immI_`'eval($7-$2) rshift, rFlagsReg cr) %{ match(Set dst ($3$1 src1 EXTEND($1, $4, src2, lshift, rshift))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "$5 $dst, $src1, $6 $src2" %} ins_encode %{ @@ -315,7 +315,7 @@ instruct $3Ext$1_$5_and(iReg$1NoSp dst, iReg$1 src1, iReg$1 src2, imm$1_$2 mask, rFlagsReg cr) %{ match(Set dst ($3$1 src1 (And$1 src2 mask))); - ins_cost(DEFAULT_COST); + ins_cost(INSN_COST); format %{ "$4 $dst, $src1, $src2, $5" %} ins_encode %{ From aph at redhat.com Thu Apr 10 08:17:34 2014 From: aph at redhat.com (aph at redhat.com) Date: Thu, 10 Apr 2014 08:17:34 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8/hotspot: New cost model for instruction selection. Message-ID: <201404100817.s3A8HaKR008588@aojmv0008> Changeset: a16c651450e4 Author: aph Date: 2014-04-08 14:58 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/a16c651450e4 New cost model for instruction selection. ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/aarch64_ad.m4 From aph at redhat.com Thu Apr 10 12:23:15 2014 From: aph at redhat.com (Andrew Haley) Date: Thu, 10 Apr 2014 13:23:15 +0100 Subject: [aarch64-port-dev ] Rewrite CAS operations to be more conservative Message-ID: <53468D33.6060709@redhat.com> After some discussion on concurrency-interest at cs.oswego.edu, I have decided to change all of our CAS operations to use the form // atomic_op (B) 1: ldxar x0, [B] // Exclusive load with acquire stlxr w1, x0, [B] // Exclusive store with release cbnz w1, 1b dmb ish // Full barrier or a similar variant. I'm not convinced that this is necessary, but it is safe. I'll push this patch once the jcstress test run has finished. Andrew. -------------- next part -------------- # HG changeset patch # User aph # Date 1397127043 14400 # Thu Apr 10 06:50:43 2014 -0400 # Node ID d9468835bc5160b7fac6709b0afbc751b2159fbb # Parent a16c651450e4b0822cfabb248e19f3b371582fce Rewrite CAS operations to be more conservative diff -r a16c651450e4 -r d9468835bc51 src/cpu/aarch64/vm/aarch64.ad --- a/src/cpu/aarch64/vm/aarch64.ad Tue Apr 08 14:58:30 2014 +0100 +++ b/src/cpu/aarch64/vm/aarch64.ad Thu Apr 10 06:50:43 2014 -0400 @@ -2328,12 +2328,11 @@ } } Label retry_load, done; - __ membar(__ AnyAny); __ bind(retry_load); - __ ldxr(rscratch1, addr_reg); + __ ldaxr(rscratch1, addr_reg); __ cmp(rscratch1, old_reg); __ br(Assembler::NE, done); - __ stxr(rscratch1, new_reg, addr_reg); + __ stlxr(rscratch1, new_reg, addr_reg); __ cbnzw(rscratch1, retry_load); __ bind(done); __ membar(__ AnyAny); @@ -2370,11 +2369,10 @@ } Label retry_load, done; __ bind(retry_load); - __ membar(__ AnyAny); - __ ldxrw(rscratch1, addr_reg); + __ ldaxrw(rscratch1, addr_reg); __ cmpw(rscratch1, old_reg); __ br(Assembler::NE, done); - __ stxrw(rscratch1, new_reg, addr_reg); + __ stlxrw(rscratch1, new_reg, addr_reg); __ cbnzw(rscratch1, retry_load); __ bind(done); __ membar(__ AnyAny); @@ -5927,7 +5925,7 @@ ins_encode %{ __ block_comment("membar-acquire-lock"); - // __ membar(Assembler::Membar_mask_bits(Assembler::LoadLoad|Assembler::LoadStore)); + __ membar(Assembler::Membar_mask_bits(Assembler::LoadLoad|Assembler::LoadStore)); %} ins_pipe(pipe_class_memory); @@ -5940,7 +5938,7 @@ ins_encode %{ __ block_comment("MEMBAR-release-lock"); - // __ membar(Assembler::AnyAny); + __ membar(Assembler::AnyAny); %} ins_pipe(pipe_class_memory); diff -r a16c651450e4 -r d9468835bc51 src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp --- a/src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp Tue Apr 08 14:58:30 2014 +0100 +++ b/src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp Thu Apr 10 06:50:43 2014 -0400 @@ -1590,7 +1590,7 @@ __ cset(rscratch1, Assembler::NE); __ br(Assembler::NE, nope); // if we store+flush with no intervening write rscratch1 wil be zero - __ stxrw(rscratch1, newval, addr); + __ stlxrw(rscratch1, newval, addr); // retry so we only ever return after a load fails to compare // ensures we don't return a stale value after a failed write. __ cbnzw(rscratch1, retry_load); @@ -1608,7 +1608,7 @@ __ cset(rscratch1, Assembler::NE); __ br(Assembler::NE, nope); // if we store+flush with no intervening write rscratch1 wil be zero - __ stxr(rscratch1, newval, addr); + __ stlxr(rscratch1, newval, addr); // retry so we only ever return after a load fails to compare // ensures we don't return a stale value after a failed write. __ cbnz(rscratch1, retry_load); @@ -3087,23 +3087,23 @@ case T_INT: lda = &MacroAssembler::ldaxrw; add = &MacroAssembler::addw; - stl = &MacroAssembler::stxrw; + stl = &MacroAssembler::stlxrw; break; case T_LONG: lda = &MacroAssembler::ldaxr; add = &MacroAssembler::add; - stl = &MacroAssembler::stxr; + stl = &MacroAssembler::stlxr; break; case T_OBJECT: case T_ARRAY: if (UseCompressedOops) { lda = &MacroAssembler::ldaxrw; add = &MacroAssembler::addw; - stl = &MacroAssembler::stxrw; + stl = &MacroAssembler::stlxrw; } else { lda = &MacroAssembler::ldaxr; add = &MacroAssembler::add; - stl = &MacroAssembler::stxr; + stl = &MacroAssembler::stlxr; } break; default: From aph at redhat.com Thu Apr 10 17:24:25 2014 From: aph at redhat.com (aph at redhat.com) Date: Thu, 10 Apr 2014 17:24:25 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8/hotspot: Rewrite CAS operations to be more conservative Message-ID: <201404101724.s3AHOQEg007020@aojmv0008> Changeset: d9468835bc51 Author: aph Date: 2014-04-10 06:50 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/d9468835bc51 Rewrite CAS operations to be more conservative ! src/cpu/aarch64/vm/aarch64.ad ! src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp From ed at camswl.com Fri Apr 11 11:41:29 2014 From: ed at camswl.com (Edward Nevill) Date: Fri, 11 Apr 2014 12:41:29 +0100 Subject: [aarch64-port-dev ] RFR: Documentation and config changes Message-ID: <1397216489.13136.50.camel@mint> Hi, Various changes to documentation, config and test excludes - Documentation changes I have updated the documentation as it was getting rather out of date (eg referring to C2 being incomplete, wrong links etc etc). Also corrected spelling of README-Xcompile-aarc64.html.orig to README-Xcompile-aarch64.html (the 'h' key is dodgy on my laptop which is why this was misspelled in the first place - there is something ironic about an Irishman with a busted 'h' key) - Configure changes I have removed the --with-boot-jdk and the --with-cacerts-file arguments to configure in the cross_configure script. The OE build no longer builds a native JVM in the sysroots so these were pointing to non existent locations with the latest sysroots. The --with-cacerts-file argument was broken in any case as it meant that the build was picking up the cacerts from the native OpenJDK7 in the sysroots rather than using the one from the OpenJDK8 sources. For --with-boot-jdk the build will pick up the default java on the host. I have also changed the default build type from fastdebug,client to release,server as I feel that this is more in keeping with the state of the project and I wish to remove the possibility of people accidentally doing benchmarking on the wrong build. - Test changes Added a test to the exclude which is also failing on x86. Bug reported against JDK 1.8.0 and also JDK 7u51. OK to push? Ed. --- CUT HERE --- exporting patch: # HG changeset patch # User Edward Nevill edward.nevill at linaro.org # Date 1397213400 -3600 # Fri Apr 11 11:50:00 2014 +0100 # Node ID 1a17b52547d268cc34c87318e295b1406a1c0871 # Parent e03c59f25d2b8d12d3a4906d72887bfe14494c85 Documentation and build/configure changes diff -r e03c59f25d2b -r 1a17b52547d2 README-Xcompile-aarc64.html --- a/README-Xcompile-aarc64.html Wed Mar 26 17:19:44 2014 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,249 +0,0 @@ - - -

Cross compile build for aarch64

-

This page describes the steps necessary to perform a cros compile build -for aarch64 and how to run the resultant images under the Foundation model -(ARM's freely available aarc64 simulator). - -

The steps basically comprise the following - -

-
Step 1
Obtain an aarch64 sysroots
-
Step 2
Configure and compile OpenJDK -
Step 3
Download the Foundation model -
Step 4
Obtain a kernel and root filesystem
-
Step 5
Copy your newly built OpenJDK image to the root filesystem -
Step 6
Boot the Foundation model and run your OpenJDK image -
- -

Step 1: Obtain an aarch64 sysroots

- -

The sysroots directory contains all the cross compilation tools, aarch64 -libraries and headers you will need to perform a cross compile build of -OpenJDK for aarch64. - -

For convenience I have provided a pre-built sysroots at - -http://people.linaro.org/~edward.nevill/aarch64/sysroots.tar.gz - -

Note that the above sysroots will only work if you are building on a 64-bit -x86 linux machine. If you are using anything else you will need to rebuild -the sysroots as the cross compilation tools will not work on your macine. - -

Download the sysroots.tar.gz file above and untar it into the directory containing the jdk8 repository so that the jdk8 directory and sysroots directory are at the same level. - -


-cd <directory containing jdk8>
-tar xfz sysroots.tar.gz
-ls
-
-jdk8	sysroots	sysroots.tar.gz
-
- -

Step 2: Configure and compile OpenJDK

- -

Two shell scripts have been provided to allow you to quickstart building OpenJDK. - -

-
cross_configure
Configure for a cross compiled release build of OpenJDK -
cross_compile
Build the j2re and j2sdk images for the current configuration -
- -

In theory all that should be necessary is - -


-cd jdk8
-sh ./cross_configure
-sh ./cross_compile 2>&1 | tee compile.log
-
- -

This should build a client compiler (C1) release version of OpenJDK which may be found in build/linux-aarch64-normal-client-release/images/j2sdk-image - -

Note: At the end of the build you may see an warning like - -

WARNING: You have the following ALT_ variables set:
-ALT_SDT_H=/home/ed/work/aa64/sysroots/genericarmv8/usr/include/sys/sdt.h
-ALT_ variables are deprecated and will be ignored. Please clean your environment.
-
- -

This warning may be ignored. The ALT_SDT_H is required and is not ignored! - -

Changing build/config options

- -

To build a debug version edit the cross_configure file and change --with-debug-level=release -to one of --with-debug-level=fastdebug or --with-debug-level=slowdebug - -

To build a Zero release change --with-jvm-variants=client to --with-jvm-variants=zero. You can -also specify --with-jvm-variants=server, however the server (C2) compiler is still under development -at the time of writing. - -

If you specify a new configuration such as --with-debug-level=slowdebug you may get the following -error when you run the cross_compile script. - -

No CONF given, but more than one configuration found in /home/ed/work/aarch64_tip/jdk8//build.
-Available configurations:
-* linux-aarch64-normal-client-slowdebug
-* linux-aarch64-normal-client-release
-Please retry building with CONF= (or SPEC=)
-
- -

To select which configuation you wish to build specify CONF=... before the cross_compile script. For example: - -


-CONF=linux-aarch64-normal-client-slowdebug sh ./cross_compile 2>&1 | tee compile.log
-
- -

Some other useful options you can specify to configure are: - -

-
--enable-debug-symbols
--disable-debug-symbols -
Enable/disable generation of debug symbols -
--enable-zip-debug-info
--disable-zip-debug-info -
Enable/disable zipping of debug-info files -
- -

Step 3: Download the Foundation model

- -Download the foundation model from ARM's website at - - -http://www.arm.com/products/tools/models/fast-models/foundation-model.php - - -

At the bottom of the page under Get started with Foundation Models -click on download now. You will be asked to login. If you have -prevously registered for an ARM account login, otherwise you will need -to check the license terms, ensure that you are happy with them and -register for an ARM account. - -

After logging in above, on the right hand side of the next page under -ARM V8 Foundation Model click Download Now. - -

The name of the file I downloaded was FM000-KT-00035-r0p8-48rel5.tgz. -Obviously the name of the file you download may differ depending on the -version number you download, but if you want to download exactly the -same version number you can look for it under "Display older versions". - -

Untar this archive. It will create a directory 'Foundation_v8pkg'. cd to -this directory - -


-tar xfz FM000-KT-00035-r0p8-48rel5.tgz
-cd Foundation_v8pkg
-
- -

Step 4: Download a kernel and root filesystem

- -

For convenience I have provided a pre-built kernel and pre-populated rootfs at - -

- -

Download both these files into the Foundation_v8pkg directory created above -and decompress the rootfs.ext2.gz - -


-gzip -d rootfs.ext2.gz
-
- -

Step 5: Copy your OpenJDK image into your rootfs

- -

First of all you must mount the root file system. Then copy the OpenJDK image into -the home directory of root. Then unmount the filesystem. - -


-sudo mount -o loop rootfs.ext2 /mnt
-sudo cp -r .../jdk8/build/linux-aarch64-normal-client-release/images/j2sdk-image /mnt/home/root
-sudo umount /mnt
-
- -

Step 6: Boot the Foundation model and run OpenJDK

- -

To boot the Foundation model enter - -


-./models/Linux64_GCC-4.1/Foundation_v8 --image kernel.axf --block-device rootfs.ext2 --network nat
-
- -

An xterm window will open and Linux will boot in the aarch64 simulator. After a short -while you will get the root '#' prompt. In the root directory you should find the -j2sdk-image directory you copied earlier in addition to a JavaApps directory -which contains some sample Java applications. - -


-ls
-JavaApps     j2sdk-image
-
- - - -

The following are brief instructions on running each of the provided sample applications. -In each case you can specify the following options. - -

-
-Xint
Run everything under the template interpreter -
-Xcomp
Attempt to compile every single method -
-Xmixed
The default. Mixed interpretation and compilation -
-XX:+PrintCompilation
Print a message as each method is compiled -
- -

-cd JavaApps/LinPack
-../../../j2sdk-image/bin/java Linpack
-Mflops/s: 5.012  Time: 0.14 secs  Norm Res: 1.43  Precision: 2.220446049250313E-16
-
-cd JavaApps/bm-1.1 - ../../j2sdk-image/bin/java -classpath dist/fullset/bench1.jar org.eembc.grinderbench.CmdlineWrapper -r 1 -m 1 -Copyright (c) 2003-2005, EDN Embedded Microprocessor Benchmark Consortium (EEMBC), Inc. -Parallel.....................8019 -kXML.........................4451 -PNG decoding.................8928 -Chess........................3564 -Crypto.......................4330 ---------------------------------- -
-cd JavaApps/dry -../../j2sdk-image/bin/java dhry -dhrystone (static): 10 iterations of 30000 executions.......... -total time: 3310ms -Result: 90634 dhrystone/sec. -
-cd JavaApps/ecm -../../j2sdk-image/bin/java CaffeineMarkEmbeddedApp -Sieve score = 1574 (98) -Loop score = 2413 (2017) -Logic score = 4102 (0) -String score = 1135 (708) -Float score = 1609 (185) -Method score = 1905 (166650) -Overall score = 1945 -
- -

The following programs require you to set the DISPLAY variable within - the simulator to the IP address of your X server. For example - -


-export DISPLAY=192.168.1.249:0.0
-
- -

You will also need to ensure that your X server is capable of accepting a remote connection. Also note that -if you want to specify the -X options on the appletviewer command you will need to precede them with -J. For -example "-J-Xcomp -J-XX:+PrintCompilation" - -


-cd JavaApps/galaxians
-../../j2sdk-image/bin/appletviewer Galaxians.html
-
-cd JavaApps/scared -../../j2sdk-image/bin/appletviewer game.html -
-cd JavaApps/cm3 -../../j2sdk-image/bin/appletviewer CaffeineMark30.html -
- - - \ No newline at end of file diff -r e03c59f25d2b -r 1a17b52547d2 README-Xcompile-aarch64.html --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/README-Xcompile-aarch64.html Fri Apr 11 11:50:00 2014 +0100 @@ -0,0 +1,249 @@ + + +

Cross compile build for aarch64

+

This page describes the steps necessary to perform a cros compile build +for aarch64 and how to run the resultant images under the Foundation model +(ARM's freely available aarc64 simulator). + +

The steps basically comprise the following + +

+
Step 1
Obtain an aarch64 sysroots
+
Step 2
Configure and compile OpenJDK +
Step 3
Download the Foundation model +
Step 4
Obtain a kernel and root filesystem
+
Step 5
Copy your newly built OpenJDK image to the root filesystem +
Step 6
Boot the Foundation model and run your OpenJDK image +
+ +

Step 1: Obtain an aarch64 sysroots

+ +

The sysroots directory contains all the cross compilation tools, aarch64 +libraries and headers you will need to perform a cross compile build of +OpenJDK for aarch64. + +

A pre-built sysroots may be downloaded from + +http://openjdk.linaro.org/aarch64_build/sysroots.tar.xz + +

Note that the above sysroots will only work if you are building on a 64-bit +x86 linux machine. If you are using anything else you will need to rebuild +the sysroots as the cross compilation tools will not work on your macine. + +

Download the sysroots.tar.xz file above and untar it into the directory containing the jdk8 repository so that the jdk8 directory and sysroots directory are at the same level. + +


+cd <directory containing jdk8>
+tar xf sysroots.tar.xz
+ls
+
+jdk8	sysroots	sysroots.tar.xz
+
+ +

Step 2: Configure and compile OpenJDK

+ +

Two shell scripts have been provided to allow you to quickstart building OpenJDK. + +

+
cross_configure
Configure for a cross compiled release build of OpenJDK +
cross_compile
Build the j2re and j2sdk images for the current configuration +
+ +

In theory all that should be necessary is + +


+cd jdk8
+sh ./cross_configure
+sh ./cross_compile 2>&1 | tee compile.log
+
+ +

This should build a server compiler (C2) release version of OpenJDK which may be found in build/linux-aarch64-normal-server-release/images/j2sdk-image + +

Note: At the end of the build you may see an warning like + +

WARNING: You have the following ALT_ variables set:
+ALT_SDT_H=/home/ed/work/aa64/sysroots/genericarmv8/usr/include/sys/sdt.h
+ALT_ variables are deprecated and will be ignored. Please clean your environment.
+
+ +

This warning may be ignored. The ALT_SDT_H is required and is not ignored! + +

Changing build/config options

+ +

To build a debug version edit the cross_configure file and change --with-debug-level=release +to one of --with-debug-level=fastdebug or --with-debug-level=slowdebug + +

To build a Client release change --with-jvm-variants=server to --with-jvm-variants=client. You can also build a combined client and server release with --with-jvm-variants=client,server. + +

To build a Zero release change --with-jvm-variants=client to --with-jvm-variants=zero. + +

If you specify a new configuration such as --with-debug-level=slowdebug you may get the following +error when you run the cross_compile script. + +

No CONF given, but more than one configuration found in /home/ed/work/aarch64_tip/jdk8//build.
+Available configurations:
+* linux-aarch64-normal-client-slowdebug
+* linux-aarch64-normal-client-release
+Please retry building with CONF= (or SPEC=)
+
+ +

To select which configuation you wish to build specify CONF=... before the cross_compile script. For example: + +


+CONF=linux-aarch64-normal-client-slowdebug sh ./cross_compile 2>&1 | tee compile.log
+
+ +

Some other useful options you can specify to configure are: + +

+
--enable-debug-symbols
--disable-debug-symbols +
Enable/disable generation of debug symbols +
--enable-zip-debug-info
--disable-zip-debug-info +
Enable/disable zipping of debug-info files +
+ +

Step 3: Download the Foundation model

+ +Download the foundation model from ARM's website at + + +http://www.arm.com/products/tools/models/fast-models/foundation-model.php + + +

At the bottom of the page under Get started with Foundation Models +click on download now. You will be asked to login. If you have +prevously registered for an ARM account login, otherwise you will need +to check the license terms, ensure that you are happy with them and +register for an ARM account. + +

After logging in above, on the right hand side of the next page under +ARM V8 Foundation Model click Download Now. + +

The name of the file I downloaded was FM000-KT-00035-r0p8-48rel5.tgz. +Obviously the name of the file you download may differ depending on the +version number you download, but if you want to download exactly the +same version number you can look for it under "Display older versions". + +

Untar this archive. It will create a directory 'Foundation_v8pkg'. cd to +this directory + +


+tar xfz FM000-KT-00035-r0p8-48rel5.tgz
+cd Foundation_v8pkg
+
+ +

Step 4: Download a kernel and root filesystem

+ +

For convenience I have provided a pre-built kernel and pre-populated rootfs at + +

+ +

Download both these files into the Foundation_v8pkg directory created above +and decompress the rootfs.ext2.gz + +


+gzip -d rootfs.ext2.gz
+
+ +

Step 5: Copy your OpenJDK image into your rootfs

+ +

First of all you must mount the root file system. Then copy the OpenJDK image into +the home directory of root. Then unmount the filesystem. + +


+sudo mount -o loop rootfs.ext2 /mnt
+sudo cp -r .../jdk8/build/linux-aarch64-normal-client-release/images/j2sdk-image /mnt/home/root
+sudo umount /mnt
+
+ +

Step 6: Boot the Foundation model and run OpenJDK

+ +

To boot the Foundation model enter + +


+./models/Linux64_GCC-4.1/Foundation_v8 --image kernel.axf --block-device rootfs.ext2 --network nat
+
+ +

An xterm window will open and Linux will boot in the aarch64 simulator. After a short +while you will get the root '#' prompt. In the root directory you should find the +j2sdk-image directory you copied earlier in addition to a JavaApps directory +which contains some sample Java applications. + +


+ls
+JavaApps     j2sdk-image
+
+ + + +

The following are brief instructions on running each of the provided sample applications. +In each case you can specify the following options. + +

+
-Xint
Run everything under the template interpreter +
-Xcomp
Attempt to compile every single method +
-Xmixed
The default. Mixed interpretation and compilation +
-XX:+PrintCompilation
Print a message as each method is compiled +
+ +

+cd JavaApps/LinPack
+../../../j2sdk-image/bin/java Linpack
+Mflops/s: 5.012  Time: 0.14 secs  Norm Res: 1.43  Precision: 2.220446049250313E-16
+
+cd JavaApps/bm-1.1 + ../../j2sdk-image/bin/java -classpath dist/fullset/bench1.jar org.eembc.grinderbench.CmdlineWrapper -r 1 -m 1 +Copyright (c) 2003-2005, EDN Embedded Microprocessor Benchmark Consortium (EEMBC), Inc. +Parallel.....................8019 +kXML.........................4451 +PNG decoding.................8928 +Chess........................3564 +Crypto.......................4330 +--------------------------------- +
+cd JavaApps/dry +../../j2sdk-image/bin/java dhry +dhrystone (static): 10 iterations of 30000 executions.......... +total time: 3310ms +Result: 90634 dhrystone/sec. +
+cd JavaApps/ecm +../../j2sdk-image/bin/java CaffeineMarkEmbeddedApp +Sieve score = 1574 (98) +Loop score = 2413 (2017) +Logic score = 4102 (0) +String score = 1135 (708) +Float score = 1609 (185) +Method score = 1905 (166650) +Overall score = 1945 +
+ +

The following programs require you to set the DISPLAY variable within + the simulator to the IP address of your X server. For example + +


+export DISPLAY=192.168.1.249:0.0
+
+ +

You will also need to ensure that your X server is capable of accepting a remote connection. Also note that +if you want to specify the -X options on the appletviewer command you will need to precede them with -J. For +example "-J-Xcomp -J-XX:+PrintCompilation" + +


+cd JavaApps/galaxians
+../../j2sdk-image/bin/appletviewer Galaxians.html
+
+cd JavaApps/scared +../../j2sdk-image/bin/appletviewer game.html +
+cd JavaApps/cm3 +../../j2sdk-image/bin/appletviewer CaffeineMark30.html +
+ + + diff -r e03c59f25d2b -r 1a17b52547d2 README.aarch64 --- a/README.aarch64 Wed Mar 26 17:19:44 2014 +0000 +++ b/README.aarch64 Fri Apr 11 11:50:00 2014 +0100 @@ -3,14 +3,12 @@ Overview -------- -The current AArch64 port of OpenJDK allows execution of a template -interpreter implemented using JITted AArch64 code and a C1 JIT -compiler which generates AArch64 code. It does not yet include a -complete implementation of the C2 JIT compiler. +The AArch64 port of OpenJDK provides C1 (Client) JIT, C2 (Server) +JIT support and Template Interpreter support for AArch64. -In the absence of available ARMv8 hardware, the AArch64 JVM has to -employ a simulator to execute the AArch64 code. There are two modes of -operation. +Due to the limited availability of ARMv8 HW, the AArch64 JVM may be +run using a simulator to execute the AArch64 code. There are two modes +of operation. The first is to run the ported JVM compiled for x86 hardware and Linux and use our small AArch64 simulator to execute AArch64 code. With this @@ -214,7 +212,7 @@ $ echo "simstopnew 1" > ~/.simgdbrc -Now when you run the program it will stop wen the initial thread first +Now when you run the program it will stop when the initial thread first starts executing AArch64 code (gdb) run Hello @@ -410,9 +408,10 @@ ------------------------------------------- Full details of how to obtain and set up ARM's Foundation model simulator and -how to to the AArch64 JVM for use on the Foundation model are provided at +how to to the AArch64 JVM for use on the Foundation model are provided in the +file - http://people.linaro.org/~edward.nevill/aarch64/README-cross-compile.html + README-Xcompile-aarch64.html The next few sections briefly summarise the steps needed to configure, compile and run on the Foundation model. @@ -435,15 +434,11 @@ $ bash cross_compile -After a few cups of coffee you should find that the AArch64 JVM sdk -and jre images have been installed into your sysroot tree (they should -appear as /j2sdk-image and /j2re-image) - Running the JVM using ARM's Foundation model simulator ------------------------------------------------------ -To exercise the resulting image you have to boot Linux on the -simulator and then at the # prompt type in +To exercise the resulting image boot Linux on the simulator and then +at the # prompt type in (for example) # ./jsdk-image/bin/javac Queens.java # ./j2re-image/bin/java Queens diff -r e03c59f25d2b -r 1a17b52547d2 cross_configure --- a/cross_configure Wed Mar 26 17:19:44 2014 +0000 +++ b/cross_configure Fri Apr 11 11:50:00 2014 +0100 @@ -3,7 +3,7 @@ if [ ! -d $IWD/sysroots ]; then echo "$IWD/sysroots not found!!!" echo "Please either install sysroots in $IWD/sysroots, or link $IWD/sysroots to your installed sysroots." - echo "For a limited period of time a pre-populated sysroots may be downloaded from http://people.linaro.org/~edward.nevill/sysroots.tar.gz" + echo "A pre-populated sysroots may be downloaded from http://openjdk.linaro.org/aarch64_build/sysroots.tar.xz" exit 1 fi mkdir -p /tmp/oe_45434e/jenkins-setup/build/tmp-eglibc @@ -25,4 +25,4 @@ export AS="aarch64-oe-linux-as" export AR="aarch64-oe-linux-ar" export PATH="$IWD/sysroots/x86_64-linux/usr/bin/aarch64-oe-linux:$IWD/sysroots/genericarmv8/usr/bin/crossscripts:$IWD/sysroots/x86_64-linux/usr/sbin:$IWD/sysroots/x86_64-linux/usr/bin:$IWD/sysroots/x86_64-linux/sbin:$IWD/sysroots/x86_64-linux//bin:/usr/sbin:/usr/bin:/sbin:/bin" -sh ./configure --with-debug-level=fastdebug --with-jvm-variants=client --with-sys-root=$IWD/sysroots/genericarmv8 --enable-unlimited-crypto --openjdk-target=aarch64-oe-linux --with-cacerts-file=$IWD/sysroots/x86_64-linux/usr/lib/jvm/icedtea7-native/jre/lib/security/cacerts --with-zlib=system --with-stdc++lib=dynamic --with-boot-jdk=$IWD/sysroots/x86_64-linux/usr/lib/jvm/icedtea7-native +sh ./configure --with-debug-level=release --with-jvm-variants=server --with-sys-root=$IWD/sysroots/genericarmv8 --enable-unlimited-crypto --openjdk-target=aarch64-oe-linux --with-zlib=system --with-stdc++lib=dynamic diff -r e03c59f25d2b -r 1a17b52547d2 test/exclude_aarch64.txt --- a/test/exclude_aarch64.txt Wed Mar 26 17:19:44 2014 +0000 +++ b/test/exclude_aarch64.txt Fri Apr 11 11:50:00 2014 +0100 @@ -15,3 +15,4 @@ runtime/SharedArchiveFile/CdsSameObjectAlignment.java generic-all runtime/SharedArchiveFile/CdsDifferentObjectAlignment.java generic-all compiler/whitebox/IsMethodCompilableTest.java generic-all +sun/java2d/OpenGL/DrawBufImgOp.java generic-all --- CUT HERE --- From ebourg at apache.org Wed Apr 16 08:22:49 2014 From: ebourg at apache.org (Emmanuel Bourg) Date: Wed, 16 Apr 2014 10:22:49 +0200 Subject: [aarch64-port-dev ] jdk8u vs aarch64-port Message-ID: <534E3DD9.1020808@apache.org> Hi, Java 8u5 has just been released but it seems the changes haven't been merged in the aarch64-port forest yet. I wondered if you planned to follow closely the main releases and keep the trees in sync, or if this will be performed infrequently? I'm asking because I'm preparing the OpenJDK 8 package for Debian and so far we bundled the aarch64 port along the main codebase. If the aarch64 port is likely to always lag behind we have to organize the package accordingly. Another question, is the aarch64 port likely to be merged in the main codebase during the JDK8 lifecycle, or will this only happen in JDK9? Thank you, Emmanuel Bourg From adinn at redhat.com Wed Apr 16 09:12:38 2014 From: adinn at redhat.com (Andrew Dinn) Date: Wed, 16 Apr 2014 10:12:38 +0100 Subject: [aarch64-port-dev ] jdk8u vs aarch64-port In-Reply-To: <534E3DD9.1020808@apache.org> References: <534E3DD9.1020808@apache.org> Message-ID: <534E4986.6030103@redhat.com> On 16/04/14 09:22, Emmanuel Bourg wrote: > Java 8u5 has just been released but it seems the changes haven't been > merged in the aarch64-port forest yet. I wondered if you planned to > follow closely the main releases and keep the trees in sync, or if this > will be performed infrequently? I don't think we are going to commit to keeping them in close sync just now. Since no one is using AArch64 in production or, as yet, development there is no great pressure to do so and I think would rather have the option to do this at a leisurely pace and concentrate on other priorities. > I'm asking because I'm preparing the OpenJDK 8 package for Debian and so > far we bundled the aarch64 port along the main codebase. If the aarch64 > port is likely to always lag behind we have to organize the package > accordingly. So, decoupling the two sound slike a good plan. > Another question, is the aarch64 port likely to be merged in the main > codebase during the JDK8 lifecycle, or will this only happen in JDK9? We can only upstream the port into a development release (putting new functionality into dev releases first is an OpenJDK policy). Andrew Haley has just proposed our port for inclusion in JDK9 on the hotspot-dev list: http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-April/013527.html So, yes we hope to get the port upstream for JDK9 but we are still waiting for a plan to be formulated. regards, Andrew Dinn ----------- From ebourg at apache.org Wed Apr 16 10:37:28 2014 From: ebourg at apache.org (Emmanuel Bourg) Date: Wed, 16 Apr 2014 12:37:28 +0200 Subject: [aarch64-port-dev ] jdk8u vs aarch64-port In-Reply-To: <534E4986.6030103@redhat.com> References: <534E3DD9.1020808@apache.org> <534E4986.6030103@redhat.com> Message-ID: <534E5D68.1080904@apache.org> Le 16/04/2014 11:12, Andrew Dinn a ?crit : > So, decoupling the two sound slike a good plan. Thank you for the information Andrew. That's not the option I preferred but at least I know what to expect. Good luck with the port. Emmanuel From openjdk-testing at linaro.org Thu Apr 17 13:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Thu, 17 Apr 2014 14:00:01 +0100 (BST) Subject: [aarch64-port-dev ] server JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140417130032.813191F4EB@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/summary/2014/107/summary.html =============================================================================== server-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/apr/11 pass: 433; fail: 1; error: 4 Build 1: aarch64/2014/apr/12 pass: 433; fail: 1; error: 4 Build 2: aarch64/2014/apr/13 pass: 433; fail: 1; error: 4 Build 3: aarch64/2014/apr/14 pass: 433; fail: 1; error: 4 Build 4: aarch64/2014/apr/15 pass: 432; fail: 2; error: 4 Build 5: aarch64/2014/apr/16 pass: 433; fail: 1; error: 4 Build 6: aarch64/2014/apr/17 pass: 433; fail: 1; error: 4 ------------------------------------------------------------------------------- =============================================================================== server-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/apr/11 pass: 2,937; error: 35 Build 1: aarch64/2014/apr/12 pass: 2,939; error: 33 Build 2: aarch64/2014/apr/13 pass: 2,936; error: 36 Build 3: aarch64/2014/apr/14 pass: 2,935; error: 37 Build 4: aarch64/2014/apr/15 pass: 2,936; error: 36 Build 5: aarch64/2014/apr/16 pass: 2,934; error: 38 Build 6: aarch64/2014/apr/17 pass: 2,933; error: 39 ------------------------------------------------------------------------------- =============================================================================== server-release/jdk =============================================================================== Build 0: aarch64/2014/apr/11 pass: 5,231; fail: 149; error: 70 Build 1: aarch64/2014/apr/12 pass: 5,237; fail: 142; error: 71 Build 2: aarch64/2014/apr/13 pass: 5,236; fail: 145; error: 69 Build 3: aarch64/2014/apr/14 pass: 5,245; fail: 134; error: 71 Build 4: aarch64/2014/apr/15 pass: 5,230; fail: 149; error: 71 Build 5: aarch64/2014/apr/16 pass: 5,242; fail: 134; error: 74 Build 6: aarch64/2014/apr/17 pass: 5,230; fail: 146; error: 74 ------------------------------------------------------------------------------- Previous results can be found here: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/index.html From openjdk-testing at linaro.org Thu Apr 17 13:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Thu, 17 Apr 2014 14:00:01 +0100 (BST) Subject: [aarch64-port-dev ] client JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140417130032.5FFAC1F531@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/summary/2014/107/summary.html =============================================================================== client-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/apr/11 pass: 431; fail: 5; error: 2 Build 1: aarch64/2014/apr/12 pass: 430; fail: 5; error: 3 Build 2: aarch64/2014/apr/13 pass: 431; fail: 5; error: 2 Build 3: aarch64/2014/apr/14 pass: 431; fail: 5; error: 2 Build 4: aarch64/2014/apr/15 pass: 431; fail: 5; error: 2 Build 5: aarch64/2014/apr/16 pass: 429; fail: 5; error: 4 Build 6: aarch64/2014/apr/17 pass: 429; fail: 5; error: 4 ------------------------------------------------------------------------------- =============================================================================== client-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/apr/11 pass: 2,940; error: 32 Build 1: aarch64/2014/apr/12 pass: 2,937; error: 35 Build 2: aarch64/2014/apr/13 pass: 2,937; error: 35 Build 3: aarch64/2014/apr/14 pass: 2,938; error: 34 Build 4: aarch64/2014/apr/15 pass: 2,939; error: 33 Build 5: aarch64/2014/apr/16 pass: 2,936; error: 36 Build 6: aarch64/2014/apr/17 pass: 2,937; error: 35 ------------------------------------------------------------------------------- =============================================================================== client-release/jdk =============================================================================== Build 0: aarch64/2014/apr/11 pass: 5,237; fail: 155; error: 58 Build 1: aarch64/2014/apr/12 pass: 5,233; fail: 160; error: 57 Build 2: aarch64/2014/apr/13 pass: 5,238; fail: 152; error: 60 Build 3: aarch64/2014/apr/14 pass: 5,236; fail: 157; error: 57 Build 4: aarch64/2014/apr/15 pass: 5,236; fail: 155; error: 59 Build 5: aarch64/2014/apr/16 pass: 5,231; fail: 156; error: 63 Build 6: aarch64/2014/apr/17 pass: 5,231; fail: 157; error: 62 ------------------------------------------------------------------------------- Previous results can be found here: http://people.linaro.org/~andrew.mcdermott/openjdk8-jtreg-nightly-tests/index.html From aph at redhat.com Tue Apr 22 18:10:47 2014 From: aph at redhat.com (Andrew Haley) Date: Tue, 22 Apr 2014 19:10:47 +0100 Subject: [aarch64-port-dev ] Use an explicit set of registers rather than a bitmap for psh and pop operations. Message-ID: <5356B0A7.1000203@redhat.com> The code using push() and pop() was getting to be very hard to understand. This patch defines a set of registers as a type, transforming pop(0x3fffffff, sp); into pop(RegSet::range(r0, r29), sp); This is fairly delicate code, so I'd appreciate someone going over it to check that I haven't made mistakes. Andrew. -------------- next part -------------- # HG changeset patch # User aph # Date 1398189279 -3600 # Tue Apr 22 18:54:39 2014 +0100 # Node ID 4c3b20781d5d4e187662896d4957e28aa332973b # Parent d9468835bc5160b7fac6709b0afbc751b2159fbb Use an explicit set of registers rather than a bitmap for psh and pop operations. diff -r d9468835bc51 -r 4c3b20781d5d src/cpu/aarch64/vm/c1_CodeStubs_aarch64.cpp --- a/src/cpu/aarch64/vm/c1_CodeStubs_aarch64.cpp Thu Apr 10 06:50:43 2014 -0400 +++ b/src/cpu/aarch64/vm/c1_CodeStubs_aarch64.cpp Tue Apr 22 18:54:39 2014 +0100 @@ -54,7 +54,7 @@ __ enter(); __ sub(sp, sp, 2 * wordSize); - __ push(0x3fffffff, sp); // integer registers except lr & sp + __ push(RegSet::range(r0, r29), sp); // integer registers except lr & sp for (int i = 30; i >= 0; i -= 2) // caller-saved fp registers if (i < 8 || i > 15) __ stpd(as_FloatRegister(i), as_FloatRegister(i+1), @@ -103,7 +103,7 @@ if (i < 8 || i > 15) __ ldpd(as_FloatRegister(i), as_FloatRegister(i+1), Address(__ post(sp, 2 * wordSize))); - __ pop(0x3fffffff, sp); + __ pop(RegSet::range(r0, r29), sp); __ ldr(as_reg(result()), Address(rfp, -wordSize)); __ leave(); diff -r d9468835bc51 -r 4c3b20781d5d src/cpu/aarch64/vm/c1_Runtime1_aarch64.cpp --- a/src/cpu/aarch64/vm/c1_Runtime1_aarch64.cpp Thu Apr 10 06:50:43 2014 -0400 +++ b/src/cpu/aarch64/vm/c1_Runtime1_aarch64.cpp Tue Apr 22 18:54:39 2014 +0100 @@ -69,7 +69,7 @@ int call_offset = offset(); // verify callee-saved register #ifdef ASSERT - push(0b1, sp); // r0 + push(RegSet::of(r0), sp); { Label L; get_thread(r0); cmp(rthread, r0); @@ -77,7 +77,7 @@ stop("StubAssembler::call_RT: rthread not callee saved?"); bind(L); } - pop(0b1, sp); + pop(RegSet::of(r0), sp); #endif reset_last_Java_frame(true, true); @@ -262,7 +262,7 @@ bool save_fpu_registers = true) { __ block_comment("save_live_registers"); - __ push(0x3fffffff, sp); // integer registers except lr & sp + __ push(RegSet::range(r0, r29), sp); // integer registers except lr & sp if (save_fpu_registers) { for (int i = 30; i >= 0; i -= 2) @@ -284,7 +284,7 @@ __ add(sp, sp, 32 * wordSize); } - __ pop(0x3fffffff, sp); + __ pop(RegSet::range(r0, r29), sp); } static void restore_live_registers_except_r0(StubAssembler* sasm, bool restore_fpu_registers = true) { @@ -298,7 +298,7 @@ } __ ldp(zr, r1, Address(__ post(sp, 16))); - __ pop(0x3ffffffc, sp); + __ pop(RegSet::range(r2, r29), sp); } @@ -1014,7 +1014,7 @@ }; __ set_info("slow_subtype_check", dont_gc_arguments); - __ push((1 << 0) | (1 <<2) | (1 << 4) | (1 << 5), sp); + __ push(RegSet::of(r0, r2, r4, r5), sp); // This is called by pushing args and not with C abi // __ ldr(r4, Address(sp, (klass_off) * VMRegImpl::stack_slot_size)); // subclass @@ -1028,12 +1028,12 @@ // fallthrough on success: __ mov(rscratch1, 1); __ str(rscratch1, Address(sp, (result_off) * VMRegImpl::stack_slot_size)); // result - __ pop((1 << 0) | (1 <<2) | (1 << 4) | (1 << 5), sp); + __ pop(RegSet::of(r0, r2, r4, r5), sp); __ ret(lr); __ bind(miss); __ str(zr, Address(sp, (result_off) * VMRegImpl::stack_slot_size)); // result - __ pop((1 << 0) | (1 <<2) | (1 << 4) | (1 << 5), sp); + __ pop(RegSet::of(r0, r2, r4, r5), sp); __ ret(lr); } break; @@ -1154,12 +1154,7 @@ #if INCLUDE_ALL_GCS // Registers to be saved around calls to g1_wb_pre or g1_wb_post -#define G1_SAVE_REGS ((r0->bit(1)|r1->bit(1)| \ - r2->bit(1)|r3->bit(1)|r4->bit(1)|r5->bit(1)| \ - r6->bit(1)|r7->bit(1)|r8->bit(1)|r9->bit(1)| \ - r10->bit(1)|r11->bit(1)|r12->bit(1)|r13->bit(1)| \ - r14->bit(1)|r15->bit(1)|r16->bit(1)|r17->bit(1)| \ - r18->bit(1))&~(rscratch1->bit(1)|rscratch2->bit(1))) +#define G1_SAVE_REGS (RegSet::range(r0, r18) - RegSet::of(rscratch1, rscratch2)) case g1_pre_barrier_slow_id: { @@ -1262,10 +1257,10 @@ const Register buffer_addr = r0; - __ push(r0->bit(1) | r1->bit(1), sp); + __ push(RegSet::of(r0, r1), sp); __ ldr(buffer_addr, buffer); __ str(card_addr, Address(buffer_addr, rscratch1)); - __ pop(r0->bit(1) | r1->bit(1), sp); + __ pop(RegSet::of(r0, r1), sp); __ b(done); __ bind(runtime); diff -r d9468835bc51 -r 4c3b20781d5d src/cpu/aarch64/vm/interp_masm_aarch64.hpp --- a/src/cpu/aarch64/vm/interp_masm_aarch64.hpp Thu Apr 10 06:50:43 2014 -0400 +++ b/src/cpu/aarch64/vm/interp_masm_aarch64.hpp Tue Apr 22 18:54:39 2014 +0100 @@ -146,8 +146,8 @@ void pop(TosState state); // transition vtos -> state void push(TosState state); // transition state -> vtos - void pop(unsigned bitset, Register stack) { ((MacroAssembler*)this)->pop(bitset, stack); } - void push(unsigned bitset, Register stack) { ((MacroAssembler*)this)->push(bitset, stack); } + void pop(RegSet regs, Register stack) { ((MacroAssembler*)this)->pop(regs, stack); } + void push(RegSet regs, Register stack) { ((MacroAssembler*)this)->push(regs, stack); } void empty_expression_stack() { ldr(esp, Address(rfp, frame::interpreter_frame_monitor_block_top_offset * wordSize)); diff -r d9468835bc51 -r 4c3b20781d5d src/cpu/aarch64/vm/macroAssembler_aarch64.hpp --- a/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp Thu Apr 10 06:50:43 2014 -0400 +++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.hpp Tue Apr 22 18:54:39 2014 +0100 @@ -403,9 +403,15 @@ void mov_immediate64(Register dst, u_int64_t imm64); void mov_immediate32(Register dst, u_int32_t imm32); + int push(unsigned int bitset, Register stack); + int pop(unsigned int bitset, Register stack); + +public: + int push(RegSet regs, Register stack) { if (regs.bits()) push(regs.bits(), stack); } + int pop(RegSet regs, Register stack) { if (regs.bits()) pop(regs.bits(), stack); } + // now mov instructions for loading absolute addresses and 32 or // 64 bit integers -public: inline void mov(Register dst, address addr) { @@ -1229,9 +1235,6 @@ void pusha(); void popa(); - int push(unsigned int bitset, Register stack); - int pop(unsigned int bitset, Register stack); - void repne_scan(Register addr, Register value, Register count, Register scratch); void repne_scanw(Register addr, Register value, Register count, diff -r d9468835bc51 -r 4c3b20781d5d src/cpu/aarch64/vm/methodHandles_aarch64.cpp --- a/src/cpu/aarch64/vm/methodHandles_aarch64.cpp Thu Apr 10 06:50:43 2014 -0400 +++ b/src/cpu/aarch64/vm/methodHandles_aarch64.cpp Tue Apr 22 18:54:39 2014 +0100 @@ -77,7 +77,7 @@ BLOCK_COMMENT("verify_klass {"); __ verify_oop(obj); __ cbz(obj, L_bad); - __ push(temp->bit() | temp2->bit(), sp); + __ push(RegSet::of(temp, temp2), sp); __ load_klass(temp, obj); __ cmpptr(temp, ExternalAddress((address) klass_addr)); __ br(Assembler::EQ, L_ok); @@ -85,11 +85,11 @@ __ ldr(temp, Address(temp, super_check_offset)); __ cmpptr(temp, ExternalAddress((address) klass_addr)); __ br(Assembler::EQ, L_ok); - __ pop(temp->bit() | temp2->bit(), sp); + __ pop(RegSet::of(temp, temp2), sp); __ bind(L_bad); __ stop(error_message); __ BIND(L_ok); - __ pop(temp->bit() | temp2->bit(), sp); + __ pop(RegSet::of(temp, temp2), sp); BLOCK_COMMENT("} verify_klass"); } diff -r d9468835bc51 -r 4c3b20781d5d src/cpu/aarch64/vm/register_aarch64.hpp --- a/src/cpu/aarch64/vm/register_aarch64.hpp Thu Apr 10 06:50:43 2014 -0400 +++ b/src/cpu/aarch64/vm/register_aarch64.hpp Tue Apr 22 18:54:39 2014 +0100 @@ -232,4 +232,57 @@ static const int max_fpr; }; +// A set of registers +class RegSet { + uint32_t _bitset; + + RegSet(uint32_t bitset) : _bitset (bitset) { } + +public: + + RegSet() : _bitset(0) { } + + RegSet operator+(Register r1) const { + RegSet result(_bitset | r1->bit()); + return result; + } + + RegSet operator+(RegSet aSet) const { + RegSet result(_bitset | aSet._bitset); + return result; + } + + RegSet operator-(RegSet aSet) const { + RegSet result(_bitset & ~aSet._bitset); + return result; + } + + static RegSet of(Register r1) { + return RegSet(r1->bit()); + } + + static RegSet of(Register r1, Register r2) { + return of(r1) + r2; + } + + static RegSet of(Register r1, Register r2, Register r3) { + return of(r1, r2) + r3; + } + + static RegSet of(Register r1, Register r2, Register r3, Register r4) { + return of(r1, r2, r3) + r4; + } + + static RegSet range(Register start, Register end) { + uint32_t bits = ~0; + bits <<= start->encoding(); + bits <<= 31 - end->encoding(); + bits >>= 31 - end->encoding(); + + return RegSet(bits); + } + + uint32_t bits() const { return _bitset; } +}; + #endif // CPU_AARCH64_VM_REGISTER_AARCH64_HPP diff -r d9468835bc51 -r 4c3b20781d5d src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp --- a/src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp Thu Apr 10 06:50:43 2014 -0400 +++ b/src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp Tue Apr 22 18:54:39 2014 +0100 @@ -1052,29 +1052,27 @@ } } static void save_args(MacroAssembler *masm, int arg_count, int first_arg, VMRegPair *args) { - unsigned long x = 0; // Register bit vector + RegSet x; for ( int i = first_arg ; i < arg_count ; i++ ) { if (args[i].first()->is_Register()) { - x |= 1 << args[i].first()->as_Register()->encoding(); + x = x + args[i].first()->as_Register(); } else if (args[i].first()->is_FloatRegister()) { __ strd(args[i].first()->as_FloatRegister(), Address(__ pre(sp, -2 * wordSize))); } } - if (x) - __ push(x, sp); + __ push(x, sp); } static void restore_args(MacroAssembler *masm, int arg_count, int first_arg, VMRegPair *args) { - unsigned long x = 0; + RegSet x; for ( int i = first_arg ; i < arg_count ; i++ ) { if (args[i].first()->is_Register()) { - x |= 1 << args[i].first()->as_Register()->encoding(); + x = x + args[i].first()->as_Register(); } else { ; } } - if (x) - __ pop(x, sp); + __ pop(x, sp); for ( int i = first_arg ; i < arg_count ; i++ ) { if (args[i].first()->is_Register()) { ; diff -r d9468835bc51 -r 4c3b20781d5d src/cpu/aarch64/vm/stubGenerator_aarch64.cpp --- a/src/cpu/aarch64/vm/stubGenerator_aarch64.cpp Thu Apr 10 06:50:43 2014 -0400 +++ b/src/cpu/aarch64/vm/stubGenerator_aarch64.cpp Tue Apr 22 18:54:39 2014 +0100 @@ -805,7 +805,7 @@ __ bind(error); __ ldp(c_rarg3, c_rarg2, Address(__ post(sp, 16))); - __ push(0x7fffffff, sp); + __ push(RegSet::range(r0, r29), sp); // debug(char* msg, int64_t pc, int64_t regs[]) __ ldr(c_rarg0, Address(sp, rscratch1->encoding())); // pass address of error message __ mov(c_rarg1, Address(sp, lr)); // pass return address @@ -816,7 +816,7 @@ BLOCK_COMMENT("call MacroAssembler::debug"); __ mov(rscratch1, CAST_FROM_FN_PTR(address, MacroAssembler::debug64)); __ blrt(rscratch1, 3, 0, 1); - __ pop(0x7fffffff, sp); + __ pop(RegSet::range(r0, r29), sp); __ ldp(rscratch2, lr, Address(__ post(sp, 2 * wordSize))); __ ldp(r0, rscratch1, Address(__ post(sp, 2 * wordSize))); @@ -880,7 +880,7 @@ case BarrierSet::G1SATBCTLogging: // With G1, don't generate the call if we statically know that the target in uninitialized if (!dest_uninitialized) { - __ push(0x3fffffff, sp); // integer registers except lr & sp + __ push(RegSet::range(r0, r29), sp); // integer registers except lr & sp if (count == c_rarg0) { if (addr == c_rarg1) { // exactly backwards!! @@ -895,7 +895,7 @@ __ mov(c_rarg1, count); } __ call_VM_leaf(CAST_FROM_FN_PTR(address, BarrierSet::static_write_ref_array_pre), 2); - __ pop(0x3fffffff, sp); // integer registers except lr & sp } + __ pop(RegSet::range(r0, r29), sp); // integer registers except lr & sp } break; case BarrierSet::CardTableModRef: case BarrierSet::CardTableExtension: @@ -926,7 +926,7 @@ case BarrierSet::G1SATBCTLogging: { - __ push(0x3fffffff, sp); // integer registers except lr & sp + __ push(RegSet::range(r0, r29), sp); // integer registers except lr & sp // must compute element count unless barrier set interface is changed (other platforms supply count) assert_different_registers(start, end, scratch); __ lea(scratch, Address(end, BytesPerHeapOop)); @@ -935,7 +935,7 @@ __ mov(c_rarg0, start); __ mov(c_rarg1, scratch); __ call_VM_leaf(CAST_FROM_FN_PTR(address, BarrierSet::static_write_ref_array_post), 2); - __ pop(0x3fffffff, sp); // integer registers except lr & sp } + __ pop(RegSet::range(r0, r29), sp); // integer registers except lr & sp } } break; case BarrierSet::CardTableModRef: @@ -1298,13 +1298,13 @@ } __ enter(); if (is_oop) { - __ push(d->bit() | count->bit(), sp); + __ push(RegSet::of(d, count), sp); // no registers are destroyed by this call gen_write_ref_array_pre_barrier(d, count, dest_uninitialized); } copy_memory(aligned, s, d, count, rscratch1, size); if (is_oop) { - __ pop(d->bit() | count->bit(), sp); + __ pop(RegSet::of(d, count), sp); if (VerifyOops) verify_oop_array(size, d, count, r16); __ sub(count, count, 1); // make an inclusive end pointer @@ -1350,13 +1350,13 @@ __ enter(); if (is_oop) { - __ push(d->bit() | count->bit(), sp); + __ push(RegSet::of(d, count), sp); // no registers are destroyed by this call gen_write_ref_array_pre_barrier(d, count, dest_uninitialized); } copy_memory(aligned, s, d, count, rscratch1, -size); if (is_oop) { - __ pop(d->bit() | count->bit(), sp); + __ pop(RegSet::of(d, count), sp); if (VerifyOops) verify_oop_array(size, d, count, r16); __ sub(count, count, 1); // make an inclusive end pointer @@ -1682,7 +1682,7 @@ // Empty array: Nothing to do. __ cbz(count, L_done); - __ push(r18->bit() | r19->bit() | r20->bit() | r21->bit(), sp); + __ push(RegSet::of(r18, r19, r20, r21), sp); #ifdef ASSERT BLOCK_COMMENT("assert consistent ckoff/ckval"); @@ -1743,7 +1743,7 @@ gen_write_ref_array_post_barrier(start_to, to, rscratch1); __ bind(L_done_pop); - __ pop(r18->bit() | r19->bit() | r20->bit()| r21->bit(), sp); + __ pop(RegSet::of(r18, r19, r20, r21), sp); inc_counter_np(SharedRuntime::_checkcast_array_copy_ctr); __ bind(L_done); diff -r d9468835bc51 -r 4c3b20781d5d src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp --- a/src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp Thu Apr 10 06:50:43 2014 -0400 +++ b/src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp Tue Apr 22 18:54:39 2014 +0100 @@ -1810,12 +1810,12 @@ __ push(lr); __ push(state); - __ push(0xffffu, sp); + __ push(RegSet::range(r0, r15), sp); __ mov(c_rarg2, r0); // Pass itos __ call_VM(noreg, CAST_FROM_FN_PTR(address, SharedRuntime::trace_bytecode), c_rarg1, c_rarg2, c_rarg3); - __ pop(0xffffu, sp); + __ pop(RegSet::range(r0, r15), sp); __ pop(state); __ pop(lr); __ ret(lr); // return from result handler From edward.nevill at linaro.org Wed Apr 23 08:34:07 2014 From: edward.nevill at linaro.org (Edward Nevill) Date: Wed, 23 Apr 2014 09:34:07 +0100 Subject: [aarch64-port-dev ] Use an explicit set of registers rather than a bitmap for psh and pop operations. In-Reply-To: <5356B0A7.1000203@redhat.com> References: <5356B0A7.1000203@redhat.com> Message-ID: <1398242047.21532.5.camel@localhost.localdomain> Hi Andrew, It looks fine to me. A couple of small points > Use an explicit set of registers rather than a bitmap for psh and pop operations. 'push' rather than 'psh' in the commit comment (if this is a pain to fix don't bother) In class RegSet for consistency I would provide RegSet operator-(Register r1) You have RegSet + Register and RegSet - RegSet. I would be surprised if RegSet - Register did not work for me. Regards, Ed. On Tue, 2014-04-22 at 19:10 +0100, Andrew Haley wrote: > The code using push() and pop() was getting to be very hard to understand. From aph at redhat.com Wed Apr 23 09:29:52 2014 From: aph at redhat.com (Andrew Haley) Date: Wed, 23 Apr 2014 10:29:52 +0100 Subject: [aarch64-port-dev ] Use an explicit set of registers rather than a bitmap for psh and pop operations. In-Reply-To: <1398242047.21532.5.camel@localhost.localdomain> References: <5356B0A7.1000203@redhat.com> <1398242047.21532.5.camel@localhost.localdomain> Message-ID: <53578810.7010704@redhat.com> On 04/23/2014 09:34 AM, Edward Nevill wrote: > Hi Andrew, > > It looks fine to me. A couple of small points > >> Use an explicit set of registers rather than a bitmap for psh and pop operations. > > 'push' rather than 'psh' in the commit comment > > (if this is a pain to fix don't bother) > > In class RegSet for consistency I would provide > > RegSet operator-(Register r1) > > You have RegSet + Register and RegSet - RegSet. I would be surprised if RegSet - Register did not work for me. OK. I think I'll provide a conversion operator from Register to RegSet so that'll just work. Thanks, Andrew. From aph at redhat.com Wed Apr 23 13:26:21 2014 From: aph at redhat.com (aph at redhat.com) Date: Wed, 23 Apr 2014 13:26:21 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8/hotspot: 2 new changesets Message-ID: <201404231326.s3NDQMth026990@aojmv0008> Changeset: 4c3b20781d5d Author: aph Date: 2014-04-22 18:54 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/4c3b20781d5d Use an explicit set of registers rather than a bitmap for psh and pop operations. ! src/cpu/aarch64/vm/c1_CodeStubs_aarch64.cpp ! src/cpu/aarch64/vm/c1_Runtime1_aarch64.cpp ! src/cpu/aarch64/vm/interp_masm_aarch64.hpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.hpp ! src/cpu/aarch64/vm/methodHandles_aarch64.cpp ! src/cpu/aarch64/vm/register_aarch64.hpp ! src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp ! src/cpu/aarch64/vm/stubGenerator_aarch64.cpp ! src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp Changeset: 563e44ab11a3 Author: aph Date: 2014-04-23 09:26 -0400 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/563e44ab11a3 Add a constructor as a conversion from Register - RegSet. Use it. ! src/cpu/aarch64/vm/c1_Runtime1_aarch64.cpp ! src/cpu/aarch64/vm/register_aarch64.hpp From ed at camswl.com Thu Apr 24 09:58:19 2014 From: ed at camswl.com (Edward Nevill) Date: Thu, 24 Apr 2014 10:58:19 +0100 Subject: [aarch64-port-dev ] RFR: Fix biased locking and enable as default Message-ID: <1398333499.20498.12.camel@localhost.localdomain> Hi, The following patch fixes a problem I was seeing assertion failures with biased locking. The main problem was that the swap_reg (ie the compare value) and tmp_reg (ie the new value) were the wrong way around in two calls to cmpxchgptr. In most cases this worked OK, since it simply meant that there was garbage in the compare value, therefore the compare always failed, therefore it went off to the slow case instead. However, in certain cases this failed because it left the biased locking bits in the object header and the ensuing code assumed that some thread must have cleared them because at least 1 thread must have been successful with the compare and exchange whereas in our case all threads were failing because the compare value was wrong to start with. I have also re-enabled the call to biased_locking_enter in generate_native_wrapper which was commented out for some reason and also fixed the register usage in that call (calls to biased_locking_enter must not pass rscratch1 as the tmp because that is used internally within biased_locking_enter). Finally I have re-enabled biased locking as the default so that the overnight tests will kick in and tests this fully. This also removes changes to shared code which is always good. OK to push? Ed. --- CUT HERE --- # HG changeset patch # User Edward Nevill edward.nevill at linaro.org # Date 1398332582 -3600 # Thu Apr 24 10:43:02 2014 +0100 # Node ID ef2aa7fd06f354b221432fdbc9f8b4ba6bd0a7c4 # Parent 563e44ab11a3570b1ec0f89e736832e14f7414ae Fix biased locking and enable as default diff -r 563e44ab11a3 -r ef2aa7fd06f3 src/cpu/aarch64/vm/globals_aarch64.hpp --- a/src/cpu/aarch64/vm/globals_aarch64.hpp Wed Apr 23 09:26:04 2014 -0400 +++ b/src/cpu/aarch64/vm/globals_aarch64.hpp Thu Apr 24 10:43:02 2014 +0100 @@ -73,9 +73,6 @@ define_pd_global(uintx, TypeProfileLevel, 0); -// avoid biased locking while we are bootstrapping the aarch64 build -define_pd_global(bool, UseBiasedLocking, false); - #if defined(COMPILER1) || defined(COMPILER2) define_pd_global(intx, InlineSmallCode, 1000); #endif diff -r 563e44ab11a3 -r ef2aa7fd06f3 src/cpu/aarch64/vm/macroAssembler_aarch64.cpp --- a/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Wed Apr 23 09:26:04 2014 -0400 +++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp Thu Apr 24 10:43:02 2014 +0100 @@ -414,7 +414,7 @@ Label here; load_prototype_header(tmp_reg, obj_reg); orr(tmp_reg, rthread, tmp_reg); - cmpxchgptr(tmp_reg, swap_reg, obj_reg, rscratch1, here, slow_case); + cmpxchgptr(swap_reg, tmp_reg, obj_reg, rscratch1, here, slow_case); // If the biasing toward our thread failed, then another thread // succeeded in biasing it toward itself and we need to revoke that // bias. The revocation will occur in the runtime in the slow case. @@ -441,7 +441,7 @@ { Label here, nope; load_prototype_header(tmp_reg, obj_reg); - cmpxchgptr(tmp_reg, swap_reg, obj_reg, rscratch1, here, &nope); + cmpxchgptr(swap_reg, tmp_reg, obj_reg, rscratch1, here, &nope); bind(here); // Fall through to the normal CAS-based lock, because no matter what diff -r 563e44ab11a3 -r ef2aa7fd06f3 src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp --- a/src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp Wed Apr 23 09:26:04 2014 -0400 +++ b/src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp Thu Apr 24 10:43:02 2014 +0100 @@ -1815,7 +1815,7 @@ __ ldr(obj_reg, Address(oop_handle_reg, 0)); if (UseBiasedLocking) { - // __ biased_locking_enter(lock_reg, obj_reg, swap_reg, rscratch1, false, lock_done, &slow_path_lock); + __ biased_locking_enter(lock_reg, obj_reg, swap_reg, rscratch2, false, lock_done, &slow_path_lock); } // Load (object->mark() | 1) into swap_reg %r0 diff -r 563e44ab11a3 -r ef2aa7fd06f3 src/cpu/zero/vm/globals_zero.hpp --- a/src/cpu/zero/vm/globals_zero.hpp Wed Apr 23 09:26:04 2014 -0400 +++ b/src/cpu/zero/vm/globals_zero.hpp Thu Apr 24 10:43:02 2014 +0100 @@ -59,12 +59,6 @@ define_pd_global(uintx, TypeProfileLevel, 0); -#ifdef AARCH64 -// This is declared as _pd for AARCH64 only in globals.hpp -// so must match with definition here. -define_pd_global(bool, UseBiasedLocking, false); -#endif - #define ARCH_FLAGS(develop, product, diagnostic, experimental, notproduct) #endif // CPU_ZERO_VM_GLOBALS_ZERO_HPP diff -r 563e44ab11a3 -r ef2aa7fd06f3 src/share/vm/runtime/globals.hpp --- a/src/share/vm/runtime/globals.hpp Wed Apr 23 09:26:04 2014 -0400 +++ b/src/share/vm/runtime/globals.hpp Thu Apr 24 10:43:02 2014 +0100 @@ -1229,10 +1229,8 @@ product(bool, CompactFields, true, \ "Allocate nonstatic fields in gaps between previous fields") \ \ - AARCH64_ONLY(product_pd(bool, UseBiasedLocking, \ - "Enable biased locking in JVM")) \ - NOT_AARCH64(product(bool, UseBiasedLocking, true, \ - "Enable biased locking in JVM")) \ + product(bool, UseBiasedLocking, true, \ + "Enable biased locking in JVM") \ notproduct(bool, PrintFieldLayout, false, \ "Print field layout for each class") \ \ --- CUT HERE --- From aph at redhat.com Thu Apr 24 11:17:53 2014 From: aph at redhat.com (Andrew Haley) Date: Thu, 24 Apr 2014 12:17:53 +0100 Subject: [aarch64-port-dev ] RFR: Fix biased locking and enable as default In-Reply-To: <1398333499.20498.12.camel@localhost.localdomain> References: <1398333499.20498.12.camel@localhost.localdomain> Message-ID: <5358F2E1.5030204@redhat.com> On 04/24/2014 10:58 AM, Edward Nevill wrote: > I have also re-enabled the call to biased_locking_enter in generate_native_wrapper which was commented out for some reason and also fixed the register usage in that call (calls to biased_locking_enter must not pass rscratch1 as the tmp because that is used internally within biased_locking_enter). Can you assert that rscratch1 is not tmp in biased_locking_enter? Otherwise OK. Thanks, Andrew. From ed at camswl.com Thu Apr 24 12:33:33 2014 From: ed at camswl.com (ed at camswl.com) Date: Thu, 24 Apr 2014 12:33:33 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8/hotspot: Fix biased locking and enable as default Message-ID: <201404241233.s3OCXY8l002962@aojmv0008> Changeset: ef2aa7fd06f3 Author: Edward Nevill edward.nevill at linaro.org Date: 2014-04-24 10:43 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/ef2aa7fd06f3 Fix biased locking and enable as default ! src/cpu/aarch64/vm/globals_aarch64.hpp ! src/cpu/aarch64/vm/macroAssembler_aarch64.cpp ! src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp ! src/cpu/zero/vm/globals_zero.hpp ! src/share/vm/runtime/globals.hpp From openjdk-testing at linaro.org Thu Apr 24 13:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Thu, 24 Apr 2014 14:00:01 +0100 (BST) Subject: [aarch64-port-dev ] server JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140424130058.3B03B1F4EA@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/summary/2014/114/summary.html =============================================================================== server-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/apr/12 pass: 433; fail: 1; error: 4 Build 1: aarch64/2014/apr/13 pass: 433; fail: 1; error: 4 Build 2: aarch64/2014/apr/14 pass: 433; fail: 1; error: 4 Build 3: aarch64/2014/apr/15 pass: 432; fail: 2; error: 4 Build 4: aarch64/2014/apr/16 pass: 433; fail: 1; error: 4 Build 5: aarch64/2014/apr/17 pass: 433; fail: 1; error: 4 Build 6: aarch64/2014/apr/24 pass: 416; fail: 1; error: 21 ------------------------------------------------------------------------------- =============================================================================== server-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/apr/12 pass: 2,939; error: 33 Build 1: aarch64/2014/apr/13 pass: 2,936; error: 36 Build 2: aarch64/2014/apr/14 pass: 2,935; error: 37 Build 3: aarch64/2014/apr/15 pass: 2,936; error: 36 Build 4: aarch64/2014/apr/16 pass: 2,934; error: 38 Build 5: aarch64/2014/apr/17 pass: 2,933; error: 39 Build 6: aarch64/2014/apr/24 pass: 2,894; error: 78 ------------------------------------------------------------------------------- =============================================================================== server-release/jdk =============================================================================== Build 0: aarch64/2014/apr/12 pass: 5,237; fail: 142; error: 71 Build 1: aarch64/2014/apr/13 pass: 5,236; fail: 145; error: 69 Build 2: aarch64/2014/apr/14 pass: 5,245; fail: 134; error: 71 Build 3: aarch64/2014/apr/15 pass: 5,230; fail: 149; error: 71 Build 4: aarch64/2014/apr/16 pass: 5,242; fail: 134; error: 74 Build 5: aarch64/2014/apr/17 pass: 5,230; fail: 146; error: 74 Build 6: aarch64/2014/apr/24 pass: 4,699; fail: 472; error: 279 ------------------------------------------------------------------------------- Previous results can be found here: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/index.html From openjdk-testing at linaro.org Thu Apr 24 13:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Thu, 24 Apr 2014 14:00:01 +0100 (BST) Subject: [aarch64-port-dev ] client JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140424130057.BB4051F53A@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/summary/2014/114/summary.html =============================================================================== client-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/apr/12 pass: 430; fail: 5; error: 3 Build 1: aarch64/2014/apr/13 pass: 431; fail: 5; error: 2 Build 2: aarch64/2014/apr/14 pass: 431; fail: 5; error: 2 Build 3: aarch64/2014/apr/15 pass: 431; fail: 5; error: 2 Build 4: aarch64/2014/apr/16 pass: 429; fail: 5; error: 4 Build 5: aarch64/2014/apr/17 pass: 429; fail: 5; error: 4 Build 6: aarch64/2014/apr/24 pass: 418; fail: 5; error: 15 ------------------------------------------------------------------------------- =============================================================================== client-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/apr/12 pass: 2,937; error: 35 Build 1: aarch64/2014/apr/13 pass: 2,937; error: 35 Build 2: aarch64/2014/apr/14 pass: 2,938; error: 34 Build 3: aarch64/2014/apr/15 pass: 2,939; error: 33 Build 4: aarch64/2014/apr/16 pass: 2,936; error: 36 Build 5: aarch64/2014/apr/17 pass: 2,937; error: 35 Build 6: aarch64/2014/apr/24 pass: 2,916; error: 56 ------------------------------------------------------------------------------- =============================================================================== client-release/jdk =============================================================================== Build 0: aarch64/2014/apr/12 pass: 5,233; fail: 160; error: 57 Build 1: aarch64/2014/apr/13 pass: 5,238; fail: 152; error: 60 Build 2: aarch64/2014/apr/14 pass: 5,236; fail: 157; error: 57 Build 3: aarch64/2014/apr/15 pass: 5,236; fail: 155; error: 59 Build 4: aarch64/2014/apr/16 pass: 5,231; fail: 156; error: 63 Build 5: aarch64/2014/apr/17 pass: 5,231; fail: 157; error: 62 Build 6: aarch64/2014/apr/24 pass: 4,886; fail: 474; error: 90 ------------------------------------------------------------------------------- Previous results can be found here: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/index.html From ed at camswl.com Fri Apr 25 13:31:49 2014 From: ed at camswl.com (Edward Nevill) Date: Fri, 25 Apr 2014 14:31:49 +0100 Subject: [aarch64-port-dev ] Add test to JTreg exclude which also fails on x86 Message-ID: <1398432709.14692.2.camel@mint> Hi, The following test fails on x86 also. I have filed a bug report against it on x86. Patch to exclude from JTreg tests. Regards, Ed. --- CUT HERE --- # HG changeset patch # User Edward Nevill edward.nevill at linaro.org # Date 1398432523 -3600 # Fri Apr 25 14:28:43 2014 +0100 # Node ID e78570806ec82432313f6543bb7601cc3197aae3 # Parent e03c59f25d2b8d12d3a4906d72887bfe14494c85 Add test to JTreg exclude which also fails on x86 diff -r e03c59f25d2b -r e78570806ec8 test/exclude_aarch64.txt --- a/test/exclude_aarch64.txt Wed Mar 26 17:19:44 2014 +0000 +++ b/test/exclude_aarch64.txt Fri Apr 25 14:28:43 2014 +0100 @@ -15,3 +15,4 @@ runtime/SharedArchiveFile/CdsSameObjectAlignment.java generic-all runtime/SharedArchiveFile/CdsDifferentObjectAlignment.java generic-all compiler/whitebox/IsMethodCompilableTest.java generic-all +sun/java2d/OpenGL/DrawBufImgOp.java generic-all --- CUT HERE --- From ed at camswl.com Fri Apr 25 16:02:22 2014 From: ed at camswl.com (ed at camswl.com) Date: Fri, 25 Apr 2014 16:02:22 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8: Add test to JTreg exclude which also fails on x86 Message-ID: <201404251602.s3PG2NGE015954@aojmv0008> Changeset: e78570806ec8 Author: Edward Nevill edward.nevill at linaro.org Date: 2014-04-25 14:28 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/rev/e78570806ec8 Add test to JTreg exclude which also fails on x86 ! test/exclude_aarch64.txt From D.Sturm42 at gmail.com Fri Apr 25 22:16:56 2014 From: D.Sturm42 at gmail.com (D.Sturm) Date: Sat, 26 Apr 2014 00:16:56 +0200 Subject: [aarch64-port-dev ] frintz instruction in simulator Message-ID: Hi, Since I need a frintz instruction when generating the floating point remainder operation* I had to implement it in the simulator. The implementation should be correct except in case of a signaling NaN since that should throw an exception (could make a difference if someone generates a signaling NaN manually from the integer pattern, but that seems like an extreme edge case). Maybe useful for the general branch too. --Daniel From openjdk-testing at linaro.org Fri Apr 25 13:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Fri, 25 Apr 2014 14:00:01 +0100 (BST) Subject: [aarch64-port-dev ] server JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140425130038.47ECC1F546@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/summary/2014/115/summary.html =============================================================================== server-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/apr/13 pass: 433; fail: 1; error: 4 Build 1: aarch64/2014/apr/14 pass: 433; fail: 1; error: 4 Build 2: aarch64/2014/apr/15 pass: 432; fail: 2; error: 4 Build 3: aarch64/2014/apr/16 pass: 433; fail: 1; error: 4 Build 4: aarch64/2014/apr/17 pass: 433; fail: 1; error: 4 Build 5: aarch64/2014/apr/24 pass: 416; fail: 1; error: 21 Build 6: aarch64/2014/apr/25 pass: 425; fail: 1; error: 12 ------------------------------------------------------------------------------- =============================================================================== server-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/apr/13 pass: 2,936; error: 36 Build 1: aarch64/2014/apr/14 pass: 2,935; error: 37 Build 2: aarch64/2014/apr/15 pass: 2,936; error: 36 Build 3: aarch64/2014/apr/16 pass: 2,934; error: 38 Build 4: aarch64/2014/apr/17 pass: 2,933; error: 39 Build 5: aarch64/2014/apr/24 pass: 2,894; error: 78 Build 6: aarch64/2014/apr/25 pass: 2,911; error: 61 ------------------------------------------------------------------------------- =============================================================================== server-release/jdk =============================================================================== Build 0: aarch64/2014/apr/13 pass: 5,236; fail: 145; error: 69 Build 1: aarch64/2014/apr/14 pass: 5,245; fail: 134; error: 71 Build 2: aarch64/2014/apr/15 pass: 5,230; fail: 149; error: 71 Build 3: aarch64/2014/apr/16 pass: 5,242; fail: 134; error: 74 Build 4: aarch64/2014/apr/17 pass: 5,230; fail: 146; error: 74 Build 5: aarch64/2014/apr/24 pass: 4,699; fail: 472; error: 279 Build 6: aarch64/2014/apr/25 pass: 4,725; fail: 472; error: 253 ------------------------------------------------------------------------------- Previous results can be found here: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/index.html From openjdk-testing at linaro.org Fri Apr 25 13:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Fri, 25 Apr 2014 14:00:01 +0100 (BST) Subject: [aarch64-port-dev ] client JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140425130038.5E3A21F53B@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/summary/2014/115/summary.html =============================================================================== client-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/apr/13 pass: 431; fail: 5; error: 2 Build 1: aarch64/2014/apr/14 pass: 431; fail: 5; error: 2 Build 2: aarch64/2014/apr/15 pass: 431; fail: 5; error: 2 Build 3: aarch64/2014/apr/16 pass: 429; fail: 5; error: 4 Build 4: aarch64/2014/apr/17 pass: 429; fail: 5; error: 4 Build 5: aarch64/2014/apr/24 pass: 418; fail: 5; error: 15 Build 6: aarch64/2014/apr/25 pass: 421; fail: 5; error: 12 ------------------------------------------------------------------------------- =============================================================================== client-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/apr/13 pass: 2,937; error: 35 Build 1: aarch64/2014/apr/14 pass: 2,938; error: 34 Build 2: aarch64/2014/apr/15 pass: 2,939; error: 33 Build 3: aarch64/2014/apr/16 pass: 2,936; error: 36 Build 4: aarch64/2014/apr/17 pass: 2,937; error: 35 Build 5: aarch64/2014/apr/24 pass: 2,916; error: 56 Build 6: aarch64/2014/apr/25 pass: 2,917; error: 55 ------------------------------------------------------------------------------- =============================================================================== client-release/jdk =============================================================================== Build 0: aarch64/2014/apr/13 pass: 5,238; fail: 152; error: 60 Build 1: aarch64/2014/apr/14 pass: 5,236; fail: 157; error: 57 Build 2: aarch64/2014/apr/15 pass: 5,236; fail: 155; error: 59 Build 3: aarch64/2014/apr/16 pass: 5,231; fail: 156; error: 63 Build 4: aarch64/2014/apr/17 pass: 5,231; fail: 157; error: 62 Build 5: aarch64/2014/apr/24 pass: 4,886; fail: 474; error: 90 Build 6: aarch64/2014/apr/25 pass: 4,905; fail: 474; error: 71 ------------------------------------------------------------------------------- Previous results can be found here: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/index.html From aph at redhat.com Sat Apr 26 13:47:50 2014 From: aph at redhat.com (Andrew Haley) Date: Sat, 26 Apr 2014 14:47:50 +0100 Subject: [aarch64-port-dev ] frintz instruction in simulator In-Reply-To: References: Message-ID: <535BB906.7070906@redhat.com> On 04/25/2014 11:16 PM, D.Sturm wrote: > Hi, > Since I need a frintz instruction when generating the floating point > remainder operation* I had to implement it in the simulator. > > The implementation should be correct except in case of a signaling NaN > since that should throw an exception (could make a difference if someone > generates a signaling NaN manually from the integer pattern, but that seems > like an extreme edge case). > > Maybe useful for the general branch too. Sure, er, please post it. :-) Andrew. From D.Sturm42 at gmail.com Sat Apr 26 14:45:47 2014 From: D.Sturm42 at gmail.com (D.Sturm) Date: Sat, 26 Apr 2014 16:45:47 +0200 Subject: [aarch64-port-dev ] frintz instruction in simulator In-Reply-To: <535BB906.7070906@redhat.com> References: <535BB906.7070906@redhat.com> Message-ID: Ah does the mailing list swallow the attachments or did you overlook it? (not usual to not inline patches here I see, sorry!) Here's the patch: # HG changeset patch # User Daniel Sturm # Date 1398462569 -7200 # Fri Apr 25 23:49:29 2014 +0200 # Node ID 8866f73c6637b0ff0107eb9f14c1fd319e1f50bd # Parent d70a2050eb423046283bc8f1280fd9d9e1fabc21 implemented frintz instruction. diff -r d70a2050eb42 -r 8866f73c6637 simulator.cpp --- a/simulator.cpp Fri Jan 03 13:42:36 2014 +0000 +++ b/simulator.cpp Fri Apr 25 23:49:29 2014 +0200 @@ -1373,10 +1373,19 @@ case 0b000111: // FCVT double/single to half precision errorCode = ERROR_NYI; return error(); + case 0b001011: // FRINTZ + if (type == 0b01) { + frintzd(); + } else if (type == 0b00) { + frintzs(); + } else { + errorCode = ERROR_UNALLOC; + return error(); + } + return STATUS_READY; case 0b001000: // FRINTN etc case 0b001001: case 0b001010: - case 0b001011: case 0b001100: case 0b001110: case 0b001111: @@ -7988,6 +7997,32 @@ dreg(0) = (double)sreg(5); } +template +static inline T frintz(T val) +{ + if (val == -0.0 || isinf(val) || isnan(val)) { + return val; + } + return (T)(int64_t)val; +} + +// round to integral towards zero +void AArch64Simulator::frintzs() +{ + // instr[9,5] = Sn + // instr[4,0] = Sd + sreg(0) = frintz(sreg(5)); +} + +// round to integral towards zero +void AArch64Simulator::frintzd() +{ + // instr[9,5] = Dn + // instr[4,0] = Dd + dreg(0) = frintz(dreg(5)); +} + + // 2 sources // float add diff -r d70a2050eb42 -r 8866f73c6637 simulator.hpp --- a/simulator.hpp Fri Jan 03 13:42:36 2014 +0000 +++ b/simulator.hpp Fri Apr 25 23:49:29 2014 +0200 @@ -1287,6 +1287,10 @@ // etc // TODO FP round to integral/ nearest integral floating + // float round to integral, towards zero. + void frintzs(); + // double round to integral, towards zero. + void frintzd(); // TODO FP arithmetic // On 26 April 2014 15:47, Andrew Haley wrote: > On 04/25/2014 11:16 PM, D.Sturm wrote: > > Hi, > > Since I need a frintz instruction when generating the floating point > > remainder operation* I had to implement it in the simulator. > > > > The implementation should be correct except in case of a signaling NaN > > since that should throw an exception (could make a difference if someone > > generates a signaling NaN manually from the integer pattern, but that > seems > > like an extreme edge case). > > > > Maybe useful for the general branch too. > > Sure, er, please post it. :-) > > Andrew. > > > From D.Sturm42 at gmail.com Sat Apr 26 18:03:22 2014 From: D.Sturm42 at gmail.com (D.Sturm) Date: Sat, 26 Apr 2014 20:03:22 +0200 Subject: [aarch64-port-dev ] simulator: multiplication/divsion for LONG/INT_MIN and -1 Message-ID: The simulator does not correctly handle multiplications/divisions in the case where we have the smallest negative value and -1. This is undefined in C/C++ anyhow, but works correctly for me in the case of multiplication (INT_MIN * -1 == INT_MIN), and is actually guaranteed to work for the int division due to the way the code is written but causes an arithmetic exception for LONG_MIN / -1 under Linux Mint 16 gcc 4.8.1. That exception is caught by the signal handler in the JVM and then correctly handled there in most situations - I ran into some throwing assertion when it was simulating graal code but haven't looked further into it. The following patch should take care of this. The smulh instruction uses inline assembly but I think x86 does the right thing(TM) there. -- Daniel # HG changeset patch # User Daniel Sturm # Date 1398525661 -7200 # Sat Apr 26 17:21:01 2014 +0200 # Node ID 31460830078ecdb590d602ccfa12f93c8e022ba9 # Parent d70a2050eb423046283bc8f1280fd9d9e1fabc21 fixed signed multiply and division in case of MIN_VAL and -1 as operands diff -r d70a2050eb42 -r 31460830078e simulator.cpp --- a/simulator.cpp Fri Jan 03 13:42:36 2014 +0000 +++ b/simulator.cpp Sat Apr 26 17:21:01 2014 +0200 @@ -44,6 +44,7 @@ #include #include #include +#include #define DEBUG // #include @@ -7073,6 +7074,16 @@ // multiply +template +static T mul(T a, T b) { + if (a == -1 || b == -1) { + if (a == std::numeric_limits::min() || b == std::numeric_limits::min()) { + return std::numeric_limits::min(); + } + } + return a * b; +} + // 32 bit multiply and add void AArch64Simulator::madd32() { @@ -7080,7 +7091,7 @@ // instr[14,10] = ra : may not be SP // instr[9,5] = rn : may not be SP // instr[4,0] = rd : may not be SP - xreg(0, NO_SP) = wreg(10, NO_SP) + wreg(5, NO_SP) * wreg(16, NO_SP); + xreg(0, NO_SP) = wreg(10, NO_SP) + mul(wreg(5, NO_SP), wreg(16, NO_SP)); } // 64 bit multiply and add @@ -7090,7 +7101,7 @@ // instr[14,10] = ra : may not be SP // instr[9,5] = rn : may not be SP // instr[4,0] = rd : may not be SP - xreg(0, NO_SP) = xreg(10, NO_SP) + xreg(5, NO_SP) * xreg(16, NO_SP); + xreg(0, NO_SP) = xreg(10, NO_SP) + mul(xreg(5, NO_SP), xreg(16, NO_SP)); } // 32 bit multiply and sub @@ -7100,7 +7111,7 @@ // instr[14,10] = ra : may not be SP // instr[9,5] = rn : may not be SP // instr[4,0] = rd : may not be SP - xreg(0, NO_SP) = wreg(10, NO_SP) - wreg(5, NO_SP) * wreg(16, NO_SP); + xreg(0, NO_SP) = wreg(10, NO_SP) - mul(wreg(5, NO_SP), wreg(16, NO_SP)); } // 64 bit multiply and sub @@ -7110,7 +7121,7 @@ // instr[14,10] = ra : may not be SP // instr[9,5] = rn : may not be SP // instr[4,0] = rd : may not be SP - xreg(0, NO_SP) = xreg(10, NO_SP) - xreg(5, NO_SP) * xreg(16, NO_SP); + xreg(0, NO_SP) = xreg(10, NO_SP) - mul(xreg(5, NO_SP), xreg(16, NO_SP)); } // signed multiply add long -- source, source2 : 32 bit, source3 : 64 bit @@ -7122,7 +7133,7 @@ // instr[4,0] = rd : may not be SP // n.b. we need to multiply the signed 32 bit values in rn, rm to // obtain a 64 bit product - xregs(0, NO_SP) = xregs(10, NO_SP) + ((int64_t)wregs(5, NO_SP)) * ((int64_t)wregs(16, NO_SP)); + xregs(0, NO_SP) = xregs(10, NO_SP) + mul((int64_t)wregs(5, NO_SP), (int64_t)wregs(16, NO_SP)); } // signed multiply sub long -- source, source2 : 32 bit, source3 : 64 bit @@ -7134,7 +7145,7 @@ // instr[4,0] = rd : may not be SP // n.b. we need to multiply the signed 32 bit values in rn, rm to // obtain a 64 bit product - xregs(0, NO_SP) = xregs(10, NO_SP) - ((int64_t)wregs(5, NO_SP)) * ((int64_t)wregs(16, NO_SP)); + xregs(0, NO_SP) = xregs(10, NO_SP) - mul((int64_t)wregs(5, NO_SP), (int64_t)wregs(16, NO_SP)); } // signed multiply high, source, source2 : 64 bit, dest <-- high 64-bit of result @@ -7225,8 +7236,13 @@ // instr[9,5] = rn : may not be SP // instr[4,0] = rd : may not be SP // TODO : check that this rounds towards zero as required + int64_t dividend = xregs(5, NO_SP); int64_t divisor = xregs(16, NO_SP); - xregs(0, NO_SP) = (divisor ? (xregs(5, NO_SP) / divisor) : 0); + if (divisor == -1 && dividend == std::numeric_limits::min()) { + xregs(0, NO_SP) = std::numeric_limits::min(); + } else { + xregs(0, NO_SP) = (divisor ? (dividend / divisor) : 0); + } } // 32 bit unsigned divide From aph at redhat.com Sun Apr 27 08:51:49 2014 From: aph at redhat.com (Andrew Haley) Date: Sun, 27 Apr 2014 09:51:49 +0100 Subject: [aarch64-port-dev ] simulator: multiplication/divsion for LONG/INT_MIN and -1 In-Reply-To: References: Message-ID: <535CC525.6070106@redhat.com> On 04/26/2014 07:03 PM, D.Sturm wrote: > The simulator does not correctly handle multiplications/divisions in the > case where we have the smallest negative value and -1. This is undefined in > C/C++ anyhow, but works correctly for me in the case of multiplication > (INT_MIN * -1 == INT_MIN), and is actually guaranteed to work for the int > division due to the way the code is written but causes an arithmetic > exception for LONG_MIN / -1 under Linux Mint 16 gcc 4.8.1. I'm a bit mystified by the problem with multiplication. Can you tell me what error you got? > That exception is caught by the signal handler in the JVM and then > correctly handled there in most situations - I ran into some throwing > assertion when it was simulating graal code but haven't looked further into > it. > > The following patch should take care of this. The smulh instruction uses > inline assembly but I think x86 does the right thing(TM) there. Have you got a http://sourceforge.net account? Then you can check stuff in. Andrew. From D.Sturm42 at gmail.com Sun Apr 27 23:07:05 2014 From: D.Sturm42 at gmail.com (D.Sturm) Date: Mon, 28 Apr 2014 01:07:05 +0200 Subject: [aarch64-port-dev ] simulator: multiplication/divsion for LONG/INT_MIN and -1 In-Reply-To: <535CC525.6070106@redhat.com> References: <535CC525.6070106@redhat.com> Message-ID: > > I'm a bit mystified by the problem with multiplication. Can you > tell me what error you got? Multiplication works fine for me, but it's undefined behavior according to the standard, so gcc could introduce some subtle bugs - I thought it better to fix both while at it. > > That exception is caught by the signal handler in the JVM and then > > correctly handled there in most situations - I ran into some throwing > > assertion when it was simulating graal code but haven't looked further > into > > it. > > > > The following patch should take care of this. The smulh instruction uses > > inline assembly but I think x86 does the right thing(TM) there. > > Have you got a http://sourceforge.net account? Then you can check stuff > in. User dansturm, but I doubt I'll have to supply any more patches to the simulator, seems pretty solid apart from this small thing :-) --Daniel From edward.nevill at linaro.org Tue Apr 29 09:16:44 2014 From: edward.nevill at linaro.org (Edward Nevill) Date: Tue, 29 Apr 2014 10:16:44 +0100 Subject: [aarch64-port-dev ] RFR: Minor optimisation for divide by 2 Message-ID: <1398763004.20174.7.camel@localhost.localdomain> Hi, C2 currently generates mov rdst, rsrc, asr #31 mov rdst, rdst, lsr #31 add rdst, rsrc, rdst mov rdst, rdst, asr #1 for divide by 2. The following patch reduces this to add rdst, rsrc, rsrc, lsr #31 mov rdst, rdst, asr #1 I know this is very minor, but it offends me:-) OK? Ed. --- CUT HERE --- # HG changeset patch # User Edward Nevill edward.nevill at linaro.org # Date 1398762402 -3600 # Tue Apr 29 10:06:42 2014 +0100 # Node ID 7f9ab7b86d7a690e04ffc6331c2b9519aae2a565 # Parent d9468835bc5160b7fac6709b0afbc751b2159fbb Minor optimisation for divide by 2 diff -r d9468835bc51 -r 7f9ab7b86d7a src/cpu/aarch64/vm/aarch64.ad --- a/src/cpu/aarch64/vm/aarch64.ad Thu Apr 10 06:50:43 2014 -0400 +++ b/src/cpu/aarch64/vm/aarch64.ad Tue Apr 29 10:06:42 2014 +0100 @@ -3356,6 +3356,16 @@ interface(CONST_INTER); %} +operand immI_31() +%{ + predicate(n->get_int() == 31); + match(ConI); + + op_cost(0); + format %{ %} + interface(CONST_INTER); +%} + operand immI_8() %{ predicate(n->get_int() == 8); @@ -7274,6 +7284,30 @@ ins_pipe(pipe_class_default); %} +instruct signExtract(iRegINoSp dst, iRegI src, immI_31 div1, immI_31 div2) %{ + match(Set dst (URShiftI (RShiftI src div1) div2)); + ins_cost(INSN_COST); + format %{ "lsrw $dst, $src, $div1" %} + ins_encode %{ + __ lsrw(as_Register($dst$$reg), as_Register($src$$reg), 31); + %} + ins_pipe(pipe_class_default); +%} + +instruct div2Round(iRegINoSp dst, iRegI src, immI_31 div1, immI_31 div2) %{ + match(Set dst (AddI src (URShiftI (RShiftI src div1) div2))); + ins_cost(INSN_COST); + format %{ "addw $dst, $src, $div1" %} + + ins_encode %{ + __ addw(as_Register($dst$$reg), + as_Register($src$$reg), + as_Register($src$$reg), + Assembler::LSR, 31); + %} + ins_pipe(pipe_class_default); +%} + // Long Divide instruct divL(iRegLNoSp dst, iRegL src1, iRegL src2) %{ --- CUT HERE --- From aph at redhat.com Tue Apr 29 09:37:06 2014 From: aph at redhat.com (Andrew Haley) Date: Tue, 29 Apr 2014 10:37:06 +0100 Subject: [aarch64-port-dev ] RFR: Minor optimisation for divide by 2 In-Reply-To: <1398763004.20174.7.camel@localhost.localdomain> References: <1398763004.20174.7.camel@localhost.localdomain> Message-ID: <535F72C2.8000308@redhat.com> On 04/29/2014 10:16 AM, Edward Nevill wrote: > C2 currently generates > > mov rdst, rsrc, asr #31 > mov rdst, rdst, lsr #31 > add rdst, rsrc, rdst > mov rdst, rdst, asr #1 > > for divide by 2. The following patch reduces this to > > add rdst, rsrc, rsrc, lsr #31 > mov rdst, rdst, asr #1 > > I know this is very minor, but it offends me:-) > > OK? How strange. That must be a generic C2 bug, but OK. I'm a bit mystified why the add and shift operation isn't being combined. It should match AddI_reg_URShift_reg. Andrew. From aph at redhat.com Tue Apr 29 13:21:24 2014 From: aph at redhat.com (Andrew Haley) Date: Tue, 29 Apr 2014 14:21:24 +0100 Subject: [aarch64-port-dev ] JDK9 Message-ID: <535FA754.1090007@redhat.com> I'm building a JDK 9 tree as the first stage of upstreaming. I'll write a JEP proposal for this project next. Andrew. From ed at camswl.com Tue Apr 29 13:32:18 2014 From: ed at camswl.com (Edward Nevill) Date: Tue, 29 Apr 2014 14:32:18 +0100 Subject: [aarch64-port-dev ] JDK9 In-Reply-To: <535FA754.1090007@redhat.com> References: <535FA754.1090007@redhat.com> Message-ID: <1398778338.20174.12.camel@localhost.localdomain> Hi Andrew, Good news. I am happy to help merging into JDK9. How will this work. Do you a) Populate the jdk9 tree initially with the jdk8 aarch64 port and then pull in jdk9 or b) populate with jdk9 and merge in the aarch64 changes. b) might be easier, but it will lose all the aarch64 history? Regards, Ed. On Tue, 2014-04-29 at 14:21 +0100, Andrew Haley wrote: > I'm building a JDK 9 tree as the first stage of upstreaming. I'll write > a JEP proposal for this project next. > > Andrew. From ed at camswl.com Tue Apr 29 13:59:24 2014 From: ed at camswl.com (ed at camswl.com) Date: Tue, 29 Apr 2014 13:59:24 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8/hotspot: Minor optimisation for divide by 2 Message-ID: <201404291359.s3TDxPQe024650@aojmv0008> Changeset: 9d641fdeea4d Author: Edward Nevill edward.nevill at linaro.org Date: 2014-04-29 14:58 +0100 URL: http://hg.openjdk.java.net/aarch64-port/jdk8/hotspot/rev/9d641fdeea4d Minor optimisation for divide by 2 ! src/cpu/aarch64/vm/aarch64.ad From aph at redhat.com Tue Apr 29 14:37:12 2014 From: aph at redhat.com (Andrew Haley) Date: Tue, 29 Apr 2014 15:37:12 +0100 Subject: [aarch64-port-dev ] JDK9 In-Reply-To: <1398778338.20174.12.camel@localhost.localdomain> References: <535FA754.1090007@redhat.com> <1398778338.20174.12.camel@localhost.localdomain> Message-ID: <535FB918.4060302@redhat.com> On 04/29/2014 02:32 PM, Edward Nevill wrote: > Hi Andrew, > > Good news. I am happy to help merging into JDK9. > > How will this work. Do you > > a) Populate the jdk9 tree initially with the jdk8 aarch64 port and then > pull in jdk9 > > or > > b) populate with jdk9 and merge in the aarch64 changes. > > > b) might be easier, but it will lose all the aarch64 history? b) is what I'm doing. AIUI a) would fail jcheck. Andrew. From openjdk-testing at linaro.org Wed Apr 30 13:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Wed, 30 Apr 2014 14:00:01 +0100 (BST) Subject: [aarch64-port-dev ] server JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140430130046.93FF81F558@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/summary/2014/120/summary.html =============================================================================== server-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/apr/15 pass: 432; fail: 2; error: 4 Build 1: aarch64/2014/apr/16 pass: 433; fail: 1; error: 4 Build 2: aarch64/2014/apr/17 pass: 433; fail: 1; error: 4 Build 3: aarch64/2014/apr/24 pass: 416; fail: 1; error: 21 Build 4: aarch64/2014/apr/25 pass: 425; fail: 1; error: 12 Build 5: aarch64/2014/apr/26 pass: 423; fail: 1; error: 14 Build 6: aarch64/2014/apr/30 pass: 421; fail: 2; error: 15 ------------------------------------------------------------------------------- =============================================================================== server-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/apr/15 pass: 2,936; error: 36 Build 1: aarch64/2014/apr/16 pass: 2,934; error: 38 Build 2: aarch64/2014/apr/17 pass: 2,933; error: 39 Build 3: aarch64/2014/apr/24 pass: 2,894; error: 78 Build 4: aarch64/2014/apr/25 pass: 2,911; error: 61 Build 5: aarch64/2014/apr/26 pass: 2,912; error: 60 Build 6: aarch64/2014/apr/30 pass: 2,915; error: 57 ------------------------------------------------------------------------------- =============================================================================== server-release/jdk =============================================================================== Build 0: aarch64/2014/apr/15 pass: 5,230; fail: 149; error: 71 Build 1: aarch64/2014/apr/16 pass: 5,242; fail: 134; error: 74 Build 2: aarch64/2014/apr/17 pass: 5,230; fail: 146; error: 74 Build 3: aarch64/2014/apr/24 pass: 4,699; fail: 472; error: 279 Build 4: aarch64/2014/apr/25 pass: 4,725; fail: 472; error: 253 Build 5: aarch64/2014/apr/26 pass: 4,723; fail: 475; error: 251 Build 6: aarch64/2014/apr/30 pass: 4,725; fail: 470; error: 254 ------------------------------------------------------------------------------- Previous results can be found here: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/index.html From openjdk-testing at linaro.org Wed Apr 30 13:00:01 2014 From: openjdk-testing at linaro.org (OpenJDK Automated Test) Date: Wed, 30 Apr 2014 14:00:01 +0100 (BST) Subject: [aarch64-port-dev ] client JTREG results for OpenJDK 8 on AArch64 Message-ID: <20140430130046.BDC891F557@apm4.linaro.org> This is a summary of the JTREG test results for OpenJDK 8 on AArch64. The build and test results are cycled on a weekly basis. For detailed information on the test output please refer to: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/summary/2014/120/summary.html =============================================================================== client-fastdebug/hotspot =============================================================================== Build 0: aarch64/2014/apr/15 pass: 431; fail: 5; error: 2 Build 1: aarch64/2014/apr/16 pass: 429; fail: 5; error: 4 Build 2: aarch64/2014/apr/17 pass: 429; fail: 5; error: 4 Build 3: aarch64/2014/apr/24 pass: 418; fail: 5; error: 15 Build 4: aarch64/2014/apr/25 pass: 421; fail: 5; error: 12 Build 5: aarch64/2014/apr/26 pass: 421; fail: 5; error: 12 Build 6: aarch64/2014/apr/30 pass: 422; fail: 5; error: 11 ------------------------------------------------------------------------------- =============================================================================== client-fastdebug/langtools =============================================================================== Build 0: aarch64/2014/apr/15 pass: 2,939; error: 33 Build 1: aarch64/2014/apr/16 pass: 2,936; error: 36 Build 2: aarch64/2014/apr/17 pass: 2,937; error: 35 Build 3: aarch64/2014/apr/24 pass: 2,916; error: 56 Build 4: aarch64/2014/apr/25 pass: 2,917; error: 55 Build 5: aarch64/2014/apr/26 pass: 2,917; error: 55 Build 6: aarch64/2014/apr/30 pass: 2,917; error: 55 ------------------------------------------------------------------------------- =============================================================================== client-release/jdk =============================================================================== Build 0: aarch64/2014/apr/15 pass: 5,236; fail: 155; error: 59 Build 1: aarch64/2014/apr/16 pass: 5,231; fail: 156; error: 63 Build 2: aarch64/2014/apr/17 pass: 5,231; fail: 157; error: 62 Build 3: aarch64/2014/apr/24 pass: 4,886; fail: 474; error: 90 Build 4: aarch64/2014/apr/25 pass: 4,905; fail: 474; error: 71 Build 5: aarch64/2014/apr/26 pass: 4,905; fail: 472; error: 72 Build 6: aarch64/2014/apr/30 pass: 4,904; fail: 473; error: 72 ------------------------------------------------------------------------------- Previous results can be found here: http://openjdk.linaro.org/openjdk8-jtreg-nightly-tests/index.html