/hg/icedtea8-forest/hotspot: 35 new changesets
andrew at icedtea.classpath.org
andrew at icedtea.classpath.org
Sun Apr 10 00:29:53 UTC 2016
changeset 5cd005a0470b in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=5cd005a0470b
author: adinn
date: Wed Aug 26 17:13:59 2015 +0100
8134322, PR2922: AArch64: Fix several errors in C2 biased locking implementation
Summary: Several errors in C2 biased locking require fixing
Reviewed-by: kvn
Contributed-by: hui.shi at linaro.org
changeset babe8ca2d61e in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=babe8ca2d61e
author: enevill
date: Tue Sep 15 12:59:51 2015 +0000
8136524, PR2922: aarch64: test/compiler/runtime/7196199/Test7196199.java fails
Summary: Fix safepoint handlers to save 128 bits on vector poll
Reviewed-by: kvn
Contributed-by: felix.yang at linaro.org
changeset 0896e50fab35 in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=0896e50fab35
author: roland
date: Thu Feb 25 09:43:56 2016 -0500
8136596, PR2922: Remove aarch64: MemBarRelease when final field's allocation is NoEscape or ArgEscape
Summary: elide MemBar when AllocateNode _is_non_escaping
Reviewed-by: kvn, roland
Contributed-by: hui.shi at linaro.org
changeset b317b9da87e4 in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=b317b9da87e4
author: enevill
date: Wed Sep 16 13:50:57 2015 +0000
8136615, PR2922: aarch64: elide DecodeN when followed by CmpP 0
Summary: remove DecodeN when comparing a narrow oop with 0
Reviewed-by: kvn, adinn
changeset c192885e7c16 in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=c192885e7c16
author: aph
date: Mon Sep 28 16:18:15 2015 +0000
8136165, PR2922: AARCH64: Tidy up compiled native calls
Summary: Do some cleaning
Reviewed-by: roland, kvn, enevill
changeset 75ae9026eadd in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=75ae9026eadd
author: aph
date: Wed Sep 30 13:23:46 2015 +0000
8138641, PR2922: Disable C2 peephole by default for aarch64
Reviewed-by: roland
Contributed-by: felix.yang at linaro.org
changeset 953c4e38008b in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=953c4e38008b
author: aph
date: Tue Sep 29 17:01:37 2015 +0000
8138575, PR2922: Improve generated code for profile counters
Reviewed-by: kvn
changeset f987924334cd in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=f987924334cd
author: enevill
date: Thu Oct 15 15:33:54 2015 +0000
8139674, PR2922: aarch64: guarantee failure in TestOptionsWithRanges.java
Summary: Fix negative overflow in instruction field
Reviewed-by: kvn, roland, adinn, aph
changeset 6e4896ac5bbc in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=6e4896ac5bbc
author: ecaspole
date: Mon Sep 21 10:36:36 2015 -0400
8131645, PR2922: [ARM64] crash on Cavium when using G1
Summary: Add a fence when creating the CodeRootSetTable so the readers do not see invalid memory.
Reviewed-by: aph, tschatzl
changeset 6a589c3915be in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=6a589c3915be
author: adinn
date: Thu Oct 08 11:06:07 2015 -0400
PR2922: Backport optimization of volatile puts/gets and CAS to use ldar/stlr
changeset 0b5123ad9c31 in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=0b5123ad9c31
author: enevill
date: Wed Oct 28 17:47:45 2015 +0000
PR2922: Fix thinko when backporting 8131645. Table ends up being allocated twice.
changeset 86b2d612adf1 in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=86b2d612adf1
author: enevill
date: Wed Oct 28 17:51:10 2015 +0000
8140611, PR2922: aarch64: jtreg test jdk/tools/pack200/UnpackerMemoryTest.java SEGVs
Summary: Fix register usage on calling native synchronized methods
Reviewed-by: kvn, adinn
changeset 27acb51158b9 in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=27acb51158b9
author: enevill
date: Thu Feb 25 05:44:08 2016 -0500
PR2922: Some 32 bit shifts still being anded with 0x3f instead of 0x1f.
changeset 2bbfb04230ec in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=2bbfb04230ec
author: aph
date: Tue Sep 08 14:08:58 2015 +0100
8135157, PR2922: DMB elimination in AArch64 C2 synchronization implementation
Summary: Reduce memory barrier usage in C2 fast lock and unlock.
Reviewed-by: kvn
Contributed-by: wei.tang at linaro.org, aph at redhat.com
changeset 14f41a6da05f in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=14f41a6da05f
author: aph
date: Wed Nov 04 13:38:38 2015 +0100
8138966, PR2922: Intermittent SEGV running ParallelGC
Summary: Add necessary memory fences so that the parallel threads are unable to observe partially filled block tables.
Reviewed-by: tschatzl
changeset a0284b5f2c3a in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=a0284b5f2c3a
author: enevill
date: Thu Nov 19 15:15:20 2015 +0000
8143067, PR2922: aarch64: guarantee failure in javac
Summary: Fix adrp going out of range during code relocation
Reviewed-by: aph, kvn
changeset 498c0173ac25 in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=498c0173ac25
author: hshi
date: Tue Nov 24 09:02:26 2015 +0000
8143285, PR2922: aarch64: Missing load acquire when checking if ConstantPoolCacheEntry is resolved
Reviewed-by: roland, aph
changeset 285af921daec in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=285af921daec
author: enevill
date: Fri Feb 26 03:44:38 2016 -0500
PR2922: Add support for large code cache
changeset 384b670295d9 in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=384b670295d9
author: enevill
date: Tue Jan 05 17:40:17 2016 +0000
PR2922: Fix client build after addition of large code cache support
changeset 6ff8db505d54 in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=6ff8db505d54
author: enevill
date: Tue Dec 29 16:47:34 2015 +0000
8146286, PR2922: aarch64: guarantee failures with large code cache sizes on jtreg test java/lang/invoke/LFCaching/LFMultiThreadCachingTest.java
Summary: patch trampoline calls with special case bl to itself which does not cause guarantee failure
Reviewed-by: aph
changeset 216100b310c3 in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=216100b310c3
author: hshi
date: Thu Nov 26 15:37:04 2015 +0000
8143584, PR2922: Load constant pool tag and class status with load acquire
Reviewed-by: roland, aph
changeset b286409be4b9 in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=b286409be4b9
author: aph
date: Wed Nov 25 18:13:13 2015 +0000
8144028, PR2922: Use AArch64 bit-test instructions in C2
Reviewed-by: kvn
changeset 27d7474e68ca in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=27d7474e68ca
author: fyang
date: Mon Dec 07 21:23:02 2015 +0800
8144587, PR2922: aarch64: generate vectorized MLA/MLS instructions
Summary: Add support for MLA/MLS (vector) instructions
Reviewed-by: roland
changeset 8fae3f3129fd in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=8fae3f3129fd
author: aph
date: Tue Dec 15 19:18:05 2015 +0000
8145438, PR2922: Guarantee failures since 8144028: Use AArch64 bit-test instructions in C2
Summary: Implement short and long versions of bit test instructions.
Reviewed-by: kvn
changeset a0a416432508 in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=a0a416432508
author: aph
date: Wed Dec 16 11:35:59 2015 +0000
8144582, PR2922: AArch64 does not generate correct branch profile data
Reviewed-by: kvn
changeset 03c02db49d16 in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=03c02db49d16
author: fyang
date: Mon Dec 07 21:14:56 2015 +0800
8144201, PR2922: aarch64: jdk/test/com/sun/net/httpserver/Test6a.java fails with --enable-unlimited-crypto
Summary: Fix typo in stub generate_cipherBlockChaining_decryptAESCrypt
Reviewed-by: roland
changeset 9b413b1b49a9 in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=9b413b1b49a9
author: enevill
date: Fri Jan 08 11:39:47 2016 +0000
8146678, PR2922: aarch64: assertion failure: call instruction in an infinite loop
Summary: Remove assertion
Reviewed-by: aph
changeset 8344270ca8ca in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=8344270ca8ca
author: enevill
date: Tue Jan 12 14:55:15 2016 +0000
8146843, PR2922: aarch64: add scheduling support for FP and vector instructions
Summary: add pipeline classes for FP/vector pipeline
Reviewed-by: aph
changeset 31421ce3f8a1 in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=31421ce3f8a1
author: aph
date: Tue Jan 19 17:52:52 2016 +0000
8146709, PR2922: AArch64: Incorrect use of ADRP for byte_map_base
Reviewed-by: roland
changeset 2d6aa4a52092 in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=2d6aa4a52092
author: hshi
date: Wed Jan 20 04:56:51 2016 -0800
8147805, PR2922: aarch64: C1 segmentation fault due to inline Unsafe.getAndSetObject
Summary: In Aarch64 LIR_Assembler.atomic_op, keep stored data reference register in decompressed forms as it may be used later
Reviewed-by: aph
Contributed-by: hui.shi at linaro.org, felix.yang at linaro.org
changeset b0a61be7e092 in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=b0a61be7e092
author: enevill
date: Tue Jan 26 14:04:01 2016 +0000
8148240, PR2922: aarch64: random infrequent null pointer exceptions in javac
Summary: Disable fp as an allocatable register
Reviewed-by: aph
changeset ecca96e2dfcf in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=ecca96e2dfcf
author: andrew
date: Tue Mar 01 02:00:13 2016 +0000
PR2922: Apply ReservedCodeCacheSize default limiting to AArch64 only.
changeset 15b7a15b9310 in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=15b7a15b9310
author: enevill
date: Thu Mar 31 08:30:30 2016 +0000
PR2922: Add missing includes to macroAssembler_aarch64.cpp
changeset 5e587a29a6aa in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=5e587a29a6aa
author: aph
date: Thu Feb 25 14:59:44 2016 +0000
8150652, PR2922: Remove unused code in AArch64 back end
Reviewed-by: kvn
changeset 49b8cecd1bbe in /hg/icedtea8-forest/hotspot
details: http://icedtea.classpath.org/hg/icedtea8-forest/hotspot?cmd=changeset;node=49b8cecd1bbe
author: andrew
date: Sun Apr 10 01:08:29 2016 +0100
Added tag icedtea-3.0.0 for changeset 5e587a29a6aa
diffstat:
.hgtags | 1 +
src/cpu/aarch64/vm/aarch64.ad | 3743 +++++++++-
src/cpu/aarch64/vm/assembler_aarch64.cpp | 5 -
src/cpu/aarch64/vm/assembler_aarch64.hpp | 32 +-
src/cpu/aarch64/vm/c1_CodeStubs_aarch64.cpp | 32 +-
src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp | 39 +-
src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.hpp | 8 +-
src/cpu/aarch64/vm/c1_MacroAssembler_aarch64.cpp | 4 +-
src/cpu/aarch64/vm/c1_MacroAssembler_aarch64.hpp | 1 +
src/cpu/aarch64/vm/c1_Runtime1_aarch64.cpp | 57 +-
src/cpu/aarch64/vm/c2_globals_aarch64.hpp | 2 +-
src/cpu/aarch64/vm/compiledIC_aarch64.cpp | 6 +-
src/cpu/aarch64/vm/globalDefinitions_aarch64.hpp | 4 +
src/cpu/aarch64/vm/globals_aarch64.hpp | 12 +-
src/cpu/aarch64/vm/icBuffer_aarch64.cpp | 21 +-
src/cpu/aarch64/vm/interp_masm_aarch64.cpp | 23 +-
src/cpu/aarch64/vm/macroAssembler_aarch64.cpp | 330 +-
src/cpu/aarch64/vm/macroAssembler_aarch64.hpp | 118 +-
src/cpu/aarch64/vm/methodHandles_aarch64.cpp | 4 +-
src/cpu/aarch64/vm/nativeInst_aarch64.cpp | 141 +-
src/cpu/aarch64/vm/nativeInst_aarch64.hpp | 75 +-
src/cpu/aarch64/vm/relocInfo_aarch64.cpp | 29 +-
src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp | 541 +-
src/cpu/aarch64/vm/stubGenerator_aarch64.cpp | 6 +-
src/cpu/aarch64/vm/templateInterpreter_aarch64.cpp | 4 +-
src/cpu/aarch64/vm/templateTable_aarch64.cpp | 12 +-
src/cpu/aarch64/vm/vm_version_aarch64.cpp | 8 +
src/cpu/aarch64/vm/vtableStubs_aarch64.cpp | 2 +-
src/os_cpu/linux_aarch64/vm/os_linux_aarch64.cpp | 9 +-
src/share/vm/adlc/formssel.cpp | 3 +-
src/share/vm/gc_implementation/g1/g1CodeCacheRemSet.cpp | 4 +-
src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.hpp | 7 +-
src/share/vm/opto/callnode.hpp | 14 +
src/share/vm/opto/graphKit.cpp | 2 +-
src/share/vm/opto/macro.cpp | 8 +-
src/share/vm/opto/memnode.cpp | 4 +-
src/share/vm/runtime/arguments.cpp | 12 +-
src/share/vm/utilities/globalDefinitions.hpp | 5 +
test/compiler/codegen/8144028/BitTests.java | 164 +
39 files changed, 4620 insertions(+), 872 deletions(-)
diffs (truncated from 8663 to 500 lines):
diff -r 9a57d01ddf03 -r 49b8cecd1bbe .hgtags
--- a/.hgtags Fri Dec 18 08:55:47 2015 +0100
+++ b/.hgtags Sun Apr 10 01:08:29 2016 +0100
@@ -830,3 +830,4 @@
ddd297e340b1170d3cec011ee64e729f8b493c86 jdk8u77-b01
1b4072e4bb3ad54c4e894998486a8b33f0689160 jdk8u77-b02
e9585e814cc954c06e870f3bdf37171029da0d5e icedtea-3.0.0pre10
+5e587a29a6aac06d6b5a7ebeea99a291d82520c8 icedtea-3.0.0
diff -r 9a57d01ddf03 -r 49b8cecd1bbe src/cpu/aarch64/vm/aarch64.ad
--- a/src/cpu/aarch64/vm/aarch64.ad Fri Dec 18 08:55:47 2015 +0100
+++ b/src/cpu/aarch64/vm/aarch64.ad Sun Apr 10 01:08:29 2016 +0100
@@ -545,7 +545,7 @@
R26
/* R27, */ // heapbase
/* R28, */ // thread
- R29, // fp
+ /* R29, */ // fp
/* R30, */ // lr
/* R31 */ // sp
);
@@ -579,7 +579,7 @@
R26, R26_H,
/* R27, R27_H, */ // heapbase
/* R28, R28_H, */ // thread
- R29, R29_H, // fp
+ /* R29, R29_H, */ // fp
/* R30, R30_H, */ // lr
/* R31, R31_H */ // sp
);
@@ -952,20 +952,1864 @@
static int emit_deopt_handler(CodeBuffer& cbuf);
static uint size_exception_handler() {
- // count up to 4 movz/n/k instructions and one branch instruction
- return 5 * NativeInstruction::instruction_size;
+ return MacroAssembler::far_branch_size();
}
static uint size_deopt_handler() {
- // count one adr and one branch instruction
- return 2 * NativeInstruction::instruction_size;
+ // count one adr and one far branch instruction
+ // return 4 * NativeInstruction::instruction_size;
+ return NativeInstruction::instruction_size + MacroAssembler::far_branch_size();
}
};
+ // graph traversal helpers
+
+ MemBarNode *parent_membar(const Node *n);
+ MemBarNode *child_membar(const MemBarNode *n);
+ bool leading_membar(const MemBarNode *barrier);
+
+ bool is_card_mark_membar(const MemBarNode *barrier);
+ bool is_CAS(int opcode);
+
+ MemBarNode *leading_to_normal(MemBarNode *leading);
+ MemBarNode *normal_to_leading(const MemBarNode *barrier);
+ MemBarNode *card_mark_to_trailing(const MemBarNode *barrier);
+ MemBarNode *trailing_to_card_mark(const MemBarNode *trailing);
+ MemBarNode *trailing_to_leading(const MemBarNode *trailing);
+
+ // predicates controlling emit of ldr<x>/ldar<x> and associated dmb
+
+ bool unnecessary_acquire(const Node *barrier);
+ bool needs_acquiring_load(const Node *load);
+
+ // predicates controlling emit of str<x>/stlr<x> and associated dmbs
+
+ bool unnecessary_release(const Node *barrier);
+ bool unnecessary_volatile(const Node *barrier);
+ bool needs_releasing_store(const Node *store);
+
+ // predicate controlling translation of CompareAndSwapX
+ bool needs_acquiring_load_exclusive(const Node *load);
+
+ // predicate controlling translation of StoreCM
+ bool unnecessary_storestore(const Node *storecm);
%}
source %{
+ // Optimizaton of volatile gets and puts
+ // -------------------------------------
+ //
+ // AArch64 has ldar<x> and stlr<x> instructions which we can safely
+ // use to implement volatile reads and writes. For a volatile read
+ // we simply need
+ //
+ // ldar<x>
+ //
+ // and for a volatile write we need
+ //
+ // stlr<x>
+ //
+ // Alternatively, we can implement them by pairing a normal
+ // load/store with a memory barrier. For a volatile read we need
+ //
+ // ldr<x>
+ // dmb ishld
+ //
+ // for a volatile write
+ //
+ // dmb ish
+ // str<x>
+ // dmb ish
+ //
+ // We can also use ldaxr and stlxr to implement compare and swap CAS
+ // sequences. These are normally translated to an instruction
+ // sequence like the following
+ //
+ // dmb ish
+ // retry:
+ // ldxr<x> rval raddr
+ // cmp rval rold
+ // b.ne done
+ // stlxr<x> rval, rnew, rold
+ // cbnz rval retry
+ // done:
+ // cset r0, eq
+ // dmb ishld
+ //
+ // Note that the exclusive store is already using an stlxr
+ // instruction. That is required to ensure visibility to other
+ // threads of the exclusive write (assuming it succeeds) before that
+ // of any subsequent writes.
+ //
+ // The following instruction sequence is an improvement on the above
+ //
+ // retry:
+ // ldaxr<x> rval raddr
+ // cmp rval rold
+ // b.ne done
+ // stlxr<x> rval, rnew, rold
+ // cbnz rval retry
+ // done:
+ // cset r0, eq
+ //
+ // We don't need the leading dmb ish since the stlxr guarantees
+ // visibility of prior writes in the case that the swap is
+ // successful. Crucially we don't have to worry about the case where
+ // the swap is not successful since no valid program should be
+ // relying on visibility of prior changes by the attempting thread
+ // in the case where the CAS fails.
+ //
+ // Similarly, we don't need the trailing dmb ishld if we substitute
+ // an ldaxr instruction since that will provide all the guarantees we
+ // require regarding observation of changes made by other threads
+ // before any change to the CAS address observed by the load.
+ //
+ // In order to generate the desired instruction sequence we need to
+ // be able to identify specific 'signature' ideal graph node
+ // sequences which i) occur as a translation of a volatile reads or
+ // writes or CAS operations and ii) do not occur through any other
+ // translation or graph transformation. We can then provide
+ // alternative aldc matching rules which translate these node
+ // sequences to the desired machine code sequences. Selection of the
+ // alternative rules can be implemented by predicates which identify
+ // the relevant node sequences.
+ //
+ // The ideal graph generator translates a volatile read to the node
+ // sequence
+ //
+ // LoadX[mo_acquire]
+ // MemBarAcquire
+ //
+ // As a special case when using the compressed oops optimization we
+ // may also see this variant
+ //
+ // LoadN[mo_acquire]
+ // DecodeN
+ // MemBarAcquire
+ //
+ // A volatile write is translated to the node sequence
+ //
+ // MemBarRelease
+ // StoreX[mo_release] {CardMark}-optional
+ // MemBarVolatile
+ //
+ // n.b. the above node patterns are generated with a strict
+ // 'signature' configuration of input and output dependencies (see
+ // the predicates below for exact details). The card mark may be as
+ // simple as a few extra nodes or, in a few GC configurations, may
+ // include more complex control flow between the leading and
+ // trailing memory barriers. However, whatever the card mark
+ // configuration these signatures are unique to translated volatile
+ // reads/stores -- they will not appear as a result of any other
+ // bytecode translation or inlining nor as a consequence of
+ // optimizing transforms.
+ //
+ // We also want to catch inlined unsafe volatile gets and puts and
+ // be able to implement them using either ldar<x>/stlr<x> or some
+ // combination of ldr<x>/stlr<x> and dmb instructions.
+ //
+ // Inlined unsafe volatiles puts manifest as a minor variant of the
+ // normal volatile put node sequence containing an extra cpuorder
+ // membar
+ //
+ // MemBarRelease
+ // MemBarCPUOrder
+ // StoreX[mo_release] {CardMark}-optional
+ // MemBarVolatile
+ //
+ // n.b. as an aside, the cpuorder membar is not itself subject to
+ // matching and translation by adlc rules. However, the rule
+ // predicates need to detect its presence in order to correctly
+ // select the desired adlc rules.
+ //
+ // Inlined unsafe volatile gets manifest as a somewhat different
+ // node sequence to a normal volatile get
+ //
+ // MemBarCPUOrder
+ // || \\
+ // MemBarAcquire LoadX[mo_acquire]
+ // ||
+ // MemBarCPUOrder
+ //
+ // In this case the acquire membar does not directly depend on the
+ // load. However, we can be sure that the load is generated from an
+ // inlined unsafe volatile get if we see it dependent on this unique
+ // sequence of membar nodes. Similarly, given an acquire membar we
+ // can know that it was added because of an inlined unsafe volatile
+ // get if it is fed and feeds a cpuorder membar and if its feed
+ // membar also feeds an acquiring load.
+ //
+ // Finally an inlined (Unsafe) CAS operation is translated to the
+ // following ideal graph
+ //
+ // MemBarRelease
+ // MemBarCPUOrder
+ // CompareAndSwapX {CardMark}-optional
+ // MemBarCPUOrder
+ // MemBarAcquire
+ //
+ // So, where we can identify these volatile read and write
+ // signatures we can choose to plant either of the above two code
+ // sequences. For a volatile read we can simply plant a normal
+ // ldr<x> and translate the MemBarAcquire to a dmb. However, we can
+ // also choose to inhibit translation of the MemBarAcquire and
+ // inhibit planting of the ldr<x>, instead planting an ldar<x>.
+ //
+ // When we recognise a volatile store signature we can choose to
+ // plant at a dmb ish as a translation for the MemBarRelease, a
+ // normal str<x> and then a dmb ish for the MemBarVolatile.
+ // Alternatively, we can inhibit translation of the MemBarRelease
+ // and MemBarVolatile and instead plant a simple stlr<x>
+ // instruction.
+ //
+ // when we recognise a CAS signature we can choose to plant a dmb
+ // ish as a translation for the MemBarRelease, the conventional
+ // macro-instruction sequence for the CompareAndSwap node (which
+ // uses ldxr<x>) and then a dmb ishld for the MemBarAcquire.
+ // Alternatively, we can elide generation of the dmb instructions
+ // and plant the alternative CompareAndSwap macro-instruction
+ // sequence (which uses ldaxr<x>).
+ //
+ // Of course, the above only applies when we see these signature
+ // configurations. We still want to plant dmb instructions in any
+ // other cases where we may see a MemBarAcquire, MemBarRelease or
+ // MemBarVolatile. For example, at the end of a constructor which
+ // writes final/volatile fields we will see a MemBarRelease
+ // instruction and this needs a 'dmb ish' lest we risk the
+ // constructed object being visible without making the
+ // final/volatile field writes visible.
+ //
+ // n.b. the translation rules below which rely on detection of the
+ // volatile signatures and insert ldar<x> or stlr<x> are failsafe.
+ // If we see anything other than the signature configurations we
+ // always just translate the loads and stores to ldr<x> and str<x>
+ // and translate acquire, release and volatile membars to the
+ // relevant dmb instructions.
+ //
+
+ // graph traversal helpers used for volatile put/get and CAS
+ // optimization
+
+ // 1) general purpose helpers
+
+ // if node n is linked to a parent MemBarNode by an intervening
+ // Control and Memory ProjNode return the MemBarNode otherwise return
+ // NULL.
+ //
+ // n may only be a Load or a MemBar.
+
+ MemBarNode *parent_membar(const Node *n)
+ {
+ Node *ctl = NULL;
+ Node *mem = NULL;
+ Node *membar = NULL;
+
+ if (n->is_Load()) {
+ ctl = n->lookup(LoadNode::Control);
+ mem = n->lookup(LoadNode::Memory);
+ } else if (n->is_MemBar()) {
+ ctl = n->lookup(TypeFunc::Control);
+ mem = n->lookup(TypeFunc::Memory);
+ } else {
+ return NULL;
+ }
+
+ if (!ctl || !mem || !ctl->is_Proj() || !mem->is_Proj()) {
+ return NULL;
+ }
+
+ membar = ctl->lookup(0);
+
+ if (!membar || !membar->is_MemBar()) {
+ return NULL;
+ }
+
+ if (mem->lookup(0) != membar) {
+ return NULL;
+ }
+
+ return membar->as_MemBar();
+ }
+
+ // if n is linked to a child MemBarNode by intervening Control and
+ // Memory ProjNodes return the MemBarNode otherwise return NULL.
+
+ MemBarNode *child_membar(const MemBarNode *n)
+ {
+ ProjNode *ctl = n->proj_out(TypeFunc::Control);
+ ProjNode *mem = n->proj_out(TypeFunc::Memory);
+
+ // MemBar needs to have both a Ctl and Mem projection
+ if (! ctl || ! mem)
+ return NULL;
+
+ MemBarNode *child = NULL;
+ Node *x;
+
+ for (DUIterator_Fast imax, i = ctl->fast_outs(imax); i < imax; i++) {
+ x = ctl->fast_out(i);
+ // if we see a membar we keep hold of it. we may also see a new
+ // arena copy of the original but it will appear later
+ if (x->is_MemBar()) {
+ child = x->as_MemBar();
+ break;
+ }
+ }
+
+ if (child == NULL) {
+ return NULL;
+ }
+
+ for (DUIterator_Fast imax, i = mem->fast_outs(imax); i < imax; i++) {
+ x = mem->fast_out(i);
+ // if we see a membar we keep hold of it. we may also see a new
+ // arena copy of the original but it will appear later
+ if (x == child) {
+ return child;
+ }
+ }
+ return NULL;
+ }
+
+ // helper predicate use to filter candidates for a leading memory
+ // barrier
+ //
+ // returns true if barrier is a MemBarRelease or a MemBarCPUOrder
+ // whose Ctl and Mem feeds come from a MemBarRelease otherwise false
+
+ bool leading_membar(const MemBarNode *barrier)
+ {
+ int opcode = barrier->Opcode();
+ // if this is a release membar we are ok
+ if (opcode == Op_MemBarRelease) {
+ return true;
+ }
+ // if its a cpuorder membar . . .
+ if (opcode != Op_MemBarCPUOrder) {
+ return false;
+ }
+ // then the parent has to be a release membar
+ MemBarNode *parent = parent_membar(barrier);
+ if (!parent) {
+ return false;
+ }
+ opcode = parent->Opcode();
+ return opcode == Op_MemBarRelease;
+ }
+
+ // 2) card mark detection helper
+
+ // helper predicate which can be used to detect a volatile membar
+ // introduced as part of a conditional card mark sequence either by
+ // G1 or by CMS when UseCondCardMark is true.
+ //
+ // membar can be definitively determined to be part of a card mark
+ // sequence if and only if all the following hold
+ //
+ // i) it is a MemBarVolatile
+ //
+ // ii) either UseG1GC or (UseConcMarkSweepGC && UseCondCardMark) is
+ // true
+ //
+ // iii) the node's Mem projection feeds a StoreCM node.
+
+ bool is_card_mark_membar(const MemBarNode *barrier)
+ {
+ if (!UseG1GC && !(UseConcMarkSweepGC && UseCondCardMark)) {
+ return false;
+ }
+
+ if (barrier->Opcode() != Op_MemBarVolatile) {
+ return false;
+ }
+
+ ProjNode *mem = barrier->proj_out(TypeFunc::Memory);
+
+ for (DUIterator_Fast imax, i = mem->fast_outs(imax); i < imax ; i++) {
+ Node *y = mem->fast_out(i);
+ if (y->Opcode() == Op_StoreCM) {
+ return true;
+ }
+ }
+
+ return false;
+ }
+
+
+ // 3) helper predicates to traverse volatile put or CAS graphs which
+ // may contain GC barrier subgraphs
+
+ // Preamble
+ // --------
+ //
+ // for volatile writes we can omit generating barriers and employ a
+ // releasing store when we see a node sequence sequence with a
+ // leading MemBarRelease and a trailing MemBarVolatile as follows
+ //
+ // MemBarRelease
+ // { || } -- optional
+ // {MemBarCPUOrder}
+ // || \\
+ // || StoreX[mo_release]
+ // | \ /
+ // | MergeMem
+ // | /
+ // MemBarVolatile
+ //
+ // where
+ // || and \\ represent Ctl and Mem feeds via Proj nodes
+ // | \ and / indicate further routing of the Ctl and Mem feeds
+ //
+ // this is the graph we see for non-object stores. however, for a
+ // volatile Object store (StoreN/P) we may see other nodes below the
+ // leading membar because of the need for a GC pre- or post-write
+ // barrier.
+ //
+ // with most GC configurations we with see this simple variant which
+ // includes a post-write barrier card mark.
+ //
+ // MemBarRelease______________________________
+ // || \\ Ctl \ \\
+ // || StoreN/P[mo_release] CastP2X StoreB/CM
+ // | \ / . . . /
+ // | MergeMem
+ // | /
+ // || /
+ // MemBarVolatile
+ //
+ // i.e. the leading membar feeds Ctl to a CastP2X (which converts
+ // the object address to an int used to compute the card offset) and
+ // Ctl+Mem to a StoreB node (which does the actual card mark).
+ //
+ // n.b. a StoreCM node will only appear in this configuration when
+ // using CMS. StoreCM differs from a normal card mark write (StoreB)
+ // because it implies a requirement to order visibility of the card
+ // mark (StoreCM) relative to the object put (StoreP/N) using a
+ // StoreStore memory barrier (arguably this ought to be represented
+ // explicitly in the ideal graph but that is not how it works). This
+ // ordering is required for both non-volatile and volatile
+ // puts. Normally that means we need to translate a StoreCM using
+ // the sequence
+ //
+ // dmb ishst
+ // stlrb
+ //
+ // However, in the case of a volatile put if we can recognise this
+ // configuration and plant an stlr for the object write then we can
+ // omit the dmb and just plant an strb since visibility of the stlr
+ // is ordered before visibility of subsequent stores. StoreCM nodes
+ // also arise when using G1 or using CMS with conditional card
+ // marking. In these cases (as we shall see) we don't need to insert
+ // the dmb when translating StoreCM because there is already an
+ // intervening StoreLoad barrier between it and the StoreP/N.
+ //
+ // It is also possible to perform the card mark conditionally on it
+ // currently being unmarked in which case the volatile put graph
+ // will look slightly different
+ //
+ // MemBarRelease____________________________________________
+ // || \\ Ctl \ Ctl \ \\ Mem \
+ // || StoreN/P[mo_release] CastP2X If LoadB |
+ // | \ / \ |
+ // | MergeMem . . . StoreB
+ // | / /
+ // || /
More information about the distro-pkg-dev
mailing list