RFR 8214239 (S): Missing x86_64.ad patterns for clearing and setting long vector bits
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Tue Nov 12 09:00:21 UTC 2019
Hi Bernard,
Nice improvement!
> http://cr.openjdk.java.net/~bsrbnd/jdk8214239/webrev.00/
I don't see cases for non-constant masks John suggested covered. Have
you tried to implement them? Any problems encountered or did you just
leave them for future improvement?
====================================================================
I'd like to see asserts in new MacroAssembler methods that imm8 fits
into a byte, like:
+void Assembler::btsq(Address dst, int imm8) {
+ assert(isByte(imm8), "not a byte");
+ InstructionMark im(this);
====================================================================
src/hotspot/cpu/x86/x86_64.ad:
+operand immPow2L()
+%{
+ // n should be a pure 64-bit power of 2 immediate.
+ predicate(is_power_of_2_long(n->get_long()) &&
log2_long(n->get_long()) > 31);
+operand immPow2NotL()
+%{
+ // n should be a pure 64-bit immediate given that not(n) is a power of 2.
Why do you limit the optimization to bits in upper half? Is it because
ordinary andq/orq instructions work well for the rest? If that's the
case, it deserves a comment.
(immPow2NotL is a bit misleading: I read it as "power of 2, but not a
long". What do you think about immL_NegPow2/immL_Pow2? Not sure how to
encode that it's > 2^32, but I would just skip it for now.)
====================================================================
+instruct btrL_mem_imm(memory dst, immPow2NotL src, rFlagsReg cr) %{
+ match(Set dst (StoreL dst (AndL (LoadL dst) src)));
+instruct btsL_mem_imm(memory dst, immPow2L src, rFlagsReg cr) %{
+ match(Set dst (StoreL dst (OrL (LoadL dst) src)));
Does it make sense to cover 32-/16-bit cases the same way? (Relates to
the earlier question on bits from upper half.)
Do you leave out in-register updates because they don't give any
benefits compared to andq/orq?
Also, please, use "con" instead of "src". It's easier to read when the
name reflects that the value is a constant.
====================================================================
test/hotspot/jtreg/compiler/c2/TestBitSetAndReset.java:
As I understand, the only test case which exercises new code is:
64 private static void test63() {
65 andq &= ~MASK63;
66 orq |= MASK63;
67 }
Please, at least, add a case for MASK32.
29 * @run main/othervm -XX:+IgnoreUnrecognizedVMOptions
-XX:+UnlockDiagnosticVMOptions
30 * -XX:-TieredCompilation
-XX:CompileThresholdScaling=0.1 -XX:-Inline
31 *
-XX:CompileCommand=print,compiler/c2/TestBitSetAndReset.test*
32 *
-XX:CompileCommand=compileonly,compiler/c2/TestBitSetAndReset.test*
33 * compiler.c2.TestBitSetAndReset
Since you explicitly disable tiered, you can just directly set the
threshold instead (-XX:CompileThreshold=1000).
-XX:-Inline is redundant and you can replace compileonly directive with
dontinline to speed up the test.
Best regards,
Vladimir Ivanov
>
> It includes:
> * Vladimir's requested changes about instruction encoding and testing
> along with:
> * John's suggested complementary benchmark following Sandhya's note
> regarding throughput.
>
> Thanks (hotspot:tier1 is OK on Linux/x86_64),
> Bernard
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8214239
> [2] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-November/035699.html
>
More information about the hotspot-compiler-dev
mailing list