RFR 8214239 (S): Missing x86_64.ad patterns for clearing and setting long vector bits

Vladimir Ivanov vladimir.x.ivanov at oracle.com
Tue Nov 12 09:00:21 UTC 2019


Hi Bernard,

Nice improvement!

> http://cr.openjdk.java.net/~bsrbnd/jdk8214239/webrev.00/

I don't see cases for non-constant masks John suggested covered. Have 
you tried to implement them? Any problems encountered or did you just 
leave them for future improvement?

====================================================================

I'd like to see asserts in new MacroAssembler methods that imm8 fits 
into a byte, like:

+void Assembler::btsq(Address dst, int imm8) {
+  assert(isByte(imm8), "not a byte");
+  InstructionMark im(this);


====================================================================

src/hotspot/cpu/x86/x86_64.ad:

+operand immPow2L()
+%{
+  // n should be a pure 64-bit power of 2 immediate.
+  predicate(is_power_of_2_long(n->get_long()) && 
log2_long(n->get_long()) > 31);

+operand immPow2NotL()
+%{
+  // n should be a pure 64-bit immediate given that not(n) is a power of 2.

Why do you limit the optimization to bits in upper half? Is it because 
ordinary andq/orq instructions work well for the rest? If that's the 
case, it deserves a comment.

(immPow2NotL is a bit misleading: I read it as "power of 2, but not a 
long". What do you think about immL_NegPow2/immL_Pow2? Not sure how to 
encode that it's > 2^32, but I would just skip it for now.)

====================================================================

+instruct btrL_mem_imm(memory dst, immPow2NotL src, rFlagsReg cr) %{
+  match(Set dst (StoreL dst (AndL (LoadL dst) src)));

+instruct btsL_mem_imm(memory dst, immPow2L    src, rFlagsReg cr) %{
+  match(Set dst (StoreL dst (OrL (LoadL dst) src)));

Does it make sense to cover 32-/16-bit cases the same way? (Relates to 
the earlier question on bits from upper half.)

Do you leave out in-register updates because they don't give any 
benefits compared to andq/orq?

Also, please, use "con" instead of "src". It's easier to read when the 
name reflects that the value is a constant.

====================================================================

test/hotspot/jtreg/compiler/c2/TestBitSetAndReset.java:

As I understand, the only test case which exercises new code is:

   64     private static void test63() {
   65         andq &= ~MASK63;
   66         orq |= MASK63;
   67     }

Please, at least, add a case for MASK32.

   29  * @run main/othervm -XX:+IgnoreUnrecognizedVMOptions 
-XX:+UnlockDiagnosticVMOptions
   30  *                   -XX:-TieredCompilation 
-XX:CompileThresholdScaling=0.1 -XX:-Inline
   31  * 
-XX:CompileCommand=print,compiler/c2/TestBitSetAndReset.test*
   32  * 
-XX:CompileCommand=compileonly,compiler/c2/TestBitSetAndReset.test*
   33  *                   compiler.c2.TestBitSetAndReset

Since you explicitly disable tiered, you can just directly set the 
threshold instead (-XX:CompileThreshold=1000).

-XX:-Inline is redundant and you can replace compileonly directive with 
dontinline to speed up the test.

Best regards,
Vladimir Ivanov
> 
> It includes:
> * Vladimir's requested changes about instruction encoding and testing
> along with:
> * John's suggested complementary benchmark following Sandhya's note
> regarding throughput.
> 
> Thanks (hotspot:tier1 is OK on Linux/x86_64),
> Bernard
> 
> [1] https://bugs.openjdk.java.net/browse/JDK-8214239
> [2] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-November/035699.html
> 


More information about the hotspot-compiler-dev mailing list