Bit set intrinsic
B. Blaser
bsrbnd at gmail.com
Mon Nov 5 21:21:02 UTC 2018
On Wed, 31 Oct 2018 at 15:51, B. Blaser <bsrbnd at gmail.com> wrote:
>
> The last but not least, I implemented the c2 part (using the 8-bit
> AND/OR variant) to do sharper comparisons also on non-concurrent
> execution:
>
> http://cr.openjdk.java.net/~bsrbnd/boolpack/webrev.02/
>
> With 10e6 iterations the lock latency seems to be more or less
> negligible and removing it would make the intrinsic about 10% faster
> than BitSet without synchronization.
Which actually seems to be due to the following missing ANDB/ORB
patterns in x86_64.ad:
instruct andB_mem_rReg(memory dst, rRegI src, rFlagsReg cr)
%{
match(Set dst (StoreB dst (AndI (LoadB dst) src)));
effect(KILL cr);
ins_cost(150);
format %{ "andb $dst, $src\t# byte" %}
opcode(0x20);
ins_encode(REX_breg_mem(src, dst), OpcP, reg_mem(src, dst));
ins_pipe(ialu_mem_reg);
%}
instruct orB_mem_rReg(memory dst, rRegI src, rFlagsReg cr)
%{
match(Set dst (StoreB dst (OrI (LoadB dst) src)));
effect(KILL cr);
ins_cost(150);
format %{ "orb $dst, $src\t# byte" %}
opcode(0x08);
ins_encode(REX_breg_mem(src, dst), OpcP, reg_mem(src, dst));
ins_pipe(ialu_mem_reg);
%}
The next two lines:
1) bits[index>>>3] |= (byte)(1 << (index & 7));
2) bits[index>>>3] &= (byte)~(1 << (index & 7));
where assembled as:
1)
024 movsbl R8, [RSI + #16 + R10] # byte
02a movl R11, #1 # int
030 sall R11, RCX
033 movsbl R11, R11 # i2b
037 orl R11, R8 # int
03a movb [RSI + #16 + R10], R11 # byte
2)
024 movsbl R8, [RSI + #16 + R10] # byte
02a movl R11, #1 # int
030 sall R11, RCX
033 not R11
036 movsbl R11, R11 # i2b
03a andl R8, R11 # int
03d movb [RSI + #16 + R10], R8 # byte
instead of:
1)
024 movl R11, #1 # int
02a sall R11, RCX
02d movsbl R11, R11 # i2b
031 orb [RSI + #16 + R10], R11 # byte
2)
024 movl R11, #1 # int
02a sall R11, RCX
02d not R11
030 movsbl R11, R11 # i2b
034 andb [RSI + #16 + R10], R11 # byte
So, as first step, I would probably create a JBS issue and send out a
RFR on hotspot-dev for this simple enhancement if there are no
objections?
Any opinion is welcome.
Thanks,
Bernard
More information about the compiler-dev
mailing list