Bit set intrinsic
Roman Kennke
rkennke at redhat.com
Mon Nov 5 21:24:12 UTC 2018
Just want to say that I like this effort. Please go ahead and create an
issue and send it out for review.
Roman
> On Wed, 31 Oct 2018 at 15:51, B. Blaser <bsrbnd at gmail.com> wrote:
>>
>> The last but not least, I implemented the c2 part (using the 8-bit
>> AND/OR variant) to do sharper comparisons also on non-concurrent
>> execution:
>>
>> http://cr.openjdk.java.net/~bsrbnd/boolpack/webrev.02/
>>
>> With 10e6 iterations the lock latency seems to be more or less
>> negligible and removing it would make the intrinsic about 10% faster
>> than BitSet without synchronization.
>
> Which actually seems to be due to the following missing ANDB/ORB
> patterns in x86_64.ad:
>
> instruct andB_mem_rReg(memory dst, rRegI src, rFlagsReg cr)
> %{
> match(Set dst (StoreB dst (AndI (LoadB dst) src)));
> effect(KILL cr);
>
> ins_cost(150);
> format %{ "andb $dst, $src\t# byte" %}
> opcode(0x20);
> ins_encode(REX_breg_mem(src, dst), OpcP, reg_mem(src, dst));
> ins_pipe(ialu_mem_reg);
> %}
>
> instruct orB_mem_rReg(memory dst, rRegI src, rFlagsReg cr)
> %{
> match(Set dst (StoreB dst (OrI (LoadB dst) src)));
> effect(KILL cr);
>
> ins_cost(150);
> format %{ "orb $dst, $src\t# byte" %}
> opcode(0x08);
> ins_encode(REX_breg_mem(src, dst), OpcP, reg_mem(src, dst));
> ins_pipe(ialu_mem_reg);
> %}
>
> The next two lines:
> 1) bits[index>>>3] |= (byte)(1 << (index & 7));
> 2) bits[index>>>3] &= (byte)~(1 << (index & 7));
>
> where assembled as:
> 1)
> 024 movsbl R8, [RSI + #16 + R10] # byte
> 02a movl R11, #1 # int
> 030 sall R11, RCX
> 033 movsbl R11, R11 # i2b
> 037 orl R11, R8 # int
> 03a movb [RSI + #16 + R10], R11 # byte
> 2)
> 024 movsbl R8, [RSI + #16 + R10] # byte
> 02a movl R11, #1 # int
> 030 sall R11, RCX
> 033 not R11
> 036 movsbl R11, R11 # i2b
> 03a andl R8, R11 # int
> 03d movb [RSI + #16 + R10], R8 # byte
>
> instead of:
> 1)
> 024 movl R11, #1 # int
> 02a sall R11, RCX
> 02d movsbl R11, R11 # i2b
> 031 orb [RSI + #16 + R10], R11 # byte
> 2)
> 024 movl R11, #1 # int
> 02a sall R11, RCX
> 02d not R11
> 030 movsbl R11, R11 # i2b
> 034 andb [RSI + #16 + R10], R11 # byte
>
> So, as first step, I would probably create a JBS issue and send out a
> RFR on hotspot-dev for this simple enhancement if there are no
> objections?
>
> Any opinion is welcome.
>
> Thanks,
> Bernard
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://mail.openjdk.java.net/pipermail/compiler-dev/attachments/20181105/979ca5c2/signature-0001.asc>
More information about the compiler-dev
mailing list