RFR: JDK-8213473: Prevent transformation of LoadB->AndI->CmpI pattern to facilitate testb instruction matching
Roman Kennke
rkennke at redhat.com
Thu Nov 8 14:58:09 UTC 2018
Also, I wonder if it's worth to check for strictly CmpI with zero only,
because this is what the testb matcher is matching on. ?
Roman
> Hi Vladimir,
>
> I haven't (yet) come up with an actual micro-benchmark, and haven't yet
> run specjvm or such, but I've written a program that triggers the
> matching and shows the assembly.
>
> This:
> https://paste.fedoraproject.org/paste/n7vad~kq2-qB65x6PdCuGw
>
> Executed like this (usual jmh):
> java -jar target/benchmarks.jar -prof perfasm
>
> Yields the following assembly (without patch):
> 0x00007f5ff0587edc: movzbl 0xc(%rsi),%r10d
> // unrelated
> 0,06% 0x00007f5ff0587ee1: mov $0x2,%r8d
> // unrelated
> 0x00007f5ff0587ee7: mov $0x1,%eax
> 0x00007f5ff0587eec: test $0x2a,%r10d
>
> and with patch:
>
> 0,08% 0x00007f27605888e7: testb $0x2a,0xc(%rsi)
>
> I don't know if there's any chance that this performs worse? Maybe the
> two instructions schedule better because CPU can do other work while
> waiting for memory? It surely saves a register, which might come useful
> under register pressure. And it compiles to more compact code.
>
> Do you want me to take it further and come up with an actual working
> microbenchmark? Or do you have benchmarks that you want to run?
>
> Thanks,
> Roman
>
>> Roman, your change seems fine.
>>
>> I thought for SPARC it may regress but the only complex rule with
>> AndI+LoadUB I found is for ConvI2L:
>>
>> match(Set dst (ConvI2L (AndI (LoadUB mem) mask)));
>>
>> I did not find anything on other platforms.
>>
>> Anyway it would be nice to run some performance testing.
>>
>> Thanks,
>> Vladimir
>>
>> On 11/7/18 3:46 AM, Roman Kennke wrote:
>>> In JDK-8203628, Aleksey added matchers to generate testb instruction (on
>>> x86) for LoadB->AndI->CmpI sequence. However, in AndINode::Ideal() we
>>> transform any LoadB->AndI to LoadUB->AndI which fails to match this
>>> pattern. Which means we never actually match for testb. The original
>>> matchers for LoadUB->AndI->CmpI needed to be removed because of
>>> JDK-8204479.
>>>
>>> The proposed enhancement would prevent the transformation if we have a
>>> straight LoadB->AndI->CmpI sequence. While the matcher is x86-only, I
>>> don't think it actually hurts to prevent this particular sequence. It
>>> should generate better code (e.g. single testb instructions) for things
>>> like:
>>>
>>> byte field;
>>> ..
>>>
>>> if ((field & 3) == 0) ..
>>>
>>> or even:
>>>
>>> if (field == x) .. because we always mask byte loads
>>>
>>> or even (not sure) same for boolean fields.
>>>
>>> Bug:
>>> https://bugs.openjdk.java.net/browse/JDK-8213473
>>> Webrev:
>>> http://cr.openjdk.java.net/~rkennke/JDK-8213473/webrev.00/
>>>
>>> Testing: passes hotspot/jtreg:tier1 here
>>>
>>> Thoughts? Reviews?
>>>
>>> Roman
>>>
>>>
>>>
>>>
>>>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20181108/107f3d99/signature.asc>
More information about the hotspot-compiler-dev
mailing list