RFR: JDK-8213473: Prevent transformation of LoadB->AndI->CmpI pattern to facilitate testb instruction matching
Roman Kennke
rkennke at redhat.com
Thu Nov 8 15:30:28 UTC 2018
That would be:
http://cr.openjdk.java.net/~rkennke/JDK-8213473/webrev.01/
Thoughts?
Roman
> Also, I wonder if it's worth to check for strictly CmpI with zero only,
> because this is what the testb matcher is matching on. ?
>
> Roman
>
>> Hi Vladimir,
>>
>> I haven't (yet) come up with an actual micro-benchmark, and haven't yet
>> run specjvm or such, but I've written a program that triggers the
>> matching and shows the assembly.
>>
>> This:
>> https://paste.fedoraproject.org/paste/n7vad~kq2-qB65x6PdCuGw
>>
>> Executed like this (usual jmh):
>> java -jar target/benchmarks.jar -prof perfasm
>>
>> Yields the following assembly (without patch):
>> 0x00007f5ff0587edc: movzbl 0xc(%rsi),%r10d
>> // unrelated
>> 0,06% 0x00007f5ff0587ee1: mov $0x2,%r8d
>> // unrelated
>> 0x00007f5ff0587ee7: mov $0x1,%eax
>> 0x00007f5ff0587eec: test $0x2a,%r10d
>>
>> and with patch:
>>
>> 0,08% 0x00007f27605888e7: testb $0x2a,0xc(%rsi)
>>
>> I don't know if there's any chance that this performs worse? Maybe the
>> two instructions schedule better because CPU can do other work while
>> waiting for memory? It surely saves a register, which might come useful
>> under register pressure. And it compiles to more compact code.
>>
>> Do you want me to take it further and come up with an actual working
>> microbenchmark? Or do you have benchmarks that you want to run?
>>
>> Thanks,
>> Roman
>>
>>> Roman, your change seems fine.
>>>
>>> I thought for SPARC it may regress but the only complex rule with
>>> AndI+LoadUB I found is for ConvI2L:
>>>
>>> match(Set dst (ConvI2L (AndI (LoadUB mem) mask)));
>>>
>>> I did not find anything on other platforms.
>>>
>>> Anyway it would be nice to run some performance testing.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 11/7/18 3:46 AM, Roman Kennke wrote:
>>>> In JDK-8203628, Aleksey added matchers to generate testb instruction (on
>>>> x86) for LoadB->AndI->CmpI sequence. However, in AndINode::Ideal() we
>>>> transform any LoadB->AndI to LoadUB->AndI which fails to match this
>>>> pattern. Which means we never actually match for testb. The original
>>>> matchers for LoadUB->AndI->CmpI needed to be removed because of
>>>> JDK-8204479.
>>>>
>>>> The proposed enhancement would prevent the transformation if we have a
>>>> straight LoadB->AndI->CmpI sequence. While the matcher is x86-only, I
>>>> don't think it actually hurts to prevent this particular sequence. It
>>>> should generate better code (e.g. single testb instructions) for things
>>>> like:
>>>>
>>>> byte field;
>>>> ..
>>>>
>>>> if ((field & 3) == 0) ..
>>>>
>>>> or even:
>>>>
>>>> if (field == x) .. because we always mask byte loads
>>>>
>>>> or even (not sure) same for boolean fields.
>>>>
>>>> Bug:
>>>> https://bugs.openjdk.java.net/browse/JDK-8213473
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~rkennke/JDK-8213473/webrev.00/
>>>>
>>>> Testing: passes hotspot/jtreg:tier1 here
>>>>
>>>> Thoughts? Reviews?
>>>>
>>>> Roman
>>>>
>>>>
>>>>
>>>>
>>>>
>>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20181108/3bd7a44b/signature.asc>
More information about the hotspot-compiler-dev
mailing list