RFR: JDK-8213473: Prevent transformation of LoadB->AndI->CmpI pattern to facilitate testb instruction matching
Roman Kennke
rkennke at redhat.com
Thu Nov 8 20:17:46 UTC 2018
Hi Vladimir,
> On 11/8/18 11:51 AM, Roman Kennke wrote:
>> Webrev implementing that change:
>> http://cr.openjdk.java.net/~rkennke/JDK-8213473/webrev.02/
>
> Good.
Thanks!
>> And since the existing testB_mem_imm never matches, I took the liberty
>> to replace it with testUB_mem_imm. Or do you think there's any chance
>> that the LoadB->AndI->CmpI(0) pattern slips through somehow? Or might be
>> useful otherwise?
>
> I looked again through all .ad files we have in OpenJDK and I don't see
> this patter except this one case. I think it is okay to replace it.
Very good! Thanks for checking!
I'll push it through jdk/submit while waiting for another review.
Speaking of jdk/submit, is anything up with that? I'm waiting for my
branch JDK-8213199-2 to get a result since ~24hours.
Thanks,
Roman
> Thanks,
> Vladimir
>
>>
>> Roman
>>
>>> Hmm. May be we should undo JDK-8204479 changes for testUB_mem_imm?
>>>
>>> JDK-8204479 issue was with signed compare cmpb. Compare with 0 should
>>> not be affected. Right?
>>>
>>> Vladimir
>>>
>>> On 11/8/18 7:30 AM, Roman Kennke wrote:
>>>> That would be:
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/JDK-8213473/webrev.01/
>>>>
>>>> Thoughts?
>>>>
>>>> Roman
>>>>
>>>>> Also, I wonder if it's worth to check for strictly CmpI with zero
>>>>> only,
>>>>> because this is what the testb matcher is matching on. ?
>>>>>
>>>>> Roman
>>>>>
>>>>>> Hi Vladimir,
>>>>>>
>>>>>> I haven't (yet) come up with an actual micro-benchmark, and
>>>>>> haven't yet
>>>>>> run specjvm or such, but I've written a program that triggers the
>>>>>> matching and shows the assembly.
>>>>>>
>>>>>> This:
>>>>>> https://paste.fedoraproject.org/paste/n7vad~kq2-qB65x6PdCuGw
>>>>>>
>>>>>> Executed like this (usual jmh):
>>>>>> java -jar target/benchmarks.jar -prof perfasm
>>>>>>
>>>>>> Yields the following assembly (without patch):
>>>>>> 0x00007f5ff0587edc: movzbl 0xc(%rsi),%r10d
>>>>>> // unrelated
>>>>>> 0,06% 0x00007f5ff0587ee1: mov $0x2,%r8d
>>>>>> // unrelated
>>>>>> 0x00007f5ff0587ee7: mov $0x1,%eax
>>>>>> 0x00007f5ff0587eec: test $0x2a,%r10d
>>>>>>
>>>>>> and with patch:
>>>>>>
>>>>>> 0,08% 0x00007f27605888e7: testb $0x2a,0xc(%rsi)
>>>>>>
>>>>>> I don't know if there's any chance that this performs worse? Maybe
>>>>>> the
>>>>>> two instructions schedule better because CPU can do other work while
>>>>>> waiting for memory? It surely saves a register, which might come
>>>>>> useful
>>>>>> under register pressure. And it compiles to more compact code.
>>>>>>
>>>>>> Do you want me to take it further and come up with an actual working
>>>>>> microbenchmark? Or do you have benchmarks that you want to run?
>>>>>>
>>>>>> Thanks,
>>>>>> Roman
>>>>>>
>>>>>>> Roman, your change seems fine.
>>>>>>>
>>>>>>> I thought for SPARC it may regress but the only complex rule with
>>>>>>> AndI+LoadUB I found is for ConvI2L:
>>>>>>>
>>>>>>> match(Set dst (ConvI2L (AndI (LoadUB mem) mask)));
>>>>>>>
>>>>>>> I did not find anything on other platforms.
>>>>>>>
>>>>>>> Anyway it would be nice to run some performance testing.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Vladimir
>>>>>>>
>>>>>>> On 11/7/18 3:46 AM, Roman Kennke wrote:
>>>>>>>> In JDK-8203628, Aleksey added matchers to generate testb
>>>>>>>> instruction (on
>>>>>>>> x86) for LoadB->AndI->CmpI sequence. However, in
>>>>>>>> AndINode::Ideal() we
>>>>>>>> transform any LoadB->AndI to LoadUB->AndI which fails to match this
>>>>>>>> pattern. Which means we never actually match for testb. The
>>>>>>>> original
>>>>>>>> matchers for LoadUB->AndI->CmpI needed to be removed because of
>>>>>>>> JDK-8204479.
>>>>>>>>
>>>>>>>> The proposed enhancement would prevent the transformation if we
>>>>>>>> have a
>>>>>>>> straight LoadB->AndI->CmpI sequence. While the matcher is
>>>>>>>> x86-only, I
>>>>>>>> don't think it actually hurts to prevent this particular
>>>>>>>> sequence. It
>>>>>>>> should generate better code (e.g. single testb instructions) for
>>>>>>>> things
>>>>>>>> like:
>>>>>>>>
>>>>>>>> byte field;
>>>>>>>> ..
>>>>>>>>
>>>>>>>> if ((field & 3) == 0) ..
>>>>>>>>
>>>>>>>> or even:
>>>>>>>>
>>>>>>>> if (field == x) .. because we always mask byte loads
>>>>>>>>
>>>>>>>> or even (not sure) same for boolean fields.
>>>>>>>>
>>>>>>>> Bug:
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8213473
>>>>>>>> Webrev:
>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8213473/webrev.00/
>>>>>>>>
>>>>>>>> Testing: passes hotspot/jtreg:tier1 here
>>>>>>>>
>>>>>>>> Thoughts? Reviews?
>>>>>>>>
>>>>>>>> Roman
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>
>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20181108/998a94e8/signature.asc>
More information about the hotspot-compiler-dev
mailing list