RFR: JDK-8213473: Prevent transformation of LoadB->AndI->CmpI pattern to facilitate testb instruction matching

Vladimir Kozlov vladimir.kozlov at oracle.com
Thu Nov 8 18:43:06 UTC 2018


Hmm. May be we should undo JDK-8204479 changes for testUB_mem_imm?

JDK-8204479 issue was with signed compare cmpb. Compare with 0 should not be affected. Right?

Vladimir

On 11/8/18 7:30 AM, Roman Kennke wrote:
> That would be:
> 
> http://cr.openjdk.java.net/~rkennke/JDK-8213473/webrev.01/
> 
> Thoughts?
> 
> Roman
> 
>> Also, I wonder if it's worth to check for strictly CmpI with zero only,
>> because this is what the testb matcher is matching on. ?
>>
>> Roman
>>
>>> Hi Vladimir,
>>>
>>> I haven't (yet) come up with an actual micro-benchmark, and haven't yet
>>> run specjvm or such, but I've written a program that triggers the
>>> matching and shows the assembly.
>>>
>>> This:
>>> https://paste.fedoraproject.org/paste/n7vad~kq2-qB65x6PdCuGw
>>>
>>> Executed like this (usual jmh):
>>> java -jar target/benchmarks.jar -prof perfasm
>>>
>>> Yields the following assembly (without patch):
>>>             0x00007f5ff0587edc: movzbl 0xc(%rsi),%r10d
>>> // unrelated
>>>    0,06%    0x00007f5ff0587ee1: mov    $0x2,%r8d
>>> // unrelated
>>>             0x00007f5ff0587ee7: mov    $0x1,%eax
>>>             0x00007f5ff0587eec: test   $0x2a,%r10d
>>>
>>> and with patch:
>>>
>>>    0,08%     0x00007f27605888e7: testb  $0x2a,0xc(%rsi)
>>>
>>> I don't know if there's any chance that this performs worse? Maybe the
>>> two instructions schedule better because CPU can do other work while
>>> waiting for memory? It surely saves a register, which might come useful
>>> under register pressure. And it compiles to more compact code.
>>>
>>> Do you want me to take it further and come up with an actual working
>>> microbenchmark? Or do you have benchmarks that you want to run?
>>>
>>> Thanks,
>>> Roman
>>>
>>>> Roman, your change seems fine.
>>>>
>>>> I thought for SPARC it may regress but the only complex rule with
>>>> AndI+LoadUB I found is for ConvI2L:
>>>>
>>>> match(Set dst (ConvI2L (AndI (LoadUB mem) mask)));
>>>>
>>>> I did not find anything on other platforms.
>>>>
>>>> Anyway it would be nice to run some performance testing.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 11/7/18 3:46 AM, Roman Kennke wrote:
>>>>> In JDK-8203628, Aleksey added matchers to generate testb instruction (on
>>>>> x86) for LoadB->AndI->CmpI sequence. However, in AndINode::Ideal() we
>>>>> transform any LoadB->AndI to LoadUB->AndI which fails to match this
>>>>> pattern. Which means we never actually match for testb. The original
>>>>> matchers for LoadUB->AndI->CmpI needed to be removed because of
>>>>> JDK-8204479.
>>>>>
>>>>> The proposed enhancement would prevent the transformation if we have a
>>>>> straight LoadB->AndI->CmpI sequence. While the matcher is x86-only, I
>>>>> don't think it actually hurts to prevent this particular sequence. It
>>>>> should generate better code (e.g. single testb instructions) for things
>>>>> like:
>>>>>
>>>>> byte field;
>>>>> ..
>>>>>
>>>>> if ((field & 3) == 0) ..
>>>>>
>>>>> or even:
>>>>>
>>>>> if (field == x) .. because we always mask byte loads
>>>>>
>>>>> or even (not sure) same for boolean fields.
>>>>>
>>>>> Bug:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8213473
>>>>> Webrev:
>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8213473/webrev.00/
>>>>>
>>>>> Testing: passes hotspot/jtreg:tier1 here
>>>>>
>>>>> Thoughts? Reviews?
>>>>>
>>>>> Roman
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>
>>
> 


More information about the hotspot-compiler-dev mailing list