RFR 8248043: Need to eliminate excessive i2l conversions

Claes Redestad claes.redestad at oracle.com
Tue Jun 30 21:28:13 UTC 2020


+1

Maybe add tests for reversed variants to TestSkipLongToIntCast too? No
need for a new webrev if you do.

/Claes

On 2020-06-30 23:13, Vladimir Kozlov wrote:
> Good optimization. Reviewed.
> 
> Thanks,
> Vladimir
> 
> On 6/30/20 11:04 AM, Boris Ulasevich wrote:
>> Hi Claes,
>>
>>  > Seems like the optimization is mostly effective, but not getting 
>> all the way.
>>
>> Good point about LHS, thanks! CmpL turned to be not canonized on the 
>> moment.
>> I moved the optimization to CmpLNode::Ideal and transformations now 
>> works as follows:
>> 1. CmpINode::Ideal: CmpI(CmpL3)->CmpL
>> 2. BoolNode::Ideal: 
>> Bool(CmpL(const,val),test)->Bool(CmpL(val,const),test_invert)
>> 3. CmpLNode::Ideal: CmpL(ConvI2L(val),ConL)->CmpI(val,ConI)
>>
>> I applied your test to the benchmark. The result is:
>> Benchmark                            Mode  Cnt   Score   Error Units
>> SkipIntToLongCast.skipCastTestLeft   avgt    5  14.288 ± 0.052 ns/op
>> SkipIntToLongCast.skipCastTestRight  avgt    5  14.338 ± 0.088 ns/op
>>
>> Updated webrev:
>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b
>>
>> thanks,
>> Boris
>>
>> On 26.06.2020 21:31, Claes Redestad wrote:
>>> Hi Boris,
>>>
>>> this looks like a nice improvement! I just have some comments about the
>>> micro.
>>>
>>> I was curious whether the optimization works when the constant is on
>>> the LHS and added a variant of the micro to try that[1]. Results are
>>> interesting (Intel Xeon):
>>>
>>> Benchmark                            Mode  Cnt   Score   Error Units
>>> SkipIntToLongCast.skipCastTest       avgt    5  30.937 ± 0.056 ns/op
>>> SkipIntToLongCast.skipCastTestLeft   avgt    5  30.937 ± 0.140 ns/op
>>>
>>> With your patch:
>>> Benchmark                            Mode  Cnt   Score   Error Units
>>> SkipIntToLongCast.skipCastTest       avgt    5  14.123 ± 0.035 ns/op
>>> SkipIntToLongCast.skipCastTestLeft   avgt    5  17.420 ± 0.044 ns/op
>>>
>>> Seems like the optimization is mostly effective, but not getting all
>>> the way. I wouldn't worry about it for this RFE, but perhaps something
>>> to investigate in a follow-up. Feel free to include such a variant in
>>> your patch though (no attribution necessary).
>>>
>>> The micro also stabilizes very quickly, so you might want to provide
>>> some default tuning to keep runtime in check, e.g., something like:
>>>
>>> @Warmup(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
>>> @Measurement(iterations = 5, time = 1000, timeUnit = 
>>> TimeUnit.MILLISECONDS)
>>> @Fork(3)
>>>
>>> Thanks!
>>>
>>> /Claes
>>>
>>> [1]
>>>     @Benchmark
>>>     public int skipCastTestLeft() {
>>>         for (int i = 0; i < ARRAYSIZE_L; i++) {
>>>             if (ARRAYSIZE_L == intValues[i]) {
>>>                 return i;
>>>             }
>>>         }
>>>         return 0;
>>>     }
>>>
>>> On 2020-06-26 17:05, Boris Ulasevich wrote:
>>>> Hi all,
>>>>
>>>> Please review the change to eliminate the unnecessary i2l conversion
>>>> for expressions like this: "if (intValue == 1L)".
>>>>
>>>> http://bugs.openjdk.java.net/browse/JDK-8248043
>>>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.01
>>>>
>>>> The provided benchmark shows performance boost on all platforms:
>>>> - Intel Xeon: 32.705 --> 14.234 ns/op
>>>> - arm64: 42.060 --> 25.456 ns/op
>>>> - arm32: 618.763 --> 314.040 ns/op
>>>> - ppc8:  81.218 --> 63.026 ns/op
>>>>
>>>> Testing done: jtreg, jck.
>>>>
>>>> thanks,
>>>> Boris
>>


More information about the hotspot-compiler-dev mailing list