RFR 8248043: Need to eliminate excessive i2l conversions

Wed Jul 1 05:15:02 UTC 2020

I think Igor said that you can't swap arguments of compare without changing condition test. For example, if it was CC_LT 
it should be CC_GT after swap.

It is not clear why you need swapping in CmpLNode::Ideal() if BoolNode::Ideal() should do it already. If it does not you 
need to investigate why.

Also your list of steps 1.-3. does not reflect changes in webrev.02b:
http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b/src/hotspot/share/opto/subnode.cpp.udiff.html

Regards,
Vladimir

On 6/30/20 9:33 PM, Boris Ulasevich wrote:
> Hi Igor,
> 
> By BoolNode I mean the canonicalization that is already in place:
> https://hg.openjdk.java.net/jdk/jdk/file/de6ad5f86276/src/hotspot/share/opto/subnode.cpp#l1391
> 
> thanks,
> Boris
> 
> On Wed, Jul 1, 2020 at 5:07 AM Igor Veresov <igor.veresov at oracle.com> wrote:
> 
>> I think you forgot to include changes to BoolNode in the webrev.
>>
>> igor
>>
>>
>>
>> On Jun 30, 2020, at 11:04 AM, Boris Ulasevich <boris.ulasevich at bell-sw.com>
>> wrote:
>>
>> Hi Claes,
>>
>>> Seems like the optimization is mostly effective, but not getting all the
>> way.
>>
>> Good point about LHS, thanks! CmpL turned to be not canonized on the
>> moment.
>> I moved the optimization to CmpLNode::Ideal and transformations now works
>> as follows:
>> 1. CmpINode::Ideal: CmpI(CmpL3)->CmpL
>> 2. BoolNode::Ideal:
>> Bool(CmpL(const,val),test)->Bool(CmpL(val,const),test_invert)
>> 3. CmpLNode::Ideal: CmpL(ConvI2L(val),ConL)->CmpI(val,ConI)
>>
>> I applied your test to the benchmark. The result is:
>> Benchmark                            Mode  Cnt   Score   Error Units
>> SkipIntToLongCast.skipCastTestLeft   avgt    5  14.288 ± 0.052 ns/op
>> SkipIntToLongCast.skipCastTestRight  avgt    5  14.338 ± 0.088 ns/op
>>
>> Updated webrev:
>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b
>>
>> thanks,
>> Boris
>>
>> On 26.06.2020 21:31, Claes Redestad wrote:
>>
>> Hi Boris,
>>
>> this looks like a nice improvement! I just have some comments about the
>> micro.
>>
>> I was curious whether the optimization works when the constant is on
>> the LHS and added a variant of the micro to try that[1]. Results are
>> interesting (Intel Xeon):
>>
>> Benchmark                            Mode  Cnt   Score   Error Units
>> SkipIntToLongCast.skipCastTest       avgt    5  30.937 ± 0.056 ns/op
>> SkipIntToLongCast.skipCastTestLeft   avgt    5  30.937 ± 0.140 ns/op
>>
>> With your patch:
>> Benchmark                            Mode  Cnt   Score   Error Units
>> SkipIntToLongCast.skipCastTest       avgt    5  14.123 ± 0.035 ns/op
>> SkipIntToLongCast.skipCastTestLeft   avgt    5  17.420 ± 0.044 ns/op
>>
>> Seems like the optimization is mostly effective, but not getting all
>> the way. I wouldn't worry about it for this RFE, but perhaps something
>> to investigate in a follow-up. Feel free to include such a variant in
>> your patch though (no attribution necessary).
>>
>> The micro also stabilizes very quickly, so you might want to provide
>> some default tuning to keep runtime in check, e.g., something like:
>>
>> @Warmup(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
>> @Measurement(iterations = 5, time = 1000, timeUnit = TimeUnit.MILLISECONDS)
>> @Fork(3)
>>
>> Thanks!
>>
>> /Claes
>>
>> [1]
>>      @Benchmark
>>      public int skipCastTestLeft() {
>>          for (int i = 0; i < ARRAYSIZE_L; i++) {
>>              if (ARRAYSIZE_L == intValues[i]) {
>>                  return i;
>>              }
>>          }
>>          return 0;
>>      }
>>
>> On 2020-06-26 17:05, Boris Ulasevich wrote:
>>
>> Hi all,
>>
>> Please review the change to eliminate the unnecessary i2l conversion
>> for expressions like this: "if (intValue == 1L)".
>>
>> http://bugs.openjdk.java.net/browse/JDK-8248043
>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.01
>>
>> The provided benchmark shows performance boost on all platforms:
>> - Intel Xeon: 32.705 --> 14.234 ns/op
>> - arm64: 42.060 --> 25.456 ns/op
>> - arm32: 618.763 --> 314.040 ns/op
>> - ppc8:  81.218 --> 63.026 ns/op
>>
>> Testing done: jtreg, jck.
>>
>> thanks,
>> Boris
>>
>>
>>
>>