RFR 8248043: Need to eliminate excessive i2l conversions
Boris Ulasevich
boris.ulasevich at bell-sw.com
Thu Jul 2 18:02:49 UTC 2020
Thank you, Igor and Vladimir.
Boris
On 01.07.2020 22:29, Igor Veresov wrote:
> That looks good.
>
> igor
>
>
>
>> On Jul 1, 2020, at 2:16 AM, Boris Ulasevich
>> <boris.ulasevich at bell-sw.com <mailto:boris.ulasevich at bell-sw.com>> wrote:
>>
>> Hi,
>>
>> It is the third attempt to send a correct link. Sorry for that ;)
>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02c
>>
>> Thanks,
>> Boris
>>
>> On Wednesday, July 1, 2020, Boris Ulasevich
>> <boris.ulasevich at bell-sw.com <mailto:boris.ulasevich at bell-sw.com>> wrote:
>>
>> Hi,
>>
>> I'm deeply sorry. Yes, webrev.02b is certainly wrong!
>> Correct link is webrev.02c:
>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02
>> <http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b/src/hotspot/share/opto/subnode.cpp.udiff.html>c
>> - this is the change I described in my mail and wanted to review.
>>
>> my apologies,
>> Boris
>>
>> On Wednesday, July 1, 2020, Igor Veresov <igor.veresov at oracle.com
>> <mailto:igor.veresov at oracle.com>> wrote:
>>
>> > On Jun 30, 2020, at 10:15 PM, Vladimir Kozlov
>> <vladimir.kozlov at oracle.com
>> <mailto:vladimir.kozlov at oracle.com>> wrote:
>> >
>> > I think Igor said that you can't swap arguments of compare
>> without changing condition test. For example, if it was CC_LT
>> it should be CC_GT after swap.
>>
>> Yes, that’s exactly what I had in mind. Condition must be
>> inverted. Otherwise your transformation [3] is not valid for
>> anything else but equality, so that’s not going to work. May
>> be if [3] didn’t work, perhaps there is another user of the
>> CmpLNode in addition to BoolNode ?
>>
>> igor
>>
>> >
>> > It is not clear why you need swapping in CmpLNode::Ideal()
>> if BoolNode::Ideal() should do it already. If it does not you
>> need to investigate why.
>> >
>> > Also your list of steps 1.-3. does not reflect changes in
>> webrev.02b:
>> >
>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b/src/hotspot/share/opto/subnode.cpp.udiff.html
>> <http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b/src/hotspot/share/opto/subnode.cpp.udiff.html>
>> >
>> > Regards,
>> > Vladimir
>> >
>> > On 6/30/20 9:33 PM, Boris Ulasevich wrote:
>> >> Hi Igor,
>> >> By BoolNode I mean the canonicalization that is already in
>> place:
>> >>
>> https://hg.openjdk.java.net/jdk/jdk/file/de6ad5f86276/src/hotspot/share/opto/subnode.cpp#l1391
>> <https://hg.openjdk.java.net/jdk/jdk/file/de6ad5f86276/src/hotspot/share/opto/subnode.cpp#l1391>
>> >> thanks,
>> >> Boris
>> >> On Wed, Jul 1, 2020 at 5:07 AM Igor Veresov
>> <igor.veresov at oracle.com <mailto:igor.veresov at oracle.com>> wrote:
>> >>> I think you forgot to include changes to BoolNode in the
>> webrev.
>> >>>
>> >>> igor
>> >>>
>> >>>
>> >>>
>> >>> On Jun 30, 2020, at 11:04 AM, Boris Ulasevich
>> <boris.ulasevich at bell-sw.com
>> <mailto:boris.ulasevich at bell-sw.com>>
>> >>> wrote:
>> >>>
>> >>> Hi Claes,
>> >>>
>> >>>> Seems like the optimization is mostly effective, but not
>> getting all the
>> >>> way.
>> >>>
>> >>> Good point about LHS, thanks! CmpL turned to be not
>> canonized on the
>> >>> moment.
>> >>> I moved the optimization to CmpLNode::Ideal and
>> transformations now works
>> >>> as follows:
>> >>> 1. CmpINode::Ideal: CmpI(CmpL3)->CmpL
>> >>> 2. BoolNode::Ideal:
>> >>> Bool(CmpL(const,val),test)->Bool(CmpL(val,const),test_invert)
>> >>> 3. CmpLNode::Ideal: CmpL(ConvI2L(val),ConL)->CmpI(val,ConI)
>> >>>
>> >>> I applied your test to the benchmark. The result is:
>> >>> Benchmark Mode Cnt Score Error Units
>> >>> SkipIntToLongCast.skipCastTestLeft avgt 5 14.288 ±
>> 0.052 ns/op
>> >>> SkipIntToLongCast.skipCastTestRight avgt 5 14.338 ±
>> 0.088 ns/op
>> >>>
>> >>> Updated webrev:
>> >>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b
>> <http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b>
>> >>>
>> >>> thanks,
>> >>> Boris
>> >>>
>> >>> On 26.06.2020 21:31, Claes Redestad wrote:
>> >>>
>> >>> Hi Boris,
>> >>>
>> >>> this looks like a nice improvement! I just have some
>> comments about the
>> >>> micro.
>> >>>
>> >>> I was curious whether the optimization works when the
>> constant is on
>> >>> the LHS and added a variant of the micro to try that[1].
>> Results are
>> >>> interesting (Intel Xeon):
>> >>>
>> >>> Benchmark Mode Cnt Score Error Units
>> >>> SkipIntToLongCast.skipCastTest avgt 5 30.937 ±
>> 0.056 ns/op
>> >>> SkipIntToLongCast.skipCastTestLeft avgt 5 30.937 ±
>> 0.140 ns/op
>> >>>
>> >>> With your patch:
>> >>> Benchmark Mode Cnt Score Error Units
>> >>> SkipIntToLongCast.skipCastTest avgt 5 14.123 ±
>> 0.035 ns/op
>> >>> SkipIntToLongCast.skipCastTestLeft avgt 5 17.420 ±
>> 0.044 ns/op
>> >>>
>> >>> Seems like the optimization is mostly effective, but not
>> getting all
>> >>> the way. I wouldn't worry about it for this RFE, but
>> perhaps something
>> >>> to investigate in a follow-up. Feel free to include such
>> a variant in
>> >>> your patch though (no attribution necessary).
>> >>>
>> >>> The micro also stabilizes very quickly, so you might want
>> to provide
>> >>> some default tuning to keep runtime in check, e.g.,
>> something like:
>> >>>
>> >>> @Warmup(iterations = 10, time = 500, timeUnit =
>> TimeUnit.MILLISECONDS)
>> >>> @Measurement(iterations = 5, time = 1000, timeUnit =
>> TimeUnit.MILLISECONDS)
>> >>> @Fork(3)
>> >>>
>> >>> Thanks!
>> >>>
>> >>> /Claes
>> >>>
>> >>> [1]
>> >>> @Benchmark
>> >>> public int skipCastTestLeft() {
>> >>> for (int i = 0; i < ARRAYSIZE_L; i++) {
>> >>> if (ARRAYSIZE_L == intValues[i]) {
>> >>> return i;
>> >>> }
>> >>> }
>> >>> return 0;
>> >>> }
>> >>>
>> >>> On 2020-06-26 17:05, Boris Ulasevich wrote:
>> >>>
>> >>> Hi all,
>> >>>
>> >>> Please review the change to eliminate the unnecessary i2l
>> conversion
>> >>> for expressions like this: "if (intValue == 1L)".
>> >>>
>> >>> http://bugs.openjdk.java.net/browse/JDK-8248043
>> <http://bugs.openjdk.java.net/browse/JDK-8248043>
>> >>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.01
>> <http://cr.openjdk.java.net/~bulasevich/8248043/webrev.01>
>> >>>
>> >>> The provided benchmark shows performance boost on all
>> platforms:
>> >>> - Intel Xeon: 32.705 --> 14.234 ns/op
>> >>> - arm64: 42.060 --> 25.456 ns/op
>> >>> - arm32: 618.763 --> 314.040 ns/op
>> >>> - ppc8: 81.218 --> 63.026 ns/op
>> >>>
>> >>> Testing done: jtreg, jck.
>> >>>
>> >>> thanks,
>> >>> Boris
>> >>>
>> >>>
>> >>>
>> >>>
>>
>
More information about the hotspot-compiler-dev
mailing list