[patterns-instanceof-primitive] RFR: 8303374: Compiler Implementation for Primitive types in patterns, instanceof, and switch [v3]
Raffaello Giulietti
rgiulietti at openjdk.org
Mon Mar 13 13:35:39 UTC 2023
On Sun, 12 Mar 2023 18:07:00 GMT, Michael Hixson <duke at openjdk.org> wrote:
>> I added a benchmarks file (we may or may not keep it in the final PR) that we can discuss during the finalization of the PR on openjdk/jdk/. In any case, feel free to take a look on the test.
>>
>> I followed these [instructions](https://openjdk.org/groups/build/doc/testing.html) for more info.
>>
>> The following are quick measurements on an Apple M1 Max chip/macOS 13.2.1 (22D68) -- quick meaning that if we want to keep them, we should run them with 3 forks for 15wi/15i at least.
>>
>>
>> sh make/devkit/createJMHBundle.sh
>>
>> bash configure --with-boot-jdk="<.....>/prebuilt/jdk-19.jdk/Contents/Home/" --with-jmh=<.....>/jdk/make/devkit/../../build/jmh/jars/ -disable-warnings-as-errors
>>
>> make test TEST="micro:org.openjdk.bench.jdk.preview.patterns.Exactness" MICRO="OPTIONS=-p "pollute=true";VM_OPTIONS=--enable-native-access=ALL-UNNAMED;RESULTS_FORMAT=json" CONF=rel
>>
>> Exactness.test_float_int_based_on_compare avgt 5 22001.714 ± 21.124 ms/op
>> Exactness.test_float_int_based_on_filtering avgt 5 21459.505 ± 159.797 ms/op
>> Exactness.test_int_float_based_on_filtering avgt 5 1464.611 ± 5.375 ms/op
>> Exactness.test_int_float_based_on_leading_trailing avgt 5 2741.839 ± 10.300 ms/op
>> Exactness.test_long_double_based_on_filtering avgt 5 1379.058 ± 4.329 ms/op
>> Exactness.test_long_double_based_on_leading_trailing avgt 5 2439.624 ± 4.366 ms/op
>> Exactness.test_long_float_based_on_filtering avgt 5 1491.983 ± 8.019 ms/op
>> Exactness.test_long_float_based_on_leading_trailing avgt 5 2736.461 ± 9.508 ms/op
>>
>>
>> Interesting that both float_int are the same and I observe that the filtering ones (or compact) are better than all the analytical (leading_trailing).
>
> @biboudis It's interesting that your `long_double` results conflict with mine.
>
> My results on an i7-7700K + Windows:
>
> Benchmark Mode Cnt Score Error Units
> Exactness.test_long_double_based_on_filtering avgt 5 4579.436 ± 66.288 ms/op
> Exactness.test_long_double_based_on_leading_trailing avgt 5 3322.384 ± 110.679 ms/op
>
>
> My results on an i9-7960X + Windows:
>
> Benchmark Mode Cnt Score Error Units
> Exactness.test_long_double_based_on_filtering avgt 5 5403.940 ± 186.774 ms/op
> Exactness.test_long_double_based_on_leading_trailing avgt 5 3828.349 ± 81.673 ms/op
>
>
> I've never been able to build the JDK, so I copied your test code into a standalone project with Java 19 and JMH 1.36. So there's that, the processors, and the operating systems as possible sources of the difference.
>
> I'm not sure how to tell whose results are more "correct", if there is such a thing. I suppose that if the performance is a wash, the filtering approach is more attractive: it's easier to understand and it's stylistically similar to the rest of the exactness methods.
>
> Thanks for trying out the benchmarks anyway.
>
> ---
>
> **Edit:** I can't tell if M1 supports the `lzcnt` and `tzcnt` instructions. For me, the generated assembly of the benchmark uses those for `Long.numberOf{Leading,Trailing}Zeros`. If it's falling back to something less efficient on M1, that could explain things.
Seems like there are fast integer<->floating-point conversion instructions on Aarch64 with semantics very similar to Java's, but that on x86-64 the conversions are rather convoluted.
However, if the check is then followed by the cast, as in a successful `instanceof`, for example, then maybe the "filtering" variant works well even of x86-64, because the result might already be available in a register from the preceding check.
Hard to tell from the results above, though.
-------------
PR: https://git.openjdk.org/amber/pull/91
More information about the amber-dev
mailing list