[patterns-instanceof-primitive] RFR: 8303374: Compiler Implementation for Primitive types in patterns, instanceof, and switch [v3]

Mon Mar 13 13:35:39 UTC 2023

On Sun, 12 Mar 2023 18:07:00 GMT, Michael Hixson <duke at openjdk.org> wrote:

>> I added a benchmarks file (we may or may not keep it in the final PR) that we can discuss during the finalization of the PR on openjdk/jdk/. In any case, feel free to take a look on the test.
>> 
>> I followed these [instructions](https://openjdk.org/groups/build/doc/testing.html) for more info. 
>> 
>> The following are quick measurements on an Apple M1 Max chip/macOS 13.2.1 (22D68) -- quick meaning that if we want to keep them, we should run them with 3 forks for 15wi/15i at least.
>> 
>> 
>> sh make/devkit/createJMHBundle.sh
>> 
>> bash configure --with-boot-jdk="<.....>/prebuilt/jdk-19.jdk/Contents/Home/" --with-jmh=<.....>/jdk/make/devkit/../../build/jmh/jars/ -disable-warnings-as-errors
>> 
>> make test TEST="micro:org.openjdk.bench.jdk.preview.patterns.Exactness" MICRO="OPTIONS=-p "pollute=true";VM_OPTIONS=--enable-native-access=ALL-UNNAMED;RESULTS_FORMAT=json" CONF=rel
>> 
>> Exactness.test_float_int_based_on_compare             avgt    5  22001.714 ±  21.124  ms/op
>> Exactness.test_float_int_based_on_filtering           avgt    5  21459.505 ± 159.797  ms/op
>> Exactness.test_int_float_based_on_filtering           avgt    5   1464.611 ±   5.375  ms/op
>> Exactness.test_int_float_based_on_leading_trailing    avgt    5   2741.839 ±  10.300  ms/op
>> Exactness.test_long_double_based_on_filtering         avgt    5   1379.058 ±   4.329  ms/op
>> Exactness.test_long_double_based_on_leading_trailing  avgt    5   2439.624 ±   4.366  ms/op
>> Exactness.test_long_float_based_on_filtering          avgt    5   1491.983 ±   8.019  ms/op
>> Exactness.test_long_float_based_on_leading_trailing   avgt    5   2736.461 ±   9.508  ms/op
>> 
>> 
>> Interesting that both float_int are the same and I observe that the filtering ones (or compact) are better than all the analytical (leading_trailing).
>
> @biboudis It's interesting that your `long_double` results conflict with mine.
> 
> My results on an i7-7700K + Windows:
> 
> Benchmark                                             Mode  Cnt     Score     Error  Units
> Exactness.test_long_double_based_on_filtering         avgt    5  4579.436 ±  66.288  ms/op
> Exactness.test_long_double_based_on_leading_trailing  avgt    5  3322.384 ± 110.679  ms/op
> 
> 
> My results on an i9-7960X + Windows:
> 
> Benchmark                                             Mode  Cnt     Score     Error  Units
> Exactness.test_long_double_based_on_filtering         avgt    5  5403.940 ± 186.774  ms/op
> Exactness.test_long_double_based_on_leading_trailing  avgt    5  3828.349 ±  81.673  ms/op
> 
> 
> I've never been able to build the JDK, so I copied your test code into a standalone project with Java 19 and JMH 1.36.  So there's that, the processors, and the operating systems as possible sources of the difference.
> 
> I'm not sure how to tell whose results are more "correct", if there is such a thing.  I suppose that if the performance is a wash, the filtering approach is more attractive: it's easier to understand and it's stylistically similar to the rest of the exactness methods.
> 
> Thanks for trying out the benchmarks anyway.
> 
> ---
> 
> **Edit:** I can't tell if M1 supports the `lzcnt` and `tzcnt` instructions.  For me, the generated assembly of the benchmark uses those for `Long.numberOf{Leading,Trailing}Zeros`.  If it's falling back to something less efficient on M1, that could explain things.

Seems like there are fast integer<->floating-point conversion instructions on Aarch64 with semantics very similar to Java's, but that on x86-64 the conversions are rather convoluted.

However, if the check is then followed by the cast, as in a successful `instanceof`, for example, then maybe the "filtering" variant works well even of x86-64, because the result might already be available in a register from the preceding check.

Hard to tell from the results above, though.

-------------

PR: https://git.openjdk.org/amber/pull/91