Comments / metadata in assembly listings don't make sense for code vectorized using Vector API

Paul Sandoz paul.sandoz at oracle.com
Mon Mar 22 16:03:23 UTC 2021


Hi Piotr,

Sorry to hear that your submission was not accepted.

However, I don’t think all is lost. Your advocacy and experimentation is valueable. The Mandelbrot is a good use-case to test masking implementation support (currently being developed).

Paul.

> On Mar 21, 2021, at 10:19 AM, Piotr Tarsa <piotr.tarsa at gmail.com> wrote:
> 
> Hi Paul,
> 
> I've submitted my "mandelbrot" implementation (i.e. the one discussed
> in this topic) to the benchmark owner, but sadly he rejected it (due
> to usage of non-standard API, i.e. the Vector API):
> https://urldefense.com/v3/__https://salsa.debian.org/benchmarksgame-team/benchmarksgame/-/issues/424__;!!GqivPVa7Brio!PJGZhBzZMuSTqOw2vC2vupRPnpVG5aWIhIoUr7bzR_UifFdwrjfSK-HrdDv8iO88Cw$ 
> 
> I've set up public repo on GitHub instead:
> https://urldefense.com/v3/__https://github.com/tarsa/benchmarksgame-java-fast__;!!GqivPVa7Brio!PJGZhBzZMuSTqOw2vC2vupRPnpVG5aWIhIoUr7bzR_UifFdwrjfSK-HrdDsFHgDA3w$ 
> I hope it will help someone when experimenting with either Project
> Panama or Project Valhalla.
> 
> Regards,
> Piotr
> 
> wt., 2 lut 2021 o 18:54 Paul Sandoz <paul.sandoz at oracle.com> napisał(a):
>> 
>> Hi Piotr,
>> 
>> Thanks for the update.
>> 
>> If the 30 iterations are performed from within Java itself I suspect you will see improvements, since the same compilation would not have to be performed every time. But, I dunno if that is within the benchmark rules.
>> 
>> Paul.
>> 
>>> On Jan 30, 2021, at 5:14 AM, Piotr Tarsa <piotr.tarsa at gmail.com> wrote:
>>> 
>>> Hi,
>>> 
>>> I was busy with other things, so that's why I delayed the reply.
>>> 
>>> wt., 19 sty 2021 o 21:17 Paul Sandoz <paul.sandoz at oracle.com> napisał(a):
>>>> 
>>>> Hi Piotr,
>>>> 
>>>> Thanks for further sharing. I am glad you managed to make progress. I was not aware there were some benchmark rules you needed to adhere to.
>>> 
>>> Rules are here:
>>> https://urldefense.com/v3/__https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/mandelbrot.html*mandelbrot__;Iw!!GqivPVa7Brio!LsQqaL3nuiK_2L-DKtjCpVy_JZKb0yX3_AVdzglYqHMkLO4m6m2o83LEz1lwmQOpVg$
>>> They require bit perfect output and also the same algorithm. Well, in
>>> the end it's a benchmark of programming languages, not of algorithms.
>>> 
>>>> Re: masks, yes there is still work to do for some mask operations.
>>> 
>>> OK, good to know.
>>> 
>>>> Re: execution from the command line. You can run with -XX:-TieredCompilation (Remi, thanks for the correction in the prior email :-) ), and it's also possible reduce the compilation threshold (at the expense of potentially less accurate profiling information) using say -XX:CompileThreshold=1000 (the default is 10000).
>>>> It’s always a bit tricky to compare a static (Rust) vs. dynamic system that needs to warm up.
>>>> 
>>>> Paul.
>>> 
>>> I've tested the provided options, but they don't improve performance
>>> on the real benchmark:
>>> 
>>> $ time for run in {1..30}; do ~/devel/jdk-16/bin/java
>>> -XX:-TieredCompilation -XX:CompileThreshold=1000 --add-modules
>>> jdk.incubator.vector -cp
>>> target/classes/:/home/piotrek/.m2/repository/org/openjdk/jmh/jmh-core/1.27/jmh-core-1.27.jar
>>> pl.tarsa.mandelbrot_simd_1 16000 > /dev/null; done
>>> WARNING: Using incubator modules: jdk.incubator.vector
>>> WARNING: Using incubator modules: jdk.incubator.vector
>>> ... (repeated for each run)
>>> WARNING: Using incubator modules: jdk.incubator.vector
>>> WARNING: Using incubator modules: jdk.incubator.vector
>>> real 0m42,213s
>>> user 2m35,740s
>>> sys 0m4,094s
>>> 
>>> $ time for run in {1..30}; do ~/devel/jdk-16/bin/java
>>> -XX:-TieredCompilation --add-modules jdk.incubator.vector -cp
>>> target/classes/:/home/piotrek/.m2/repository/org/openjdk/jmh/jmh-core/1.27/jmh-core-1.27.jar
>>> pl.tarsa.mandelbrot_simd_1 16000 > /dev/null; done
>>> WARNING: Using incubator modules: jdk.incubator.vector
>>> WARNING: Using incubator modules: jdk.incubator.vector
>>> ... (repeated for each run)
>>> WARNING: Using incubator modules: jdk.incubator.vector
>>> WARNING: Using incubator modules: jdk.incubator.vector
>>> real 0m40,038s
>>> user 2m23,227s
>>> sys 0m3,515s
>>> 
>>> $ time for run in {1..30}; do ~/devel/jdk-16/bin/java --add-modules
>>> jdk.incubator.vector -cp
>>> target/classes/:/home/piotrek/.m2/repository/org/openjdk/jmh/jmh-core/1.27/jmh-core-1.27.jar
>>> pl.tarsa.mandelbrot_simd_1 16000 > /dev/null; done
>>> WARNING: Using incubator modules: jdk.incubator.vector
>>> WARNING: Using incubator modules: jdk.incubator.vector
>>> ... (repeated for each run)
>>> WARNING: Using incubator modules: jdk.incubator.vector
>>> WARNING: Using incubator modules: jdk.incubator.vector
>>> real 0m37,743s
>>> user 2m16,316s
>>> sys 0m3,758s
>>> 
>>> Looks like the default settings yield best performance. That's a
>>> positive thing, actually.
>>> 
>>> I'll probably send my version to benchmarks game maintainer when he
>>> switches to Java 16 and then leave tuning to others.
>>> 
>>> Thanks for the conversation,
>>> Piotr
>>> 
>> 



More information about the panama-dev mailing list