Comments / metadata in assembly listings don't make sense for code vectorized using Vector API
Paul Sandoz
paul.sandoz at oracle.com
Mon Mar 22 16:03:23 UTC 2021
Hi Piotr,
Sorry to hear that your submission was not accepted.
However, I don’t think all is lost. Your advocacy and experimentation is valueable. The Mandelbrot is a good use-case to test masking implementation support (currently being developed).
Paul.
> On Mar 21, 2021, at 10:19 AM, Piotr Tarsa <piotr.tarsa at gmail.com> wrote:
>
> Hi Paul,
>
> I've submitted my "mandelbrot" implementation (i.e. the one discussed
> in this topic) to the benchmark owner, but sadly he rejected it (due
> to usage of non-standard API, i.e. the Vector API):
> https://urldefense.com/v3/__https://salsa.debian.org/benchmarksgame-team/benchmarksgame/-/issues/424__;!!GqivPVa7Brio!PJGZhBzZMuSTqOw2vC2vupRPnpVG5aWIhIoUr7bzR_UifFdwrjfSK-HrdDv8iO88Cw$
>
> I've set up public repo on GitHub instead:
> https://urldefense.com/v3/__https://github.com/tarsa/benchmarksgame-java-fast__;!!GqivPVa7Brio!PJGZhBzZMuSTqOw2vC2vupRPnpVG5aWIhIoUr7bzR_UifFdwrjfSK-HrdDsFHgDA3w$
> I hope it will help someone when experimenting with either Project
> Panama or Project Valhalla.
>
> Regards,
> Piotr
>
> wt., 2 lut 2021 o 18:54 Paul Sandoz <paul.sandoz at oracle.com> napisał(a):
>>
>> Hi Piotr,
>>
>> Thanks for the update.
>>
>> If the 30 iterations are performed from within Java itself I suspect you will see improvements, since the same compilation would not have to be performed every time. But, I dunno if that is within the benchmark rules.
>>
>> Paul.
>>
>>> On Jan 30, 2021, at 5:14 AM, Piotr Tarsa <piotr.tarsa at gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> I was busy with other things, so that's why I delayed the reply.
>>>
>>> wt., 19 sty 2021 o 21:17 Paul Sandoz <paul.sandoz at oracle.com> napisał(a):
>>>>
>>>> Hi Piotr,
>>>>
>>>> Thanks for further sharing. I am glad you managed to make progress. I was not aware there were some benchmark rules you needed to adhere to.
>>>
>>> Rules are here:
>>> https://urldefense.com/v3/__https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/mandelbrot.html*mandelbrot__;Iw!!GqivPVa7Brio!LsQqaL3nuiK_2L-DKtjCpVy_JZKb0yX3_AVdzglYqHMkLO4m6m2o83LEz1lwmQOpVg$
>>> They require bit perfect output and also the same algorithm. Well, in
>>> the end it's a benchmark of programming languages, not of algorithms.
>>>
>>>> Re: masks, yes there is still work to do for some mask operations.
>>>
>>> OK, good to know.
>>>
>>>> Re: execution from the command line. You can run with -XX:-TieredCompilation (Remi, thanks for the correction in the prior email :-) ), and it's also possible reduce the compilation threshold (at the expense of potentially less accurate profiling information) using say -XX:CompileThreshold=1000 (the default is 10000).
>>>> It’s always a bit tricky to compare a static (Rust) vs. dynamic system that needs to warm up.
>>>>
>>>> Paul.
>>>
>>> I've tested the provided options, but they don't improve performance
>>> on the real benchmark:
>>>
>>> $ time for run in {1..30}; do ~/devel/jdk-16/bin/java
>>> -XX:-TieredCompilation -XX:CompileThreshold=1000 --add-modules
>>> jdk.incubator.vector -cp
>>> target/classes/:/home/piotrek/.m2/repository/org/openjdk/jmh/jmh-core/1.27/jmh-core-1.27.jar
>>> pl.tarsa.mandelbrot_simd_1 16000 > /dev/null; done
>>> WARNING: Using incubator modules: jdk.incubator.vector
>>> WARNING: Using incubator modules: jdk.incubator.vector
>>> ... (repeated for each run)
>>> WARNING: Using incubator modules: jdk.incubator.vector
>>> WARNING: Using incubator modules: jdk.incubator.vector
>>> real 0m42,213s
>>> user 2m35,740s
>>> sys 0m4,094s
>>>
>>> $ time for run in {1..30}; do ~/devel/jdk-16/bin/java
>>> -XX:-TieredCompilation --add-modules jdk.incubator.vector -cp
>>> target/classes/:/home/piotrek/.m2/repository/org/openjdk/jmh/jmh-core/1.27/jmh-core-1.27.jar
>>> pl.tarsa.mandelbrot_simd_1 16000 > /dev/null; done
>>> WARNING: Using incubator modules: jdk.incubator.vector
>>> WARNING: Using incubator modules: jdk.incubator.vector
>>> ... (repeated for each run)
>>> WARNING: Using incubator modules: jdk.incubator.vector
>>> WARNING: Using incubator modules: jdk.incubator.vector
>>> real 0m40,038s
>>> user 2m23,227s
>>> sys 0m3,515s
>>>
>>> $ time for run in {1..30}; do ~/devel/jdk-16/bin/java --add-modules
>>> jdk.incubator.vector -cp
>>> target/classes/:/home/piotrek/.m2/repository/org/openjdk/jmh/jmh-core/1.27/jmh-core-1.27.jar
>>> pl.tarsa.mandelbrot_simd_1 16000 > /dev/null; done
>>> WARNING: Using incubator modules: jdk.incubator.vector
>>> WARNING: Using incubator modules: jdk.incubator.vector
>>> ... (repeated for each run)
>>> WARNING: Using incubator modules: jdk.incubator.vector
>>> WARNING: Using incubator modules: jdk.incubator.vector
>>> real 0m37,743s
>>> user 2m16,316s
>>> sys 0m3,758s
>>>
>>> Looks like the default settings yield best performance. That's a
>>> positive thing, actually.
>>>
>>> I'll probably send my version to benchmarks game maintainer when he
>>> switches to Java 16 and then leave tuning to others.
>>>
>>> Thanks for the conversation,
>>> Piotr
>>>
>>
More information about the panama-dev
mailing list