Slower loops for vector memory segment access (loop unrolling)

Radosław Smogura mail at smogura.eu
Fri Aug 26 20:55:12 UTC 2022


Hi all,

After detailed checking, I think this results are ok. The benchmarks use 1024 array size, and increase in time is caused but checking segments. Numbers looks similiar when used with bigger sizes like 1M.

Sorry for noise and please ignore this thread.

Kind regards,
Radoslaw Smogura
From: Radosław Smogura<mailto:mail at smogura.eu>
Sent: Wednesday, August 24, 2022 10:02 PM
To: panama-dev at openjdk.java.net<mailto:panama-dev at openjdk.java.net>
Subject: Slower loops for vector memory segment access (loop unrolling)

Hi all,

I hope you have a good day!

During [1], I’ve noticed that directSegments benchmark is slower then arrayCopy one. However expected behavior is that both benchmarks should produce similar results.

I tracked it to int vs long loops - array copy produces same results when int loop is replaced by long counted loop.

I gathered generated graphs [2] where, first benchmark is unrolled 8 times, while 2nd one (with long loops) 4 times only.

For now I don’t know exact reasons while there’s such difference.

[1] Split foreign vector load and store by null or not null base by rsmogura · Pull Request #711 · openjdk/panama-foreign (github.com)<https://github.com/openjdk/panama-foreign/pull/711>
[2] https://drive.google.com/drive/folders/1gaNz4qwc0e1un6oy7WunTHnYdBdKMIAb?usp=sharing

Kind regards,
Radoslaw Smogura



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20220826/11a3f4a9/attachment.htm>


More information about the panama-dev mailing list