Slower loops for vector memory segment access (loop unrolling)
Radosław Smogura
mail at smogura.eu
Wed Aug 24 20:01:22 UTC 2022
Hi all,
I hope you have a good day!
During [1], I’ve noticed that directSegments benchmark is slower then arrayCopy one. However expected behavior is that both benchmarks should produce similar results.
I tracked it to int vs long loops - array copy produces same results when int loop is replaced by long counted loop.
I gathered generated graphs [2] where, first benchmark is unrolled 8 times, while 2nd one (with long loops) 4 times only.
For now I don’t know exact reasons while there’s such difference.
[1] Split foreign vector load and store by null or not null base by rsmogura · Pull Request #711 · openjdk/panama-foreign (github.com)<https://github.com/openjdk/panama-foreign/pull/711>
[2] https://drive.google.com/drive/folders/1gaNz4qwc0e1un6oy7WunTHnYdBdKMIAb?usp=sharing
Kind regards,
Radoslaw Smogura
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20220824/1520d064/attachment.htm>
More information about the panama-dev
mailing list