RFR: 8338967: Improve performance for MemorySegment::fill [v5]

Francesco Nigro duke at openjdk.org
Fri Aug 30 15:35:20 UTC 2024


On Fri, 30 Aug 2024 15:21:52 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:

>  in this case, we can't optimize as well, because we have different branches which get taken or not in a less predictable fashion.

Exactly - It has been designed to show the case when the conditions materialize (because are taken) and are performed in a non predictable sequence. 
I couldn't find an easier way to make it happen reliably if not by making the root method not been inlined, with the limit that the actuall call of this root method is now part of the cost, but should be order of magnitude less important than a pipeline nuke.
I could have made the method being C2 compiled in the setup with all the branches taken and "probably" I could have dropped the DONTINLINE in the root method - but I'm not sure.

> The important question is (for this PR): does the work proposed here cause a regression in the case you have in mind? E.g. is the setMemory intrinsics better than the branchy logic we have here?

good point: relatively to the baseline, nope, cause the new version improve regardless, even when the new version got high branch misses

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20712#issuecomment-2321630093


More information about the core-libs-dev mailing list