RFR: 8338967: Improve performance for MemorySegment::fill [v10]
Maurizio Cimadamore
mcimadamore at openjdk.org
Mon Sep 2 09:39:20 UTC 2024
On Fri, 30 Aug 2024 22:04:39 GMT, Francesco Nigro <duke at openjdk.org> wrote:
> All of these strategies are better than what we have now, probably because the existing instrinsics still perform some poor decision, but I haven't dug yet into perfasm out to see what it does wrong; maybe is something which could be fixed in the intrinsic itself?
I'm no intrinsics expert, but if I had to guess I'd say that the intrinsics we have do not specialize for small sizes. Also, the use of vector instructions typically comes with additional alignment constraints - meaning that we need a pre-loop (and sometimes a post-loop). This logic, while faster for bigger sizes, has some drawbacks for smaller sizes.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/20712#issuecomment-2324276883
More information about the core-libs-dev
mailing list