RFR: 8338967: Improve performance for MemorySegment::fill [v5]
Francesco Nigro
duke at openjdk.org
Wed Aug 28 15:35:19 UTC 2024
On Wed, 28 Aug 2024 09:06:48 GMT, Per Minborg <pminborg at openjdk.org> wrote:
>> How fast do we need to be here given we are measuring in a few nanoseconds per operation?
>>
>> What if the goal is not to regress from say explicitly filling in a small sized segment or a comparable array (e.g., < 8 bytes) then maybe a loop suffices and the code is simple?
>
>> How fast do we need to be here given we are measuring in a few nanoseconds per operation?
>>
>> What if the goal is not to regress from say explicitly filling in a small sized segment or a comparable array (e.g., < 8 bytes) then maybe a loop suffices and the code is simple?
>
> Fair question. I have another version (called "patch bits" below) that is based on bit logic (first doing int ops, then short and lastly byte, similar to `ArraySupport::vectorizedMismatch`). This has slightly worse performance but is more scalable and perhaps simpler.
>
> 
@minborg Hi! I didn't checked the numbers with the benchmark I've written at https://github.com/openjdk/jdk/pull/20712#discussion_r1732802685 which is meant to stress the branch predictor (without enough `samples` i.e. past 128K on my machine) - can you give it a shot with M1 🙏 ?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/20712#issuecomment-2315685287
More information about the core-libs-dev
mailing list