RFR: 8338967: Improve performance for MemorySegment::fill [v4]
Per Minborg
pminborg at openjdk.org
Tue Aug 27 09:50:03 UTC 2024
On Mon, 26 Aug 2024 21:38:37 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:
>> Here is a benchmark that fills segments of various random sizes:
>>
>>
>>
>> @BenchmarkMode(Mode.AverageTime)
>> @Warmup(iterations = 5, time = 500, timeUnit = TimeUnit.MILLISECONDS)
>> @Measurement(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
>> @State(Scope.Thread)
>> @OutputTimeUnit(TimeUnit.NANOSECONDS)
>> @Fork(value = 3)
>> public class TestFill {
>>
>> private static final int SIZE = 16;
>> private static final int[] INDICES = new Random(42).ints(0, 8)
>> .limit(SIZE)
>> .toArray();
>>
>>
>> private MemorySegment[] segments;
>>
>> @Setup
>> public void setup() {
>> segments = IntStream.of(INDICES)
>> .mapToObj(i -> MemorySegment.ofArray(new byte[i]))
>> .toArray(MemorySegment[]::new);
>> }
>>
>> @Benchmark
>> public void heap_segment_fill() {
>> for (int i = 0; i < SIZE; i++) {
>> segments[i].fill((byte) 0);
>> }
>> }
>>
>> }
>>
>>
>> This produces the following on my Mac M1:
>>
>>
>> Benchmark Mode Cnt Score Error Units
>> TestFill.heap_segment_fill avgt 30 59.054 ? 3.723 ns/op
>>
>>
>> On average, an operation will take 59/16 = ~3 ns per operation (including looping).
>>
>> A test with the same size for every benchmark looks like this on my machine:
>>
>>
>> Benchmark (ELEM_SIZE) Mode Cnt Score Error Units
>> TestFill.heap_segment_fill 0 avgt 30 1.112 ? 0.027 ns/op
>> TestFill.heap_segment_fill 1 avgt 30 1.602 ? 0.060 ns/op
>> TestFill.heap_segment_fill 2 avgt 30 1.583 ? 0.004 ns/op
>> TestFill.heap_segment_fill 3 avgt 30 1.909 ? 0.055 ns/op
>> TestFill.heap_segment_fill 4 avgt 30 1.605 ? 0.059 ns/op
>> TestFill.heap_segment_fill 5 avgt 30 1.900 ? 0.064 ns/op
>> TestFill.heap_segment_fill 6 avgt 30 1.891 ? 0.038 ns/op
>> TestFill.heap_segment_fill 7 avgt 30 2.237 ? 0.091 ns/op
>
> As discussed offline, can't we use a stable array of functions or something like that which can be populated lazily? That way you can access the function you want in a single array access, and we could put all these helper methods somewhere else.
Unfortunately, a stable array of functions/MethodHandles didn't work from a performance perspective.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/20712#discussion_r1732503991
More information about the core-libs-dev
mailing list