[lworld] RFR: Prototype inline cursors for List
Sergey Kuksenko
sergey.kuksenko at oracle.com
Thu Mar 26 20:37:20 UTC 2020
It won't work for ArrayList.
I told about that on Valhalla meeting (half year ago) and showed it's in
my presentation about Valhalla performance
(http://cr.openjdk.java.net/~skuksenko/valhalla/talk/vp.pdf, slides 68-75).
There is no difference cost between memory access to stack and heap. You
won't get performance benefits on any range checks. Iterator/Cursor
memory allocation itself is very cheap.
The only place where it may have significant effect (and it has) is GC
write barriers. Simple not-tuned Cursor for HashMap give +40% boost. If
we want to see performance difference for Cursor we need an Iterator
where Iterator's state contains references and these references should
be updated on each "next()" invocation. ArrayList needs only update int
field - no GC barriers.
My raw experiments were published in last August
http://cr.openjdk.java.net/~skuksenko/valhalla/playground/.
It's still not enough to observe performance improvement on
microbenchmarks. HotSpot is really good at
Iterators-inside-microbenchmarks scalarization. Iterator should be
forced DO NOT to scalarize the iterator. The simplest way to do it:
@CompilerControl(CompilerControl.Mode.DONT_INLINE)
private static Iterator<Integer> hide(Iterator<Integer> it) {
return it;
}
@Benchmark
public int sumIteratorHidden() {
int s = 0;
for (Iterator<Integer> iterator = hide(map.keyIterator()); iterator.hasNext(); ) {
s += iterator.next();
}
return s;
}
Minor note: Don't run it with ZGC. ZGC has no write barriers. Default G1
is the best option due to highest cost of write barriers. ParallelGC
write barriers has much less cost than G1 and Cursor still shows
performance boost on ParallelGC, but much less than on G1.
On 3/26/20 10:45 AM, Brian Goetz wrote:
> These don't look so good! I suspect that the Iterator is performing
> so well because it's getting good EA, but not sure why cursors aren't
> doing better? Guess we'll have to dig into the generated code....
>
> On 3/26/2020 1:14 PM, Roger Riggs wrote:
>> Implementation of Cursors and jmh tests comparing
>> typical List traversal via direct index, iterator,
>> inline cursor, and an iterator implemented on top of cursor.
>>
>> Sample results:
>>
>> (size) Mode Cnt Score
>> Error Units
>> XArrayListCursorTest.getViaArray 100000 avgt 5
>> 507793.484 7086.038 ns/op
>> XArrayListCursorTest.getViaCursorForLoop 100000 avgt 5
>> 656461.958 52488.547 ns/op
>> XArrayListCursorTest.getViaCursorWhileLoop 100000 avgt 5
>> 641963.323 32219.409 ns/op
>> XArrayListCursorTest.getViaIterator 100000 avgt 5
>> 558863.817 23539.256 ns/op
>> XArrayListCursorTest.getViaIteratorCurs 100000 avgt 5
>> 733161.466 33721.881 ns/op
>>
>> -------------
>>
>> Commit messages:
>> - Prototype inline cursors for List
>>
>> Changes: https://git.openjdk.java.net/valhalla/pull/5/files
>> Webrev: https://webrevs.openjdk.java.net/valhalla/5/webrev.00
>> Stats: 2139 lines in 3 files changed: 2139 ins; 0 del; 0 mod
>> Patch: https://git.openjdk.java.net/valhalla/pull/5.diff
>> Fetch: git fetch https://git.openjdk.java.net/valhalla
>> pull/5/head:pull/5
>>
>> PR: https://git.openjdk.java.net/valhalla/pull/5
>
More information about the valhalla-dev
mailing list