RFC (S): Prefetching during mark scans
Aleksey Shipilev
shade at redhat.com
Wed Nov 2 12:33:22 UTC 2016
Hi,
This describes the work in progress, but I would like early feedbacks,
because re-running perf experiments is tedious, and every little change
there affects performance.
Not a surprise that our GC blows the CPU caches when walking the heap.
Within the mark phase, there is little we can do, because the object
graph is random in worst case. But once we have marked, we have the
marked addresses bitmap in our hands, which we scan *linearly*. Which
means, knowing that we will access oop fields, headers, etc. while
scanning that bitmap, we could prefetch oop contents in advance, long
before we actually reference it.
This is the prototype patch that affects only mark-compact via
ShenandoahHeapRegion::marked_object_iterate:
http://cr.openjdk.java.net/~shade/shenandoah/markscan-prefetch/webrev.00/
It does improve Full GC times significantly, because the users of marked
bitmap (Calculate Addresses, Adjust Pointers, Copy Objects) improve:
http://cr.openjdk.java.net/~shade/shenandoah/markscan-prefetch/prefetches
Roman is exploring whether we can merge ShenandoahHeapRegion and
ShehandoahHeap versions of marked_object_iterate, and I would
forward-port the patch there. After that, the prefetching would also
affect our regular concurrent GC (e.g. the scan for concurrent evacuation).
Thanks,
-Aleksey
More information about the shenandoah-dev
mailing list