RFR (S): Mark scan prefetch

Aleksey Shipilev shade at redhat.com
Tue Nov 8 11:47:46 UTC 2016


Hi,

Not a surprise that our GC blows the CPU caches when walking the heap.
Within the mark phase, there is little we can do, because the object
graph is random in worst case. But once we have marked, we have the
marked addresses bitmap in our hands, which we scan *linearly*. Which
means, knowing that we will access oop fields, headers, etc. while
scanning that bitmap, we could prefetch oop contents in advance, long
before we actually reference it.

The answer is to prefetch when we get the "mark" from the bitmap:
 http://cr.openjdk.java.net/~shade/shenandoah/markscan-prefetch/webrev.01/

It does improve Full GC times significantly, because the users of marked
bitmap (Calculate Addresses, Adjust Pointers, Copy Objects) improve:
  http://cr.openjdk.java.net/~shade/shenandoah/markscan-prefetch/prefetches

Concurrent GC users (parallel_evacuate) are not affected, because there
are more bottlenecks in them, e.g. CASing the fwdptr.

Testing: hotspot_gc_shenandoah, jcstress-all (quick), microbenchmarks

Thanks,
-Aleksey



More information about the shenandoah-dev mailing list