RFR (S): Mark scan prefetch
Roman Kennke
rkennke at redhat.com
Tue Nov 8 14:53:45 UTC 2016
BTW: Are you also planning to write-prefetch? Most users of this
routine are also writing (copy-object to a different location though..)
I'd expect even more benefit from that.
Also, what's the reason for not prefetching when doing accurate
traversal? (not relevant for mark-compact I guess)
Roman
Am Dienstag, den 08.11.2016, 12:47 +0100 schrieb Aleksey Shipilev:
> Hi,
>
> Not a surprise that our GC blows the CPU caches when walking the
> heap.
> Within the mark phase, there is little we can do, because the object
> graph is random in worst case. But once we have marked, we have the
> marked addresses bitmap in our hands, which we scan *linearly*. Which
> means, knowing that we will access oop fields, headers, etc. while
> scanning that bitmap, we could prefetch oop contents in advance, long
> before we actually reference it.
>
> The answer is to prefetch when we get the "mark" from the bitmap:
> http://cr.openjdk.java.net/~shade/shenandoah/markscan-prefetch/webre
> v.01/
>
> It does improve Full GC times significantly, because the users of
> marked
> bitmap (Calculate Addresses, Adjust Pointers, Copy Objects) improve:
> http://cr.openjdk.java.net/~shade/shenandoah/markscan-prefetch/pref
> etches
>
> Concurrent GC users (parallel_evacuate) are not affected, because
> there
> are more bottlenecks in them, e.g. CASing the fwdptr.
>
> Testing: hotspot_gc_shenandoah, jcstress-all (quick), microbenchmarks
>
> Thanks,
> -Aleksey
>
More information about the shenandoah-dev
mailing list