RFC: Rebuilding remembered sets during concurrent mark

Erik Helin erik.helin at oracle.com
Wed Nov 9 15:18:06 UTC 2016


Hi all,

G1 maintains remembered sets for all old regions at any time. This poses 
a significant footprint and throughput overhead. The footprint overhead 
is the result of storing the remembered set information, the throughput 
overhead results from executing the slow paths of the post-write barrier 
and the refinement work.

Some remembered sets need to be maintained always, i.e. the ones for the 
young generation to allow evacuation. However, G1 only actually needs 
the remembered sets for the old gen regions G1 intends to evacuate 
during and while executing a mixed GC phase.

In many larger applications the old gen remembered sets are much larger 
than the young gen ones.

Given a way to rebuild the remembered sets on demand, these observations 
may allow us to avoid managing the remembered sets for a large part of 
the heap most of the time:
- all of them during the young-only phase
- old regions without marking information ("to" regions since the last
   concurrent mark cycle)
- old regions with too many live objects to make them worth evacuating
- humongous regions
- any other regions that we are not going to collect for any reason.
   E.g. the user can set a global/local footprint overhead, a liveness
   threshold, or use any other information about the application.

The suggestion here is to use concurrent mark for rebuilding remembered 
sets. Concurrent mark already traverses the entire object graph, so it 
can determine the remembered sets for any given region in conjunction 
with the barriers. While this will significantly lengthen the concurrent 
mark phase, the benefits for throughput and memory footprint may 
outweigh this, resulting in an overall win.

Thanks,
Erik



More information about the hotspot-gc-dev mailing list