Strawman for moving from SATB to minimal wavefront in concurrent mark.

Christine Flood cflood at redhat.com
Fri Oct 13 16:13:43 UTC 2017


Background:

>From www.memorymanagement.org

In order for the collector to miss a reachable object, the following two
conditions need to hold at some point during tracing:

The mutator stores a reference to a white object into a black object.
All paths from any gray objects to that white object are destroyed.

Snapshot-at-the-beginning algorithms ensure the second condition cannot
occur, by causing the collector to process any reference that the mutator
overwrites and that might be part of such a path.

Incremental-update algorithms ensure the first condition cannot occur, by
painting either the black or the white object gray (see Pirinen (1998) for
details).
They are so called because they incrementally update the collector’s view
of the graph to track changes made by the mutator.


Current Status:

We currently use Snapshot At The Beginning (SATB) and our write barriers
push the about-to-be-overwritten value onto our SATB queues to be
rescanned.  Everything allocated after init mark is considered live and
implicitly colored black without updating the marking bitmap.  This
implicit coloring is done by checking for addresses allocated after
Top-At-Mark-Start (TAMS).

Incremental Update would save us a significant amount of floating garbage.
Incremental Update write barriers push the new value being written onto a
scan queue.  However there is a problem with incremental update.  If a new
linked list is allocated during the concurrent mark, the only pointer to it
might be in the root set, and this pointer won't be found until final
mark.  Which means our final mark pause may be arbitrarily long as we trace
this large newly allocated data structure.

Minimal Wavefront is very similar to Incremental Update but we treat new
objects specially.  If a white object is written to, you don't do anything
in incremental update because that reference will be scanned when the
object is traced.  If a new ("yellow") object is written you need to scan
both the object and the newly referenced object.  This saves us from
potentially long final mark pauses.

Proposed Change:
Minimal Wavefront

Traditional Minimal Wavefront algorithms have a four color algorithm where
new objects are allocated yellow and treated specially.

00 => White  (object not visited may be trash).
01 => Yellow (new object, may be trash)
10 => Gray   (object is live but not yet scanned)
11 => Black  (object is live and has been scanned)

The write barrier distinguishes between yellow objects and white objects.
Writing to a white object does nothing due to the assumption that the
object will be traced when the marking algorithm discovers it.  Writing to
a yellow object causes both the object and the reference to be scanned.

Concurrent Marking

We don't really use an explicit gray in our algorithms

All nodes on the scan queue are black.
We pop an element off and visit each of it's children.
visit(white child) => color child black, add to scan queue
visit(yellow child)=> color child black, add to scan queue
visit(black child) => do nothing

Write Barrier:

write_ref(black object, black ref) => do nothing
write_ref(black object, white ref) => color ref black, add to scan queue
write_ref(black object, yellow ref)=> color ref black, add to scan queue

write_ref(yellow object, white ref) => color object black, add to scan
queue, color ref black, add to scan queue
write_ref(yellow object, yellow ref)=> color object black, add to scan
queue, color ref black, add to scan queue
write_ref(yellow object, black ref) => color object black, add to scan queue

write_ref(white object, black ref)  => do nothing, object will either be
found from another path, or not
write_ref(white object, yellow ref) => do nothing, object will either be
found from another path, or not
write_ref(white object, white ref) => do nothing, object will either be
found from another path, or not


We can get away with one bit in a marking bitmap.

Any object allocated after TAMS is considered yellow.  We use one bit in
the marking bitmap to distinguish between black and (white or yellow) and
then the address to distinguish between white and yellow.

This is the first step towards mixed partial and concurrent shenandoah
collections.  Objects copied during a partial collection will be colored
yellow and that's fine.  If they are unreachable we'll figure that out.  If
they are reachable we will find them during tracing.

Christine


More information about the shenandoah-dev mailing list