RFR: Traveral GC heuristics

Wed Jan 17 21:58:52 UTC 2018

>> Full:
>> http://cr.openjdk.java.net/~rkennke/traversal/webrev.02/
> 
> Exciting!
> 
> c1_Runtime1_x86.cpp:
> 
> *) Let's rewrite this:
> 
>     if (bs->kind() == BarrierSet::ShenandoahBarrierSet && !ShenandoahSATBBarrier &&
> !ShenandoahConditionalSATBBarrier && !ShenandoahStoreValEnqueueBarrier) {
> 
> into:
> 
>      if (bs->kind() == BarrierSet::ShenandoahBarrierSet && !(ShenandoahSATBBarrier ||
> ShenandoahConditionalSATBBarrier || ShenandoahStoreValEnqueueBarrier) {

Done.

> *) Re:
> 
> 1644           __ testb(gc_state, ShenandoahHeap::MARKING | ShenandoahHeap::TRAVERSAL);
> 
> So, set_concurrent_traversal_in_progress activates the SATB queues, and this is good. Why don't we
> set the ShenandoahHeap::MARKING bit to gc_state there, and avoid "| TRAVERSAL" all around the
> arch-specific code?

Done.

> *) shenandoahBarrierSet_x86.cpp: Pushes/pops around the call to g1_write_barrier_pre seem
> suspicious. How do we know we need to caller-save rbx, rcx, rdx, c_rarg1? Deserves a comment, maybe?

It's the same set of regs that need to be saved+restored in the write 
barrier, a few lines above.

> *) ShenandoahStoreValReadBarrier, ShenandoahStoreValWriteBarrier, ShenandoahStoreValReadBarrier
> exclusion tests

ShStoreValWB and ShStoreValEnq are not exclusive. I need them both in 
tandem. I added exclusion test for ShStoreValEnq against ShStoreValRB in 
shenandoahCollectorPolicy.cpp, but don't know how to 'encode' that in 
TestSelectiveBarrierFlags.java

> *) shenandoahBarrierSet.cpp: branches are the same, which looks like typo. Should be compound
> boolean predicate?
> 
>    42       if (ALWAYS_ENQUEUE && !oopDesc::is_null(o)) {
>    43         ShenandoahBarrierSet::enqueue(o);
>    44       } else if (evac) {
>    45         ShenandoahBarrierSet::enqueue(o);
>    46       }

Done.

> shenandoahCollectorPolicy.cpp
> 
>   *) Stray debugging lines:
> 
> 1367     // tty->print_cr("CSET regions:");
> 1376         // r->print_on(tty);

Removed.

>   *) Heuristics need work: I think it runs into problem that adaptive cset selection solves: it
> chooses either too big or too small cset. I wonder if you can actually reuse that in traversal
> heuristics

Yes, but we need degenerate GC for traversal first :-)

> shenandoahConcurrentThread.cpp:
> 
>   *) I think you want to introduce ShenandoahHeap::{vmop_entry,entry,op}_traversal family of methods,
> and call them, as we do with the rest of VM ops.

Done.

>   *) This is not needed anymore:
> 
>   207   // TODO: Call this properly with Shenandoah*CycleMark
>   208   heap->set_used_at_last_gc();

Removed.

> shenandoahHeap.cpp:
> 
>   *) As mentioned above, this:
> 
> 1683 void ShenandoahHeap::set_concurrent_traversal_in_progress(bool in_progress) {
> 1684   set_gc_state_bit(TRAVERSAL_BITPOS, in_progress);
> 1685   JavaThread::satb_mark_queue_set().set_active_all_threads(in_progress, !in_progress);
> 1686   set_evacuation_in_progress_at_safepoint(in_progress);
> 1687   set_has_forwarded_objects(in_progress);
> 1688 }
> 
> is probably just:
> 
>   void ShenandoahHeap::set_concurrent_traversal_in_progress(bool in_progress) {
>     set_gc_state_bit(TRAVERSAL_BITPOS, in_progress);
>     set_gc_state_bit(MARKING_BITPOS, in_progress);
>     set_gc_state_bit(HAS_FORWARDED_OBJECTS_BITPOS, in_progress);
>     set_gc_state_bit_at_safepoint(_gc_state.raw_value());
>     JavaThread::satb_mark_queue_set().set_active_all_threads(in_progress, !in_progress);
>   }

Done.

> shenandoahVerifier.cpp:
> 
> *) Why are we testing for "next" bitmap here?

Because traversal uses the next bitmap, and only this, and I don't care 
to swap with complete, but I want to verify it. Good?

> _verify_liveness_complete and the comment seem to
> disagree? Comments still mention "partial"?

Fixed.

> shenandoah_globals.hpp:
> 
>   *) Comment is duplicated:
> 
>   316                                                                             \
>   317   diagnostic(bool, ShenandoahStoreValEnqueueBarrier, false,                 \
>   318           "Turn on/off enqueuing of oops after write barriers (MWF)")       \
>   319                                                                             \
>   320   diagnostic(bool, ShenandoahMWF, false,                                    \
>   321           "Turn on/off enqueuing of oops after write barriers (MWF)")       \
> 
>

Fixed.

> graphKit.cpp:
> 
>   *) So we predicate shenandoah_enqueue_barrier with !ShenandoahMWF here:
> 
> 4887     if (ShenandoahStoreValEnqueueBarrier && !ShenandoahMWF) {
> 4888       shenandoah_enqueue_barrier(obj);
> 4889     }
> 
>   ...but not around other uses of ShenandoahStoreValEnqueueBarrier, e.g. in c1_LIRGenerator?
> 

I only implemented this sketchy MWF thing in C2 for now. This definitely 
needs more work, and I don't even know if it is correct.

> Other C2:
> 
>   *) Roland should take a look, but I find it uncomfortable to change do_unswitching,
> find_unswitching_candidate with new arguements...

This was actually done by Roland to get the new barriers to work and 
optimize well enough.

> sharedRuntime.cpp:
> 
>   *) Bug due to bad indentation and braces?
> 
>   215 if (UseShenandoahGC) { ShenandoahBarrierSet::enqueue(orig); }
>   216 return;

Yeah, this is not needed. I removed it.

> shenandoahTraversalGC*:
> 
>   *) Really, really unfortunate to duplicate a lot from shenandoahConcurrentMark. Maybe we should
> massage the codebase so that we could reuse significant chunks of the code?

Yes, maybe. But for the start, I did not want it to interfere with 
existing code if I can avoid it. For this reason, this looks like a 
copy+paste job from conc-mark and partial for some parts.

Thanks for reviewing and spotting all the issues. I could not really 
make a diff webrev, because I first had to pull -u your latest work, and 
this messed up my differential webrev... sorry. Only full webrev now:

http://cr.openjdk.java.net/~rkennke/traversal/webrev.03/