WB midpath: CSet check and RB reversal

Thu Jul 26 09:32:13 UTC 2018

Am 26.07.2018 um 07:48 schrieb Aleksey Shipilev:
> I am looking into the barriers profile, trying to understand where the overhead for the activated
> barriers are coming from. In very CSet-intensive microbenchmarks, it seems that taking the WB
> midpath consumes most of the time. And if we look into the profile, then RB is the hottest thing
> there. I remember from my update-refs experiments that changing the test from "in_cset +
> check_fwdptr" to "check_fwdptr + in_cset" degraded update-refs concurrent performance around 3x.
> 
> In the WB midpath code we do exactly that slow pattern:
> 
>  if (gcstate_bit_set(HAS_FORWARDED)) {
>    o = rb(o)                  // <--- this guy is hot
>    if (gcstate_bit_set(EVAC|TRAVERSAL) {
>      if (in_cset(o)) {
>        o = call shenandoah_wb
>      }
>    }
>  }
> 
> ...maybe we should instead do:
> 
>  if (gcstate_bit_set(HAS_FORWARDED)) {
>    if (in_cset(o)) {          // <--- avoid touching the fwdptr if object cannot be forwarded
>      o = rb(o)
>      if (gcstate_bit_set(EVAC|TRAVERSAL) {
>        if (in_cset(o)) {      // <--- avoid going to slowpath is object is evac'ed already
>          o = call shenandoah_wb
>        }
>      }
>    }
>  }
> 

Notice how this could be simplified for traversal GC:

 if (gcstate_bit_set(HAS_FORWARDED)) { // only 1 phase.
   if (in_cset(o)) {          // <--- avoid touching the fwdptr if
object cannot be forwarded
     o = rb(o)
     if (in_cset(o)) {      // <--- avoid going to slowpath is object is
evac'ed already
       o = call shenandoah_wb
     }
   }
 }

Roman