RFR (M): Optimize object/array marking with bit-stealing task encoding

Roman Kennke rkennke at redhat.com
Mon Jan 16 16:06:55 UTC 2017


Excellent!

'512TB ought to be enough for anybody' ? ;-)

Good to go.

Needs revisiting in 100 years or so, if we are still stuck with 64bit
addressing then ;-)

Roman

Am Montag, den 16.01.2017, 16:46 +0100 schrieb Aleksey Shipilev:
> Hi,
> 
> Our mark stack contains ObjArrayFromToTask instances, which is are
> the tuples
> <oop, from, to>. For arrays, from/to are describing the chunk to
> process. For
> objects, from is always -1, indicating no chunk is expected.
> 
> Since HS taskqueue employs copying constructors to poll/push the
> tasks from/to
> the queue, this means we always copy from/to fields, and the queue
> footprint
> also always includes from/to fields. This is excessive for a
> prevailing case of
> regular oop marking. This is an attempt to improve the case for
> regular oops,
> without regressing parallel array processing:
>   http://cr.openjdk.java.net/~shade/shenandoah/mark-objtask-regular/w
> ebrev.02/
> 
> This patch improves concurrent mark times significantly for regular
> oops:
> 
> retain.Tree -p size=50000000:
> 
>  Baseline: Concurrent Marking =    99.17 s (a =   826446 us) (n
> =   120)
>              (lvls, us
> =   806641,   826172,   839844,   841797,   887344)
> 
>   Patched: Concurrent Marking =    93.77 s (a =   774975 us) (n
> =   121)
>              (lvls, us
> =   753906,   771484,   785156,   787109,   837818)
> 
> ...and also ever-so-slightly improving for object arrays:
> 
> retain.RefArray -p size=2000000000:
> 
>  Baseline: Concurrent Marking =   157.29 s (a =   741921 us) (n
> =   212)
>              (lvls, us
> =   720703,   740234,   753906,   755859,   822552)
> 
>   Patched: Concurrent Marking =   158.64 s (a =   734448 us) (n
> =   216)
>              (lvls, us
> =   720703,   734375,   744141,   746094,   764200)
> 
> Less targeted workloads also improve concurrent mark times, e.g.
> Compiler.compiler:
> 
>  Baseline: Concurrent Marking =     3.87 s (a =   168337 us) (n
> =    23)
>              (lvls, us
> =    93750,   103516,   154297,   232422,   439476)
> 
>   Patched: Concurrent Marking =     2.53 s (a =   120386 us) (n
> =    21)
>              (lvls, us
> =    76953,    93164,   103516,   125000,   400385)
> 
> Testing: hotspot_gc_shenandoah, jcstress tests-all.
> 
> Thanks,
> -Aleksey
> 


More information about the shenandoah-dev mailing list