RFR: Parallel heap region iteration

Roman Kennke rkennke at redhat.com
Thu Oct 25 15:34:22 UTC 2018


The patch looks good!

I'd have two suggestions/question:
- should we lift/raise the 32MB limit? This is totally random and
basically 'derived' from the similar code in G1, and doesn't seem to
make much sense in Shenandoah context.
- What would stop us from moving the 'build cset' stuff at the beginning
of the conc-evac phase? Are there any gotchas?

Thanks, Roman

> http://cr.openjdk.java.net/~shade/shenandoah/parallel-hri/webrev.01/
> 
> This pile of 4 changesets improves pause times with huge number of regions. (For example, 8 TB heap
> has 262K regions, 32M each). It cleans up existing usages for heap_region_iterate, introduces its
> parallel version, and hooks up Clean/Complete liveness to it.
> 
> Motivational example with 400K regions, which roughly simulates ~12 TB heap:
> 
> -XX:ShenandoahParallelRegionStride=1M (no parallelism):
> 
>  Pause Init Mark (N)  = 0.04 s (a =  7860 us) (n = 5) (lvls, us =  7598,  7598,  7734,  7754,  8159)
>    Clear Liveness     = 0.04 s (a =  7568 us) (n = 5) (lvls, us =  7422,  7422,  7500,  7520,  7795)
>  Pause Final Mark (N) = 0.17 s (a = 33306 us) (n = 5) (lvls, us = 30273, 30273, 31250, 31641, 36763)
>    Complete Liveness  = 0.03 s (a =  6146 us) (n = 5) (lvls, us =  5801,  5801,  6016,  6250,  6317)
>    Prepare Evacuation = 0.13 s (a = 25518 us) (n = 5) (lvls, us = 23633, 23633, 24023, 24023, 29174)
> 
> -XX:ShenandoahParallelRegionStride=1K (new default):
> 
> Pause Init Mark (N)   = 0.01 s (a =  1980 us) (n = 5) (lvls, us =  1875,  1875,  1895,  1914,  2271)
>   Clear Liveness      = 0.01 s (a =  1707 us) (n = 5) (lvls, us =  1660,  1660,  1680,  1719,  1728)
> Pause Final Mark (N)  = 0.14 s (a = 27489 us) (n = 5) (lvls, us = 25977, 25977, 26172, 26172, 32521)
>   Complete Liveness   = 0.01 s (a =  1485 us) (n = 5) (lvls, us =  1426,  1426,  1426,  1484,  1542)
>   Prepare Evacuation  = 0.12 s (a = 24344 us) (n = 5) (lvls, us = 23828, 23828, 24023, 24023, 25099)
> 
> [Long cset selection times are for follow-up]
> 
> It does not regress pauses with our usual number of regions (2K). The stride is chosen below 2K to
> exercise that code anyway.
> 
> Testing: tier3_gc_shenandoah
> 
> Thanks,
> -Aleksey
> 



More information about the shenandoah-dev mailing list