RFR/RFC: Tax-and-Spend allocation pacing
Roman Kennke
rkennke at redhat.com
Fri Mar 9 10:47:43 UTC 2018
Hi Aleksey,
this is great stuff. The patch looks good to me.
I wonder what is needed to make this work with partial and traversal GC?
Or would that only be the boilerplate stuff like in
init_concurrent_normal_cycle() to set up pacing? Because the other hooks
(to report evacs and intercept allocs) are already in their right places?
Cheers, Roman
> http://cr.openjdk.java.net/~shade/shenandoah/tax-and-spend/webrev.01/
>
> Please review this at your leisure, as I would have time to revisit this only next week. There are
> lots of comments in the source code itself.
>
> This implements a simple Tax-and-Spend allocation pacing, and it is needed to catch up with
> application allocations when either the GC cycle is in progress (e.g. when LDS is high, and not
> enough free space is available, or there is an allocation spike), or when we are idle (e.g. heap is
> small, and control loop has to react swiftly to start the GC).
>
> This plays into our usual degradation scheme: if we blow the pacing budget, we would allocate, and
> hope for the best. In worst case, we would Degenerate, as usual, instead of stalling the
> applications threads indefinitely.
>
> It will naturally allow higher allocation rates with larger heaps, while keeping application at bay
> with small-to-moderate heaps. I have not been able to make the implementation into the performance
> bottleneck even in the torturous 1+ TB/sec allocation tests, with enough heap available.
>
> Passes hotspot_gc_shenandoah, but not all heuristics are implemented yet.
>
> Motivational examples ["Easy does it", "haste makes waste", Gil's Ferrari-around-the-tree and other
> adages apply when interpreting the results]:
>
> === Allocating "new Object()" in 16 threads and tiny -Xmx128m heap:
>
> NOTE: This test basically tests if heuristics is able to catch up fast enough.
>
> --- Before:
>
> Time per alloc: 34.192 ± 4.929 ns/op
> Allocation rate: 9730.235 ± 1526.052 MB/sec
>
> 945 successful concurrent GCs
> 0 invoked explicitly
>
> 1280 Degenerated GCs
> 1280 caused by allocation failure
> 208 happened at Outside of Cycle
> 778 happened at Mark
> 291 happened at Evacuation
> 3 happened at Update Refs
> 604 upgraded to Full GC
>
> 717 Full GCs
> 0 invoked explicitly
> 113 caused by allocation failure
> 604 upgraded from Degenerated GC
>
>
> --- After:
>
> Time per op: 175.004 ± 10.537 ns/op
> Allocation rate: 1900.956 ± 149.273 MB/sec
>
> 377 successful concurrent GCs
> 0 invoked explicitly
>
> 0 Degenerated GCs
> 0 caused by allocation failure
> 0 upgraded to Full GC
>
> 0 Full GCs
> 0 invoked explicitly
> 0 caused by allocation failure
> 0 upgraded from Degenerated GC
>
>
> === TreeFragger with 16 threads, ~20 GB LDS and -Xmx30g:
>
> NOTE: Allocation pacing provides *higher* average allocation rate, because STW GCs hurt.
>
> --- Before
>
> Time per op: 166.156 ± 94.360 ns/op
> Allocation rate: 2469.184 ± 1466.191 MB/sec
>
> 3 successful concurrent GCs
> 0 invoked explicitly
>
> 40 Degenerated GCs
> 40 caused by allocation failure
> 4 happened at Outside of Cycle
> 36 happened at Mark
> 1 upgraded to Full GC
>
> 11 Full GCs
> 0 invoked explicitly
> 10 caused by allocation failure
> 1 upgraded from Degenerated GC
>
>
>
> --- After
>
> Time per op: 62.819 ± 104.797 ns/op
> Allocation rate: 5716.089 ± 1748.301 MB/sec
>
> 97 successful concurrent GCs
> 0 invoked explicitly
>
> 0 Degenerated GCs
> 0 caused by allocation failure
> 0 upgraded to Full GC
>
> 0 Full GCs
> 0 invoked explicitly
> 0 caused by allocation failure
> 0 upgraded from Degenerated GC
>
>
>
> Thanks,
> -Aleksey
>
>
>
More information about the shenandoah-dev
mailing list