RFR/RFC: Tax-and-Spend allocation pacing
Aleksey Shipilev
shade at redhat.com
Thu Mar 8 17:48:23 UTC 2018
http://cr.openjdk.java.net/~shade/shenandoah/tax-and-spend/webrev.01/
Please review this at your leisure, as I would have time to revisit this only next week. There are
lots of comments in the source code itself.
This implements a simple Tax-and-Spend allocation pacing, and it is needed to catch up with
application allocations when either the GC cycle is in progress (e.g. when LDS is high, and not
enough free space is available, or there is an allocation spike), or when we are idle (e.g. heap is
small, and control loop has to react swiftly to start the GC).
This plays into our usual degradation scheme: if we blow the pacing budget, we would allocate, and
hope for the best. In worst case, we would Degenerate, as usual, instead of stalling the
applications threads indefinitely.
It will naturally allow higher allocation rates with larger heaps, while keeping application at bay
with small-to-moderate heaps. I have not been able to make the implementation into the performance
bottleneck even in the torturous 1+ TB/sec allocation tests, with enough heap available.
Passes hotspot_gc_shenandoah, but not all heuristics are implemented yet.
Motivational examples ["Easy does it", "haste makes waste", Gil's Ferrari-around-the-tree and other
adages apply when interpreting the results]:
=== Allocating "new Object()" in 16 threads and tiny -Xmx128m heap:
NOTE: This test basically tests if heuristics is able to catch up fast enough.
--- Before:
Time per alloc: 34.192 ± 4.929 ns/op
Allocation rate: 9730.235 ± 1526.052 MB/sec
945 successful concurrent GCs
0 invoked explicitly
1280 Degenerated GCs
1280 caused by allocation failure
208 happened at Outside of Cycle
778 happened at Mark
291 happened at Evacuation
3 happened at Update Refs
604 upgraded to Full GC
717 Full GCs
0 invoked explicitly
113 caused by allocation failure
604 upgraded from Degenerated GC
--- After:
Time per op: 175.004 ± 10.537 ns/op
Allocation rate: 1900.956 ± 149.273 MB/sec
377 successful concurrent GCs
0 invoked explicitly
0 Degenerated GCs
0 caused by allocation failure
0 upgraded to Full GC
0 Full GCs
0 invoked explicitly
0 caused by allocation failure
0 upgraded from Degenerated GC
=== TreeFragger with 16 threads, ~20 GB LDS and -Xmx30g:
NOTE: Allocation pacing provides *higher* average allocation rate, because STW GCs hurt.
--- Before
Time per op: 166.156 ± 94.360 ns/op
Allocation rate: 2469.184 ± 1466.191 MB/sec
3 successful concurrent GCs
0 invoked explicitly
40 Degenerated GCs
40 caused by allocation failure
4 happened at Outside of Cycle
36 happened at Mark
1 upgraded to Full GC
11 Full GCs
0 invoked explicitly
10 caused by allocation failure
1 upgraded from Degenerated GC
--- After
Time per op: 62.819 ± 104.797 ns/op
Allocation rate: 5716.089 ± 1748.301 MB/sec
97 successful concurrent GCs
0 invoked explicitly
0 Degenerated GCs
0 caused by allocation failure
0 upgraded to Full GC
0 Full GCs
0 invoked explicitly
0 caused by allocation failure
0 upgraded from Degenerated GC
Thanks,
-Aleksey
More information about the shenandoah-dev
mailing list