RFC: TLAB size flapping

Tue Dec 6 17:17:11 UTC 2016

Hi,

So, if you run allocation tests under -Xlog:gc+tlab, then a funny story unfolds.
The interesting piece of code is below, it is polled by TLAB allocation
machinery to figure what is the max TLAB allocatable without hassle:

size_t  ShenandoahHeap::unsafe_max_tlab_alloc(Thread *thread) const {
  size_t idx = _free_regions->current_index();
  ShenandoahHeapRegion* current = _free_regions->get(idx);
  if (current == NULL) {
    return 0;
  } else if (current->free() > MinTLABSize) {
    return current->free();
  } else {
    return MinTLABSize;
  }
}

This what happens next:

// Step 1: TLAB request for allocating, polling Shenandoah about the next free
// region. Shenandoah replies there is a current free region with 256 words
// busy (hm!). Okay, we claim the rest of the region for a TLAB then.
[2.328s][trace][gc,tlab] TLAB: fill thread: 0x00007ffb54594800 ...
[2.328s][trace][gc,tlab] ShenandoahHeap::unsafe_max_tlab_alloc: region = 1019,
capacity = 524288, used = 256, free = 524032
[2.328s][trace][gc,tlab] ThreadLocalAllocBuffer::compute_size(3) returns 524032
[2.328s][trace][gc,tlab] allocating new tlab of size 524032 at addr
0x00000006bec00800

// Step 2: Another TLAB request. No more space in current region. But yeah, we
// return MinTLABSize (those 256 words!), and shared infra moves on, asking us
// to allocate a new TLAB of 256 words. Now, the current region is depleted, so
// we allocate those 256 words in the *next* region.
[2.328s][trace][gc,tlab] TLAB: fill thread: 0x00007ffb54594800 ...
[2.329s][trace][gc,tlab] ShenandoahHeap::unsafe_max_tlab_alloc: (failing) region
= 1019, capacity = 524288, used = 524288, free = 0
[2.329s][trace][gc,tlab] ThreadLocalAllocBuffer::compute_size(3) returns 256
[2.329s][trace][gc,tlab] allocating new tlab of size 256 at addr 0x00000006bf000000

// Step 1 again. The cycle continues. Another TLAB request, current region has
// 256 words used, claim the rest... goes on and on.
[2.329s][trace][gc,tlab] TLAB: fill thread: 0x00007ffb54594800 ...
[2.329s][trace][gc,tlab] ShenandoahHeap::unsafe_max_tlab_alloc: region = 1020,
capacity = 524288, used = 256, free = 524032
[2.329s][trace][gc,tlab] ThreadLocalAllocBuffer::compute_size(3) returns 524032
[2.329s][trace][gc,tlab] allocating new tlab of size 524032 at addr
0x00000006bf000800

So, this flaps TLAB allocations between the region size and MinTLABSize. Oops!
We enter the slow path *twice* per region, instead of doing it once. I think
returning MinTLABSize is wrong in the code above, and we have two options:
  a) Return 0 on MinTLABSize branch. If I read the code right, this will bail us
from TLAB allocation path, which is undesireable;
  b) Advance to the next free region, and try to poll its free().

G1 is susceptible to the same problem, as far as I can see.

Thanks,
-Aleksey