RFC: TLAB allocation and garbage-first policy

Wed Sep 20 13:53:42 UTC 2017

On Wed, Sep 20, 2017 at 8:58 AM, Aleksey Shipilev <shade at redhat.com> wrote:

> On 09/20/2017 02:44 PM, Christine Flood wrote:
> > I can think of several solutions.  One would be to cap the max tlab size
> as we discussed yesterday.
> > Having a tlab be an entire region has some nice performance
> characteristics but isn't really
> > necessary, nor is it in the spirit of tlabs.
>
> This seems to work badly for smaller heaps? E.g. we have 512K regions in
> 4G configuration. Reducing
> the TLAB size to, say, 1/10-th of region size makes it only 50K, which is
> too low? We also do not
> want to hit the slowpath allocation (i.e. TLAB refill here) for every 50K
> allocated, no?
>

The original TLabs were something like 4K if I remember correctly.  Yes,
there is a balancing act with not making them too small.  However what's
the point of having TLABS if they are region sized, why not just assign a
region/thread?    Perhaps there's a middle ground of say 1/4 of a region
which will leave you in a better situation than you are now.

It also seems inconsistent with the previously said goal: (paraphrasing) we
> try not to make the
> allocators pay for our sins. Capping the TLABs seems to be doing exactly
> that?
>
> > Another potential solution would be to treat these regions specially.
> When a tlab allocation fails
> > in a region we could fill that particular region with a filler array.
> Therefore we now have
> > garbage.  This differs from your solution in that regular regions that
> are perfectly happy with
> > normal sized tlab spaces available aren't going to get prematurely
> compacted.
>
> Aha, sounds interesting. So the only thing that does seem to help is the
> half-full regions we never
> tried to allocate in, right? Otherwise it is the same as looking at "live"
> for cset selection.
>
> What this does is not confuse our metrics for expediency.  We agreed
earlier that in a single region case copying  the live data from one region
to another doesn't gain us anything.  I would argue that if I had to choose
between compacting 10 fragmented regions or 10 compacted but not large
enough for a tlab regions we would be better off compacting the fragmented
regions because that would leave us with more contiguous free space.  Your
proposed metric doesn't distinguish between the two.

I suppose there is a place for what you want.  If all I have left are
compacted but not quite spacious enough regions I would prefer to add them
to the cset instead of falling back on a full gc.  Perhaps there's a
heuristic that satisfies both constraints.

-Aleksey
>
>