RFC: TLAB allocation and garbage-first policy

Wed Sep 20 14:49:56 UTC 2017

On 09/20/2017 03:53 PM, Christine Flood wrote:
> On Wed, Sep 20, 2017 at 8:58 AM, Aleksey Shipilev <shade at redhat.com <mailto:shade at redhat.com>> wrote:
> The original TLabs were something like 4K if I remember correctly.  Yes, there is a balancing act
> with not making them too small.  However what's the point of having TLABS if they are region sized,
> why not just assign a region/thread? 

For the same reason adaptive TLAB sizing exists: do not waste space. In this mechanics, you can't
have more threads than regions, even if most of the threads are dormant.

> Perhaps there's a middle ground of say 1/4 of a region which will leave you in a better situation than you are now.

Maybe, let's make it our fallback plan: if everything else fails, we can trim down the TLABs.

>     > Another potential solution would be to treat these regions specially.  When a tlab allocation fails
>     > in a region we could fill that particular region with a filler array.  Therefore we now have
>     > garbage.  This differs from your solution in that regular regions that are perfectly happy with
>     > normal sized tlab spaces available aren't going to get prematurely compacted.
> 
>     Aha, sounds interesting. So the only thing that does seem to help is the half-full regions we never
>     tried to allocate in, right? Otherwise it is the same as looking at "live" for cset selection.
> 
> What this does is not confuse our metrics for expediency.  We agreed earlier that in a single region
> case copying  the live data from one region to another doesn't gain us anything.  I would argue that
> if I had to choose between compacting 10 fragmented regions or 10 compacted but not large enough for
> a tlab regions we would be better off compacting the fragmented regions because that would leave us
> with more contiguous free space.  Your proposed metric doesn't distinguish between the two.  
> 
> I suppose there is a place for what you want.  If all I have left are compacted but not quite
> spacious enough regions I would prefer to add them to the cset instead of falling back on a full
> gc.  Perhaps there's a heuristic that satisfies both constraints.  

The example of the single fragmented region, while valid in itself, loses sight of bigger picture, I
think. It seems to me that it *only* matters when collection set contains that one fragmented
region. If it contains more than one fragmented region, then it starts to make sense to compact them
together and free up one of regions. If cset contains additional full regions, then the impact of
"wasteful" copy for that single fragmented region is very low.

With that in sight, how frequent it is to have a single fragmented region in the collection set,
compared to other cases? I would rather have the static code that deals with 99.99% of the cases,
and never walks into bad feedback loop, than having another heuristics for 0.01% of the cases and
fails with unforeseen feedbacks. Doing the heuristics feels like what you describe as, "<chf> I will
grant you that in your particular situation your solution looks attractive, but in a myriad of other
situations you are actually pessimizing GC performance in at least one metric".

Thanks,
-Aleksey