[aarch64-port-dev ] Very large code caches

Tue Jan 7 03:03:03 PST 2014

On 01/07/2014 10:47 AM, Edward Nevill wrote:
> On Fri, 2014-01-03 at 12:53 +0000, Andrew Haley wrote:
>> The AArch64 immediate call instructions span +/- 128 Mbytes.  This is
>> a good match for us: the default ReservedCodeCacheSize is 48M, and it
>> takes a lot to fill 128M.
> 
> With tiered compilation the code cache size is set to 5 *
> ReservedCodeCacheSize. So we can currently have a code cache size of
> 240M.
> 
>> I've been kicking around a solution, attached here.  We can have a
>> long call instruction, and for the sake of the exercise I've been
>> trying lea(r16, dest); blr(r16) .
> 
> dest is a literal here so this ends up being
> 
> mov x0, .. / movk x0, .. / movk x0, .. / movk x0, ..
> 
> can we use adrp / add instead for 2 instructions instead of 4? This
> allows a code cache size of up to 4G (the max allowed in hotspot is 2G
> in any case).

I don't think so, because we can't update it atomically.  I think we
have to use the constant pool.

>> The back end has to be changed in
>> quite a few places, but the real problem is that the resulting call
>> site is not MT-safe: it can't be patched atomically.  To make that
>> work we'd have to move the destination address into the constant pool.
>>
>> So, I'm envisaging a solution where we wait until, when patching, we
>> have the first branch out of range.  We then patch the site with a
>> trap that calls deoptimize.  We also set a flag in the assembler.  When
>> the method is recompiled after deoptimization it'll have long
>> branches, and from that time onwards all (inter-method) branches will
>> be long.
> 
> Apologies if I have misunderstood but...
> 
> After the 1st out of range patchup all branches are compiled as long
> branches. Does this include branches which subsequently need patching.
> In which case this will again be non MT safe.

Not if we use the cpool.

>> A gnarly problem is nmethod::make_not_entrant_or_zombie().  At present
>> we place a single NOP at the entry point of every method, and when we
>> deoptimize it we patch it with a branch to handle_wrong_method.  We
>> can do this iff handle_wrong_method is reachable, i.e. less than
>> 128Mbytes away.  We could deposit a copy of the handle_wrong_method
>> stub every 128Mbytes or so, but a better plan is to replace our NOP
>> with a trap of some kind.  DCPS1 generates an illegal instruction
>> trap, so I've used that.  It seems to work.
>>
>> None of this is very nice, but it all works. It shouldn't affect
>> anything significant in most cases because our branch overflow will
>> never happen, and we'll not do anything different.
> 
> As you say it is not very nice, but it doesn't happen very often.
> 
> I assume the code you attached is experimental and will be tidied up
> before being pushed. There seems to be a lot of debug code / magic
> constants.

Indeed.

Andrew.