[aarch64-port-dev ] Very large code caches
Andrew Haley
aph at redhat.com
Tue Jan 7 03:03:03 PST 2014
On 01/07/2014 10:47 AM, Edward Nevill wrote:
> On Fri, 2014-01-03 at 12:53 +0000, Andrew Haley wrote:
>> The AArch64 immediate call instructions span +/- 128 Mbytes. This is
>> a good match for us: the default ReservedCodeCacheSize is 48M, and it
>> takes a lot to fill 128M.
>
> With tiered compilation the code cache size is set to 5 *
> ReservedCodeCacheSize. So we can currently have a code cache size of
> 240M.
>
>> I've been kicking around a solution, attached here. We can have a
>> long call instruction, and for the sake of the exercise I've been
>> trying lea(r16, dest); blr(r16) .
>
> dest is a literal here so this ends up being
>
> mov x0, .. / movk x0, .. / movk x0, .. / movk x0, ..
>
> can we use adrp / add instead for 2 instructions instead of 4? This
> allows a code cache size of up to 4G (the max allowed in hotspot is 2G
> in any case).
I don't think so, because we can't update it atomically. I think we
have to use the constant pool.
>> The back end has to be changed in
>> quite a few places, but the real problem is that the resulting call
>> site is not MT-safe: it can't be patched atomically. To make that
>> work we'd have to move the destination address into the constant pool.
>>
>> So, I'm envisaging a solution where we wait until, when patching, we
>> have the first branch out of range. We then patch the site with a
>> trap that calls deoptimize. We also set a flag in the assembler. When
>> the method is recompiled after deoptimization it'll have long
>> branches, and from that time onwards all (inter-method) branches will
>> be long.
>
> Apologies if I have misunderstood but...
>
> After the 1st out of range patchup all branches are compiled as long
> branches. Does this include branches which subsequently need patching.
> In which case this will again be non MT safe.
Not if we use the cpool.
>> A gnarly problem is nmethod::make_not_entrant_or_zombie(). At present
>> we place a single NOP at the entry point of every method, and when we
>> deoptimize it we patch it with a branch to handle_wrong_method. We
>> can do this iff handle_wrong_method is reachable, i.e. less than
>> 128Mbytes away. We could deposit a copy of the handle_wrong_method
>> stub every 128Mbytes or so, but a better plan is to replace our NOP
>> with a trap of some kind. DCPS1 generates an illegal instruction
>> trap, so I've used that. It seems to work.
>>
>> None of this is very nice, but it all works. It shouldn't affect
>> anything significant in most cases because our branch overflow will
>> never happen, and we'll not do anything different.
>
> As you say it is not very nice, but it doesn't happen very often.
>
> I assume the code you attached is experimental and will be tidied up
> before being pushed. There seems to be a lot of debug code / magic
> constants.
Indeed.
Andrew.
More information about the aarch64-port-dev
mailing list