[aarch64-port-dev ] AARCH64: 8064611: Changes to HotSpot shared code
Lindenmaier, Goetz
goetz.lindenmaier at sap.com
Fri Nov 14 11:27:23 UTC 2014
Hi,
on PPC, we solved this with the trampoline stubs we introduced.
Short calls are done directly. If we need a longer call, we jump to the
trampoline stub that does the load from the constant pool.
So short calls are efficient, but longer ones have an overhead of a short
branch. Another advantage for us is that we can schedule the short
branch for Power6 well.
Drawback is that the trampoline stub is sitting there for every call,
wasting code cache and constant pool entries.
Best regards,
Goetz.
-----Original Message-----
From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Andrew Haley
Sent: Freitag, 14. November 2014 11:41
To: Vladimir Kozlov; hotspot-dev Source Developers; aarch64-port-dev at openjdk.java.net
Subject: Re: AARCH64: 8064611: Changes to HotSpot shared code
On 14/11/14 03:10, Vladimir Kozlov wrote:
> Is it really 128MB max value for ReservedCodeCacheSize on aarch64?
Well, here's the story. Branches can reach 128M. One of the core
assumptions HotSpot makes (for inline caches and a few other things)
is that you can atomically patch a branch or call. Patching
multi-word blocks of code on AArch64 is very hard because there is no
ordering of memory access between cores and no synchronization between
instruction and data caches. And you can only patch nops, branches,
and traps: anything else is undefined behaviour.
So, we need to patch running code. If branches are over 128M, we're
going to find it hard. The only decent (and architecturally
well-defined) way I found was to use a load from the constant pool to
supply the destination. And that causes a delay, even when reading
from L1 cache. Every call is potentially a far call, and (once you're
over 128M) so is every branch from compiled code into the runtime.
(There are several other ways to handle far branches, but they're all
pretty unpleasant. For example, it is possible to handle it
optimistically: compile short branches and assume every branch will
reach, and deoptimize if we get unlucky, but eww.)
I have written code to handle a large code cache and tried various
ideas, but I abandoned it. The key insight for me was the realization
that the code cache is just that: it's a cache. And IMO it makes more
sense to live with a smaller code cache than pessimze everything.
Having said all that, I admit the decision to limit the cache to 128M
might be the wrong choice for some workloads, so I am quite happy to
revisit this problem at a later date, but I don't think it's critical
right now.
> What is default ReservedCodeCacheSize size?
I don't quite understand what you're asking. On AArch64, or other
systems? Default is 64M * 5 for C2.
Andrew.
More information about the aarch64-port-dev
mailing list