AARCH64: 8064611: Changes to HotSpot shared code

Fri Nov 14 17:06:18 UTC 2014

Thank you Edward and Andrew for explanation.

So defaults are next:

aarch64(C1) ReservedCodeCacheSize = 32*M
aarch64(C2) ReservedCodeCacheSize = 48*M

I understand that *5 will exceed 128MB limit. My concern here is that our experience shows that TieredCompilation will 
eat 200MB very easy. So with 128MB you may hit performance issues during long runs (C1 does not stop compiling) because 
there will be not enough space for code and we will start evicting old compiled code or stop compiling.

On 11/14/14 2:34 AM, Edward Nevill wrote:
 >> >You may need to change next code if you can allocate only 128MB:
 >> >
 >> >2547   } else if (ReservedCodeCacheSize > 2*G) {
 >> >2548     // Code cache size larger than MAXINT is not supported.
 >> >2549     jio_fprintf(defaultStream::error_stream(),
 >> >
 >> >I think you need to add new platforms specific flag CodeCacheSizeLimit
 >> >and use it instead of our hard-coded 2Gb (maxint).
 > OK. So what you are suggesting is adding CodeCacheSizeLimt as a product_pd to globals.hpp, then adding a 
define_pd_global to each of globals_aarch64.hpp, globals_x86.hpp, ....? Or something else?

Yes, I thought about product_pd() and define_pd_global(). But then some crazy (security) guys may start playing with it 
and file P1 bugs. Okay, how about a constant (#define CODE_CACHE_SIZE_LIMIT NOT_AARCH64(2*G) ARCH64_ONLY(138*M)) in this 
place so you can use it in these 2 checks?

 >
 > All the best,
 > Ed.

Thanks,
Vladimir

On 11/14/14 2:40 AM, Andrew Haley wrote:
> On 14/11/14 03:10, Vladimir Kozlov wrote:
>> Is it really 128MB max value for ReservedCodeCacheSize on aarch64?
>
> Well, here's the story.  Branches can reach 128M.  One of the core
> assumptions HotSpot makes (for inline caches and a few other things)
> is that you can atomically patch a branch or call.  Patching
> multi-word blocks of code on AArch64 is very hard because there is no
> ordering of memory access between cores and no synchronization between
> instruction and data caches.  And you can only patch nops, branches,
> and traps: anything else is undefined behaviour.
>
> So, we need to patch running code.  If branches are over 128M, we're
> going to find it hard.  The only decent (and architecturally
> well-defined) way I found was to use a load from the constant pool to
> supply the destination.  And that causes a delay, even when reading
> from L1 cache.  Every call is potentially a far call, and (once you're
> over 128M) so is every branch from compiled code into the runtime.
> (There are several other ways to handle far branches, but they're all
> pretty unpleasant.  For example, it is possible to handle it
> optimistically: compile short branches and assume every branch will
> reach, and deoptimize if we get unlucky, but eww.)
>
> I have written code to handle a large code cache and tried various
> ideas, but I abandoned it.  The key insight for me was the realization
> that the code cache is just that: it's a cache.  And IMO it makes more
> sense to live with a smaller code cache than pessimze everything.
>
> Having said all that, I admit the decision to limit the cache to 128M
> might be the wrong choice for some workloads, so I am quite happy to
> revisit this problem at a later date, but I don't think it's critical
> right now.
>
>> What is default ReservedCodeCacheSize size?
>
> I don't quite understand what you're asking.  On AArch64, or other
> systems?  Default is 64M * 5 for C2.
>
> Andrew.
>