Question about ccs reservation, CDS and aarch64 specifics

Ioi Lam ioi.lam at oracle.com
Thu Apr 16 17:46:37 UTC 2020


(I suppose you mean "compressed class space" by "ccs" :-)

On 4/16/20 9:24 AM, Andrew Haley wrote:
> Hi,
>
> On 4/16/20 4:18 PM, Thomas Stüfe wrote:
>> I am currently trying to wrap my head around the various ways the
>> CompressedClassSpace is reserved. Coding has grown a bit in complexity with
>> the advent of CDS/AppCDS and recently some aarch64 changes atop of that,
>> changing behavior for aarch64 and ppc64. Maybe someone can enlighten me a
>> bit.
>>
>> Specifically, I am looking at
>>
>> Metaspace::reserve_space
>>
>> and its aarch64-specific outgrow
>>
>> Metaspace::reserve_preferred_space
> Yowza. This one is mine, I think.
>
>> Despite its generic-sounding name, these functions can only be used to
>> allocate ccs. They lack any interface description, so I parsed the code to
>> understand their behavior.
>>
>> So I tried and here is how I think Metaspace::reserve_space works for the
>> various combinations of input parameters:
> Bear in mind that this was changed recently. It was (even more)
> complicated before.
>
> Please read the discussion at
> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-December/037472.html
>
>> A) requested_addr == NULL && use_requested_addr == false:
>> [aarch64, ppc64]: Attempt to reserve at one of the preferred OS dependent
>> allocation points. Failing that, return an unreserved space.
>> [others]: Reserve a space anywhere.
>>
>> B) requested_addr == NULL && use_requested_addr == true:
>> [aarch64, ppc64]: Does nothing, returns an unreserved space immediately. I
>> assume this would be an invalid combination, but since it is not asserted I
>> am not sure.
>> [others]: Reserve a space anywhere (use_requested_addr is ignored).
>>
>> C) requested_addr != NULL && use_requested_addr == false:
>> [aarch64, ppc64]: First attempt to reserve at the requested address, but
>> only if that would cause the space to falls into the lower 4G. Failing
>> that, allocate at one of the preferred OS dependent allocation points.
>> Failing that, return an unreserved space.
> Yes. We have to do that, because we can't cope with the heap base
> being anything other than a multiple of 4*G. We've got rid of
> rheapbase, in other words, for all compiled code.
>
>> [others]: Attempt to reserve at requested_addr. . Failing that, return an
>> unreserved space.
>>
>> D) requested_addr != NULL && use_requested_addr == true:
>> [aarch64, ppc64]: First attempt to reserve at the requested address, but
>> only if that would cause the space to falls into the lower 4G. Failing
>> that, return an unreserved space.
>> [others]: Attempt to reserve at requested_addr. . Failing that, return an
>> unreserved space. (use_requested_addr is ignored).
>>
>> Note the many subtle platform differences. E.g. on aarch64 we honor the
>> requested address only if ccs would fall below 4G, for all other platforms
>> we always honor them. Or how for most platforms the parameter
>> "use_requested_addr" is just ignored.
>>
>> Or that on aarch64 we never seem to "try anywhere", we just try a
>> fixed set of attachment points and if these are all occupied we
>> fail. Is this a bug or by design?
> It's by design. We looked at it and decided that we would always be
> able to allocate one of our "nice" points: they are spaced 4G apart,
> and it's very unlikely that any Linux system (which is all we support)
> would fail to map any of the possibilities.
>
>> Can we always rely at least one of the attachment points being
>> unoccupied? Looking at the options for
>> MacroAssembler::KlassDecodeMode on aarch64, a simple decoding using
>> x + base >> shift seems not to be wanted? There seems to be no fall
>> back mode which would work with any value of base/shift?
> That is correct.
>
>> About reserve_preferred_space(), I was confused why a separate
>> "use_requested_addr" was even needed - requested_addr!=NULL would be a
>> perfectly valid way to communicate that the requested address should be
>> used. I wish we could simplify the coding to just two cases:
>> - hand down a requested address, which is to be taken-or-fail (somewhat
>> like case D)
>> - hand down NULL, which means "try whatever": which for most OSes would be
>> really anywhere, for aarch64 could be the fixed set of attachment points.
>> This would be case (A).
>>
>> About case (C): under which circumstances does it happen that caller code
>> hands down a requested address below 4G which happens to be free?
> I don't know.
I am not even sure if case (C) can happen at all.

I admit that I've been guilty of making the interface even more complicated
with JDK-8231610 
<https://bugs.openjdk.java.net/browse/JDK-8231610>(Relocate the CDS 
archive if it cannot be mapped to the
requested address). Looks now is a good time to clean up.

One thing that can be cleaned up is the call to
Metaspace::allocate_metaspace_compressed_klass_ptrs:

(a) when CDS is enabled:

     Metaspace::global_initialize()
     -> MetaspaceShared::initialize_runtime_shared_and_meta_spaces()
        -> ... MetaspaceShared::map_archives()
          -> ... reserve the space, eventually calling 
Metaspace::reserve_space
          -> call Metaspace::allocate_metaspace_compressed_klass_ptrs()

(b) when CDS is disabled

     Metaspace::global_initialize()
-> allocate_metaspace_compressed_klass_ptrs
        -> (if cds is not enabled) Metaspace::reserve_space()


In case (b), we should first reserve the space, and then call into
allocate_metaspace_compressed_klass_ptrs. This will simplify the arguments
of allocate_metaspace_compressed_klass_ptrs, and will also limit the 
variations
of calls to Metaspace::reserve_space(). I think this will make it 
possible to
drop the use_requested_addr argument and rely simply on (requested_addr 
!= NULL)

Thanks
- Ioi


>> Does that make sense? In other words, if the whole point of
>> Metaspace::reserve_preferred_space() is "OS knows better, let it try
>> to find a good address", would it not make sense to just try a low
>> address as part of the try-addresses-loop?
> We certainly don't want to have to use a dedicated heapbase register
> or a shift. Just give us a multiple of 4*G and we're happy.
>



More information about the hotspot-runtime-dev mailing list