Question about ccs reservation, CDS and aarch64 specifics

Ioi Lam ioi.lam at oracle.com
Thu Apr 16 18:14:03 UTC 2020



On 4/16/20 10:51 AM, Thomas Stüfe wrote:
> Hi Andrew,
>
> Thanks for the prompt answer. See my answers inline.
>
> On Thu, Apr 16, 2020 at 6:24 PM Andrew Haley <aph at redhat.com> wrote:
>
>> Hi,
>>
>> On 4/16/20 4:18 PM, Thomas Stüfe wrote:
>>> I am currently trying to wrap my head around the various ways the
>>> CompressedClassSpace is reserved. Coding has grown a bit in complexity
>> with
>>> the advent of CDS/AppCDS and recently some aarch64 changes atop of that,
>>> changing behavior for aarch64 and ppc64. Maybe someone can enlighten me a
>>> bit.
>>>
>>> Specifically, I am looking at
>>>
>>> Metaspace::reserve_space
>>>
>>> and its aarch64-specific outgrow
>>>
>>> Metaspace::reserve_preferred_space
>> Yowza. This one is mine, I think.
>>
>>> Despite its generic-sounding name, these functions can only be used to
>>> allocate ccs. They lack any interface description, so I parsed the code
>> to
>>> understand their behavior.
>>>
>>> So I tried and here is how I think Metaspace::reserve_space works for the
>>> various combinations of input parameters:
>> Bear in mind that this was changed recently. It was (even more)
>> complicated before.
>>
>>
> Yes, ccs reservation is complex, and the aarch64 parts are only a small
> part of it. I would love to simplify it a bit but its not that easy.
>
>
>> Please read the discussion at
>>
>> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-December/037472.html
>>
>>
> Thank you for pointing me to the discussion. We have missed that review.
>
> The reason I looked at this coding was due to the new Metaspace
> implementation. In the new allocator it is possible to allocate a Klass at
> ccs offset zero (currently this never happens out of accident). If
> CompressedKlassPointers::base() points to the start of ccs, the resulting
> narrow Klass pointer would be 0. But the VM cannot tell that apart from a
> real NULL reference. So far my cheap fix has been to
> move CompressedKlassPointers::base() a bit below the start of the ccs. But
> as I saw yesterday that breaks the 4G-alignment-assumption on aarch64.
> Nevermind, there are different ways to solve that, but I wondered why
> aarch64 could not handle this crooked base address.
Wouldn't it be easier to just allocate 8 bytes of dummy data at the 
beginning of the CCS?

Thanks
- Ioi

>
>
>>> A) requested_addr == NULL && use_requested_addr == false:
>>> [aarch64, ppc64]: Attempt to reserve at one of the preferred OS dependent
>>> allocation points. Failing that, return an unreserved space.
>>> [others]: Reserve a space anywhere.
>>>
>>> B) requested_addr == NULL && use_requested_addr == true:
>>> [aarch64, ppc64]: Does nothing, returns an unreserved space immediately.
>> I
>>> assume this would be an invalid combination, but since it is not
>> asserted I
>>> am not sure.
>>> [others]: Reserve a space anywhere (use_requested_addr is ignored).
>>>
>>> C) requested_addr != NULL && use_requested_addr == false:
>>> [aarch64, ppc64]: First attempt to reserve at the requested address, but
>>> only if that would cause the space to falls into the lower 4G. Failing
>>> that, allocate at one of the preferred OS dependent allocation points.
>>> Failing that, return an unreserved space.
>> Yes. We have to do that, because we can't cope with the heap base
>> being anything other than a multiple of 4*G. We've got rid of
>> rheapbase, in other words, for all compiled code.
>>
>>> [others]: Attempt to reserve at requested_addr. . Failing that, return an
>>> unreserved space.
>>>
>>> D) requested_addr != NULL && use_requested_addr == true:
>>> [aarch64, ppc64]: First attempt to reserve at the requested address, but
>>> only if that would cause the space to falls into the lower 4G. Failing
>>> that, return an unreserved space.
>>> [others]: Attempt to reserve at requested_addr. . Failing that, return an
>>> unreserved space. (use_requested_addr is ignored).
>>>
>>> Note the many subtle platform differences. E.g. on aarch64 we honor the
>>> requested address only if ccs would fall below 4G, for all other
>> platforms
>>> we always honor them. Or how for most platforms the parameter
>>> "use_requested_addr" is just ignored.
>>>
>>> Or that on aarch64 we never seem to "try anywhere", we just try a
>>> fixed set of attachment points and if these are all occupied we
>>> fail. Is this a bug or by design?
>> It's by design. We looked at it and decided that we would always be
>> able to allocate one of our "nice" points: they are spaced 4G apart,
>> and it's very unlikely that any Linux system (which is all we support)
>> would fail to map any of the possibilities.
>>
>>
> Thank you for clarifying. This clearly distinguishes aarch64 from at least
> AIX and possibly Linux ppc, not sure - there we clearly want a fallback
> "try anywhere". We have to look at the code again.
>
> I believe either Goetz or me wrote the original AIX version but my memory
> is dim. I am at a loss why we restricted this to AIX only. I have to talk
> this over with Goetz.
>
>
>>> Can we always rely at least one of the attachment points being
>>> unoccupied? Looking at the options for
>>> MacroAssembler::KlassDecodeMode on aarch64, a simple decoding using
>>> x + base >> shift seems not to be wanted? There seems to be no fall
>>> back mode which would work with any value of base/shift?
>> That is correct.
>>
>>> About reserve_preferred_space(), I was confused why a separate
>>> "use_requested_addr" was even needed - requested_addr!=NULL would be a
>>> perfectly valid way to communicate that the requested address should be
>>> used. I wish we could simplify the coding to just two cases:
>>> - hand down a requested address, which is to be taken-or-fail (somewhat
>>> like case D)
>>> - hand down NULL, which means "try whatever": which for most OSes would
>> be
>>> really anywhere, for aarch64 could be the fixed set of attachment points.
>>> This would be case (A).
>>>
>>> About case (C): under which circumstances does it happen that caller code
>>> hands down a requested address below 4G which happens to be free?
>> I don't know.
>>
>>> Does that make sense? In other words, if the whole point of
>>> Metaspace::reserve_preferred_space() is "OS knows better, let it try
>>> to find a good address", would it not make sense to just try a low
>>> address as part of the try-addresses-loop?
>> We certainly don't want to have to use a dedicated heapbase register
>> or a shift. Just give us a multiple of 4*G and we're happy.
>>
>>
> Good to know. So, zero based encoding does not have any special place in
> your heart? 4G aligned base works just as well?
>
> Thanks, Thomas
>
>



More information about the hotspot-runtime-dev mailing list