Question about ccs reservation, CDS and aarch64 specifics

Thomas Stüfe thomas.stuefe at gmail.com
Thu Apr 16 17:51:04 UTC 2020


Hi Andrew,

Thanks for the prompt answer. See my answers inline.

On Thu, Apr 16, 2020 at 6:24 PM Andrew Haley <aph at redhat.com> wrote:

> Hi,
>
> On 4/16/20 4:18 PM, Thomas Stüfe wrote:
> >
> > I am currently trying to wrap my head around the various ways the
> > CompressedClassSpace is reserved. Coding has grown a bit in complexity
> with
> > the advent of CDS/AppCDS and recently some aarch64 changes atop of that,
> > changing behavior for aarch64 and ppc64. Maybe someone can enlighten me a
> > bit.
> >
> > Specifically, I am looking at
> >
> > Metaspace::reserve_space
> >
> > and its aarch64-specific outgrow
> >
> > Metaspace::reserve_preferred_space
>
> Yowza. This one is mine, I think.
>
> > Despite its generic-sounding name, these functions can only be used to
> > allocate ccs. They lack any interface description, so I parsed the code
> to
> > understand their behavior.
> >
> > So I tried and here is how I think Metaspace::reserve_space works for the
> > various combinations of input parameters:
>
> Bear in mind that this was changed recently. It was (even more)
> complicated before.
>
>
Yes, ccs reservation is complex, and the aarch64 parts are only a small
part of it. I would love to simplify it a bit but its not that easy.


> Please read the discussion at
>
> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-December/037472.html
>
>
Thank you for pointing me to the discussion. We have missed that review.

The reason I looked at this coding was due to the new Metaspace
implementation. In the new allocator it is possible to allocate a Klass at
ccs offset zero (currently this never happens out of accident). If
CompressedKlassPointers::base() points to the start of ccs, the resulting
narrow Klass pointer would be 0. But the VM cannot tell that apart from a
real NULL reference. So far my cheap fix has been to
move CompressedKlassPointers::base() a bit below the start of the ccs. But
as I saw yesterday that breaks the 4G-alignment-assumption on aarch64.
Nevermind, there are different ways to solve that, but I wondered why
aarch64 could not handle this crooked base address.


> > A) requested_addr == NULL && use_requested_addr == false:
> > [aarch64, ppc64]: Attempt to reserve at one of the preferred OS dependent
> > allocation points. Failing that, return an unreserved space.
> > [others]: Reserve a space anywhere.
> >
> > B) requested_addr == NULL && use_requested_addr == true:
> > [aarch64, ppc64]: Does nothing, returns an unreserved space immediately.
> I
> > assume this would be an invalid combination, but since it is not
> asserted I
> > am not sure.
> > [others]: Reserve a space anywhere (use_requested_addr is ignored).
> >
> > C) requested_addr != NULL && use_requested_addr == false:
> > [aarch64, ppc64]: First attempt to reserve at the requested address, but
> > only if that would cause the space to falls into the lower 4G. Failing
> > that, allocate at one of the preferred OS dependent allocation points.
> > Failing that, return an unreserved space.
>
> Yes. We have to do that, because we can't cope with the heap base
> being anything other than a multiple of 4*G. We've got rid of
> rheapbase, in other words, for all compiled code.
>
> > [others]: Attempt to reserve at requested_addr. . Failing that, return an
> > unreserved space.
> >
> > D) requested_addr != NULL && use_requested_addr == true:
> > [aarch64, ppc64]: First attempt to reserve at the requested address, but
> > only if that would cause the space to falls into the lower 4G. Failing
> > that, return an unreserved space.
> > [others]: Attempt to reserve at requested_addr. . Failing that, return an
> > unreserved space. (use_requested_addr is ignored).
> >
> > Note the many subtle platform differences. E.g. on aarch64 we honor the
> > requested address only if ccs would fall below 4G, for all other
> platforms
> > we always honor them. Or how for most platforms the parameter
> > "use_requested_addr" is just ignored.
> >
> > Or that on aarch64 we never seem to "try anywhere", we just try a
> > fixed set of attachment points and if these are all occupied we
> > fail. Is this a bug or by design?
>
> It's by design. We looked at it and decided that we would always be
> able to allocate one of our "nice" points: they are spaced 4G apart,
> and it's very unlikely that any Linux system (which is all we support)
> would fail to map any of the possibilities.
>
>
Thank you for clarifying. This clearly distinguishes aarch64 from at least
AIX and possibly Linux ppc, not sure - there we clearly want a fallback
"try anywhere". We have to look at the code again.

I believe either Goetz or me wrote the original AIX version but my memory
is dim. I am at a loss why we restricted this to AIX only. I have to talk
this over with Goetz.


> > Can we always rely at least one of the attachment points being
> > unoccupied? Looking at the options for
> > MacroAssembler::KlassDecodeMode on aarch64, a simple decoding using
> > x + base >> shift seems not to be wanted? There seems to be no fall
> > back mode which would work with any value of base/shift?
>
> That is correct.
>
> > About reserve_preferred_space(), I was confused why a separate
> > "use_requested_addr" was even needed - requested_addr!=NULL would be a
> > perfectly valid way to communicate that the requested address should be
> > used. I wish we could simplify the coding to just two cases:
> > - hand down a requested address, which is to be taken-or-fail (somewhat
> > like case D)
> > - hand down NULL, which means "try whatever": which for most OSes would
> be
> > really anywhere, for aarch64 could be the fixed set of attachment points.
> > This would be case (A).
> >
> > About case (C): under which circumstances does it happen that caller code
> > hands down a requested address below 4G which happens to be free?
>
> I don't know.
>
> > Does that make sense? In other words, if the whole point of
> > Metaspace::reserve_preferred_space() is "OS knows better, let it try
> > to find a good address", would it not make sense to just try a low
> > address as part of the try-addresses-loop?
>
> We certainly don't want to have to use a dedicated heapbase register
> or a shift. Just give us a multiple of 4*G and we're happy.
>
>
Good to know. So, zero based encoding does not have any special place in
your heart? 4G aligned base works just as well?

Thanks, Thomas


-- 
> Andrew Haley  (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> https://keybase.io/andrewhaley
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>
>


More information about the hotspot-runtime-dev mailing list