Question about ccs reservation, CDS and aarch64 specifics
Thomas Stüfe
thomas.stuefe at gmail.com
Thu Apr 16 18:22:26 UTC 2020
On Thu, Apr 16, 2020 at 8:15 PM Ioi Lam <ioi.lam at oracle.com> wrote:
>
>
> On 4/16/20 10:51 AM, Thomas Stüfe wrote:
> > Hi Andrew,
> >
> > Thanks for the prompt answer. See my answers inline.
> >
> > On Thu, Apr 16, 2020 at 6:24 PM Andrew Haley <aph at redhat.com> wrote:
> >
> >> Hi,
> >>
> >> On 4/16/20 4:18 PM, Thomas Stüfe wrote:
> >>> I am currently trying to wrap my head around the various ways the
> >>> CompressedClassSpace is reserved. Coding has grown a bit in complexity
> >> with
> >>> the advent of CDS/AppCDS and recently some aarch64 changes atop of
> that,
> >>> changing behavior for aarch64 and ppc64. Maybe someone can enlighten
> me a
> >>> bit.
> >>>
> >>> Specifically, I am looking at
> >>>
> >>> Metaspace::reserve_space
> >>>
> >>> and its aarch64-specific outgrow
> >>>
> >>> Metaspace::reserve_preferred_space
> >> Yowza. This one is mine, I think.
> >>
> >>> Despite its generic-sounding name, these functions can only be used to
> >>> allocate ccs. They lack any interface description, so I parsed the code
> >> to
> >>> understand their behavior.
> >>>
> >>> So I tried and here is how I think Metaspace::reserve_space works for
> the
> >>> various combinations of input parameters:
> >> Bear in mind that this was changed recently. It was (even more)
> >> complicated before.
> >>
> >>
> > Yes, ccs reservation is complex, and the aarch64 parts are only a small
> > part of it. I would love to simplify it a bit but its not that easy.
> >
> >
> >> Please read the discussion at
> >>
> >>
> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-December/037472.html
> >>
> >>
> > Thank you for pointing me to the discussion. We have missed that review.
> >
> > The reason I looked at this coding was due to the new Metaspace
> > implementation. In the new allocator it is possible to allocate a Klass
> at
> > ccs offset zero (currently this never happens out of accident). If
> > CompressedKlassPointers::base() points to the start of ccs, the resulting
> > narrow Klass pointer would be 0. But the VM cannot tell that apart from a
> > real NULL reference. So far my cheap fix has been to
> > move CompressedKlassPointers::base() a bit below the start of the ccs.
> But
> > as I saw yesterday that breaks the 4G-alignment-assumption on aarch64.
> > Nevermind, there are different ways to solve that, but I wondered why
> > aarch64 could not handle this crooked base address.
> Wouldn't it be easier to just allocate 8 bytes of dummy data at the
> beginning of the CCS?
>
>
Essentially this is what I am doing now:
http://hg.openjdk.java.net/jdk/sandbox/rev/89069367b3b4
save that I have to allocate a full (tiny) chunk instead of 8 bytes. At
this point there is no CLD and no SpaceManager I could allocate 8 bytes
from - in other words, the upper layer which provides word-grained
metaspace allocation does not exist yet. So I have to allocate a full 1K
chunk.
Another possibility would have to check the return pointer after every
allocation for (p==ccs base) and discard that allocation if true. But that
seemed stupid for something which only ever happens once.
Cheers, Thomas
> Thanks
> - Ioi
>
> >
> >
> >>> A) requested_addr == NULL && use_requested_addr == false:
> >>> [aarch64, ppc64]: Attempt to reserve at one of the preferred OS
> dependent
> >>> allocation points. Failing that, return an unreserved space.
> >>> [others]: Reserve a space anywhere.
> >>>
> >>> B) requested_addr == NULL && use_requested_addr == true:
> >>> [aarch64, ppc64]: Does nothing, returns an unreserved space
> immediately.
> >> I
> >>> assume this would be an invalid combination, but since it is not
> >> asserted I
> >>> am not sure.
> >>> [others]: Reserve a space anywhere (use_requested_addr is ignored).
> >>>
> >>> C) requested_addr != NULL && use_requested_addr == false:
> >>> [aarch64, ppc64]: First attempt to reserve at the requested address,
> but
> >>> only if that would cause the space to falls into the lower 4G. Failing
> >>> that, allocate at one of the preferred OS dependent allocation points.
> >>> Failing that, return an unreserved space.
> >> Yes. We have to do that, because we can't cope with the heap base
> >> being anything other than a multiple of 4*G. We've got rid of
> >> rheapbase, in other words, for all compiled code.
> >>
> >>> [others]: Attempt to reserve at requested_addr. . Failing that, return
> an
> >>> unreserved space.
> >>>
> >>> D) requested_addr != NULL && use_requested_addr == true:
> >>> [aarch64, ppc64]: First attempt to reserve at the requested address,
> but
> >>> only if that would cause the space to falls into the lower 4G. Failing
> >>> that, return an unreserved space.
> >>> [others]: Attempt to reserve at requested_addr. . Failing that, return
> an
> >>> unreserved space. (use_requested_addr is ignored).
> >>>
> >>> Note the many subtle platform differences. E.g. on aarch64 we honor the
> >>> requested address only if ccs would fall below 4G, for all other
> >> platforms
> >>> we always honor them. Or how for most platforms the parameter
> >>> "use_requested_addr" is just ignored.
> >>>
> >>> Or that on aarch64 we never seem to "try anywhere", we just try a
> >>> fixed set of attachment points and if these are all occupied we
> >>> fail. Is this a bug or by design?
> >> It's by design. We looked at it and decided that we would always be
> >> able to allocate one of our "nice" points: they are spaced 4G apart,
> >> and it's very unlikely that any Linux system (which is all we support)
> >> would fail to map any of the possibilities.
> >>
> >>
> > Thank you for clarifying. This clearly distinguishes aarch64 from at
> least
> > AIX and possibly Linux ppc, not sure - there we clearly want a fallback
> > "try anywhere". We have to look at the code again.
> >
> > I believe either Goetz or me wrote the original AIX version but my memory
> > is dim. I am at a loss why we restricted this to AIX only. I have to talk
> > this over with Goetz.
> >
> >
> >>> Can we always rely at least one of the attachment points being
> >>> unoccupied? Looking at the options for
> >>> MacroAssembler::KlassDecodeMode on aarch64, a simple decoding using
> >>> x + base >> shift seems not to be wanted? There seems to be no fall
> >>> back mode which would work with any value of base/shift?
> >> That is correct.
> >>
> >>> About reserve_preferred_space(), I was confused why a separate
> >>> "use_requested_addr" was even needed - requested_addr!=NULL would be a
> >>> perfectly valid way to communicate that the requested address should be
> >>> used. I wish we could simplify the coding to just two cases:
> >>> - hand down a requested address, which is to be taken-or-fail (somewhat
> >>> like case D)
> >>> - hand down NULL, which means "try whatever": which for most OSes would
> >> be
> >>> really anywhere, for aarch64 could be the fixed set of attachment
> points.
> >>> This would be case (A).
> >>>
> >>> About case (C): under which circumstances does it happen that caller
> code
> >>> hands down a requested address below 4G which happens to be free?
> >> I don't know.
> >>
> >>> Does that make sense? In other words, if the whole point of
> >>> Metaspace::reserve_preferred_space() is "OS knows better, let it try
> >>> to find a good address", would it not make sense to just try a low
> >>> address as part of the try-addresses-loop?
> >> We certainly don't want to have to use a dedicated heapbase register
> >> or a shift. Just give us a multiple of 4*G and we're happy.
> >>
> >>
> > Good to know. So, zero based encoding does not have any special place in
> > your heart? 4G aligned base works just as well?
> >
> > Thanks, Thomas
> >
> >
>
>
More information about the hotspot-runtime-dev
mailing list