[aarch64-port-dev ] Question about CompressedKlassPointers::range

Tue Apr 21 17:24:33 UTC 2020

On Tue, Apr 21, 2020 at 5:59 PM Andrew Haley <aph at redhat.com> wrote:

> Hi,
>
> On 4/21/20 3:31 PM, Thomas Stüfe wrote:
> > this is a followup question, mainly for aarch64, to
> >
> https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008757.html
> >  .
> >
> > CompressedKlassPointers has a range field, only used by aarch64 afaics,
> > introduced with "8193266: AArch64: TestOptionsWithRanges.java SIGSEGV".
> >
> > I read its bug description and the patch. If I understand the problem,
> > before CDS the assumption was that CompressedClassSpaceSize is synonymous
> > with the range of values narrow Klass pointers could have; which seems
> > logical, but that assumption was broken since CDS and now the encoding
> > range must span both the ccs and the cds archives.
> >
> > The range is used inside MacroAssembler::klass_decode_mode() to decide
> > whether to use the OR mode.
> >
> > I see this being set in three places:
> > 1) at cds dumptime, to 4G
> > 2) at cds runtime, to CompressedClassSpaceSize, and
> > 3) if cds is disabled it keeps its default value of 4G.
> >
> > I may miss something here. Would (2) not be too small? Should that size
> not
> > include the size of the archives?
>
> I believe so.
>
> > We map first the archives, lets say they
> > are 300MB, after that ccs, lets say 1G default, would that not mean any
> > Klass residing toward the end of the ccs - if it were to fill up, which
> it
> > almost never does - would have an offset larger than the initially
> assumed
> > range and hence not correctly OR-able with the base anymore?
>
> How would that happen? If someone maps CDS space miles from CCS,
> you mean? OK, but that'd be a pointless thing to do.
>

I thought this could happen by filling up ccs.

At CDS runtime (-Xshare=on) we map the cds archive, followed by the ccs:

Encoding base
|
v
+------+----------------------------+
|  cds |   ccs                      |
+------+----------------------------+
+----------------------------+
                             A

The size of the ccs is CompressedClassSpaceSize. Address A is Encoding
base + CompressedClassSpaceSize, as in case (2), without archive size taken
into account.

ccs fills up at runtime, starting at the bottom, if more non-shared classes
are loaded. E.g. lots of lambdas or reflection glue classes, or just
application classes. When ccs fills up beyond point A, the assumption that
no Klass ever has an offset larger than CompressedClassPointers::range is
broken and the OR mode may not work anymore.

However I see now that we would only have a problem if the encoding base
had a non-zero bit set right above the end of the offset mask. But if the
encoding base on aarch64 is always 4G aligned, and a narrow Klass pointer
cannot be larger than 4G, the OR would still work. So, this is only a
theoretical problem.

> > And I'm not sure (3) is correct either since the range we could encode in
> > theory is 32G with shift=3. In practice this is today no problem. Today
> > CompressedClassSpaceSize is artificially capped at 3G. If that were ever
> to
> > change, and someone would set it to >4G, this should cause problems too,
> no?
>
> Yes, it would. It'd be a fool thing to do, but that doesn't mean it
> won't happen. We really don't need more than 3G, after all.
>

> > If my assumption about (2) is correct, it could be the error is just well
> > hidden either because MacroAssembler::_klass_decode_mode is already
> > initialized, using the default value (3). Or because it is difficult to
> > allocate so many classes to trigger this error.
>
> (2) looks wrong.
>
> > As a more general question: CompressedKlassPointers::range(), as in
> > "the expected range of narrow Klass pointer values", I guess it
> > makes sense to keep it as small as possible, right? Instead of, say
> > hard-coding it to 32G?
>
> Yes, it does.
>
> > Since the smaller the expected range of narrow pointers is, the more
> > probable we could choose the OR mode?
>
> At the moment the probability of being able to do that is so high that
> if it fails I'd expect it'd be a bug.
>
> > Oh, and on aarch64, how "good" is that OR mode compared with the "movk"
> > mode on aarch64? Since it seems to be preferred?
>
> A shift is sometimes slower than a simple XOR, so a shift is never
> preferred. Beyond that it's impossible to say for sure because there
> are many independent implementations, some of which I have never seen,
> but I doubt that there's a huge difference. Any 4G range is probably
> OK.
>
> Bear in mind, though, that people designing AArch64 hardware today are
> benchmarking OpenJDK and making decisions based on what HotSpot
> does. For that reason, changing what we do without a really good
> reason isn't the best idea.
>

I had no idea. Thank you. I will be very careful.

..Thomas

>
> --
> Andrew Haley  (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> https://keybase.io/andrewhaley
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>
>