[aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace storage reservation

Thu Apr 30 05:55:16 UTC 2020

On 04/29/20 20:56 pm, Andrew Haley wrote:
>
> That's right: I'm looking at AOT-compiled code (after applying your
> patch) and by default it uses a shift of 3, no offset. If I then run
> the AOT-compiled code with -Xmx31G I get:
>
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x0000ffffa142bd3c, pid=9965, tid=10174
> #
> # JRE version:  (15.0) (slowdebug build )
> # Java VM: OpenJDK 64-Bit Server VM (slowdebug 15-internal+0-adhoc.aph.jdk-tmp, mixed mode, aot, tiered, compressed oops, g1 gc, linux-aarch64)
> # Problematic frame:
> # A 388  java.lang.Thread.setPriority(I)V java.base (56 bytes) @ 0x0000ffffa142bd3c [0x0000ffffa142bac0+0x000000000000027c]
>
>    0x0000ffffa142bd30 <+624>:	ldr	w1, [x4, #56]
>    0x0000ffffa142bd34 <+628>:	cbz	w1, 0xffffa142bd84 <java.lang.Thread.setPriority(I)V+708>
>    0x0000ffffa142bd38 <+632>:	lsl	x1, x1, #3
>    0x0000ffffa142bd3c <+636>:	ldr	w0, [x1, #12]
>
> ... so the AOT-compiled code is still trying to use the shift of 3,
> but it is not adding in the base, which is 0x1000000000. I guess this
> is pilot error, but I'm trying to understand what gets checked and
> when.

No this looks like a real bug: jaotc is using the value of the heap base
in the VM where jaotc is run to decide whether to emit the add or
not. If you run `jaotc -J-Xmx31g` it works.

I'm not very familiar with Graal but I believe this fixes it:

--- a/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.aarch64/src/org/graalvm/compiler/hotspot/aarch64/AArch64HotSpotMove.java
+++ b/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.aarch64/src/org/graalvm/compiler/hotspot/aarch64/AArch64HotSpotMove.java
@@ -139,8 +139,9 @@ public class AArch64HotSpotMove {
             Register resultRegister = asRegister(result);
             Register ptr = asRegister(input);
             Register base = (isRegister(baseRegister) ? asRegister(baseRegister) : zr);
+            boolean pic = GeneratePIC.getValue(crb.getOptions());
             // result = (ptr - base) >> shift
-            if (!encoding.hasBase()) {
+            if (!pic && !encoding.hasBase()) {
                 if (encoding.hasShift()) {
                     masm.lshr(64, resultRegister, ptr, encoding.getShift());
                 } else {
@@ -189,7 +190,8 @@ public class AArch64HotSpotMove {
         public void emitCode(CompilationResultBuilder crb, AArch64MacroAssembler masm) {
             Register inputRegister = asRegister(input);
             Register resultRegister = asRegister(result);
-            Register base = encoding.hasBase() ? asRegister(baseRegister) : null;
+            boolean pic = GeneratePIC.getValue(crb.getOptions());
+            Register base = pic || encoding.hasBase() ? asRegister(baseRegister) : null;
             emitUncompressCode(masm, inputRegister, resultRegister, base, encoding.getShift(), nonNull);
         }


I've made a JBS issue to track this:
https://bugs.openjdk.java.net/browse/JDK-8244164

>
>> But for compressed class pointers we never need to address more than
>> 4G so maybe it's better to use 0 shift instead of logKlassAlignment
>> above? With CDS the default shared base address is 0x80000000 which
>> doesn't allow a zero base anyway.
>
> Maybe. What actually happens when we decode compressed class pointers
> in AOT-compiled code is:
>
> Load the klass pointer from an Object:
>
>   532440:       b940082a        ldr     w10, [x1,#8]
>
> Load the compressed class base:
>
>   532444:       90055e68        adrp    x8, b0fe000 <A.meta.got+0x16e000>
>   532448:       9104c108        add     x8, x8, #0x130
>   53244c:       f9400108        ldr     x8, [x8]
>
> Shift and add:
>
>   532450:       8b2a6d0a        add     x10, x8, x10, uxtx #3
>
> ... none of which is very nice, but the expensive part is loading the
> compressed classw base and doing the add, so I guess we don't care
> that there is a shift as well.

Yes but if we can avoid the shift here then CDS can also use zero shift
by default. Which avoids the problem of having compressed class pointers
with both shift and base non-zero in the Hotspot-generated code.


Thanks,
Nick