[master] RFR: JDK-8324523: Lilliput: if +UseCOH, always use the archive's encoding base and shift

Wed Jan 24 09:48:57 UTC 2024

On Tue, 23 Jan 2024 15:38:16 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

>> We have two ways to initialize narrow Klass encoding: either we let the JVM choose base and shift freely, or we dictate base and shift. The former gives the JVM more leeway, e.g. to go with unscaled encoding. The latter, however, is required if we load a CDS archive and that archive contains precomputed narrow Klass IDs.
>> 
>> In the Legacy VM, this can only happen if the archive contains heap objects. In Lilliput, the markword carries the nKlass, and therefore the prototype baked into archived Klass structures carries it also. Therefore, we must always choose the strict initialization when +UseCOH.
>
> Change looks ok.
> 
> Questions, though: what is the impact of this? Is it a bug? Does it improve or regress performance? Should it be backported?

> @rkennke :
> 
> > Questions, though: what is the impact of this? Is it a bug?
> 
> Okay, this bugged me, and I just had to know for sure. This bug is confirmed. In the traditional Lilliput VM this leads to early crashes in rare cases if a couple of conditions hold:
> 
> * we generate and run with an archive that had been created using +UseCOH (my Smaller Classpointers patch generates such archives)
> * we don't use CDS heap archiving (Windows, or did not build with G1 support)
> * when reserving the class space, we optimize for zero- or unscaled encoding even though we run with CDS. This is highly CPU-specific; on some platforms, we do that today (eg PPC). On x64, it will be the case once [JDK-8323497: On x64, use 32-bit immediate moves for narrow klass base if possible jdk#17340](https://github.com/openjdk/jdk/pull/17340) is merged.
> 
> If all these conditions hold, we will generate the archive using precomputed narrow Klass IDs in prototype headers based on the assumption that the later encoding base is the Klass range start. But at runtime, we set the encoding base to zero, and now the narrow Klass IDs don't match up, and we get a crash right away:
> 
> ```
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  Internal Error (/shared/projects/openjdk/lilliput/source/src/hotspot/share/classfile/javaClasses.cpp:1274), pid=192787, tid=192788
> #  assert(is_instance(java_class)) failed: must be a Class object
> ```
> 
> > Does it improve or regress performance?
> 
> Neither. If the bug hits, we crash. If it does not, we already do the right thing today, nothing changes with this patch.
> 
> Another answer is that it can improve performance, since it makes +UseCOH able to run with CDS archive, so it improves startup.
> 
> > Should it be backported?
> 
> Yes, but this is based upon the big classspace rework patch from last summer (JDK-8312018) and a bunch of other patches. If they had been backported too, this should be backported as well.

Ok, thank you very much for the clarification, this is very useful! Patch is good to go!

-------------

PR Comment: https://git.openjdk.org/lilliput/pull/124#issuecomment-1907769184