[master] RFR: JDK-8325104: Lilliput: Shrink Classpointers

Wed Feb 7 19:34:16 UTC 2024

On Thu, 1 Feb 2024 10:23:04 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

> Hi,
> 
> I wanted to get input on the following improvement for Lilliput. Testing is still ongoing, but things look really good, so this patch is hopefully near its final form (barring any objections from reviewers, of course).
> 
> Note: I have a companion patch prepared for upstream, minus the markword changes. I will attempt to get that one upstream quickly in order to not have a large delta between upstream and lilliput, especially in Metaspace.
> 
> ## High-Level Overview
> 
> (for a short sequence of slides, please see https://github.com/tstuefe/fosdem24/blob/master/classpointers-and-liliput.pdf - these accompanied a talk we held at FOSDEM 24).
> 
> We want to reduce the bit size of narrow Klass to free up bits in the MarkWord. 
> 
> We cannot just reduce the Klass encoding range size (well, we could, and maybe we will later, but for now we decided not to). We instead increase the alignment Klass is stored at, and use that alignment shadow to store other information.
> 
> In other words, this patch changes the narrow Klass Pointer to a Klass ID, since now (almost) every value in its value range points to a different class. Therefore, we use the value range of nKlass much more efficiently.
> 
> We then use the newly freed bits in the MarkWord to restore the iHash to 31 bits: 
> 
> 
> [ 22-bit nKlass | 31-bit iHash | 4 free bits | age | fwd | lck ]
> 
> nKlass gets reduced to 22 bits. Identity hash gets re-inflated to 31 bits. Preceding iHash are now 4 unused bits. Rest is unchanged.
> 
> (Note: I originally wanted to swap iHash and nKlass such that either of them could be loaded with a 32-bit load, but I found that tricky since C2 seems to rely on the nKlass offset in the Markword being > 0.)
> 
> ## nKlass reduction:
> 
> The reduction in nKlass size is made by only storing them at 10-bit aligned addresses. That alignment (1KB) works well in practice since Klass - although var sized - typically is between 512 bytes and 1KB in size. Outliers are possible, but the size distribution is bell-curvish [1], so far-away outliers are very rare. 
> 
> To not lose memory to alignment waste, metaspace is reshaped to handle arbitrarily aligned allocations efficiently. Basically, we allow the non-Klass arena of a class loader to steal the alignment waste storage from the class arena. So, alignment waste blocks are filled with non-Klass metadata. That works very well in practice since non-Klass metadata is numerous and fine-granular compared to the big Klass blocks. Total footprint loss in metaspace is, therefore, almost ...

Very nice work!

One comment on your description:
> when decoding an nKlass, the shift is not mandatory.

Did you mean to say 'is *now* mandatory'? Otherwise it doesn't seem to make much sense.

I did a first pass over the changes. Comments inline.

make/Images.gmk line 196:

> 194:       )
> 195:     endif
> 196:   endif

Is creating the COH archives necessary for this change to work or work well? If not, I would suggest to make this a separate change or a prerequisite change.

src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 2262:

> 2260:   C2LoadNKlassStub* stub = new (Compile::current()->comp_arena()) C2LoadNKlassStub(dst);
> 2261:   Compile::current()->output()->add_stub(stub);
> 2262: 

There is a stray whitespace change here.

src/hotspot/cpu/aarch64/compressedKlass_aarch64.cpp line 82:

> 80:   return result;
> 81: }
> 82: 

More unnecessary whitespace.

src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 4426:

> 4424: void MacroAssembler::load_nklass(Register dst, Register src) {
> 4425:   assert(UseCompactObjectHeaders, "expects UseCompactObjectHeaders");
> 4426: 

More whitespace.

src/hotspot/cpu/x86/sharedRuntime_x86.cpp line 80:

> 78:   // because it could be larger than 32 bits in a 64-bit vm. See markWord.hpp.
> 79:   if (UseCompactObjectHeaders) {
> 80:     STATIC_ASSERT(markWord::hash_mask_compact < nth_bit(32));

This makes me think if the i-hash should go/stay in its original position, and we could get rid of hash_code_compact and related changes in Lilliput? Or would that not work for some reason?

src/hotspot/share/cds/archiveBuilder.cpp line 642:

> 640:     // Allocate space for the future InstanceKlass with proper alignment
> 641: #ifndef _LP64
> 642:     const size_t al = SharedSpaceObjectAlignment;

'al' seems a weird variable name ('al Bundy'? ;-) ). Maybe 'align' or 'alignment'?

src/hotspot/share/gc/shared/c2/barrierSetC2.cpp line 661:

> 659:   if (base_off % BytesPerLong != 0) {
> 660:     assert(UseCompressedClassPointers, "");
> 661:     assert(!UseCompactObjectHeaders, "");

What is that for? Is it important?

src/hotspot/share/memory/metaspace/fence.hpp line 1:

> 1: /*

Can you explain the purpose of this? It looks like something that you moved out of test and into main sources. I can't tell from the looks of it what it is.
EDIT: Ok maybe it's not really moved, except that the filenames match, but the contents is wholly replaced? In any case, GitHub thinks it's a moved file.

src/hotspot/share/memory/metaspace/fence.inline.hpp line 2:

> (failed to retrieve contents of file, check the PR for context)
Can we simply change the copyright of the file?

src/hotspot/share/memory/metaspace/metachunk.hpp line 31:

> 29: #include "memory/metaspace/chunklevel.hpp"
> 30: #include "memory/metaspace/counters.hpp"
> 31: #include "memory/metaspace/metablock.hpp"

Why is the include needed here?

src/hotspot/share/memory/metaspace/metachunk.hpp line 354:

> 352:     return base() <= p && p < committed_top();
> 353:   }
> 354: 

Stray whitespace change.

src/hotspot/share/oops/markWord.hpp line 123:

> 121:   static const int age_shift                      = self_forwarded_shift + self_forwarded_bits;
> 122:   static const int hash_shift                     = age_shift + age_bits + unused_gap_bits;
> 123:   static const int hash_shift_compact             = 11;

It seems odd to have a hole there in the middle. Could we have the hole in the uppermost bits? If we really have to have it here, use a new unused_gap_bits_compact here that has 4 instead of 1?

src/hotspot/share/oops/markWord.hpp line 141:

> 139:   // 32 bits and rightshift by the lower 10 foreign bits.
> 140: 
> 141:   // These are for loading the nKlass with a 32-bit load and subsequent masking of the lower

I should mention that currently it is not safe to load the nKlass using 32-bit load. We currently always need to check for monitor. If you do that anywhere, this needs changing.
(That being said, there is ongoing work to get rid of header displacement for Lilliput, at which point we *could* do the 32-bit load, if that fits.)

src/hotspot/share/oops/markWord.inline.hpp line 81:

> 79:   return set_narrow_klass(nklass);
> 80: }
> 81: 

Stray whitespace.

-------------

Changes requested by rkennke (Lead).

PR Review: https://git.openjdk.org/lilliput/pull/128#pullrequestreview-1868532124
PR Review Comment: https://git.openjdk.org/lilliput/pull/128#discussion_r1481928734
PR Review Comment: https://git.openjdk.org/lilliput/pull/128#discussion_r1481929421
PR Review Comment: https://git.openjdk.org/lilliput/pull/128#discussion_r1481929817
PR Review Comment: https://git.openjdk.org/lilliput/pull/128#discussion_r1481930621
PR Review Comment: https://git.openjdk.org/lilliput/pull/128#discussion_r1481936940
PR Review Comment: https://git.openjdk.org/lilliput/pull/128#discussion_r1481939135
PR Review Comment: https://git.openjdk.org/lilliput/pull/128#discussion_r1481946926
PR Review Comment: https://git.openjdk.org/lilliput/pull/128#discussion_r1481956504
PR Review Comment: https://git.openjdk.org/lilliput/pull/128#discussion_r1481957053
PR Review Comment: https://git.openjdk.org/lilliput/pull/128#discussion_r1481961436
PR Review Comment: https://git.openjdk.org/lilliput/pull/128#discussion_r1481961660
PR Review Comment: https://git.openjdk.org/lilliput/pull/128#discussion_r1481972306
PR Review Comment: https://git.openjdk.org/lilliput/pull/128#discussion_r1481976686
PR Review Comment: https://git.openjdk.org/lilliput/pull/128#discussion_r1481977521